Compare commits

...

383 Commits

Author SHA1 Message Date
Linus Torvalds
3d7cb6b04c Linux 5.19 2022-07-31 14:03:01 -07:00
Linus Torvalds
334c0ef642 Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
 "One-liner fix of a NULL pointer deref in the Allwinner clk driver"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: sunxi-ng: Fix H6 RTC clock definition
2022-07-31 09:52:20 -07:00
Linus Torvalds
89caf57540 Merge tag 'x86_urgent_for_v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Update the 'mitigations=' kernel param documentation

 - Check the IBPB feature flag before enabling IBPB in firmware calls
   because cloud vendors' fantasy when it comes to creating guest
   configurations is unlimited

 - Unexport sev_es_ghcb_hv_call() before 5.19 releases now that HyperV
   doesn't need it anymore

 - Remove dead CONFIG_* items

* tag 'x86_urgent_for_v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  docs/kernel-parameters: Update descriptions for "mitigations=" param with retbleed
  x86/bugs: Do not enable IBPB at firmware entry when IBPB is not available
  Revert "x86/sev: Expose sev_es_ghcb_hv_call() for use by HyperV"
  x86/configs: Update configs in x86_debug.config
2022-07-31 09:26:53 -07:00
Linus Torvalds
5e4823e6da Merge tag 'locking_urgent_for_v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Borislav Petkov:

 - Avoid rwsem lockups in certain situations when handling the handoff
   bit

* tag 'locking_urgent_for_v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
2022-07-31 09:21:13 -07:00
Linus Torvalds
cd2715b792 Merge tag 'edac_urgent_for_v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fixes from Borislav Petkov:

 - Relax the condition under which the DIMM label in ghes_edac is set in
   order to accomodate an HPE BIOS which sets only the device but not
   the bank

 - Two forgotten fixes to synopsys_edac when handling error interrupts

* tag 'edac_urgent_for_v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/ghes: Set the DIMM label unconditionally
  EDAC/synopsys: Re-enable the error interrupts on v3 hw
  EDAC/synopsys: Use the correct register to disable the error interrupt on v3 hw
2022-07-31 09:12:58 -07:00
Linus Torvalds
6a01025844 Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Last set of ARM fixes for 5.19:

   - fix for MAX_DMA_ADDRESS overflow

   - fix for find_*_bit performing an out of bounds memory access"

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: findbit: fix overflowing offset
  ARM: 9216/1: Fix MAX_DMA_ADDRESS overflow
2022-07-30 17:24:16 -07:00
Waiman Long
6eebd5fb20 locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
With commit d257cc8cb8 ("locking/rwsem: Make handoff bit handling more
consistent"), the writer that sets the handoff bit can be interrupted
out without clearing the bit if the wait queue isn't empty. This disables
reader and writer optimistic lock spinning and stealing.

Now if a non-first writer in the queue is somehow woken up or a new
waiter enters the slowpath, it can't acquire the lock.  This is not the
case before commit d257cc8cb8 as the writer that set the handoff bit
will clear it when exiting out via the out_nolock path. This is less
efficient as the busy rwsem stays in an unlock state for a longer time.

In some cases, this new behavior may cause lockups as shown in [1] and
[2].

This patch allows a non-first writer to ignore the handoff bit if it
is not originally set or initiated by the first waiter. This patch is
shown to be effective in fixing the lockup problem reported in [1].

[1] https://lore.kernel.org/lkml/20220617134325.GC30825@techsingularity.net/
[2] https://lore.kernel.org/lkml/3f02975c-1a9d-be20-32cf-f1d8e3dfafcc@oracle.com/

Fixes: d257cc8cb8 ("locking/rwsem: Make handoff bit handling more consistent")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Tested-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/r/20220622200419.778799-1-longman@redhat.com
2022-07-30 10:58:28 +02:00
Linus Torvalds
620725263f Merge tag 'mm-hotfixes-stable-2022-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "Two hotfixes, both cc:stable"

* tag 'mm-hotfixes-stable-2022-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/hmm: fault non-owner device private entries
  page_alloc: fix invalid watermark check on a negative value
2022-07-29 21:02:35 -07:00
Linus Torvalds
8a91f86f3e Merge tag 'block-5.19-2022-07-29' of git://git.kernel.dk/linux-block
Pull block fix from Jens Axboe:
 "Just a single fix for NVMe, yet another quirk addition"

* tag 'block-5.19-2022-07-29' of git://git.kernel.dk/linux-block:
  nvme-pci: Crucial P2 has bogus namespace ids
2022-07-29 16:07:35 -07:00
Linus Torvalds
e65c6a46df Merge tag 'drm-fixes-2022-07-30' of git://anongit.freedesktop.org/drm/drm
Pull more drm fixes from Dave Airlie:
 "Maxime had the dog^Wmailing list server eat his homework^Wmisc pull
  request.

  Two more small fixes, one in nouveau svm code and the other in
  simpledrm.

  nouveau:
   - page migration fix

  simpledrm:
   - fix mode_valid return value"

* tag 'drm-fixes-2022-07-30' of git://anongit.freedesktop.org/drm/drm:
  nouveau/svm: Fix to migrate all requested pages
  drm/simpledrm: Fix return type of simpledrm_simple_display_pipe_mode_valid()
2022-07-29 13:25:31 -07:00
Dave Airlie
ce156c8a18 Merge tag 'drm-misc-fixes-2022-07-29' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
One fix to fix simpledrm mode_valid return value, and one for page
migration in nouveau

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220729094514.sfzhc3gqjgwgal62@penduick
2022-07-30 06:09:57 +10:00
Linus Torvalds
1c8ac1c4af Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Four fixes, three in drivers.

  The two biggest fixes are ufs and the remaining driver and core fix
  are small and obvious (and the core fix is low risk)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: core: Fix a race condition related to device management
  scsi: core: Fix warning in scsi_alloc_sgtables()
  scsi: ufs: host: Hold reference returned by of_parse_phandle()
  scsi: mpt3sas: Stop fw fault watchdog work item during system shutdown
2022-07-29 13:07:03 -07:00
Eiichi Tsukata
ea304a8b89 docs/kernel-parameters: Update descriptions for "mitigations=" param with retbleed
Updates descriptions for "mitigations=off" and "mitigations=auto,nosmt"
with the respective retbleed= settings.

Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: corbet@lwn.net
Link: https://lore.kernel.org/r/20220728043907.165688-1-eiichi.tsukata@nutanix.com
2022-07-29 20:47:07 +02:00
Ralph Campbell
8a295dbbaf mm/hmm: fault non-owner device private entries
If hmm_range_fault() is called with the HMM_PFN_REQ_FAULT flag and a
device private PTE is found, the hmm_range::dev_private_owner page is used
to determine if the device private page should not be faulted in. 
However, if the device private page is not owned by the caller,
hmm_range_fault() returns an error instead of calling migrate_to_ram() to
fault in the page.

For example, if a page is migrated to GPU private memory and a RDMA fault
capable NIC tries to read the migrated page, without this patch it will
get an error.  With this patch, the page will be migrated back to system
memory and the NIC will be able to read the data.

Link: https://lkml.kernel.org/r/20220727000837.4128709-2-rcampbell@nvidia.com
Link: https://lkml.kernel.org/r/20220725183615.4118795-2-rcampbell@nvidia.com
Fixes: 08ddddda66 ("mm/hmm: check the device private page owner in hmm_range_fault()")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reported-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Philip Yang <Philip.Yang@amd.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-29 11:33:37 -07:00
Jaewon Kim
9282012fc0 page_alloc: fix invalid watermark check on a negative value
There was a report that a task is waiting at the
throttle_direct_reclaim. The pgscan_direct_throttle in vmstat was
increasing.

This is a bug where zone_watermark_fast returns true even when the free
is very low. The commit f27ce0e140 ("page_alloc: consider highatomic
reserve in watermark fast") changed the watermark fast to consider
highatomic reserve. But it did not handle a negative value case which
can be happened when reserved_highatomic pageblock is bigger than the
actual free.

If watermark is considered as ok for the negative value, allocating
contexts for order-0 will consume all free pages without direct reclaim,
and finally free page may become depleted except highatomic free.

Then allocating contexts may fall into throttle_direct_reclaim. This
symptom may easily happen in a system where wmark min is low and other
reclaimers like kswapd does not make free pages quickly.

Handle the negative case by using MIN.

Link: https://lkml.kernel.org/r/20220725095212.25388-1-jaewon31.kim@samsung.com
Fixes: f27ce0e140 ("page_alloc: consider highatomic reserve in watermark fast")
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Reported-by: GyeongHwan Hong <gh21.hong@samsung.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Yong-Taek Lee <ytk.lee@samsung.com>
Cc: <stable@vger.kerenl.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-29 11:33:37 -07:00
Linus Torvalds
bb83c99d3d Merge tag 'perf-tools-fixes-for-v5.19-2022-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix addresses for bss symbols, describing variables used in resolving
   data access in tools such as 'perf c2c' and 'perf mem'.

 - Skip symbols if SHF_ALLOC flag is not set, a technique used for
   listing deprecated symbols, its addresses are zeros, so not useful.

 - Remove undefined behavior from bpf_perf_object__next() when dealing
   with an empty bpf_objects_list list.

 - Make a ARM CoreSight disasm script work with both python2 and
   python3.

 - Sync x86's cpufeatures header with with the kernel sources.

* tag 'perf-tools-fixes-for-v5.19-2022-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf bpf: Remove undefined behavior from bpf_perf_object__next()
  perf symbol: Skip symbols if SHF_ALLOC flag is not set
  perf symbol: Correct address for bss symbols
  perf scripts python: Let script to be python2 compliant
  tools headers cpufeatures: Sync with the kernel sources
2022-07-29 11:26:28 -07:00
Linus Torvalds
4b20426d04 Merge tag 'wq-for-5.19-rc8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
 "Just one commit to suppress a spurious warning added during the 5.19
  cycle"

* tag 'wq-for-5.19-rc8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Avoid a false warning in unbind_workers()
2022-07-29 11:20:40 -07:00
Linus Torvalds
506e6dfb0f Merge tag 'pm-5.19-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
 "Make some false positive RCU splats resulting from a recent intel_idle
  driver change go away (Waiman Long)"

* tag 'pm-5.19-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  intel_idle: Fix false positive RCU splats due to incorrect hardirqs state
2022-07-29 10:57:26 -07:00
Lai Jiangshan
46a4d679ef workqueue: Avoid a false warning in unbind_workers()
Doing set_cpus_allowed_ptr() with wq_unbound_cpumask can be possible
fails and trigger the false warning.

Use cpu_possible_mask instead when wq_unbound_cpumask has no active CPUs.

It is very easy to trigger the warning:
  Set wq_unbound_cpumask to a small set of CPUs.
  Offline all the CPUs of wq_unbound_cpumask.
  Offline an extra CPU and trigger the warning.

Fixes: 10a5a651e3 ("workqueue: Restrict kworker in the offline CPU pool running on housekeeping CPUs")
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-07-29 07:49:02 -10:00
Linus Torvalds
e4d8b09d67 Merge tag 'riscv-for-linus-5.19-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fix from Palmer Dabbelt:
 "A build fix for 'make vdso_install' that avoids an issue trying to
  install the compat VDSO"

* tag 'riscv-for-linus-5.19-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: compat: vdso: Fix vdso_install target
2022-07-29 10:46:03 -07:00
Linus Torvalds
a95eb1d086 Merge tag 'loongarch-fixes-5.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:

 - Fix cache size calculation, stack protection attributes, ptrace's
   fpr_set and "ROM Size" in boardinfo

 - Some cleanups and improvements of assembly

 - Some cleanups of unused code and useless code

* tag 'loongarch-fixes-5.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: Fix wrong "ROM Size" of boardinfo
  LoongArch: Fix missing fcsr in ptrace's fpr_set
  LoongArch: Fix shared cache size calculation
  LoongArch: Disable executable stack by default
  LoongArch: Remove unused variables
  LoongArch: Remove clock setting during cpu hotplug stage
  LoongArch: Remove useless header compiler.h
  LoongArch: Remove several syntactic sugar macros for branches
  LoongArch: Re-tab the assembly files
  LoongArch: Simplify "BGT foo, zero" with BGTZ
  LoongArch: Simplify "BLT foo, zero" with BLTZ
  LoongArch: Simplify "BEQ/BNE foo, zero" with BEQZ/BNEZ
  LoongArch: Use the "move" pseudo-instruction where applicable
  LoongArch: Use the "jr" pseudo-instruction where applicable
  LoongArch: Use ABI names of registers where appropriate
2022-07-29 10:10:30 -07:00
Linus Torvalds
9d928d9b78 Merge tag 'powerpc-5.19-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:

 - Re-enable the new amdgpu display engine for powerpc, as long as the
   compiler is correctly configured.

 - Disable stack variable initialisation in prom_init to fix GCC 12
   allmodconfig.

Thanks to Dan Horák and Sudip Mukherjee.

* tag 'powerpc-5.19-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  drm/amdgpu: Re-enable DCN for 64-bit powerpc
  powerpc/64s: Disable stack variable initialisation for prom_init
2022-07-29 09:57:07 -07:00
Tiezhu Yang
45b53c9051 LoongArch: Fix wrong "ROM Size" of boardinfo
We can see the "ROM Size" is different in the following outputs:

[root@linux loongson]# cat /sys/firmware/loongson/boardinfo
BIOS Information
Vendor                  : Loongson
Version                 : vUDK2018-LoongArch-V2.0.pre-beta8
ROM Size                : 63 KB
Release Date            : 06/15/2022

Board Information
Manufacturer            : Loongson
Board Name              : Loongson-LS3A5000-7A1000-1w-A2101
Family                  : LOONGSON64

[root@linux loongson]# dmidecode | head -11
...
Handle 0x0000, DMI type 0, 26 bytes
BIOS Information
	Vendor: Loongson
	Version: vUDK2018-LoongArch-V2.0.pre-beta8
	Release Date: 06/15/2022
	ROM Size: 4 MB

According to "BIOS Information (Type 0) structure" in the SMBIOS
Reference Specification [1], it shows 64K * (n+1) is the size of
the physical device containing the BIOS if the size is less than
16M.

Additionally, we can see the related code in dmidecode [2]:

  u64 s = { .l = (code1 + 1) << 6 };

So the output of dmidecode is correct, the output of boardinfo
is wrong, fix it.

By the way, at present no need to consider the size is 16M or
greater on LoongArch, because it is usually 4M or 8M which is
enough to use.

[1] https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.6.0.pdf
[2] https://git.savannah.nongnu.org/cgit/dmidecode.git/tree/dmidecode.c#n347

Fixes: 628c3bb40e ("LoongArch: Add boot and setup routines")
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:33 +08:00
Qi Hu
b0f3bdc002 LoongArch: Fix missing fcsr in ptrace's fpr_set
In file ptrace.c, function fpr_set does not copy fcsr data from ubuf
to kbuf. That's the reason why fcsr cannot be modified by ptrace.

This patch fixs this problem and allows users using ptrace to modify
the fcsr.

Co-developed-by: Xu Li <lixu@loongson.cn>
Signed-off-by: Qi Hu <huqi@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:33 +08:00
Huacai Chen
1aea29d7c3 LoongArch: Fix shared cache size calculation
Current calculation of shared cache size is from the node (die) scope,
but we hope 'lscpu' to show the shared cache size of the whole package
for multi-die chips (e.g., Loongson-3C5000L, which contains 4 dies in
one package). So fix it by multiplying nodes_per_package.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:33 +08:00
Huacai Chen
317980e6b4 LoongArch: Disable executable stack by default
Disable executable stack for LoongArch by default, as all modern
architectures do.

Reported-by: Andreas Schwab <schwab@suse.de>
Suggested-by: WANG Xuerui <git@xen0n.name>
Link: https://sourceware.org/pipermail/binutils/2022-July/121992.html
Tested-by: WANG Xuerui <git@xen0n.name>
Tested-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:32 +08:00
Bibo Mao
3a3a4f7a65 LoongArch: Remove unused variables
There are some variables never used or referenced, this patch
removes these varaibles and make the code cleaner.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:32 +08:00
Bibo Mao
71610ab1d0 LoongArch: Remove clock setting during cpu hotplug stage
On physical machine we can save power by disabling clock of hot removed
cpu. However as different platforms require different methods to
configure clocks, the code is platform-specific, and probably belongs to
firmware/pmu or cpu regulator, rather than generic arch/loongarch code.

Also, there is no such register on QEMU virt machine since the
clock/frequency regulation is not emulated.

This patch removes the hard-coded clock register accesses in generic
LoongArch cpu hotplug flow.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:32 +08:00
Jun Yi
f62b7626cb LoongArch: Remove useless header compiler.h
The content of LoongArch's compiler.h is trivial, with some unused
anywhere, so inline the definitions and remove the header.

Signed-off-by: Jun Yi <yijun@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:32 +08:00
WANG Xuerui
ab6e57a69d LoongArch: Remove several syntactic sugar macros for branches
These syntactic sugars have been supported by upstream binutils from the
beginning, so no need to patch them locally.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:32 +08:00
WANG Xuerui
f5c3c22f21 LoongArch: Re-tab the assembly files
Reflow the *.S files for better stylistic consistency, namely hard tabs
after mnemonic position, and vertical alignment of the first operand
with hard tabs. Tab width is obviously 8. Some pre-existing intra-block
vertical alignments are preserved.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:32 +08:00
WANG Xuerui
1fdb9a9249 LoongArch: Simplify "BGT foo, zero" with BGTZ
Support for the syntactic sugar is present in upstream binutils port
from the beginning. Use it for shorter lines and better consistency.
Generated code should be identical.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:32 +08:00
WANG Xuerui
d1bc75d759 LoongArch: Simplify "BLT foo, zero" with BLTZ
Support for the syntactic sugar is present in upstream binutils port
from the beginning. Use it for shorter lines and better consistency.
Generated code should be identical.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:32 +08:00
WANG Xuerui
d47b2dc87c LoongArch: Simplify "BEQ/BNE foo, zero" with BEQZ/BNEZ
While B{EQ,NE}Z and B{EQ,NE} are different instructions, and the vastly
expanded range for branch destination does not really matter in the few
cases touched, use the B{EQ,NE}Z where possible for shorter lines and
better consistency (e.g. some places used "BEQ foo, zero", while some
used "BEQ zero, foo").

Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:32 +08:00
WANG Xuerui
57ce5d3eef LoongArch: Use the "move" pseudo-instruction where applicable
Some of the assembly code in the LoongArch port likely originated
from a time when the assembler did not support pseudo-instructions like
"move" or "jr", so the desugared form was used and readability suffers
(to a minor degree) as a result.

As the upstream toolchain supports these pseudo-instructions from the
beginning, migrate the existing few usages to them for better
readability.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:32 +08:00
WANG Xuerui
07b480695d LoongArch: Use the "jr" pseudo-instruction where applicable
Some of the assembly code in the LoongArch port likely originated
from a time when the assembler did not support pseudo-instructions like
"move" or "jr", so the desugared form was used and readability suffers
(to a minor degree) as a result.

As the upstream toolchain supports these pseudo-instructions from the
beginning, migrate the existing few usages to them for better
readability.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:32 +08:00
WANG Xuerui
d8e7f201a4 LoongArch: Use ABI names of registers where appropriate
Some of the assembly in the LoongArch port seem to come from a
prehistoric time, when the assembler didn't even have support for the
ABI names we all come to know and love, thus used raw register numbers
which hampered readability.

The usages are found with a regex match inside arch/loongarch, then
manually adjusted for those non-definitions.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29 18:22:32 +08:00
Russell King (Oracle)
ec85bd369f ARM: findbit: fix overflowing offset
When offset is larger than the size of the bit array, we should not
attempt to access the array as we can perform an access beyond the
end of the array. Fix this by changing the pre-condition.

Using "cmp r2, r1; bhs ..." covers us for the size == 0 case, since
this will always take the branch when r1 is zero, irrespective of
the value of r2. This means we can fix this bug without adding any
additional code!

Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-07-29 09:54:26 +01:00
Thadeu Lima de Souza Cascardo
571c30b1a8 x86/bugs: Do not enable IBPB at firmware entry when IBPB is not available
Some cloud hypervisors do not provide IBPB on very recent CPU processors,
including AMD processors affected by Retbleed.

Using IBPB before firmware calls on such systems would cause a GPF at boot
like the one below. Do not enable such calls when IBPB support is not
present.

  EFI Variables Facility v0.08 2004-May-17
  general protection fault, maybe for address 0x1: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 0 PID: 24 Comm: kworker/u2:1 Not tainted 5.19.0-rc8+ #7
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  Workqueue: efi_rts_wq efi_call_rts
  RIP: 0010:efi_call_rts
  Code: e8 37 33 58 ff 41 bf 48 00 00 00 49 89 c0 44 89 f9 48 83 c8 01 4c 89 c2 48 c1 ea 20 66 90 b9 49 00 00 00 b8 01 00 00 00 31 d2 <0f> 30 e8 7b 9f 5d ff e8 f6 f8 ff ff 4c 89 f1 4c 89 ea 4c 89 e6 48
  RSP: 0018:ffffb373800d7e38 EFLAGS: 00010246
  RAX: 0000000000000001 RBX: 0000000000000006 RCX: 0000000000000049
  RDX: 0000000000000000 RSI: ffff94fbc19d8fe0 RDI: ffff94fbc1b2b300
  RBP: ffffb373800d7e70 R08: 0000000000000000 R09: 0000000000000000
  R10: 000000000000000b R11: 000000000000000b R12: ffffb3738001fd78
  R13: ffff94fbc2fcfc00 R14: ffffb3738001fd80 R15: 0000000000000048
  FS:  0000000000000000(0000) GS:ffff94fc3da00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffff94fc30201000 CR3: 000000006f610000 CR4: 00000000000406f0
  Call Trace:
   <TASK>
   ? __wake_up
   process_one_work
   worker_thread
   ? rescuer_thread
   kthread
   ? kthread_complete_and_exit
   ret_from_fork
   </TASK>
  Modules linked in:

Fixes: 28a99e95f5 ("x86/amd: Use IBPB for firmware calls")
Reported-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220728122602.2500509-1-cascardo@canonical.com
2022-07-29 10:02:35 +02:00
Linus Torvalds
6e2c049076 Merge tag 'drm-fixes-2022-07-29' of git://anongit.freedesktop.org/drm/drm
Pull drm fix from Dave Airlie:
 "Quiet extra week, just a single fix for i915 workaround with execlist
  backend.

  i915:

   - Further reset robustness improvements for execlists [Wa_22011802037]"

* tag 'drm-fixes-2022-07-29' of git://anongit.freedesktop.org/drm/drm:
  drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend
2022-07-28 20:34:59 -07:00
Dave Airlie
f16a2f593d Merge tag 'drm-intel-fixes-2022-07-28-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Further reset robustness improvements for execlists [Wa_22011802037] (Umesh Nerlige Ramappa)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YuJIWaEbKcs/q0NY@tursulin-desk
2022-07-29 11:39:13 +10:00
Alistair Popple
66cee9097e nouveau/svm: Fix to migrate all requested pages
Users may request that pages from an OpenCL SVM allocation be migrated
to the GPU with clEnqueueSVMMigrateMem(). In Nouveau this will call into
nouveau_dmem_migrate_vma() to do the migration. If the total range to be
migrated exceeds SG_MAX_SINGLE_ALLOC the pages will be migrated in
chunks of size SG_MAX_SINGLE_ALLOC. However a typo in updating the
starting address means that only the first chunk will get migrated.

Fix the calculation so that the entire range will get migrated if
possible.

Signed-off-by: Alistair Popple <apopple@nvidia.com>
Fixes: e3d8b08904 ("drm/nouveau/svm: map pages after migration")
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220720062745.960701-1-apopple@nvidia.com
Cc: <stable@vger.kernel.org> # v5.8+
2022-07-28 16:43:31 -04:00
Linus Torvalds
33ea1340ba Merge tag 'net-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth and netfilter, no known blockers for
  the release.

  Current release - regressions:

   - wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop(), fix
     taking the lock before its initialized

   - Bluetooth: mgmt: fix double free on error path

  Current release - new code bugs:

   - eth: ice: fix tunnel checksum offload with fragmented traffic

  Previous releases - regressions:

   - tcp: md5: fix IPv4-mapped support after refactoring, don't take the
     pure v6 path

   - Revert "tcp: change pingpong threshold to 3", improving detection
     of interactive sessions

   - mld: fix netdev refcount leak in mld_{query | report}_work() due to
     a race

   - Bluetooth:
      - always set event mask on suspend, avoid early wake ups
      - L2CAP: fix use-after-free caused by l2cap_chan_put

   - bridge: do not send empty IFLA_AF_SPEC attribute

  Previous releases - always broken:

   - ping6: fix memleak in ipv6_renew_options()

   - sctp: prevent null-deref caused by over-eager error paths

   - virtio-net: fix the race between refill work and close, resulting
     in NAPI scheduled after close and a BUG()

   - macsec:
      - fix three netlink parsing bugs
      - avoid breaking the device state on invalid change requests
      - fix a memleak in another error path

  Misc:

   - dt-bindings: net: ethernet-controller: rework 'fixed-link' schema

   - two more batches of sysctl data race adornment"

* tag 'net-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
  stmmac: dwmac-mediatek: fix resource leak in probe
  ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr
  net: ping6: Fix memleak in ipv6_renew_options().
  net/funeth: Fix fun_xdp_tx() and XDP packet reclaim
  sctp: leave the err path free in sctp_stream_init to sctp_stream_free
  sfc: disable softirqs for ptp TX
  ptp: ocp: Select CRC16 in the Kconfig.
  tcp: md5: fix IPv4-mapped support
  virtio-net: fix the race between refill work and close
  mptcp: Do not return EINPROGRESS when subflow creation succeeds
  Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put
  Bluetooth: Always set event mask on suspend
  Bluetooth: mgmt: Fix double free on error path
  wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()
  ice: do not setup vlan for loopback VSI
  ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS)
  ice: Fix VSIs unable to share unicast MAC
  ice: Fix tunnel checksum offload with fragmented traffic
  ice: Fix max VLANs available for VF
  netfilter: nft_queue: only allow supported familes and hooks
  ...
2022-07-28 11:54:59 -07:00
Dan Carpenter
4d3d3a1b24 stmmac: dwmac-mediatek: fix resource leak in probe
If mediatek_dwmac_clks_config() fails, then call stmmac_remove_config_dt()
before returning.  Otherwise it is a resource leak.

Fixes: fa4b3ca60e ("stmmac: dwmac-mediatek: fix clock issue")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/YuJ4aZyMUlG6yGGa@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 10:43:04 -07:00
Ziyang Xuan
85f0173df3 ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr
Change net device's MTU to smaller than IPV6_MIN_MTU or unregister
device while matching route. That may trigger null-ptr-deref bug
for ip6_ptr probability as following.

=========================================================
BUG: KASAN: null-ptr-deref in find_match.part.0+0x70/0x134
Read of size 4 at addr 0000000000000308 by task ping6/263

CPU: 2 PID: 263 Comm: ping6 Not tainted 5.19.0-rc7+ #14
Call trace:
 dump_backtrace+0x1a8/0x230
 show_stack+0x20/0x70
 dump_stack_lvl+0x68/0x84
 print_report+0xc4/0x120
 kasan_report+0x84/0x120
 __asan_load4+0x94/0xd0
 find_match.part.0+0x70/0x134
 __find_rr_leaf+0x408/0x470
 fib6_table_lookup+0x264/0x540
 ip6_pol_route+0xf4/0x260
 ip6_pol_route_output+0x58/0x70
 fib6_rule_lookup+0x1a8/0x330
 ip6_route_output_flags_noref+0xd8/0x1a0
 ip6_route_output_flags+0x58/0x160
 ip6_dst_lookup_tail+0x5b4/0x85c
 ip6_dst_lookup_flow+0x98/0x120
 rawv6_sendmsg+0x49c/0xc70
 inet_sendmsg+0x68/0x94

Reproducer as following:
Firstly, prepare conditions:
$ip netns add ns1
$ip netns add ns2
$ip link add veth1 type veth peer name veth2
$ip link set veth1 netns ns1
$ip link set veth2 netns ns2
$ip netns exec ns1 ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1
$ip netns exec ns2 ip -6 addr add 2001:0db8:0:f101::2/64 dev veth2
$ip netns exec ns1 ifconfig veth1 up
$ip netns exec ns2 ifconfig veth2 up
$ip netns exec ns1 ip -6 route add 2000::/64 dev veth1 metric 1
$ip netns exec ns2 ip -6 route add 2001::/64 dev veth2 metric 1

Secondly, execute the following two commands in two ssh windows
respectively:
$ip netns exec ns1 sh
$while true; do ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1; ip -6 route add 2000::/64 dev veth1 metric 1; ping6 2000::2; done

$ip netns exec ns1 sh
$while true; do ip link set veth1 mtu 1000; ip link set veth1 mtu 1500; sleep 5; done

It is because ip6_ptr has been assigned to NULL in addrconf_ifdown() firstly,
then ip6_ignore_linkdown() accesses ip6_ptr directly without NULL check.

	cpu0			cpu1
fib6_table_lookup
__find_rr_leaf
			addrconf_notify [ NETDEV_CHANGEMTU ]
			addrconf_ifdown
			RCU_INIT_POINTER(dev->ip6_ptr, NULL)
find_match
ip6_ignore_linkdown

So we can add NULL check for ip6_ptr before using in ip6_ignore_linkdown() to
fix the null-ptr-deref bug.

Fixes: dcd1f57295 ("net/ipv6: Remove fib6_idev")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220728013307.656257-1-william.xuanziyang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 10:42:44 -07:00
Kuniyuki Iwashima
e27326009a net: ping6: Fix memleak in ipv6_renew_options().
When we close ping6 sockets, some resources are left unfreed because
pingv6_prot is missing sk->sk_prot->destroy().  As reported by
syzbot [0], just three syscalls leak 96 bytes and easily cause OOM.

    struct ipv6_sr_hdr *hdr;
    char data[24] = {0};
    int fd;

    hdr = (struct ipv6_sr_hdr *)data;
    hdr->hdrlen = 2;
    hdr->type = IPV6_SRCRT_TYPE_4;

    fd = socket(AF_INET6, SOCK_DGRAM, NEXTHDR_ICMP);
    setsockopt(fd, IPPROTO_IPV6, IPV6_RTHDR, data, 24);
    close(fd);

To fix memory leaks, let's add a destroy function.

Note the socket() syscall checks if the GID is within the range of
net.ipv4.ping_group_range.  The default value is [1, 0] so that no
GID meets the condition (1 <= GID <= 0).  Thus, the local DoS does
not succeed until we change the default value.  However, at least
Ubuntu/Fedora/RHEL loosen it.

    $ cat /usr/lib/sysctl.d/50-default.conf
    ...
    -net.ipv4.ping_group_range = 0 2147483647

Also, there could be another path reported with these options, and
some of them require CAP_NET_RAW.

  setsockopt
      IPV6_ADDRFORM (inet6_sk(sk)->pktoptions)
      IPV6_RECVPATHMTU (inet6_sk(sk)->rxpmtu)
      IPV6_HOPOPTS (inet6_sk(sk)->opt)
      IPV6_RTHDRDSTOPTS (inet6_sk(sk)->opt)
      IPV6_RTHDR (inet6_sk(sk)->opt)
      IPV6_DSTOPTS (inet6_sk(sk)->opt)
      IPV6_2292PKTOPTIONS (inet6_sk(sk)->opt)

  getsockopt
      IPV6_FLOWLABEL_MGR (inet6_sk(sk)->ipv6_fl_list)

For the record, I left a different splat with syzbot's one.

  unreferenced object 0xffff888006270c60 (size 96):
    comm "repro2", pid 231, jiffies 4294696626 (age 13.118s)
    hex dump (first 32 bytes):
      01 00 00 00 44 00 00 00 00 00 00 00 00 00 00 00  ....D...........
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    backtrace:
      [<00000000f6bc7ea9>] sock_kmalloc (net/core/sock.c:2564 net/core/sock.c:2554)
      [<000000006d699550>] do_ipv6_setsockopt.constprop.0 (net/ipv6/ipv6_sockglue.c:715)
      [<00000000c3c3b1f5>] ipv6_setsockopt (net/ipv6/ipv6_sockglue.c:1024)
      [<000000007096a025>] __sys_setsockopt (net/socket.c:2254)
      [<000000003a8ff47b>] __x64_sys_setsockopt (net/socket.c:2265 net/socket.c:2262 net/socket.c:2262)
      [<000000007c409dcb>] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
      [<00000000e939c4a9>] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)

[0]: https://syzkaller.appspot.com/bug?extid=a8430774139ec3ab7176

Fixes: 6d0bfe2261 ("net: ipv6: Add IPv6 support to the ping socket.")
Reported-by: syzbot+a8430774139ec3ab7176@syzkaller.appspotmail.com
Reported-by: Ayushman Dutta <ayudutta@amazon.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220728012220.46918-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 10:42:08 -07:00
Linus Torvalds
e64ab2dbd8 watch_queue: Fix missing locking in add_watch_to_object()
If a watch is being added to a queue, it needs to guard against
interference from addition of a new watch, manual removal of a watch and
removal of a watch due to some other queue being destroyed.

KEYCTL_WATCH_KEY guards against this for the same {key,queue} pair by
holding the key->sem writelocked and by holding refs on both the key and
the queue - but that doesn't prevent interaction from other {key,queue}
pairs.

While add_watch_to_object() does take the spinlock on the event queue,
it doesn't take the lock on the source's watch list.  The assumption was
that the caller would prevent that (say by taking key->sem) - but that
doesn't prevent interference from the destruction of another queue.

Fix this by locking the watcher list in add_watch_to_object().

Fixes: c73be61ced ("pipe: Add general notification queue support")
Reported-by: syzbot+03d7b43290037d1f87ca@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: keyrings@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-28 10:06:49 -07:00
David Howells
e0339f036e watch_queue: Fix missing rcu annotation
Since __post_watch_notification() walks wlist->watchers with only the
RCU read lock held, we need to use RCU methods to add to the list (we
already use RCU methods to remove from the list).

Fix add_watch_to_object() to use hlist_add_head_rcu() instead of
hlist_add_head() for that list.

Fixes: c73be61ced ("pipe: Add general notification queue support")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-28 10:06:49 -07:00
Dimitris Michailidis
51a83391d7 net/funeth: Fix fun_xdp_tx() and XDP packet reclaim
The current implementation of fun_xdp_tx(), used for XPD_TX, is
incorrect in that it takes an address/length pair and later releases it
with page_frag_free(). It is OK for XDP_TX but the same code is used by
ndo_xdp_xmit. In that case it loses the XDP memory type and releases the
packet incorrectly for some of the types. Assorted breakage follows.

Change fun_xdp_tx() to take xdp_frame and rely on xdp_return_frame() in
reclaim.

Fixes: db37bc177d ("net/funeth: add the data path")
Signed-off-by: Dimitris Michailidis <dmichail@fungible.com>
Link: https://lore.kernel.org/r/20220726215923.7887-1-dmichail@fungible.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-28 12:54:10 +02:00
Jakub Kicinski
bf84719df7 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-07-26

This series contains updates to ice driver only.

Przemyslaw corrects accounting for VF VLANs to allow for correct number
of VLANs for untrusted VF. He also correct issue with checksum offload
on VXLAN tunnels.

Ani allows for two VSIs to share the same MAC address.

Maciej corrects checked bits for descriptor completion of loopback

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: do not setup vlan for loopback VSI
  ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS)
  ice: Fix VSIs unable to share unicast MAC
  ice: Fix tunnel checksum offload with fragmented traffic
  ice: Fix max VLANs available for VF
====================

Link: https://lore.kernel.org/r/20220726204646.2171589-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27 19:56:28 -07:00
Xin Long
181d8d2066 sctp: leave the err path free in sctp_stream_init to sctp_stream_free
A NULL pointer dereference was reported by Wei Chen:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  RIP: 0010:__list_del_entry_valid+0x26/0x80
  Call Trace:
   <TASK>
   sctp_sched_dequeue_common+0x1c/0x90
   sctp_sched_prio_dequeue+0x67/0x80
   __sctp_outq_teardown+0x299/0x380
   sctp_outq_free+0x15/0x20
   sctp_association_free+0xc3/0x440
   sctp_do_sm+0x1ca7/0x2210
   sctp_assoc_bh_rcv+0x1f6/0x340

This happens when calling sctp_sendmsg without connecting to server first.
In this case, a data chunk already queues up in send queue of client side
when processing the INIT_ACK from server in sctp_process_init() where it
calls sctp_stream_init() to alloc stream_in. If it fails to alloc stream_in
all stream_out will be freed in sctp_stream_init's err path. Then in the
asoc freeing it will crash when dequeuing this data chunk as stream_out
is missing.

As we can't free stream out before dequeuing all data from send queue, and
this patch is to fix it by moving the err path stream_out/in freeing in
sctp_stream_init() to sctp_stream_free() which is eventually called when
freeing the asoc in sctp_association_free(). This fix also makes the code
in sctp_process_init() more clear.

Note that in sctp_association_init() when it fails in sctp_stream_init(),
sctp_association_free() will not be called, and in that case it should
go to 'stream_free' err path to free stream instead of 'fail_init'.

Fixes: 5bbbbe32a4 ("sctp: introduce stream scheduler foundations")
Reported-by: Wei Chen <harperchen1110@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/831a3dc100c4908ff76e5bcc363be97f2778bc0b.1658787066.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27 18:23:22 -07:00
Alejandro Lucero
67c3b611d9 sfc: disable softirqs for ptp TX
Sending a PTP packet can imply to use the normal TX driver datapath but
invoked from the driver's ptp worker. The kernel generic TX code
disables softirqs and preemption before calling specific driver TX code,
but the ptp worker does not. Although current ptp driver functionality
does not require it, there are several reasons for doing so:

   1) The invoked code is always executed with softirqs disabled for non
      PTP packets.
   2) Better if a ptp packet transmission is not interrupted by softirq
      handling which could lead to high latencies.
   3) netdev_xmit_more used by the TX code requires preemption to be
      disabled.

Indeed a solution for dealing with kernel preemption state based on static
kernel configuration is not possible since the introduction of dynamic
preemption level configuration at boot time using the static calls
functionality.

Fixes: f79c957a0b ("drivers: net: sfc: use netdev_xmit_more helper")
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Link: https://lore.kernel.org/r/20220726064504.49613-1-alejandro.lucero-palau@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27 18:20:43 -07:00
Jonathan Lemon
0c10455626 ptp: ocp: Select CRC16 in the Kconfig.
The crc16() function is used to check the firmware validity, but
the library was not explicitly selected.

Fixes: 3c3673bde5 ("ptp: ocp: Add firmware header checks")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Vadim Fedorenko <vadfed@fb.com>
Link: https://lore.kernel.org/r/20220726220604.1339972-1-jonathan.lemon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27 18:11:34 -07:00
Jernej Skrabec
8dc592c41f clk: sunxi-ng: Fix H6 RTC clock definition
While RTC clock was added in H616 ccu_common list, it was not in H6
list. That caused invalid pointer dereference like this:

Unable to handle kernel NULL pointer dereference at virtual address 000000000000020c
Mem abort info:
  ESR = 0x96000004
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x04: level 0 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000004
  CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=000000004d574000
[000000000000020c] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
CPU: 3 PID: 339 Comm: cat Tainted: G    B             5.18.0-rc1+ #1352
Hardware name: Tanix TX6 (DT)
pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : ccu_gate_is_enabled+0x48/0x74
lr : ccu_gate_is_enabled+0x40/0x74
sp : ffff80000c0b76d0
x29: ffff80000c0b76d0 x28: 00000000016e3600 x27: 0000000000000000
x26: 0000000000000000 x25: 0000000000000002 x24: ffff00000952fe08
x23: ffff800009611400 x22: ffff00000952fe79 x21: 0000000000000000
x20: 0000000000000001 x19: ffff80000aad6f08 x18: 0000000000000000
x17: 2d2d2d2d2d2d2d2d x16: 2d2d2d2d2d2d2d2d x15: 2d2d2d2d2d2d2d2d
x14: 0000000000000000 x13: 00000000f2f2f2f2 x12: ffff700001816e89
x11: 1ffff00001816e88 x10: ffff700001816e88 x9 : dfff800000000000
x8 : ffff80000c0b7447 x7 : 0000000000000001 x6 : ffff700001816e88
x5 : ffff80000c0b7440 x4 : 0000000000000001 x3 : ffff800008935c50
x2 : dfff800000000000 x1 : 0000000000000000 x0 : 000000000000020c
Call trace:
 ccu_gate_is_enabled+0x48/0x74
 clk_core_is_enabled+0x7c/0x1c0
 clk_summary_show_subtree+0x1dc/0x334
 clk_summary_show_subtree+0x250/0x334
 clk_summary_show_subtree+0x250/0x334
 clk_summary_show_subtree+0x250/0x334
 clk_summary_show_subtree+0x250/0x334
 clk_summary_show+0x90/0xdc
 seq_read_iter+0x248/0x6d4
 seq_read+0x17c/0x1fc
 full_proxy_read+0x90/0xf0
 vfs_read+0xdc/0x28c
 ksys_read+0xc8/0x174
 __arm64_sys_read+0x44/0x5c
 invoke_syscall+0x60/0x190
 el0_svc_common.constprop.0+0x7c/0x160
 do_el0_svc+0x38/0xa0
 el0_svc+0x68/0x160
 el0t_64_sync_handler+0x10c/0x140
 el0t_64_sync+0x18c/0x190
Code: d1006260 97e5c981 785e8260 8b0002a0 (b9400000)
---[ end trace 0000000000000000 ]---

Fix that by adding rtc clock to H6 ccu_common list too.

Fixes: 38d321b61b ("clk: sunxi-ng: h6-r: Add RTC gate clock")
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20220719183725.2605141-1-jernej.skrabec@gmail.com
Reviewed-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-07-27 16:45:58 -07:00
Eric Dumazet
e62d2e1103 tcp: md5: fix IPv4-mapped support
After the blamed commit, IPv4 SYN packets handled
by a dual stack IPv6 socket are dropped, even if
perfectly valid.

$ nstat | grep MD5
TcpExtTCPMD5Failure             5                  0.0

For a dual stack listener, an incoming IPv4 SYN packet
would call tcp_inbound_md5_hash() with @family == AF_INET,
while tp->af_specific is pointing to tcp_sock_ipv6_specific.

Only later when an IPv4-mapped child is created, tp->af_specific
is changed to tcp_sock_ipv6_mapped_specific.

Fixes: 7bbb765b73 ("net/tcp: Merge TCP-MD5 inbound callbacks")
Reported-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Tested-by: Leonard Crestez <cdleonard@gmail.com>
Link: https://lore.kernel.org/r/20220726115743.2759832-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27 10:18:21 -07:00
Florian Fainelli
fb0fd3469e ARM: 9216/1: Fix MAX_DMA_ADDRESS overflow
Commit 26f09e9b3a ("mm/memblock: add memblock memory allocation apis")
added a check to determine whether arm_dma_zone_size is exceeding the
amount of kernel virtual address space available between the upper 4GB
virtual address limit and PAGE_OFFSET in order to provide a suitable
definition of MAX_DMA_ADDRESS that should fit within the 32-bit virtual
address space. The quantity used for comparison was off by a missing
trailing 0, leading to MAX_DMA_ADDRESS to be overflowing a 32-bit
quantity.

This was caught thanks to CONFIG_DEBUG_VIRTUAL on the bcm2711 platform
where we define a dma_zone_size of 1GB and we have a PAGE_OFFSET value
of 0xc000_0000 (CONFIG_VMSPLIT_3G) leading to MAX_DMA_ADDRESS being
0x1_0000_0000 which overflows the unsigned long type used throughout
__pa() and then __virt_addr_valid(). Because the virtual address passed
to __virt_addr_valid() would now be 0, the function would loudly warn
and flood the kernel log, thus making the platform unable to boot
properly.

Fixes: 26f09e9b3a ("mm/memblock: add memblock memory allocation apis")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-07-27 17:55:01 +01:00
Linus Torvalds
6e7765cb47 Merge tag 'asm-generic-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic fixes from Arnd Bergmann:
 "Two more bug fixes for asm-generic, one addressing an incorrect
  Kconfig symbol reference and another one fixing a build failure for
  the perf tool on mips and possibly others"

* tag 'asm-generic-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: remove a broken and needless ifdef conditional
  tools: Fixed MIPS builds due to struct flock re-definition
2022-07-27 09:50:18 -07:00
Linus Torvalds
9d8a8616ee Merge tag 'soc-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "One last set of changes for the soc tree:

   - fix clock frequency on lan966x

   - fix incorrect GPIO numbers on some pxa machines

   - update Baolin's email address"

* tag 'soc-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: pxa2xx: Fix GPIO descriptor tables
  mailmap: update Baolin Wang's email
  ARM: dts: lan966x: fix sys_clk frequency
2022-07-27 09:43:07 -07:00
Borislav Petkov
5bb6c1d112 Revert "x86/sev: Expose sev_es_ghcb_hv_call() for use by HyperV"
This reverts commit 007faec014.

Now that hyperv does its own protocol negotiation:

  49d6a3c062 ("x86/Hyper-V: Add SEV negotiate protocol support in Isolation VM")

revert this exposure of the sev_es_ghcb_hv_call() helper.

Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by:Tianyu Lan <tiala@microsoft.com>
Link: https://lore.kernel.org/r/20220614014553.1915929-1-ltykernel@gmail.com
2022-07-27 18:09:13 +02:00
Lukas Bulwahn
871808fd69 x86/configs: Update configs in x86_debug.config
Commit

  4675ff05de ("kmemcheck: rip it out")

removed kmemcheck and its corresponding build config KMEMCHECK.

Commit

  0f620cefd7 ("objtool: Rename "VMLINUX_VALIDATION" -> "NOINSTR_VALIDATION"")

renamed the debug config option.

Adjust x86_debug.config to those changes in debug configs.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220722121815.27535-1-lukas.bulwahn@gmail.com
2022-07-27 18:09:11 +02:00
Jens Axboe
eda3953b6a Merge tag 'nvme-5.19-2022-07-27' of git://git.infradead.org/nvme into block-5.19
Pull NVMe fix from Christoph:

"nvme fix for Linux 5.19

 - yet another duplicate ID quirk (Tobias Gruetzmacher)"

* tag 'nvme-5.19-2022-07-27' of git://git.infradead.org/nvme:
  nvme-pci: Crucial P2 has bogus namespace ids
2022-07-27 10:03:40 -06:00
Ian Rogers
9a24180567 perf bpf: Remove undefined behavior from bpf_perf_object__next()
bpf_perf_object__next() folded the last element in the list test with the
empty list test. However, this meant that offsets were computed against
null and that a struct list_head was compared against a 'struct
bpf_perf_object'.

Working around this with clang's undefined behavior sanitizer required
-fno-sanitize=null and -fno-sanitize=object-size.

Remove the undefined behavior by using the regular Linux list APIs and
handling the starting case separately from the end testing case.

Looking at uses like bpf_perf_object__for_each(), as the constant NULL
or non-NULL argument can be constant propagated, the code is no less
efficient.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Christy Lee <christylee@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20220726220921.2567761-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:19:39 -03:00
Leo Yan
882528d2e7 perf symbol: Skip symbols if SHF_ALLOC flag is not set
Some symbols are observed with the 'st_value' field zeroed.  E.g.
libc.so.6 in Ubuntu contains a symbol '__evoke_link_warning_getwd' which
resides in the '.gnu.warning.getwd' section.

Unlike normal sections, such kind of sections are used for linker
warning when a file calls deprecated functions, but they are not part of
memory images, the symbols in these sections should be dropped.

This patch checks the section attribute SHF_ALLOC bit, if the bit is not
set, it skips symbols to avoid spurious ones.

Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chang Rui <changruinj@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220724060013.171050-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:17:50 -03:00
Leo Yan
2d86612aac perf symbol: Correct address for bss symbols
When using 'perf mem' and 'perf c2c', an issue is observed that tool
reports the wrong offset for global data symbols.  This is a common
issue on both x86 and Arm64 platforms.

Let's see an example, for a test program, below is the disassembly for
its .bss section which is dumped with objdump:

  ...

  Disassembly of section .bss:

  0000000000004040 <completed.0>:
  	...

  0000000000004080 <buf1>:
  	...

  00000000000040c0 <buf2>:
  	...

  0000000000004100 <thread>:
  	...

First we used 'perf mem record' to run the test program and then used
'perf --debug verbose=4 mem report' to observe what's the symbol info
for 'buf1' and 'buf2' structures.

  # ./perf mem record -e ldlat-loads,ldlat-stores -- false_sharing.exe 8
  # ./perf --debug verbose=4 mem report
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 sh_addr: 0x4040 sh_offset: 0x3028
    symbol__new: buf2 0x30a8-0x30e8
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x4080 sh_addr: 0x4040 sh_offset: 0x3028
    symbol__new: buf1 0x3068-0x30a8
    ...

The perf tool relies on libelf to parse symbols, in executable and
shared object files, 'st_value' holds a virtual address; 'sh_addr' is
the address at which section's first byte should reside in memory, and
'sh_offset' is the byte offset from the beginning of the file to the
first byte in the section.  The perf tool uses below formula to convert
a symbol's memory address to a file address:

  file_address = st_value - sh_addr + sh_offset
                    ^
                    ` Memory address

We can see the final adjusted address ranges for buf1 and buf2 are
[0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is
incorrect, in the code, the structure for 'buf1' and 'buf2' specifies
compiler attribute with 64-byte alignment.

The problem happens for 'sh_offset', libelf returns it as 0x3028 which
is not 64-byte aligned, combining with disassembly, it's likely libelf
doesn't respect the alignment for .bss section, therefore, it doesn't
return the aligned value for 'sh_offset'.

Suggested by Fangrui Song, ELF file contains program header which
contains PT_LOAD segments, the fields p_vaddr and p_offset in PT_LOAD
segments contain the execution info.  A better choice for converting
memory address to file address is using the formula:

  file_address = st_value - p_vaddr + p_offset

This patch introduces elf_read_program_header() which returns the
program header based on the passed 'st_value', then it uses the formula
above to calculate the symbol file address; and the debugging log is
updated respectively.

After applying the change:

  # ./perf --debug verbose=4 mem report
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 p_vaddr: 0x3d28 p_offset: 0x2d28
    symbol__new: buf2 0x30c0-0x3100
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x4080 p_vaddr: 0x3d28 p_offset: 0x2d28
    symbol__new: buf1 0x3080-0x30c0
    ...

Fixes: f17e04afaf ("perf report: Fix ELF symbol parsing")
Reported-by: Chang Rui <changruinj@gmail.com>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220724060013.171050-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:17:50 -03:00
Leo Yan
b226521923 perf scripts python: Let script to be python2 compliant
The mainline kernel can be used for relative old distros, e.g. RHEL 7.
The distro doesn't upgrade from python2 to python3, this causes the
building error that the python script is not python2 compliant.

To fix the building failure, this patch changes from the python f-string
format to traditional string format.

Fixes: 12fdd6c009 ("perf scripts python: Support Arm CoreSight trace data disassembly")
Reported-by: Akemi Yagi <toracat@elrepo.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: ElRepo <contact@elrepo.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220725104220.1106663-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:17:50 -03:00
Arnaldo Carvalho de Melo
553de6e115 tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  28a99e95f5 ("x86/amd: Use IBPB for firmware calls")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org
Link: https://lore.kernel.org/lkml/Yt6oWce9UDAmBAtX@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:17:50 -03:00
Jason Wang
5a159128fa virtio-net: fix the race between refill work and close
We try using cancel_delayed_work_sync() to prevent the work from
enabling NAPI. This is insufficient since we don't disable the source
of the refill work scheduling. This means an NAPI poll callback after
cancel_delayed_work_sync() can schedule the refill work then can
re-enable the NAPI that leads to use-after-free [1].

Since the work can enable NAPI, we can't simply disable NAPI before
calling cancel_delayed_work_sync(). So fix this by introducing a
dedicated boolean to control whether or not the work could be
scheduled from NAPI.

[1]
==================================================================
BUG: KASAN: use-after-free in refill_work+0x43/0xd4
Read of size 2 at addr ffff88810562c92e by task kworker/2:1/42

CPU: 2 PID: 42 Comm: kworker/2:1 Not tainted 5.19.0-rc1+ #480
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Workqueue: events refill_work
Call Trace:
 <TASK>
 dump_stack_lvl+0x34/0x44
 print_report.cold+0xbb/0x6ac
 ? _printk+0xad/0xde
 ? refill_work+0x43/0xd4
 kasan_report+0xa8/0x130
 ? refill_work+0x43/0xd4
 refill_work+0x43/0xd4
 process_one_work+0x43d/0x780
 worker_thread+0x2a0/0x6f0
 ? process_one_work+0x780/0x780
 kthread+0x167/0x1a0
 ? kthread_exit+0x50/0x50
 ret_from_fork+0x22/0x30
 </TASK>
...

Fixes: b2baed69e6 ("virtio_net: set/cancel work on ndo_open/ndo_stop")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-27 13:20:44 +01:00
Toshi Kani
5e2805d537 EDAC/ghes: Set the DIMM label unconditionally
The commit

  cb51a371d0 ("EDAC/ghes: Setup DIMM label from DMI and use it in error reports")

enforced that both the bank and device strings passed to
dimm_setup_label() are not NULL.

However, there are BIOSes, for example on a

  HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019

which don't populate both strings:

  Handle 0x0020, DMI type 17, 84 bytes
  Memory Device
          Array Handle: 0x0013
          Error Information Handle: Not Provided
          Total Width: 72 bits
          Data Width: 64 bits
          Size: 32 GB
          Form Factor: DIMM
          Set: None
          Locator: PROC 1 DIMM 1        <===== device
          Bank Locator: Not Specified   <===== bank

This results in a buffer overflow because ghes_edac_register() calls
strlen() on an uninitialized label, which had non-zero values left over
from krealloc_array():

  detected buffer overflow in __fortify_strlen
   ------------[ cut here ]------------
   kernel BUG at lib/string_helpers.c:983!
   invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
   CPU: 1 PID: 1 Comm: swapper/0 Tainted: G          I       5.18.6-200.fc36.x86_64 #1
   Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019
   RIP: 0010:fortify_panic
   ...
   Call Trace:
    <TASK>
    ghes_edac_register.cold
    ghes_probe
    platform_probe
    really_probe
    __driver_probe_device
    driver_probe_device
    __driver_attach
    ? __device_attach_driver
    bus_for_each_dev
    bus_add_driver
    driver_register
    acpi_ghes_init
    acpi_init
    ? acpi_sleep_proc_init
    do_one_initcall

The label contains garbage because the commit in Fixes reallocs the
DIMMs array while scanning the system but doesn't clear the newly
allocated memory.

Change dimm_setup_label() to always initialize the label to fix the
issue. Set it to the empty string in case BIOS does not provide both
bank and device so that ghes_edac_register() can keep the default label
given by edac_mc_alloc_dimms().

  [ bp: Rewrite commit message. ]

Fixes: b9cae27728 ("EDAC/ghes: Scan the system once on driver init")
Co-developed-by: Robert Richter <rric@kernel.org>
Signed-off-by: Robert Richter <rric@kernel.org>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Robert Elliott <elliott@hpe.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220719220124.760359-1-toshi.kani@hpe.com
2022-07-27 10:42:52 +02:00
Mat Martineau
b5177ed92b mptcp: Do not return EINPROGRESS when subflow creation succeeds
New subflows are created within the kernel using O_NONBLOCK, so
EINPROGRESS is the expected return value from kernel_connect().
__mptcp_subflow_connect() has the correct logic to consider EINPROGRESS
to be a successful case, but it has also used that error code as its
return value.

Before v5.19 this was benign: all the callers ignored the return
value. Starting in v5.19 there is a MPTCP_PM_CMD_SUBFLOW_CREATE generic
netlink command that does use the return value, so the EINPROGRESS gets
propagated to userspace.

Make __mptcp_subflow_connect() always return 0 on success instead.

Fixes: ec3edaa7ca ("mptcp: Add handling of outgoing MP_JOIN requests")
Fixes: 702c2f646d ("mptcp: netlink: allow userspace-driven subflow establishment")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Link: https://lore.kernel.org/r/20220725205231.87529-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 19:57:55 -07:00
Jakub Kicinski
e77ea97d2b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:

====================
netfilter updates for net

Three late fixes for netfilter:

1) If nf_queue user requests packet truncation below size of l3 header,
   we corrupt the skb, then crash.  Reject such requests.

2) add cond_resched() calls when doing cycle detection in the
   nf_tables graph.  This avoids softlockup warning with certain
   rulesets.

3) Reject rulesets that use nftables 'queue' expression in family/chain
   combinations other than those that are supported.  Currently the ruleset
   will load, but when userspace attempts to reinject you get WARN splat +
   packet drops.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_queue: only allow supported familes and hooks
  netfilter: nf_tables: add rescheduling points during loop detection walks
  netfilter: nf_queue: do not allow packet truncation below transport header offset
====================

Link: https://lore.kernel.org/r/20220726192056.13497-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 19:53:09 -07:00
Jakub Kicinski
e53f529397 Merge tag 'for-net-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - Fix early wakeup after suspend
 - Fix double free on error
 - Fix use-after-free on l2cap_chan_put

* tag 'for-net-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put
  Bluetooth: Always set event mask on suspend
  Bluetooth: mgmt: Fix double free on error path
====================

Link: https://lore.kernel.org/r/20220726221328.423714-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 19:48:24 -07:00
Linus Torvalds
39c3c396f8 Merge tag 'mm-hotfixes-stable-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "Thirteen hotfixes.

  Eight are cc:stable and the remainder are for post-5.18 issues or are
  too minor to warrant backporting"

* tag 'mm-hotfixes-stable-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mailmap: update Gao Xiang's email addresses
  userfaultfd: provide properly masked address for huge-pages
  Revert "ocfs2: mount shared volume without ha stack"
  hugetlb: fix memoryleak in hugetlb_mcopy_atomic_pte
  fs: sendfile handles O_NONBLOCK of out_fd
  ntfs: fix use-after-free in ntfs_ucsncmp()
  secretmem: fix unhandled fault in truncate
  mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range()
  mm: fix missing wake-up event for FSDAX pages
  mm: fix page leak with multiple threads mapping the same page
  mailmap: update Seth Forshee's email address
  tmpfs: fix the issue that the mount and remount results are inconsistent.
  mm: kfence: apply kmemleak_ignore_phys on early allocated pool
2022-07-26 19:38:46 -07:00
Bart Van Assche
f5c2976e0c scsi: ufs: core: Fix a race condition related to device management
If a device management command completion happens after
wait_for_completion_timeout() times out and before ufshcd_clear_cmds() is
called, then the completion code may crash on the complete() call in
__ufshcd_transfer_req_compl().

Fix the following crash:

  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
  Call trace:
   complete+0x64/0x178
   __ufshcd_transfer_req_compl+0x30c/0x9c0
   ufshcd_poll+0xf0/0x208
   ufshcd_sl_intr+0xb8/0xf0
   ufshcd_intr+0x168/0x2f4
   __handle_irq_event_percpu+0xa0/0x30c
   handle_irq_event+0x84/0x178
   handle_fasteoi_irq+0x150/0x2e8
   __handle_domain_irq+0x114/0x1e4
   gic_handle_irq.31846+0x58/0x300
   el1_irq+0xe4/0x1c0
   efi_header_end+0x110/0x680
   __irq_exit_rcu+0x108/0x124
   __handle_domain_irq+0x118/0x1e4
   gic_handle_irq.31846+0x58/0x300
   el1_irq+0xe4/0x1c0
   cpuidle_enter_state+0x3ac/0x8c4
   do_idle+0x2fc/0x55c
   cpu_startup_entry+0x84/0x90
   kernel_init+0x0/0x310
   start_kernel+0x0/0x608
   start_kernel+0x4ec/0x608

Link: https://lore.kernel.org/r/20220720170228.1598842-1-bvanassche@acm.org
Fixes: 5a0b0cb9be ("[SCSI] ufs: Add support for sending NOP OUT UPIU")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: Bean Huo <beanhuo@micron.com>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-26 21:59:29 -04:00
Jason Yan
d9a434fa0c scsi: core: Fix warning in scsi_alloc_sgtables()
As explained in SG_IO howto[1]:

"If iovec_count is non-zero then 'dxfer_len' should be equal to the sum of
iov_len lengths. If not, the minimum of the two is the transfer length."

When iovec_count is non-zero and dxfer_len is zero, the sg_io() just
genarated a null bio, and finally caused a warning below. To fix it, skip
generating a bio for this request if dxfer_len is zero.

[1] https://tldp.org/HOWTO/SCSI-Generic-HOWTO/x198.html

WARNING: CPU: 2 PID: 3643 at drivers/scsi/scsi_lib.c:1032 scsi_alloc_sgtables+0xc7d/0xf70 drivers/scsi/scsi_lib.c:1032
Modules linked in:

CPU: 2 PID: 3643 Comm: syz-executor397 Not tainted
5.17.0-rc3-syzkaller-00316-gb81b1829e7e3 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-204/01/2014
RIP: 0010:scsi_alloc_sgtables+0xc7d/0xf70 drivers/scsi/scsi_lib.c:1032
Code: e7 fc 31 ff 44 89 f6 e8 c1 4e e7 fc 45 85 f6 0f 84 1a f5 ff ff e8
93 4c e7 fc 83 c5 01 0f b7 ed e9 0f f5 ff ff e8 83 4c e7 fc <0f> 0b 41
   bc 0a 00 00 00 e9 2b fb ff ff 41 bc 09 00 00 00 e9 20 fb
RSP: 0018:ffffc90000d07558 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88801bfc96a0 RCX: 0000000000000000
RDX: ffff88801c876000 RSI: ffffffff849060bd RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff849055b9 R11: 0000000000000000 R12: ffff888012b8c000
R13: ffff88801bfc9580 R14: 0000000000000000 R15: ffff88801432c000
FS:  00007effdec8e700(0000) GS:ffff88802cc00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007effdec6d718 CR3: 00000000206d6000 CR4: 0000000000150ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 scsi_setup_scsi_cmnd drivers/scsi/scsi_lib.c:1219 [inline]
 scsi_prepare_cmd drivers/scsi/scsi_lib.c:1614 [inline]
 scsi_queue_rq+0x283e/0x3630 drivers/scsi/scsi_lib.c:1730
 blk_mq_dispatch_rq_list+0x6ea/0x22e0 block/blk-mq.c:1851
 __blk_mq_sched_dispatch_requests+0x20b/0x410 block/blk-mq-sched.c:299
 blk_mq_sched_dispatch_requests+0xfb/0x180 block/blk-mq-sched.c:332
 __blk_mq_run_hw_queue+0xf9/0x350 block/blk-mq.c:1968
 __blk_mq_delay_run_hw_queue+0x5b6/0x6c0 block/blk-mq.c:2045
 blk_mq_run_hw_queue+0x30f/0x480 block/blk-mq.c:2096
 blk_mq_sched_insert_request+0x340/0x440 block/blk-mq-sched.c:451
 blk_execute_rq+0xcc/0x340 block/blk-mq.c:1231
 sg_io+0x67c/0x1210 drivers/scsi/scsi_ioctl.c:485
 scsi_ioctl_sg_io drivers/scsi/scsi_ioctl.c:866 [inline]
 scsi_ioctl+0xa66/0x1560 drivers/scsi/scsi_ioctl.c:921
 sd_ioctl+0x199/0x2a0 drivers/scsi/sd.c:1576
 blkdev_ioctl+0x37a/0x800 block/ioctl.c:588
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:874 [inline]
 __se_sys_ioctl fs/ioctl.c:860 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7effdecdc5d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 81 14 00 00 90 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007effdec8e2f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007effded664c0 RCX: 00007effdecdc5d9
RDX: 0000000020002300 RSI: 0000000000002285 RDI: 0000000000000004
RBP: 00007effded34034 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
R13: 00007effded34054 R14: 2f30656c69662f2e R15: 00007effded664c8

Link: https://lore.kernel.org/r/20220720025120.3226770-1-yanaijie@huawei.com
Fixes: 25636e282f ("block: fix SG_IO vector request data length handling")
Reported-by: syzbot+d44b35ecfb807e5af0b5@syzkaller.appspotmail.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-26 21:54:30 -04:00
Liang He
a3435afba8 scsi: ufs: host: Hold reference returned by of_parse_phandle()
In ufshcd_populate_vreg(), we should hold the reference returned by
of_parse_phandle() and then use it to call of_node_put() for refcount
balance.

Link: https://lore.kernel.org/r/20220719071529.1081166-1-windhl@126.com
Fixes: aa49761309 ("ufs: Add regulator enable support")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-26 21:51:40 -04:00
David Jeffery
0fde22c542 scsi: mpt3sas: Stop fw fault watchdog work item during system shutdown
During system shutdown or reboot, mpt3sas will reset the firmware back to
ready state. However, the driver leaves running a watchdog work item
intended to keep the firmware in operational state. This causes a second,
unneeded reset on shutdown and moves the firmware back to operational
instead of in ready state as intended. And if the mpt3sas_fwfault_debug
module parameter is set, this extra reset also panics the system.

mpt3sas's scsih_shutdown needs to stop the watchdog before resetting the
firmware back to ready state.

Link: https://lore.kernel.org/r/20220722142448.6289-1-djeffery@redhat.com
Fixes: fae21608c3 ("scsi: mpt3sas: Transition IOC to Ready state during shutdown")
Tested-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-26 21:40:43 -04:00
Gao Xiang
1f7ea54727 mailmap: update Gao Xiang's email addresses
I've been in Alibaba Cloud for more than one year, mainly to address
cloud-native challenges (such as high-performance container images) for
open source communities.

Update my email addresses on behalf of my current employer (Alibaba Cloud)
to support all my (team) work in this area.  Also add an outdated
@redhat.com address of me.

Link: https://lkml.kernel.org/r/20220719154246.62970-1-xiang@kernel.org
Signed-off-by: Gao Xiang <xiang@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-26 18:25:01 -07:00
Nadav Amit
d172b1a3bd userfaultfd: provide properly masked address for huge-pages
Commit 824ddc601a ("userfaultfd: provide unmasked address on
page-fault") was introduced to fix an old bug, in which the offset in the
address of a page-fault was masked.  Concerns were raised - although were
never backed by actual code - that some userspace code might break because
the bug has been around for quite a while.  To address these concerns a
new flag was introduced, and only when this flag is set by the user,
userfaultfd provides the exact address of the page-fault.

The commit however had a bug, and if the flag is unset, the offset was
always masked based on a base-page granularity.  Yet, for huge-pages, the
behavior prior to the commit was that the address is masked to the
huge-page granulrity.

While there are no reports on real breakage, fix this issue.  If the flag
is unset, use the address with the masking that was done before.

Link: https://lkml.kernel.org/r/20220711165906.2682-1-namit@vmware.com
Fixes: 824ddc601a ("userfaultfd: provide unmasked address on page-fault")
Signed-off-by: Nadav Amit <namit@vmware.com>
Reported-by: James Houghton <jthoughton@google.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: James Houghton <jthoughton@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-26 18:25:01 -07:00
Luiz Augusto von Dentz
d0be8347c6 Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put
This fixes the following trace which is caused by hci_rx_work starting up
*after* the final channel reference has been put() during sock_close() but
*before* the references to the channel have been destroyed, so instead
the code now rely on kref_get_unless_zero/l2cap_chan_hold_unless_zero to
prevent referencing a channel that is about to be destroyed.

  refcount_t: increment on 0; use-after-free.
  BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
  Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705

  CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W
  4.14.234-00003-g1fb6d0bd49a4-dirty #28
  Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150
  Google Inc. MSM sm8150 Flame DVT (DT)
  Workqueue: hci0 hci_rx_work
  Call trace:
   dump_backtrace+0x0/0x378
   show_stack+0x20/0x2c
   dump_stack+0x124/0x148
   print_address_description+0x80/0x2e8
   __kasan_report+0x168/0x188
   kasan_report+0x10/0x18
   __asan_load4+0x84/0x8c
   refcount_dec_and_test+0x20/0xd0
   l2cap_chan_put+0x48/0x12c
   l2cap_recv_frame+0x4770/0x6550
   l2cap_recv_acldata+0x44c/0x7a4
   hci_acldata_packet+0x100/0x188
   hci_rx_work+0x178/0x23c
   process_one_work+0x35c/0x95c
   worker_thread+0x4cc/0x960
   kthread+0x1a8/0x1c4
   ret_from_fork+0x10/0x18

Cc: stable@kernel.org
Reported-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-26 13:35:24 -07:00
Abhishek Pandit-Subedi
ef61b6ea15 Bluetooth: Always set event mask on suspend
When suspending, always set the event mask once disconnects are
successful. Otherwise, if wakeup is disallowed, the event mask is not
set before suspend continues and can result in an early wakeup.

Fixes: 182ee45da0 ("Bluetooth: hci_sync: Rework hci_suspend_notifier")
Cc: stable@vger.kernel.org
Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-26 13:35:13 -07:00
Dan Carpenter
4b2f4e072f Bluetooth: mgmt: Fix double free on error path
Don't call mgmt_pending_remove() twice (double free).

Fixes: 6b88eff43704 ("Bluetooth: hci_sync: Refactor remove Adv Monitor")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-26 13:32:40 -07:00
Tetsuo Handa
aa40d5a435 wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()
lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1],
for commit f856373e2f ("wifi: mac80211: do not wake queues on a vif
that is being stopped") guards clear_bit() using fq.lock even before
fq_init() from ieee80211_txq_setup_flows() initializes this spinlock.

According to discussion [2], Toke was not happy with expanding usage of
fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we
can instead use synchronize_rcu() for flushing ieee80211_wake_txqs().

Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1]
Link: https://lkml.kernel.org/r/874k0zowh2.fsf@toke.dk [2]
Reported-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: f856373e2f ("wifi: mac80211: do not wake queues on a vif that is being stopped")
Tested-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@kernel.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/9cc9b81d-75a3-3925-b612-9d0ad3cab82b@I-love.SAKURA.ne.jp
[ pick up commit 3598cb6e18 ("wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()") from -next]
Link: https://lore.kernel.org/all/87o7xcq6qt.fsf@kernel.org/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26 13:23:05 -07:00
Maciej Fijalkowski
cc019545a2 ice: do not setup vlan for loopback VSI
Currently loopback test is failiing due to the error returned from
ice_vsi_vlan_setup(). Skip calling it when preparing loopback VSI.

Fixes: 0e674aeb0b ("ice: Add handler for ethtool selftest")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-26 13:15:36 -07:00
Maciej Fijalkowski
283d736ff7 ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS)
Tx side sets EOP and RS bits on descriptors to indicate that a
particular descriptor is the last one and needs to generate an irq when
it was sent. These bits should not be checked on completion path
regardless whether it's the Tx or the Rx. DD bit serves this purpose and
it indicates that a particular descriptor is either for Rx or was
successfully Txed. EOF is also set as loopback test does not xmit
fragmented frames.

Look at (DD | EOF) bits setting in ice_lbtest_receive_frames() instead
of EOP and RS pair.

Fixes: 0e674aeb0b ("ice: Add handler for ethtool selftest")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-26 13:15:36 -07:00
Anirudh Venkataramanan
5c8e3c7ff3 ice: Fix VSIs unable to share unicast MAC
The driver currently does not allow two VSIs in the same PF domain
to have the same unicast MAC address. This is incorrect in the sense
that a policy decision is being made in the driver when it must be
left to the user. This approach was causing issues when rebooting
the system with VFs spawned not being able to change their MAC addresses.
Such errors were present in dmesg:

[ 7921.068237] ice 0000:b6:00.2 ens2f2: Unicast MAC 6a:0d:e4:70:ca:d1 already
exists on this PF. Preventing setting VF 7 unicast MAC address to 6a:0d:e4:70:ca:d1

Fix that by removing this restriction. Doing this also allows
us to remove some additional code that's checking if a unicast MAC
filter already exists.

Fixes: 47ebc7b024 ("ice: Check if unicast MAC exists before setting VF MAC")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-26 13:15:36 -07:00
Przemyslaw Patynowski
01658aeead ice: Fix tunnel checksum offload with fragmented traffic
Fix checksum offload on VXLAN tunnels.
In case, when mpls protocol is not used, set l4 header to transport
header of skb. This fixes case, when user tries to offload checksums
of VXLAN tunneled traffic.

Steps for reproduction (requires link partner with tunnels):
ip l s enp130s0f0 up
ip a f enp130s0f0
ip a a 10.10.110.2/24 dev enp130s0f0
ip l s enp130s0f0 mtu 1600
ip link add vxlan12_sut type vxlan id 12 group 238.168.100.100 dev enp130s0f0 dstport 4789
ip l s vxlan12_sut up
ip a a 20.10.110.2/24 dev vxlan12_sut
iperf3 -c 20.10.110.1 #should connect

Offload params: td_offset, cd_tunnel_params were
corrupted, due to l4 header pointing wrong address. NIC would then drop
those packets internally, due to incorrect TX descriptor data,
which increased GLV_TEPC register.

Fixes: 69e66c04c6 ("ice: Add mpls+tso support")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-26 13:15:36 -07:00
Przemyslaw Patynowski
1e308c6fb7 ice: Fix max VLANs available for VF
Legacy VLAN implementation allows for untrusted VF to have 8 VLAN
filters, not counting VLAN 0 filters. Current VLAN_V2 implementation
lowers available filters for VF, by counting in VLAN 0 filter for both
TPIDs.
Fix this by counting only non zero VLAN filters.
Without this patch, untrusted VF would not be able to access 8 VLAN
filters.

Fixes: cc71de8fa1 ("ice: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-26 13:15:18 -07:00
Florian Westphal
47f4f510ad netfilter: nft_queue: only allow supported familes and hooks
Trying to use 'queue' statement in ingress (for example)
triggers a splat on reinject:

WARNING: CPU: 3 PID: 1345 at net/netfilter/nf_queue.c:291

... because nf_reinject cannot find the ruleset head.

The netdev family doesn't support async resume at the moment anyway,
so disallow loading such rulesets with a more appropriate
error message.

v2: add 'validate' callback and also check hook points, v1 did
allow ingress use in 'table inet', but that doesn't work either. (Pablo)

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-07-26 21:12:42 +02:00
Florian Westphal
81ea010667 netfilter: nf_tables: add rescheduling points during loop detection walks
Add explicit rescheduling points during ruleset walk.

Switching to a faster algorithm is possible but this is a much
smaller change, suitable for nf tree.

Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1460
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-07-26 21:12:42 +02:00
Florian Westphal
99a63d36cb netfilter: nf_queue: do not allow packet truncation below transport header offset
Domingo Dirutigliano and Nicola Guerrera report kernel panic when
sending nf_queue verdict with 1-byte nfta_payload attribute.

The IP/IPv6 stack pulls the IP(v6) header from the packet after the
input hook.

If user truncates the packet below the header size, this skb_pull() will
result in a malformed skb (skb->len < 0).

Fixes: 7af4cc3fa1 ("[NETFILTER]: Add "nfnetlink_queue" netfilter queue handler over nfnetlink")
Reported-by: Domingo Dirutigliano <pwnzer0tt1@proton.me>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-07-26 21:12:42 +02:00
Linus Torvalds
5de64d4496 Merge tag 's390-5.19-7' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fix from Alexander GordeevL

 - Prevent relatively slow PRNO TRNG random number operation from being
   called from interrupt context. That could for example cause some
   network loads to timeout.

* tag 's390-5.19-7' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/archrandom: prevent CPACF trng invocations in interrupt context
2022-07-26 10:03:53 -07:00
Qi Zheng
cdb281e638 mm: fix NULL pointer dereference in wp_page_reuse()
The vmf->page can be NULL when the wp_page_reuse() is invoked by
wp_pfn_shared(), it will cause the following panic:

  BUG: kernel NULL pointer dereference, address: 000000000000008
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP PTI
  CPU: 18 PID: 923 Comm: Xorg Not tainted 5.19.0-rc8.bm.1-amd64 #263
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g14
  RIP: 0010:_compound_head+0x0/0x40
  [...]
  Call Trace:
    wp_page_reuse+0x1c/0xa0
    do_wp_page+0x1a5/0x3f0
    __handle_mm_fault+0x8cf/0xd20
    handle_mm_fault+0xd5/0x2a0
    do_user_addr_fault+0x1d0/0x680
    exc_page_fault+0x78/0x170
    asm_exc_page_fault+0x22/0x30

To fix it, this patch performs a NULL pointer check before dereferencing
the vmf->page.

Fixes: 6c287605fd ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-26 09:21:43 -07:00
Nathan Chancellor
0c09bc33aa drm/simpledrm: Fix return type of simpledrm_simple_display_pipe_mode_valid()
When booting a kernel compiled with clang's CFI protection
(CONFIG_CFI_CLANG), there is a CFI failure in
drm_simple_kms_crtc_mode_valid() when trying to call
simpledrm_simple_display_pipe_mode_valid() through ->mode_valid():

[    0.322802] CFI failure (target: simpledrm_simple_display_pipe_mode_valid+0x0/0x8):
...
[    0.324928] Call trace:
[    0.324969]  __ubsan_handle_cfi_check_fail+0x58/0x60
[    0.325053]  __cfi_check_fail+0x3c/0x44
[    0.325120]  __cfi_slowpath_diag+0x178/0x200
[    0.325192]  drm_simple_kms_crtc_mode_valid+0x58/0x80
[    0.325279]  __drm_helper_update_and_validate+0x31c/0x464
...

The ->mode_valid() member in 'struct drm_simple_display_pipe_funcs'
expects a return type of 'enum drm_mode_status', not 'int'. Correct it
to fix the CFI failure.

Cc: stable@vger.kernel.org
Fixes: 11e8f5fd22 ("drm: Add simpledrm driver")
Link: https://github.com/ClangBuiltLinux/linux/issues/1647
Reported-by: Tomasz Paweł Gajc <tpgxyz@gmail.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220725233629.223223-1-nathan@kernel.org
2022-07-26 17:18:45 +02:00
Benjamin Poirier
9b134b1694 bridge: Do not send empty IFLA_AF_SPEC attribute
After commit b6c02ef549 ("bridge: Netlink interface fix."),
br_fill_ifinfo() started to send an empty IFLA_AF_SPEC attribute when a
bridge vlan dump is requested but an interface does not have any vlans
configured.

iproute2 ignores such an empty attribute since commit b262a9becbcb
("bridge: Fix output with empty vlan lists") but older iproute2 versions as
well as other utilities have their output changed by the cited kernel
commit, resulting in failed test cases. Regardless, emitting an empty
attribute is pointless and inefficient.

Avoid this change by canceling the attribute if no AF_SPEC data was added.

Fixes: b6c02ef549 ("bridge: Netlink interface fix.")
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20220725001236.95062-1-bpoirier@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-26 15:35:53 +02:00
Paolo Abeni
33881ab73d Merge branch 'octeontx2-minor-tc-fixes'
Subbaraya Sundeep says:

====================
Octeontx2 minor tc fixes

This patch set fixes two problems found in tc code
wrt to ratelimiting and when installing UDP/TCP filters.

Patch 1: CN10K has different register format compared to
CN9xx hence fixes that.
Patch 2: Check flow mask also before installing a src/dst
port filter, otherwise installing for one port installs for other one too.
====================

Link: https://lore.kernel.org/r/1658650874-16459-1-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-26 13:05:46 +02:00
Subbaraya Sundeep
59e1be6f83 octeontx2-pf: Fix UDP/TCP src and dst port tc filters
Check the mask for non-zero value before installing tc filters
for L4 source and destination ports. Otherwise installing a
filter for source port installs destination port too and
vice-versa.

Fixes: 1d4d9e42c2 ("octeontx2-pf: Add tc flower hardware offload on ingress traffic")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-26 13:05:44 +02:00
Sunil Goutham
b354eaeec8 octeontx2-pf: cn10k: Fix egress ratelimit configuration
NIX_AF_TLXX_PIR/CIR register format has changed from OcteonTx2
to CN10K. CN10K supports larger burst size. Fix burst exponent
and burst mantissa configuration for CN10K.

Also fixed 'maxrate' from u32 to u64 since 'police.rate_bytes_ps'
passed by stack is also u64.

Fixes: e638a83f16 ("octeontx2-pf: TC_MATCHALL egress ratelimiting offload")
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-26 13:05:44 +02:00
Duoming Zhou
b89fc26f74 sctp: fix sleep in atomic context bug in timer handlers
There are sleep in atomic context bugs in timer handlers of sctp
such as sctp_generate_t3_rtx_event(), sctp_generate_probe_event(),
sctp_generate_t1_init_event(), sctp_generate_timeout_event(),
sctp_generate_t3_rtx_event() and so on.

The root cause is sctp_sched_prio_init_sid() with GFP_KERNEL parameter
that may sleep could be called by different timer handlers which is in
interrupt context.

One of the call paths that could trigger bug is shown below:

      (interrupt context)
sctp_generate_probe_event
  sctp_do_sm
    sctp_side_effects
      sctp_cmd_interpreter
        sctp_outq_teardown
          sctp_outq_init
            sctp_sched_set_sched
              n->init_sid(..,GFP_KERNEL)
                sctp_sched_prio_init_sid //may sleep

This patch changes gfp_t parameter of init_sid in sctp_sched_set_sched()
from GFP_KERNEL to GFP_ATOMIC in order to prevent sleep in atomic
context bugs.

Fixes: 5bbbbe32a4 ("sctp: introduce stream scheduler foundations")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/20220723015809.11553-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-25 19:39:05 -07:00
Vladimir Oltean
c7560d1203 net: dsa: fix reference counting for LAG FDBs
Due to an invalid conflict resolution on my side while working on 2
different series (LAG FDBs and FDB isolation), dsa_switch_do_lag_fdb_add()
does not store the database associated with a dsa_mac_addr structure.

So after adding an FDB entry associated with a LAG, dsa_mac_addr_find()
fails to find it while deleting it, because &a->db is zeroized memory
for all stored FDB entries of lag->fdbs, and dsa_switch_do_lag_fdb_del()
returns -ENOENT rather than deleting the entry.

Fixes: c26933639b ("net: dsa: request drivers to perform FDB isolation")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220723012411.1125066-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-25 19:37:06 -07:00
Michal Maloszewski
5fcbb71102 i40e: Fix interface init with MSI interrupts (no MSI-X)
Fix the inability to bring an interface up on a setup with
only MSI interrupts enabled (no MSI-X).
Solution is to add a default number of QPs = 1. This is enough,
since without MSI-X support driver enables only a basic feature set.

Fixes: bc6d33c8d9 ("i40e: Fix the number of queues available to be mapped for use")
Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com>
Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220722175401.112572-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-25 18:48:41 -07:00
Michael Ellerman
c653c59178 drm/amdgpu: Re-enable DCN for 64-bit powerpc
Commit d11219ad53 ("amdgpu: disable powerpc support for the newer
display engine") disabled the DCN driver for all of powerpc due to
unresolved build failures with some compilers.

Further digging shows that the build failures only occur with compilers
that default to 64-bit long double.

Both the ppc64 and ppc64le ABIs define long double to be 128-bits, but
there are compilers in the wild that default to 64-bits. The compilers
provided by the major distros (Fedora, Ubuntu) default to 128-bits and
are not affected by the build failure.

There is a compiler flag to force 128-bit long double, which may be the
correct long term fix, but as an interim fix only allow building the DCN
driver if long double is 128-bits by default.

The bisection in commit d11219ad53 must have gone off the rails at
some point, the build failure occurs all the way back to the original
commit that enabled DCN support on powerpc, at least with some
toolchains.

Depends-on: d11219ad53 ("amdgpu: disable powerpc support for the newer display engine")
Fixes: 16a9dea110 ("amdgpu: Enable initial DCN support on POWER")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Dan Horák <dan@danny.cz>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2100
Link: https://lore.kernel.org/r/20220725123918.1903255-1-mpe@ellerman.id.au
2022-07-26 08:25:38 +10:00
Waiman Long
d295ad34f2 intel_idle: Fix false positive RCU splats due to incorrect hardirqs state
Commit 32d4fd5751 ("cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE")
uses raw_local_irq_enable/local_irq_disable() around call to
__intel_idle() in intel_idle_irq().

With interrupt enabled, timer tick interrupt can happen and a
subsequently call to __do_softirq() may change the lockdep hardirqs state
of a debug kernel back to 'on'. This will result in a mismatch between
the cpu hardirqs state (off) and the lockdep hardirqs state (on) causing
a number of false positive "WARNING: suspicious RCU usage" splats.

Fix that by using local_irq_disable() to disable interrupt in
intel_idle_irq().

Fixes: 32d4fd5751 ("cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE")
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: 5.16+ <stable@vger.kernel.org> # 5.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-07-25 19:39:07 +02:00
Umesh Nerlige Ramappa
a7a47a5dfa drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend
For execlists backend, current implementation of Wa_22011802037 is to
stop the CS before doing a reset of the engine. This WA was further
extended to wait for any pending MI FORCE WAKEUPs before issuing a
reset. Add the extended steps in the execlist path of reset.

In addition, extend the WA to gen11.

v2: (Tvrtko)
- Clarify comments, commit message, fix typos
- Use IS_GRAPHICS_VER for gen 11/12 checks

v3: (Daneile)
- Drop changes to intel_ring_submission since WA does not apply to it
- Log an error if MSG IDLE is not defined for an engine

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Fixes: f6aa0d713c ("drm/i915: Add Wa_22011802037 force cs halt")
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621192105.2100585-1-umesh.nerlige.ramappa@intel.com
(cherry picked from commit 0667429ce6)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-07-25 15:57:54 +01:00
David S. Miller
9af0620de1 Merge branch 'net-sysctl-races-part-6'
Kuniyuki Iwashima says:

====================
sysctl: Fix data-races around ipv4_net_table (Round 6, Final).

This series fixes data-races around 11 knobs after tcp_pacing_ss_ratio
ipv4_net_table, and this is the final round for ipv4_net_table.

While at it, other data-races around these related knobs are fixed.

  - decnet_mem
  - decnet_rmem
  - tipc_rmem

There are still 58 tables possibly missing some fixes under net/.

  $ grep -rnE "struct ctl_table.*?\[\] =" net/ | wc -l
  60
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 12:42:10 +01:00
Kuniyuki Iwashima
96b9bd8c6d ipv4: Fix data-races around sysctl_fib_notify_on_flag_change.
While reading sysctl_fib_notify_on_flag_change, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 680aea08e7 ("net: ipv4: Emit notification when fib hardware flags are changed")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 12:42:10 +01:00
Kuniyuki Iwashima
870e3a634b tcp: Fix data-races around sysctl_tcp_reflect_tos.
While reading sysctl_tcp_reflect_tos, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: ac8f1710c1 ("tcp: reflect tos value received in SYN to the socket")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 12:42:10 +01:00
Kuniyuki Iwashima
79f55473bf tcp: Fix a data-race around sysctl_tcp_comp_sack_nr.
While reading sysctl_tcp_comp_sack_nr, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 9c21d2fc41 ("tcp: add tcp_comp_sack_nr sysctl")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 12:42:10 +01:00
Kuniyuki Iwashima
22396941a7 tcp: Fix a data-race around sysctl_tcp_comp_sack_slack_ns.
While reading sysctl_tcp_comp_sack_slack_ns, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: a70437cc09 ("tcp: add hrtimer slack to sack compression")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 12:42:10 +01:00
Kuniyuki Iwashima
4866b2b0f7 tcp: Fix a data-race around sysctl_tcp_comp_sack_delay_ns.
While reading sysctl_tcp_comp_sack_delay_ns, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 6d82aa2420 ("tcp: add tcp_comp_sack_delay_ns sysctl")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 12:42:09 +01:00
Kuniyuki Iwashima
0273954595 net: Fix data-races around sysctl_[rw]mem(_offset)?.
While reading these sysctl variables, they can be changed concurrently.
Thus, we need to add READ_ONCE() to their readers.

  - .sysctl_rmem
  - .sysctl_rwmem
  - .sysctl_rmem_offset
  - .sysctl_wmem_offset
  - sysctl_tcp_rmem[1, 2]
  - sysctl_tcp_wmem[1, 2]
  - sysctl_decnet_rmem[1]
  - sysctl_decnet_wmem[1]
  - sysctl_tipc_rmem[1]

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 12:42:09 +01:00
Kuniyuki Iwashima
59bf6c65a0 tcp: Fix data-races around sk_pacing_rate.
While reading sysctl_tcp_pacing_(ss|ca)_ratio, they can be changed
concurrently.  Thus, we need to add READ_ONCE() to their readers.

Fixes: 43e122b014 ("tcp: refine pacing rate determination")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 12:42:09 +01:00
Taehee Yoo
3e7d18b9dc net: mld: fix reference count leak in mld_{query | report}_work()
mld_{query | report}_work() processes queued events.
If there are too many events in the queue, it re-queue a work.
And then, it returns without in6_dev_put().
But if queuing is failed, it should call in6_dev_put(), but it doesn't.
So, a reference count leak would occur.

THREAD0				THREAD1
mld_report_work()
				spin_lock_bh()
				if (!mod_delayed_work())
					in6_dev_hold();
				spin_unlock_bh()
	spin_lock_bh()
	schedule_delayed_work()
	spin_unlock_bh()

Script to reproduce(by Hangbin Liu):
   ip netns add ns1
   ip netns add ns2
   ip netns exec ns1 sysctl -w net.ipv6.conf.all.force_mld_version=1
   ip netns exec ns2 sysctl -w net.ipv6.conf.all.force_mld_version=1

   ip -n ns1 link add veth0 type veth peer name veth0 netns ns2
   ip -n ns1 link set veth0 up
   ip -n ns2 link set veth0 up

   for i in `seq 50`; do
           for j in `seq 100`; do
                   ip -n ns1 addr add 2021:${i}::${j}/64 dev veth0
                   ip -n ns2 addr add 2022:${i}::${j}/64 dev veth0
           done
   done
   modprobe -r veth
   ip -a netns del

splat looks like:
 unregister_netdevice: waiting for veth0 to become free. Usage count = 2
 leaked reference.
  ipv6_add_dev+0x324/0xec0
  addrconf_notify+0x481/0xd10
  raw_notifier_call_chain+0xe3/0x120
  call_netdevice_notifiers+0x106/0x160
  register_netdevice+0x114c/0x16b0
  veth_newlink+0x48b/0xa50 [veth]
  rtnl_newlink+0x11a2/0x1a40
  rtnetlink_rcv_msg+0x63f/0xc00
  netlink_rcv_skb+0x1df/0x3e0
  netlink_unicast+0x5de/0x850
  netlink_sendmsg+0x6c9/0xa90
  ____sys_sendmsg+0x76a/0x780
  __sys_sendmsg+0x27c/0x340
  do_syscall_64+0x43/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd

Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: f185de28d9 ("mld: add new workqueues for process mld events")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 12:33:59 +01:00
Jianglei Nie
c7b205fbbf net: macsec: fix potential resource leak in macsec_add_rxsa() and macsec_add_txsa()
init_rx_sa() allocates relevant resource for rx_sa->stats and rx_sa->
key.tfm with alloc_percpu() and macsec_alloc_tfm(). When some error
occurs after init_rx_sa() is called in macsec_add_rxsa(), the function
released rx_sa with kfree() without releasing rx_sa->stats and rx_sa->
key.tfm, which will lead to a resource leak.

We should call macsec_rxsa_put() instead of kfree() to decrease the ref
count of rx_sa and release the relevant resource if the refcount is 0.
The same bug exists in macsec_add_txsa() for tx_sa as well. This patch
fixes the above two bugs.

Fixes: 3cf3227a21 ("net: macsec: hardware offloading infrastructure")
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 11:51:05 +01:00
David S. Miller
20a854616d Merge branch 'macsec-config-issues'
Sabrina Dubroca says:

====================
macsec: fix config issues

The patch adding netlink support for XPN (commit 48ef50fa86
("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)"))
introduced several issues, including a kernel panic reported at [1].

Reproducing those bugs with upstream iproute is limited, since iproute
doesn't currently support XPN. I'm also working on this.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=208315
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 11:49:25 +01:00
Sabrina Dubroca
c630d1fe62 macsec: always read MACSEC_SA_ATTR_PN as a u64
Currently, MACSEC_SA_ATTR_PN is handled inconsistently, sometimes as a
u32, sometimes forced into a u64 without checking the actual length of
the attribute. Instead, we can use nla_get_u64 everywhere, which will
read up to 64 bits into a u64, capped by the actual length of the
attribute coming from userspace.

This fixes several issues:
 - the check in validate_add_rxsa doesn't work with 32-bit attributes
 - the checks in validate_add_txsa and validate_upd_sa incorrectly
   reject X << 32 (with X != 0)

Fixes: 48ef50fa86 ("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 11:49:25 +01:00
Sabrina Dubroca
b07a0e2044 macsec: limit replay window size with XPN
IEEE 802.1AEbw-2013 (section 10.7.8) specifies that the maximum value
of the replay window is 2^30-1, to help with recovery of the upper
bits of the PN.

To avoid leaving the existing macsec device in an inconsistent state
if this test fails during changelink, reuse the cleanup mechanism
introduced for HW offload. This wasn't needed until now because
macsec_changelink_common could not fail during changelink, as
modifying the cipher suite was not allowed.

Finally, this must happen after handling IFLA_MACSEC_CIPHER_SUITE so
that secy->xpn is set.

Fixes: 48ef50fa86 ("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 11:49:25 +01:00
Sabrina Dubroca
3240eac4ff macsec: fix error message in macsec_add_rxsa and _txsa
The expected length is MACSEC_SALT_LEN, not MACSEC_SA_ATTR_SALT.

Fixes: 48ef50fa86 ("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 11:49:25 +01:00
Sabrina Dubroca
f46040eeaf macsec: fix NULL deref in macsec_add_rxsa
Commit 48ef50fa86 added a test on tb_sa[MACSEC_SA_ATTR_PN], but
nothing guarantees that it's not NULL at this point. The same code was
added to macsec_add_txsa, but there it's not a problem because
validate_add_txsa checks that the MACSEC_SA_ATTR_PN attribute is
present.

Note: it's not possible to reproduce with iproute, because iproute
doesn't allow creating an SA without specifying the PN.

Fixes: 48ef50fa86 ("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208315
Reported-by: Frantisek Sumsal <fsumsal@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 11:49:25 +01:00
Slark Xiao
1aaa62c483 s390/qeth: Fix typo 'the the' in comment
Replace 'the the' with 'the' in the comment.

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 10:52:28 +01:00
Slark Xiao
2540d3c999 net: ipa: Fix typo 'the the' in comment
Replace 'the the' with 'the' in the comment.

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 10:52:28 +01:00
Slark Xiao
af35f95aca nfp: bpf: Fix typo 'the the' in comment
Replace 'the the' with 'the' in the comment.

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25 10:52:28 +01:00
Tobias Gruetzmacher
d6c52fa3e9 nvme-pci: Crucial P2 has bogus namespace ids
This adds a quirk for the Crucial P2.

Signed-off-by: Tobias Gruetzmacher <tobias-git@23.gs>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-07-25 07:34:07 +02:00
Michael Ellerman
3c69a99b62 Merge tag 'v5.19-rc7' into fixes
Merge v5.19-rc7 into fixes to bring in:
  d11219ad53 ("amdgpu: disable powerpc support for the newer display engine")
2022-07-25 13:49:22 +10:00
Xin Long
aa709da0e0 Documentation: fix sctp_wmem in ip-sysctl.rst
Since commit 1033990ac5 ("sctp: implement memory accounting on tx path"),
SCTP has supported memory accounting on tx path where 'sctp_wmem' is used
by sk_wmem_schedule(). So we should fix the description for this option in
ip-sysctl.rst accordingly.

v1->v2:
  - Improve the description as Marcelo suggested.

Fixes: 1033990ac5 ("sctp: implement memory accounting on tx path")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-24 21:41:58 +01:00
Maxim Mikityanskiy
f6336724a4 net/tls: Remove the context from the list in tls_device_down
tls_device_down takes a reference on all contexts it's going to move to
the degraded state (software fallback). If sk_destruct runs afterwards,
it can reduce the reference counter back to 1 and return early without
destroying the context. Then tls_device_down will release the reference
it took and call tls_device_free_ctx. However, the context will still
stay in tls_device_down_list forever. The list will contain an item,
memory for which is released, making a memory corruption possible.

Fix the above bug by properly removing the context from all lists before
any call to tls_device_free_ctx.

Fixes: 3740651bf7 ("tls: Fix context leak on tls_device_down")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-24 21:40:56 +01:00
Linus Torvalds
e0dccc3b76 Linux 5.19-rc8 2022-07-24 13:26:27 -07:00
Adam Borowski
e90886291c certs: make system keyring depend on x509 parser
This code requires x509_load_certificate_list() to be built-in.

Fixes: 60050ffe3d ("certs: Move load_certificate_list() to be with the asymmetric keys code")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/all/202206221515.DqpUuvbQ-lkp@intel.com/
Link: https://lore.kernel.org/all/20220712104554.408dbf42@gandalf.local.home/
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-24 12:53:55 -07:00
Linus Torvalds
af2c9ac240 Merge tag 'perf_urgent_for_v5.19_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Borislav Petkov:

 - Reorganize the perf LBR init code so that a TSX quirk is applied
   early enough in order for the LBR MSR access to not #GP

* tag 'perf_urgent_for_v5.19_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/lbr: Fix unchecked MSR access error on HSW
2022-07-24 09:55:53 -07:00
Linus Torvalds
c2602a7ce0 Merge tag 'sched_urgent_for_v5.19_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Borislav Petkov:
 "A single fix to correct a wrong BUG_ON() condition for deboosted
  tasks"

* tag 'sched_urgent_for_v5.19_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/deadline: Fix BUG_ON condition for deboosted tasks
2022-07-24 09:50:53 -07:00
Linus Torvalds
05017fed92 Merge tag 'x86_urgent_for_v5.19_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
 "A couple more retbleed fallout fixes.

  It looks like their urgency is decreasing so it seems like we've
  managed to catch whatever snafus the limited -rc testing has exposed.
  Maybe we're getting ready... :)

   - Make retbleed mitigations 64-bit only (32-bit will need a bit more
     work if even needed, at all).

   - Prevent return thunks patching of the LKDTM modules as it is not
     needed there

   - Avoid writing the SPEC_CTRL MSR on every kernel entry on eIBRS
     parts

   - Enhance error output of apply_returns() when it fails to patch a
     return thunk

   - A sparse fix to the sev-guest module

   - Protect EFI fw calls by issuing an IBPB on AMD"

* tag 'x86_urgent_for_v5.19_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation: Make all RETbleed mitigations 64-bit only
  lkdtm: Disable return thunks in rodata.c
  x86/bugs: Warn when "ibrs" mitigation is selected on Enhanced IBRS parts
  x86/alternative: Report missing return thunk details
  virt: sev-guest: Pass the appropriate argument type to iounmap()
  x86/amd: Use IBPB for firmware calls
2022-07-24 09:40:17 -07:00
Linus Torvalds
714b82c18b Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
 "One more fix to set the correct IO mapping for a clk gate in the
  lan966x driver"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: lan966x: Fix the lan966x clock gate register address
2022-07-24 09:33:13 -07:00
Linus Torvalds
515f71412b Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:

 - Check for invalid flags to KVM_CAP_X86_USER_SPACE_MSR

 - Fix use of sched_setaffinity in selftests

 - Sync kernel headers to tools

 - Fix KVM_STATS_UNIT_MAX

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Protect the unused bits in MSR exiting flags
  tools headers UAPI: Sync linux/kvm.h with the kernel sources
  KVM: selftests: Fix target thread to be migrated in rseq_test
  KVM: stats: Fix value for KVM_STATS_UNIT_MAX for boolean stats
2022-07-23 10:22:26 -07:00
Ben Hutchings
b648ab487f x86/speculation: Make all RETbleed mitigations 64-bit only
The mitigations for RETBleed are currently ineffective on x86_32 since
entry_32.S does not use the required macros.  However, for an x86_32
target, the kconfig symbols for them are still enabled by default and
/sys/devices/system/cpu/vulnerabilities/retbleed will wrongly report
that mitigations are in place.

Make all of these symbols depend on X86_64, and only enable RETHUNK by
default on X86_64.

Fixes: f43b9876e8 ("x86/retbleed: Add fine grained Kconfig knobs")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/YtwSR3NNsWp1ohfV@decadent.org.uk
2022-07-23 18:45:11 +02:00
Linus Torvalds
301c894932 Merge tag 'spi-fix-v5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A few more small driver specific fixes"

* tag 'spi-fix-v5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-rspi: Fix PIO fallback on RZ platforms
  spi: spi-cadence: Fix SPI NO Slave Select macro definition
  spi: bcm2835: bcm2835_spi_handle_err(): fix NULL pointer deref for non DMA transfers
2022-07-22 16:40:03 -07:00
Wei Wang
4d8f24eeed Revert "tcp: change pingpong threshold to 3"
This reverts commit 4a41f453be.

This to-be-reverted commit was meant to apply a stricter rule for the
stack to enter pingpong mode. However, the condition used to check for
interactive session "before(tp->lsndtime, icsk->icsk_ack.lrcvtime)" is
jiffy based and might be too coarse, which delays the stack entering
pingpong mode.
We revert this patch so that we no longer use the above condition to
determine interactive session, and also reduce pingpong threshold to 1.

Fixes: 4a41f453be ("tcp: change pingpong threshold to 3")
Reported-by: LemmyHuang <hlm3280@163.com>
Suggested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220721204404.388396-1-weiwan@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-22 15:09:10 -07:00
Linus Torvalds
70664fc10c Merge tag 'riscv-for-linus-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - Two kexec-related build fixes

 - A DTS update to make the GPIO nodes match the upcoming dtschema

 - A fix that passes -mno-relax directly to the assembler when building
   modules, to work around compilers that fail to do so

* tag 'riscv-for-linus-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: add as-options for modules with assembly compontents
  riscv: dts: align gpio-key node names with dtschema
  RISC-V: kexec: Fix build error without CONFIG_KEXEC
  RISCV: kexec: Fix build error without CONFIG_MODULES
2022-07-22 13:02:05 -07:00
Linus Torvalds
ae21fbac18 Merge tag 'acpi-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
 "Fix yet another piece of ACPI CPPC changes fallout on AMD platforms
  (Mario Limonciello)"

* tag 'acpi-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: CPPC: Don't require flexible address space if X86_FEATURE_CPPC is supported
2022-07-22 12:56:49 -07:00
Linus Torvalds
a5235996e1 Merge tag 'io_uring-5.19-2022-07-21' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "Fix for a bad kfree() introduced in this cycle, and a quick fix for
  disabling buffer recycling for IORING_OP_READV.

  The latter will get reworked for 5.20, but it gets the job done for
  5.19"

* tag 'io_uring-5.19-2022-07-21' of git://git.kernel.dk/linux-block:
  io_uring: do not recycle buffer in READV
  io_uring: fix free of unallocated buffer list
2022-07-22 12:47:09 -07:00
Linus Torvalds
d945404f74 Merge tag 'block-5.19-2022-07-21' of git://git.kernel.dk/linux-block
Pull block fix from Jens Axboe:
 "Just a single fix for missing error propagation for an allocation
  failure in raid5"

* tag 'block-5.19-2022-07-21' of git://git.kernel.dk/linux-block:
  md/raid5: missing error code in setup_conf()
2022-07-22 12:41:14 -07:00
Linus Torvalds
4a1dcf77f4 Merge tag 'i2c-for-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "Two driver bugfixes and a typo fix"

* tag 'i2c-for-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: cadence: Change large transfer count reset logic to be unconditional
  i2c: imx: fix typo in comment
  i2c: mlxcpld: Fix register setting for 400KHz frequency
2022-07-22 12:36:59 -07:00
Linus Torvalds
6f8e4e1043 Merge tag 'gpio-fixes-for-v5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - fix several regmap usage issues in gpio-pca953x

 - fix out-of-tree build for GPIO selftests

 - fix integer overflow in gpio-xilinx

* tag 'gpio-fixes-for-v5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: gpio-xilinx: Fix integer overflow
  selftests: gpio: fix include path to kernel headers for out of tree builds
  gpio: pca953x: use the correct register address when regcache sync during init
  gpio: pca953x: use the correct range when do regmap sync
  gpio: pca953x: only use single read/write for No AI mode
2022-07-22 12:28:47 -07:00
Linus Torvalds
6147191112 Merge tag 'pinctrl-v5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
 "Only driver fixes:

   - NULL check for the ralink and sunplus drivers

   - Add Jacky Bai as maintainer for the Freescale pin controllers

   - Fix pin config ops for the Ocelot LAN966x and SparX5

   - Disallow AMD pin control to be a module: the GPIO lines need to be
     active in early boot, so no can do

   - Fix the Armada 37xx to use raw spinlocks in the interrupt handler
     path to avoid wait context"

* tag 'pinctrl-v5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: armada-37xx: use raw spinlocks for regmap to avoid invalid wait context
  pinctrl: armada-37xx: make irq_lock a raw spinlock to avoid invalid wait context
  pinctrl: Don't allow PINCTRL_AMD to be a module
  pinctrl: ocelot: Fix pincfg
  pinctrl: ocelot: Fix pincfg for lan966x
  MAINTAINERS: Update freescale pin controllers maintainer
  pinctrl: sunplus: Add check for kcalloc
  pinctrl: ralink: Check for null return of devm_kcalloc
2022-07-22 12:24:04 -07:00
Emil Renner Berthing
88bd24d73d riscv: compat: vdso: Fix vdso_install target
When CONFIG_COMPAT=y the vdso_install target fails:

$ make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- vdso_install
  INSTALL vdso.so
make[1]: *** No rule to make target 'vdso_install'.  Stop.
make: *** [arch/riscv/Makefile:112: vdso_install] Error 2

The problem is that arch/riscv/kernel/compat_vdso/Makefile doesn't
have a vdso_install target, but instead calls it compat_vdso_install.

Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Link: https://lore.kernel.org/r/20220625154207.80972-1-emil.renner.berthing@canonical.com
Fixes: 0715372a06 ("riscv: compat: vdso: Add COMPAT_VDSO base code implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-07-22 12:23:14 -07:00
Linus Torvalds
8f636c6a16 Merge tag 'sound-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "Only undoes the Rockchip BCLK changes to address a regression"

* tag 'sound-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: rockchip-i2s: Undo BCLK pinctrl changes
  ASoC: rockchip: i2s: Fix NULL pointer dereference when pinctrl is not found
2022-07-22 12:19:02 -07:00
Linus Torvalds
85029503fc Merge tag 'mmc-v5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fix from Ulf Hansson:

 - sdhci-omap: Fix a lockdep warning while probing

* tag 'mmc-v5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-omap: Fix a lockdep warning for PM runtime init
2022-07-22 12:14:13 -07:00
Linus Torvalds
8e65afba6b Merge tag 'drm-fixes-2022-07-22' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Fixes for this week.

  The main one is the i915 firmware fix for the phoronix reported issue.
  I've written some firmware guidelines as a result, should land in
  -next soon. Otherwise a few amdgpu fixes, a scheduler fix, ttm fix and
  two other minor ones.

  scheduler:
   - scheduling while atomic fix

  ttm:
   - locking fix

  edp:
   - variable typo fix

  i915:
   - add back support for v69 firmware on ADL-P

  amdgpu:
   - Drop redundant buffer cleanup that can lead to a segfault
   - Add a bo_list mutex to avoid possible list corruption in CS
   - dmub notification fix

  imx:
   - fix error path"

* tag 'drm-fixes-2022-07-22' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu: Protect the amdgpu_bo_list list with a mutex v2
  drm/imx/dcss: Add missing of_node_put() in fail path
  drm/i915/guc: support v69 in parallel to v70
  drm/i915/guc: Support programming the EU priority in the GuC descriptor
  drm/panel-edp: Fix variable typo when saving hpd absent delay from DT
  drm/amdgpu: Remove one duplicated ef removal
  drm/ttm: fix locking in vmap/vunmap TTM GEM helpers
  drm/scheduler: Don't kill jobs in interrupt context
  drm/amd/display: Fix new dmub notification enabling in DM
2022-07-22 12:03:19 -07:00
Linus Torvalds
4ba1329cbb Merge tag 'rcu-urgent.2022.07.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU fix from Paul McKenney:
 "This contains a pair of commits that fix 282d8998e9 ("srcu: Prevent
  expedited GPs and blocking readers from consuming CPU"), which was
  itself a fix to an SRCU expedited grace-period problem that could
  prevent kernel live patching (KLP) from completing.

  That SRCU fix for KLP introduced large (as in minutes) boot-time
  delays to embedded Linux kernels running on qemu/KVM. These delays
  were due to the emulation of certain MMIO operations controlling
  memory layout, which were emulated with one expedited grace period per
  access. Common configurations required thousands of boot-time MMIO
  accesses, and thus thousands of boot-time expedited SRCU grace
  periods.

  In these configurations, the occasional sleeps that allowed KLP to
  proceed caused excessive boot delays. These commits preserve enough
  sleeps to permit KLP to proceed, but few enough that the virtual
  embedded kernels still boot reasonably quickly.

  This represents a regression introduced in the v5.19 merge window, and
  the bug is causing significant inconvenience"

* tag 'rcu-urgent.2022.07.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  srcu: Make expedited RCU grace periods block even less frequently
  srcu: Block less aggressively for expedited grace periods
2022-07-22 10:01:20 -07:00
Linus Torvalds
7fb5e50831 mmu_gather: fix the CONFIG_MMU_GATHER_NO_RANGE case
Sudip reports that alpha doesn't build properly, with errors like

  include/asm-generic/tlb.h:401:1: error: redefinition of 'tlb_update_vma_flags'
    401 | tlb_update_vma_flags(struct mmu_gather *tlb, struct vm_area_struct *vma)
        | ^~~~~~~~~~~~~~~~~~~~
  include/asm-generic/tlb.h:372:1: note: previous definition of 'tlb_update_vma_flags' with type 'void(struct mmu_gather *, struct vm_area_struct *)'
    372 | tlb_update_vma_flags(struct mmu_gather *tlb, struct vm_area_struct *vma) { }

the cause being that We have this odd situation where some architectures
were never converted to the newer TLB flushing interfaces that have a
range for the flush.  Instead people left them alone, and we have them
select the MMU_GATHER_NO_RANGE config option to make the tlb header
files account for this.

Peter Zijlstra cleaned some of these nasty header file games up in
commits

  1e9fdf21a4 ("mmu_gather: Remove per arch tlb_{start,end}_vma()")
  18ba064e42 ("mmu_gather: Let there be one tlb_{start,end}_vma() implementation")

but tlb_update_vma_flags() was left alone, and then commit b67fbebd4c
("mmu_gather: Force tlb-flush VM_PFNMAP vmas") ended up removing only
_one_ of the two stale duplicate dummy inline functions.

This removes the other stale one.

Somebody braver than me should try to remove MMU_GATHER_NO_RANGE
entirely, but it requires fixing up the oddball architectures that use
it: alpha, m68k, microblaze, nios2 and openrisc.

The fixups should be fairly straightforward ("fix the build errors it
exposes by adding the appropriate range arguments"), but the reason this
wasn't done in the first place is that so few people end up working on
those architectures.  But it could be done one architecture at a time,
hint, hint.

Reported-by: Sudip Mukherjee (Codethink) <sudipm.mukherjee@gmail.com>
Fixes: b67fbebd4c ("mmu_gather: Force tlb-flush VM_PFNMAP vmas")
Link: https://lore.kernel.org/all/YtpXh0QHWwaEWVAY@debian/
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-22 09:28:34 -07:00
Linus Walleij
c5cdb92869 ARM: pxa2xx: Fix GPIO descriptor tables
Laurence reports:

"Kernel >5.18 on Zaurus has a bug where the power management code can't
talk to devices, emitting the following errors:

sharpsl-pm sharpsl-pm: Error: AC check failed: voltage -22.
sharpsl-pm sharpsl-pm: Charging Error!
sharpsl-pm sharpsl-pm: Warning: Cannot read main battery!

Looking at the recent changes, I found that commit 31455bbda2 ("spi:
pxa2xx_spi: Convert to use GPIO descriptors") replaced the deprecated
SPI chip select platform device code with a gpiod lookup table. However,
this didn't seem to work until I changed the `dev_id` member from the
device name to the bus id. I'm not entirely sure why this is necessary,
but I suspect it is related to the fact that in sysfs SPI devices are
attached under /sys/devices/.../dev_name/spi_master/spiB/spiB.C, rather
than directly to the device."

After reviewing the change I conclude that the same fix is needed
for all affected boards.

Fixes: 31455bbda2 ("spi: pxa2xx_spi: Convert to use GPIO descriptors")
Reported-by: Laurence de Bruxelles <lfdebrux@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220722114611.1517414-1-linus.walleij@linaro.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-07-22 15:03:08 +02:00
Lukas Bulwahn
e2a619ca0b asm-generic: remove a broken and needless ifdef conditional
Commit 527701eda5 ("lib: Add a generic version of devmem_is_allowed()")
introduces the config symbol GENERIC_LIB_DEVMEM_IS_ALLOWED, but then
falsely refers to CONFIG_GENERIC_DEVMEM_IS_ALLOWED (note the missing LIB
in the reference) in ./include/asm-generic/io.h.

Luckily, ./scripts/checkkconfigsymbols.py warns on non-existing configs:

GENERIC_DEVMEM_IS_ALLOWED
Referencing files: include/asm-generic/io.h

The actual fix, though, is simply to not to make this function declaration
dependent on any kernel config. For architectures that intend to use
the generic version, the arch's 'select GENERIC_LIB_DEVMEM_IS_ALLOWED' will
lead to picking the function definition, and for other architectures, this
function is simply defined elsewhere.

The wrong '#ifndef' on a non-existing config symbol also always had the
same effect (although more by mistake than by intent). So, there is no
functional change.

Remove this broken and needless ifdef conditional.

Fixes: 527701eda5 ("lib: Add a generic version of devmem_is_allowed()")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-07-22 15:00:00 +02:00
Sherry Sun
4bcffe9417 EDAC/synopsys: Re-enable the error interrupts on v3 hw
zynqmp_get_error_info() writes 0 to the ECC_CLR_OFST register after
an interrupt for a {un-,}correctable error is raised, which disables
the error interrupts. Then the interrupt handler will be called only
once. Therefore, re-enable the error interrupt line at the end of
intr_handler() for v3.x Synopsys EDAC DDR.

Fixes: f7824ded41 ("EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR")
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Shubhrajyoti Datta <Shubhrajyoti.datta@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220427015137.8406-3-sherry.sun@nxp.com
2022-07-22 14:31:38 +02:00
Sherry Sun
be76ceaf03 EDAC/synopsys: Use the correct register to disable the error interrupt on v3 hw
v3.x Synopsys EDAC DDR doesn't have the QOS Interrupt register. Use the
ECC Clear Register to disable the error interrupts instead.

Fixes: f7824ded41 ("EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR")
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Shubhrajyoti Datta <Shubhrajyoti.datta@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220427015137.8406-2-sherry.sun@nxp.com
2022-07-22 14:31:30 +02:00
Christophe JAILLET
8ee18e2a9e caif: Fix bitmap data type in "struct caifsock"
Bitmap are "unsigned long", so use it instead of a "u32" to make things
more explicit.

While at it, remove some useless cast (and leading spaces) when using the
bitmap API.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:51:45 +01:00
Rob Herring
030f21ba2a dt-bindings: net: fsl,fec: Add missing types to phy-reset-* properties
The phy-reset-* properties are missing type definitions and are not common
properties. Even though they are deprecated, a type is needed.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:38:16 +01:00
Rob Herring
17161c341d dt-bindings: net: ethernet-controller: Rework 'fixed-link' schema
While the if/then schemas mostly work, there's a few issues. The 'allOf'
schema will also be true if 'fixed-link' is not an array or object as a
false 'if' schema (without an 'else') will be true. In the array case
doesn't set the type (uint32-array) in the 'then' clause. In the node case,
'additionalProperties' is missing.

Rework the schema to use oneOf with each possible type.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:38:16 +01:00
David S. Miller
b20a7ca8cf Merge branch 'sysctl-races-part-5'
Kuniyuki Iwashima says:

====================
sysctl: Fix data-races around ipv4_net_table (Round 5).

This series fixes data-races around 15 knobs after tcp_dsack in
ipv4_net_table.

tcp_tso_win_divisor was skipped because it already uses READ_ONCE().

So, the final round for ipv4_net_table will start with tcp_pacing_ss_ratio.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:18 +01:00
Kuniyuki Iwashima
2afdbe7b8d tcp: Fix a data-race around sysctl_tcp_invalid_ratelimit.
While reading sysctl_tcp_invalid_ratelimit, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 032ee42369 ("tcp: helpers to mitigate ACK loops by rate-limiting out-of-window dupacks")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:18 +01:00
Kuniyuki Iwashima
85225e6f0a tcp: Fix a data-race around sysctl_tcp_autocorking.
While reading sysctl_tcp_autocorking, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: f54b311142 ("tcp: auto corking")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:18 +01:00
Kuniyuki Iwashima
1330ffacd0 tcp: Fix a data-race around sysctl_tcp_min_rtt_wlen.
While reading sysctl_tcp_min_rtt_wlen, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: f672258391 ("tcp: track min RTT using windowed min-filter")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:18 +01:00
Kuniyuki Iwashima
2455e61b85 tcp: Fix a data-race around sysctl_tcp_tso_rtt_log.
While reading sysctl_tcp_tso_rtt_log, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 65466904b0 ("tcp: adjust TSO packet sizes based on min_rtt")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:17 +01:00
Kuniyuki Iwashima
e0bb4ab9df tcp: Fix a data-race around sysctl_tcp_min_tso_segs.
While reading sysctl_tcp_min_tso_segs, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 95bd09eb27 ("tcp: TSO packets automatic sizing")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:17 +01:00
Kuniyuki Iwashima
db3815a2fa tcp: Fix a data-race around sysctl_tcp_challenge_ack_limit.
While reading sysctl_tcp_challenge_ack_limit, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 282f23c6ee ("tcp: implement RFC 5961 3.2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:17 +01:00
Kuniyuki Iwashima
9fb90193fb tcp: Fix a data-race around sysctl_tcp_limit_output_bytes.
While reading sysctl_tcp_limit_output_bytes, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 46d3ceabd8 ("tcp: TCP Small Queues")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:17 +01:00
Kuniyuki Iwashima
0f1e4d0659 tcp: Fix data-races around sysctl_tcp_workaround_signed_windows.
While reading sysctl_tcp_workaround_signed_windows, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 15d99e02ba ("[TCP]: sysctl to allow TCP window > 32767 sans wscale")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:17 +01:00
Kuniyuki Iwashima
7804764888 tcp: Fix data-races around sysctl_tcp_moderate_rcvbuf.
While reading sysctl_tcp_moderate_rcvbuf, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:17 +01:00
Kuniyuki Iwashima
ab1ba21b52 tcp: Fix data-races around sysctl_tcp_no_ssthresh_metrics_save.
While reading sysctl_tcp_no_ssthresh_metrics_save, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 65e6d90168 ("net-tcp: Disable TCP ssthresh metrics cache by default")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:17 +01:00
Kuniyuki Iwashima
8499a2454d tcp: Fix a data-race around sysctl_tcp_nometrics_save.
While reading sysctl_tcp_nometrics_save, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:17 +01:00
Kuniyuki Iwashima
706c6202a3 tcp: Fix a data-race around sysctl_tcp_frto.
While reading sysctl_tcp_frto, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:17 +01:00
Kuniyuki Iwashima
36eeee75ef tcp: Fix a data-race around sysctl_tcp_adv_win_scale.
While reading sysctl_tcp_adv_win_scale, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:17 +01:00
Kuniyuki Iwashima
02ca527ac5 tcp: Fix a data-race around sysctl_tcp_app_win.
While reading sysctl_tcp_app_win, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:17 +01:00
Kuniyuki Iwashima
58ebb1c8b3 tcp: Fix data-races around sysctl_tcp_dsack.
While reading sysctl_tcp_dsack, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:06:17 +01:00
Srinivas Neeli
32c094a09d gpio: gpio-xilinx: Fix integer overflow
Current implementation is not able to configure more than 32 pins
due to incorrect data type. So type casting with unsigned long
to avoid it.

Fixes: 02b3f84d90 ("xilinx: Switch to use bitmap APIs")
Signed-off-by: Srinivas Neeli <srinivas.neeli@xilinx.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-07-22 09:17:03 +02:00
Dave Airlie
7f5ec14a4e Merge tag 'drm-misc-fixes-2022-07-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A scheduling-while-atomic fix for drm/scheduler, a locking fix for TTM,
a typo fix for panel-edp and a resource removal fix for imx/dcss

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220721085550.hrwbukj34y56rzva@houat
2022-07-22 12:19:45 +10:00
Liang He
ebbbe23fdf net: sungem_phy: Add of_node_put() for reference returned by of_get_parent()
In bcm5421_init(), we should call of_node_put() for the reference
returned by of_get_parent() which has increased the refcount.

Fixes: 3c326fe9cb ("[PATCH] ppc64: Add new PHY to sungem")
Signed-off-by: Liang He <windhl@126.com>
Link: https://lore.kernel.org/r/20220720131003.1287426-1-windhl@126.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-21 19:04:19 -07:00
Vladimir Oltean
27161db090 net: pcs: xpcs: propagate xpcs_read error to xpcs_get_state_c37_sgmii
While phylink_pcs_ops :: pcs_get_state does return void, xpcs_get_state()
does check for a non-zero return code from xpcs_get_state_c37_sgmii()
and prints that as a message to the kernel log.

However, a non-zero return code from xpcs_read() is translated into
"return false" (i.e. zero as int) and the I/O error is therefore not
printed. Fix that.

Fixes: b97b5331b8 ("net: pcs: add C37 SGMII AN support for intel mGbE controller")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220720112057.3504398-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-21 19:01:16 -07:00
Ben Dooks
c1f6eff304 riscv: add as-options for modules with assembly compontents
When trying to load modules built for RISC-V which include assembly files
the kernel loader errors with "unexpected relocation type 'R_RISCV_ALIGN'"
due to R_RISCV_ALIGN relocations being generated by the assembler.

The R_RISCV_ALIGN relocations can be removed at the expense of code space
by adding -mno-relax to gcc and as.  In commit 7a8e7da422
("RISC-V: Fixes to module loading") -mno-relax is added to the build
variable KBUILD_CFLAGS_MODULE. See [1] for more info.

The issue is that when kbuild builds a .S file, it invokes gcc with
the -mno-relax flag, but this is not being passed through to the
assembler. Adding -Wa,-mno-relax to KBUILD_AFLAGS_MODULE ensures that
the assembler is invoked correctly. This may have now been fixed in
gcc[2] and this addition should not stop newer gcc and as from working.

[1] https://github.com/riscv/riscv-elf-psabi-doc/issues/183
[2] 3b0a7d624e

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Link: https://lore.kernel.org/r/20220529152200.609809-1-ben.dooks@codethink.co.uk
Fixes: ab1ef68e54 ("RISC-V: Add sections of PLT and GOT for kernel module")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-07-21 12:09:29 -07:00
Linus Torvalds
68e77ffbfd Merge tag 'mtd/fixes-for-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fix from Richard Weinberger:
 "A aingle NAND controller fix:

   - gpmi: Fix busy timeout setting (wrong calculation, yes again)"

* tag 'mtd/fixes-for-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: gpmi: Set WAIT_FOR_READY timeout based on program/erase times
2022-07-21 11:28:26 -07:00
Linus Torvalds
7ca433dc6d Merge tag 'net-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from can.

  Still no major regressions, most of the changes are still due to data
  races fixes, plus the usual bunch of drivers fixes.

  Previous releases - regressions:

   - tcp/udp: make early_demux back namespacified.

   - dsa: fix issues with vlan_filtering_is_global

  Previous releases - always broken:

   - ip: fix data-races around ipv4_net_table (round 2, 3 & 4)

   - amt: fix validation and synchronization bugs

   - can: fix detection of mcp251863

   - eth: iavf: fix handling of dummy receive descriptors

   - eth: lan966x: fix issues with MAC table

   - eth: stmmac: dwmac-mediatek: fix clock issue

  Misc:

   - dsa: update documentation"

* tag 'net-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits)
  mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication
  net/sched: cls_api: Fix flow action initialization
  tcp: Fix data-races around sysctl_tcp_max_reordering.
  tcp: Fix a data-race around sysctl_tcp_abort_on_overflow.
  tcp: Fix a data-race around sysctl_tcp_rfc1337.
  tcp: Fix a data-race around sysctl_tcp_stdurg.
  tcp: Fix a data-race around sysctl_tcp_retrans_collapse.
  tcp: Fix data-races around sysctl_tcp_slow_start_after_idle.
  tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts.
  tcp: Fix data-races around sysctl_tcp_recovery.
  tcp: Fix a data-race around sysctl_tcp_early_retrans.
  tcp: Fix data-races around sysctl knobs related to SYN option.
  udp: Fix a data-race around sysctl_udp_l3mdev_accept.
  ip: Fix data-races around sysctl_ip_prot_sock.
  ipv4: Fix data-races around sysctl_fib_multipath_hash_fields.
  ipv4: Fix data-races around sysctl_fib_multipath_hash_policy.
  ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh.
  can: rcar_canfd: Add missing of_node_put() in rcar_canfd_probe()
  can: mcp251xfd: fix detection of mcp251863
  Documentation: fix udp_wmem_min in ip-sysctl.rst
  ...
2022-07-21 11:08:35 -07:00
Harald Freudenberger
918e75f77a s390/archrandom: prevent CPACF trng invocations in interrupt context
This patch slightly reworks the s390 arch_get_random_seed_{int,long}
implementation: Make sure the CPACF trng instruction is never
called in any interrupt context. This is done by adding an
additional condition in_task().

Justification:

There are some constrains to satisfy for the invocation of the
arch_get_random_seed_{int,long}() functions:
- They should provide good random data during kernel initialization.
- They should not be called in interrupt context as the TRNG
  instruction is relatively heavy weight and may for example
  make some network loads cause to timeout and buck.

However, it was not clear what kind of interrupt context is exactly
encountered during kernel init or network traffic eventually calling
arch_get_random_seed_long().

After some days of investigations it is clear that the s390
start_kernel function is not running in any interrupt context and
so the trng is called:

Jul 11 18:33:39 t35lp54 kernel:  [<00000001064e90ca>] arch_get_random_seed_long.part.0+0x32/0x70
Jul 11 18:33:39 t35lp54 kernel:  [<000000010715f246>] random_init+0xf6/0x238
Jul 11 18:33:39 t35lp54 kernel:  [<000000010712545c>] start_kernel+0x4a4/0x628
Jul 11 18:33:39 t35lp54 kernel:  [<000000010590402a>] startup_continue+0x2a/0x40

The condition in_task() is true and the CPACF trng provides random data
during kernel startup.

The network traffic however, is more difficult. A typical call stack
looks like this:

Jul 06 17:37:07 t35lp54 kernel:  [<000000008b5600fc>] extract_entropy.constprop.0+0x23c/0x240
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b560136>] crng_reseed+0x36/0xd8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b5604b8>] crng_make_state+0x78/0x340
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b5607e0>] _get_random_bytes+0x60/0xf8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b56108a>] get_random_u32+0xda/0x248
Jul 06 17:37:07 t35lp54 kernel:  [<000000008aefe7a8>] kfence_guarded_alloc+0x48/0x4b8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008aeff35e>] __kfence_alloc+0x18e/0x1b8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008aef7f10>] __kmalloc_node_track_caller+0x368/0x4d8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b611eac>] kmalloc_reserve+0x44/0xa0
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b611f98>] __alloc_skb+0x90/0x178
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b6120dc>] __napi_alloc_skb+0x5c/0x118
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b8f06b4>] qeth_extract_skb+0x13c/0x680
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b8f6526>] qeth_poll+0x256/0x3f8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b63d76e>] __napi_poll.constprop.0+0x46/0x2f8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b63dbec>] net_rx_action+0x1cc/0x408
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b937302>] __do_softirq+0x132/0x6b0
Jul 06 17:37:07 t35lp54 kernel:  [<000000008abf46ce>] __irq_exit_rcu+0x13e/0x170
Jul 06 17:37:07 t35lp54 kernel:  [<000000008abf531a>] irq_exit_rcu+0x22/0x50
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b922506>] do_io_irq+0xe6/0x198
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b935826>] io_int_handler+0xd6/0x110
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b9358a6>] psw_idle_exit+0x0/0xa
Jul 06 17:37:07 t35lp54 kernel: ([<000000008ab9c59a>] arch_cpu_idle+0x52/0xe0)
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b933cfe>] default_idle_call+0x6e/0xd0
Jul 06 17:37:07 t35lp54 kernel:  [<000000008ac59f4e>] do_idle+0xf6/0x1b0
Jul 06 17:37:07 t35lp54 kernel:  [<000000008ac5a28e>] cpu_startup_entry+0x36/0x40
Jul 06 17:37:07 t35lp54 kernel:  [<000000008abb0d90>] smp_start_secondary+0x148/0x158
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b935b9e>] restart_int_handler+0x6e/0x90

which confirms that the call is in softirq context. So in_task() covers exactly
the cases where we want to have CPACF trng called: not in nmi, not in hard irq,
not in soft irq but in normal task context and during kernel init.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Link: https://lore.kernel.org/r/20220713131721.257907-1-freude@linux.ibm.com
Fixes: e4f7440030 ("s390/archrandom: simplify back to earlier design and initialize earlier")
[agordeev@linux.ibm.com changed desc, added Fixes and Link, removed -stable]
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-21 20:03:03 +02:00
Peter Zijlstra
b67fbebd4c mmu_gather: Force tlb-flush VM_PFNMAP vmas
Jann reported a race between munmap() and unmap_mapping_range(), where
unmap_mapping_range() will no-op once unmap_vmas() has unlinked the
VMA; however munmap() will not yet have invalidated the TLBs.

Therefore unmap_mapping_range() will complete while there are still
(stale) TLB entries for the specified range.

Mitigate this by force flushing TLBs for VM_PFNMAP ranges.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21 10:50:13 -07:00
Peter Zijlstra
18ba064e42 mmu_gather: Let there be one tlb_{start,end}_vma() implementation
Now that architectures are no longer allowed to override
tlb_{start,end}_vma() re-arrange code so that there is only one
implementation for each of these functions.

This much simplifies trying to figure out what they actually do.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21 10:50:13 -07:00
Peter Zijlstra
1d7708e75c csky/tlb: Remove tlb_flush() define
The previous patch removed the tlb_flush_end() implementation which
used tlb_flush_range(). This means:

 - csky did double invalidates, a range invalidate per vma and a full
   invalidate at the end

 - csky actually has range invalidates and as such the generic
   tlb_flush implementation is more efficient for it.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21 10:50:13 -07:00
Peter Zijlstra
1e9fdf21a4 mmu_gather: Remove per arch tlb_{start,end}_vma()
Scattered across the archs are 3 basic forms of tlb_{start,end}_vma().
Provide two new MMU_GATHER_knobs to enumerate them and remove the per
arch tlb_{start,end}_vma() implementations.

 - MMU_GATHER_NO_FLUSH_CACHE indicates the arch has flush_cache_range()
   but does *NOT* want to call it for each VMA.

 - MMU_GATHER_MERGE_VMAS indicates the arch wants to merge the
   invalidate across multiple VMAs if possible.

With these it is possible to capture the three forms:

  1) empty stubs;
     select MMU_GATHER_NO_FLUSH_CACHE and MMU_GATHER_MERGE_VMAS

  2) start: flush_cache_range(), end: empty;
     select MMU_GATHER_MERGE_VMAS

  3) start: flush_cache_range(), end: flush_tlb_range();
     default

Obviously, if the architecture does not have flush_cache_range() then
it also doesn't need to select MMU_GATHER_NO_FLUSH_CACHE.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21 10:50:13 -07:00
Khalid Masum
23a67619bc scripts/gdb: Fix gdb 'lx-symbols' command
Currently the command 'lx-symbols' in gdb exits with the error`Function
"do_init_module" not defined in "kernel/module.c"`. This occurs because
the file kernel/module.c was moved to kernel/module/main.c.

Fix this breakage by changing the path to "kernel/module/main.c" in
LoadModuleBreakpoint.

Signed-off-by: Khalid Masum <khalid.masum.92@gmail.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Fixes: cfc1d27789 ("module: Move all into module/")
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21 10:40:55 -07:00
Linus Torvalds
44e29e64cf watch-queue: remove spurious double semicolon
Sedat Dilek noticed that I had an extraneous semicolon at the end of a
line in the previous patch.

It's harmless, but unintentional, and while compilers just treat it as
an extra empty statement, for all I know some other tooling might warn
about it. So clean it up before other people notice too ;)

Fixes: 353f7988dd ("watchqueue: make sure to serialize 'wqueue->defunct' properly")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
2022-07-21 10:30:14 -07:00
Biju Das
b620aa3a7b spi: spi-rspi: Fix PIO fallback on RZ platforms
RSPI IP on RZ/{A, G2L} SoC's has the same signal for both interrupt
and DMA transfer request. Setting DMARS register for DMA transfer
makes the signal to work as a DMA transfer request signal and
subsequent interrupt requests to the interrupt controller
are masked.

PIO fallback does not work as interrupt signal is disabled.

This patch fixes this issue by re-enabling the interrupts by
calling dmaengine_synchronize().

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20220721143449.879257-1-biju.das.jz@bp.renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-07-21 17:21:07 +01:00
Dylan Yudaken
934447a603 io_uring: do not recycle buffer in READV
READV cannot recycle buffers as it would lose some of the data required to
reimport that buffer.

Reported-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Fixes: b66e65f414 ("io_uring: never call io_buffer_select() for a buffer re-select")
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220721131325.624788-1-dylany@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-21 08:31:31 -06:00
Dylan Yudaken
ec8516f3b7 io_uring: fix free of unallocated buffer list
in the error path of io_register_pbuf_ring, only free bl if it was
allocated.

Reported-by: Dipanjan Das <mail.dipanjan.das@gmail.com>
Fixes: c7fb19428d ("io_uring: add support for ring mapped supplied buffers")
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/all/CANX2M5bXKw1NaHdHNVqssUUaBCs8aBpmzRNVEYEvV0n44P7ioA@mail.gmail.com/
Link: https://lore.kernel.org/all/CANX2M5YiZBXU3L6iwnaLs-HHJXRvrxM8mhPDiMDF9Y9sAvOHUA@mail.gmail.com/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-21 08:29:01 -06:00
Arnd Bergmann
430d31bb2e Merge tag 'at91-fixes-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/fixes
AT91 fixes for 5.19 #3

It contains one fix for LAN966 based SoCs fixing the frequency of
sys_clk. sys_clk is feeding different IPs so having proper frequency
for it in DT is necessary for proper working of different drivers.

* tag 'at91-fixes-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux:
  ARM: dts: lan966x: fix sys_clk frequency

Link: https://lore.kernel.org/r/20220721075705.1739915-1-claudiu.beznea@microchip.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-07-21 14:58:46 +02:00
Juri Lelli
ddfc710395 sched/deadline: Fix BUG_ON condition for deboosted tasks
Tasks the are being deboosted from SCHED_DEADLINE might enter
enqueue_task_dl() one last time and hit an erroneous BUG_ON condition:
since they are not boosted anymore, the if (is_dl_boosted()) branch is
not taken, but the else if (!dl_prio) is and inside this one we
BUG_ON(!is_dl_boosted), which is of course false (BUG_ON triggered)
otherwise we had entered the if branch above. Long story short, the
current condition doesn't make sense and always leads to triggering of a
BUG.

Fix this by only checking enqueue flags, properly: ENQUEUE_REPLENISH has
to be present, but additional flags are not a problem.

Fixes: 64be6f1f5f ("sched/deadline: Don't replenish from a !SCHED_DEADLINE entity")
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20220714151908.533052-1-juri.lelli@redhat.com
2022-07-21 10:35:28 +02:00
Dave Airlie
1c46f3c075 Merge tag 'amd-drm-fixes-5.19-2022-07-20' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.19-2022-07-20:

amdgpu:
- Drop redundant buffer cleanup that can lead to a segfault
- Add a bo_list mutex to avoid possible list corruption in CS

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220720210917.6202-1-alexander.deucher@amd.com
2022-07-21 13:22:40 +10:00
Dave Airlie
4b2b2ee1f8 Merge tag 'drm-intel-fixes-2022-07-20-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix the regression caused by the lack of GuC v70.
  Let's accept the fallback to v69.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YtgguaR5JYK083oZ@intel.com
2022-07-21 13:22:23 +10:00
Luben Tuikov
90af0ca047 drm/amdgpu: Protect the amdgpu_bo_list list with a mutex v2
Protect the struct amdgpu_bo_list with a mutex. This is used during command
submission in order to avoid buffer object corruption as recorded in
the link below.

v2 (chk): Keep the mutex looked for the whole CS to avoid using the
	  list from multiple CS threads at the same time.

Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Vitaly Prosyak <Vitaly.Prosyak@amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2048
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-07-20 16:23:34 -04:00
Linus Torvalds
353f7988dd watchqueue: make sure to serialize 'wqueue->defunct' properly
When the pipe is closed, we mark the associated watchqueue defunct by
calling watch_queue_clear().  However, while that is protected by the
watchqueue lock, new watchqueue entries aren't actually added under that
lock at all: they use the pipe->rd_wait.lock instead, and looking up
that pipe happens without any locking.

The watchqueue code uses the RCU read-side section to make sure that the
wqueue entry itself hasn't disappeared, but that does not protect the
pipe_info in any way.

So make sure to actually hold the wqueue lock when posting watch events,
properly serializing against the pipe being torn down.

Reported-by: Noam Rathaus <noamr@ssd-disclosure.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-20 10:46:07 -07:00
Sai Krishna Potthuri
e1502ba416 spi: spi-cadence: Fix SPI NO Slave Select macro definition
Fix SPI NO Slave Select macro definition, when all the SPI CS bits
are high which means no slave is selected.

Fixes: 21b511ddee ("spi: spi-cadence: Fix SPI CS gets toggling sporadically")
Signed-off-by: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri@xilinx.com>
Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra@xilinx.com>
Link: https://lore.kernel.org/r/20220713164529.28444-1-amit.kumar-mahapatra@xilinx.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-07-20 18:45:21 +01:00
Kan Liang
b0380e1350 perf/x86/intel/lbr: Fix unchecked MSR access error on HSW
The fuzzer triggers the below trace.

[ 7763.384369] unchecked MSR access error: WRMSR to 0x689
(tried to write 0x1fffffff8101349e) at rIP: 0xffffffff810704a4
(native_write_msr+0x4/0x20)
[ 7763.397420] Call Trace:
[ 7763.399881]  <TASK>
[ 7763.401994]  intel_pmu_lbr_restore+0x9a/0x1f0
[ 7763.406363]  intel_pmu_lbr_sched_task+0x91/0x1c0
[ 7763.410992]  __perf_event_task_sched_in+0x1cd/0x240

On a machine with the LBR format LBR_FORMAT_EIP_FLAGS2, when the TSX is
disabled, a TSX quirk is required to access LBR from registers.
The lbr_from_signext_quirk_needed() is introduced to determine whether
the TSX quirk should be applied. However, the
lbr_from_signext_quirk_needed() is invoked before the
intel_pmu_lbr_init(), which parses the LBR format information. Without
the correct LBR format information, the TSX quirk never be applied.

Move the lbr_from_signext_quirk_needed() into the intel_pmu_lbr_init().
Checking x86_pmu.lbr_has_tsx in the lbr_from_signext_quirk_needed() is
not required anymore.

Both LBR_FORMAT_EIP_FLAGS2 and LBR_FORMAT_INFO have LBR_TSX flag, but
only the LBR_FORMAT_EIP_FLAGS2 requirs the quirk. Update the comments
accordingly.

Fixes: 1ac7fd8159 ("perf/x86/intel/lbr: Support LBR format V7")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20220714182630.342107-1-kan.liang@linux.intel.com
2022-07-20 19:24:55 +02:00
Josh Poimboeuf
efc72a665a lkdtm: Disable return thunks in rodata.c
The following warning was seen:

  WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:557 apply_returns (arch/x86/kernel/alternative.c:557 (discriminator 1))
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc4-00008-gee88d363d156 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
  RIP: 0010:apply_returns (arch/x86/kernel/alternative.c:557 (discriminator 1))
  Code: ff ff 74 cb 48 83 c5 04 49 39 ee 0f 87 81 fe ff ff e9 22 ff ff ff 0f 0b 48 83 c5 04 49 39 ee 0f 87 6d fe ff ff e9 0e ff ff ff <0f> 0b 48 83 c5 04 49 39 ee 0f 87 59 fe ff ff e9 fa fe ff ff 48 89

The warning happened when apply_returns() failed to convert "JMP
__x86_return_thunk" to RET.  It was instead a JMP to nowhere, due to the
thunk relocation not getting resolved.

That rodata.o code is objcopy'd to .rodata, and later memcpy'd, so
relocations don't work (and are apparently silently ignored).

LKDTM is only used for testing, so the naked RET should be fine.  So
just disable return thunks for that file.

While at it, disable objtool and KCSAN for the file.

Fixes: 0b53c374b9 ("x86/retpoline: Use -mfunction-return")
Reported-by: kernel test robot <oliver.sang@intel.com>
Debugged-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/Ys58BxHxoDZ7rfpr@xsang-OptiPlex-9020/
2022-07-20 19:24:53 +02:00
Pawan Gupta
eb23b5ef91 x86/bugs: Warn when "ibrs" mitigation is selected on Enhanced IBRS parts
IBRS mitigation for spectre_v2 forces write to MSR_IA32_SPEC_CTRL at
every kernel entry/exit. On Enhanced IBRS parts setting
MSR_IA32_SPEC_CTRL[IBRS] only once at boot is sufficient. MSR writes at
every kernel entry/exit incur unnecessary performance loss.

When Enhanced IBRS feature is present, print a warning about this
unnecessary performance loss.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/2a5eaf54583c2bfe0edc4fea64006656256cca17.1657814857.git.pawan.kumar.gupta@linux.intel.com
2022-07-20 19:24:53 +02:00
Kees Cook
65cdf0d623 x86/alternative: Report missing return thunk details
Debugging missing return thunks is easier if we can see where they're
happening.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/Ys66hwtFcGbYmoiZ@hirez.programming.kicks-ass.net/
2022-07-20 19:24:53 +02:00
Eric Snowberg
543ce63b66 lockdown: Fix kexec lockdown bypass with ima policy
The lockdown LSM is primarily used in conjunction with UEFI Secure Boot.
This LSM may also be used on machines without UEFI.  It can also be
enabled when UEFI Secure Boot is disabled.  One of lockdown's features
is to prevent kexec from loading untrusted kernels.  Lockdown can be
enabled through a bootparam or after the kernel has booted through
securityfs.

If IMA appraisal is used with the "ima_appraise=log" boot param,
lockdown can be defeated with kexec on any machine when Secure Boot is
disabled or unavailable.  IMA prevents setting "ima_appraise=log" from
the boot param when Secure Boot is enabled, but this does not cover
cases where lockdown is used without Secure Boot.

To defeat lockdown, boot without Secure Boot and add ima_appraise=log to
the kernel command line; then:

  $ echo "integrity" > /sys/kernel/security/lockdown
  $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" > \
    /sys/kernel/security/ima/policy
  $ kexec -ls unsigned-kernel

Add a call to verify ima appraisal is set to "enforce" whenever lockdown
is enabled.  This fixes CVE-2022-21505.

Cc: stable@vger.kernel.org
Fixes: 29d3c1c8df ("kexec: Allow kexec_file() with appropriate IMA policy when locked down")
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: John Haxby <john.haxby@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-20 09:56:48 -07:00
Marc Kleine-Budde
4ceaa68445 spi: bcm2835: bcm2835_spi_handle_err(): fix NULL pointer deref for non DMA transfers
In case a IRQ based transfer times out the bcm2835_spi_handle_err()
function is called. Since commit 1513ceee70 ("spi: bcm2835: Drop
dma_pending flag") the TX and RX DMA transfers are unconditionally
canceled, leading to NULL pointer derefs if ctlr->dma_tx or
ctlr->dma_rx are not set.

Fix the NULL pointer deref by checking that ctlr->dma_tx and
ctlr->dma_rx are valid pointers before accessing them.

Fixes: 1513ceee70 ("spi: bcm2835: Drop dma_pending flag")
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20220719072234.2782764-1-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-07-20 14:03:40 +01:00
Kent Gibson
f63731e18e selftests: gpio: fix include path to kernel headers for out of tree builds
When building selftests out of the kernel tree the gpio.h the include
path is incorrect and the build falls back to the system includes
which may be outdated.

Add the KHDR_INCLUDES to the CFLAGS to include the gpio.h from the
build tree.

Fixes: 4f4d0af7b2 ("selftests: gpio: restore CFLAGS options")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-07-20 14:35:18 +02:00
Florian Fainelli
9b31e60800 tools: Fixed MIPS builds due to struct flock re-definition
Building perf for MIPS failed after 9f79b8b723 ("uapi: simplify
__ARCH_FLOCK{,64}_PAD a little") with the following error:

  CC
/home/fainelli/work/buildroot/output/bmips/build/linux-custom/tools/perf/trace/beauty/fcntl.o
In file included from
../../../../host/mipsel-buildroot-linux-gnu/sysroot/usr/include/asm/fcntl.h:77,
                 from ../include/uapi/linux/fcntl.h:5,
                 from trace/beauty/fcntl.c:10:
../include/uapi/asm-generic/fcntl.h:188:8: error: redefinition of
'struct flock'
 struct flock {
        ^~~~~
In file included from ../include/uapi/linux/fcntl.h:5,
                 from trace/beauty/fcntl.c:10:
../../../../host/mipsel-buildroot-linux-gnu/sysroot/usr/include/asm/fcntl.h:63:8:
note: originally defined here
 struct flock {
        ^~~~~

This is due to the local copy under
tools/include/uapi/asm-generic/fcntl.h including the toolchain's kernel
headers which already define 'struct flock' and define
HAVE_ARCH_STRUCT_FLOCK to future inclusions make a decision as to
whether re-defining 'struct flock' is appropriate or not.

Make sure what do not re-define 'struct flock'
when HAVE_ARCH_STRUCT_FLOCK is already defined.

Fixes: 9f79b8b723 ("uapi: simplify __ARCH_FLOCK{,64}_PAD a little")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[arnd: sync with include/uapi/asm-generic/fcntl.h as well]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-07-20 12:55:33 +02:00
David S. Miller
44484fa8ee Merge tag 'linux-can-fixes-for-5.19-20220720' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:

====================
this is a pull request of 2 patches for net/master.

The first patch is by me and fixes the detection of the mcp251863 in
the mcp251xfd driver.

The last patch is by Liang He and adds a missing of_node_put() in the
rcar_canfd driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 11:13:54 +01:00
Ido Schimmel
e5ec6a2513 mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication
mlxsw needs to distinguish nexthops with a gateway from connected
nexthops in order to write the former to the adjacency table of the
device. The check used to rely on the fact that nexthops with a gateway
have a 'link' scope whereas connected nexthops have a 'host' scope. This
is no longer correct after commit 747c143072 ("ip: fix dflt addr
selection for connected nexthop").

Fix that by instead checking the address family of the gateway IP. This
is a more direct way and also consistent with the IPv6 counterpart in
mlxsw_sp_rt6_is_gateway().

Cc: stable@vger.kernel.org
Fixes: 747c143072 ("ip: fix dflt addr selection for connected nexthop")
Fixes: 597cfe4fc3 ("nexthop: Add support for IPv4 nexthops")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 11:00:46 +01:00
Oz Shlomo
c0f47c2822 net/sched: cls_api: Fix flow action initialization
The cited commit refactored the flow action initialization sequence to
use an interface method when translating tc action instances to flow
offload objects. The refactored version skips the initialization of the
generic flow action attributes for tc actions, such as pedit, that allocate
more than one offload entry. This can cause potential issues for drivers
mapping flow action ids.

Populate the generic flow action fields for all the flow action entries.

Fixes: c54e1d920f ("flow_offload: add ops to tc_action_ops for flow action setup")
Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>

----
v1 -> v2:
 - coalese the generic flow action fields initialization to a single loop
Reviewed-by: Baowen Zheng <baowen.zheng@corigine.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:54:27 +01:00
David S. Miller
3b15b3e93e Merge branch 'net-sysctl-races-round-4'
Kuniyuki Iwashima says:

====================
sysctl: Fix data-races around ipv4_net_table (Round 4).

This series fixes data-races around 17 knobs after fib_multipath_use_neigh
in ipv4_net_table.

tcp_fack was skipped because it's obsolete and there's no readers.

So, round 5 will start with tcp_dsack, 2 rounds left for 27 knobs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:50 +01:00
Kuniyuki Iwashima
a11e5b3e7a tcp: Fix data-races around sysctl_tcp_max_reordering.
While reading sysctl_tcp_max_reordering, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: dca145ffaa ("tcp: allow for bigger reordering level")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:50 +01:00
Kuniyuki Iwashima
2d17d9c738 tcp: Fix a data-race around sysctl_tcp_abort_on_overflow.
While reading sysctl_tcp_abort_on_overflow, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:50 +01:00
Kuniyuki Iwashima
0b484c9191 tcp: Fix a data-race around sysctl_tcp_rfc1337.
While reading sysctl_tcp_rfc1337, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:50 +01:00
Kuniyuki Iwashima
4e08ed41cb tcp: Fix a data-race around sysctl_tcp_stdurg.
While reading sysctl_tcp_stdurg, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:50 +01:00
Kuniyuki Iwashima
1a63cb91f0 tcp: Fix a data-race around sysctl_tcp_retrans_collapse.
While reading sysctl_tcp_retrans_collapse, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:50 +01:00
Kuniyuki Iwashima
4845b5713a tcp: Fix data-races around sysctl_tcp_slow_start_after_idle.
While reading sysctl_tcp_slow_start_after_idle, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 35089bb203 ("[TCP]: Add tcp_slow_start_after_idle sysctl.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:50 +01:00
Kuniyuki Iwashima
7c6f2a86ca tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts.
While reading sysctl_tcp_thin_linear_timeouts, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 36e31b0af5 ("net: TCP thin linear timeouts")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:50 +01:00
Kuniyuki Iwashima
e7d2ef837e tcp: Fix data-races around sysctl_tcp_recovery.
While reading sysctl_tcp_recovery, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 4f41b1c58a ("tcp: use RACK to detect losses")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:50 +01:00
Kuniyuki Iwashima
52e65865de tcp: Fix a data-race around sysctl_tcp_early_retrans.
While reading sysctl_tcp_early_retrans, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: eed530b6c6 ("tcp: early retransmit")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:49 +01:00
Kuniyuki Iwashima
3666f666e9 tcp: Fix data-races around sysctl knobs related to SYN option.
While reading these knobs, they can be changed concurrently.
Thus, we need to add READ_ONCE() to their readers.

  - tcp_sack
  - tcp_window_scaling
  - tcp_timestamps

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:49 +01:00
Kuniyuki Iwashima
3d72bb4188 udp: Fix a data-race around sysctl_udp_l3mdev_accept.
While reading sysctl_udp_l3mdev_accept, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 63a6fff353 ("net: Avoid receiving packets with an l3mdev on unbound UDP sockets")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:49 +01:00
Kuniyuki Iwashima
9b55c20f83 ip: Fix data-races around sysctl_ip_prot_sock.
sysctl_ip_prot_sock is accessed concurrently, and there is always a chance
of data-race.  So, all readers and writers need some basic protection to
avoid load/store-tearing.

Fixes: 4548b683b7 ("Introduce a sysctl that modifies the value of PROT_SOCK.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:49 +01:00
Kuniyuki Iwashima
8895a9c2ac ipv4: Fix data-races around sysctl_fib_multipath_hash_fields.
While reading sysctl_fib_multipath_hash_fields, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: ce5c9c20d3 ("ipv4: Add a sysctl to control multipath hash fields")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:49 +01:00
Kuniyuki Iwashima
7998c12a08 ipv4: Fix data-races around sysctl_fib_multipath_hash_policy.
While reading sysctl_fib_multipath_hash_policy, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: bf4e0a3db9 ("net: ipv4: add support for ECMP hash policy choice")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:49 +01:00
Kuniyuki Iwashima
87507bcb4f ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh.
While reading sysctl_fib_multipath_use_neigh, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: a6db4494d2 ("net: ipv4: Consider failed nexthops in multipath routes")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:14:49 +01:00
David S. Miller
ef5621758a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2022-07-20

1) Fix a policy refcount imbalance in xfrm_bundle_lookup.
   From Hangyu Hua.

2) Fix some clang -Wformat warnings.
   Justin Stitt
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20 10:11:58 +01:00
Liang He
7b66dfcc6e can: rcar_canfd: Add missing of_node_put() in rcar_canfd_probe()
We should use of_node_put() for the reference returned by
of_get_child_by_name() which has increased the refcount.

Fixes: 45721c406d ("can: rcar_canfd: Add support for r8a779a0 SoC")
Link: https://lore.kernel.org/all/20220712095623.364287-1-windhl@126.com
Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-07-20 10:20:19 +02:00
Marc Kleine-Budde
db87c005b9 can: mcp251xfd: fix detection of mcp251863
In commit c6f2a617a0 ("can: mcp251xfd: add support for mcp251863")
support for the mcp251863 was added. However it was not taken into
account that the auto detection of the chip model cannot distinguish
between mcp2518fd and mcp251863 and would lead to a warning message if
the firmware specifies a mcp251863.

Fix auto detection: If a mcp2518fd compatible chip is found, keep the
mcp251863 if specified by firmware, use mcp2518fd instead.

Link: https://lore.kernel.org/all/20220706064835.1848864-1-mkl@pengutronix.de
Fixes: c6f2a617a0 ("can: mcp251xfd: add support for mcp251863")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-07-20 10:20:19 +02:00
Liang He
02c87df248 drm/imx/dcss: Add missing of_node_put() in fail path
In dcss_dev_create() and dcss_dev_destroy(), we should call of_node_put()
in fail path or before the dcss's destroy as of_graph_get_port_by_id() has
increased the refcount.

Fixes: 9021c317b7 ("drm/imx: Add initial support for DCSS on iMX8MQ")
Signed-off-by: Liang He <windhl@126.com>
Reviewed-by: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Signed-off-by: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220714081337.374761-1-windhl@126.com
2022-07-20 10:12:15 +03:00
Baolin Wang
7849f5cf76 mailmap: update Baolin Wang's email
I recently switched to my Alibaba email address. So add aliases for my
previous email addresses.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-07-20 09:11:10 +02:00
Michael Ellerman
be640317a1 powerpc/64s: Disable stack variable initialisation for prom_init
With GCC 12 allmodconfig prom_init fails to build:

  Error: External symbol 'memset' referenced from prom_init.c
  make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1

The allmodconfig build enables KASAN, so all calls to memset in
prom_init should be converted to __memset by the #ifdefs in
asm/string.h, because prom_init must use the non-KASAN instrumented
versions.

The build failure happens because there's a call to memset that hasn't
been caught by the pre-processor and converted to __memset. Typically
that's because it's a memset generated by the compiler itself, and that
is the case here.

With GCC 12, allmodconfig enables CONFIG_INIT_STACK_ALL_PATTERN, which
causes the compiler to emit memset calls to initialise on-stack
variables with a pattern.

Because prom_init is non-user-facing boot-time only code, as a
workaround just disable stack variable initialisation to unbreak the
build.

Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220718134418.354114-1-mpe@ellerman.id.au
2022-07-20 15:13:02 +10:00
Daniele Ceraolo Spurio
443148858f drm/i915/guc: support v69 in parallel to v70
This patch re-introduces support for GuC v69 in parallel to v70. As this
is a quick fix, v69 has been re-introduced as the single "fallback" guc
version in case v70 is not available on disk and only for platforms that
are out of force_probe and require the GuC by default. All v69 specific
code has been labeled as such for easy identification, and the same was
done for all v70 functions for which there is a separate v69 version,
to avoid accidentally calling the wrong version via the unlabeled name.

When the fallback mode kicks in, a drm_notice message is printed in
dmesg to inform the user of the required update. The existing
logging of the fetch function has also been updated so that we no
longer complain immediately if we can't find a fw and we only throw an
error if the fetch of both the base and fallback blobs fails.

The plan is to follow this up with a more complex rework to allow for
multiple different GuC versions to be supported at the same time.

v2: reduce the fallback to platform that require it, switch to
firmware_request_nowarn(), improve logs.

Fixes: 2584b3549f ("drm/i915/guc: Update to GuC version 70.1.1")
Link: https://lists.freedesktop.org/archives/intel-gfx/2022-July/301640.html
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220718230732.1409641-1-daniele.ceraolospurio@intel.com
(cherry picked from commit 774ce1510e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-07-19 21:25:03 -04:00
Matthew Brost
e7999fa14f drm/i915/guc: Support programming the EU priority in the GuC descriptor
In GuC submission mode the EU priority must be updated by the GuC rather
than the driver as the GuC owns the programming of the context descriptor.

Given that the GuC code uses the GuC priorities, we can't use a generic
function using i915 priorities for both execlists and GuC submission.
The existing function has therefore been pushed to the execlists
back-end while a new one has been added for GuC.

v2: correctly use the GuC prio.

Cc: John Harrison <john.c.harrison@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220504234636.2119794-1-daniele.ceraolospurio@intel.com
(cherry picked from commit a5c89f7c43)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-07-19 21:24:48 -04:00
Jakub Kicinski
48ea8ea32d Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-07-18

This series contains updates to iavf driver only.

Przemyslaw fixes handling of multiple VLAN requests to account for
individual errors instead of rejecting them all. He removes incorrect
implementations of ETHTOOL_COALESCE_MAX_FRAMES and
ETHTOOL_COALESCE_MAX_FRAMES_IRQ.

He also corrects an issue with NULL pointer caused by improper handling of
dummy receive descriptors. Finally, he corrects debug prints reporting an
unknown state.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  iavf: Fix missing state logs
  iavf: Fix handling of dummy receive descriptors
  iavf: Disallow changing rx/tx-frames and rx/tx-frames-irq
  iavf: Fix VLAN_V2 addition/rejection
====================

Link: https://lore.kernel.org/r/20220718174807.4113582-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19 17:43:02 -07:00
Xin Long
c6b10de537 Documentation: fix udp_wmem_min in ip-sysctl.rst
UDP doesn't support tx memory accounting, and sysctl udp_wmem_min
is not really used anywhere. So we should fix the description in
ip-sysctl.rst accordingly.

Fixes: 95766fff6b ("[UDP]: Add memory accounting.")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/c880a963d9b1fb5f442ae3c9e4dfa70d45296a16.1658167019.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19 17:34:53 -07:00
Lorenzo Bianconi
53eb9b0456 net: ethernet: mtk_ppe: fix possible NULL pointer dereference in mtk_flow_get_wdma_info
odev pointer can be NULL in mtk_flow_offload_replace routine according
to the flower action rules. Fix possible NULL pointer dereference in
mtk_flow_get_wdma_info.

Fixes: a333215e10 ("net: ethernet: mtk_eth_soc: implement flow offloading to WED devices")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/4e1685bc4976e21e364055f6bee86261f8f9ee93.1658137753.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19 17:27:18 -07:00
Hayes Wang
cdf0b86b25 r8152: fix a WOL issue
This fixes that the platform is waked by an unexpected packet. The
size and range of FIFO is different when the device enters S3 state,
so it is necessary to correct some settings when suspending.

Regardless of jumbo frame, set RMS to 1522 and MTPS to MTPS_DEFAULT.
Besides, enable MCU_BORW_EN to update the method of calculating the
pointer of data. Then, the hardware could get the correct data.

Fixes: 195aae321c ("r8152: support new chips")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20220718082120.10957-391-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19 17:10:56 -07:00
Nícolas F. R. A. Prado
ef2084a838 drm/panel-edp: Fix variable typo when saving hpd absent delay from DT
The value read from the "hpd-absent-delay-ms" property in DT was being
saved to the wrong variable, overriding the hpd_reliable delay. Fix the
typo.

Fixes: 5540cf8f3e ("drm/panel-edp: Implement generic "edp-panel"s probed by EDID")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220719203857.1488831-4-nfraprado@collabora.com
2022-07-19 15:47:47 -07:00
Tom Lendacky
908fc4c2ab virt: sev-guest: Pass the appropriate argument type to iounmap()
Fix a sparse warning in sev_guest_probe() where the wrong argument type is
provided to iounmap().

Fixes: fce96cf044 ("virt: Add SEV-SNP guest driver")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/202207150617.jqwQ0Rpz-lkp@intel.com
2022-07-19 22:26:02 +02:00
Jens Axboe
82e094f7bd Merge branch 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-5.19
Pull MD fix from Song.

* 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
  md/raid5: missing error code in setup_conf()
2022-07-19 12:42:33 -06:00
Neeraj Upadhyay
4f2bfd9494 srcu: Make expedited RCU grace periods block even less frequently
The purpose of commit 282d8998e9 ("srcu: Prevent expedited GPs
and blocking readers from consuming CPU") was to prevent a long
series of never-blocking expedited SRCU grace periods from blocking
kernel-live-patching (KLP) progress.  Although it was successful, it also
resulted in excessive boot times on certain embedded workloads running
under qemu with the "-bios QEMU_EFI.fd" command line.  Here "excessive"
means increasing the boot time up into the three-to-four minute range.
This increase in boot time was due to the more than 6000 back-to-back
invocations of synchronize_rcu_expedited() within the KVM host OS, which
in turn resulted from qemu's emulation of a long series of MMIO accesses.

Commit 640a7d37c3f4 ("srcu: Block less aggressively for expedited grace
periods") did not significantly help this particular use case.

Zhangfei Gao and Shameerali Kolothum Thodi did experiments varying the
value of SRCU_MAX_NODELAY_PHASE with HZ=250 and with various values
of non-sleeping per phase counts on a system with preemption enabled,
and observed the following boot times:

+──────────────────────────+────────────────+
| SRCU_MAX_NODELAY_PHASE   | Boot time (s)  |
+──────────────────────────+────────────────+
| 100                      | 30.053         |
| 150                      | 25.151         |
| 200                      | 20.704         |
| 250                      | 15.748         |
| 500                      | 11.401         |
| 1000                     | 11.443         |
| 10000                    | 11.258         |
| 1000000                  | 11.154         |
+──────────────────────────+────────────────+

Analysis on the experiment results show additional improvements with
CPU-bound delays approaching one jiffy in duration. This improvement was
also seen when number of per-phase iterations were scaled to one jiffy.

This commit therefore scales per-grace-period phase number of non-sleeping
polls so that non-sleeping polls extend for about one jiffy. In addition,
the delay-calculation call to srcu_get_delay() in srcu_gp_end() is
replaced with a simple check for an expedited grace period.  This change
schedules callback invocation immediately after expedited grace periods
complete, which results in greatly improved boot times.  Testing done
by Marc and Zhangfei confirms that this change recovers most of the
performance degradation in boottime; for CONFIG_HZ_250 configuration,
specifically, boot times improve from 3m50s to 41s on Marc's setup;
and from 2m40s to ~9.7s on Zhangfei's setup.

In addition to the changes to default per phase delays, this
change adds 3 new kernel parameters - srcutree.srcu_max_nodelay,
srcutree.srcu_max_nodelay_phase, and srcutree.srcu_retry_check_delay.
This allows users to configure the srcu grace period scanning delays in
order to more quickly react to additional use cases.

Fixes: 640a7d37c3f4 ("srcu: Block less aggressively for expedited grace periods")
Fixes: 282d8998e9 ("srcu: Prevent expedited GPs and blocking readers from consuming CPU")
Reported-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Reported-by: yueluck <yueluck@163.com>
Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Tested-by: Marc Zyngier <maz@kernel.org>
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Link: https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-07-19 11:39:59 -07:00
Paul E. McKenney
8f870e6eb8 srcu: Block less aggressively for expedited grace periods
Commit 282d8998e9 ("srcu: Prevent expedited GPs and blocking readers
from consuming CPU") fixed a problem where a long-running expedited SRCU
grace period could block kernel live patching.  It did so by giving up
on expediting once a given SRCU expedited grace period grew too old.

Unfortunately, this added excessive delays to boots of virtual embedded
systems specifying "-bios QEMU_EFI.fd" to qemu.  This commit therefore
makes the transition away from expediting less aggressive, increasing
the per-grace-period phase number of non-sleeping polls of readers from
one to three and increasing the required grace-period age from one jiffy
(actually from zero to one jiffies) to two jiffies (actually from one
to two jiffies).

Fixes: 282d8998e9 ("srcu: Prevent expedited GPs and blocking readers from consuming CPU")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reported-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Reported-by: chenxiang (M)" <chenxiang66@hisilicon.com>
Cc: Shameerali Kolothum Thodi  <shameerali.kolothum.thodi@huawei.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Link: https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/
2022-07-19 11:39:59 -07:00
Aaron Lewis
cf5029d5dd KVM: x86: Protect the unused bits in MSR exiting flags
The flags for KVM_CAP_X86_USER_SPACE_MSR and KVM_X86_SET_MSR_FILTER
have no protection for their unused bits.  Without protection, future
development for these features will be difficult.  Add the protection
needed to make it possible to extend these features in the future.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Message-Id: <20220714161314.1715227-1-aaronlewis@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-19 14:04:18 -04:00
Dan Carpenter
5f7ef4875f md/raid5: missing error code in setup_conf()
Return -ENOMEM if the allocation fails.  Don't return success.

Fixes: 8fbcba6b99 ("md/raid5: Cleanup setup_conf() error returns")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>
2022-07-19 10:58:33 -07:00
Paolo Bonzini
dc951e22a1 tools headers UAPI: Sync linux/kvm.h with the kernel sources
Silence this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-19 09:16:53 -04:00
Gavin Shan
e923b0537d KVM: selftests: Fix target thread to be migrated in rseq_test
In rseq_test, there are two threads, which are vCPU thread and migration
worker separately. Unfortunately, the test has the wrong PID passed to
sched_setaffinity() in the migration worker. It forces migration on the
migration worker because zeroed PID represents the calling thread, which
is the migration worker itself. It means the vCPU thread is never enforced
to migration and it can migrate at any time, which eventually leads to
failure as the following logs show.

  host# uname -r
  5.19.0-rc6-gavin+
  host# # cat /proc/cpuinfo | grep processor | tail -n 1
  processor    : 223
  host# pwd
  /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
  host# for i in `seq 1 100`; do \
        echo "--------> $i"; ./rseq_test; done
  --------> 1
  --------> 2
  --------> 3
  --------> 4
  --------> 5
  --------> 6
  ==== Test Assertion Failure ====
    rseq_test.c:265: rseq_cpu == cpu
    pid=3925 tid=3925 errno=4 - Interrupted system call
       1  0x0000000000401963: main at rseq_test.c:265 (discriminator 2)
       2  0x0000ffffb044affb: ?? ??:0
       3  0x0000ffffb044b0c7: ?? ??:0
       4  0x0000000000401a6f: _start at ??:?
    rseq CPU = 4, sched CPU = 27

Fix the issue by passing correct parameter, TID of the vCPU thread, to
sched_setaffinity() in the migration worker.

Fixes: 61e52f1630 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Message-Id: <20220719020830.3479482-1-gshan@redhat.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-19 09:03:49 -04:00
Oliver Upton
450a563924 KVM: stats: Fix value for KVM_STATS_UNIT_MAX for boolean stats
commit 1b870fa557 ("kvm: stats: tell userspace which values are
boolean") added a new stat unit (boolean) but failed to raise
KVM_STATS_UNIT_MAX.

Fix by pointing UNIT_MAX at the new max value of UNIT_BOOLEAN.

Fixes: 1b870fa557 ("kvm: stats: tell userspace which values are boolean")
Reported-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20220719125229.2934273-1-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-19 08:54:11 -04:00
Paolo Abeni
b3fcfc4f0c Merge branch 'amt-fix-validation-and-synchronization-bugs'
Taehee Yoo says:

====================
amt: fix validation and synchronization bugs

There are some synchronization issues in the amt module.
Especially, an amt gateway doesn't well synchronize its own variables
and status(amt->status).
It tries to use a workqueue for handles in a single thread.
A global lock is also good, but it would occur complex locking complex.

In this patchset, only the gateway uses workqueue.
The reason why only gateway interface uses workqueue is that gateway
should manage its own states and variables a little bit statefully.
But relay doesn't need to manage tunnels statefully, stateless is okay.
So, relay side message handlers are okay to be called concurrently.
But it doesn't mean that no lock is needed.

Only amt multicast data message type will not be processed by the work
queue because It contains actual multicast data.
So, it should be processed immediately.

When any amt gateway events are triggered(sending discovery message by
delayed_work, sending request message by delayed_work and receiving
messages), it stores event and skb into the event queue(amt->events[16]).
Then, workqueue processes these events one by one.

The first patch is to use the work queue.

The second patch is to remove unnecessary lock due to a previous patch.

The third patch is to use READ_ONCE() in the amt module.
Even if the amt module uses a single thread, some variables (ready4,
ready6, amt->status) can be accessed concurrently.

The fourth patch is to add missing nonce generation logic when it sends a
new request message.

The fifth patch is to drop unexpected advertisement messages.
advertisement message should be received only after the gateway sends
a discovery message first.
So, the gateway should drop advertisement messages if it has never
sent a discovery message and it also should drop duplicate advertisement
messages.
Using nonce is good to distinguish whether a received message is an
expected message or not.

The sixth patch is to drop unexpected query messages.
This is the same behavior as the fourth patch.
Query messages should be received only after the gateway sends a request
message first.
The nonce variable is used to distinguish whether it is a reply to a
previous request message or not.
amt->ready4 and amt->ready6 are used to distinguish duplicate messages.

The seventh patch is to drop unexpected multicast data.
AMT gateway should not receive multicast data message type before
establish between gateway and relay.
In order to drop unexpected multicast data messages, it checks amt->status.

The last patch is to fix a locking problem on the relay side.
amt->nr_tunnels variable is protected by amt->lock.
But amt_request_handler() doesn't protect this variable.

v2:
 - Use local_bh_disable() instead of rcu_read_lock_bh() in
   amt_membership_query_handler.
 - Fix using uninitialized variables.
 - Fix unexpectedly start the event_wq after stopping.
 - Fix possible deadlock in amt_event_work().
 - Add a limit variable in amt_event_work() to prevent infinite working.
 - Rename amt_queue_events() to amt_queue_event().
====================

Link: https://lore.kernel.org/r/20220717160910.19156-1-ap420073@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 12:37:05 +02:00
Taehee Yoo
989918482b amt: do not use amt->nr_tunnels outside of lock
amt->nr_tunnels is protected by amt->lock.
But, amt_request_handler() has been using this variable without the
amt->lock.
So, it expands context of amt->lock in the amt_request_handler() to
protect amt->nr_tunnels variable.

Fixes: cbc21dc1cf ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 12:37:02 +02:00
Taehee Yoo
e882827d5b amt: drop unexpected multicast data
AMT gateway interface should not receive unexpected multicast data.
Multicast data message type should be received after sending an update
message, which means all establishment between gateway and relay is
finished.
So, amt_multicast_data_handler() checks amt->status.

Fixes: cbc21dc1cf ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 12:37:02 +02:00
Taehee Yoo
239d886601 amt: drop unexpected query message
AMT gateway interface should not receive unexpected query messages.
In order to drop unexpected query messages, it checks nonce.
And it also checks ready4 and ready6 variables to drop duplicated messages.

Fixes: cbc21dc1cf ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 12:37:02 +02:00
Taehee Yoo
40185f359f amt: drop unexpected advertisement message
AMT gateway interface should not receive unexpected advertisement messages.
In order to drop these packets, it should check nonce and amt->status.

Fixes: cbc21dc1cf ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 12:37:02 +02:00
Taehee Yoo
627f16931b amt: add missing regeneration nonce logic in request logic
When AMT gateway starts sending a new request message, it should
regenerate the nonce variable.

Fixes: cbc21dc1cf ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 12:37:02 +02:00
Taehee Yoo
928f353cb8 amt: use READ_ONCE() in amt module
There are some data races in the amt module.
amt->ready4, amt->ready6, and amt->status can be accessed concurrently
without locks.
So, it uses READ_ONCE() and WRITE_ONCE().

Fixes: cbc21dc1cf ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 12:37:02 +02:00
Taehee Yoo
9c343ea618 amt: remove unnecessary locks
By the previous patch, amt gateway handlers are changed to worked by
a single thread.
So, most locks for gateway are not needed.
So, it removes.

Fixes: cbc21dc1cf ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 12:37:02 +02:00
Taehee Yoo
30e22a6ebc amt: use workqueue for gateway side message handling
There are some synchronization issues(amt->status, amt->req_cnt, etc)
if the interface is in gateway mode because gateway message handlers
are processed concurrently.
This applies a work queue for processing these messages instead of
expanding the locking context.

So, the purposes of this patch are to fix exist race conditions and to make
gateway to be able to validate a gateway status more correctly.

When the AMT gateway interface is created, it tries to establish to relay.
The establishment step looks stateless, but it should be managed well.
In order to handle messages in the gateway, it saves the current
status(i.e. AMT_STATUS_XXX).
This patch makes gateway code to be worked with a single thread.

Now, all messages except the multicast are triggered(received or
delay expired), and these messages will be stored in the event
queue(amt->events).
Then, the single worker processes stored messages asynchronously one
by one.
The multicast data message type will be still processed immediately.

Now, amt->lock is only needed to access the event queue(amt->events)
if an interface is the gateway mode.

Fixes: cbc21dc1cf ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 12:37:02 +02:00
Oleksij Rempel
1774559f07 net: dsa: vitesse-vsc73xx: silent spi_device_id warnings
Add spi_device_id entries to silent SPI warnings.

Fixes: 5fa6863ba6 ("spi: Check we have a spi_device_id for each DT compatible")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220717135831.2492844-2-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 12:20:40 +02:00
Oleksij Rempel
855fe49984 net: dsa: sja1105: silent spi_device_id warnings
Add spi_device_id entries to silent following warnings:
 SPI driver sja1105 has no spi_device_id for nxp,sja1105e
 SPI driver sja1105 has no spi_device_id for nxp,sja1105t
 SPI driver sja1105 has no spi_device_id for nxp,sja1105p
 SPI driver sja1105 has no spi_device_id for nxp,sja1105q
 SPI driver sja1105 has no spi_device_id for nxp,sja1105r
 SPI driver sja1105 has no spi_device_id for nxp,sja1105s
 SPI driver sja1105 has no spi_device_id for nxp,sja1110a
 SPI driver sja1105 has no spi_device_id for nxp,sja1110b
 SPI driver sja1105 has no spi_device_id for nxp,sja1110c
 SPI driver sja1105 has no spi_device_id for nxp,sja1110d

Fixes: 5fa6863ba6 ("spi: Check we have a spi_device_id for each DT compatible")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220717135831.2492844-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 12:20:40 +02:00
Hristo Venev
d7241f679a be2net: Fix buffer overflow in be_get_module_eeprom
be_cmd_read_port_transceiver_data assumes that it is given a buffer that
is at least PAGE_DATA_LEN long, or twice that if the module supports SFF
8472. However, this is not always the case.

Fix this by passing the desired offset and length to
be_cmd_read_port_transceiver_data so that we only copy the bytes once.

Fixes: e36edd9d26 ("be2net: add ethtool "-m" option support")
Signed-off-by: Hristo Venev <hristo@venev.name>
Link: https://lore.kernel.org/r/20220716085134.6095-1-hristo@venev.name
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19 11:51:16 +02:00
Haibo Chen
b8c768ccdd gpio: pca953x: use the correct register address when regcache sync during init
For regcache_sync_region, we need to use pca953x_recalc_addr() to get
the real register address.

Fixes: ec82d1eba3 ("gpio: pca953x: Zap ad-hoc reg_output cache")
Fixes: 0f25fda840 ("gpio: pca953x: Zap ad-hoc reg_direction cache")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-07-19 11:20:41 +02:00
Haibo Chen
2abc17a938 gpio: pca953x: use the correct range when do regmap sync
regmap will sync a range of registers, here use the correct range
to make sure the sync do not touch other unexpected registers.

Find on pca9557pw on imx8qxp/dxl evk board, this device support
8 pin, so only need one register(8 bits) to cover all the 8 pins's
property setting. But when sync the output, we find it actually
update two registers, output register and the following register.

Fixes: b765743005 ("gpio: pca953x: Restore registers after suspend/resume cycle")
Fixes: ec82d1eba3 ("gpio: pca953x: Zap ad-hoc reg_output cache")
Fixes: 0f25fda840 ("gpio: pca953x: Zap ad-hoc reg_direction cache")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-07-19 11:19:16 +02:00
Haibo Chen
db8edaa09d gpio: pca953x: only use single read/write for No AI mode
For the device use NO AI mode(not support auto address increment),
only use the single read/write when config the regmap.

We meet issue on PCA9557PW on i.MX8QXP/DXL evk board, this device
do not support AI mode, but when do the regmap sync, regmap will
sync 3 byte data to register 1, logically this means write first
data to register 1, write second data to register 2, write third data
to register 3. But this device do not support AI mode, finally, these
three data write only into register 1 one by one. the reault is the
value of register 1 alway equal to the latest data, here is the third
data, no operation happened on register 2 and register 3. This is
not what we expect.

Fixes: 4942723276 ("gpio: pca953x: Perform basic regmap conversion")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-07-19 11:18:08 +02:00
Herve Codina
25c2a075eb clk: lan966x: Fix the lan966x clock gate register address
The register address used for the clock gate register is the base
register address coming from first reg map (ie. the generic
clock registers) instead of the second reg map defining the clock
gate register.

Use the correct clock gate register address.

Fixes: 5ad5915dea ("clk: lan966x: Extend lan966x clock driver for clock gating support")
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://lore.kernel.org/r/20220704102845.168438-2-herve.codina@bootlin.com
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-07-19 00:04:10 -07:00
Wong Vee Khee
da791bac10 net: stmmac: remove redunctant disable xPCS EEE call
Disable is done in stmmac_init_eee() on the event of MAC link down.
Since setting enable/disable EEE via ethtool will eventually trigger
a MAC down, removing this redunctant call in stmmac_ethtool.c to avoid
calling xpcs_config_eee() twice.

Fixes: d4aeaed80b ("net: stmmac: trigger PCS EEE to turn off on link down")
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Link: https://lore.kernel.org/r/20220715122402.1017470-1-vee.khee.wong@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:15:45 -07:00
Jakub Kicinski
49a2f5c88e Merge branch 'fix-2-dsa-issues-with-vlan_filtering_is_global'
Vladimir Oltean says:

====================
Fix 2 DSA issues with vlan_filtering_is_global

This patch set fixes 2 issues with vlan_filtering_is_global switches.

Both are regressions introduced by refactoring commit d0004a020b
("net: dsa: remove the "dsa_to_port in a loop" antipattern from the
core"), which wasn't tested on a wide enough variety of switches.

Tested on the sja1105 driver.
====================

Link: https://lore.kernel.org/r/20220715151659.780544-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:14:28 -07:00
Vladimir Oltean
1699b4d502 net: dsa: fix NULL pointer dereference in dsa_port_reset_vlan_filtering
The "ds" iterator variable used in dsa_port_reset_vlan_filtering() ->
dsa_switch_for_each_port() overwrites the "dp" received as argument,
which is later used to call dsa_port_vlan_filtering() proper.

As a result, switches which do enter that code path (the ones with
vlan_filtering_is_global=true) will dereference an invalid dp in
dsa_port_reset_vlan_filtering() after leaving a VLAN-aware bridge.

Use a dedicated "other_dp" iterator variable to avoid this from
happening.

Fixes: d0004a020b ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:14:23 -07:00
Vladimir Oltean
4db2a5ef4c net: dsa: fix dsa_port_vlan_filtering when global
The blamed refactoring commit changed a "port" iterator with "other_dp",
but still looked at the slave_dev of the dp outside the loop, instead of
other_dp->slave from the loop.

As a result, dsa_port_vlan_filtering() would not call
dsa_slave_manage_vlan_filtering() except for the port in cause, and not
for all switch ports as expected.

Fixes: d0004a020b ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core")
Reported-by: Lucian Banu <Lucian.Banu@westermo.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:14:23 -07:00
Piotr Skajewski
1e53834ce5 ixgbe: Add locking to prevent panic when setting sriov_numvfs to zero
It is possible to disable VFs while the PF driver is processing requests
from the VF driver.  This can result in a panic.

BUG: unable to handle kernel paging request at 000000000000106c
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 8 PID: 0 Comm: swapper/8 Kdump: loaded Tainted: G I      --------- -
Hardware name: Dell Inc. PowerEdge R740/06WXJT, BIOS 2.8.2 08/27/2020
RIP: 0010:ixgbe_msg_task+0x4c8/0x1690 [ixgbe]
Code: 00 00 48 8d 04 40 48 c1 e0 05 89 7c 24 24 89 fd 48 89 44 24 10 83 ff
01 0f 84 b8 04 00 00 4c 8b 64 24 10 4d 03 a5 48 22 00 00 <41> 80 7c 24 4c
00 0f 84 8a 03 00 00 0f b7 c7 83 f8 08 0f 84 8f 0a
RSP: 0018:ffffb337869f8df8 EFLAGS: 00010002
RAX: 0000000000001020 RBX: 0000000000000000 RCX: 000000000000002b
RDX: 0000000000000002 RSI: 0000000000000008 RDI: 0000000000000006
RBP: 0000000000000006 R08: 0000000000000002 R09: 0000000000029780
R10: 00006957d8f42832 R11: 0000000000000000 R12: 0000000000001020
R13: ffff8a00e8978ac0 R14: 000000000000002b R15: ffff8a00e8979c80
FS:  0000000000000000(0000) GS:ffff8a07dfd00000(0000) knlGS:00000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000106c CR3: 0000000063e10004 CR4: 00000000007726e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <IRQ>
 ? ttwu_do_wakeup+0x19/0x140
 ? try_to_wake_up+0x1cd/0x550
 ? ixgbevf_update_xcast_mode+0x71/0xc0 [ixgbevf]
 ixgbe_msix_other+0x17e/0x310 [ixgbe]
 __handle_irq_event_percpu+0x40/0x180
 handle_irq_event_percpu+0x30/0x80
 handle_irq_event+0x36/0x53
 handle_edge_irq+0x82/0x190
 handle_irq+0x1c/0x30
 do_IRQ+0x49/0xd0
 common_interrupt+0xf/0xf

This can be eventually be reproduced with the following script:

while :
do
    echo 63 > /sys/class/net/<devname>/device/sriov_numvfs
    sleep 1
    echo 0 > /sys/class/net/<devname>/device/sriov_numvfs
    sleep 1
done

Add lock when disabling SR-IOV to prevent process VF mailbox communication.

Fixes: d773d13106 ("ixgbe: Fix memory leak when SR-IOV VFs are direct assigned")
Signed-off-by: Piotr Skajewski <piotrx.skajewski@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220715214456.2968711-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:13:40 -07:00
Dawid Lukwinski
f838a63369 i40e: Fix erroneous adapter reinitialization during recovery process
Fix an issue when driver incorrectly detects state
of recovery process and erroneously reinitializes interrupts,
which results in a kernel error and call trace message.

The issue was caused by a combination of two factors:
1. Assuming the EMP reset issued after completing
firmware recovery means the whole recovery process is complete.
2. Erroneous reinitialization of interrupt vector after detecting
the above mentioned EMP reset.

Fixes (1) by changing how recovery state change is detected
and (2) by adjusting the conditional expression to ensure using proper
interrupt reinitialization method, depending on the situation.

Fixes: 4ff0ee1af0 ("i40e: Introduce recovery mode support")
Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220715214542.2968762-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:13:22 -07:00
Tom Rix
3696c952da net: ethernet: mtk_eth_soc: fix off by one check of ARRAY_SIZE
In mtk_wed_tx_ring_setup(.., int idx, ..), idx is used as an index here
  struct mtk_wed_ring *ring = &dev->tx_ring[idx];

The bounds of idx are checked here
  BUG_ON(idx > ARRAY_SIZE(dev->tx_ring));

If idx is the size of the array, it will pass this check and overflow.
So change the check to >= .

Fixes: 804775dfc2 ("net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED)")
Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20220716214654.1540240-1-trix@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:12:04 -07:00
Jakub Kicinski
b6224a36de Merge branch 'net-lan966x-fix-issues-with-mac-table'
Horatiu Vultur says:

====================
net: lan966x: Fix issues with MAC table

The patch series fixes 2 issues:
- when an entry was forgotten the irq thread was holding a spin lock and then
  was talking also rtnl_lock.
- the access to the HW MAC table is indirect, so the access to the HW MAC
  table was not synchronized, which means that there could be race conditions.
====================

Link: https://lore.kernel.org/r/20220714194040.231651-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:00:06 -07:00
Horatiu Vultur
675c807ae2 net: lan966x: Fix usage of lan966x->mac_lock when used by FDB
When the SW bridge was trying to add/remove entries to/from HW, the
access to HW was not protected by any lock. In this way, it was
possible to have race conditions.
Fix this by using the lan966x->mac_lock to protect parallel access to HW
for this cases.

Fixes: 25ee9561ec ("net: lan966x: More MAC table functionality")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:00:00 -07:00
Horatiu Vultur
c192468436 net: lan966x: Fix usage of lan966x->mac_lock inside lan966x_mac_irq_handler
The problem with this spin lock is that it was just protecting the list
of the MAC entries in SW and not also the access to the MAC entries in HW.
Because the access to HW is indirect, then it could happen to have race
conditions.
For example when SW introduced an entry in MAC table and the irq mac is
trying to read something from the MAC.
Update such that also the access to MAC entries in HW is protected by
this lock.

Fixes: 5ccd66e01c ("net: lan966x: add support for interrupts from analyzer")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:00:00 -07:00
Horatiu Vultur
99343cfa4f net: lan966x: Fix usage of lan966x->mac_lock when entry is removed
To remove an entry to the MAC table, it is required first to setup the
entry and then issue a command for the MAC to forget the entry.
So if it happens for two threads to remove simultaneously an entry
in MAC table then it would be a race condition.
Fix this by using lan966x->mac_lock to protect the HW access.

Fixes: e18aba8941 ("net: lan966x: add mactable support")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:00:00 -07:00
Horatiu Vultur
43243bb319 net: lan966x: Fix usage of lan966x->mac_lock when entry is added
To add an entry to the MAC table, it is required first to setup the
entry and then issue a command for the MAC to learn the entry.
So if it happens for two threads to add simultaneously an entry in MAC
table then it would be a race condition.
Fix this by using lan966x->mac_lock to protect the HW access.

Fixes: fc0c3fe748 ("net: lan966x: Add function lan966x_mac_ip_learn()")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 19:59:59 -07:00
Horatiu Vultur
45533a534a net: lan966x: Fix taking rtnl_lock while holding spin_lock
When the HW deletes an entry in MAC table then it generates an
interrupt. The SW will go through it's own list of MAC entries and if it
is not found then it would notify the listeners about this. The problem
is that when the SW will go through it's own list it would take a spin
lock(lan966x->mac_lock) and when it notifies that the entry is deleted.
But to notify the listeners it taking the rtnl_lock which is illegal.

This is fixed by instead of notifying right away that the entry is
deleted, move the entry on a temp list and once, it checks all the
entries then just notify that the entries from temp list are deleted.

Fixes: 5ccd66e01c ("net: lan966x: add support for interrupts from analyzer")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 19:59:59 -07:00
Linus Torvalds
ca85855bdc Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
 "Two bug fixes for irdma:

   - x722 does not support 1GB pages, trying to configure them will
     corrupt the dma mapping

   - Fix a sleep while holding a spinlock"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/irdma: Fix sleep from invalid context BUG
  RDMA/irdma: Do not advertise 1GB page size for x722
2022-07-18 17:16:22 -07:00
Vladimir Oltean
4546760619 pinctrl: armada-37xx: use raw spinlocks for regmap to avoid invalid wait context
The irqchip->irq_set_type method is called by __irq_set_trigger() under
the desc->lock raw spinlock.

The armada-37xx implementation, armada_37xx_irq_set_type(), uses an MMIO
regmap created by of_syscon_register(), which uses plain spinlocks
(the kind that are sleepable on RT).

Therefore, this is an invalid locking scheme for which we get a kernel
splat stating just that ("[ BUG: Invalid wait context ]"), because the
context in which the plain spinlock may sleep is atomic due to the raw
spinlock. We need to go raw spinlocks all the way.

Make this driver create its own MMIO regmap, with use_raw_spinlock=true,
and stop relying on syscon to provide it.

This patch depends on commit 67021f25d9 ("regmap: teach regmap to use
raw spinlocks if requested in the config").

Cc: <stable@vger.kernel.org> # 5.15+
Fixes: 2f22760539 ("pinctrl: armada-37xx: Add irqchip support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220716233745.1704677-3-vladimir.oltean@nxp.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-07-19 00:57:38 +02:00
Vladimir Oltean
984245b66c pinctrl: armada-37xx: make irq_lock a raw spinlock to avoid invalid wait context
The irqchip->irq_set_type method is called by __irq_set_trigger() under
the desc->lock raw spinlock.

The armada-37xx implementation, armada_37xx_irq_set_type(), takes a
plain spinlock, the kind that becomes sleepable on RT.

Therefore, this is an invalid locking scheme for which we get a kernel
splat stating just that ("[ BUG: Invalid wait context ]"), because the
context in which the plain spinlock may sleep is atomic due to the raw
spinlock. We need to go raw spinlocks all the way.

Replace the driver's irq_lock with a raw spinlock, to disable preemption
even on RT.

Cc: <stable@vger.kernel.org> # 5.15+
Fixes: 2f22760539 ("pinctrl: armada-37xx: Add irqchip support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220716233745.1704677-2-vladimir.oltean@nxp.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-07-19 00:56:01 +02:00
Junxiao Bi
c80af0c250 Revert "ocfs2: mount shared volume without ha stack"
This reverts commit 912f655d78.

This commit introduced a regression that can cause mount hung.  The
changes in __ocfs2_find_empty_slot causes that any node with none-zero
node number can grab the slot that was already taken by node 0, so node 1
will access the same journal with node 0, when it try to grab journal
cluster lock, it will hung because it was already acquired by node 0. 
It's very easy to reproduce this, in one cluster, mount node 0 first, then
node 1, you will see the following call trace from node 1.

[13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
[13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
[13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
[13148.749354] Call Trace:
[13148.750718]  <TASK>
[13148.752019]  ? usleep_range+0x90/0x89
[13148.753882]  __schedule+0x210/0x567
[13148.755684]  schedule+0x44/0xa8
[13148.757270]  schedule_timeout+0x106/0x13c
[13148.759273]  ? __prepare_to_swait+0x53/0x78
[13148.761218]  __wait_for_common+0xae/0x163
[13148.763144]  __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2]
[13148.765780]  ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.768312]  ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.770968]  ocfs2_journal_init+0x91/0x340 [ocfs2]
[13148.773202]  ocfs2_check_volume+0x39/0x461 [ocfs2]
[13148.775401]  ? iput+0x69/0xba
[13148.777047]  ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2]
[13148.779646]  ocfs2_fill_super+0x54b/0x853 [ocfs2]
[13148.781756]  mount_bdev+0x190/0x1b7
[13148.783443]  ? ocfs2_remount+0x440/0x440 [ocfs2]
[13148.785634]  legacy_get_tree+0x27/0x48
[13148.787466]  vfs_get_tree+0x25/0xd0
[13148.789270]  do_new_mount+0x18c/0x2d9
[13148.791046]  __x64_sys_mount+0x10e/0x142
[13148.792911]  do_syscall_64+0x3b/0x89
[13148.794667]  entry_SYSCALL_64_after_hwframe+0x170/0x0
[13148.797051] RIP: 0033:0x7f2309f6e26e
[13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e
[13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0
[13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820
[13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420
[13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000
[13148.816564]  </TASK>

To fix it, we can just fix __ocfs2_find_empty_slot.  But original commit
introduced the feature to mount ocfs2 locally even it is cluster based,
that is a very dangerous, it can easily cause serious data corruption,
there is no way to stop other nodes mounting the fs and corrupting it. 
Setup ha or other cluster-aware stack is just the cost that we have to
take for avoiding corruption, otherwise we have to do it in kernel.

Link: https://lkml.kernel.org/r/20220603222801.42488-1-junxiao.bi@oracle.com
Fixes: 912f655d78c5("ocfs2: mount shared volume without ha stack")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <heming.zhao@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-18 15:09:15 -07:00
Miaohe Lin
da9a298f5f hugetlb: fix memoryleak in hugetlb_mcopy_atomic_pte
When alloc_huge_page fails, *pagep is set to NULL without put_page first.
So the hugepage indicated by *pagep is leaked.

Link: https://lkml.kernel.org/r/20220709092629.54291-1-linmiaohe@huawei.com
Fixes: 8cc5fcbb5b ("mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPY")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-18 15:07:52 -07:00
Andrei Vagin
bdeb77bc2c fs: sendfile handles O_NONBLOCK of out_fd
sendfile has to return EAGAIN if out_fd is nonblocking and the write into
it would block.

Here is a small reproducer for the problem:

#define _GNU_SOURCE /* See feature_test_macros(7) */
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/sendfile.h>


#define FILE_SIZE (1UL << 30)
int main(int argc, char **argv) {
        int p[2], fd;

        if (pipe2(p, O_NONBLOCK))
                return 1;

        fd = open(argv[1], O_RDWR | O_TMPFILE, 0666);
        if (fd < 0)
                return 1;
        ftruncate(fd, FILE_SIZE);

        if (sendfile(p[1], fd, 0, FILE_SIZE) == -1) {
                fprintf(stderr, "FAIL\n");
        }
        if (sendfile(p[1], fd, 0, FILE_SIZE) != -1 || errno != EAGAIN) {
                fprintf(stderr, "FAIL\n");
        }
        return 0;
}

It worked before b964bf53e5, it is stuck after b964bf53e5, and it
works again with this fix.

This regression occurred because do_splice_direct() calls pipe_write
that handles O_NONBLOCK.  Here is a trace log from the reproducer:

 1)               |  __x64_sys_sendfile64() {
 1)               |    do_sendfile() {
 1)               |      __fdget()
 1)               |      rw_verify_area()
 1)               |      __fdget()
 1)               |      rw_verify_area()
 1)               |      do_splice_direct() {
 1)               |        rw_verify_area()
 1)               |        splice_direct_to_actor() {
 1)               |          do_splice_to() {
 1)               |            rw_verify_area()
 1)               |            generic_file_splice_read()
 1) + 74.153 us   |          }
 1)               |          direct_splice_actor() {
 1)               |            iter_file_splice_write() {
 1)               |              __kmalloc()
 1)   0.148 us    |              pipe_lock();
 1)   0.153 us    |              splice_from_pipe_next.part.0();
 1)   0.162 us    |              page_cache_pipe_buf_confirm();
... 16 times
 1)   0.159 us    |              page_cache_pipe_buf_confirm();
 1)               |              vfs_iter_write() {
 1)               |                do_iter_write() {
 1)               |                  rw_verify_area()
 1)               |                  do_iter_readv_writev() {
 1)               |                    pipe_write() {
 1)               |                      mutex_lock()
 1)   0.153 us    |                      mutex_unlock();
 1)   1.368 us    |                    }
 1)   1.686 us    |                  }
 1)   5.798 us    |                }
 1)   6.084 us    |              }
 1)   0.174 us    |              kfree();
 1)   0.152 us    |              pipe_unlock();
 1) + 14.461 us   |            }
 1) + 14.783 us   |          }
 1)   0.164 us    |          page_cache_pipe_buf_release();
... 16 times
 1)   0.161 us    |          page_cache_pipe_buf_release();
 1)               |          touch_atime()
 1) + 95.854 us   |        }
 1) + 99.784 us   |      }
 1) ! 107.393 us  |    }
 1) ! 107.699 us  |  }

Link: https://lkml.kernel.org/r/20220415005015.525191-1-avagin@gmail.com
Fixes: b964bf53e5 ("teach sendfile(2) to handle send-to-pipe directly")
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-18 15:07:52 -07:00
ChenXiaoSong
38c9c22a85 ntfs: fix use-after-free in ntfs_ucsncmp()
Syzkaller reported use-after-free bug as follows:

==================================================================
BUG: KASAN: use-after-free in ntfs_ucsncmp+0x123/0x130
Read of size 2 at addr ffff8880751acee8 by task a.out/879

CPU: 7 PID: 879 Comm: a.out Not tainted 5.19.0-rc4-next-20220630-00001-gcc5218c8bd2c-dirty #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x1c0/0x2b0
 print_address_description.constprop.0.cold+0xd4/0x484
 print_report.cold+0x55/0x232
 kasan_report+0xbf/0xf0
 ntfs_ucsncmp+0x123/0x130
 ntfs_are_names_equal.cold+0x2b/0x41
 ntfs_attr_find+0x43b/0xb90
 ntfs_attr_lookup+0x16d/0x1e0
 ntfs_read_locked_attr_inode+0x4aa/0x2360
 ntfs_attr_iget+0x1af/0x220
 ntfs_read_locked_inode+0x246c/0x5120
 ntfs_iget+0x132/0x180
 load_system_files+0x1cc6/0x3480
 ntfs_fill_super+0xa66/0x1cf0
 mount_bdev+0x38d/0x460
 legacy_get_tree+0x10d/0x220
 vfs_get_tree+0x93/0x300
 do_new_mount+0x2da/0x6d0
 path_mount+0x496/0x19d0
 __x64_sys_mount+0x284/0x300
 do_syscall_64+0x3b/0xc0
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f3f2118d9ea
Code: 48 8b 0d a9 f4 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 76 f4 0b 00 f7 d8 64 89 01 48
RSP: 002b:00007ffc269deac8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3f2118d9ea
RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffc269dec00
RBP: 00007ffc269dec80 R08: 00007ffc269deb00 R09: 00007ffc269dec44
R10: 0000000000000000 R11: 0000000000000202 R12: 000055f81ab1d220
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 </TASK>

The buggy address belongs to the physical page:
page:0000000085430378 refcount:1 mapcount:1 mapping:0000000000000000 index:0x555c6a81d pfn:0x751ac
memcg:ffff888101f7e180
anon flags: 0xfffffc00a0014(uptodate|lru|mappedtodisk|swapbacked|node=0|zone=1|lastcpupid=0x1fffff)
raw: 000fffffc00a0014 ffffea0001bf2988 ffffea0001de2448 ffff88801712e201
raw: 0000000555c6a81d 0000000000000000 0000000100000000 ffff888101f7e180
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8880751acd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff8880751ace00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff8880751ace80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
                                                          ^
 ffff8880751acf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff8880751acf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================

The reason is that struct ATTR_RECORD->name_offset is 6485, end address of
name string is out of bounds.

Fix this by adding sanity check on end address of attribute name string.

[akpm@linux-foundation.org: coding-style cleanups]
[chenxiaosong2@huawei.com: cleanup suggested by Hawkins Jiawei]
  Link: https://lkml.kernel.org/r/20220709064511.3304299-1-chenxiaosong2@huawei.com
Link: https://lkml.kernel.org/r/20220707105329.4020708-1-chenxiaosong2@huawei.com
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: ChenXiaoSong <chenxiaosong2@huawei.com>
Cc: Yongqiang Liu <liuyongqiang13@huawei.com>
Cc: Zhang Yi <yi.zhang@huawei.com>
Cc: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-18 15:07:52 -07:00
Mike Rapoport
84ac013046 secretmem: fix unhandled fault in truncate
syzkaller reports the following issue:

BUG: unable to handle page fault for address: ffff888021f7e005
PGD 11401067 P4D 11401067 PUD 11402067 PMD 21f7d063 PTE 800fffffde081060
Oops: 0002 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 3761 Comm: syz-executor281 Not tainted 5.19.0-rc4-syzkaller-00014-g941e3e791269 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:memset_erms+0x9/0x10 arch/x86/lib/memset_64.S:64
Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01
RSP: 0018:ffffc9000329fa90 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000ffb
RDX: 0000000000000ffb RSI: 0000000000000000 RDI: ffff888021f7e005
RBP: ffffea000087df80 R08: 0000000000000001 R09: ffff888021f7e005
R10: ffffed10043efdff R11: 0000000000000000 R12: 0000000000000005
R13: 0000000000000000 R14: 0000000000001000 R15: 0000000000000ffb
FS:  00007fb29d8b2700(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff888021f7e005 CR3: 0000000026e7b000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 zero_user_segments include/linux/highmem.h:272 [inline]
 folio_zero_range include/linux/highmem.h:428 [inline]
 truncate_inode_partial_folio+0x76a/0xdf0 mm/truncate.c:237
 truncate_inode_pages_range+0x83b/0x1530 mm/truncate.c:381
 truncate_inode_pages mm/truncate.c:452 [inline]
 truncate_pagecache+0x63/0x90 mm/truncate.c:753
 simple_setattr+0xed/0x110 fs/libfs.c:535
 secretmem_setattr+0xae/0xf0 mm/secretmem.c:170
 notify_change+0xb8c/0x12b0 fs/attr.c:424
 do_truncate+0x13c/0x200 fs/open.c:65
 do_sys_ftruncate+0x536/0x730 fs/open.c:193
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7fb29d900899
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fb29d8b2318 EFLAGS: 00000246 ORIG_RAX: 000000000000004d
RAX: ffffffffffffffda RBX: 00007fb29d988408 RCX: 00007fb29d900899
RDX: 00007fb29d900899 RSI: 0000000000000005 RDI: 0000000000000003
RBP: 00007fb29d988400 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb29d98840c
R13: 00007ffca01a23bf R14: 00007fb29d8b2400 R15: 0000000000022000
 </TASK>
Modules linked in:
CR2: ffff888021f7e005
---[ end trace 0000000000000000 ]---

Eric Biggers suggested that this happens when
secretmem_setattr()->simple_setattr() races with secretmem_fault() so that
a page that is faulted in by secretmem_fault() (and thus removed from the
direct map) is zeroed by inode truncation right afterwards.

Use mapping->invalidate_lock to make secretmem_fault() and
secretmem_setattr() mutually exclusive.

[rppt@linux.ibm.com: v3]
  Link: https://lkml.kernel.org/r/20220714091337.412297-1-rppt@kernel.org
Link: https://lkml.kernel.org/r/20220707165650.248088-1-rppt@kernel.org
Reported-by: syzbot+9bd2b7adbd34b30b87e4@syzkaller.appspotmail.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Suggested-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-18 15:07:51 -07:00
Naoya Horiguchi
c2cb0dcce9 mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range()
Originally copy_hugetlb_page_range() handles migration entries and
hwpoisoned entries in similar manner.  But recently the related code path
has more code for migration entries, and when
is_writable_migration_entry() was converted to
!is_readable_migration_entry(), hwpoison entries on source processes got
to be unexpectedly updated (which is legitimate for migration entries, but
not for hwpoison entries).  This results in unexpected serious issues like
kernel panic when forking processes with hwpoison entries in pmd.

Separate the if branch into one for hwpoison entries and one for migration
entries.

Link: https://lkml.kernel.org/r/20220704013312.2415700-3-naoya.horiguchi@linux.dev
Fixes: 6c287605fd ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: <stable@vger.kernel.org>	[5.18]
Cc: David Hildenbrand <david@redhat.com>
Cc: Liu Shixin <liushixin2@huawei.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-18 15:07:51 -07:00
Muchun Song
f4f451a16d mm: fix missing wake-up event for FSDAX pages
FSDAX page refcounts are 1-based, rather than 0-based: if refcount is
1, then the page is freed.  The FSDAX pages can be pinned through GUP,
then they will be unpinned via unpin_user_page() using a folio variant
to put the page, however, folio variants did not consider this special
case, the result will be to miss a wakeup event (like the user of
__fuse_dax_break_layouts()).  This results in a task being permanently
stuck in TASK_INTERRUPTIBLE state.

Since FSDAX pages are only possibly obtained by GUP users, so fix GUP
instead of folio_put() to lower overhead.

Link: https://lkml.kernel.org/r/20220705123532.283-1-songmuchun@bytedance.com
Fixes: d8ddc099c6 ("mm/gup: Add gup_put_folio()")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-18 15:07:51 -07:00
Josef Bacik
3fe2895cfe mm: fix page leak with multiple threads mapping the same page
We have an application with a lot of threads that use a shared mmap backed
by tmpfs mounted with -o huge=within_size.  This application started
leaking loads of huge pages when we upgraded to a recent kernel.

Using the page ref tracepoints and a BPF program written by Tejun Heo we
were able to determine that these pages would have multiple refcounts from
the page fault path, but when it came to unmap time we wouldn't drop the
number of refs we had added from the faults.

I wrote a reproducer that mmap'ed a file backed by tmpfs with -o
huge=always, and then spawned 20 threads all looping faulting random
offsets in this map, while using madvise(MADV_DONTNEED) randomly for huge
page aligned ranges.  This very quickly reproduced the problem.

The problem here is that we check for the case that we have multiple
threads faulting in a range that was previously unmapped.  One thread maps
the PMD, the other thread loses the race and then returns 0.  However at
this point we already have the page, and we are no longer putting this
page into the processes address space, and so we leak the page.  We
actually did the correct thing prior to f9ce0be71d, however it looks
like Kirill copied what we do in the anonymous page case.  In the
anonymous page case we don't yet have a page, so we don't have to drop a
reference on anything.  Previously we did the correct thing for file based
faults by returning VM_FAULT_NOPAGE so we correctly drop the reference on
the page we faulted in.

Fix this by returning VM_FAULT_NOPAGE in the pmd_devmap_trans_unstable()
case, this makes us drop the ref on the page properly, and now my
reproducer no longer leaks the huge pages.

[josef@toxicpanda.com: v2]
  Link: https://lkml.kernel.org/r/e90c8f0dbae836632b669c2afc434006a00d4a67.1657721478.git.josef@toxicpanda.com
Link: https://lkml.kernel.org/r/2b798acfd95c9ab9395fe85e8d5a835e2e10a920.1657051137.git.josef@toxicpanda.com
Fixes: f9ce0be71d ("mm: Cleanup faultaround and finish_fault() codepaths")
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Chris Mason <clm@fb.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-18 15:07:51 -07:00
Seth Forshee
f073c83359 mailmap: update Seth Forshee's email address
seth.forshee@canonical.com is no longer valid, use sforshee@kernel.org
instead.

Link: https://lkml.kernel.org/r/20220628200734.424495-1-sforshee@kernel.org
Signed-off-by: Seth Forshee <sforshee@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-18 15:07:51 -07:00
ZhaoLong Wang
0c98c8e1e1 tmpfs: fix the issue that the mount and remount results are inconsistent.
An undefined-behavior issue has not been completely fixed since commit
d14f5efadd ("tmpfs: fix undefined-behaviour in shmem_reconfigure()"). 
In the commit, check in the shmem_reconfigure() is added in remount
process to avoid the Ubsan problem.  However, the check is not added to
the mount process.  It causes inconsistent results between mount and
remount.  The operations to reproduce the problem in user mode as follows:

If nr_blocks is set to 0x8000000000000000, the mounting is successful.

  # mount tmpfs /dev/shm/ -t tmpfs -o nr_blocks=0x8000000000000000

However, when -o remount is used, the mount fails because of the
check in the shmem_reconfigure()

  # mount tmpfs /dev/shm/ -t tmpfs -o remount,nr_blocks=0x8000000000000000
  mount: /dev/shm: mount point not mounted or bad option.

Therefore, add checks in the shmem_parse_one() function and remove the
check in shmem_reconfigure() to avoid this problem.

Link: https://lkml.kernel.org/r/20220629124324.1640807-1-wangzhaolong1@huawei.com
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Cc: Luo Meng <luomeng12@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-18 15:07:51 -07:00
Yee Lee
07313a2b29 mm: kfence: apply kmemleak_ignore_phys on early allocated pool
This patch solves two issues.

(1) The pool allocated by memblock needs to unregister from
kmemleak scanning. Apply kmemleak_ignore_phys to replace the
original kmemleak_free as its address now is stored in the phys tree.

(2) The pool late allocated by page-alloc doesn't need to unregister.
Move out the freeing operation from its call path.

Link: https://lkml.kernel.org/r/20220628113714.7792-2-yee.lee@mediatek.com
Fixes: 0c24e06119 ("mm: kmemleak: add rbtree and store physical address for objects allocated with PA")
Signed-off-by: Yee Lee <yee.lee@mediatek.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Suggested-by: Marco Elver <elver@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-18 15:07:51 -07:00
Linus Torvalds
80e19f34c2 Merge tag 'hte/for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux
Pull hardware timestamp fix from Thierry Reding:
 "A single fix for an out-of-sync kerneldoc comment"

* tag 'hte/for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  gpiolib: cdev: Fix kernel doc for struct line
2022-07-18 11:47:04 -07:00
Mario Limonciello
09073396ea ACPI: CPPC: Don't require flexible address space if X86_FEATURE_CPPC is supported
Commit 0651ab90e4 ("ACPI: CPPC: Check _OSC for flexible address space")
changed _CPC probing to require flexible address space to be negotiated
for CPPC to work.

However it was observed that this caused a regression for Arek's ROG
Zephyrus G15 GA503QM which previously CPPC worked, but now it stopped
working.

To avoid causing a regression waive this failure when the CPU is known
to support CPPC.

Cc: Pierre Gondois <pierre.gondois@arm.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216248
Fixes: 0651ab90e4 ("ACPI: CPPC: Check _OSC for flexible address space")
Reported-and-tested-by: Arek Ruśniak <arek.rusi@gmail.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-07-18 20:40:05 +02:00
Przemyslaw Patynowski
d8fa2fd791 iavf: Fix missing state logs
Fix debug prints, by adding missing state prints.

Extend iavf_state_str by strings for __IAVF_INIT_EXTENDED_CAPS and
__IAVF_INIT_CONFIG_ADAPTER.

Without this patch, when enabling debug prints for iavf.h, user will
see:
iavf 0000:06:0e.0: state transition from:__IAVF_INIT_GET_RESOURCES to:__IAVF_UNKNOWN_STATE
iavf 0000:06:0e.0: state transition from:__IAVF_UNKNOWN_STATE to:__IAVF_UNKNOWN_STATE

Fixes: 605ca7c5c6 ("iavf: Fix kernel BUG in free_msi_irqs")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Jun Zhang <xuejun.zhang@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-18 09:36:40 -07:00
Przemyslaw Patynowski
a9f49e0060 iavf: Fix handling of dummy receive descriptors
Fix memory leak caused by not handling dummy receive descriptor properly.
iavf_get_rx_buffer now sets the rx_buffer return value for dummy receive
descriptors. Without this patch, when the hardware writes a dummy
descriptor, iavf would not free the page allocated for the previous receive
buffer. This is an unlikely event but can still happen.

[Jesse: massaged commit message]

Fixes: efa14c3985 ("iavf: allow null RX descriptors")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-18 09:36:40 -07:00
Przemyslaw Patynowski
4635fd3a9d iavf: Disallow changing rx/tx-frames and rx/tx-frames-irq
Remove from supported_coalesce_params ETHTOOL_COALESCE_MAX_FRAMES
and ETHTOOL_COALESCE_MAX_FRAMES_IRQ. As tx-frames-irq allowed
user to change budget for iavf_clean_tx_irq, remove work_limit
and use define for budget.

Without this patch there would be possibility to change rx/tx-frames
and rx/tx-frames-irq, which for rx/tx-frames did nothing, while for
rx/tx-frames-irq it changed rx/tx-frames and only changed budget
for cleaning NAPI poll.

Fixes: fbb7ddfef2 ("i40evf: core ethtool functionality")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Jun Zhang <xuejun.zhang@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-18 09:36:40 -07:00
Przemyslaw Patynowski
968996c070 iavf: Fix VLAN_V2 addition/rejection
Fix VLAN addition, so that PF driver does not reject whole VLAN batch.
Add VLAN reject handling, so rejected VLANs, won't litter VLAN filter
list. Fix handling of active_(c/s)vlans, so it will be possible to
re-add VLAN filters for user.
Without this patch, after changing trust to off, with VLAN filters
saturated, no VLAN is added, due to PF rejecting addition.

Fixes: 92fc508598 ("iavf: Restrict maximum VLAN filters for VIRTCHNL_VF_OFFLOAD_VLAN_V2")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-18 09:36:40 -07:00
xinhui pan
e1aadbab44 drm/amdgpu: Remove one duplicated ef removal
That has been done in BO release notify.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2074
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-18 11:45:23 -04:00
Peter Zijlstra
28a99e95f5 x86/amd: Use IBPB for firmware calls
On AMD IBRS does not prevent Retbleed; as such use IBPB before a
firmware call to flush the branch history state.

And because in order to do an EFI call, the kernel maps a whole lot of
the kernel page table into the EFI page table, do an IBPB just in case
in order to prevent the scenario of poisoning the BTB and causing an EFI
call using the unprotected RET there.

  [ bp: Massage. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220715194550.793957-1-cascardo@canonical.com
2022-07-18 15:38:09 +02:00
David S. Miller
c32349f325 Merge branch 'dsa-docs'
Vladimir Oltean says:

====================
Update DSA documentation

These are some updates of dsa.rst, since it hasn't kept up with
development (in some cases, even since 2017). I've added Fixes: tags as
I thought was appropriate.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
7b02f40350 docs: net: dsa: mention that VLANs are now refcounted on shared ports
The blamed commit updated the way in which VLANs are handled at the
cross-chip notifier layer and didn't update the documentation to say
that. Fix it.

Fixes: 134ef2388e ("net: dsa: add explicit support for host bridge VLANs")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
6ba1a4aa59 docs: net: dsa: delete misinformation about -EOPNOTSUPP for FDB/MDB/VLAN
Returning -EOPNOTSUPP does *NOT* mean anything special.

port_vlan_add() is actually called from 2 code paths, one is
vlan_vid_add() from 8021q module and the other is
br_switchdev_port_vlan_add() from switchdev.

The bridge has a wrapper __vlan_vid_add() which first tries via
switchdev, then if that returns -EOPNOTSUPP, tries again via the VLAN RX
filters in the 8021q module. But DSA doesn't distinguish between one
call path and the other when calling the driver's port_vlan_add(), so if
the driver returns -EOPNOTSUPP to switchdev, it also returns -EOPNOTSUPP
to the 8021q module. And the latter is a hard error.

port_fdb_add() is called from the deferred dsa_owq only, so obviously
its return code isn't propagated anywhere, and cannot be interpreted in
any way.

The return code from port_mdb_add() is propagated to the bridge, but
again, this doesn't do anything special when -EOPNOTSUPP is returned,
but rather, br_switchdev_mdb_notify() returns void.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
ea7006a7aa docs: net: dsa: re-explain what port_fdb_dump actually does
Switchdev has changed radically from its initial implementation, and the
currently provided definition is incorrect and very confusing.

Rewrite it in light of what it actually does.

Fixes: 2bedde1abb ("net: dsa: Move FDB dump implementation inside DSA")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
4e9d9bb6df docs: net: dsa: add a section for address databases
The given definition for what VID 0 represents in the current
port_fdb_add and port_mdb_add is blatantly wrong. Delete it and explain
the concepts surrounding DSA's understanding of FDB isolation.

Fixes: c26933639b ("net: dsa: request drivers to perform FDB isolation")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
7f75d3dd4f docs: net: dsa: delete port_mdb_dump
This was deleted in 2017, stop documenting it.

Fixes: dc0cbff3ff ("net: dsa: Remove redundant MDB dump support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
e465d507c7 docs: net: dsa: remove port_vlan_dump
This was deleted in 2017, delete the obsolete documentation.

Fixes: c069fcd82c ("net: dsa: Remove support for bypass bridge port attributes/vlan set")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
3083623948 docs: net: dsa: remove port_bridge_tx_fwd_offload
We've changed the API through which we can offload the bridge TX
forwarding process. Update the documentation in light of the removal of
2 DSA switch ops.

Fixes: b079922ba2 ("net: dsa: add a "tx_fwd_offload" argument to ->port_bridge_join")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
0cb8682ebf docs: net: dsa: document port_fast_age
The provided information about FDB flushing is not really up to date.
The DSA core automatically calls port_fast_age() when necessary, and
drivers should just implement that rather than hooking it to
port_bridge_leave, port_stp_state_set and others.

Fixes: 732f794c1b ("net: dsa: add port fast ageing")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
3c87237ecd docs: net: dsa: document port_setup and port_teardown
These methods were added without being documented, fix that.

Fixes: fd292c189a ("net: dsa: tear down devlink port regions when tearing down the devlink port on error")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
b763f50dc1 docs: net: dsa: document the teardown method
A teardown method was added to dsa_switch_ops without being documented.
Do so now.

Fixes: 5e3f847a02 ("net: dsa: Add teardown callback for drivers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
d6a0336add docs: net: dsa: document change_tag_protocol
Support for changing the tagging protocol was added without this
operation being documented; do so now.

Fixes: 53da0ebaad ("net: dsa: allow changing the tag protocol via the "tagging" device attribute")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
c56313a42a docs: net: dsa: add more info about the other arguments to get_tag_protocol
Changes were made to the prototype of get_tag_protocol without
describing at a high level what they are about. Update the documentation
to explain that.

Fixes: 5ed4e3eb02 ("net: dsa: Pass a port to get_tag_protocol()")
Fixes: 4d776482ec ("net: dsa: Get information about stacked DSA protocol")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:37 +01:00
Vladimir Oltean
c3f0e84d10 docs: net: dsa: rename tag_protocol to get_tag_protocol
Since the blamed commit, the enum was turned into a function pointer and
also renamed. Update the documentation.

Fixes: 7b314362a2 ("net: dsa: Allow the DSA driver to indicate the tag protocol")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:36 +01:00
Vladimir Oltean
54367831c5 docs: net: dsa: document the shutdown behavior
Document the changes that took place in the DSA core in the blamed
commit.

Fixes: 0650bf52b3 ("net: dsa: be compatible with masters which unregister on shutdown")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:36 +01:00
Vladimir Oltean
19b3b13c93 docs: net: dsa: update probing documentation
Since the blamed commit we don't have register_switch_driver() and
unregister_switch_driver() anymore. Additionally, the expected
dsa_register_switch() and dsa_unregister_switch() calls aren't
documented.

Update the probing section with the details of how things are currently
done.

Fixes: 93e86b3bc8 ("net: dsa: Remove legacy probing support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:44:36 +01:00
David S. Miller
c9f21106d9 Merge branch 'net-ipv4-sysctl-races-part-3'
Kuniyuki Iwashima says:

====================
sysctl: Fix data-races around ipv4_net_table (Round 3).

This series fixes data-races around 21 knobs after
igmp_link_local_mcast_reports in ipv4_net_table.

These 4 knobs are skipped because they are safe.

  - tcp_congestion_control: Safe with RCU and xchg().
  - tcp_available_congestion_control: Read only.
  - tcp_allowed_congestion_control: Safe with RCU and spinlock().
  - tcp_fastopen_key: Safe with RCU and xchg()

So, round 4 will start with fib_multipath_use_neigh.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
021266ec64 tcp: Fix data-races around sysctl_tcp_fastopen_blackhole_timeout.
While reading sysctl_tcp_fastopen_blackhole_timeout, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: cf1ef3f071 ("net/tcp_fastopen: Disable active side TFO in certain scenarios")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
5a54213318 tcp: Fix data-races around sysctl_tcp_fastopen.
While reading sysctl_tcp_fastopen, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 2100c8d2d9 ("net-tcp: Fast Open base")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
79539f3474 tcp: Fix data-races around sysctl_max_syn_backlog.
While reading sysctl_max_syn_backlog, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
cbfc649558 tcp: Fix a data-race around sysctl_tcp_tw_reuse.
While reading sysctl_tcp_tw_reuse, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
55be873695 tcp: Fix a data-race around sysctl_tcp_notsent_lowat.
While reading sysctl_tcp_notsent_lowat, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: c9bee3b7fd ("tcp: TCP_NOTSENT_LOWAT socket option")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
39e24435a7 tcp: Fix data-races around some timeout sysctl knobs.
While reading these sysctl knobs, they can be changed concurrently.
Thus, we need to add READ_ONCE() to their readers.

  - tcp_retries1
  - tcp_retries2
  - tcp_orphan_retries
  - tcp_fin_timeout

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
46778cd16e tcp: Fix data-races around sysctl_tcp_reordering.
While reading sysctl_tcp_reordering, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
4177f54589 tcp: Fix data-races around sysctl_tcp_migrate_req.
While reading sysctl_tcp_migrate_req, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: f9ac779f88 ("net: Introduce net.ipv4.tcp_migrate_req.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
f2e383b5bb tcp: Fix data-races around sysctl_tcp_syncookies.
While reading sysctl_tcp_syncookies, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
20a3b1c0f6 tcp: Fix data-races around sysctl_tcp_syn(ack)?_retries.
While reading sysctl_tcp_syn(ack)?_retries, they can be changed
concurrently.  Thus, we need to add READ_ONCE() to their readers.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
f2f316e287 tcp: Fix data-races around keepalive sysctl knobs.
While reading sysctl_tcp_keepalive_(time|probes|intvl), they can be changed
concurrently.  Thus, we need to add READ_ONCE() to their readers.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
8ebcc62c73 igmp: Fix data-races around sysctl_igmp_qrv.
While reading sysctl_igmp_qrv, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

This test can be packed into a helper, so such changes will be in the
follow-up series after net is merged into net-next.

  qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv);

Fixes: a9fe8e2994 ("ipv4: implement igmp_qrv sysctl to tune igmp robustness variable")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:54 +01:00
Kuniyuki Iwashima
6ae0f2e553 igmp: Fix data-races around sysctl_igmp_max_msf.
While reading sysctl_igmp_max_msf, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:53 +01:00
Kuniyuki Iwashima
6305d821e3 igmp: Fix a data-race around sysctl_igmp_max_memberships.
While reading sysctl_igmp_max_memberships, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:53 +01:00
Kuniyuki Iwashima
f6da2267e7 igmp: Fix data-races around sysctl_igmp_llm_reports.
While reading sysctl_igmp_llm_reports, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

This test can be packed into a helper, so such changes will be in the
follow-up series after net is merged into net-next.

  if (ipv4_is_local_multicast(pmc->multiaddr) &&
      !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports))

Fixes: df2cf4a78e ("IGMP: Inhibit reports for local multicast groups")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 12:21:53 +01:00
Mario Limonciello
41ef3c1a6b pinctrl: Don't allow PINCTRL_AMD to be a module
It was observed that by allowing pinctrl_amd to be loaded
later in the boot process that interrupts sent to the GPIO
controller early in the boot are not serviced.  The kernel treats
these as a spurious IRQ and disables the IRQ.

This problem was exacerbated because it happened on a system with
an encrypted partition so the kernel object was not accesssible for
an extended period of time while waiting for a passphrase.

To avoid this situation from occurring, stop allowing pinctrl-amd
from being built as a module and instead require it to be built-in
or disabled.

Reported-by: madcatx@atlas.cz
Suggested-by: jwrdegoede@fedoraproject.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216230
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220713175950.964-1-mario.limonciello@amd.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-07-18 12:47:28 +02:00
Maksym Glubokiy
1e20904e41 net: prestera: acl: use proper mask for port selector
Adjusted as per packet processor documentation.
This allows to properly match 'indev' for clsact rules.

Fixes: 47327e198d ("net: prestera: acl: migrate to new vTCAM api")

Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 11:44:32 +01:00
Tariq Toukan
f08d8c1bb9 net/tls: Fix race in TLS device down flow
Socket destruction flow and tls_device_down function sync against each
other using tls_device_lock and the context refcount, to guarantee the
device resources are freed via tls_dev_del() by the end of
tls_device_down.

In the following unfortunate flow, this won't happen:
- refcount is decreased to zero in tls_device_sk_destruct.
- tls_device_down starts, skips the context as refcount is zero, going
  all the way until it flushes the gc work, and returns without freeing
  the device resources.
- only then, tls_device_queue_ctx_destruction is called, queues the gc
  work and frees the context's device resources.

Solve it by decreasing the refcount in the socket's destruction flow
under the tls_device_lock, for perfect synchronization.  This does not
slow down the common likely destructor flow, in which both the refcount
is decreased and the spinlock is acquired, anyway.

Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 11:33:07 +01:00
Junxiao Chang
613b065ca3 net: stmmac: fix dma queue left shift overflow issue
When queue number is > 4, left shift overflows due to 32 bits
integer variable. Mask calculation is wrong for MTL_RXQ_DMA_MAP1.

If CONFIG_UBSAN is enabled, kernel dumps below warning:
[   10.363842] ==================================================================
[   10.363882] UBSAN: shift-out-of-bounds in /build/linux-intel-iotg-5.15-8e6Tf4/
linux-intel-iotg-5.15-5.15.0/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c:224:12
[   10.363929] shift exponent 40 is too large for 32-bit type 'unsigned int'
[   10.363953] CPU: 1 PID: 599 Comm: NetworkManager Not tainted 5.15.0-1003-intel-iotg
[   10.363956] Hardware name: ADLINK Technology Inc. LEC-EL/LEC-EL, BIOS 0.15.11 12/22/2021
[   10.363958] Call Trace:
[   10.363960]  <TASK>
[   10.363963]  dump_stack_lvl+0x4a/0x5f
[   10.363971]  dump_stack+0x10/0x12
[   10.363974]  ubsan_epilogue+0x9/0x45
[   10.363976]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0x10e
[   10.363979]  ? wake_up_klogd+0x4a/0x50
[   10.363983]  ? vprintk_emit+0x8f/0x240
[   10.363986]  dwmac4_map_mtl_dma.cold+0x42/0x91 [stmmac]
[   10.364001]  stmmac_mtl_configuration+0x1ce/0x7a0 [stmmac]
[   10.364009]  ? dwmac410_dma_init_channel+0x70/0x70 [stmmac]
[   10.364020]  stmmac_hw_setup.cold+0xf/0xb14 [stmmac]
[   10.364030]  ? page_pool_alloc_pages+0x4d/0x70
[   10.364034]  ? stmmac_clear_tx_descriptors+0x6e/0xe0 [stmmac]
[   10.364042]  stmmac_open+0x39e/0x920 [stmmac]
[   10.364050]  __dev_open+0xf0/0x1a0
[   10.364054]  __dev_change_flags+0x188/0x1f0
[   10.364057]  dev_change_flags+0x26/0x60
[   10.364059]  do_setlink+0x908/0xc40
[   10.364062]  ? do_setlink+0xb10/0xc40
[   10.364064]  ? __nla_validate_parse+0x4c/0x1a0
[   10.364068]  __rtnl_newlink+0x597/0xa10
[   10.364072]  ? __nla_reserve+0x41/0x50
[   10.364074]  ? __kmalloc_node_track_caller+0x1d0/0x4d0
[   10.364079]  ? pskb_expand_head+0x75/0x310
[   10.364082]  ? nla_reserve_64bit+0x21/0x40
[   10.364086]  ? skb_free_head+0x65/0x80
[   10.364089]  ? security_sock_rcv_skb+0x2c/0x50
[   10.364094]  ? __cond_resched+0x19/0x30
[   10.364097]  ? kmem_cache_alloc_trace+0x15a/0x420
[   10.364100]  rtnl_newlink+0x49/0x70

This change fixes MTL_RXQ_DMA_MAP1 mask issue and channel/queue
mapping warning.

Fixes: d43042f4da ("net: stmmac: mapping mtl rx to dma channel")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216195
Reported-by: Cedric Wassenaar <cedric@bytespeed.nl>
Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 11:30:26 +01:00
Wong Vee Khee
76c16d3e19 net: stmmac: switch to use interrupt for hw crosstimestamping
Using current implementation of polling mode, there is high chances we
will hit into timeout error when running phc2sys. Hence, update the
implementation of hardware crosstimestamping to use the MAC interrupt
service routine instead of polling for TSIS bit in the MAC Timestamp
Interrupt Status register to be set.

Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 11:14:35 +01:00
Horatiu Vultur
ba9c4745fc pinctrl: ocelot: Fix pincfg
The blamed commit changed to use regmaps instead of __iomem. But it
didn't update the register offsets to be at word offset, so it uses byte
offset.
Another issue with the same commit is that it has a limit of 32 registers
which is incorrect. The sparx5 has 64 while lan966x has 77.

Fixes: 076d9e71bc ("pinctrl: ocelot: convert pinctrl to regmap")
Acked-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20220713193750.4079621-3-horatiu.vultur@microchip.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-07-18 11:22:43 +02:00
Horatiu Vultur
dc62db7138 pinctrl: ocelot: Fix pincfg for lan966x
The blamed commit introduce support for lan966x which use the same
pinconf_ops as sparx5. The problem is that pinconf_ops is specific to
sparx5. More precisely the offset of the bits in the pincfg register are
different and also lan966x doesn't have support for
PIN_CONFIG_INPUT_SCHMITT_ENABLE.

Fix this by making pinconf_ops more generic such that it can be also
used by lan966x. This is done by introducing 'ocelot_pincfg_data' which
contains the offset and what is supported for each SOC.

Fixes: 531d6ab365 ("pinctrl: ocelot: Extend support for lan966x")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220713193750.4079621-2-horatiu.vultur@microchip.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-07-18 11:22:43 +02:00
Christian König
dbd0da2453 drm/ttm: fix locking in vmap/vunmap TTM GEM helpers
I've stumbled over this while reviewing patches for DMA-buf and it looks
like we completely messed the locking up here.

In general most TTM function should only be called while holding the
appropriate BO resv lock. Without this we could break the internal
buffer object state here.

Only compile tested!

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 43676605f8 ("drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers")
Cc: stable@vger.kernel.org
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220715111533.467012-1-christian.koenig@amd.com
2022-07-18 09:18:53 +02:00
Michael Walle
ef0324b641 ARM: dts: lan966x: fix sys_clk frequency
The sys_clk frequency is 165.625MHz. The register reference of the
Generic Clock controller lists the CPU clock as 600MHz, the DDR clock as
300MHz and the SYS clock as 162.5MHz. This is wrong. It was first
noticed during the fan driver development and it was measured and
verified via the CLK_MON output of the SoC which can be configured to
output sys_clk/64.

The core PLL settings (which drives the SYS clock) seems to be as
follows:
  DIVF = 52
  DIVQ = 3
  DIVR = 1

With a refernce clock of 25MHz, this means we have a post divider clock
  Fpfd = Fref / (DIVR + 1) = 25MHz / (1 + 1) = 12.5MHz

The resulting VCO frequency is then
  Fvco = Fpfd * (DIVF + 1) * 2 = 12.5MHz * (52 + 1) * 2 = 1325MHz

And the output frequency is
  Fout = Fvco / 2^DIVQ = 1325MHz / 2^3 = 165.625Mhz

This all adds up to the constrains of the PLL:
    10MHz <= Fpfd <= 200MHz
    20MHz <= Fout <= 1000MHz
  1000MHz <= Fvco <= 2000MHz

Fixes: 290deaa10c ("ARM: dts: add DT for lan966 SoC and 2-port board pcb8291")
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Kavyasree Kotagiri <kavyasree.kotagiri@microchip.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20220326194028.2945985-1-michael@walle.cc
2022-07-18 09:31:22 +03:00
Robert Hancock
4ca8ca873d i2c: cadence: Change large transfer count reset logic to be unconditional
Problems were observed on the Xilinx ZynqMP platform with large I2C reads.
When a read of 277 bytes was performed, the controller NAKed the transfer
after only 252 bytes were transferred and returned an ENXIO error on the
transfer.

There is some code in cdns_i2c_master_isr to handle this case by resetting
the transfer count in the controller before it reaches 0, to allow larger
transfers to work, but it was conditional on the CDNS_I2C_BROKEN_HOLD_BIT
quirk being set on the controller, and ZynqMP uses the r1p14 version of
the core where this quirk is not being set. The requirement to do this to
support larger reads seems like an inherently required workaround due to
the core only having an 8-bit transfer size register, so it does not
appear that this should be conditional on the broken HOLD bit quirk which
is used elsewhere in the driver.

Remove the dependency on the CDNS_I2C_BROKEN_HOLD_BIT for this transfer
size reset logic to fix this problem.

Fixes: 63cab195bf ("i2c: removed work arounds in i2c driver for Zynq Ultrascale+ MPSoC")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Shubhrajyoti Datta <Shubhrajyoti.datta@amd.com>
Acked-by: Michal Simek <michal.simek@amd.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-07-16 14:44:12 +02:00
Flavio Suligoi
824a826e2e i2c: imx: fix typo in comment
to provid --> to provide

Signed-off-by: Flavio Suligoi <f.suligoi@asem.it>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-07-16 14:18:49 +02:00
Vadim Pasternak
e1f77ecc75 i2c: mlxcpld: Fix register setting for 400KHz frequency
Fix setting of 'Half Cycle' register for 400KHz frequency.

Fixes: fa1049135c ("i2c: mlxcpld: Modify register setting for 400KHz frequency")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-07-16 14:17:11 +02:00
Kuniyuki Iwashima
11052589cf tcp/udp: Make early_demux back namespacified.
Commit e21145a987 ("ipv4: namespacify ip_early_demux sysctl knob") made
it possible to enable/disable early_demux on a per-netns basis.  Then, we
introduced two knobs, tcp_early_demux and udp_early_demux, to switch it for
TCP/UDP in commit dddb64bcb3 ("net: Add sysctl to toggle early demux for
tcp and udp").  However, the .proc_handler() was wrong and actually
disabled us from changing the behaviour in each netns.

We can execute early_demux if net.ipv4.ip_early_demux is on and each proto
.early_demux() handler is not NULL.  When we toggle (tcp|udp)_early_demux,
the change itself is saved in each netns variable, but the .early_demux()
handler is a global variable, so the handler is switched based on the
init_net's sysctl variable.  Thus, netns (tcp|udp)_early_demux knobs have
nothing to do with the logic.  Whether we CAN execute proto .early_demux()
is always decided by init_net's sysctl knob, and whether we DO it or not is
by each netns ip_early_demux knob.

This patch namespacifies (tcp|udp)_early_demux again.  For now, the users
of the .early_demux() handler are TCP and UDP only, and they are called
directly to avoid retpoline.  So, we can remove the .early_demux() handler
from inet6?_protos and need not dereference them in ip6?_rcv_finish_core().
If another proto needs .early_demux(), we can restore it at that time.

Fixes: dddb64bcb3 ("net: Add sysctl to toggle early demux for tcp and udp")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20220713175207.7727-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-15 18:50:35 -07:00
Jakub Kicinski
df254d4508 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-07-14

This series contains updates to e1000e and igc drivers.

Sasha re-enables GPT clock when exiting s0ix to prevent hardware unit
hang and reverts a workaround for this issue on e1000e.

Lennert Buytenhek restores checks for removed device while accessing
registers to prevent NULL pointer dereferences for igc.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  igc: Reinstate IGC_REMOVED logic and implement it properly
  Revert "e1000e: Fix possible HW unit hang after an s0ix exit"
  e1000e: Enable GPT clock before sending message to CSME
====================

Link: https://lore.kernel.org/r/20220714175857.933537-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-15 16:45:30 -07:00
Liang He
a14bd74754 net: dsa: microchip: ksz_common: Fix refcount leak bug
In ksz_switch_register(), we should call of_node_put() for the
reference returned by of_get_child_by_name() which has increased
the refcount.

Fixes: 912aae27c6 ("net: dsa: microchip: really look for phy-mode in port nodes")
Signed-off-by: Liang He <windhl@126.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220714153138.375919-1-windhl@126.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-15 16:44:53 -07:00
Sascha Hauer
0fddf9ad06 mtd: rawnand: gpmi: Set WAIT_FOR_READY timeout based on program/erase times
06781a5026 Fixes the calculation of the DEVICE_BUSY_TIMEOUT register
value from busy_timeout_cycles. busy_timeout_cycles is calculated wrong
though: It is calculated based on the maximum page read time, but the
timeout is also used for page write and block erase operations which
require orders of magnitude bigger timeouts.

Fix this by calculating busy_timeout_cycles from the maximum of
tBERS_max and tPROG_max.

This is for now the easiest and most obvious way to fix the driver.
There's room for improvements though: The NAND_OP_WAITRDY_INSTR tells us
the desired timeout for the current operation, so we could program the
timeout dynamically for each operation instead of setting a fixed
timeout. Also we could wire up the interrupt handler to actually detect
and forward timeouts occurred when waiting for the chip being ready.

As a sidenote I verified that the change in 06781a5026 is really
correct. I wired up the interrupt handler in my tree and measured the
time between starting the operation and the timeout interrupt handler
coming in. The time increases 41us with each step in the timeout
register which corresponds to 4096 clock cycles with the 99MHz clock
that I have.

Fixes: 06781a5026 ("mtd: rawnand: gpmi: Fix setting busy timeout setting")
Fixes: b120612206 ("mtd: rawniand: gpmi: use core timings instead of an empirical derivation")
Cc: stable@vger.kernel.org
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Han Xu <han.xu@nxp.com>
Tested-by: Tomasz Moń <tomasz.mon@camlingroup.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2022-07-15 17:41:11 +02:00
Dmitry Osipenko
9b04369b06 drm/scheduler: Don't kill jobs in interrupt context
Interrupt context can't sleep. Drivers like Panfrost and MSM are taking
mutex when job is released, and thus, that code can sleep. This results
into "BUG: scheduling while atomic" if locks are contented while job is
freed. There is no good reason for releasing scheduler's jobs in IRQ
context, hence use normal context to fix the trouble.

Cc: stable@vger.kernel.org
Fixes: 542cff7893 ("drm/sched: Avoid lockdep spalt on killing a processes")
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220411221536.283312-1-dmitry.osipenko@collabora.com
2022-07-15 10:09:15 -04:00
Stylon Wang
2d4bd81fea drm/amd/display: Fix new dmub notification enabling in DM
[Why]
Changes from "Fix for dmub outbox notification enable" need to land
in DM or DMUB outbox notification would be disabled.

[How]
Enable outbox notification only after interrupt are enabled and IRQ
handlers registered. Any pending notification will be sent by DMUB
once outbox notification is enabled.

Fixes: ed72087064 ("drm/amd/display: Fix for dmub outbox notification enable")
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Stylon Wang <stylon.wang@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-07-15 10:04:59 -04:00
David S. Miller
8f3184b951 Merge branch 'stmmac-dwmac-mediatec-clock-fix'
Biao Huang says:

====================
stmmac: dwmac-mediatek: fix clock issue

changes in v5:
1. add reivewd-by as Matthias's comments.
2. fix "warning: unused variable 'ret' [-Wunused-variable]" as Jakub's comments

changes in v4:
1. improve commit message and test ko insertion/remove as Matthias's comments.
2. add patch "net: stmmac: fix pm runtime issue in stmmac_dvr_remove()" to
   fix vlan filter deletion issue.
3. add patch "net: stmmac: fix unbalanced ptp clock issue in suspend/resume flow"
   to fix unbalanced ptp clock issue in suspend/resume flow.

changes in v3:
1. delete mediatek_dwmac_exit() since there is no operation in it,
as Matthias's comments.

changes in v2:
1. clock configuration is still needed in probe,
and invoke mediatek_dwmac_clks_config() instead.
2. update commit message.

v1:
remove duplicated clock configuration in init/exit.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 12:06:56 +01:00
Biao Huang
f4c7d8948e net: stmmac: fix unbalanced ptp clock issue in suspend/resume flow
Current stmmac driver will prepare/enable ptp_ref clock in
stmmac_init_tstamp_counter().

The stmmac_pltfr_noirq_suspend will disable it once in suspend flow.

But in resume flow,
	stmmac_pltfr_noirq_resume --> stmmac_init_tstamp_counter
	stmmac_resume --> stmmac_hw_setup --> stmmac_init_ptp --> stmmac_init_tstamp_counter
ptp_ref clock reference counter increases twice, which leads to unbalance
ptp clock when resume back.

Move ptp_ref clock prepare/enable out of stmmac_init_tstamp_counter to fix it.

Fixes: 0735e639f1 ("net: stmmac: skip only stmmac_ptp_register when resume from suspend")
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 12:06:56 +01:00
Biao Huang
0d9a15913b net: stmmac: fix pm runtime issue in stmmac_dvr_remove()
If netif is running when stmmac_dvr_remove is invoked,
the unregister_netdev will call ndo_stop(stmmac_release) and
vlan_kill_rx_filter(stmmac_vlan_rx_kill_vid).

Currently, stmmac_dvr_remove() will disable pm runtime before
unregister_netdev. When stmmac_vlan_rx_kill_vid is invoked,
pm_runtime_resume_and_get in it returns EACCESS error number,
and reports:

	dwmac-mediatek 11021000.ethernet eth0: stmmac_dvr_remove: removing driver
	dwmac-mediatek 11021000.ethernet eth0: FPE workqueue stop
	dwmac-mediatek 11021000.ethernet eth0: failed to kill vid 0081/0

Move the pm_runtime_disable to the end of stmmac_dvr_remove
to fix this issue.

Fixes: 6449520391 ("net: stmmac: properly handle with runtime pm in stmmac_dvr_remove()")
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 12:06:56 +01:00
Biao Huang
fa4b3ca60e stmmac: dwmac-mediatek: fix clock issue
The pm_runtime takes care of the clock handling in current
stmmac drivers, and dwmac-mediatek implement the
mediatek_dwmac_clks_config() as the callback for pm_runtime.

Then, stripping duplicated clocks handling in old init()/exit()
to fix clock issue in suspend/resume test.

As to clocks in probe/remove, vendor need symmetric handling to
ensure clocks balance.

Test pass, including suspend/resume and ko insertion/remove.

Fixes: 3186bdad97 ("stmmac: dwmac-mediatek: add platform level clocks management")
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 12:06:55 +01:00
David S. Miller
782d86fe44 Merge branch 'net-sysctl-races-round2'
Kuniyuki Iwashima says:

====================
sysctl: Fix data-races around ipv4_net_table (Round 2).

This series fixes data-races around 15 knobs after ip_default_ttl in
ipv4_net_table.

These two knobs are skipped.
  - ip_local_port_range is safe with its own lock.
  - ip_local_reserved_ports uses proc_do_large_bitmap(), which will need
    an additional lock and can be fixed later.

So, the next round will start with igmp_link_local_mcast_reports.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:56 +01:00
Kuniyuki Iwashima
2a85388f1d tcp: Fix a data-race around sysctl_tcp_probe_interval.
While reading sysctl_tcp_probe_interval, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 05cbc0db03 ("ipv4: Create probe timer for tcp PMTU as per RFC4821")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:56 +01:00
Kuniyuki Iwashima
92c0aa4175 tcp: Fix a data-race around sysctl_tcp_probe_threshold.
While reading sysctl_tcp_probe_threshold, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 6b58e0a5f3 ("ipv4: Use binary search to choose tcp PMTU probe_size")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:56 +01:00
Kuniyuki Iwashima
8e92d44236 tcp: Fix a data-race around sysctl_tcp_mtu_probe_floor.
While reading sysctl_tcp_mtu_probe_floor, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: c04b79b6cf ("tcp: add new tcp_mtu_probe_floor sysctl")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:56 +01:00
Kuniyuki Iwashima
78eb166cde tcp: Fix data-races around sysctl_tcp_min_snd_mss.
While reading sysctl_tcp_min_snd_mss, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 5f3e2bf008 ("tcp: add tcp_min_snd_mss sysctl")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:56 +01:00
Kuniyuki Iwashima
88d78bc097 tcp: Fix data-races around sysctl_tcp_base_mss.
While reading sysctl_tcp_base_mss, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 5d424d5a67 ("[TCP]: MTU probing")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:55 +01:00
Kuniyuki Iwashima
f47d00e077 tcp: Fix data-races around sysctl_tcp_mtu_probing.
While reading sysctl_tcp_mtu_probing, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 5d424d5a67 ("[TCP]: MTU probing")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:55 +01:00
Kuniyuki Iwashima
08a75f1067 tcp: Fix data-races around sysctl_tcp_l3mdev_accept.
While reading sysctl_tcp_l3mdev_accept, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 6dd9a14e92 ("net: Allow accepted sockets to be bound to l3mdev domain")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:55 +01:00
Kuniyuki Iwashima
1a0008f9df tcp/dccp: Fix a data-race around sysctl_tcp_fwmark_accept.
While reading sysctl_tcp_fwmark_accept, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 84f39b08d7 ("net: support marking accepting TCP sockets")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:55 +01:00
Kuniyuki Iwashima
85d0b4dbd7 ip: Fix a data-race around sysctl_fwmark_reflect.
While reading sysctl_fwmark_reflect, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: e110861f86 ("net: add a sysctl to reflect the fwmark on replies")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:55 +01:00
Kuniyuki Iwashima
0db2327658 ip: Fix a data-race around sysctl_ip_autobind_reuse.
While reading sysctl_ip_autobind_reuse, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 4b01a96742 ("tcp: bind(0) remove the SO_REUSEADDR restriction when ephemeral ports are exhausted.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:55 +01:00
Kuniyuki Iwashima
289d3b21fb ip: Fix data-races around sysctl_ip_nonlocal_bind.
While reading sysctl_ip_nonlocal_bind, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:55 +01:00
Kuniyuki Iwashima
7bf9e18d9a ip: Fix data-races around sysctl_ip_fwd_update_priority.
While reading sysctl_ip_fwd_update_priority, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 432e05d328 ("net: ipv4: Control SKB reprioritization after forwarding")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:55 +01:00
Kuniyuki Iwashima
60c158dc7b ip: Fix data-races around sysctl_ip_fwd_use_pmtu.
While reading sysctl_ip_fwd_use_pmtu, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: f87c10a8aa ("ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:55 +01:00
Kuniyuki Iwashima
0968d2a441 ip: Fix data-races around sysctl_ip_no_pmtu_disc.
While reading sysctl_ip_no_pmtu_disc, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:55 +01:00
Kuniyuki Iwashima
8281b7ec5c ip: Fix data-races around sysctl_ip_default_ttl.
While reading sysctl_ip_default_ttl, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-15 11:49:55 +01:00
Takashi Iwai
cf33ce6f0c Merge tag 'asoc-fix-v5.19-rc4-2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Drop Rockchip BCLK management for v5.19

As covered in the second revert commit in this pull request the version
of the BCLK muxing that's in v5.19 is causing issues, let's just revert
it and wait for the more complete support in v5.20 instead.
2022-07-15 12:31:07 +02:00
Krzysztof Kozlowski
89551fdd44 riscv: dts: align gpio-key node names with dtschema
The node names should be generic and DT schema expects certain pattern
(e.g. with key/button/switch).

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220624170811.66395-1-krzysztof.kozlowski@linaro.org
Link: https://lore.kernel.org/all/20220616005224.18391-1-krzysztof.kozlowski@linaro.org/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-07-14 14:30:20 -07:00
Li Zhengyu
3a66a08759 RISC-V: kexec: Fix build error without CONFIG_KEXEC
When CONFIG_KEXEC_FILE=y but CONFIG_KEXEC is not set:

kernel/kexec_core.o: In function `kimage_free':
kexec_core.c:(.text+0xa0c): undefined reference to `machine_kexec_cleanup'
kernel/kexec_core.o: In function `.L0 ':
kexec_core.c:(.text+0xde8): undefined reference to `machine_crash_shutdown'
kexec_core.c:(.text+0xdf4): undefined reference to `machine_kexec'
kernel/kexec_core.o: In function `.L231':
kexec_core.c:(.text+0xe1c): undefined reference to `riscv_crash_save_regs'
kernel/kexec_core.o: In function `.L0 ':
kexec_core.c:(.text+0x119e): undefined reference to `machine_shutdown'
kernel/kexec_core.o: In function `.L312':
kexec_core.c:(.text+0x11b2): undefined reference to `machine_kexec'
kernel/kexec_file.o: In function `.L0 ':
kexec_file.c:(.text+0xb84): undefined reference to `machine_kexec_prepare'
kernel/kexec_file.o: In function `.L177':
kexec_file.c:(.text+0xc5a): undefined reference to `machine_kexec_prepare'
Makefile:1160: recipe for target 'vmlinux' failed
make: *** [vmlinux] Error 1

These symbols should depend on CONFIG_KEXEC_CORE rather than CONFIG_KEXEC
when kexec_file has been implemented on RISC-V, like the other archs have
done.

Signed-off-by: Li Zhengyu <lizhengyu3@huawei.com>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20220601070204.26882-1-lizhengyu3@huawei.com
Fixes: 6261586e0c ("RISC-V: Add kexec_file support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-07-14 14:23:41 -07:00
Li Zhengyu
a927444aa9 RISCV: kexec: Fix build error without CONFIG_MODULES
When CONFIG_MODULES is not set/enabled:

../arch/riscv/kernel/elf_kexec.c:353:9: error: unknown type name 'Elf_Rela'; did you mean 'Elf64_Rela'?
  353 |         Elf_Rela *relas;
      |         ^~~~~~~~
      |         Elf64_Rela

Replace Elf_Rela by Elf64_Rela to avoid relying on CONFIG_MODULES.

Signed-off-by: Li Zhengyu <lizhengyu3@huawei.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20220601063924.13037-1-lizhengyu3@huawei.com
Fixes: 838b3e2848 ("RISC-V: Load purgatory in kexec_file")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-07-14 14:23:33 -07:00
Lennert Buytenhek
7c1ddcee53 igc: Reinstate IGC_REMOVED logic and implement it properly
The initially merged version of the igc driver code (via commit
146740f9ab, "igc: Add support for PF") contained the following
IGC_REMOVED checks in the igc_rd32/wr32() MMIO accessors:

	u32 igc_rd32(struct igc_hw *hw, u32 reg)
	{
		u8 __iomem *hw_addr = READ_ONCE(hw->hw_addr);
		u32 value = 0;

		if (IGC_REMOVED(hw_addr))
			return ~value;

		value = readl(&hw_addr[reg]);

		/* reads should not return all F's */
		if (!(~value) && (!reg || !(~readl(hw_addr))))
			hw->hw_addr = NULL;

		return value;
	}

And:

	#define wr32(reg, val) \
	do { \
		u8 __iomem *hw_addr = READ_ONCE((hw)->hw_addr); \
		if (!IGC_REMOVED(hw_addr)) \
			writel((val), &hw_addr[(reg)]); \
	} while (0)

E.g. igb has similar checks in its MMIO accessors, and has a similar
macro E1000_REMOVED, which is implemented as follows:

	#define E1000_REMOVED(h) unlikely(!(h))

These checks serve to detect and take note of an 0xffffffff MMIO read
return from the device, which can be caused by a PCIe link flap or some
other kind of PCI bus error, and to avoid performing MMIO reads and
writes from that point onwards.

However, the IGC_REMOVED macro was not originally implemented:

	#ifndef IGC_REMOVED
	#define IGC_REMOVED(a) (0)
	#endif /* IGC_REMOVED */

This led to the IGC_REMOVED logic to be removed entirely in a
subsequent commit (commit 3c215fb18e, "igc: remove IGC_REMOVED
function"), with the rationale that such checks matter only for
virtualization and that igc does not support virtualization -- but a
PCIe device can become detached even without virtualization being in
use, and without proper checks, a PCIe bus error affecting an igc
adapter will lead to various NULL pointer dereferences, as the first
access after the error will set hw->hw_addr to NULL, and subsequent
accesses will blindly dereference this now-NULL pointer.

This patch reinstates the IGC_REMOVED checks in igc_rd32/wr32(), and
implements IGC_REMOVED the way it is done for igb, by checking for the
unlikely() case of hw_addr being NULL.  This change prevents the oopses
seen when a PCIe link flap occurs on an igc adapter.

Fixes: 146740f9ab ("igc: Add support for PF")
Signed-off-by: Lennert Buytenhek <buytenh@arista.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-14 09:18:02 -07:00
Sasha Neftin
6cfa45361d Revert "e1000e: Fix possible HW unit hang after an s0ix exit"
This reverts commit 1866aa0d0d.

Commit 1866aa0d0d ("e1000e: Fix possible HW unit hang after an s0ix
exit") was a workaround for CSME problem to handle messages comes via H2ME
mailbox. This problem has been fixed by patch "e1000e: Enable the GPT
clock before sending message to the CSME".

Fixes: 3e55d23171 ("e1000e: Add handshake with the CSME to support S0ix")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=214821
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-14 09:17:21 -07:00
Sasha Neftin
b49feacbef e1000e: Enable GPT clock before sending message to CSME
On corporate (CSME) ADL systems, the Ethernet Controller may stop working
("HW unit hang") after exiting from the s0ix state. The reason is that
CSME misses the message sent by the host. Enabling the dynamic GPT clock
solves this problem. This clock is cleared upon HW initialization.

Fixes: 3e55d23171 ("e1000e: Add handshake with the CSME to support S0ix")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=214821
Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-14 09:16:47 -07:00
Mark Brown
1e347f861d ASoC: rockchip-i2s: Undo BCLK pinctrl changes
The version of the BCLK pinctrl management changes that made it into
v5.19 has caused problems on some systems due to overly strict DT
requirements but attempts to fix it have caused further breakage on
other platforms.  Just drop the changes for this release, we already
have a better version queued for -next.

Fixes: 26b9f2fa7b ("ASoC: rockchip: i2s: Fix NULL pointer dereference when pinctrl is not found")
Fixes: a5450aba73 ("ASoC: rockchip: i2s: switch BCLK to GPIO")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220713130451.31481-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-07-14 13:25:52 +01:00
Tony Lindgren
51189eb9dd mmc: sdhci-omap: Fix a lockdep warning for PM runtime init
We need runtime PM enabled early in probe before sdhci_setup_host() for
sdhci_omap_set_capabilities(). But on the first runtime resume we must
not call sdhci_runtime_resume_host() as sdhci_setup_host() has not been
called yet. Let's check for an initialized controller like we already do
for context restore to fix a lockdep warning.

Fixes: f433e8aac6 ("mmc: sdhci-omap: Implement PM runtime functions")
Reported-by: Yegor Yefremov <yegorslists@googlemail.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220622051215.34063-1-tony@atomide.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-07-13 12:29:17 +02:00
Justin Stitt
e79b9473e9 net: ipv4: fix clang -Wformat warnings
When building with Clang we encounter these warnings:
| net/ipv4/ah4.c:513:4: error: format specifies type 'unsigned short' but
| the argument has type 'int' [-Werror,-Wformat]
| aalg_desc->uinfo.auth.icv_fullbits / 8);
-
| net/ipv4/esp4.c:1114:5: error: format specifies type 'unsigned short'
| but the argument has type 'int' [-Werror,-Wformat]
| aalg_desc->uinfo.auth.icv_fullbits / 8);

`aalg_desc->uinfo.auth.icv_fullbits` is a u16 but due to default
argument promotion becomes an int.

Variadic functions (printf-like) undergo default argument promotion.
Documentation/core-api/printk-formats.rst specifically recommends using
the promoted-to-type's format flag.

As per C11 6.3.1.1:
(https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf) `If an int
can represent all values of the original type ..., the value is
converted to an int; otherwise, it is converted to an unsigned int.
These are called the integer promotions.` Thus it makes sense to change
%hu to %d not only to follow this standard but to suppress the warning
as well.

Link: https://github.com/ClangBuiltLinux/linux/issues/378
Signed-off-by: Justin Stitt <justinstitt@google.com>
Suggested-by: Joe Perches <joe@perches.com>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-07-12 12:58:53 +02:00
Alexandru Elisei
26b9f2fa7b ASoC: rockchip: i2s: Fix NULL pointer dereference when pinctrl is not found
Commit a5450aba73 ("ASoC: rockchip: i2s: switch BCLK to GPIO") switched
BCLK to GPIO functions when probing the i2s bus interface, but missed
adding a check for when devm_pinctrl_get() returns an error.  This can lead
to the following NULL pointer dereference on a rockpro64-v2 if there are no
"pinctrl" properties in the i2s device tree node.

Check that i2s->pinctrl is valid before attempting to search for the
bclk_on and bclk_off pinctrl states.

Fixes: a5450aba73 ("ASoC: rockchip: i2s: switch BCLK to GPIO")
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20220711130522.401551-1-alexandru.elisei@arm.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-07-11 14:25:18 +01:00
Jacky Bai
a9ab5bf33c MAINTAINERS: Update freescale pin controllers maintainer
Add myself as co-maintainer of freescale pin controllers driver.
As Stefan is no longer working on NXP pin controller, so remove
Stefan from the list as suggested by him.

Signed-off-by: Jacky Bai <ping.bai@nxp.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Acked-by: Stefan Agner <stefan@agner.ch>
Link: https://lore.kernel.org/r/20220711083528.27710-1-ping.bai@nxp.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-07-11 15:19:10 +02:00
William Dean
acf50233fc pinctrl: sunplus: Add check for kcalloc
As the potential failure of the kcalloc(),
it should be better to check it in order to
avoid the dereference of the NULL pointer.

Fixes: aa74c44be1 ("pinctrl: Add driver for Sunplus SP7021")
Reported-by: Hacash Robot <hacashRobot@santino.com>
Signed-off-by: William Dean <williamsukatube@gmail.com>
Link: https://lore.kernel.org/r/20220710154822.2610801-1-williamsukatube@163.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-07-11 15:04:19 +02:00
William Dean
c3b821e8e4 pinctrl: ralink: Check for null return of devm_kcalloc
Because of the possible failure of the allocation, data->domains might
be NULL pointer and will cause the dereference of the NULL pointer
later.
Therefore, it might be better to check it and directly return -ENOMEM
without releasing data manually if fails, because the comment of the
devm_kmalloc() says "Memory allocated with this function is
automatically freed on driver detach.".

Fixes: a86854d0c5 ("treewide: devm_kzalloc() -> devm_kcalloc()")
Reported-by: Hacash Robot <hacashRobot@santino.com>
Signed-off-by: William Dean <williamsukatube@gmail.com>
Link: https://lore.kernel.org/r/20220710154922.2610876-1-williamsukatube@163.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-07-11 14:40:17 +02:00
Mustafa Ismail
cc0315564d RDMA/irdma: Fix sleep from invalid context BUG
Taking the qos_mutex to process RoCEv2 QP's on netdev events causes a
kernel splat.

Fix this by removing the handling for RoCEv2 in
irdma_cm_teardown_connections that uses the mutex. This handling is only
needed for iWARP to avoid having connections established while the link is
down or having connections remain functional after the IP address is
removed.

  BUG: sleeping function called from invalid context at kernel/locking/mutex.
  Call Trace:
  kernel: dump_stack+0x66/0x90
  kernel: ___might_sleep.cold.92+0x8d/0x9a
  kernel: mutex_lock+0x1c/0x40
  kernel: irdma_cm_teardown_connections+0x28e/0x4d0 [irdma]
  kernel: ? check_preempt_curr+0x7a/0x90
  kernel: ? select_idle_sibling+0x22/0x3c0
  kernel: ? select_task_rq_fair+0x94c/0xc90
  kernel: ? irdma_exec_cqp_cmd+0xc27/0x17c0 [irdma]
  kernel: ? __wake_up_common+0x7a/0x190
  kernel: irdma_if_notify+0x3cc/0x450 [irdma]
  kernel: ? sched_clock_cpu+0xc/0xb0
  kernel: irdma_inet6addr_event+0xc6/0x150 [irdma]

Fixes: 146b9756f1 ("RDMA/irdma: Add connection manager")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-07-11 03:04:16 -03:00
Mustafa Ismail
5e8afb8792 RDMA/irdma: Do not advertise 1GB page size for x722
x722 does not support 1GB page size but the irdma driver incorrectly
advertises 1GB page size support for x722 device to ib_core to compute the
best page size to use on this MR.  This could lead to incorrect start
offsets computed by hardware on the MR.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-07-11 03:03:50 -03:00
Andy Shevchenko
85ff37e302 gpiolib: cdev: Fix kernel doc for struct line
Kernel doc validator is not happy:
  gpiolib-cdev.c:487: warning: Function parameter or member 'hdesc' not described in 'line'
  gpiolib-cdev.c:487: warning: Function parameter or member 'raw_level' not described in 'line'
  gpiolib-cdev.c:487: warning: Function parameter or member 'total_discard_seq' not described in 'line'
  gpiolib-cdev.c:487: warning: Function parameter or member 'last_seqno' not described in 'line'

Describe above mentioned parameters.

Fixes: 2068339a6c ("gpiolib: cdev: Add hardware timestamp clock type")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Dipen Patel <dipenp@nvidia.com>
Acked-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-06-17 12:58:57 +02:00
Hangyu Hua
f85daf0e72 xfrm: xfrm_policy: fix a possible double xfrm_pols_put() in xfrm_bundle_lookup()
xfrm_policy_lookup() will call xfrm_pol_hold_rcu() to get a refcount of
pols[0]. This refcount can be dropped in xfrm_expand_policies() when
xfrm_expand_policies() return error. pols[0]'s refcount is balanced in
here. But xfrm_bundle_lookup() will also call xfrm_pols_put() with
num_pols == 1 to drop this refcount when xfrm_expand_policies() return
error.

This patch also fix an illegal address access. pols[0] will save a error
point when xfrm_policy_lookup fails. This lead to xfrm_pols_put to resolve
an illegal address in xfrm_bundle_lookup's error path.

Fix these by setting num_pols = 0 in xfrm_expand_policies()'s error path.

Fixes: 80c802f307 ("xfrm: cache bundles instead of policies for outgoing flows")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-06-02 11:05:19 +02:00
314 changed files with 3473 additions and 2289 deletions

View File

@@ -60,6 +60,10 @@ Arnd Bergmann <arnd@arndb.de>
Atish Patra <atishp@atishpatra.org> <atish.patra@wdc.com>
Axel Dyks <xl@xlsigned.net>
Axel Lin <axel.lin@gmail.com>
Baolin Wang <baolin.wang@linux.alibaba.com> <baolin.wang@linaro.org>
Baolin Wang <baolin.wang@linux.alibaba.com> <baolin.wang@spreadtrum.com>
Baolin Wang <baolin.wang@linux.alibaba.com> <baolin.wang@unisoc.com>
Baolin Wang <baolin.wang@linux.alibaba.com> <baolin.wang7@gmail.com>
Bart Van Assche <bvanassche@acm.org> <bart.vanassche@sandisk.com>
Bart Van Assche <bvanassche@acm.org> <bart.vanassche@wdc.com>
Ben Gardner <bgardner@wabtec.com>
@@ -135,6 +139,8 @@ Frank Rowand <frowand.list@gmail.com> <frowand@mvista.com>
Frank Zago <fzago@systemfabricworks.com>
Gao Xiang <xiang@kernel.org> <gaoxiang25@huawei.com>
Gao Xiang <xiang@kernel.org> <hsiangkao@aol.com>
Gao Xiang <xiang@kernel.org> <hsiangkao@linux.alibaba.com>
Gao Xiang <xiang@kernel.org> <hsiangkao@redhat.com>
Gerald Schaefer <gerald.schaefer@linux.ibm.com> <geraldsc@de.ibm.com>
Gerald Schaefer <gerald.schaefer@linux.ibm.com> <gerald.schaefer@de.ibm.com>
Gerald Schaefer <gerald.schaefer@linux.ibm.com> <geraldsc@linux.vnet.ibm.com>
@@ -371,6 +377,7 @@ Sean Nyekjaer <sean@geanix.com> <sean.nyekjaer@prevas.dk>
Sebastian Reichel <sre@kernel.org> <sebastian.reichel@collabora.co.uk>
Sebastian Reichel <sre@kernel.org> <sre@debian.org>
Sedat Dilek <sedat.dilek@gmail.com> <sedat.dilek@credativ.de>
Seth Forshee <sforshee@kernel.org> <seth.forshee@canonical.com>
Shiraz Hashim <shiraz.linux.kernel@gmail.com> <shiraz.hashim@st.com>
Shuah Khan <shuah@kernel.org> <shuahkhan@gmail.com>
Shuah Khan <shuah@kernel.org> <shuah.khan@hp.com>

View File

@@ -3176,6 +3176,7 @@
no_entry_flush [PPC]
no_uaccess_flush [PPC]
mmio_stale_data=off [X86]
retbleed=off [X86]
Exceptions:
This does not have any effect on
@@ -3198,6 +3199,7 @@
mds=full,nosmt [X86]
tsx_async_abort=full,nosmt [X86]
mmio_stale_data=full,nosmt [X86]
retbleed=auto,nosmt [X86]
mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
@@ -5796,6 +5798,24 @@
expediting. Set to zero to disable automatic
expediting.
srcutree.srcu_max_nodelay [KNL]
Specifies the number of no-delay instances
per jiffy for which the SRCU grace period
worker thread will be rescheduled with zero
delay. Beyond this limit, worker thread will
be rescheduled with a sleep delay of one jiffy.
srcutree.srcu_max_nodelay_phase [KNL]
Specifies the per-grace-period phase, number of
non-sleeping polls of readers. Beyond this limit,
grace period worker thread will be rescheduled
with a sleep delay of one jiffy, between each
rescan of the readers, for a grace period phase.
srcutree.srcu_retry_check_delay [KNL]
Specifies number of microseconds of non-sleeping
delay between each non-sleeping poll of readers.
srcutree.small_contention_lim [KNL]
Specifies the number of update-side contention
events per jiffy will be tolerated before

View File

@@ -167,70 +167,65 @@ properties:
- in-band-status
fixed-link:
allOf:
- if:
type: array
then:
deprecated: true
items:
- minimum: 0
maximum: 31
description:
Emulated PHY ID, choose any but unique to the all
specified fixed-links
oneOf:
- $ref: /schemas/types.yaml#/definitions/uint32-array
deprecated: true
items:
- minimum: 0
maximum: 31
description:
Emulated PHY ID, choose any but unique to the all
specified fixed-links
- enum: [0, 1]
description:
Duplex configuration. 0 for half duplex or 1 for
full duplex
- enum: [0, 1]
description:
Duplex configuration. 0 for half duplex or 1 for
full duplex
- enum: [10, 100, 1000, 2500, 10000]
description:
Link speed in Mbits/sec.
- enum: [10, 100, 1000, 2500, 10000]
description:
Link speed in Mbits/sec.
- enum: [0, 1]
description:
Pause configuration. 0 for no pause, 1 for pause
- enum: [0, 1]
description:
Pause configuration. 0 for no pause, 1 for pause
- enum: [0, 1]
description:
Asymmetric pause configuration. 0 for no asymmetric
pause, 1 for asymmetric pause
- enum: [0, 1]
description:
Asymmetric pause configuration. 0 for no asymmetric
pause, 1 for asymmetric pause
- type: object
additionalProperties: false
properties:
speed:
description:
Link speed.
$ref: /schemas/types.yaml#/definitions/uint32
enum: [10, 100, 1000, 2500, 10000]
full-duplex:
$ref: /schemas/types.yaml#/definitions/flag
description:
Indicates that full-duplex is used. When absent, half
duplex is assumed.
- if:
type: object
then:
properties:
speed:
description:
Link speed.
$ref: /schemas/types.yaml#/definitions/uint32
enum: [10, 100, 1000, 2500, 10000]
pause:
$ref: /schemas/types.yaml#definitions/flag
description:
Indicates that pause should be enabled.
full-duplex:
$ref: /schemas/types.yaml#/definitions/flag
description:
Indicates that full-duplex is used. When absent, half
duplex is assumed.
asym-pause:
$ref: /schemas/types.yaml#/definitions/flag
description:
Indicates that asym_pause should be enabled.
pause:
$ref: /schemas/types.yaml#definitions/flag
description:
Indicates that pause should be enabled.
link-gpios:
maxItems: 1
description:
GPIO to determine if the link is up
asym-pause:
$ref: /schemas/types.yaml#/definitions/flag
description:
Indicates that asym_pause should be enabled.
link-gpios:
maxItems: 1
description:
GPIO to determine if the link is up
required:
- speed
required:
- speed
additionalProperties: true

View File

@@ -183,6 +183,7 @@ properties:
Should specify the gpio for phy reset.
phy-reset-duration:
$ref: /schemas/types.yaml#/definitions/uint32
deprecated: true
description:
Reset duration in milliseconds. Should present only if property
@@ -191,12 +192,14 @@ properties:
and 1 millisecond will be used instead.
phy-reset-active-high:
type: boolean
deprecated: true
description:
If present then the reset sequence using the GPIO specified in the
"phy-reset-gpios" property is reversed (H=reset state, L=operation state).
phy-reset-post-delay:
$ref: /schemas/types.yaml#/definitions/uint32
deprecated: true
description:
Post reset delay in milliseconds. If present then a delay of phy-reset-post-delay

View File

@@ -503,26 +503,108 @@ per-port PHY specific details: interface connection, MDIO bus location, etc.
Driver development
==================
DSA switch drivers need to implement a dsa_switch_ops structure which will
DSA switch drivers need to implement a ``dsa_switch_ops`` structure which will
contain the various members described below.
``register_switch_driver()`` registers this dsa_switch_ops in its internal list
of drivers to probe for. ``unregister_switch_driver()`` does the exact opposite.
Probing, registration and device lifetime
-----------------------------------------
Unless requested differently by setting the priv_size member accordingly, DSA
does not allocate any driver private context space.
DSA switches are regular ``device`` structures on buses (be they platform, SPI,
I2C, MDIO or otherwise). The DSA framework is not involved in their probing
with the device core.
Switch registration from the perspective of a driver means passing a valid
``struct dsa_switch`` pointer to ``dsa_register_switch()``, usually from the
switch driver's probing function. The following members must be valid in the
provided structure:
- ``ds->dev``: will be used to parse the switch's OF node or platform data.
- ``ds->num_ports``: will be used to create the port list for this switch, and
to validate the port indices provided in the OF node.
- ``ds->ops``: a pointer to the ``dsa_switch_ops`` structure holding the DSA
method implementations.
- ``ds->priv``: backpointer to a driver-private data structure which can be
retrieved in all further DSA method callbacks.
In addition, the following flags in the ``dsa_switch`` structure may optionally
be configured to obtain driver-specific behavior from the DSA core. Their
behavior when set is documented through comments in ``include/net/dsa.h``.
- ``ds->vlan_filtering_is_global``
- ``ds->needs_standalone_vlan_filtering``
- ``ds->configure_vlan_while_not_filtering``
- ``ds->untag_bridge_pvid``
- ``ds->assisted_learning_on_cpu_port``
- ``ds->mtu_enforcement_ingress``
- ``ds->fdb_isolation``
Internally, DSA keeps an array of switch trees (group of switches) global to
the kernel, and attaches a ``dsa_switch`` structure to a tree on registration.
The tree ID to which the switch is attached is determined by the first u32
number of the ``dsa,member`` property of the switch's OF node (0 if missing).
The switch ID within the tree is determined by the second u32 number of the
same OF property (0 if missing). Registering multiple switches with the same
switch ID and tree ID is illegal and will cause an error. Using platform data,
a single switch and a single switch tree is permitted.
In case of a tree with multiple switches, probing takes place asymmetrically.
The first N-1 callers of ``dsa_register_switch()`` only add their ports to the
port list of the tree (``dst->ports``), each port having a backpointer to its
associated switch (``dp->ds``). Then, these switches exit their
``dsa_register_switch()`` call early, because ``dsa_tree_setup_routing_table()``
has determined that the tree is not yet complete (not all ports referenced by
DSA links are present in the tree's port list). The tree becomes complete when
the last switch calls ``dsa_register_switch()``, and this triggers the effective
continuation of initialization (including the call to ``ds->ops->setup()``) for
all switches within that tree, all as part of the calling context of the last
switch's probe function.
The opposite of registration takes place when calling ``dsa_unregister_switch()``,
which removes a switch's ports from the port list of the tree. The entire tree
is torn down when the first switch unregisters.
It is mandatory for DSA switch drivers to implement the ``shutdown()`` callback
of their respective bus, and call ``dsa_switch_shutdown()`` from it (a minimal
version of the full teardown performed by ``dsa_unregister_switch()``).
The reason is that DSA keeps a reference on the master net device, and if the
driver for the master device decides to unbind on shutdown, DSA's reference
will block that operation from finalizing.
Either ``dsa_switch_shutdown()`` or ``dsa_unregister_switch()`` must be called,
but not both, and the device driver model permits the bus' ``remove()`` method
to be called even if ``shutdown()`` was already called. Therefore, drivers are
expected to implement a mutual exclusion method between ``remove()`` and
``shutdown()`` by setting their drvdata to NULL after any of these has run, and
checking whether the drvdata is NULL before proceeding to take any action.
After ``dsa_switch_shutdown()`` or ``dsa_unregister_switch()`` was called, no
further callbacks via the provided ``dsa_switch_ops`` may take place, and the
driver may free the data structures associated with the ``dsa_switch``.
Switch configuration
--------------------
- ``tag_protocol``: this is to indicate what kind of tagging protocol is supported,
should be a valid value from the ``dsa_tag_protocol`` enum
- ``get_tag_protocol``: this is to indicate what kind of tagging protocol is
supported, should be a valid value from the ``dsa_tag_protocol`` enum.
The returned information does not have to be static; the driver is passed the
CPU port number, as well as the tagging protocol of a possibly stacked
upstream switch, in case there are hardware limitations in terms of supported
tag formats.
- ``probe``: probe routine which will be invoked by the DSA platform device upon
registration to test for the presence/absence of a switch device. For MDIO
devices, it is recommended to issue a read towards internal registers using
the switch pseudo-PHY and return whether this is a supported device. For other
buses, return a non-NULL string
- ``change_tag_protocol``: when the default tagging protocol has compatibility
problems with the master or other issues, the driver may support changing it
at runtime, either through a device tree property or through sysfs. In that
case, further calls to ``get_tag_protocol`` should report the protocol in
current use.
- ``setup``: setup function for the switch, this function is responsible for setting
up the ``dsa_switch_ops`` private structure with all it needs: register maps,
@@ -535,7 +617,17 @@ Switch configuration
fully configured and ready to serve any kind of request. It is recommended
to issue a software reset of the switch during this setup function in order to
avoid relying on what a previous software agent such as a bootloader/firmware
may have previously configured.
may have previously configured. The method responsible for undoing any
applicable allocations or operations done here is ``teardown``.
- ``port_setup`` and ``port_teardown``: methods for initialization and
destruction of per-port data structures. It is mandatory for some operations
such as registering and unregistering devlink port regions to be done from
these methods, otherwise they are optional. A port will be torn down only if
it has been previously set up. It is possible for a port to be set up during
probing only to be torn down immediately afterwards, for example in case its
PHY cannot be found. In this case, probing of the DSA switch continues
without that particular port.
PHY devices and link management
-------------------------------
@@ -635,26 +727,198 @@ Power management
``BR_STATE_DISABLED`` and propagating changes to the hardware if this port is
disabled while being a bridge member
Address databases
-----------------
Switching hardware is expected to have a table for FDB entries, however not all
of them are active at the same time. An address database is the subset (partition)
of FDB entries that is active (can be matched by address learning on RX, or FDB
lookup on TX) depending on the state of the port. An address database may
occasionally be called "FID" (Filtering ID) in this document, although the
underlying implementation may choose whatever is available to the hardware.
For example, all ports that belong to a VLAN-unaware bridge (which is
*currently* VLAN-unaware) are expected to learn source addresses in the
database associated by the driver with that bridge (and not with other
VLAN-unaware bridges). During forwarding and FDB lookup, a packet received on a
VLAN-unaware bridge port should be able to find a VLAN-unaware FDB entry having
the same MAC DA as the packet, which is present on another port member of the
same bridge. At the same time, the FDB lookup process must be able to not find
an FDB entry having the same MAC DA as the packet, if that entry points towards
a port which is a member of a different VLAN-unaware bridge (and is therefore
associated with a different address database).
Similarly, each VLAN of each offloaded VLAN-aware bridge should have an
associated address database, which is shared by all ports which are members of
that VLAN, but not shared by ports belonging to different bridges that are
members of the same VID.
In this context, a VLAN-unaware database means that all packets are expected to
match on it irrespective of VLAN ID (only MAC address lookup), whereas a
VLAN-aware database means that packets are supposed to match based on the VLAN
ID from the classified 802.1Q header (or the pvid if untagged).
At the bridge layer, VLAN-unaware FDB entries have the special VID value of 0,
whereas VLAN-aware FDB entries have non-zero VID values. Note that a
VLAN-unaware bridge may have VLAN-aware (non-zero VID) FDB entries, and a
VLAN-aware bridge may have VLAN-unaware FDB entries. As in hardware, the
software bridge keeps separate address databases, and offloads to hardware the
FDB entries belonging to these databases, through switchdev, asynchronously
relative to the moment when the databases become active or inactive.
When a user port operates in standalone mode, its driver should configure it to
use a separate database called a port private database. This is different from
the databases described above, and should impede operation as standalone port
(packet in, packet out to the CPU port) as little as possible. For example,
on ingress, it should not attempt to learn the MAC SA of ingress traffic, since
learning is a bridging layer service and this is a standalone port, therefore
it would consume useless space. With no address learning, the port private
database should be empty in a naive implementation, and in this case, all
received packets should be trivially flooded to the CPU port.
DSA (cascade) and CPU ports are also called "shared" ports because they service
multiple address databases, and the database that a packet should be associated
to is usually embedded in the DSA tag. This means that the CPU port may
simultaneously transport packets coming from a standalone port (which were
classified by hardware in one address database), and from a bridge port (which
were classified to a different address database).
Switch drivers which satisfy certain criteria are able to optimize the naive
configuration by removing the CPU port from the flooding domain of the switch,
and just program the hardware with FDB entries pointing towards the CPU port
for which it is known that software is interested in those MAC addresses.
Packets which do not match a known FDB entry will not be delivered to the CPU,
which will save CPU cycles required for creating an skb just to drop it.
DSA is able to perform host address filtering for the following kinds of
addresses:
- Primary unicast MAC addresses of ports (``dev->dev_addr``). These are
associated with the port private database of the respective user port,
and the driver is notified to install them through ``port_fdb_add`` towards
the CPU port.
- Secondary unicast and multicast MAC addresses of ports (addresses added
through ``dev_uc_add()`` and ``dev_mc_add()``). These are also associated
with the port private database of the respective user port.
- Local/permanent bridge FDB entries (``BR_FDB_LOCAL``). These are the MAC
addresses of the bridge ports, for which packets must be terminated locally
and not forwarded. They are associated with the address database for that
bridge.
- Static bridge FDB entries installed towards foreign (non-DSA) interfaces
present in the same bridge as some DSA switch ports. These are also
associated with the address database for that bridge.
- Dynamically learned FDB entries on foreign interfaces present in the same
bridge as some DSA switch ports, only if ``ds->assisted_learning_on_cpu_port``
is set to true by the driver. These are associated with the address database
for that bridge.
For various operations detailed below, DSA provides a ``dsa_db`` structure
which can be of the following types:
- ``DSA_DB_PORT``: the FDB (or MDB) entry to be installed or deleted belongs to
the port private database of user port ``db->dp``.
- ``DSA_DB_BRIDGE``: the entry belongs to one of the address databases of bridge
``db->bridge``. Separation between the VLAN-unaware database and the per-VID
databases of this bridge is expected to be done by the driver.
- ``DSA_DB_LAG``: the entry belongs to the address database of LAG ``db->lag``.
Note: ``DSA_DB_LAG`` is currently unused and may be removed in the future.
The drivers which act upon the ``dsa_db`` argument in ``port_fdb_add``,
``port_mdb_add`` etc should declare ``ds->fdb_isolation`` as true.
DSA associates each offloaded bridge and each offloaded LAG with a one-based ID
(``struct dsa_bridge :: num``, ``struct dsa_lag :: id``) for the purposes of
refcounting addresses on shared ports. Drivers may piggyback on DSA's numbering
scheme (the ID is readable through ``db->bridge.num`` and ``db->lag.id`` or may
implement their own.
Only the drivers which declare support for FDB isolation are notified of FDB
entries on the CPU port belonging to ``DSA_DB_PORT`` databases.
For compatibility/legacy reasons, ``DSA_DB_BRIDGE`` addresses are notified to
drivers even if they do not support FDB isolation. However, ``db->bridge.num``
and ``db->lag.id`` are always set to 0 in that case (to denote the lack of
isolation, for refcounting purposes).
Note that it is not mandatory for a switch driver to implement physically
separate address databases for each standalone user port. Since FDB entries in
the port private databases will always point to the CPU port, there is no risk
for incorrect forwarding decisions. In this case, all standalone ports may
share the same database, but the reference counting of host-filtered addresses
(not deleting the FDB entry for a port's MAC address if it's still in use by
another port) becomes the responsibility of the driver, because DSA is unaware
that the port databases are in fact shared. This can be achieved by calling
``dsa_fdb_present_in_other_db()`` and ``dsa_mdb_present_in_other_db()``.
The down side is that the RX filtering lists of each user port are in fact
shared, which means that user port A may accept a packet with a MAC DA it
shouldn't have, only because that MAC address was in the RX filtering list of
user port B. These packets will still be dropped in software, however.
Bridge layer
------------
Offloading the bridge forwarding plane is optional and handled by the methods
below. They may be absent, return -EOPNOTSUPP, or ``ds->max_num_bridges`` may
be non-zero and exceeded, and in this case, joining a bridge port is still
possible, but the packet forwarding will take place in software, and the ports
under a software bridge must remain configured in the same way as for
standalone operation, i.e. have all bridging service functions (address
learning etc) disabled, and send all received packets to the CPU port only.
Concretely, a port starts offloading the forwarding plane of a bridge once it
returns success to the ``port_bridge_join`` method, and stops doing so after
``port_bridge_leave`` has been called. Offloading the bridge means autonomously
learning FDB entries in accordance with the software bridge port's state, and
autonomously forwarding (or flooding) received packets without CPU intervention.
This is optional even when offloading a bridge port. Tagging protocol drivers
are expected to call ``dsa_default_offload_fwd_mark(skb)`` for packets which
have already been autonomously forwarded in the forwarding domain of the
ingress switch port. DSA, through ``dsa_port_devlink_setup()``, considers all
switch ports part of the same tree ID to be part of the same bridge forwarding
domain (capable of autonomous forwarding to each other).
Offloading the TX forwarding process of a bridge is a distinct concept from
simply offloading its forwarding plane, and refers to the ability of certain
driver and tag protocol combinations to transmit a single skb coming from the
bridge device's transmit function to potentially multiple egress ports (and
thereby avoid its cloning in software).
Packets for which the bridge requests this behavior are called data plane
packets and have ``skb->offload_fwd_mark`` set to true in the tag protocol
driver's ``xmit`` function. Data plane packets are subject to FDB lookup,
hardware learning on the CPU port, and do not override the port STP state.
Additionally, replication of data plane packets (multicast, flooding) is
handled in hardware and the bridge driver will transmit a single skb for each
packet that may or may not need replication.
When the TX forwarding offload is enabled, the tag protocol driver is
responsible to inject packets into the data plane of the hardware towards the
correct bridging domain (FID) that the port is a part of. The port may be
VLAN-unaware, and in this case the FID must be equal to the FID used by the
driver for its VLAN-unaware address database associated with that bridge.
Alternatively, the bridge may be VLAN-aware, and in that case, it is guaranteed
that the packet is also VLAN-tagged with the VLAN ID that the bridge processed
this packet in. It is the responsibility of the hardware to untag the VID on
the egress-untagged ports, or keep the tag on the egress-tagged ones.
- ``port_bridge_join``: bridge layer function invoked when a given switch port is
added to a bridge, this function should do what's necessary at the switch
level to permit the joining port to be added to the relevant logical
domain for it to ingress/egress traffic with other members of the bridge.
By setting the ``tx_fwd_offload`` argument to true, the TX forwarding process
of this bridge is also offloaded.
- ``port_bridge_leave``: bridge layer function invoked when a given switch port is
removed from a bridge, this function should do what's necessary at the
switch level to deny the leaving port from ingress/egress traffic from the
remaining bridge members. When the port leaves the bridge, it should be aged
out at the switch hardware for the switch to (re) learn MAC addresses behind
this port.
remaining bridge members.
- ``port_stp_state_set``: bridge layer function invoked when a given switch port STP
state is computed by the bridge layer and should be propagated to switch
hardware to forward/block/learn traffic. The switch driver is responsible for
computing a STP state change based on current and asked parameters and perform
the relevant ageing based on the intersection results
hardware to forward/block/learn traffic.
- ``port_bridge_flags``: bridge layer function invoked when a port must
configure its settings for e.g. flooding of unknown traffic or source address
@@ -667,21 +931,11 @@ Bridge layer
CPU port, and flooding towards the CPU port should also be enabled, due to a
lack of an explicit address filtering mechanism in the DSA core.
- ``port_bridge_tx_fwd_offload``: bridge layer function invoked after
``port_bridge_join`` when a driver sets ``ds->num_fwd_offloading_bridges`` to
a non-zero value. Returning success in this function activates the TX
forwarding offload bridge feature for this port, which enables the tagging
protocol driver to inject data plane packets towards the bridging domain that
the port is a part of. Data plane packets are subject to FDB lookup, hardware
learning on the CPU port, and do not override the port STP state.
Additionally, replication of data plane packets (multicast, flooding) is
handled in hardware and the bridge driver will transmit a single skb for each
packet that needs replication. The method is provided as a configuration
point for drivers that need to configure the hardware for enabling this
feature.
- ``port_bridge_tx_fwd_unoffload``: bridge layer function invoked when a driver
leaves a bridge port which had the TX forwarding offload feature enabled.
- ``port_fast_age``: bridge layer function invoked when flushing the
dynamically learned FDB entries on the port is necessary. This is called when
transitioning from an STP state where learning should take place to an STP
state where it shouldn't, or when leaving a bridge, or when address learning
is turned off via ``port_bridge_flags``.
Bridge VLAN filtering
---------------------
@@ -697,55 +951,44 @@ Bridge VLAN filtering
allowed.
- ``port_vlan_add``: bridge layer function invoked when a VLAN is configured
(tagged or untagged) for the given switch port. If the operation is not
supported by the hardware, this function should return ``-EOPNOTSUPP`` to
inform the bridge code to fallback to a software implementation.
(tagged or untagged) for the given switch port. The CPU port becomes a member
of a VLAN only if a foreign bridge port is also a member of it (and
forwarding needs to take place in software), or the VLAN is installed to the
VLAN group of the bridge device itself, for termination purposes
(``bridge vlan add dev br0 vid 100 self``). VLANs on shared ports are
reference counted and removed when there is no user left. Drivers do not need
to manually install a VLAN on the CPU port.
- ``port_vlan_del``: bridge layer function invoked when a VLAN is removed from the
given switch port
- ``port_vlan_dump``: bridge layer function invoked with a switchdev callback
function that the driver has to call for each VLAN the given port is a member
of. A switchdev object is used to carry the VID and bridge flags.
- ``port_fdb_add``: bridge layer function invoked when the bridge wants to install a
Forwarding Database entry, the switch hardware should be programmed with the
specified address in the specified VLAN Id in the forwarding database
associated with this VLAN ID. If the operation is not supported, this
function should return ``-EOPNOTSUPP`` to inform the bridge code to fallback to
a software implementation.
.. note:: VLAN ID 0 corresponds to the port private database, which, in the context
of DSA, would be its port-based VLAN, used by the associated bridge device.
associated with this VLAN ID.
- ``port_fdb_del``: bridge layer function invoked when the bridge wants to remove a
Forwarding Database entry, the switch hardware should be programmed to delete
the specified MAC address from the specified VLAN ID if it was mapped into
this port forwarding database
- ``port_fdb_dump``: bridge layer function invoked with a switchdev callback
function that the driver has to call for each MAC address known to be behind
the given port. A switchdev object is used to carry the VID and FDB info.
- ``port_fdb_dump``: bridge bypass function invoked by ``ndo_fdb_dump`` on the
physical DSA port interfaces. Since DSA does not attempt to keep in sync its
hardware FDB entries with the software bridge, this method is implemented as
a means to view the entries visible on user ports in the hardware database.
The entries reported by this function have the ``self`` flag in the output of
the ``bridge fdb show`` command.
- ``port_mdb_add``: bridge layer function invoked when the bridge wants to install
a multicast database entry. If the operation is not supported, this function
should return ``-EOPNOTSUPP`` to inform the bridge code to fallback to a
software implementation. The switch hardware should be programmed with the
a multicast database entry. The switch hardware should be programmed with the
specified address in the specified VLAN ID in the forwarding database
associated with this VLAN ID.
.. note:: VLAN ID 0 corresponds to the port private database, which, in the context
of DSA, would be its port-based VLAN, used by the associated bridge device.
- ``port_mdb_del``: bridge layer function invoked when the bridge wants to remove a
multicast database entry, the switch hardware should be programmed to delete
the specified MAC address from the specified VLAN ID if it was mapped into
this port forwarding database.
- ``port_mdb_dump``: bridge layer function invoked with a switchdev callback
function that the driver has to call for each MAC address known to be behind
the given port. A switchdev object is used to carry the VID and MDB info.
Link aggregation
----------------

View File

@@ -1052,11 +1052,7 @@ udp_rmem_min - INTEGER
Default: 4K
udp_wmem_min - INTEGER
Minimal size of send buffer used by UDP sockets in moderation.
Each UDP socket is able to use the size for sending data, even if
total pages of UDP sockets exceed udp_mem pressure. The unit is byte.
Default: 4K
UDP does not have tx memory accounting and this tunable has no effect.
RAW variables
=============
@@ -2870,7 +2866,14 @@ sctp_rmem - vector of 3 INTEGERs: min, default, max
Default: 4K
sctp_wmem - vector of 3 INTEGERs: min, default, max
Currently this tunable has no effect.
Only the first value ("min") is used, "default" and "max" are
ignored.
min: Minimum size of send buffer that can be used by SCTP sockets.
It is guaranteed to each SCTP socket (but not association) even
under moderate memory pressure.
Default: 4K
addr_scope_policy - INTEGER
Control IPv4 address scoping - draft-stewart-tsvwg-sctp-ipv4-00

View File

@@ -5658,7 +5658,7 @@ by a string of size ``name_size``.
#define KVM_STATS_UNIT_SECONDS (0x2 << KVM_STATS_UNIT_SHIFT)
#define KVM_STATS_UNIT_CYCLES (0x3 << KVM_STATS_UNIT_SHIFT)
#define KVM_STATS_UNIT_BOOLEAN (0x4 << KVM_STATS_UNIT_SHIFT)
#define KVM_STATS_UNIT_MAX KVM_STATS_UNIT_CYCLES
#define KVM_STATS_UNIT_MAX KVM_STATS_UNIT_BOOLEAN
#define KVM_STATS_BASE_SHIFT 8
#define KVM_STATS_BASE_MASK (0xF << KVM_STATS_BASE_SHIFT)

View File

@@ -15849,7 +15849,7 @@ PIN CONTROLLER - FREESCALE
M: Dong Aisheng <aisheng.dong@nxp.com>
M: Fabio Estevam <festevam@gmail.com>
M: Shawn Guo <shawnguo@kernel.org>
M: Stefan Agner <stefan@agner.ch>
M: Jacky Bai <ping.bai@nxp.com>
R: Pengutronix Kernel Team <kernel@pengutronix.de>
L: linux-gpio@vger.kernel.org
S: Maintained

View File

@@ -2,7 +2,7 @@
VERSION = 5
PATCHLEVEL = 19
SUBLEVEL = 0
EXTRAVERSION = -rc7
EXTRAVERSION =
NAME = Superb Owl
# *DOCUMENTATION*

View File

@@ -438,6 +438,13 @@ config MMU_GATHER_PAGE_SIZE
config MMU_GATHER_NO_RANGE
bool
select MMU_GATHER_MERGE_VMAS
config MMU_GATHER_NO_FLUSH_CACHE
bool
config MMU_GATHER_MERGE_VMAS
bool
config MMU_GATHER_NO_GATHER
bool

View File

@@ -38,7 +38,7 @@
sys_clk: sys_clk {
compatible = "fixed-clock";
#clock-cells = <0>;
clock-frequency = <162500000>;
clock-frequency = <165625000>;
};
cpu_clk: cpu_clk {

View File

@@ -10,7 +10,7 @@
#else
#define MAX_DMA_ADDRESS ({ \
extern phys_addr_t arm_dma_zone_size; \
arm_dma_zone_size && arm_dma_zone_size < (0x10000000 - PAGE_OFFSET) ? \
arm_dma_zone_size && arm_dma_zone_size < (0x100000000ULL - PAGE_OFFSET) ? \
(PAGE_OFFSET + arm_dma_zone_size) : 0xffffffffUL; })
#endif

View File

@@ -40,8 +40,8 @@ ENDPROC(_find_first_zero_bit_le)
* Prototype: int find_next_zero_bit(void *addr, unsigned int maxbit, int offset)
*/
ENTRY(_find_next_zero_bit_le)
teq r1, #0
beq 3b
cmp r2, r1
bhs 3b
ands ip, r2, #7
beq 1b @ If new byte, goto old routine
ARM( ldrb r3, [r0, r2, lsr #3] )
@@ -81,8 +81,8 @@ ENDPROC(_find_first_bit_le)
* Prototype: int find_next_zero_bit(void *addr, unsigned int maxbit, int offset)
*/
ENTRY(_find_next_bit_le)
teq r1, #0
beq 3b
cmp r2, r1
bhs 3b
ands ip, r2, #7
beq 1b @ If new byte, goto old routine
ARM( ldrb r3, [r0, r2, lsr #3] )
@@ -115,8 +115,8 @@ ENTRY(_find_first_zero_bit_be)
ENDPROC(_find_first_zero_bit_be)
ENTRY(_find_next_zero_bit_be)
teq r1, #0
beq 3b
cmp r2, r1
bhs 3b
ands ip, r2, #7
beq 1b @ If new byte, goto old routine
eor r3, r2, #0x18 @ big endian byte ordering
@@ -149,8 +149,8 @@ ENTRY(_find_first_bit_be)
ENDPROC(_find_first_bit_be)
ENTRY(_find_next_bit_be)
teq r1, #0
beq 3b
cmp r2, r1
bhs 3b
ands ip, r2, #7
beq 1b @ If new byte, goto old routine
eor r3, r2, #0x18 @ big endian byte ordering

View File

@@ -549,7 +549,7 @@ static struct pxa2xx_spi_controller corgi_spi_info = {
};
static struct gpiod_lookup_table corgi_spi_gpio_table = {
.dev_id = "pxa2xx-spi.1",
.dev_id = "spi1",
.table = {
GPIO_LOOKUP_IDX("gpio-pxa", CORGI_GPIO_ADS7846_CS, "cs", 0, GPIO_ACTIVE_LOW),
GPIO_LOOKUP_IDX("gpio-pxa", CORGI_GPIO_LCDCON_CS, "cs", 1, GPIO_ACTIVE_LOW),

View File

@@ -635,7 +635,7 @@ static struct pxa2xx_spi_controller pxa_ssp2_master_info = {
};
static struct gpiod_lookup_table pxa_ssp2_gpio_table = {
.dev_id = "pxa2xx-spi.2",
.dev_id = "spi2",
.table = {
GPIO_LOOKUP_IDX("gpio-pxa", GPIO88_HX4700_TSC2046_CS, "cs", 0, GPIO_ACTIVE_LOW),
{ },

View File

@@ -140,7 +140,7 @@ struct platform_device pxa_spi_ssp4 = {
};
static struct gpiod_lookup_table pxa_ssp3_gpio_table = {
.dev_id = "pxa2xx-spi.3",
.dev_id = "spi3",
.table = {
GPIO_LOOKUP_IDX("gpio-pxa", ICONTROL_MCP251x_nCS1, "cs", 0, GPIO_ACTIVE_LOW),
GPIO_LOOKUP_IDX("gpio-pxa", ICONTROL_MCP251x_nCS2, "cs", 1, GPIO_ACTIVE_LOW),
@@ -149,7 +149,7 @@ static struct gpiod_lookup_table pxa_ssp3_gpio_table = {
};
static struct gpiod_lookup_table pxa_ssp4_gpio_table = {
.dev_id = "pxa2xx-spi.4",
.dev_id = "spi4",
.table = {
GPIO_LOOKUP_IDX("gpio-pxa", ICONTROL_MCP251x_nCS3, "cs", 0, GPIO_ACTIVE_LOW),
GPIO_LOOKUP_IDX("gpio-pxa", ICONTROL_MCP251x_nCS4, "cs", 1, GPIO_ACTIVE_LOW),

View File

@@ -207,7 +207,7 @@ static struct spi_board_info littleton_spi_devices[] __initdata = {
};
static struct gpiod_lookup_table littleton_spi_gpio_table = {
.dev_id = "pxa2xx-spi.2",
.dev_id = "spi2",
.table = {
GPIO_LOOKUP_IDX("gpio-pxa", LITTLETON_GPIO_LCD_CS, "cs", 0, GPIO_ACTIVE_LOW),
{ },

View File

@@ -994,7 +994,7 @@ static struct pxa2xx_spi_controller magician_spi_info = {
};
static struct gpiod_lookup_table magician_spi_gpio_table = {
.dev_id = "pxa2xx-spi.2",
.dev_id = "spi2",
.table = {
/* NOTICE must be GPIO, incompatibility with hw PXA SPI framing */
GPIO_LOOKUP_IDX("gpio-pxa", GPIO14_MAGICIAN_TSC2046_CS, "cs", 0, GPIO_ACTIVE_LOW),

View File

@@ -578,7 +578,7 @@ static struct pxa2xx_spi_controller spitz_spi_info = {
};
static struct gpiod_lookup_table spitz_spi_gpio_table = {
.dev_id = "pxa2xx-spi.2",
.dev_id = "spi2",
.table = {
GPIO_LOOKUP_IDX("gpio-pxa", SPITZ_GPIO_ADS7846_CS, "cs", 0, GPIO_ACTIVE_LOW),
GPIO_LOOKUP_IDX("gpio-pxa", SPITZ_GPIO_LCDCON_CS, "cs", 1, GPIO_ACTIVE_LOW),

View File

@@ -623,7 +623,7 @@ static struct pxa2xx_spi_controller pxa_ssp2_master_info = {
};
static struct gpiod_lookup_table pxa_ssp1_gpio_table = {
.dev_id = "pxa2xx-spi.1",
.dev_id = "spi1",
.table = {
GPIO_LOOKUP_IDX("gpio-pxa", GPIO24_ZIPITZ2_WIFI_CS, "cs", 0, GPIO_ACTIVE_LOW),
{ },
@@ -631,7 +631,7 @@ static struct gpiod_lookup_table pxa_ssp1_gpio_table = {
};
static struct gpiod_lookup_table pxa_ssp2_gpio_table = {
.dev_id = "pxa2xx-spi.2",
.dev_id = "spi2",
.table = {
GPIO_LOOKUP_IDX("gpio-pxa", GPIO88_ZIPITZ2_LCD_CS, "cs", 0, GPIO_ACTIVE_LOW),
{ },

View File

@@ -4,21 +4,6 @@
#define __ASM_CSKY_TLB_H
#include <asm/cacheflush.h>
#define tlb_start_vma(tlb, vma) \
do { \
if (!(tlb)->fullmm) \
flush_cache_range(vma, (vma)->vm_start, (vma)->vm_end); \
} while (0)
#define tlb_end_vma(tlb, vma) \
do { \
if (!(tlb)->fullmm) \
flush_tlb_range(vma, (vma)->vm_start, (vma)->vm_end); \
} while (0)
#define tlb_flush(tlb) flush_tlb_mm((tlb)->mm)
#include <asm-generic/tlb.h>
#endif /* __ASM_CSKY_TLB_H */

View File

@@ -69,7 +69,6 @@ config LOONGARCH
select GENERIC_TIME_VSYSCALL
select GPIOLIB
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_COMPILER_H
select HAVE_ARCH_MMAP_RND_BITS if MMU
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
@@ -108,6 +107,7 @@ config LOONGARCH
select TRACE_IRQFLAGS_SUPPORT
select USE_PERCPU_NUMA_NODE_ID
select ZONE_DMA32
select MMU_GATHER_MERGE_VMAS if MMU
config 32BIT
bool

View File

@@ -274,16 +274,4 @@
nor \dst, \src, zero
.endm
.macro bgt r0 r1 label
blt \r1, \r0, \label
.endm
.macro bltz r0 label
blt \r0, zero, \label
.endm
.macro bgez r0 label
bge \r0, zero, \label
.endm
#endif /* _ASM_ASMMACRO_H */

View File

@@ -10,7 +10,6 @@
#include <linux/types.h>
#include <asm/barrier.h>
#include <asm/cmpxchg.h>
#include <asm/compiler.h>
#if __SIZEOF_LONG__ == 4
#define __LL "ll.w "
@@ -157,27 +156,25 @@ static inline int arch_atomic_sub_if_positive(int i, atomic_t *v)
__asm__ __volatile__(
"1: ll.w %1, %2 # atomic_sub_if_positive\n"
" addi.w %0, %1, %3 \n"
" or %1, %0, $zero \n"
" blt %0, $zero, 2f \n"
" move %1, %0 \n"
" bltz %0, 2f \n"
" sc.w %1, %2 \n"
" beq $zero, %1, 1b \n"
" beqz %1, 1b \n"
"2: \n"
__WEAK_LLSC_MB
: "=&r" (result), "=&r" (temp),
"+" GCC_OFF_SMALL_ASM() (v->counter)
: "=&r" (result), "=&r" (temp), "+ZC" (v->counter)
: "I" (-i));
} else {
__asm__ __volatile__(
"1: ll.w %1, %2 # atomic_sub_if_positive\n"
" sub.w %0, %1, %3 \n"
" or %1, %0, $zero \n"
" blt %0, $zero, 2f \n"
" move %1, %0 \n"
" bltz %0, 2f \n"
" sc.w %1, %2 \n"
" beq $zero, %1, 1b \n"
" beqz %1, 1b \n"
"2: \n"
__WEAK_LLSC_MB
: "=&r" (result), "=&r" (temp),
"+" GCC_OFF_SMALL_ASM() (v->counter)
: "=&r" (result), "=&r" (temp), "+ZC" (v->counter)
: "r" (i));
}
@@ -320,27 +317,25 @@ static inline long arch_atomic64_sub_if_positive(long i, atomic64_t *v)
__asm__ __volatile__(
"1: ll.d %1, %2 # atomic64_sub_if_positive \n"
" addi.d %0, %1, %3 \n"
" or %1, %0, $zero \n"
" blt %0, $zero, 2f \n"
" move %1, %0 \n"
" bltz %0, 2f \n"
" sc.d %1, %2 \n"
" beq %1, $zero, 1b \n"
" beqz %1, 1b \n"
"2: \n"
__WEAK_LLSC_MB
: "=&r" (result), "=&r" (temp),
"+" GCC_OFF_SMALL_ASM() (v->counter)
: "=&r" (result), "=&r" (temp), "+ZC" (v->counter)
: "I" (-i));
} else {
__asm__ __volatile__(
"1: ll.d %1, %2 # atomic64_sub_if_positive \n"
" sub.d %0, %1, %3 \n"
" or %1, %0, $zero \n"
" blt %0, $zero, 2f \n"
" move %1, %0 \n"
" bltz %0, 2f \n"
" sc.d %1, %2 \n"
" beq %1, $zero, 1b \n"
" beqz %1, 1b \n"
"2: \n"
__WEAK_LLSC_MB
: "=&r" (result), "=&r" (temp),
"+" GCC_OFF_SMALL_ASM() (v->counter)
: "=&r" (result), "=&r" (temp), "+ZC" (v->counter)
: "r" (i));
}

View File

@@ -48,9 +48,9 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
__asm__ __volatile__(
"sltu %0, %1, %2\n\t"
#if (__SIZEOF_LONG__ == 4)
"sub.w %0, $r0, %0\n\t"
"sub.w %0, $zero, %0\n\t"
#elif (__SIZEOF_LONG__ == 8)
"sub.d %0, $r0, %0\n\t"
"sub.d %0, $zero, %0\n\t"
#endif
: "=r" (mask)
: "r" (index), "r" (size)

View File

@@ -55,9 +55,9 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x,
__asm__ __volatile__( \
"1: " ld " %0, %2 # __cmpxchg_asm \n" \
" bne %0, %z3, 2f \n" \
" or $t0, %z4, $zero \n" \
" move $t0, %z4 \n" \
" " st " $t0, %1 \n" \
" beq $zero, $t0, 1b \n" \
" beqz $t0, 1b \n" \
"2: \n" \
__WEAK_LLSC_MB \
: "=&r" (__ret), "=ZB"(*m) \

View File

@@ -1,15 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (C) 2020-2022 Loongson Technology Corporation Limited
*/
#ifndef _ASM_COMPILER_H
#define _ASM_COMPILER_H
#define GCC_OFF_SMALL_ASM() "ZC"
#define LOONGARCH_ISA_LEVEL "loongarch"
#define LOONGARCH_ISA_ARCH_LEVEL "arch=loongarch"
#define LOONGARCH_ISA_LEVEL_RAW loongarch
#define LOONGARCH_ISA_ARCH_LEVEL_RAW LOONGARCH_ISA_LEVEL_RAW
#endif /* _ASM_COMPILER_H */

View File

@@ -288,8 +288,6 @@ struct arch_elf_state {
.interp_fp_abi = LOONGARCH_ABI_FP_ANY, \
}
#define elf_read_implies_exec(ex, exec_stk) (exec_stk == EXSTACK_DEFAULT)
extern int arch_elf_pt_proc(void *ehdr, void *phdr, struct file *elf,
bool is_interp, struct arch_elf_state *state);

View File

@@ -8,7 +8,6 @@
#include <linux/futex.h>
#include <linux/uaccess.h>
#include <asm/barrier.h>
#include <asm/compiler.h>
#include <asm/errno.h>
#define __futex_atomic_op(insn, ret, oldval, uaddr, oparg) \
@@ -17,7 +16,7 @@
"1: ll.w %1, %4 # __futex_atomic_op\n" \
" " insn " \n" \
"2: sc.w $t0, %2 \n" \
" beq $t0, $zero, 1b \n" \
" beqz $t0, 1b \n" \
"3: \n" \
" .section .fixup,\"ax\" \n" \
"4: li.w %0, %6 \n" \
@@ -82,9 +81,9 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval, u32 newv
"# futex_atomic_cmpxchg_inatomic \n"
"1: ll.w %1, %3 \n"
" bne %1, %z4, 3f \n"
" or $t0, %z5, $zero \n"
" move $t0, %z5 \n"
"2: sc.w $t0, %2 \n"
" beq $zero, $t0, 1b \n"
" beqz $t0, 1b \n"
"3: \n"
__WEAK_LLSC_MB
" .section .fixup,\"ax\" \n"
@@ -95,8 +94,8 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval, u32 newv
" "__UA_ADDR "\t1b, 4b \n"
" "__UA_ADDR "\t2b, 4b \n"
" .previous \n"
: "+r" (ret), "=&r" (val), "=" GCC_OFF_SMALL_ASM() (*uaddr)
: GCC_OFF_SMALL_ASM() (*uaddr), "Jr" (oldval), "Jr" (newval),
: "+r" (ret), "=&r" (val), "=ZC" (*uaddr)
: "ZC" (*uaddr), "Jr" (oldval), "Jr" (newval),
"i" (-EFAULT)
: "memory", "t0");

View File

@@ -9,7 +9,6 @@
#include <linux/compiler.h>
#include <linux/stringify.h>
#include <asm/compiler.h>
#include <asm/loongarch.h>
static inline void arch_local_irq_enable(void)

View File

@@ -9,7 +9,6 @@
#include <linux/bitops.h>
#include <linux/atomic.h>
#include <asm/cmpxchg.h>
#include <asm/compiler.h>
typedef struct {
atomic_long_t a;

View File

@@ -39,18 +39,6 @@ extern const struct plat_smp_ops loongson3_smp_ops;
#define MAX_PACKAGES 16
/* Chip Config register of each physical cpu package */
extern u64 loongson_chipcfg[MAX_PACKAGES];
#define LOONGSON_CHIPCFG(id) (*(volatile u32 *)(loongson_chipcfg[id]))
/* Chip Temperature register of each physical cpu package */
extern u64 loongson_chiptemp[MAX_PACKAGES];
#define LOONGSON_CHIPTEMP(id) (*(volatile u32 *)(loongson_chiptemp[id]))
/* Freq Control register of each physical cpu package */
extern u64 loongson_freqctrl[MAX_PACKAGES];
#define LOONGSON_FREQCTRL(id) (*(volatile u32 *)(loongson_freqctrl[id]))
#define xconf_readl(addr) readl(addr)
#define xconf_readq(addr) readq(addr)
@@ -58,7 +46,7 @@ static inline void xconf_writel(u32 val, volatile void __iomem *addr)
{
asm volatile (
" st.w %[v], %[hw], 0 \n"
" ld.b $r0, %[hw], 0 \n"
" ld.b $zero, %[hw], 0 \n"
:
: [hw] "r" (addr), [v] "r" (val)
);
@@ -68,7 +56,7 @@ static inline void xconf_writeq(u64 val64, volatile void __iomem *addr)
{
asm volatile (
" st.d %[v], %[hw], 0 \n"
" ld.b $r0, %[hw], 0 \n"
" ld.b $zero, %[hw], 0 \n"
:
: [hw] "r" (addr), [v] "r" (val64)
);

View File

@@ -23,13 +23,13 @@
static __always_inline void prepare_frametrace(struct pt_regs *regs)
{
__asm__ __volatile__(
/* Save $r1 */
/* Save $ra */
STORE_ONE_REG(1)
/* Use $r1 to save PC */
"pcaddi $r1, 0\n\t"
STR_LONG_S " $r1, %0\n\t"
/* Restore $r1 */
STR_LONG_L " $r1, %1, "STR_LONGSIZE"\n\t"
/* Use $ra to save PC */
"pcaddi $ra, 0\n\t"
STR_LONG_S " $ra, %0\n\t"
/* Restore $ra */
STR_LONG_L " $ra, %1, "STR_LONGSIZE"\n\t"
STORE_ONE_REG(2)
STORE_ONE_REG(3)
STORE_ONE_REG(4)

View File

@@ -44,14 +44,14 @@ struct thread_info {
}
/* How to get the thread information struct from C. */
register struct thread_info *__current_thread_info __asm__("$r2");
register struct thread_info *__current_thread_info __asm__("$tp");
static inline struct thread_info *current_thread_info(void)
{
return __current_thread_info;
}
register unsigned long current_stack_pointer __asm__("$r3");
register unsigned long current_stack_pointer __asm__("$sp");
#endif /* !__ASSEMBLY__ */

View File

@@ -137,16 +137,6 @@ static inline void invtlb_all(u32 op, u32 info, u64 addr)
);
}
/*
* LoongArch doesn't need any special per-pte or per-vma handling, except
* we need to flush cache for area to be unmapped.
*/
#define tlb_start_vma(tlb, vma) \
do { \
if (!(tlb)->fullmm) \
flush_cache_range(vma, vma->vm_start, vma->vm_end); \
} while (0)
#define tlb_end_vma(tlb, vma) do { } while (0)
#define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0)
static void tlb_flush(struct mmu_gather *tlb);

View File

@@ -162,7 +162,7 @@ do { \
"2: \n" \
" .section .fixup,\"ax\" \n" \
"3: li.w %0, %3 \n" \
" or %1, $r0, $r0 \n" \
" move %1, $zero \n" \
" b 2b \n" \
" .previous \n" \
" .section __ex_table,\"a\" \n" \

View File

@@ -4,8 +4,9 @@
*
* Copyright (C) 2020-2022 Loongson Technology Corporation Limited
*/
#include <asm/cpu-info.h>
#include <linux/cacheinfo.h>
#include <asm/bootinfo.h>
#include <asm/cpu-info.h>
/* Populates leaf and increments to next leaf */
#define populate_cache(cache, leaf, c_level, c_type) \
@@ -17,6 +18,8 @@ do { \
leaf->ways_of_associativity = c->cache.ways; \
leaf->size = c->cache.linesz * c->cache.sets * \
c->cache.ways; \
if (leaf->level > 2) \
leaf->size *= nodes_per_package; \
leaf++; \
} while (0)
@@ -95,11 +98,15 @@ static void cache_cpumap_setup(unsigned int cpu)
int populate_cache_leaves(unsigned int cpu)
{
int level = 1;
int level = 1, nodes_per_package = 1;
struct cpuinfo_loongarch *c = &current_cpu_data;
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
struct cacheinfo *this_leaf = this_cpu_ci->info_list;
if (loongson_sysconf.nr_nodes > 1)
nodes_per_package = loongson_sysconf.cores_per_package
/ loongson_sysconf.cores_per_node;
if (c->icache.waysize) {
populate_cache(dcache, this_leaf, level, CACHE_TYPE_DATA);
populate_cache(icache, this_leaf, level++, CACHE_TYPE_INST);

View File

@@ -27,7 +27,7 @@ SYM_FUNC_START(handle_syscall)
addi.d sp, sp, -PT_SIZE
cfi_st t2, PT_R3
cfi_rel_offset sp, PT_R3
cfi_rel_offset sp, PT_R3
st.d zero, sp, PT_R0
csrrd t2, LOONGARCH_CSR_PRMD
st.d t2, sp, PT_PRMD
@@ -50,7 +50,7 @@ SYM_FUNC_START(handle_syscall)
cfi_st a7, PT_R11
csrrd ra, LOONGARCH_CSR_ERA
st.d ra, sp, PT_ERA
cfi_rel_offset ra, PT_ERA
cfi_rel_offset ra, PT_ERA
cfi_st tp, PT_R2
cfi_st u0, PT_R21

View File

@@ -17,21 +17,6 @@ u64 efi_system_table;
struct loongson_system_configuration loongson_sysconf;
EXPORT_SYMBOL(loongson_sysconf);
u64 loongson_chipcfg[MAX_PACKAGES];
u64 loongson_chiptemp[MAX_PACKAGES];
u64 loongson_freqctrl[MAX_PACKAGES];
unsigned long long smp_group[MAX_PACKAGES];
static void __init register_addrs_set(u64 *registers, const u64 addr, int num)
{
u64 i;
for (i = 0; i < num; i++) {
*registers = (i << 44) | addr;
registers++;
}
}
void __init init_environ(void)
{
int efi_boot = fw_arg0;
@@ -50,11 +35,6 @@ void __init init_environ(void)
efi_memmap_init_early(&data);
memblock_reserve(data.phys_map & PAGE_MASK,
PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK)));
register_addrs_set(smp_group, TO_UNCACHE(0x1fe01000), 16);
register_addrs_set(loongson_chipcfg, TO_UNCACHE(0x1fe00180), 16);
register_addrs_set(loongson_chiptemp, TO_UNCACHE(0x1fe0019c), 16);
register_addrs_set(loongson_freqctrl, TO_UNCACHE(0x1fe001d0), 16);
}
static int __init init_cpu_fullname(void)

View File

@@ -27,78 +27,78 @@
.endm
.macro sc_save_fp base
EX fst.d $f0, \base, (0 * FPU_REG_WIDTH)
EX fst.d $f1, \base, (1 * FPU_REG_WIDTH)
EX fst.d $f2, \base, (2 * FPU_REG_WIDTH)
EX fst.d $f3, \base, (3 * FPU_REG_WIDTH)
EX fst.d $f4, \base, (4 * FPU_REG_WIDTH)
EX fst.d $f5, \base, (5 * FPU_REG_WIDTH)
EX fst.d $f6, \base, (6 * FPU_REG_WIDTH)
EX fst.d $f7, \base, (7 * FPU_REG_WIDTH)
EX fst.d $f8, \base, (8 * FPU_REG_WIDTH)
EX fst.d $f9, \base, (9 * FPU_REG_WIDTH)
EX fst.d $f10, \base, (10 * FPU_REG_WIDTH)
EX fst.d $f11, \base, (11 * FPU_REG_WIDTH)
EX fst.d $f12, \base, (12 * FPU_REG_WIDTH)
EX fst.d $f13, \base, (13 * FPU_REG_WIDTH)
EX fst.d $f14, \base, (14 * FPU_REG_WIDTH)
EX fst.d $f15, \base, (15 * FPU_REG_WIDTH)
EX fst.d $f16, \base, (16 * FPU_REG_WIDTH)
EX fst.d $f17, \base, (17 * FPU_REG_WIDTH)
EX fst.d $f18, \base, (18 * FPU_REG_WIDTH)
EX fst.d $f19, \base, (19 * FPU_REG_WIDTH)
EX fst.d $f20, \base, (20 * FPU_REG_WIDTH)
EX fst.d $f21, \base, (21 * FPU_REG_WIDTH)
EX fst.d $f22, \base, (22 * FPU_REG_WIDTH)
EX fst.d $f23, \base, (23 * FPU_REG_WIDTH)
EX fst.d $f24, \base, (24 * FPU_REG_WIDTH)
EX fst.d $f25, \base, (25 * FPU_REG_WIDTH)
EX fst.d $f26, \base, (26 * FPU_REG_WIDTH)
EX fst.d $f27, \base, (27 * FPU_REG_WIDTH)
EX fst.d $f28, \base, (28 * FPU_REG_WIDTH)
EX fst.d $f29, \base, (29 * FPU_REG_WIDTH)
EX fst.d $f30, \base, (30 * FPU_REG_WIDTH)
EX fst.d $f31, \base, (31 * FPU_REG_WIDTH)
EX fst.d $f0, \base, (0 * FPU_REG_WIDTH)
EX fst.d $f1, \base, (1 * FPU_REG_WIDTH)
EX fst.d $f2, \base, (2 * FPU_REG_WIDTH)
EX fst.d $f3, \base, (3 * FPU_REG_WIDTH)
EX fst.d $f4, \base, (4 * FPU_REG_WIDTH)
EX fst.d $f5, \base, (5 * FPU_REG_WIDTH)
EX fst.d $f6, \base, (6 * FPU_REG_WIDTH)
EX fst.d $f7, \base, (7 * FPU_REG_WIDTH)
EX fst.d $f8, \base, (8 * FPU_REG_WIDTH)
EX fst.d $f9, \base, (9 * FPU_REG_WIDTH)
EX fst.d $f10, \base, (10 * FPU_REG_WIDTH)
EX fst.d $f11, \base, (11 * FPU_REG_WIDTH)
EX fst.d $f12, \base, (12 * FPU_REG_WIDTH)
EX fst.d $f13, \base, (13 * FPU_REG_WIDTH)
EX fst.d $f14, \base, (14 * FPU_REG_WIDTH)
EX fst.d $f15, \base, (15 * FPU_REG_WIDTH)
EX fst.d $f16, \base, (16 * FPU_REG_WIDTH)
EX fst.d $f17, \base, (17 * FPU_REG_WIDTH)
EX fst.d $f18, \base, (18 * FPU_REG_WIDTH)
EX fst.d $f19, \base, (19 * FPU_REG_WIDTH)
EX fst.d $f20, \base, (20 * FPU_REG_WIDTH)
EX fst.d $f21, \base, (21 * FPU_REG_WIDTH)
EX fst.d $f22, \base, (22 * FPU_REG_WIDTH)
EX fst.d $f23, \base, (23 * FPU_REG_WIDTH)
EX fst.d $f24, \base, (24 * FPU_REG_WIDTH)
EX fst.d $f25, \base, (25 * FPU_REG_WIDTH)
EX fst.d $f26, \base, (26 * FPU_REG_WIDTH)
EX fst.d $f27, \base, (27 * FPU_REG_WIDTH)
EX fst.d $f28, \base, (28 * FPU_REG_WIDTH)
EX fst.d $f29, \base, (29 * FPU_REG_WIDTH)
EX fst.d $f30, \base, (30 * FPU_REG_WIDTH)
EX fst.d $f31, \base, (31 * FPU_REG_WIDTH)
.endm
.macro sc_restore_fp base
EX fld.d $f0, \base, (0 * FPU_REG_WIDTH)
EX fld.d $f1, \base, (1 * FPU_REG_WIDTH)
EX fld.d $f2, \base, (2 * FPU_REG_WIDTH)
EX fld.d $f3, \base, (3 * FPU_REG_WIDTH)
EX fld.d $f4, \base, (4 * FPU_REG_WIDTH)
EX fld.d $f5, \base, (5 * FPU_REG_WIDTH)
EX fld.d $f6, \base, (6 * FPU_REG_WIDTH)
EX fld.d $f7, \base, (7 * FPU_REG_WIDTH)
EX fld.d $f8, \base, (8 * FPU_REG_WIDTH)
EX fld.d $f9, \base, (9 * FPU_REG_WIDTH)
EX fld.d $f10, \base, (10 * FPU_REG_WIDTH)
EX fld.d $f11, \base, (11 * FPU_REG_WIDTH)
EX fld.d $f12, \base, (12 * FPU_REG_WIDTH)
EX fld.d $f13, \base, (13 * FPU_REG_WIDTH)
EX fld.d $f14, \base, (14 * FPU_REG_WIDTH)
EX fld.d $f15, \base, (15 * FPU_REG_WIDTH)
EX fld.d $f16, \base, (16 * FPU_REG_WIDTH)
EX fld.d $f17, \base, (17 * FPU_REG_WIDTH)
EX fld.d $f18, \base, (18 * FPU_REG_WIDTH)
EX fld.d $f19, \base, (19 * FPU_REG_WIDTH)
EX fld.d $f20, \base, (20 * FPU_REG_WIDTH)
EX fld.d $f21, \base, (21 * FPU_REG_WIDTH)
EX fld.d $f22, \base, (22 * FPU_REG_WIDTH)
EX fld.d $f23, \base, (23 * FPU_REG_WIDTH)
EX fld.d $f24, \base, (24 * FPU_REG_WIDTH)
EX fld.d $f25, \base, (25 * FPU_REG_WIDTH)
EX fld.d $f26, \base, (26 * FPU_REG_WIDTH)
EX fld.d $f27, \base, (27 * FPU_REG_WIDTH)
EX fld.d $f28, \base, (28 * FPU_REG_WIDTH)
EX fld.d $f29, \base, (29 * FPU_REG_WIDTH)
EX fld.d $f30, \base, (30 * FPU_REG_WIDTH)
EX fld.d $f31, \base, (31 * FPU_REG_WIDTH)
EX fld.d $f0, \base, (0 * FPU_REG_WIDTH)
EX fld.d $f1, \base, (1 * FPU_REG_WIDTH)
EX fld.d $f2, \base, (2 * FPU_REG_WIDTH)
EX fld.d $f3, \base, (3 * FPU_REG_WIDTH)
EX fld.d $f4, \base, (4 * FPU_REG_WIDTH)
EX fld.d $f5, \base, (5 * FPU_REG_WIDTH)
EX fld.d $f6, \base, (6 * FPU_REG_WIDTH)
EX fld.d $f7, \base, (7 * FPU_REG_WIDTH)
EX fld.d $f8, \base, (8 * FPU_REG_WIDTH)
EX fld.d $f9, \base, (9 * FPU_REG_WIDTH)
EX fld.d $f10, \base, (10 * FPU_REG_WIDTH)
EX fld.d $f11, \base, (11 * FPU_REG_WIDTH)
EX fld.d $f12, \base, (12 * FPU_REG_WIDTH)
EX fld.d $f13, \base, (13 * FPU_REG_WIDTH)
EX fld.d $f14, \base, (14 * FPU_REG_WIDTH)
EX fld.d $f15, \base, (15 * FPU_REG_WIDTH)
EX fld.d $f16, \base, (16 * FPU_REG_WIDTH)
EX fld.d $f17, \base, (17 * FPU_REG_WIDTH)
EX fld.d $f18, \base, (18 * FPU_REG_WIDTH)
EX fld.d $f19, \base, (19 * FPU_REG_WIDTH)
EX fld.d $f20, \base, (20 * FPU_REG_WIDTH)
EX fld.d $f21, \base, (21 * FPU_REG_WIDTH)
EX fld.d $f22, \base, (22 * FPU_REG_WIDTH)
EX fld.d $f23, \base, (23 * FPU_REG_WIDTH)
EX fld.d $f24, \base, (24 * FPU_REG_WIDTH)
EX fld.d $f25, \base, (25 * FPU_REG_WIDTH)
EX fld.d $f26, \base, (26 * FPU_REG_WIDTH)
EX fld.d $f27, \base, (27 * FPU_REG_WIDTH)
EX fld.d $f28, \base, (28 * FPU_REG_WIDTH)
EX fld.d $f29, \base, (29 * FPU_REG_WIDTH)
EX fld.d $f30, \base, (30 * FPU_REG_WIDTH)
EX fld.d $f31, \base, (31 * FPU_REG_WIDTH)
.endm
.macro sc_save_fcc base, tmp0, tmp1
movcf2gr \tmp0, $fcc0
move \tmp1, \tmp0
move \tmp1, \tmp0
movcf2gr \tmp0, $fcc1
bstrins.d \tmp1, \tmp0, 15, 8
movcf2gr \tmp0, $fcc2
@@ -113,11 +113,11 @@
bstrins.d \tmp1, \tmp0, 55, 48
movcf2gr \tmp0, $fcc7
bstrins.d \tmp1, \tmp0, 63, 56
EX st.d \tmp1, \base, 0
EX st.d \tmp1, \base, 0
.endm
.macro sc_restore_fcc base, tmp0, tmp1
EX ld.d \tmp0, \base, 0
EX ld.d \tmp0, \base, 0
bstrpick.d \tmp1, \tmp0, 7, 0
movgr2cf $fcc0, \tmp1
bstrpick.d \tmp1, \tmp0, 15, 8
@@ -138,11 +138,11 @@
.macro sc_save_fcsr base, tmp0
movfcsr2gr \tmp0, fcsr0
EX st.w \tmp0, \base, 0
EX st.w \tmp0, \base, 0
.endm
.macro sc_restore_fcsr base, tmp0
EX ld.w \tmp0, \base, 0
EX ld.w \tmp0, \base, 0
movgr2fcsr fcsr0, \tmp0
.endm
@@ -151,9 +151,9 @@
*/
SYM_FUNC_START(_save_fp)
fpu_save_csr a0 t1
fpu_save_double a0 t1 # clobbers t1
fpu_save_double a0 t1 # clobbers t1
fpu_save_cc a0 t1 t2 # clobbers t1, t2
jirl zero, ra, 0
jr ra
SYM_FUNC_END(_save_fp)
EXPORT_SYMBOL(_save_fp)
@@ -161,10 +161,10 @@ EXPORT_SYMBOL(_save_fp)
* Restore a thread's fp context.
*/
SYM_FUNC_START(_restore_fp)
fpu_restore_double a0 t1 # clobbers t1
fpu_restore_csr a0 t1
fpu_restore_cc a0 t1 t2 # clobbers t1, t2
jirl zero, ra, 0
fpu_restore_double a0 t1 # clobbers t1
fpu_restore_csr a0 t1
fpu_restore_cc a0 t1 t2 # clobbers t1, t2
jr ra
SYM_FUNC_END(_restore_fp)
/*
@@ -216,7 +216,7 @@ SYM_FUNC_START(_init_fpu)
movgr2fr.d $f30, t1
movgr2fr.d $f31, t1
jirl zero, ra, 0
jr ra
SYM_FUNC_END(_init_fpu)
/*
@@ -225,11 +225,11 @@ SYM_FUNC_END(_init_fpu)
* a2: fcsr
*/
SYM_FUNC_START(_save_fp_context)
sc_save_fcc a1 t1 t2
sc_save_fcsr a2 t1
sc_save_fp a0
li.w a0, 0 # success
jirl zero, ra, 0
sc_save_fcc a1 t1 t2
sc_save_fcsr a2 t1
sc_save_fp a0
li.w a0, 0 # success
jr ra
SYM_FUNC_END(_save_fp_context)
/*
@@ -238,14 +238,14 @@ SYM_FUNC_END(_save_fp_context)
* a2: fcsr
*/
SYM_FUNC_START(_restore_fp_context)
sc_restore_fp a0
sc_restore_fcc a1 t1 t2
sc_restore_fcsr a2 t1
li.w a0, 0 # success
jirl zero, ra, 0
sc_restore_fp a0
sc_restore_fcc a1 t1 t2
sc_restore_fcsr a2 t1
li.w a0, 0 # success
jr ra
SYM_FUNC_END(_restore_fp_context)
SYM_FUNC_START(fault)
li.w a0, -EFAULT # failure
jirl zero, ra, 0
jr ra
SYM_FUNC_END(fault)

View File

@@ -28,23 +28,23 @@ SYM_FUNC_START(__arch_cpu_idle)
nop
idle 0
/* end of rollback region */
1: jirl zero, ra, 0
1: jr ra
SYM_FUNC_END(__arch_cpu_idle)
SYM_FUNC_START(handle_vint)
BACKUP_T0T1
SAVE_ALL
la.abs t1, __arch_cpu_idle
LONG_L t0, sp, PT_ERA
LONG_L t0, sp, PT_ERA
/* 32 byte rollback region */
ori t0, t0, 0x1f
xori t0, t0, 0x1f
bne t0, t1, 1f
LONG_S t0, sp, PT_ERA
LONG_S t0, sp, PT_ERA
1: move a0, sp
move a1, sp
la.abs t0, do_vint
jirl ra, t0, 0
jirl ra, t0, 0
RESTORE_ALL_AND_RET
SYM_FUNC_END(handle_vint)
@@ -72,7 +72,7 @@ SYM_FUNC_END(except_vec_cex)
build_prep_\prep
move a0, sp
la.abs t0, do_\handler
jirl ra, t0, 0
jirl ra, t0, 0
RESTORE_ALL_AND_RET
SYM_FUNC_END(handle_\exception)
.endm
@@ -91,5 +91,5 @@ SYM_FUNC_END(except_vec_cex)
SYM_FUNC_START(handle_sys)
la.abs t0, handle_syscall
jirl zero, t0, 0
jr t0
SYM_FUNC_END(handle_sys)

View File

@@ -32,7 +32,7 @@ SYM_CODE_START(kernel_entry) # kernel entry point
/* We might not get launched at the address the kernel is linked to,
so we jump there. */
la.abs t0, 0f
jirl zero, t0, 0
jr t0
0:
la t0, __bss_start # clear .bss
st.d zero, t0, 0
@@ -50,7 +50,7 @@ SYM_CODE_START(kernel_entry) # kernel entry point
/* KSave3 used for percpu base, initialized as 0 */
csrwr zero, PERCPU_BASE_KS
/* GPR21 used for percpu base (runtime), initialized as 0 */
or u0, zero, zero
move u0, zero
la tp, init_thread_union
/* Set the SP after an empty pt_regs. */
@@ -85,8 +85,8 @@ SYM_CODE_START(smpboot_entry)
ld.d sp, t0, CPU_BOOT_STACK
ld.d tp, t0, CPU_BOOT_TINFO
la.abs t0, 0f
jirl zero, t0, 0
la.abs t0, 0f
jr t0
0:
bl start_secondary
SYM_CODE_END(smpboot_entry)

View File

@@ -193,7 +193,7 @@ static int fpr_set(struct task_struct *target,
const void *kbuf, const void __user *ubuf)
{
const int fcc_start = NUM_FPU_REGS * sizeof(elf_fpreg_t);
const int fcc_end = fcc_start + sizeof(u64);
const int fcsr_start = fcc_start + sizeof(u64);
int err;
BUG_ON(count % sizeof(elf_fpreg_t));
@@ -209,10 +209,12 @@ static int fpr_set(struct task_struct *target,
if (err)
return err;
if (count > 0)
err |= user_regset_copyin(&pos, &count, &kbuf, &ubuf,
&target->thread.fpu.fcc,
fcc_start, fcc_end);
err |= user_regset_copyin(&pos, &count, &kbuf, &ubuf,
&target->thread.fpu.fcc, fcc_start,
fcc_start + sizeof(u64));
err |= user_regset_copyin(&pos, &count, &kbuf, &ubuf,
&target->thread.fpu.fcsr, fcsr_start,
fcsr_start + sizeof(u32));
return err;
}

View File

@@ -13,7 +13,6 @@
#include <linux/console.h>
#include <acpi/reboot.h>
#include <asm/compiler.h>
#include <asm/idle.h>
#include <asm/loongarch.h>
#include <asm/reboot.h>

View File

@@ -126,7 +126,7 @@ static void __init parse_bios_table(const struct dmi_header *dm)
char *dmi_data = (char *)dm;
bios_extern = *(dmi_data + SMBIOS_BIOSEXTERN_OFFSET);
b_info.bios_size = *(dmi_data + SMBIOS_BIOSSIZE_OFFSET);
b_info.bios_size = (*(dmi_data + SMBIOS_BIOSSIZE_OFFSET) + 1) << 6;
if (bios_extern & LOONGSON_EFI_ENABLE)
set_bit(EFI_BOOT, &efi.flags);

View File

@@ -278,116 +278,29 @@ void loongson3_cpu_die(unsigned int cpu)
mb();
}
/*
* The target CPU should go to XKPRANGE (uncached area) and flush
* ICache/DCache/VCache before the control CPU can safely disable its clock.
*/
static void loongson3_play_dead(int *state_addr)
void play_dead(void)
{
register int val;
register void *addr;
register uint64_t addr;
register void (*init_fn)(void);
__asm__ __volatile__(
" li.d %[addr], 0x8000000000000000\n"
"1: cacop 0x8, %[addr], 0 \n" /* flush ICache */
" cacop 0x8, %[addr], 1 \n"
" cacop 0x8, %[addr], 2 \n"
" cacop 0x8, %[addr], 3 \n"
" cacop 0x9, %[addr], 0 \n" /* flush DCache */
" cacop 0x9, %[addr], 1 \n"
" cacop 0x9, %[addr], 2 \n"
" cacop 0x9, %[addr], 3 \n"
" addi.w %[sets], %[sets], -1 \n"
" addi.d %[addr], %[addr], 0x40 \n"
" bnez %[sets], 1b \n"
" li.d %[addr], 0x8000000000000000\n"
"2: cacop 0xa, %[addr], 0 \n" /* flush VCache */
" cacop 0xa, %[addr], 1 \n"
" cacop 0xa, %[addr], 2 \n"
" cacop 0xa, %[addr], 3 \n"
" cacop 0xa, %[addr], 4 \n"
" cacop 0xa, %[addr], 5 \n"
" cacop 0xa, %[addr], 6 \n"
" cacop 0xa, %[addr], 7 \n"
" cacop 0xa, %[addr], 8 \n"
" cacop 0xa, %[addr], 9 \n"
" cacop 0xa, %[addr], 10 \n"
" cacop 0xa, %[addr], 11 \n"
" cacop 0xa, %[addr], 12 \n"
" cacop 0xa, %[addr], 13 \n"
" cacop 0xa, %[addr], 14 \n"
" cacop 0xa, %[addr], 15 \n"
" addi.w %[vsets], %[vsets], -1 \n"
" addi.d %[addr], %[addr], 0x40 \n"
" bnez %[vsets], 2b \n"
" li.w %[val], 0x7 \n" /* *state_addr = CPU_DEAD; */
" st.w %[val], %[state_addr], 0 \n"
" dbar 0 \n"
" cacop 0x11, %[state_addr], 0 \n" /* flush entry of *state_addr */
: [addr] "=&r" (addr), [val] "=&r" (val)
: [state_addr] "r" (state_addr),
[sets] "r" (cpu_data[smp_processor_id()].dcache.sets),
[vsets] "r" (cpu_data[smp_processor_id()].vcache.sets));
idle_task_exit();
local_irq_enable();
change_csr_ecfg(ECFG0_IM, ECFGF_IPI);
set_csr_ecfg(ECFGF_IPI);
__this_cpu_write(cpu_state, CPU_DEAD);
__asm__ __volatile__(
" idle 0 \n"
" li.w $t0, 0x1020 \n"
" iocsrrd.d %[init_fn], $t0 \n" /* Get init PC */
: [init_fn] "=&r" (addr)
: /* No Input */
: "a0");
init_fn = __va(addr);
__smp_mb();
do {
__asm__ __volatile__("idle 0\n\t");
addr = iocsr_read64(LOONGARCH_IOCSR_MBUF0);
} while (addr == 0);
init_fn = (void *)TO_CACHE(addr);
iocsr_write32(0xffffffff, LOONGARCH_IOCSR_IPI_CLEAR);
init_fn();
unreachable();
}
void play_dead(void)
{
int *state_addr;
unsigned int cpu = smp_processor_id();
void (*play_dead_uncached)(int *s);
idle_task_exit();
play_dead_uncached = (void *)TO_UNCACHE(__pa((unsigned long)loongson3_play_dead));
state_addr = &per_cpu(cpu_state, cpu);
mb();
play_dead_uncached(state_addr);
}
static int loongson3_enable_clock(unsigned int cpu)
{
uint64_t core_id = cpu_data[cpu].core;
uint64_t package_id = cpu_data[cpu].package;
LOONGSON_FREQCTRL(package_id) |= 1 << (core_id * 4 + 3);
return 0;
}
static int loongson3_disable_clock(unsigned int cpu)
{
uint64_t core_id = cpu_data[cpu].core;
uint64_t package_id = cpu_data[cpu].package;
LOONGSON_FREQCTRL(package_id) &= ~(1 << (core_id * 4 + 3));
return 0;
}
static int register_loongson3_notifier(void)
{
return cpuhp_setup_state_nocalls(CPUHP_LOONGARCH_SOC_PREPARE,
"loongarch/loongson:prepare",
loongson3_enable_clock,
loongson3_disable_clock);
}
early_initcall(register_loongson3_notifier);
#endif
/*

View File

@@ -24,8 +24,8 @@ SYM_FUNC_START(__switch_to)
move tp, a2
cpu_restore_nonscratch a1
li.w t0, _THREAD_SIZE - 32
PTR_ADD t0, t0, tp
li.w t0, _THREAD_SIZE - 32
PTR_ADD t0, t0, tp
set_saved_sp t0, t1, t2
ldptr.d t1, a1, THREAD_CSRPRMD

View File

@@ -32,7 +32,7 @@ SYM_FUNC_START(__clear_user)
1: st.b zero, a0, 0
addi.d a0, a0, 1
addi.d a1, a1, -1
bgt a1, zero, 1b
bgtz a1, 1b
2: move a0, a1
jr ra

View File

@@ -35,7 +35,7 @@ SYM_FUNC_START(__copy_user)
addi.d a0, a0, 1
addi.d a1, a1, 1
addi.d a2, a2, -1
bgt a2, zero, 1b
bgtz a2, 1b
3: move a0, a2
jr ra

View File

@@ -7,7 +7,6 @@
#include <linux/smp.h>
#include <linux/timex.h>
#include <asm/compiler.h>
#include <asm/processor.h>
void __delay(unsigned long cycles)

View File

@@ -10,75 +10,75 @@
.align 5
SYM_FUNC_START(clear_page)
lu12i.w t0, 1 << (PAGE_SHIFT - 12)
add.d t0, t0, a0
lu12i.w t0, 1 << (PAGE_SHIFT - 12)
add.d t0, t0, a0
1:
st.d zero, a0, 0
st.d zero, a0, 8
st.d zero, a0, 16
st.d zero, a0, 24
st.d zero, a0, 32
st.d zero, a0, 40
st.d zero, a0, 48
st.d zero, a0, 56
addi.d a0, a0, 128
st.d zero, a0, -64
st.d zero, a0, -56
st.d zero, a0, -48
st.d zero, a0, -40
st.d zero, a0, -32
st.d zero, a0, -24
st.d zero, a0, -16
st.d zero, a0, -8
bne t0, a0, 1b
st.d zero, a0, 0
st.d zero, a0, 8
st.d zero, a0, 16
st.d zero, a0, 24
st.d zero, a0, 32
st.d zero, a0, 40
st.d zero, a0, 48
st.d zero, a0, 56
addi.d a0, a0, 128
st.d zero, a0, -64
st.d zero, a0, -56
st.d zero, a0, -48
st.d zero, a0, -40
st.d zero, a0, -32
st.d zero, a0, -24
st.d zero, a0, -16
st.d zero, a0, -8
bne t0, a0, 1b
jirl $r0, ra, 0
jr ra
SYM_FUNC_END(clear_page)
EXPORT_SYMBOL(clear_page)
.align 5
SYM_FUNC_START(copy_page)
lu12i.w t8, 1 << (PAGE_SHIFT - 12)
add.d t8, t8, a0
lu12i.w t8, 1 << (PAGE_SHIFT - 12)
add.d t8, t8, a0
1:
ld.d t0, a1, 0
ld.d t1, a1, 8
ld.d t2, a1, 16
ld.d t3, a1, 24
ld.d t4, a1, 32
ld.d t5, a1, 40
ld.d t6, a1, 48
ld.d t7, a1, 56
ld.d t0, a1, 0
ld.d t1, a1, 8
ld.d t2, a1, 16
ld.d t3, a1, 24
ld.d t4, a1, 32
ld.d t5, a1, 40
ld.d t6, a1, 48
ld.d t7, a1, 56
st.d t0, a0, 0
st.d t1, a0, 8
ld.d t0, a1, 64
ld.d t1, a1, 72
st.d t2, a0, 16
st.d t3, a0, 24
ld.d t2, a1, 80
ld.d t3, a1, 88
st.d t4, a0, 32
st.d t5, a0, 40
ld.d t4, a1, 96
ld.d t5, a1, 104
st.d t6, a0, 48
st.d t7, a0, 56
ld.d t6, a1, 112
ld.d t7, a1, 120
addi.d a0, a0, 128
addi.d a1, a1, 128
st.d t0, a0, 0
st.d t1, a0, 8
ld.d t0, a1, 64
ld.d t1, a1, 72
st.d t2, a0, 16
st.d t3, a0, 24
ld.d t2, a1, 80
ld.d t3, a1, 88
st.d t4, a0, 32
st.d t5, a0, 40
ld.d t4, a1, 96
ld.d t5, a1, 104
st.d t6, a0, 48
st.d t7, a0, 56
ld.d t6, a1, 112
ld.d t7, a1, 120
addi.d a0, a0, 128
addi.d a1, a1, 128
st.d t0, a0, -64
st.d t1, a0, -56
st.d t2, a0, -48
st.d t3, a0, -40
st.d t4, a0, -32
st.d t5, a0, -24
st.d t6, a0, -16
st.d t7, a0, -8
st.d t0, a0, -64
st.d t1, a0, -56
st.d t2, a0, -48
st.d t3, a0, -40
st.d t4, a0, -32
st.d t5, a0, -24
st.d t6, a0, -16
st.d t7, a0, -8
bne t8, a0, 1b
jirl $r0, ra, 0
bne t8, a0, 1b
jr ra
SYM_FUNC_END(copy_page)
EXPORT_SYMBOL(copy_page)

View File

@@ -18,7 +18,7 @@
REG_S a2, sp, PT_BVADDR
li.w a1, \write
la.abs t0, do_page_fault
jirl ra, t0, 0
jirl ra, t0, 0
RESTORE_ALL_AND_RET
SYM_FUNC_END(tlb_do_page_fault_\write)
.endm
@@ -34,7 +34,7 @@ SYM_FUNC_START(handle_tlb_protect)
csrrd a2, LOONGARCH_CSR_BADV
REG_S a2, sp, PT_BVADDR
la.abs t0, do_page_fault
jirl ra, t0, 0
jirl ra, t0, 0
RESTORE_ALL_AND_RET
SYM_FUNC_END(handle_tlb_protect)
@@ -47,7 +47,7 @@ SYM_FUNC_START(handle_tlb_load)
* The vmalloc handling is not in the hotpath.
*/
csrrd t0, LOONGARCH_CSR_BADV
blt t0, $r0, vmalloc_load
bltz t0, vmalloc_load
csrrd t1, LOONGARCH_CSR_PGDL
vmalloc_done_load:
@@ -80,7 +80,7 @@ vmalloc_done_load:
* see if we need to jump to huge tlb processing.
*/
andi t0, ra, _PAGE_HUGE
bne t0, $r0, tlb_huge_update_load
bnez t0, tlb_huge_update_load
csrrd t0, LOONGARCH_CSR_BADV
srli.d t0, t0, (PAGE_SHIFT + PTE_ORDER)
@@ -100,12 +100,12 @@ smp_pgtable_change_load:
srli.d ra, t0, _PAGE_PRESENT_SHIFT
andi ra, ra, 1
beq ra, $r0, nopage_tlb_load
beqz ra, nopage_tlb_load
ori t0, t0, _PAGE_VALID
#ifdef CONFIG_SMP
sc.d t0, t1, 0
beq t0, $r0, smp_pgtable_change_load
beqz t0, smp_pgtable_change_load
#else
st.d t0, t1, 0
#endif
@@ -139,23 +139,23 @@ tlb_huge_update_load:
#endif
srli.d ra, t0, _PAGE_PRESENT_SHIFT
andi ra, ra, 1
beq ra, $r0, nopage_tlb_load
beqz ra, nopage_tlb_load
tlbsrch
ori t0, t0, _PAGE_VALID
#ifdef CONFIG_SMP
sc.d t0, t1, 0
beq t0, $r0, tlb_huge_update_load
beqz t0, tlb_huge_update_load
ld.d t0, t1, 0
#else
st.d t0, t1, 0
#endif
addu16i.d t1, $r0, -(CSR_TLBIDX_EHINV >> 16)
addi.d ra, t1, 0
csrxchg ra, t1, LOONGARCH_CSR_TLBIDX
addu16i.d t1, zero, -(CSR_TLBIDX_EHINV >> 16)
addi.d ra, t1, 0
csrxchg ra, t1, LOONGARCH_CSR_TLBIDX
tlbwr
csrxchg $r0, t1, LOONGARCH_CSR_TLBIDX
csrxchg zero, t1, LOONGARCH_CSR_TLBIDX
/*
* A huge PTE describes an area the size of the
@@ -178,27 +178,27 @@ tlb_huge_update_load:
addi.d t0, ra, 0
/* Convert to entrylo1 */
addi.d t1, $r0, 1
addi.d t1, zero, 1
slli.d t1, t1, (HPAGE_SHIFT - 1)
add.d t0, t0, t1
csrwr t0, LOONGARCH_CSR_TLBELO1
/* Set huge page tlb entry size */
addu16i.d t0, $r0, (CSR_TLBIDX_PS >> 16)
addu16i.d t1, $r0, (PS_HUGE_SIZE << (CSR_TLBIDX_PS_SHIFT - 16))
addu16i.d t0, zero, (CSR_TLBIDX_PS >> 16)
addu16i.d t1, zero, (PS_HUGE_SIZE << (CSR_TLBIDX_PS_SHIFT - 16))
csrxchg t1, t0, LOONGARCH_CSR_TLBIDX
tlbfill
addu16i.d t0, $r0, (CSR_TLBIDX_PS >> 16)
addu16i.d t1, $r0, (PS_DEFAULT_SIZE << (CSR_TLBIDX_PS_SHIFT - 16))
addu16i.d t0, zero, (CSR_TLBIDX_PS >> 16)
addu16i.d t1, zero, (PS_DEFAULT_SIZE << (CSR_TLBIDX_PS_SHIFT - 16))
csrxchg t1, t0, LOONGARCH_CSR_TLBIDX
nopage_tlb_load:
dbar 0
csrrd ra, EXCEPTION_KS2
la.abs t0, tlb_do_page_fault_0
jirl $r0, t0, 0
jr t0
SYM_FUNC_END(handle_tlb_load)
SYM_FUNC_START(handle_tlb_store)
@@ -210,7 +210,7 @@ SYM_FUNC_START(handle_tlb_store)
* The vmalloc handling is not in the hotpath.
*/
csrrd t0, LOONGARCH_CSR_BADV
blt t0, $r0, vmalloc_store
bltz t0, vmalloc_store
csrrd t1, LOONGARCH_CSR_PGDL
vmalloc_done_store:
@@ -244,7 +244,7 @@ vmalloc_done_store:
* see if we need to jump to huge tlb processing.
*/
andi t0, ra, _PAGE_HUGE
bne t0, $r0, tlb_huge_update_store
bnez t0, tlb_huge_update_store
csrrd t0, LOONGARCH_CSR_BADV
srli.d t0, t0, (PAGE_SHIFT + PTE_ORDER)
@@ -265,12 +265,12 @@ smp_pgtable_change_store:
srli.d ra, t0, _PAGE_PRESENT_SHIFT
andi ra, ra, ((_PAGE_PRESENT | _PAGE_WRITE) >> _PAGE_PRESENT_SHIFT)
xori ra, ra, ((_PAGE_PRESENT | _PAGE_WRITE) >> _PAGE_PRESENT_SHIFT)
bne ra, $r0, nopage_tlb_store
bnez ra, nopage_tlb_store
ori t0, t0, (_PAGE_VALID | _PAGE_DIRTY | _PAGE_MODIFIED)
#ifdef CONFIG_SMP
sc.d t0, t1, 0
beq t0, $r0, smp_pgtable_change_store
beqz t0, smp_pgtable_change_store
#else
st.d t0, t1, 0
#endif
@@ -306,24 +306,24 @@ tlb_huge_update_store:
srli.d ra, t0, _PAGE_PRESENT_SHIFT
andi ra, ra, ((_PAGE_PRESENT | _PAGE_WRITE) >> _PAGE_PRESENT_SHIFT)
xori ra, ra, ((_PAGE_PRESENT | _PAGE_WRITE) >> _PAGE_PRESENT_SHIFT)
bne ra, $r0, nopage_tlb_store
bnez ra, nopage_tlb_store
tlbsrch
ori t0, t0, (_PAGE_VALID | _PAGE_DIRTY | _PAGE_MODIFIED)
#ifdef CONFIG_SMP
sc.d t0, t1, 0
beq t0, $r0, tlb_huge_update_store
beqz t0, tlb_huge_update_store
ld.d t0, t1, 0
#else
st.d t0, t1, 0
#endif
addu16i.d t1, $r0, -(CSR_TLBIDX_EHINV >> 16)
addi.d ra, t1, 0
csrxchg ra, t1, LOONGARCH_CSR_TLBIDX
addu16i.d t1, zero, -(CSR_TLBIDX_EHINV >> 16)
addi.d ra, t1, 0
csrxchg ra, t1, LOONGARCH_CSR_TLBIDX
tlbwr
csrxchg $r0, t1, LOONGARCH_CSR_TLBIDX
csrxchg zero, t1, LOONGARCH_CSR_TLBIDX
/*
* A huge PTE describes an area the size of the
* configured huge page size. This is twice the
@@ -345,28 +345,28 @@ tlb_huge_update_store:
addi.d t0, ra, 0
/* Convert to entrylo1 */
addi.d t1, $r0, 1
addi.d t1, zero, 1
slli.d t1, t1, (HPAGE_SHIFT - 1)
add.d t0, t0, t1
csrwr t0, LOONGARCH_CSR_TLBELO1
/* Set huge page tlb entry size */
addu16i.d t0, $r0, (CSR_TLBIDX_PS >> 16)
addu16i.d t1, $r0, (PS_HUGE_SIZE << (CSR_TLBIDX_PS_SHIFT - 16))
addu16i.d t0, zero, (CSR_TLBIDX_PS >> 16)
addu16i.d t1, zero, (PS_HUGE_SIZE << (CSR_TLBIDX_PS_SHIFT - 16))
csrxchg t1, t0, LOONGARCH_CSR_TLBIDX
tlbfill
/* Reset default page size */
addu16i.d t0, $r0, (CSR_TLBIDX_PS >> 16)
addu16i.d t1, $r0, (PS_DEFAULT_SIZE << (CSR_TLBIDX_PS_SHIFT - 16))
addu16i.d t0, zero, (CSR_TLBIDX_PS >> 16)
addu16i.d t1, zero, (PS_DEFAULT_SIZE << (CSR_TLBIDX_PS_SHIFT - 16))
csrxchg t1, t0, LOONGARCH_CSR_TLBIDX
nopage_tlb_store:
dbar 0
csrrd ra, EXCEPTION_KS2
la.abs t0, tlb_do_page_fault_1
jirl $r0, t0, 0
jr t0
SYM_FUNC_END(handle_tlb_store)
SYM_FUNC_START(handle_tlb_modify)
@@ -378,7 +378,7 @@ SYM_FUNC_START(handle_tlb_modify)
* The vmalloc handling is not in the hotpath.
*/
csrrd t0, LOONGARCH_CSR_BADV
blt t0, $r0, vmalloc_modify
bltz t0, vmalloc_modify
csrrd t1, LOONGARCH_CSR_PGDL
vmalloc_done_modify:
@@ -411,7 +411,7 @@ vmalloc_done_modify:
* see if we need to jump to huge tlb processing.
*/
andi t0, ra, _PAGE_HUGE
bne t0, $r0, tlb_huge_update_modify
bnez t0, tlb_huge_update_modify
csrrd t0, LOONGARCH_CSR_BADV
srli.d t0, t0, (PAGE_SHIFT + PTE_ORDER)
@@ -431,12 +431,12 @@ smp_pgtable_change_modify:
srli.d ra, t0, _PAGE_WRITE_SHIFT
andi ra, ra, 1
beq ra, $r0, nopage_tlb_modify
beqz ra, nopage_tlb_modify
ori t0, t0, (_PAGE_VALID | _PAGE_DIRTY | _PAGE_MODIFIED)
#ifdef CONFIG_SMP
sc.d t0, t1, 0
beq t0, $r0, smp_pgtable_change_modify
beqz t0, smp_pgtable_change_modify
#else
st.d t0, t1, 0
#endif
@@ -454,7 +454,7 @@ leave_modify:
ertn
#ifdef CONFIG_64BIT
vmalloc_modify:
la.abs t1, swapper_pg_dir
la.abs t1, swapper_pg_dir
b vmalloc_done_modify
#endif
@@ -471,14 +471,14 @@ tlb_huge_update_modify:
srli.d ra, t0, _PAGE_WRITE_SHIFT
andi ra, ra, 1
beq ra, $r0, nopage_tlb_modify
beqz ra, nopage_tlb_modify
tlbsrch
ori t0, t0, (_PAGE_VALID | _PAGE_DIRTY | _PAGE_MODIFIED)
#ifdef CONFIG_SMP
sc.d t0, t1, 0
beq t0, $r0, tlb_huge_update_modify
beqz t0, tlb_huge_update_modify
ld.d t0, t1, 0
#else
st.d t0, t1, 0
@@ -504,28 +504,28 @@ tlb_huge_update_modify:
addi.d t0, ra, 0
/* Convert to entrylo1 */
addi.d t1, $r0, 1
addi.d t1, zero, 1
slli.d t1, t1, (HPAGE_SHIFT - 1)
add.d t0, t0, t1
csrwr t0, LOONGARCH_CSR_TLBELO1
/* Set huge page tlb entry size */
addu16i.d t0, $r0, (CSR_TLBIDX_PS >> 16)
addu16i.d t1, $r0, (PS_HUGE_SIZE << (CSR_TLBIDX_PS_SHIFT - 16))
csrxchg t1, t0, LOONGARCH_CSR_TLBIDX
addu16i.d t0, zero, (CSR_TLBIDX_PS >> 16)
addu16i.d t1, zero, (PS_HUGE_SIZE << (CSR_TLBIDX_PS_SHIFT - 16))
csrxchg t1, t0, LOONGARCH_CSR_TLBIDX
tlbwr
/* Reset default page size */
addu16i.d t0, $r0, (CSR_TLBIDX_PS >> 16)
addu16i.d t1, $r0, (PS_DEFAULT_SIZE << (CSR_TLBIDX_PS_SHIFT - 16))
csrxchg t1, t0, LOONGARCH_CSR_TLBIDX
addu16i.d t0, zero, (CSR_TLBIDX_PS >> 16)
addu16i.d t1, zero, (PS_DEFAULT_SIZE << (CSR_TLBIDX_PS_SHIFT - 16))
csrxchg t1, t0, LOONGARCH_CSR_TLBIDX
nopage_tlb_modify:
dbar 0
csrrd ra, EXCEPTION_KS2
la.abs t0, tlb_do_page_fault_1
jirl $r0, t0, 0
jr t0
SYM_FUNC_END(handle_tlb_modify)
SYM_FUNC_START(handle_tlb_refill)

View File

@@ -256,6 +256,7 @@ config PPC
select IRQ_FORCED_THREADING
select MMU_GATHER_PAGE_SIZE
select MMU_GATHER_RCU_TABLE_FREE
select MMU_GATHER_MERGE_VMAS
select MODULES_USE_ELF_RELA
select NEED_DMA_MAP_STATE if PPC64 || NOT_COHERENT_CACHE
select NEED_PER_CPU_EMBED_FIRST_CHUNK if PPC64
@@ -281,6 +282,10 @@ config PPC
# Please keep this list sorted alphabetically.
#
config PPC_LONG_DOUBLE_128
depends on PPC64
def_bool $(success,test "$(shell,echo __LONG_DOUBLE_128__ | $(CC) -E -P -)" = 1)
config PPC_BARRIER_NOSPEC
bool
default y

View File

@@ -19,8 +19,6 @@
#include <linux/pagemap.h>
#define tlb_start_vma(tlb, vma) do { } while (0)
#define tlb_end_vma(tlb, vma) do { } while (0)
#define __tlb_remove_tlb_entry __tlb_remove_tlb_entry
#define tlb_flush tlb_flush

View File

@@ -20,6 +20,7 @@ CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += -fno-stack-protector
CFLAGS_prom_init.o += -DDISABLE_BRANCH_PROFILING
CFLAGS_prom_init.o += -ffreestanding
CFLAGS_prom_init.o += $(call cc-option, -ftrivial-auto-var-init=uninitialized)
ifdef CONFIG_FUNCTION_TRACER
# Do not trace early boot code

View File

@@ -73,6 +73,7 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
endif
KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-relax)
KBUILD_AFLAGS_MODULE += $(call as-option,-Wa$(comma)-mno-relax)
# GCC versions that support the "-mstrict-align" option default to allowing
# unaligned accesses. While unaligned accesses are explicitly allowed in the
@@ -110,7 +111,7 @@ PHONY += vdso_install
vdso_install:
$(Q)$(MAKE) $(build)=arch/riscv/kernel/vdso $@
$(if $(CONFIG_COMPAT),$(Q)$(MAKE) \
$(build)=arch/riscv/kernel/compat_vdso $@)
$(build)=arch/riscv/kernel/compat_vdso compat_$@)
ifeq ($(KBUILD_EXTMOD),)
ifeq ($(CONFIG_MMU),y)

View File

@@ -35,7 +35,7 @@
gpio-keys {
compatible = "gpio-keys";
key0 {
key {
label = "KEY0";
linux,code = <BTN_0>;
gpios = <&gpio0 10 GPIO_ACTIVE_LOW>;

View File

@@ -47,7 +47,7 @@
gpio-keys {
compatible = "gpio-keys";
boot {
key-boot {
label = "BOOT";
linux,code = <BTN_0>;
gpios = <&gpio0 0 GPIO_ACTIVE_LOW>;

View File

@@ -52,7 +52,7 @@
gpio-keys {
compatible = "gpio-keys";
boot {
key-boot {
label = "BOOT";
linux,code = <BTN_0>;
gpios = <&gpio0 0 GPIO_ACTIVE_LOW>;

View File

@@ -46,19 +46,19 @@
gpio-keys {
compatible = "gpio-keys";
up {
key-up {
label = "UP";
linux,code = <BTN_1>;
gpios = <&gpio1_0 7 GPIO_ACTIVE_LOW>;
};
press {
key-press {
label = "PRESS";
linux,code = <BTN_0>;
gpios = <&gpio0 0 GPIO_ACTIVE_LOW>;
};
down {
key-down {
label = "DOWN";
linux,code = <BTN_2>;
gpios = <&gpio0 1 GPIO_ACTIVE_LOW>;

View File

@@ -23,7 +23,7 @@
gpio-keys {
compatible = "gpio-keys";
boot {
key-boot {
label = "BOOT";
linux,code = <BTN_0>;
gpios = <&gpio0 0 GPIO_ACTIVE_LOW>;

View File

@@ -78,7 +78,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o
endif
obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o
obj-$(CONFIG_KGDB) += kgdb.o
obj-$(CONFIG_KEXEC) += kexec_relocate.o crash_save_regs.o machine_kexec.o
obj-$(CONFIG_KEXEC_CORE) += kexec_relocate.o crash_save_regs.o machine_kexec.o
obj-$(CONFIG_KEXEC_FILE) += elf_kexec.o machine_kexec_file.o
obj-$(CONFIG_CRASH_DUMP) += crash_dump.o

View File

@@ -349,7 +349,7 @@ int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
{
const char *strtab, *name, *shstrtab;
const Elf_Shdr *sechdrs;
Elf_Rela *relas;
Elf64_Rela *relas;
int i, r_type;
/* String & section header string table */

View File

@@ -204,6 +204,7 @@ config S390
select IOMMU_SUPPORT if PCI
select MMU_GATHER_NO_GATHER
select MMU_GATHER_RCU_TABLE_FREE
select MMU_GATHER_MERGE_VMAS
select MODULES_USE_ELF_RELA
select NEED_DMA_MAP_STATE if PCI
select NEED_SG_DMA_LENGTH if PCI

View File

@@ -2,7 +2,7 @@
/*
* Kernel interface for the s390 arch_random_* functions
*
* Copyright IBM Corp. 2017, 2020
* Copyright IBM Corp. 2017, 2022
*
* Author: Harald Freudenberger <freude@de.ibm.com>
*
@@ -14,6 +14,7 @@
#ifdef CONFIG_ARCH_RANDOM
#include <linux/static_key.h>
#include <linux/preempt.h>
#include <linux/atomic.h>
#include <asm/cpacf.h>
@@ -32,7 +33,8 @@ static inline bool __must_check arch_get_random_int(unsigned int *v)
static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
{
if (static_branch_likely(&s390_arch_random_available)) {
if (static_branch_likely(&s390_arch_random_available) &&
in_task()) {
cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
atomic64_add(sizeof(*v), &s390_arch_random_counter);
return true;
@@ -42,7 +44,8 @@ static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
{
if (static_branch_likely(&s390_arch_random_available)) {
if (static_branch_likely(&s390_arch_random_available) &&
in_task()) {
cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
atomic64_add(sizeof(*v), &s390_arch_random_counter);
return true;

View File

@@ -27,9 +27,6 @@ static inline void tlb_flush(struct mmu_gather *tlb);
static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
struct page *page, int page_size);
#define tlb_start_vma(tlb, vma) do { } while (0)
#define tlb_end_vma(tlb, vma) do { } while (0)
#define tlb_flush tlb_flush
#define pte_free_tlb pte_free_tlb
#define pmd_free_tlb pmd_free_tlb

View File

@@ -67,6 +67,8 @@ config SPARC64
select HAVE_KRETPROBES
select HAVE_KPROBES
select MMU_GATHER_RCU_TABLE_FREE if SMP
select MMU_GATHER_MERGE_VMAS
select MMU_GATHER_NO_FLUSH_CACHE
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
select HAVE_DYNAMIC_FTRACE
select HAVE_FTRACE_MCOUNT_RECORD

View File

@@ -22,8 +22,6 @@ void smp_flush_tlb_mm(struct mm_struct *mm);
void __flush_tlb_pending(unsigned long, unsigned long, unsigned long *);
void flush_tlb_pending(void);
#define tlb_start_vma(tlb, vma) do { } while (0)
#define tlb_end_vma(tlb, vma) do { } while (0)
#define tlb_flush(tlb) flush_tlb_pending()
/*

View File

@@ -245,6 +245,7 @@ config X86
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select MMU_GATHER_RCU_TABLE_FREE if PARAVIRT
select MMU_GATHER_MERGE_VMAS
select HAVE_POSIX_CPU_TIMERS_TASK_WORK
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE if UNWINDER_ORC || STACK_VALIDATION
@@ -2473,7 +2474,7 @@ config RETHUNK
bool "Enable return-thunks"
depends on RETPOLINE && CC_HAS_RETURN_THUNK
select OBJTOOL if HAVE_OBJTOOL
default y
default y if X86_64
help
Compile the kernel with the return-thunks compiler option to guard
against kernel-to-user data leaks by avoiding return speculation.
@@ -2482,21 +2483,21 @@ config RETHUNK
config CPU_UNRET_ENTRY
bool "Enable UNRET on kernel entry"
depends on CPU_SUP_AMD && RETHUNK
depends on CPU_SUP_AMD && RETHUNK && X86_64
default y
help
Compile the kernel with support for the retbleed=unret mitigation.
config CPU_IBPB_ENTRY
bool "Enable IBPB on kernel entry"
depends on CPU_SUP_AMD
depends on CPU_SUP_AMD && X86_64
default y
help
Compile the kernel with support for the retbleed=ibpb mitigation.
config CPU_IBRS_ENTRY
bool "Enable IBRS on kernel entry"
depends on CPU_SUP_INTEL
depends on CPU_SUP_INTEL && X86_64
default y
help
Compile the kernel with support for the spectre_v2=ibrs mitigation.

View File

@@ -27,6 +27,7 @@ RETHUNK_CFLAGS := -mfunction-return=thunk-extern
RETPOLINE_CFLAGS += $(RETHUNK_CFLAGS)
endif
export RETHUNK_CFLAGS
export RETPOLINE_CFLAGS
export RETPOLINE_VDSO_CFLAGS

View File

@@ -278,9 +278,9 @@ enum {
};
/*
* For formats with LBR_TSX flags (e.g. LBR_FORMAT_EIP_FLAGS2), bits 61:62 in
* MSR_LAST_BRANCH_FROM_x are the TSX flags when TSX is supported, but when
* TSX is not supported they have no consistent behavior:
* For format LBR_FORMAT_EIP_FLAGS2, bits 61:62 in MSR_LAST_BRANCH_FROM_x
* are the TSX flags when TSX is supported, but when TSX is not supported
* they have no consistent behavior:
*
* - For wrmsr(), bits 61:62 are considered part of the sign extension.
* - For HW updates (branch captures) bits 61:62 are always OFF and are not
@@ -288,7 +288,7 @@ enum {
*
* Therefore, if:
*
* 1) LBR has TSX format
* 1) LBR format LBR_FORMAT_EIP_FLAGS2
* 2) CPU has no TSX support enabled
*
* ... then any value passed to wrmsr() must be sign extended to 63 bits and any
@@ -300,7 +300,7 @@ static inline bool lbr_from_signext_quirk_needed(void)
bool tsx_support = boot_cpu_has(X86_FEATURE_HLE) ||
boot_cpu_has(X86_FEATURE_RTM);
return !tsx_support && x86_pmu.lbr_has_tsx;
return !tsx_support;
}
static DEFINE_STATIC_KEY_FALSE(lbr_from_quirk_key);
@@ -1609,9 +1609,6 @@ void intel_pmu_lbr_init_hsw(void)
x86_pmu.lbr_sel_map = hsw_lbr_sel_map;
x86_get_pmu(smp_processor_id())->task_ctx_cache = create_lbr_kmem_cache(size, 0);
if (lbr_from_signext_quirk_needed())
static_branch_enable(&lbr_from_quirk_key);
}
/* skylake */
@@ -1702,7 +1699,11 @@ void intel_pmu_lbr_init(void)
switch (x86_pmu.intel_cap.lbr_format) {
case LBR_FORMAT_EIP_FLAGS2:
x86_pmu.lbr_has_tsx = 1;
fallthrough;
x86_pmu.lbr_from_flags = 1;
if (lbr_from_signext_quirk_needed())
static_branch_enable(&lbr_from_quirk_key);
break;
case LBR_FORMAT_EIP_FLAGS:
x86_pmu.lbr_from_flags = 1;
break;

View File

@@ -302,6 +302,7 @@
#define X86_FEATURE_RETPOLINE_LFENCE (11*32+13) /* "" Use LFENCE for Spectre variant 2 */
#define X86_FEATURE_RETHUNK (11*32+14) /* "" Use REturn THUNK */
#define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */
#define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */
/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */

View File

@@ -297,6 +297,8 @@ do { \
alternative_msr_write(MSR_IA32_SPEC_CTRL, \
spec_ctrl_current() | SPEC_CTRL_IBRS, \
X86_FEATURE_USE_IBRS_FW); \
alternative_msr_write(MSR_IA32_PRED_CMD, PRED_CMD_IBPB, \
X86_FEATURE_USE_IBPB_FW); \
} while (0)
#define firmware_restrict_branch_speculation_end() \

View File

@@ -72,7 +72,6 @@ static inline u64 lower_bits(u64 val, unsigned int bits)
struct real_mode_header;
enum stack_type;
struct ghcb;
/* Early IDT entry points for #VC handler */
extern void vc_no_ghcb(void);
@@ -156,11 +155,7 @@ static __always_inline void sev_es_nmi_complete(void)
__sev_es_nmi_complete();
}
extern int __init sev_es_efi_map_ghcbs(pgd_t *pgd);
extern enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
bool set_ghcb_msr,
struct es_em_ctxt *ctxt,
u64 exit_code, u64 exit_info_1,
u64 exit_info_2);
static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs)
{
int rc;

View File

@@ -2,9 +2,6 @@
#ifndef _ASM_X86_TLB_H
#define _ASM_X86_TLB_H
#define tlb_start_vma(tlb, vma) do { } while (0)
#define tlb_end_vma(tlb, vma) do { } while (0)
#define tlb_flush tlb_flush
static inline void tlb_flush(struct mmu_gather *tlb);

View File

@@ -555,7 +555,9 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end)
dest = addr + insn.length + insn.immediate.value;
if (__static_call_fixup(addr, op, dest) ||
WARN_ON_ONCE(dest != &__x86_return_thunk))
WARN_ONCE(dest != &__x86_return_thunk,
"missing return thunk: %pS-%pS: %*ph",
addr, dest, 5, addr))
continue;
DPRINTK("return thunk at: %pS (%px) len: %d to: %pS",

View File

@@ -975,6 +975,7 @@ static inline const char *spectre_v2_module_string(void) { return ""; }
#define SPECTRE_V2_LFENCE_MSG "WARNING: LFENCE mitigation is not recommended for this CPU, data leaks possible!\n"
#define SPECTRE_V2_EIBRS_EBPF_MSG "WARNING: Unprivileged eBPF is enabled with eIBRS on, data leaks possible via Spectre v2 BHB attacks!\n"
#define SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG "WARNING: Unprivileged eBPF is enabled with eIBRS+LFENCE mitigation and SMT, data leaks possible via Spectre v2 BHB attacks!\n"
#define SPECTRE_V2_IBRS_PERF_MSG "WARNING: IBRS mitigation selected on Enhanced IBRS CPU, this may cause unnecessary performance loss\n"
#ifdef CONFIG_BPF_SYSCALL
void unpriv_ebpf_notify(int new_state)
@@ -1415,6 +1416,8 @@ static void __init spectre_v2_select_mitigation(void)
case SPECTRE_V2_IBRS:
setup_force_cpu_cap(X86_FEATURE_KERNEL_IBRS);
if (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED))
pr_warn(SPECTRE_V2_IBRS_PERF_MSG);
break;
case SPECTRE_V2_LFENCE:
@@ -1516,7 +1519,17 @@ static void __init spectre_v2_select_mitigation(void)
* the CPU supports Enhanced IBRS, kernel might un-intentionally not
* enable IBRS around firmware calls.
*/
if (boot_cpu_has(X86_FEATURE_IBRS) && !spectre_v2_in_ibrs_mode(mode)) {
if (boot_cpu_has_bug(X86_BUG_RETBLEED) &&
boot_cpu_has(X86_FEATURE_IBPB) &&
(boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)) {
if (retbleed_cmd != RETBLEED_CMD_IBPB) {
setup_force_cpu_cap(X86_FEATURE_USE_IBPB_FW);
pr_info("Enabling Speculation Barrier for firmware calls\n");
}
} else if (boot_cpu_has(X86_FEATURE_IBRS) && !spectre_v2_in_ibrs_mode(mode)) {
setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
pr_info("Enabling Restricted Speculation for firmware calls\n");
}

View File

@@ -219,9 +219,10 @@ static enum es_result verify_exception_info(struct ghcb *ghcb, struct es_em_ctxt
return ES_VMM_ERROR;
}
enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
struct es_em_ctxt *ctxt, u64 exit_code,
u64 exit_info_1, u64 exit_info_2)
static enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
struct es_em_ctxt *ctxt,
u64 exit_code, u64 exit_info_1,
u64 exit_info_2)
{
/* Fill in protocol and format specifiers */
ghcb->protocol_version = ghcb_version;
@@ -231,14 +232,7 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
ghcb_set_sw_exit_info_1(ghcb, exit_info_1);
ghcb_set_sw_exit_info_2(ghcb, exit_info_2);
/*
* Hyper-V unenlightened guests use a paravisor for communicating and
* GHCB pages are being allocated and set up by that paravisor. Linux
* should not change the GHCB page's physical address.
*/
if (set_ghcb_msr)
sev_es_wr_ghcb_msr(__pa(ghcb));
sev_es_wr_ghcb_msr(__pa(ghcb));
VMGEXIT();
return verify_exception_info(ghcb, ctxt);
@@ -795,7 +789,7 @@ static enum es_result vc_handle_ioio(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
*/
sw_scratch = __pa(ghcb) + offsetof(struct ghcb, shared_buffer);
ghcb_set_sw_scratch(ghcb, sw_scratch);
ret = sev_es_ghcb_hv_call(ghcb, true, ctxt, SVM_EXIT_IOIO,
ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_IOIO,
exit_info_1, exit_info_2);
if (ret != ES_OK)
return ret;
@@ -837,8 +831,7 @@ static enum es_result vc_handle_ioio(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
ghcb_set_rax(ghcb, rax);
ret = sev_es_ghcb_hv_call(ghcb, true, ctxt,
SVM_EXIT_IOIO, exit_info_1, 0);
ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_IOIO, exit_info_1, 0);
if (ret != ES_OK)
return ret;
@@ -894,7 +887,7 @@ static enum es_result vc_handle_cpuid(struct ghcb *ghcb,
/* xgetbv will cause #GP - use reset value for xcr0 */
ghcb_set_xcr0(ghcb, 1);
ret = sev_es_ghcb_hv_call(ghcb, true, ctxt, SVM_EXIT_CPUID, 0, 0);
ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_CPUID, 0, 0);
if (ret != ES_OK)
return ret;
@@ -919,7 +912,7 @@ static enum es_result vc_handle_rdtsc(struct ghcb *ghcb,
bool rdtscp = (exit_code == SVM_EXIT_RDTSCP);
enum es_result ret;
ret = sev_es_ghcb_hv_call(ghcb, true, ctxt, exit_code, 0, 0);
ret = sev_es_ghcb_hv_call(ghcb, ctxt, exit_code, 0, 0);
if (ret != ES_OK)
return ret;

View File

@@ -786,7 +786,7 @@ static int vmgexit_psc(struct snp_psc_desc *desc)
ghcb_set_sw_scratch(ghcb, (u64)__pa(data));
/* This will advance the shared buffer data points to. */
ret = sev_es_ghcb_hv_call(ghcb, true, &ctxt, SVM_VMGEXIT_PSC, 0, 0);
ret = sev_es_ghcb_hv_call(ghcb, &ctxt, SVM_VMGEXIT_PSC, 0, 0);
/*
* Page State Change VMGEXIT can pass error code through
@@ -1212,8 +1212,7 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
ghcb_set_rdx(ghcb, regs->dx);
}
ret = sev_es_ghcb_hv_call(ghcb, true, ctxt, SVM_EXIT_MSR,
exit_info_1, 0);
ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_MSR, exit_info_1, 0);
if ((ret == ES_OK) && (!exit_info_1)) {
regs->ax = ghcb->save.rax;
@@ -1452,7 +1451,7 @@ static enum es_result vc_do_mmio(struct ghcb *ghcb, struct es_em_ctxt *ctxt,
ghcb_set_sw_scratch(ghcb, ghcb_pa + offsetof(struct ghcb, shared_buffer));
return sev_es_ghcb_hv_call(ghcb, true, ctxt, exit_code, exit_info_1, exit_info_2);
return sev_es_ghcb_hv_call(ghcb, ctxt, exit_code, exit_info_1, exit_info_2);
}
/*
@@ -1628,7 +1627,7 @@ static enum es_result vc_handle_dr7_write(struct ghcb *ghcb,
/* Using a value of 0 for ExitInfo1 means RAX holds the value */
ghcb_set_rax(ghcb, val);
ret = sev_es_ghcb_hv_call(ghcb, true, ctxt, SVM_EXIT_WRITE_DR7, 0, 0);
ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_WRITE_DR7, 0, 0);
if (ret != ES_OK)
return ret;
@@ -1658,7 +1657,7 @@ static enum es_result vc_handle_dr7_read(struct ghcb *ghcb,
static enum es_result vc_handle_wbinvd(struct ghcb *ghcb,
struct es_em_ctxt *ctxt)
{
return sev_es_ghcb_hv_call(ghcb, true, ctxt, SVM_EXIT_WBINVD, 0, 0);
return sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_WBINVD, 0, 0);
}
static enum es_result vc_handle_rdpmc(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
@@ -1667,7 +1666,7 @@ static enum es_result vc_handle_rdpmc(struct ghcb *ghcb, struct es_em_ctxt *ctxt
ghcb_set_rcx(ghcb, ctxt->regs->cx);
ret = sev_es_ghcb_hv_call(ghcb, true, ctxt, SVM_EXIT_RDPMC, 0, 0);
ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_RDPMC, 0, 0);
if (ret != ES_OK)
return ret;
@@ -1708,7 +1707,7 @@ static enum es_result vc_handle_vmmcall(struct ghcb *ghcb,
if (x86_platform.hyper.sev_es_hcall_prepare)
x86_platform.hyper.sev_es_hcall_prepare(ghcb, ctxt->regs);
ret = sev_es_ghcb_hv_call(ghcb, true, ctxt, SVM_EXIT_VMMCALL, 0, 0);
ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_VMMCALL, 0, 0);
if (ret != ES_OK)
return ret;
@@ -2197,7 +2196,7 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned
ghcb_set_rbx(ghcb, input->data_npages);
}
ret = sev_es_ghcb_hv_call(ghcb, true, &ctxt, exit_code, input->req_gpa, input->resp_gpa);
ret = sev_es_ghcb_hv_call(ghcb, &ctxt, exit_code, input->req_gpa, input->resp_gpa);
if (ret)
goto e_put;

View File

@@ -6029,6 +6029,11 @@ split_irqchip_unlock:
r = 0;
break;
case KVM_CAP_X86_USER_SPACE_MSR:
r = -EINVAL;
if (cap->args[0] & ~(KVM_MSR_EXIT_REASON_INVAL |
KVM_MSR_EXIT_REASON_UNKNOWN |
KVM_MSR_EXIT_REASON_FILTER))
break;
kvm->arch.user_space_msr_mask = cap->args[0];
r = 0;
break;
@@ -6183,6 +6188,9 @@ static int kvm_vm_ioctl_set_msr_filter(struct kvm *kvm, void __user *argp)
if (copy_from_user(&filter, user_msr_filter, sizeof(filter)))
return -EFAULT;
if (filter.flags & ~KVM_MSR_FILTER_DEFAULT_DENY)
return -EINVAL;
for (i = 0; i < ARRAY_SIZE(filter.ranges); i++)
empty &= !filter.ranges[i].nmsrs;

View File

@@ -43,6 +43,7 @@ config SYSTEM_TRUSTED_KEYRING
bool "Provide system-wide ring of trusted keys"
depends on KEYS
depends on ASYMMETRIC_KEY_TYPE
depends on X509_CERTIFICATE_PARSER
help
Provide a system keyring to which trusted keys can be added. Keys in
the keyring are considered to be trusted. Keys may be added at will

View File

@@ -782,7 +782,8 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
if (!osc_cpc_flexible_adr_space_confirmed) {
pr_debug("Flexible address space capability not supported\n");
goto out_free;
if (!cpc_supported_by_cpu())
goto out_free;
}
addr = ioremap(gas_t->address, gas_t->bit_width/8);
@@ -809,7 +810,8 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
}
if (!osc_cpc_flexible_adr_space_confirmed) {
pr_debug("Flexible address space capability not supported\n");
goto out_free;
if (!cpc_supported_by_cpu())
goto out_free;
}
} else {
if (gas_t->space_id != ACPI_ADR_SPACE_FIXED_HARDWARE || !cpc_ffh_supported()) {

View File

@@ -213,7 +213,7 @@ static int lan966x_gate_clk_register(struct device *dev,
hw_data->hws[i] =
devm_clk_hw_register_gate(dev, clk_gate_desc[idx].name,
"lan966x", 0, base,
"lan966x", 0, gate_base,
clk_gate_desc[idx].bit_idx,
0, &clk_gate_lock);

View File

@@ -138,6 +138,7 @@ static struct ccu_common *sun50i_h6_r_ccu_clks[] = {
&r_apb2_rsb_clk.common,
&r_apb1_ir_clk.common,
&r_apb1_w1_clk.common,
&r_apb1_rtc_clk.common,
&ir_clk.common,
&w1_clk.common,
};

View File

@@ -103,9 +103,14 @@ static void dimm_setup_label(struct dimm_info *dimm, u16 handle)
dmi_memdev_name(handle, &bank, &device);
/* both strings must be non-zero */
if (bank && *bank && device && *device)
snprintf(dimm->label, sizeof(dimm->label), "%s %s", bank, device);
/*
* Set to a NULL string when both bank and device are zero. In this case,
* the label assigned by default will be preserved.
*/
snprintf(dimm->label, sizeof(dimm->label), "%s%s%s",
(bank && *bank) ? bank : "",
(bank && *bank && device && *device) ? " " : "",
(device && *device) ? device : "");
}
static void assign_dmi_dimm_info(struct dimm_info *dimm, struct memdev_dmi_entry *entry)

View File

@@ -514,6 +514,28 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
memset(p, 0, sizeof(*p));
}
static void enable_intr(struct synps_edac_priv *priv)
{
/* Enable UE/CE Interrupts */
if (priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)
writel(DDR_UE_MASK | DDR_CE_MASK,
priv->baseaddr + ECC_CLR_OFST);
else
writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
priv->baseaddr + DDR_QOS_IRQ_EN_OFST);
}
static void disable_intr(struct synps_edac_priv *priv)
{
/* Disable UE/CE Interrupts */
if (priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)
writel(0x0, priv->baseaddr + ECC_CLR_OFST);
else
writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
priv->baseaddr + DDR_QOS_IRQ_DB_OFST);
}
/**
* intr_handler - Interrupt Handler for ECC interrupts.
* @irq: IRQ number.
@@ -555,6 +577,9 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
/* v3.0 of the controller does not have this register */
if (!(priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR))
writel(regval, priv->baseaddr + DDR_QOS_IRQ_STAT_OFST);
else
enable_intr(priv);
return IRQ_HANDLED;
}
@@ -837,25 +862,6 @@ static void mc_init(struct mem_ctl_info *mci, struct platform_device *pdev)
init_csrows(mci);
}
static void enable_intr(struct synps_edac_priv *priv)
{
/* Enable UE/CE Interrupts */
if (priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)
writel(DDR_UE_MASK | DDR_CE_MASK,
priv->baseaddr + ECC_CLR_OFST);
else
writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
priv->baseaddr + DDR_QOS_IRQ_EN_OFST);
}
static void disable_intr(struct synps_edac_priv *priv)
{
/* Disable UE/CE Interrupts */
writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
priv->baseaddr + DDR_QOS_IRQ_DB_OFST);
}
static int setup_irq(struct mem_ctl_info *mci,
struct platform_device *pdev)
{

View File

@@ -351,6 +351,9 @@ static const struct regmap_config pca953x_i2c_regmap = {
.reg_bits = 8,
.val_bits = 8,
.use_single_read = true,
.use_single_write = true,
.readable_reg = pca953x_readable_register,
.writeable_reg = pca953x_writeable_register,
.volatile_reg = pca953x_volatile_register,
@@ -906,15 +909,18 @@ static int pca953x_irq_setup(struct pca953x_chip *chip,
static int device_pca95xx_init(struct pca953x_chip *chip, u32 invert)
{
DECLARE_BITMAP(val, MAX_LINE);
u8 regaddr;
int ret;
ret = regcache_sync_region(chip->regmap, chip->regs->output,
chip->regs->output + NBANK(chip));
regaddr = pca953x_recalc_addr(chip, chip->regs->output, 0);
ret = regcache_sync_region(chip->regmap, regaddr,
regaddr + NBANK(chip) - 1);
if (ret)
goto out;
ret = regcache_sync_region(chip->regmap, chip->regs->direction,
chip->regs->direction + NBANK(chip));
regaddr = pca953x_recalc_addr(chip, chip->regs->direction, 0);
ret = regcache_sync_region(chip->regmap, regaddr,
regaddr + NBANK(chip) - 1);
if (ret)
goto out;
@@ -1127,14 +1133,14 @@ static int pca953x_regcache_sync(struct device *dev)
* sync these registers first and only then sync the rest.
*/
regaddr = pca953x_recalc_addr(chip, chip->regs->direction, 0);
ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip));
ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip) - 1);
if (ret) {
dev_err(dev, "Failed to sync GPIO dir registers: %d\n", ret);
return ret;
}
regaddr = pca953x_recalc_addr(chip, chip->regs->output, 0);
ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip));
ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip) - 1);
if (ret) {
dev_err(dev, "Failed to sync GPIO out registers: %d\n", ret);
return ret;
@@ -1144,7 +1150,7 @@ static int pca953x_regcache_sync(struct device *dev)
if (chip->driver_data & PCA_PCAL) {
regaddr = pca953x_recalc_addr(chip, PCAL953X_IN_LATCH, 0);
ret = regcache_sync_region(chip->regmap, regaddr,
regaddr + NBANK(chip));
regaddr + NBANK(chip) - 1);
if (ret) {
dev_err(dev, "Failed to sync INT latch registers: %d\n",
ret);
@@ -1153,7 +1159,7 @@ static int pca953x_regcache_sync(struct device *dev)
regaddr = pca953x_recalc_addr(chip, PCAL953X_INT_MASK, 0);
ret = regcache_sync_region(chip->regmap, regaddr,
regaddr + NBANK(chip));
regaddr + NBANK(chip) - 1);
if (ret) {
dev_err(dev, "Failed to sync INT mask registers: %d\n",
ret);

View File

@@ -99,7 +99,7 @@ static inline void xgpio_set_value32(unsigned long *map, int bit, u32 v)
const unsigned long offset = (bit % BITS_PER_LONG) & BIT(5);
map[index] &= ~(0xFFFFFFFFul << offset);
map[index] |= v << offset;
map[index] |= (unsigned long)v << offset;
}
static inline int xgpio_regoffset(struct xgpio_instance *chip, int ch)

View File

@@ -421,6 +421,10 @@ out_free_lh:
* @work: the worker that implements software debouncing
* @sw_debounced: flag indicating if the software debouncer is active
* @level: the current debounced physical level of the line
* @hdesc: the Hardware Timestamp Engine (HTE) descriptor
* @raw_level: the line level at the time of event
* @total_discard_seq: the running counter of the discarded events
* @last_seqno: the last sequence number before debounce period expires
*/
struct line {
struct gpio_desc *desc;

View File

@@ -1364,16 +1364,10 @@ void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device *adev,
struct amdgpu_vm *vm)
{
struct amdkfd_process_info *process_info = vm->process_info;
struct amdgpu_bo *pd = vm->root.bo;
if (!process_info)
return;
/* Release eviction fence from PD */
amdgpu_bo_reserve(pd, false);
amdgpu_bo_fence(pd, NULL, false);
amdgpu_bo_unreserve(pd);
/* Update process info */
mutex_lock(&process_info->lock);
process_info->n_vms--;

View File

@@ -40,7 +40,7 @@ static void amdgpu_bo_list_free_rcu(struct rcu_head *rcu)
{
struct amdgpu_bo_list *list = container_of(rcu, struct amdgpu_bo_list,
rhead);
mutex_destroy(&list->bo_list_mutex);
kvfree(list);
}
@@ -136,6 +136,7 @@ int amdgpu_bo_list_create(struct amdgpu_device *adev, struct drm_file *filp,
trace_amdgpu_cs_bo_status(list->num_entries, total_size);
mutex_init(&list->bo_list_mutex);
*result = list;
return 0;

View File

@@ -47,6 +47,10 @@ struct amdgpu_bo_list {
struct amdgpu_bo *oa_obj;
unsigned first_userptr;
unsigned num_entries;
/* Protect access during command submission.
*/
struct mutex bo_list_mutex;
};
int amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id,

View File

@@ -519,6 +519,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
return r;
}
mutex_lock(&p->bo_list->bo_list_mutex);
/* One for TTM and one for the CS job */
amdgpu_bo_list_for_each_entry(e, p->bo_list)
e->tv.num_shared = 2;
@@ -651,6 +653,7 @@ out_free_user_pages:
kvfree(e->user_pages);
e->user_pages = NULL;
}
mutex_unlock(&p->bo_list->bo_list_mutex);
}
return r;
}
@@ -690,9 +693,11 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser, int error,
{
unsigned i;
if (error && backoff)
if (error && backoff) {
ttm_eu_backoff_reservation(&parser->ticket,
&parser->validated);
mutex_unlock(&parser->bo_list->bo_list_mutex);
}
for (i = 0; i < parser->num_post_deps; i++) {
drm_syncobj_put(parser->post_deps[i].syncobj);
@@ -832,12 +837,16 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
continue;
r = amdgpu_vm_bo_update(adev, bo_va, false);
if (r)
if (r) {
mutex_unlock(&p->bo_list->bo_list_mutex);
return r;
}
r = amdgpu_sync_fence(&p->job->sync, bo_va->last_pt_update);
if (r)
if (r) {
mutex_unlock(&p->bo_list->bo_list_mutex);
return r;
}
}
r = amdgpu_vm_handle_moved(adev, vm);
@@ -1278,6 +1287,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
mutex_unlock(&p->adev->notifier_lock);
mutex_unlock(&p->bo_list->bo_list_mutex);
return 0;

View File

@@ -6,7 +6,7 @@ config DRM_AMD_DC
bool "AMD DC - Enable new display engine"
default y
select SND_HDA_COMPONENT if SND_HDA_CORE
select DRM_AMD_DC_DCN if X86 && !(KCOV_INSTRUMENT_ALL && KCOV_ENABLE_COMPARISONS)
select DRM_AMD_DC_DCN if (X86 || PPC_LONG_DOUBLE_128) && !(KCOV_INSTRUMENT_ALL && KCOV_ENABLE_COMPARISONS)
help
Choose this option if you want to use the new display engine
support for AMDGPU. This adds required support for Vega and

View File

@@ -1653,7 +1653,7 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
adev->dm.crc_rd_wrk = amdgpu_dm_crtc_secure_display_create_work();
#endif
if (dc_enable_dmub_notifications(adev->dm.dc)) {
if (dc_is_dmub_outbox_supported(adev->dm.dc)) {
init_completion(&adev->dm.dmub_aux_transfer_done);
adev->dm.dmub_notify = kzalloc(sizeof(struct dmub_notification), GFP_KERNEL);
if (!adev->dm.dmub_notify) {
@@ -1689,6 +1689,13 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
goto error;
}
/* Enable outbox notification only after IRQ handlers are registered and DMUB is alive.
* It is expected that DMUB will resend any pending notifications at this point, for
* example HPD from DPIA.
*/
if (dc_is_dmub_outbox_supported(adev->dm.dc))
dc_enable_dmub_outbox(adev->dm.dc);
/* create fake encoders for MST */
dm_dp_create_fake_mst_encoders(adev);
@@ -2678,9 +2685,6 @@ static int dm_resume(void *handle)
*/
link_enc_cfg_copy(adev->dm.dc->current_state, dc_state);
if (dc_enable_dmub_notifications(adev->dm.dc))
amdgpu_dm_outbox_init(adev);
r = dm_dmub_hw_init(adev);
if (r)
DRM_ERROR("DMUB interface failed to initialize: status=%d\n", r);
@@ -2698,6 +2702,11 @@ static int dm_resume(void *handle)
}
}
if (dc_is_dmub_outbox_supported(adev->dm.dc)) {
amdgpu_dm_outbox_init(adev);
dc_enable_dmub_outbox(adev->dm.dc);
}
WARN_ON(!dc_commit_state(dm->dc, dc_state));
dm_gpureset_commit_state(dm->cached_dc_state, dm);
@@ -2719,13 +2728,15 @@ static int dm_resume(void *handle)
/* TODO: Remove dc_state->dccg, use dc->dccg directly. */
dc_resource_state_construct(dm->dc, dm_state->context);
/* Re-enable outbox interrupts for DPIA. */
if (dc_enable_dmub_notifications(adev->dm.dc))
amdgpu_dm_outbox_init(adev);
/* Before powering on DC we need to re-initialize DMUB. */
dm_dmub_hw_resume(adev);
/* Re-enable outbox interrupts for DPIA. */
if (dc_is_dmub_outbox_supported(adev->dm.dc)) {
amdgpu_dm_outbox_init(adev);
dc_enable_dmub_outbox(adev->dm.dc);
}
/* power on hardware */
dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);

View File

@@ -64,8 +64,13 @@ int drm_gem_ttm_vmap(struct drm_gem_object *gem,
struct iosys_map *map)
{
struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
int ret;
return ttm_bo_vmap(bo, map);
dma_resv_lock(gem->resv, NULL);
ret = ttm_bo_vmap(bo, map);
dma_resv_unlock(gem->resv);
return ret;
}
EXPORT_SYMBOL(drm_gem_ttm_vmap);
@@ -82,7 +87,9 @@ void drm_gem_ttm_vunmap(struct drm_gem_object *gem,
{
struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
dma_resv_lock(gem->resv, NULL);
ttm_bo_vunmap(bo, map);
dma_resv_unlock(gem->resv);
}
EXPORT_SYMBOL(drm_gem_ttm_vunmap);

View File

@@ -273,10 +273,17 @@ struct intel_context {
u8 child_index;
/** @guc: GuC specific members for parallel submission */
struct {
/** @wqi_head: head pointer in work queue */
/** @wqi_head: cached head pointer in work queue */
u16 wqi_head;
/** @wqi_tail: tail pointer in work queue */
/** @wqi_tail: cached tail pointer in work queue */
u16 wqi_tail;
/** @wq_head: pointer to the actual head in work queue */
u32 *wq_head;
/** @wq_tail: pointer to the actual head in work queue */
u32 *wq_tail;
/** @wq_status: pointer to the status in work queue */
u32 *wq_status;
/**
* @parent_page: page in context state (ce->state) used
* by parent for work queue, process descriptor

View File

@@ -201,6 +201,8 @@ int intel_ring_submission_setup(struct intel_engine_cs *engine);
int intel_engine_stop_cs(struct intel_engine_cs *engine);
void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine);
void intel_engine_wait_for_pending_mi_fw(struct intel_engine_cs *engine);
void intel_engine_set_hwsp_writemask(struct intel_engine_cs *engine, u32 mask);
u64 intel_engine_get_active_head(const struct intel_engine_cs *engine);

View File

@@ -1282,10 +1282,10 @@ static int __intel_engine_stop_cs(struct intel_engine_cs *engine,
intel_uncore_write_fw(uncore, mode, _MASKED_BIT_ENABLE(STOP_RING));
/*
* Wa_22011802037 : gen12, Prior to doing a reset, ensure CS is
* Wa_22011802037 : gen11, gen12, Prior to doing a reset, ensure CS is
* stopped, set ring stop bit and prefetch disable bit to halt CS
*/
if (GRAPHICS_VER(engine->i915) == 12)
if (IS_GRAPHICS_VER(engine->i915, 11, 12))
intel_uncore_write_fw(uncore, RING_MODE_GEN7(engine->mmio_base),
_MASKED_BIT_ENABLE(GEN12_GFX_PREFETCH_DISABLE));
@@ -1308,6 +1308,18 @@ int intel_engine_stop_cs(struct intel_engine_cs *engine)
return -ENODEV;
ENGINE_TRACE(engine, "\n");
/*
* TODO: Find out why occasionally stopping the CS times out. Seen
* especially with gem_eio tests.
*
* Occasionally trying to stop the cs times out, but does not adversely
* affect functionality. The timeout is set as a config parameter that
* defaults to 100ms. In most cases the follow up operation is to wait
* for pending MI_FORCE_WAKES. The assumption is that this timeout is
* sufficient for any pending MI_FORCEWAKEs to complete. Once root
* caused, the caller must check and handle the return from this
* function.
*/
if (__intel_engine_stop_cs(engine, 1000, stop_timeout(engine))) {
ENGINE_TRACE(engine,
"timed out on STOP_RING -> IDLE; HEAD:%04x, TAIL:%04x\n",
@@ -1334,6 +1346,78 @@ void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine)
ENGINE_WRITE_FW(engine, RING_MI_MODE, _MASKED_BIT_DISABLE(STOP_RING));
}
static u32 __cs_pending_mi_force_wakes(struct intel_engine_cs *engine)
{
static const i915_reg_t _reg[I915_NUM_ENGINES] = {
[RCS0] = MSG_IDLE_CS,
[BCS0] = MSG_IDLE_BCS,
[VCS0] = MSG_IDLE_VCS0,
[VCS1] = MSG_IDLE_VCS1,
[VCS2] = MSG_IDLE_VCS2,
[VCS3] = MSG_IDLE_VCS3,
[VCS4] = MSG_IDLE_VCS4,
[VCS5] = MSG_IDLE_VCS5,
[VCS6] = MSG_IDLE_VCS6,
[VCS7] = MSG_IDLE_VCS7,
[VECS0] = MSG_IDLE_VECS0,
[VECS1] = MSG_IDLE_VECS1,
[VECS2] = MSG_IDLE_VECS2,
[VECS3] = MSG_IDLE_VECS3,
[CCS0] = MSG_IDLE_CS,
[CCS1] = MSG_IDLE_CS,
[CCS2] = MSG_IDLE_CS,
[CCS3] = MSG_IDLE_CS,
};
u32 val;
if (!_reg[engine->id].reg) {
drm_err(&engine->i915->drm,
"MSG IDLE undefined for engine id %u\n", engine->id);
return 0;
}
val = intel_uncore_read(engine->uncore, _reg[engine->id]);
/* bits[29:25] & bits[13:9] >> shift */
return (val & (val >> 16) & MSG_IDLE_FW_MASK) >> MSG_IDLE_FW_SHIFT;
}
static void __gpm_wait_for_fw_complete(struct intel_gt *gt, u32 fw_mask)
{
int ret;
/* Ensure GPM receives fw up/down after CS is stopped */
udelay(1);
/* Wait for forcewake request to complete in GPM */
ret = __intel_wait_for_register_fw(gt->uncore,
GEN9_PWRGT_DOMAIN_STATUS,
fw_mask, fw_mask, 5000, 0, NULL);
/* Ensure CS receives fw ack from GPM */
udelay(1);
if (ret)
GT_TRACE(gt, "Failed to complete pending forcewake %d\n", ret);
}
/*
* Wa_22011802037:gen12: In addition to stopping the cs, we need to wait for any
* pending MI_FORCE_WAKEUP requests that the CS has initiated to complete. The
* pending status is indicated by bits[13:9] (masked by bits[29:25]) in the
* MSG_IDLE register. There's one MSG_IDLE register per reset domain. Since we
* are concerned only with the gt reset here, we use a logical OR of pending
* forcewakeups from all reset domains and then wait for them to complete by
* querying PWRGT_DOMAIN_STATUS.
*/
void intel_engine_wait_for_pending_mi_fw(struct intel_engine_cs *engine)
{
u32 fw_pending = __cs_pending_mi_force_wakes(engine);
if (fw_pending)
__gpm_wait_for_fw_complete(engine->gt, fw_pending);
}
static u32
read_subslice_reg(const struct intel_engine_cs *engine,
int slice, int subslice, i915_reg_t reg)

View File

@@ -661,6 +661,16 @@ static inline void execlists_schedule_out(struct i915_request *rq)
i915_request_put(rq);
}
static u32 map_i915_prio_to_lrc_desc_prio(int prio)
{
if (prio > I915_PRIORITY_NORMAL)
return GEN12_CTX_PRIORITY_HIGH;
else if (prio < I915_PRIORITY_NORMAL)
return GEN12_CTX_PRIORITY_LOW;
else
return GEN12_CTX_PRIORITY_NORMAL;
}
static u64 execlists_update_context(struct i915_request *rq)
{
struct intel_context *ce = rq->context;
@@ -669,7 +679,7 @@ static u64 execlists_update_context(struct i915_request *rq)
desc = ce->lrc.desc;
if (rq->engine->flags & I915_ENGINE_HAS_EU_PRIORITY)
desc |= lrc_desc_priority(rq_prio(rq));
desc |= map_i915_prio_to_lrc_desc_prio(rq_prio(rq));
/*
* WaIdleLiteRestore:bdw,skl
@@ -2958,6 +2968,13 @@ static void execlists_reset_prepare(struct intel_engine_cs *engine)
ring_set_paused(engine, 1);
intel_engine_stop_cs(engine);
/*
* Wa_22011802037:gen11/gen12: In addition to stopping the cs, we need
* to wait for any pending mi force wakeups
*/
if (IS_GRAPHICS_VER(engine->i915, 11, 12))
intel_engine_wait_for_pending_mi_fw(engine);
engine->execlists.reset_ccid = active_ccid(engine);
}

Some files were not shown because too many files have changed in this diff Show More