Compare commits

...

982 Commits

Author SHA1 Message Date
Linus Torvalds
88084a3df1 Linux 5.19-rc5 2022-07-03 15:39:28 -07:00
Linus Torvalds
b8d5109f50 lockref: remove unused 'lockref_get_or_lock()' function
Looking at the conditional lock acquire functions in the kernel due to
the new sparse support (see commit 4a557a5d1a "sparse: introduce
conditional lock acquire function attribute"), it became obvious that
the lockref code has a couple of them, but they don't match the usual
naming convention for the other ones, and their return value logic is
also reversed.

In the other very similar places, the naming pattern is '*_and_lock()'
(eg 'atomic_put_and_lock()' and 'refcount_dec_and_lock()'), and the
function returns true when the lock is taken.

The lockref code is superficially very similar to the refcount code,
only with the special "atomic wrt the embedded lock" semantics.  But
instead of the '*_and_lock()' naming it uses '*_or_lock()'.

And instead of returning true in case it took the lock, it returns true
if it *didn't* take the lock.

Now, arguably the reflock code is quite logical: it really is a "either
decrement _or_ lock" kind of situation - and the return value is about
whether the operation succeeded without any special care needed.

So despite the similarities, the differences do make some sense, and
maybe it's not worth trying to unify the different conditional locking
primitives in this area.

But while looking at this all, it did become obvious that the
'lockref_get_or_lock()' function hasn't actually had any users for
almost a decade.

The only user it ever had was the shortlived 'd_rcu_to_refcount()'
function, and it got removed and replaced with 'lockref_get_not_dead()'
back in 2013 in commits 0d98439ea3 ("vfs: use lockred 'dead' flag to
mark unrecoverably dead dentries") and e5c832d555 ("vfs: fix dentry
RCU to refcounting possibly sleeping dput()")

In fact, that single use was removed less than a week after the whole
function was introduced in commit b3abd80250 ("lockref: add
'lockref_get_or_lock() helper") so this function has been around for a
decade, but only had a user for six days.

Let's just put this mis-designed and unused function out of its misery.

We can think about the naming and semantic oddities of the remaining
'lockref_put_or_lock()' later, but at least that function has users.

And while the naming is different and the return value doesn't match,
that function matches the whole '{atomic,refcount}_dec_and_test()'
pattern much better (ie the magic happens when the count goes down to
zero, not when it is incremented from zero).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-03 14:40:28 -07:00
Linus Torvalds
4a557a5d1a sparse: introduce conditional lock acquire function attribute
The kernel tends to try to avoid conditional locking semantics because
it makes it harder to think about and statically check locking rules,
but we do have a few fundamental locking primitives that take locks
conditionally - most obviously the 'trylock' functions.

That has always been a problem for 'sparse' checking for locking
imbalance, and we've had a special '__cond_lock()' macro that we've used
to let sparse know how the locking works:

    # define __cond_lock(x,c)        ((c) ? ({ __acquire(x); 1; }) : 0)

so that you can then use this to tell sparse that (for example) the
spinlock trylock macro ends up acquiring the lock when it succeeds, but
not when it fails:

    #define raw_spin_trylock(lock)  __cond_lock(lock, _raw_spin_trylock(lock))

and then sparse can follow along the locking rules when you have code like

        if (!spin_trylock(&dentry->d_lock))
                return LRU_SKIP;
	.. sparse sees that the lock is held here..
        spin_unlock(&dentry->d_lock);

and sparse ends up happy about the lock contexts.

However, this '__cond_lock()' use does result in very ugly header files,
and requires you to basically wrap the real function with that macro
that uses '__cond_lock'.  Which has made PeterZ NAK things that try to
fix sparse warnings over the years [1].

To solve this, there is now a very experimental patch to sparse that
basically does the exact same thing as '__cond_lock()' did, but using a
function attribute instead.  That seems to make PeterZ happy [2].

Note that this does not replace existing use of '__cond_lock()', but
only exposes the new proposed attribute and uses it for the previously
unannotated 'refcount_dec_and_lock()' family of functions.

For existing sparse installations, this will make no difference (a
negative output context was ignored), but if you have the experimental
sparse patch it will make sparse now understand code that uses those
functions, the same way '__cond_lock()' makes sparse understand the very
similar 'atomic_dec_and_lock()' uses that have the old '__cond_lock()'
annotations.

Note that in some cases this will silence existing context imbalance
warnings.  But in other cases it may end up exposing new sparse warnings
for code that sparse just didn't see the locking for at all before.

This is a trial, in other words.  I'd expect that if it ends up being
successful, and new sparse releases end up having this new attribute,
we'll migrate the old-style '__cond_lock()' users to use the new-style
'__cond_acquires' function attribute.

The actual experimental sparse patch was posted in [3].

Link: https://lore.kernel.org/all/20130930134434.GC12926@twins.programming.kicks-ass.net/ [1]
Link: https://lore.kernel.org/all/Yr60tWxN4P568x3W@worktop.programming.kicks-ass.net/ [2]
Link: https://lore.kernel.org/all/CAHk-=wjZfO9hGqJ2_hGQG3U_XzSh9_XaXze=HgPdvJbgrvASfA@mail.gmail.com/ [3]
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Aring <aahringo@redhat.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-03 11:32:22 -07:00
Linus Torvalds
20855e4cb3 Merge tag 'xfs-5.19-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
 "This fixes some stalling problems and corrects the last of the
  problems (I hope) observed during testing of the new atomic xattr
  update feature.

   - Fix statfs blocking on background inode gc workers

   - Fix some broken inode lock assertion code

   - Fix xattr leaf buffer leaks when cancelling a deferred xattr update
     operation

   - Clean up xattr recovery to make it easier to understand.

   - Fix xattr leaf block verifiers tripping over empty blocks.

   - Remove complicated and error prone xattr leaf block bholding mess.

   - Fix a bug where an rt extent crossing EOF was treated as "posteof"
     blocks and cleaned unnecessarily.

   - Fix a UAF when log shutdown races with unmount"

* tag 'xfs-5.19-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: prevent a UAF when log IO errors race with unmount
  xfs: dont treat rt extents beyond EOF as eofblocks to be cleared
  xfs: don't hold xattr leaf buffers across transaction rolls
  xfs: empty xattr leaf header blocks are not corruption
  xfs: clean up the end of xfs_attri_item_recover
  xfs: always free xattri_leaf_bp when cancelling a deferred op
  xfs: use invalidate_lock to check the state of mmap_lock
  xfs: factor out the common lock flags assert
  xfs: introduce xfs_inodegc_push()
  xfs: bound maximum wait time for inodegc work
2022-07-03 09:42:17 -07:00
Linus Torvalds
69cb6c6556 Merge tag 'nfsd-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
 "Notable regression fixes:

   - Fix NFSD crash during NFSv4.2 READ_PLUS operation

   - Fix incorrect status code returned by COMMIT operation"

* tag 'nfsd-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  SUNRPC: Fix READ_PLUS crasher
  NFSD: restore EINVAL error translation in nfsd_commit()
2022-07-02 11:20:56 -07:00
Linus Torvalds
34074da542 Merge tag 'for-5.19/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:
 "Two important fixes for bugs in code which was added in 5.18:

   - Fix userspace signal failures on 32-bit kernel due to a bug in vDSO

   - Fix 32-bit load-word unalignment exception handler which returned
     wrong values"

* tag 'for-5.19/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix vDSO signal breakage on 32-bit kernel
  parisc/unaligned: Fix emulate_ldw() breakage
2022-07-02 10:23:36 -07:00
Helge Deller
aa78fa905b parisc: Fix vDSO signal breakage on 32-bit kernel
Addition of vDSO support for parisc in kernel v5.18 suddenly broke glibc
signal testcases on a 32-bit kernel.

The trampoline code (sigtramp.S) which is mapped into userspace includes
an offset to the context data on the stack, which is used by gdb and
glibc to get access to registers.

In a 32-bit kernel we used by mistake the offset into the compat context
(which is valid on a 64-bit kernel only) instead of the offset into the
"native" 32-bit context.

Reported-by: John David Anglin <dave.anglin@bell.net>
Tested-by: John David Anglin <dave.anglin@bell.net>
Fixes: 	df24e1783e ("parisc: Add vDSO support")
CC: stable@vger.kernel.org # 5.18
Signed-off-by: Helge Deller <deller@gmx.de>
2022-07-02 18:36:58 +02:00
Linus Torvalds
bb7c512687 Merge tag 'perf-tools-fixes-for-v5.19-2022-07-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - BPF program info linear (BPIL) data is accessed assuming 64-bit
   alignment resulting in undefined behavior as the data is just byte
   aligned. Fix it, Found using -fsanitize=undefined.

 - Fix 'perf offcpu' build on old kernels wrt task_struct's
   state/__state field.

 - Fix perf_event_attr.sample_type setting on the 'offcpu-time' event
   synthesized by the 'perf offcpu' tool.

 - Don't bail out when synthesizing PERF_RECORD_ events for pre-existing
   threads when one goes away while parsing its procfs entries.

 - Don't sort the task scan result from /proc, its not needed and
   introduces bugs when the main thread isn't the first one to be
   processed.

 - Fix uninitialized 'offset' variable on aarch64 in the unwind code.

 - Sync KVM headers with the kernel sources.

* tag 'perf-tools-fixes-for-v5.19-2022-07-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf synthetic-events: Ignore dead threads during event synthesis
  perf synthetic-events: Don't sort the task scan result from /proc
  perf unwind: Fix unitialized 'offset' variable on aarch64
  tools headers UAPI: Sync linux/kvm.h with the kernel sources
  perf bpf: 8 byte align bpil data
  tools kvm headers arm64: Update KVM headers from the kernel sources
  perf offcpu: Accept allowed sample types only
  perf offcpu: Fix build failure on old kernels
2022-07-02 09:28:36 -07:00
Linus Torvalds
5411de0733 Merge tag 'powerpc-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:

 - Fix BPF uapi confusion about the correct type of bpf_user_pt_regs_t.

 - Fix virt_addr_valid() when memory is hotplugged above the boot-time
   high_memory value.

 - Fix a bug in 64-bit Book3E map_kernel_page() which would incorrectly
   allocate a PMD page at PUD level.

 - Fix a couple of minor issues found since we enabled KASAN for 64-bit
   Book3S.

Thanks to Aneesh Kumar K.V, Cédric Le Goater, Christophe Leroy, Kefeng
Wang, Liam Howlett, Nathan Lynch, and Naveen N. Rao.

* tag 'powerpc-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/memhotplug: Add add_pages override for PPC
  powerpc/bpf: Fix use of user_pt_regs in uapi
  powerpc/prom_init: Fix kernel config grep
  powerpc/book3e: Fix PUD allocation size in map_kernel_page()
  powerpc/xive/spapr: correct bitmap allocation size
2022-07-02 09:11:44 -07:00
Namhyung Kim
ff898552fb perf synthetic-events: Ignore dead threads during event synthesis
When it synthesize various task events, it scans the list of task
first and then accesses later.  There's a window threads can die
between the two and proc entries may not be available.

Instead of bailing out, we can ignore that thread and move on.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220701205458.985106-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-02 09:22:26 -03:00
Namhyung Kim
363afa3aef perf synthetic-events: Don't sort the task scan result from /proc
It should not sort the result as procfs already returns a proper
ordering of tasks.  Actually sorting the order caused problems that it
doesn't guararantee to process the main thread first.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220701205458.985106-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-02 09:21:41 -03:00
Ivan Babrou
5eb502b2e1 perf unwind: Fix unitialized 'offset' variable on aarch64
Commit dc2cf4ca86 ("perf unwind: Fix segbase for ld.lld linked
objects") uncovered the following issue on aarch64:

    util/unwind-libunwind-local.c: In function 'find_proc_info':
    util/unwind-libunwind-local.c:386:28: error: 'offset' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    386 |                         if (ofs > 0) {
        |                            ^
    util/unwind-libunwind-local.c:199:22: note: 'offset' was declared here
    199 |         u64 address, offset;
        |                      ^~~~~~
    util/unwind-libunwind-local.c:371:20: error: 'offset' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    371 |                 if (ofs <= 0) {
        |                    ^
    util/unwind-libunwind-local.c:199:22: note: 'offset' was declared here
    199 |         u64 address, offset;
        |                      ^~~~~~
    util/unwind-libunwind-local.c:363:20: error: 'offset' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    363 |                 if (ofs <= 0) {
        |                    ^
    util/unwind-libunwind-local.c:199:22: note: 'offset' was declared here
    199 |         u64 address, offset;
        |                      ^~~~~~
    In file included from util/libunwind/arm64.c:37:

Fixes: dc2cf4ca86 ("perf unwind: Fix segbase for ld.lld linked objects")
Signed-off-by: Ivan Babrou <ivan@cloudflare.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: kernel-team@cloudflare.com
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220701182046.12589-1-ivan@cloudflare.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-02 09:16:52 -03:00
Linus Torvalds
0898660614 Merge tag 'libnvdimm-fixes-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fix from Vishal Verma:

 - Fix a bug in the libnvdimm 'BTT' (Block Translation Table) driver
   where accounting for poison blocks to be cleared was off by one,
   causing a failure to clear the the last badblock in an nvdimm region.

* tag 'libnvdimm-fixes-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  nvdimm: Fix badblocks clear off-by-one error
2022-07-01 16:58:19 -07:00
Linus Torvalds
1ce8c443e9 Merge tag 'thermal-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
 "Add a new CPU ID to the list of supported processors in the
  intel_tcc_cooling driver (Sumeet Pawnikar)"

* tag 'thermal-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: intel_tcc_cooling: Add TCC cooling support for RaptorLake
2022-07-01 13:00:47 -07:00
Linus Torvalds
9ee7827668 Merge tag 'pm-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These fix some issues in cpufreq drivers and some issues in devfreq:

   - Fix error code path issues related PROBE_DEFER handling in devfreq
     (Christian Marangi)

   - Revert an editing accident in SPDX-License line in the devfreq
     passive governor (Lukas Bulwahn)

   - Fix refcount leak in of_get_devfreq_events() in the exynos-ppmu
     devfreq driver (Miaoqian Lin)

   - Use HZ_PER_KHZ macro in the passive devfreq governor (Yicong Yang)

   - Fix missing of_node_put for qoriq and pmac32 driver (Liang He)

   - Fix issues around throttle interrupt for qcom driver (Stephen Boyd)

   - Add MT8186 to cpufreq-dt-platdev blocklist (AngeloGioacchino Del
     Regno)

   - Make amd-pstate enable CPPC on resume from S3 (Jinzhou Su)"

* tag 'pm-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / devfreq: passive: revert an editing accident in SPDX-License line
  PM / devfreq: Fix kernel warning with cpufreq passive register fail
  PM / devfreq: Rework freq_table to be local to devfreq struct
  PM / devfreq: exynos-ppmu: Fix refcount leak in of_get_devfreq_events
  PM / devfreq: passive: Use HZ_PER_KHZ macro in units.h
  PM / devfreq: Fix cpufreq passive unregister erroring on PROBE_DEFER
  PM / devfreq: Mute warning on governor PROBE_DEFER
  PM / devfreq: Fix kernel panic with cpu based scaling to passive gov
  cpufreq: Add MT8186 to cpufreq-dt-platdev blocklist
  cpufreq: pmac32-cpufreq: Fix refcount leak bug
  cpufreq: qcom-hw: Don't do lmh things without a throttle interrupt
  drivers: cpufreq: Add missing of_node_put() in qoriq-cpufreq.c
  cpufreq: amd-pstate: Add resume and suspend callbacks
2022-07-01 12:55:28 -07:00
Rafael J. Wysocki
bc621588ff Merge branch 'pm-cpufreq'
Merge cpufreq fixes for 5.19-rc5, including ARM cpufreq fixes and the
following one:

 - Make amd-pstate enable CPPC on resume from S3 (Jinzhou Su).

* pm-cpufreq:
  cpufreq: Add MT8186 to cpufreq-dt-platdev blocklist
  cpufreq: pmac32-cpufreq: Fix refcount leak bug
  cpufreq: qcom-hw: Don't do lmh things without a throttle interrupt
  drivers: cpufreq: Add missing of_node_put() in qoriq-cpufreq.c
  cpufreq: amd-pstate: Add resume and suspend callbacks
2022-07-01 21:43:08 +02:00
Linus Torvalds
b336ad598a Merge tag 'hwmon-for-v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:

 - Fix error handling in ibmaem driver initialization

 - Fix bad data reported by occ driver after setting power cap

 - Fix typos in pmbus/ucd9200 driver comments

* tag 'hwmon-for-v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (ibmaem) don't call platform_device_del() if platform_device_add() fails
  hwmon: (pmbus/ucd9200) fix typos in comments
  hwmon: (occ) Prevent power cap command overwriting poll response
2022-07-01 12:05:27 -07:00
Yang Yingliang
d0e51022a0 hwmon: (ibmaem) don't call platform_device_del() if platform_device_add() fails
If platform_device_add() fails, it no need to call platform_device_del(), split
platform_device_unregister() into platform_device_del/put(), so platform_device_put()
can be called separately.

Fixes: 8808a793f0 ("ibmaem: new driver for power/energy/temp meters in IBM System X hardware")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220701074153.4021556-1-yangyingliang@huawei.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-07-01 11:53:29 -07:00
Linus Torvalds
d0f67adb79 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
 "Restore TLB invalidation for the 'break-before-make' rule on
  contiguous ptes (missed in a recent clean-up)"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: hugetlb: Restore TLB invalidation for BBM on contiguous ptes
2022-07-01 11:23:21 -07:00
Linus Torvalds
cec84e7547 Merge tag 's390-5.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Alexander Gordeev:

 - Fix purgatory build process so bin2c tool does not get built
   unnecessarily and the Makefile is more consistent with other
   architectures.

 - Return earlier simple design of arch_get_random_seed_long|int() and
   arch_get_random_long|int() callbacks as result of changes in generic
   RNG code.

 - Fix minor comment typos and spelling mistakes.

* tag 's390-5.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/qdio: Fix spelling mistake
  s390/sclp: Fix typo in comments
  s390/archrandom: simplify back to earlier design and initialize earlier
  s390/purgatory: remove duplicated build rule of kexec-purgatory.o
  s390/purgatory: hard-code obj-y in Makefile
  s390: remove unneeded 'select BUILD_BIN2C'
2022-07-01 11:19:14 -07:00
Linus Torvalds
76ff294e16 Merge tag 'nfs-for-5.19-3' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:

 - Allocate a fattr for _nfs4_discover_trunking()

 - Fix module reference count leak in nfs4_run_state_manager()

* tag 'nfs-for-5.19-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFSv4: Add an fattr allocation to _nfs4_discover_trunking()
  NFS: restore module put when manager exits.
2022-07-01 11:11:32 -07:00
Linus Torvalds
6f8693ea2b Merge tag 'ceph-for-5.19-rc5' of https://github.com/ceph/ceph-client
Pull ceph fix from Ilya Dryomov:
 "A ceph filesystem fix, marked for stable.

  There appears to be a deeper issue on the MDS side, but for now we are
  going with this one-liner to avoid busy looping and potential soft
  lockups"

* tag 'ceph-for-5.19-rc5' of https://github.com/ceph/ceph-client:
  ceph: wait on async create before checking caps for syncfs
2022-07-01 11:06:21 -07:00
Linus Torvalds
8300d38030 Merge tag 'for-5.19/dm-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
 "Three fixes for invalid memory accesses discovered by using KASAN
  while running the lvm2 testsuite's dm-raid tests. Includes changes to
  MD's raid5.c given the dependency dm-raid has on the MD code"

* tag 'for-5.19/dm-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm raid: fix KASAN warning in raid5_add_disks
  dm raid: fix KASAN warning in raid5_remove_disk
  dm raid: fix accesses beyond end of raid member array
2022-07-01 10:58:39 -07:00
Linus Torvalds
0a35d1622d Merge tag 'io_uring-5.19-2022-07-01' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "Two minor tweaks:

   - While we still can, adjust the send/recv based flags to be in
     ->ioprio rather than in ->addr2. This is consistent with eg accept,
     and also doesn't waste a full 64-bit field for flags (Pavel)

   - 5.18-stable fix for re-importing provided buffers. Not much real
     world relevance here as it'll only impact non-pollable files gone
     async, which is more of a practical test case rather than something
     that is used in the wild (Dylan)"

* tag 'io_uring-5.19-2022-07-01' of git://git.kernel.dk/linux-block:
  io_uring: fix provided buffer import
  io_uring: keep sendrecv flags in ioprio
2022-07-01 10:52:01 -07:00
Linus Torvalds
d516e221e2 Merge tag 'block-5.19-2022-07-01' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - Fix for batch getting of tags in sbitmap (wuchi)

 - NVMe pull request via Christoph:
      - More quirks (Lamarque Vieira Souza, Pablo Greco)
      - Fix a fabrics disconnect regression (Ruozhu Li)
      - Fix a nvmet-tcp data_digest calculation regression (Sagi
        Grimberg)
      - Fix nvme-tcp send failure handling (Sagi Grimberg)
      - Fix a regression with nvmet-loop and passthrough controllers
        (Alan Adamson)

* tag 'block-5.19-2022-07-01' of git://git.kernel.dk/linux-block:
  nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA IM2P33F8ABR1
  nvmet: add a clear_ids attribute for passthru targets
  nvme: fix regression when disconnect a recovering ctrl
  nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG SX6000LNP (AKA SPECTRIX S40G)
  nvme-tcp: always fail a request when sending it failed
  nvmet-tcp: fix regression in data_digest calculation
  lib/sbitmap: Fix invalid loop in __sbitmap_queue_get_batch()
2022-07-01 10:42:10 -07:00
Linus Torvalds
067c227379 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
 "One simple driver fix for a dma overrun"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: hisi_sas: Limit max hw sectors for v3 HW
2022-07-01 10:38:17 -07:00
Linus Torvalds
690685ffcd Merge tag 'ata-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ATA fix from Damien Le Moal:

 - Fix a compilation warning with some versions of gcc/sparse when
   compiling the pata_cs5535 driver, from John.

* tag 'ata-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: pata_cs5535: Fix W=1 warnings
2022-07-01 10:31:44 -07:00
Will Deacon
4109823037 arm64: hugetlb: Restore TLB invalidation for BBM on contiguous ptes
Commit fb396bb459 ("arm64/hugetlb: Drop TLB flush from get_clear_flush()")
removed TLB invalidation from get_clear_flush() [now get_clear_contig()]
on the basis that the core TLB invalidation code is aware of hugetlb
mappings backed by contiguous page-table entries and will cover the
correct virtual address range.

However, this change also resulted in the TLB invalidation being removed
from the "break" step in the break-before-make (BBM) sequence used
internally by huge_ptep_set_{access_flags,wrprotect}(), therefore
making the BBM sequence unsafe irrespective of later invalidation.

Although the architecture is desperately unclear about how exactly
contiguous ptes should be updated in a live page-table, restore TLB
invalidation to our BBM sequence under the assumption that BBM is the
right thing to be doing in the first place.

Fixes: fb396bb459 ("arm64/hugetlb: Drop TLB flush from get_clear_flush()")
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220629095349.25748-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-07-01 18:29:26 +01:00
Linus Torvalds
9650910d05 Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
 "Two small fixes

   - Initialize a spinlock in the stm32 reset code

   - Add dt bindings to the clk maintainer filepattern"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  MAINTAINERS: add include/dt-bindings/clock to COMMON CLK FRAMEWORK
  clk: stm32: rcc_reset: Fix missing spin_lock_init()
2022-07-01 10:01:32 -07:00
Darrick J. Wong
7561cea5db xfs: prevent a UAF when log IO errors race with unmount
KASAN reported the following use after free bug when running
generic/475:

 XFS (dm-0): Mounting V5 Filesystem
 XFS (dm-0): Starting recovery (logdev: internal)
 XFS (dm-0): Ending recovery (logdev: internal)
 Buffer I/O error on dev dm-0, logical block 20639616, async page read
 Buffer I/O error on dev dm-0, logical block 20639617, async page read
 XFS (dm-0): log I/O error -5
 XFS (dm-0): Filesystem has been shut down due to log error (0x2).
 XFS (dm-0): Unmounting Filesystem
 XFS (dm-0): Please unmount the filesystem and rectify the problem(s).
 ==================================================================
 BUG: KASAN: use-after-free in do_raw_spin_lock+0x246/0x270
 Read of size 4 at addr ffff888109dd84c4 by task 3:1H/136

 CPU: 3 PID: 136 Comm: 3:1H Not tainted 5.19.0-rc4-xfsx #rc4 8e53ab5ad0fddeb31cee5e7063ff9c361915a9c4
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
 Workqueue: xfs-log/dm-0 xlog_ioend_work [xfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x34/0x44
  print_report.cold+0x2b8/0x661
  ? do_raw_spin_lock+0x246/0x270
  kasan_report+0xab/0x120
  ? do_raw_spin_lock+0x246/0x270
  do_raw_spin_lock+0x246/0x270
  ? rwlock_bug.part.0+0x90/0x90
  xlog_force_shutdown+0xf6/0x370 [xfs 4ad76ae0d6add7e8183a553e624c31e9ed567318]
  xlog_ioend_work+0x100/0x190 [xfs 4ad76ae0d6add7e8183a553e624c31e9ed567318]
  process_one_work+0x672/0x1040
  worker_thread+0x59b/0xec0
  ? __kthread_parkme+0xc6/0x1f0
  ? process_one_work+0x1040/0x1040
  ? process_one_work+0x1040/0x1040
  kthread+0x29e/0x340
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x1f/0x30
  </TASK>

 Allocated by task 154099:
  kasan_save_stack+0x1e/0x40
  __kasan_kmalloc+0x81/0xa0
  kmem_alloc+0x8d/0x2e0 [xfs]
  xlog_cil_init+0x1f/0x540 [xfs]
  xlog_alloc_log+0xd1e/0x1260 [xfs]
  xfs_log_mount+0xba/0x640 [xfs]
  xfs_mountfs+0xf2b/0x1d00 [xfs]
  xfs_fs_fill_super+0x10af/0x1910 [xfs]
  get_tree_bdev+0x383/0x670
  vfs_get_tree+0x7d/0x240
  path_mount+0xdb7/0x1890
  __x64_sys_mount+0x1fa/0x270
  do_syscall_64+0x2b/0x80
  entry_SYSCALL_64_after_hwframe+0x46/0xb0

 Freed by task 154151:
  kasan_save_stack+0x1e/0x40
  kasan_set_track+0x21/0x30
  kasan_set_free_info+0x20/0x30
  ____kasan_slab_free+0x110/0x190
  slab_free_freelist_hook+0xab/0x180
  kfree+0xbc/0x310
  xlog_dealloc_log+0x1b/0x2b0 [xfs]
  xfs_unmountfs+0x119/0x200 [xfs]
  xfs_fs_put_super+0x6e/0x2e0 [xfs]
  generic_shutdown_super+0x12b/0x3a0
  kill_block_super+0x95/0xd0
  deactivate_locked_super+0x80/0x130
  cleanup_mnt+0x329/0x4d0
  task_work_run+0xc5/0x160
  exit_to_user_mode_prepare+0xd4/0xe0
  syscall_exit_to_user_mode+0x1d/0x40
  entry_SYSCALL_64_after_hwframe+0x46/0xb0

This appears to be a race between the unmount process, which frees the
CIL and waits for in-flight iclog IO; and the iclog IO completion.  When
generic/475 runs, it starts fsstress in the background, waits a few
seconds, and substitutes a dm-error device to simulate a disk falling
out of a machine.  If the fsstress encounters EIO on a pure data write,
it will exit but the filesystem will still be online.

The next thing the test does is unmount the filesystem, which tries to
clean the log, free the CIL, and wait for iclog IO completion.  If an
iclog was being written when the dm-error switch occurred, it can race
with log unmounting as follows:

Thread 1				Thread 2

					xfs_log_unmount
					xfs_log_clean
					xfs_log_quiesce
xlog_ioend_work
<observe error>
xlog_force_shutdown
test_and_set_bit(XLOG_IOERROR)
					xfs_log_force
					<log is shut down, nop>
					xfs_log_umount_write
					<log is shut down, nop>
					xlog_dealloc_log
					xlog_cil_destroy
					<wait for iclogs>
spin_lock(&log->l_cilp->xc_push_lock)
<KABOOM>

Therefore, free the CIL after waiting for the iclogs to complete.  I
/think/ this race has existed for quite a few years now, though I don't
remember the ~2014 era logging code well enough to know if it was a real
threat then or if the actual race was exposed only more recently.

Fixes: ac983517ec ("xfs: don't sleep in xlog_cil_force_lsn on shutdown")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-07-01 09:09:52 -07:00
Linus Torvalds
a175eca0f3 Merge tag 'drm-fixes-2022-07-01' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Bit quieter this week, the main thing is it pulls in the fixes for the
  sysfb resource issue you were seeing. these had been queued for next
  so should have had some decent testing.

  Otherwise amdgpu, i915 and msm each have a few fixes, and vc4 has one.

  fbdev:
   - sysfb fixes/conflicting fb fixes

  amdgpu:
   - GPU recovery fix

   - Fix integer type usage in fourcc header for AMD modifiers

   - KFD TLB flush fix for gfx9 APUs

   - Display fix

  i915:
   - Fix ioctl argument error return

   - Fix d3cold disable to allow PCI upstream bridge D3 transition

   - Fix setting cache_dirty for dma-buf objects on discrete

  msm:
   - Fix to increment vsync_cnt before calling drm_crtc_handle_vblank so
     that userspace sees the value *after* it is incremented if waiting
     for vblank events

   - Fix to reset drm_dev to NULL in dp_display_unbind to avoid a crash
     in probe/bind error paths

   - Fix to resolve the smatch error of de-referencing before NULL check
     in dpu_encoder_phys_wb.c

   - Fix error return to userspace if fence-id allocation fails in
     submit ioctl

  vc4:
   - NULL ptr dereference fix"

* tag 'drm-fixes-2022-07-01' of git://anongit.freedesktop.org/drm/drm:
  Revert "drm/amdgpu/display: set vblank_disable_immediate for DC"
  drm/amdgpu: To flush tlb for MMHUB of RAVEN series
  drm/fourcc: fix integer type usage in uapi header
  drm/amdgpu: fix adev variable used in amdgpu_device_gpu_recover()
  fbdev: Disable sysfb device registration when removing conflicting FBs
  firmware: sysfb: Add sysfb_disable() helper function
  firmware: sysfb: Make sysfb_create_simplefb() return a pdev pointer
  drm/msm/gem: Fix error return on fence id alloc fail
  drm/i915: tweak the ordering in cpu_write_needs_clflush
  drm/i915/dgfx: Disable d3cold at gfx root port
  drm/i915/gem: add missing else
  drm/vc4: perfmon: Fix variable dereferenced before check
  drm/msm/dpu: Fix variable dereferenced before check
  drm/msm/dp: reset drm_dev to NULL at dp_display_unbind()
  drm/msm/dpu: Increment vsync_cnt before waking up userspace
2022-06-30 17:19:19 -07:00
Dave Airlie
b8f0009bc9 Merge tag 'drm-misc-fixes-2022-06-30' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A NULL pointer dereference fix for vc4, and 3 patches to improve the
sysfb device behaviour when removing conflicting framebuffers

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220630072404.2fa4z3nk5h5q34ci@houat
2022-07-01 09:27:55 +10:00
Linus Torvalds
5e8379351d Merge tag 'net-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter.

  Current release - new code bugs:

   - clear msg_get_inq in __sys_recvfrom() and __copy_msghdr_from_user()

   - mptcp:
      - invoke MP_FAIL response only when needed
      - fix shutdown vs fallback race
      - consistent map handling on failure

   - octeon_ep: use bitwise AND

  Previous releases - regressions:

   - tipc: move bc link creation back to tipc_node_create, fix NPD

  Previous releases - always broken:

   - tcp: add a missing nf_reset_ct() in 3WHS handling to prevent socket
     buffered skbs from keeping refcount on the conntrack module

   - ipv6: take care of disable_policy when restoring routes

   - tun: make sure to always disable and unlink NAPI instances

   - phy: don't trigger state machine while in suspend

   - netfilter: nf_tables: avoid skb access on nf_stolen

   - asix: fix "can't send until first packet is send" issue

   - usb: asix: do not force pause frames support

   - nxp-nci: don't issue a zero length i2c_master_read()

  Misc:

   - ncsi: allow use of proper "mellanox" DT vendor prefix

   - act_api: add a message for user space if any actions were already
     flushed before the error was hit"

* tag 'net-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits)
  net: dsa: felix: fix race between reading PSFP stats and port stats
  selftest: tun: add test for NAPI dismantle
  net: tun: avoid disabling NAPI twice
  net: sparx5: mdb add/del handle non-sparx5 devices
  net: sfp: fix memory leak in sfp_probe()
  mlxsw: spectrum_router: Fix rollback in tunnel next hop init
  net: rose: fix UAF bugs caused by timer handler
  net: usb: ax88179_178a: Fix packet receiving
  net: bonding: fix use-after-free after 802.3ad slave unbind
  ipv6: fix lockdep splat in in6_dump_addrs()
  net: phy: ax88772a: fix lost pause advertisement configuration
  net: phy: Don't trigger state machine while in suspend
  usbnet: fix memory allocation in helpers
  selftests net: fix kselftest net fatal error
  NFC: nxp-nci: don't print header length mismatch on i2c error
  NFC: nxp-nci: Don't issue a zero length i2c_master_read()
  net: tipc: fix possible refcount leak in tipc_sk_create()
  nfc: nfcmrvl: Fix irq_of_parse_and_map() return value
  net: ipv6: unexport __init-annotated seg6_hmac_net_init()
  ipv6/sit: fix ipip6_tunnel_get_prl return value
  ...
2022-06-30 15:26:55 -07:00
Amir Goldstein
868f9f2f8e vfs: fix copy_file_range() regression in cross-fs copies
A regression has been reported by Nicolas Boichat, found while using the
copy_file_range syscall to copy a tracefs file.

Before commit 5dae222a5f ("vfs: allow copy_file_range to copy across
devices") the kernel would return -EXDEV to userspace when trying to
copy a file across different filesystems.  After this commit, the
syscall doesn't fail anymore and instead returns zero (zero bytes
copied), as this file's content is generated on-the-fly and thus reports
a size of zero.

Another regression has been reported by He Zhe - the assertion of
WARN_ON_ONCE(ret == -EOPNOTSUPP) can be triggered from userspace when
copying from a sysfs file whose read operation may return -EOPNOTSUPP.

Since we do not have test coverage for copy_file_range() between any two
types of filesystems, the best way to avoid these sort of issues in the
future is for the kernel to be more picky about filesystems that are
allowed to do copy_file_range().

This patch restores some cross-filesystem copy restrictions that existed
prior to commit 5dae222a5f ("vfs: allow copy_file_range to copy across
devices"), namely, cross-sb copy is not allowed for filesystems that do
not implement ->copy_file_range().

Filesystems that do implement ->copy_file_range() have full control of
the result - if this method returns an error, the error is returned to
the user.  Before this change this was only true for fs that did not
implement the ->remap_file_range() operation (i.e.  nfsv3).

Filesystems that do not implement ->copy_file_range() still fall-back to
the generic_copy_file_range() implementation when the copy is within the
same sb.  This helps the kernel can maintain a more consistent story
about which filesystems support copy_file_range().

nfsd and ksmbd servers are modified to fall-back to the
generic_copy_file_range() implementation in case vfs_copy_file_range()
fails with -EOPNOTSUPP or -EXDEV, which preserves behavior of
server-side-copy.

fall-back to generic_copy_file_range() is not implemented for the smb
operation FSCTL_DUPLICATE_EXTENTS_TO_FILE, which is arguably a correct
change of behavior.

Fixes: 5dae222a5f ("vfs: allow copy_file_range to copy across devices")
Link: https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@chromium.org/
Link: https://lore.kernel.org/linux-fsdevel/CANMq1KDZuxir2LM5jOTm0xx+BnvW=ZmpsG47CyHFJwnw7zSX6Q@mail.gmail.com/
Link: https://lore.kernel.org/linux-fsdevel/20210126135012.1.If45b7cdc3ff707bc1efa17f5366057d60603c45f@changeid/
Link: https://lore.kernel.org/linux-fsdevel/20210630161320.29006-1-lhenriques@suse.de/
Reported-by: Nicolas Boichat <drinkcat@chromium.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Luis Henriques <lhenriques@suse.de>
Fixes: 64bf5ff58d ("vfs: no fallback for ->copy_file_range")
Link: https://lore.kernel.org/linux-fsdevel/20f17f64-88cb-4e80-07c1-85cb96c83619@windriver.com/
Reported-by: He Zhe <zhe.he@windriver.com>
Tested-by: Namjae Jeon <linkinjeon@kernel.org>
Tested-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-30 15:16:38 -07:00
Chuck Lever
a23dd544de SUNRPC: Fix READ_PLUS crasher
Looks like there are still cases when "space_left - frag1bytes" can
legitimately exceed PAGE_SIZE. Ensure that xdr->end always remains
within the current encode buffer.

Reported-by: Bruce Fields <bfields@fieldses.org>
Reported-by: Zorro Lang <zlang@redhat.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216151
Fixes: 6c254bf3b6 ("SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-06-30 17:41:08 -04:00
Scott Mayhew
4f40a5b554 NFSv4: Add an fattr allocation to _nfs4_discover_trunking()
This was missed in c3ed222745 ("NFSv4: Fix free of uninitialized
nfs4_label on referral lookup.") and causes a panic when mounting
with '-o trunkdiscovery':

PID: 1604   TASK: ffff93dac3520000  CPU: 3   COMMAND: "mount.nfs"
 #0 [ffffb79140f738f8] machine_kexec at ffffffffaec64bee
 #1 [ffffb79140f73950] __crash_kexec at ffffffffaeda67fd
 #2 [ffffb79140f73a18] crash_kexec at ffffffffaeda76ed
 #3 [ffffb79140f73a30] oops_end at ffffffffaec2658d
 #4 [ffffb79140f73a50] general_protection at ffffffffaf60111e
    [exception RIP: nfs_fattr_init+0x5]
    RIP: ffffffffc0c18265  RSP: ffffb79140f73b08  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff93dac304a800  RCX: 0000000000000000
    RDX: ffffb79140f73bb0  RSI: ffff93dadc8cbb40  RDI: d03ee11cfaf6bd50
    RBP: ffffb79140f73be8   R8: ffffffffc0691560   R9: 0000000000000006
    R10: ffff93db3ffd3df8  R11: 0000000000000000  R12: ffff93dac4040000
    R13: ffff93dac2848e00  R14: ffffb79140f73b60  R15: ffffb79140f73b30
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #5 [ffffb79140f73b08] _nfs41_proc_get_locations at ffffffffc0c73d53 [nfsv4]
 #6 [ffffb79140f73bf0] nfs4_proc_get_locations at ffffffffc0c83e90 [nfsv4]
 #7 [ffffb79140f73c60] nfs4_discover_trunking at ffffffffc0c83fb7 [nfsv4]
 #8 [ffffb79140f73cd8] nfs_probe_fsinfo at ffffffffc0c0f95f [nfs]
 #9 [ffffb79140f73da0] nfs_probe_server at ffffffffc0c1026a [nfs]
    RIP: 00007f6254fce26e  RSP: 00007ffc69496ac8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000000000  RCX: 00007f6254fce26e
    RDX: 00005600220a82a0  RSI: 00005600220a64d0  RDI: 00005600220a6520
    RBP: 00007ffc69496c50   R8: 00005600220a8710   R9: 003035322e323231
    R10: 0000000000000000  R11: 0000000000000246  R12: 00007ffc69496c50
    R13: 00005600220a8440  R14: 0000000000000010  R15: 0000560020650ef9
    ORIG_RAX: 00000000000000a5  CS: 0033  SS: 002b

Fixes: c3ed222745 ("NFSv4: Fix free of uninitialized nfs4_label on referral lookup.")
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-30 16:13:00 -04:00
NeilBrown
080abad71e NFS: restore module put when manager exits.
Commit f49169c97f ("NFSD: Remove svc_serv_ops::svo_module") removed
calls to module_put_and_kthread_exit() from threads that acted as SUNRPC
servers and had a related svc_serv_ops structure.  This was correct.

It ALSO removed the module_put_and_kthread_exit() call from
nfs4_run_state_manager() which is NOT a SUNRPC service.

Consequently every time the NFSv4 state manager runs the module count
increments and won't be decremented.  So the nfsv4 module cannot be
unloaded.

So restore the module_put_and_kthread_exit() call.

Fixes: f49169c97f ("NFSD: Remove svc_serv_ops::svo_module")
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-30 16:13:00 -04:00
Jens Axboe
f3163d8567 Merge tag 'nvme-5.19-2022-06-30' of git://git.infradead.org/nvme into block-5.19
Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.19

 - more quirks (Lamarque Vieira Souza, Pablo Greco)
 - fix a fabrics disconnect regression (Ruozhu Li)
 - fix a nvmet-tcp data_digest calculation regression (Sagi Grimberg)
 - fix nvme-tcp send failure handling (Sagi Grimberg)
 - fix a regression with nvmet-loop and passthrough controllers
   (Alan Adamson)"

* tag 'nvme-5.19-2022-06-30' of git://git.infradead.org/nvme:
  nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA IM2P33F8ABR1
  nvmet: add a clear_ids attribute for passthru targets
  nvme: fix regression when disconnect a recovering ctrl
  nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG SX6000LNP (AKA SPECTRIX S40G)
  nvme-tcp: always fail a request when sending it failed
  nvmet-tcp: fix regression in data_digest calculation
2022-06-30 14:00:11 -06:00
Vladimir Oltean
58bf4db695 net: dsa: felix: fix race between reading PSFP stats and port stats
Both PSFP stats and the port stats read by ocelot_check_stats_work() are
indirectly read through the same mechanism - write to STAT_CFG:STAT_VIEW,
read from SYS:STAT:CNT[n].

It's just that for port stats, we write STAT_VIEW with the index of the
port, and for PSFP stats, we write STAT_VIEW with the filter index.

So if we allow them to run concurrently, ocelot_check_stats_work() may
change the view from vsc9959_psfp_counters_get(), and vice versa.

Fixes: 7d4b564d6a ("net: dsa: felix: support psfp filter on vsc9959")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220629183007.3808130-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-30 11:37:09 -07:00
Jakub Kicinski
839b92fede selftest: tun: add test for NAPI dismantle
Being lazy does not pay, add the test for various
ordering of tun queue close / detach / destroy.

Link: https://lore.kernel.org/r/20220629181911.372047-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-30 11:34:10 -07:00
Jakub Kicinski
ff1fa2081d net: tun: avoid disabling NAPI twice
Eric reports that syzbot made short work out of my speculative
fix. Indeed when queue gets detached its tfile->tun remains,
so we would try to stop NAPI twice with a detach(), close()
sequence.

Alternative fix would be to move tun_napi_disable() to
tun_detach_all() and let the NAPI run after the queue
has been detached.

Fixes: a8fc8cb569 ("net: tun: stop NAPI when detaching queues")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220629181911.372047-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-30 11:34:10 -07:00
Casper Andersson
9c5de246c1 net: sparx5: mdb add/del handle non-sparx5 devices
When adding/deleting mdb entries on other net_devices, eg., tap
interfaces, it should not crash.

Fixes: 3bacfccdcb ("net: sparx5: Add mdb handlers")
Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Link: https://lore.kernel.org/r/20220630122226.316812-1-casper.casan@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-30 11:32:54 -07:00
Sumeet Pawnikar
62f46fc7b8 thermal: intel_tcc_cooling: Add TCC cooling support for RaptorLake
Add RaptorLake to the list of processor models supported by the Intel
TCC cooling driver.

Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
[ rjw: Subject edits, new changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-30 19:48:44 +02:00
Zhang Jiaming
d7d488f41b s390/qdio: Fix spelling mistake
Change 'defineable' to 'definable'.
Change 'paramater' to 'parameter'.

Signed-off-by: Zhang Jiaming <jiaming@nfschina.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Link: https://lore.kernel.org/r/20220623060543.12870-1-jiaming@nfschina.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-30 19:40:36 +02:00
Jiang Jian
d608f45ed3 s390/sclp: Fix typo in comments
Remove the repeated word 'and' from comments

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220622142713.14187-1-jiangjian@cdjrlc.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-30 19:40:36 +02:00
Jason A. Donenfeld
e4f7440030 s390/archrandom: simplify back to earlier design and initialize earlier
s390x appears to present two RNG interfaces:
- a "TRNG" that gathers entropy using some hardware function; and
- a "DRBG" that takes in a seed and expands it.

Previously, the TRNG was wired up to arch_get_random_{long,int}(), but
it was observed that this was being called really frequently, resulting
in high overhead. So it was changed to be wired up to arch_get_random_
seed_{long,int}(), which was a reasonable decision. Later on, the DRBG
was then wired up to arch_get_random_{long,int}(), with a complicated
buffer filling thread, to control overhead and rate.

Fortunately, none of the performance issues matter much now. The RNG
always attempts to use arch_get_random_seed_{long,int}() first, which
means a complicated implementation of arch_get_random_{long,int}() isn't
really valuable or useful to have around. And it's only used when
reseeding, which means it won't hit the high throughput complications
that were faced before.

So this commit returns to an earlier design of just calling the TRNG in
arch_get_random_seed_{long,int}(), and returning false in arch_get_
random_{long,int}().

Part of what makes the simplification possible is that the RNG now seeds
itself using the TRNG at bootup. But this only works if the TRNG is
detected early in boot, before random_init() is called. So this commit
also causes that check to happen in setup_arch().

Cc: stable@vger.kernel.org
Cc: Harald Freudenberger <freude@linux.ibm.com>
Cc: Ingo Franzki <ifranzki@linux.ibm.com>
Cc: Juergen Christ <jchrist@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20220610222023.378448-1-Jason@zx2c4.com
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-30 19:40:36 +02:00
Dylan Yudaken
09007af2b6 io_uring: fix provided buffer import
io_import_iovec uses the s pointer, but this was changed immediately
after the iovec was re-imported and so it was imported into the wrong
place.

Change the ordering.

Fixes: 2be2eb02e2 ("io_uring: ensure reads re-import for selected buffers")
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220630132006.2825668-1-dylany@fb.com
[axboe: ensure we don't half-import as well]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-30 11:34:41 -06:00
Linus Torvalds
1a0e93df1e Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
 "Three minor bug fixes:

   - qedr not setting the QP timeout properly toward userspace

   - Memory leak on error path in ib_cm

   - Divide by 0 in RDMA interrupt moderation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  linux/dim: Fix divide by 0 in RDMA DIM
  RDMA/cm: Fix memory leak in ib_cm_insert_listen
  RDMA/qedr: Fix reporting QP timeout attribute
2022-06-30 10:03:22 -07:00
Linus Torvalds
9fb3bb25d1 Merge tag 'fsnotify_for_v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fanotify fix from Jan Kara:
 "A fix for recently added fanotify API to have stricter checks and
  refuse some invalid flag combinations to make our life easier in the
  future"

* tag 'fsnotify_for_v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fanotify: refine the validation checks on non-dir inode mask
2022-06-30 09:57:18 -07:00
Linus Torvalds
f5da5ddf81 Merge tag 'v5.19-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "Fix a regression that breaks the ccp driver"

* tag 'v5.19-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: ccp - Fix device IRQ counting by using platform_irq_count()
2022-06-30 09:45:42 -07:00
Rafael J. Wysocki
589cb2c0b8 Merge tag 'devfreq-fixes-for-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux
Pull devfreq fixes for 5.19-rc5 from Chanwoo Choi:

"1. Fix devfreq passive governor issue when cpufreq policies are not
    ready during kernel boot because some CPUs turn on after kernel
    booting or others.

    - Re-initialize the vairables of struct devfreq_passive_data when
      PROBE_DEFER happens when cpufreq_get() returns NULL.

    - Use dev_err_probe to mute warning when PROBE_DEFER.

    - Fix cpufreq passive unregister erroring on PROBE_DEFER
      by using the allocated parent_cpu_data list to free resouce
      instead of for_each_possible_cpu().

    - Remove duplicate cpufreq passive unregister and warning when
      PROBE_DEFER.

    - Use HZ_PER_KZH macro in units.h.

    - Fix wrong indentation in SPDX-License line.

 2. Fix reference count leak in exynos-ppmu.c by using of_node_put().

 3. Rework freq_table to be local to devfreq struct

    - struct devfreq_dev_profile includes freq_table array to store
      the supported frequencies. If devfreq driver doesn't initialize
      the freq_table, devfreq core allocates the memory and initializes
      the freq_table.

      On a devfreq PROBE_DEFER, the freq_table in the driver profile
      struct is never reset and may be left in an undefined state. To fix
      this and correctly handle PROBE_DEFER, use a local freq_table and
      max_state in the devfreq struct."

* tag 'devfreq-fixes-for-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
  PM / devfreq: passive: revert an editing accident in SPDX-License line
  PM / devfreq: Fix kernel warning with cpufreq passive register fail
  PM / devfreq: Rework freq_table to be local to devfreq struct
  PM / devfreq: exynos-ppmu: Fix refcount leak in of_get_devfreq_events
  PM / devfreq: passive: Use HZ_PER_KHZ macro in units.h
  PM / devfreq: Fix cpufreq passive unregister erroring on PROBE_DEFER
  PM / devfreq: Mute warning on governor PROBE_DEFER
  PM / devfreq: Fix kernel panic with cpu based scaling to passive gov
2022-06-30 15:30:30 +02:00
Pavel Begunkov
29c1ac230e io_uring: keep sendrecv flags in ioprio
We waste a u64 SQE field for flags even though we don't need as many
bits and it can be used for something more useful later. Store io_uring
specific send/recv flags in sqe->ioprio instead of ->addr2.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Fixes: 0455d4ccec ("io_uring: add POLL_FIRST support for send/sendmsg and recv/recvmsg")
[axboe: change comment in io_uring.h as well]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-30 07:15:50 -06:00
Masahiro Yamada
20159e287a s390/purgatory: remove duplicated build rule of kexec-purgatory.o
This is equivalent to the pattern rule in scripts/Makefile.build.

Having the dependency on $(obj)/purgatory.ro is enough.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20220613170902.1775211-3-masahiroy@kernel.org
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-30 14:18:16 +02:00
Masahiro Yamada
b9a56c113f s390/purgatory: hard-code obj-y in Makefile
The purgatory/ directory is entirely guarded in arch/s390/Kbuild.
CONFIG_ARCH_HAS_KEXEC_PURGATORY is bool type.

$(CONFIG_ARCH_HAS_KEXEC_PURGATORY) is always 'y' when Kbuild visits
this Makefile for building.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20220613170902.1775211-2-masahiroy@kernel.org
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-30 14:18:15 +02:00
Masahiro Yamada
25deecb21c s390: remove unneeded 'select BUILD_BIN2C'
Since commit 4c0f032d49 ("s390/purgatory: Omit use of bin2c"),
s390 builds the purgatory without using bin2c.

Remove 'select BUILD_BIN2C' to avoid the unneeded build of bin2c.

Fixes: 4c0f032d49 ("s390/purgatory: Omit use of bin2c")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20220613170902.1775211-1-masahiroy@kernel.org
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-30 14:18:15 +02:00
Jianglei Nie
0a18d802d6 net: sfp: fix memory leak in sfp_probe()
sfp_probe() allocates a memory chunk from sfp with sfp_alloc(). When
devm_add_action() fails, sfp is not freed, which leads to a memory leak.

We should use devm_add_action_or_reset() instead of devm_add_action().

Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220629075550.2152003-1-niejianglei2021@163.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-30 11:38:16 +02:00
Petr Machata
665030fd0c mlxsw: spectrum_router: Fix rollback in tunnel next hop init
In mlxsw_sp_nexthop6_init(), a next hop is always added to the router
linked list, and mlxsw_sp_nexthop_type_init() is invoked afterwards. When
that function results in an error, the next hop will not have been removed
from the linked list. As the error is propagated upwards and the caller
frees the next hop object, the linked list ends up holding an invalid
object.

A similar issue comes up with mlxsw_sp_nexthop4_init(), where rollback
block does exist, however does not include the linked list removal.

Both IPv6 and IPv4 next hops have a similar issue with next-hop counter
rollbacks. As these were introduced in the same patchset as the next hop
linked list, include the cleanup in this patch.

Fixes: dbe4598c1e ("mlxsw: spectrum_router: Keep nexthops in a linked list")
Fixes: a5390278a5 ("mlxsw: spectrum: Add support for setting counters on nexthops")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220629070205.803952-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-30 11:35:18 +02:00
Duoming Zhou
9cc02ede69 net: rose: fix UAF bugs caused by timer handler
There are UAF bugs in rose_heartbeat_expiry(), rose_timer_expiry()
and rose_idletimer_expiry(). The root cause is that del_timer()
could not stop the timer handler that is running and the refcount
of sock is not managed properly.

One of the UAF bugs is shown below:

    (thread 1)          |        (thread 2)
                        |  rose_bind
                        |  rose_connect
                        |    rose_start_heartbeat
rose_release            |    (wait a time)
  case ROSE_STATE_0     |
  rose_destroy_socket   |  rose_heartbeat_expiry
    rose_stop_heartbeat |
    sock_put(sk)        |    ...
  sock_put(sk) // FREE  |
                        |    bh_lock_sock(sk) // USE

The sock is deallocated by sock_put() in rose_release() and
then used by bh_lock_sock() in rose_heartbeat_expiry().

Although rose_destroy_socket() calls rose_stop_heartbeat(),
it could not stop the timer that is running.

The KASAN report triggered by POC is shown below:

BUG: KASAN: use-after-free in _raw_spin_lock+0x5a/0x110
Write of size 4 at addr ffff88800ae59098 by task swapper/3/0
...
Call Trace:
 <IRQ>
 dump_stack_lvl+0xbf/0xee
 print_address_description+0x7b/0x440
 print_report+0x101/0x230
 ? irq_work_single+0xbb/0x140
 ? _raw_spin_lock+0x5a/0x110
 kasan_report+0xed/0x120
 ? _raw_spin_lock+0x5a/0x110
 kasan_check_range+0x2bd/0x2e0
 _raw_spin_lock+0x5a/0x110
 rose_heartbeat_expiry+0x39/0x370
 ? rose_start_heartbeat+0xb0/0xb0
 call_timer_fn+0x2d/0x1c0
 ? rose_start_heartbeat+0xb0/0xb0
 expire_timers+0x1f3/0x320
 __run_timers+0x3ff/0x4d0
 run_timer_softirq+0x41/0x80
 __do_softirq+0x233/0x544
 irq_exit_rcu+0x41/0xa0
 sysvec_apic_timer_interrupt+0x8c/0xb0
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x1b/0x20
RIP: 0010:default_idle+0xb/0x10
RSP: 0018:ffffc9000012fea0 EFLAGS: 00000202
RAX: 000000000000bcae RBX: ffff888006660f00 RCX: 000000000000bcae
RDX: 0000000000000001 RSI: ffffffff843a11c0 RDI: ffffffff843a1180
RBP: dffffc0000000000 R08: dffffc0000000000 R09: ffffed100da36d46
R10: dfffe9100da36d47 R11: ffffffff83cf0950 R12: 0000000000000000
R13: 1ffff11000ccc1e0 R14: ffffffff8542af28 R15: dffffc0000000000
...
Allocated by task 146:
 __kasan_kmalloc+0xc4/0xf0
 sk_prot_alloc+0xdd/0x1a0
 sk_alloc+0x2d/0x4e0
 rose_create+0x7b/0x330
 __sock_create+0x2dd/0x640
 __sys_socket+0xc7/0x270
 __x64_sys_socket+0x71/0x80
 do_syscall_64+0x43/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

Freed by task 152:
 kasan_set_track+0x4c/0x70
 kasan_set_free_info+0x1f/0x40
 ____kasan_slab_free+0x124/0x190
 kfree+0xd3/0x270
 __sk_destruct+0x314/0x460
 rose_release+0x2fa/0x3b0
 sock_close+0xcb/0x230
 __fput+0x2d9/0x650
 task_work_run+0xd6/0x160
 exit_to_user_mode_loop+0xc7/0xd0
 exit_to_user_mode_prepare+0x4e/0x80
 syscall_exit_to_user_mode+0x20/0x40
 do_syscall_64+0x4f/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

This patch adds refcount of sock when we use functions
such as rose_start_heartbeat() and so on to start timer,
and decreases the refcount of sock when timer is finished
or deleted by functions such as rose_stop_heartbeat()
and so on. As a result, the UAF bugs could be mitigated.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Tested-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/20220629002640.5693-1-duoming@zju.edu.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-30 11:07:30 +02:00
Jose Alonso
f8ebb3ac88 net: usb: ax88179_178a: Fix packet receiving
This patch corrects packet receiving in ax88179_rx_fixup.

- problem observed:
  ifconfig shows allways a lot of 'RX Errors' while packets
  are received normally.

  This occurs because ax88179_rx_fixup does not recognise properly
  the usb urb received.
  The packets are normally processed and at the end, the code exits
  with 'return 0', generating RX Errors.
  (pkt_cnt==-2 and ptk_hdr over field rx_hdr trying to identify
   another packet there)

  This is a usb urb received by "tcpdump -i usbmon2 -X" on a
  little-endian CPU:
  0x0000:  eeee f8e3 3b19 87a0 94de 80e3 daac 0800
           ^         packet 1 start (pkt_len = 0x05ec)
           ^^^^      IP alignment pseudo header
                ^    ethernet packet start
           last byte ethernet packet   v
           padding (8-bytes aligned)     vvvv vvvv
  0x05e0:  c92d d444 1420 8a69 83dd 272f e82b 9811
  0x05f0:  eeee f8e3 3b19 87a0 94de 80e3 daac 0800
  ...      ^ packet 2
  0x0be0:  eeee f8e3 3b19 87a0 94de 80e3 daac 0800
  ...
  0x1130:  9d41 9171 8a38 0ec5 eeee f8e3 3b19 87a0
  ...
  0x1720:  8cfc 15ff 5e4c e85c eeee f8e3 3b19 87a0
  ...
  0x1d10:  ecfa 2a3a 19ab c78c eeee f8e3 3b19 87a0
  ...
  0x2070:  eeee f8e3 3b19 87a0 94de 80e3 daac 0800
  ...      ^ packet 7
  0x2120:  7c88 4ca5 5c57 7dcc 0d34 7577 f778 7e0a
  0x2130:  f032 e093 7489 0740 3008 ec05 0000 0080
                               ====1==== ====2====
           hdr_off             ^
           pkt_len = 0x05ec         ^^^^
           AX_RXHDR_*=0x00830  ^^^^   ^
           pkt_len = 0                        ^^^^
           AX_RXHDR_DROP_ERR=0x80000000  ^^^^   ^
  0x2140:  3008 ec05 0000 0080 3008 5805 0000 0080
  0x2150:  3008 ec05 0000 0080 3008 ec05 0000 0080
  0x2160:  3008 5803 0000 0080 3008 c800 0000 0080
           ===11==== ===12==== ===13==== ===14====
  0x2170:  0000 0000 0e00 3821
                     ^^^^ ^^^^ rx_hdr
                     ^^^^      pkt_cnt=14
                          ^^^^ hdr_off=0x2138
           ^^^^ ^^^^           padding

  The dump shows that pkt_cnt is the number of entrys in the
  per-packet metadata. It is "2 * packet count".
  Each packet have two entrys. The first have a valid
  value (pkt_len and AX_RXHDR_*) and the second have a
  dummy-header 0x80000000 (pkt_len=0 with AX_RXHDR_DROP_ERR).
  Why exists dummy-header for each packet?!?
  My guess is that this was done probably to align the
  entry for each packet to 64-bits and maintain compatibility
  with old firmware.
  There is also a padding (0x00000000) before the rx_hdr to
  align the end of rx_hdr to 64-bit.
  Note that packets have a alignment of 64-bits (8-bytes).

  This patch assumes that the dummy-header and the last
  padding are optional. So it preserves semantics and
  recognises the same valid packets as the current code.

  This patch was made using only the dumpfile information and
  tested with only one device:
  0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet

Fixes: 57bc3d3ae8 ("net: usb: ax88179_178a: Fix out-of-bounds accesses in RX fixup")
Fixes: e2ca90c276 ("ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver")
Signed-off-by: Jose Alonso <joalonsof@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/d6970bb04bf67598af4d316eaeb1792040b18cfd.camel@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-30 10:41:57 +02:00
Lamarque Vieira Souza
e1c70d7934 nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA IM2P33F8ABR1
ADATA IM2P33F8ABR1 reports bogus eui64 values that appear to be the same
across all drives. Quirk them out so they are not marked as "non globally
unique" duplicates.

Co-developed-by: Felipe de Jesus Araujo da Conceição <felipe.conceicao@petrosoftdesign.com>
Signed-off-by: Felipe de Jesus Araujo da Conceição <felipe.conceicao@petrosoftdesign.com>
Signed-off-by: Lamarque V. Souza <lamarque.souza@petrosoftdesign.com>

Cc: stable@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-30 08:24:33 +02:00
Alan Adamson
34ad61514c nvmet: add a clear_ids attribute for passthru targets
If the clear_ids attribute is set to true, the EUI/GUID/UUID is cleared
for the passthru target.  By default, loop targets will set clear_ids to
true.

This resolves an issue where a connect to a passthru target fails when
using a trtype of 'loop' because EUI/GUID/UUID is not unique.

Fixes: 2079f41ec6 ("nvme: check that EUI/GUID/UUID are globally unique")
Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-30 08:23:24 +02:00
Yevhen Orlov
050133e1aa net: bonding: fix use-after-free after 802.3ad slave unbind
commit 0622cab034 ("bonding: fix 802.3ad aggregator reselection"),
resolve case, when there is several aggregation groups in the same bond.
bond_3ad_unbind_slave will invalidate (clear) aggregator when
__agg_active_ports return zero. So, ad_clear_agg can be executed even, when
num_of_ports!=0. Than bond_3ad_unbind_slave can be executed again for,
previously cleared aggregator. NOTE: at this time bond_3ad_unbind_slave
will not update slave ports list, because lag_ports==NULL. So, here we
got slave ports, pointing to freed aggregator memory.

Fix with checking actual number of ports in group (as was before
commit 0622cab034 ("bonding: fix 802.3ad aggregator reselection") ),
before ad_clear_agg().

The KASAN logs are as follows:

[  767.617392] ==================================================================
[  767.630776] BUG: KASAN: use-after-free in bond_3ad_state_machine_handler+0x13dc/0x1470
[  767.638764] Read of size 2 at addr ffff00011ba9d430 by task kworker/u8:7/767
[  767.647361] CPU: 3 PID: 767 Comm: kworker/u8:7 Tainted: G           O 5.15.11 #15
[  767.655329] Hardware name: DNI AmazonGo1 A7040 board (DT)
[  767.660760] Workqueue: lacp_1 bond_3ad_state_machine_handler
[  767.666468] Call trace:
[  767.668930]  dump_backtrace+0x0/0x2d0
[  767.672625]  show_stack+0x24/0x30
[  767.675965]  dump_stack_lvl+0x68/0x84
[  767.679659]  print_address_description.constprop.0+0x74/0x2b8
[  767.685451]  kasan_report+0x1f0/0x260
[  767.689148]  __asan_load2+0x94/0xd0
[  767.692667]  bond_3ad_state_machine_handler+0x13dc/0x1470

Fixes: 0622cab034 ("bonding: fix 802.3ad aggregator reselection")
Co-developed-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Link: https://lore.kernel.org/r/20220629012914.361-1-yevhen.orlov@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-29 20:52:40 -07:00
Eric Dumazet
4e43e64d0f ipv6: fix lockdep splat in in6_dump_addrs()
As reported by syzbot, we should not use rcu_dereference()
when rcu_read_lock() is not held.

WARNING: suspicious RCU usage
5.19.0-rc2-syzkaller #0 Not tainted

net/ipv6/addrconf.c:5175 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by syz-executor326/3617:
 #0: ffffffff8d5848e8 (rtnl_mutex){+.+.}-{3:3}, at: netlink_dump+0xae/0xc20 net/netlink/af_netlink.c:2223

stack backtrace:
CPU: 0 PID: 3617 Comm: syz-executor326 Not tainted 5.19.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 in6_dump_addrs+0x12d1/0x1790 net/ipv6/addrconf.c:5175
 inet6_dump_addr+0x9c1/0xb50 net/ipv6/addrconf.c:5300
 netlink_dump+0x541/0xc20 net/netlink/af_netlink.c:2275
 __netlink_dump_start+0x647/0x900 net/netlink/af_netlink.c:2380
 netlink_dump_start include/linux/netlink.h:245 [inline]
 rtnetlink_rcv_msg+0x73e/0xc90 net/core/rtnetlink.c:6046
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2501
 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
 netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
 netlink_sendmsg+0x917/0xe10 net/netlink/af_netlink.c:1921
 sock_sendmsg_nosec net/socket.c:714 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:734
 ____sys_sendmsg+0x6eb/0x810 net/socket.c:2492
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2546
 __sys_sendmsg net/socket.c:2575 [inline]
 __do_sys_sendmsg net/socket.c:2584 [inline]
 __se_sys_sendmsg net/socket.c:2582 [inline]
 __x64_sys_sendmsg+0x132/0x220 net/socket.c:2582
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: 88e2ca3080 ("mld: convert ifmcaddr6 to RCU")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20220628121248.858695-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-29 20:41:09 -07:00
Oleksij Rempel
fa152f626b net: phy: ax88772a: fix lost pause advertisement configuration
In case of asix_ax88772a_link_change_notify() workaround, we run soft
reset which will automatically clear MII_ADVERTISE configuration. The
PHYlib framework do not know about changed configuration state of the
PHY, so we need use phy_init_hw() to reinit PHY configuration.

Fixes: dde2584692 ("net: usb/phy: asix: add support for ax88772A/C PHYs")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220628114349.3929928-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-29 20:39:05 -07:00
Lukas Wunner
1758bde2e4 net: phy: Don't trigger state machine while in suspend
Upon system sleep, mdio_bus_phy_suspend() stops the phy_state_machine(),
but subsequent interrupts may retrigger it:

They may have been left enabled to facilitate wakeup and are not
quiesced until the ->suspend_noirq() phase.  Unwanted interrupts may
hence occur between mdio_bus_phy_suspend() and dpm_suspend_noirq(),
as well as between dpm_resume_noirq() and mdio_bus_phy_resume().

Retriggering the phy_state_machine() through an interrupt is not only
undesirable for the reason given in mdio_bus_phy_suspend() (freezing it
midway with phydev->lock held), but also because the PHY may be
inaccessible after it's suspended:  Accesses to USB-attached PHYs are
blocked once usb_suspend_both() clears the can_submit flag and PHYs on
PCI network cards may become inaccessible upon suspend as well.

Amend phy_interrupt() to avoid triggering the state machine if the PHY
is suspended.  Signal wakeup instead if the attached net_device or its
parent has been configured as a wakeup source.  (Those conditions are
identical to mdio_bus_phy_may_suspend().)  Postpone handling of the
interrupt until the PHY has resumed.

Before stopping the phy_state_machine() in mdio_bus_phy_suspend(),
wait for a concurrent phy_interrupt() to run to completion.  That is
necessary because phy_interrupt() may have checked the PHY's suspend
status before the system sleep transition commenced and it may thus
retrigger the state machine after it was stopped.

Likewise, after re-enabling interrupt handling in mdio_bus_phy_resume(),
wait for a concurrent phy_interrupt() to complete to ensure that
interrupts which it postponed are properly rerun.

The issue was exposed by commit 1ce8b37241 ("usbnet: smsc95xx: Forward
PHY interrupts to PHY driver to avoid polling"), but has existed since
forever.

Fixes: 541cd3ee00 ("phylib: Fix deadlock on resume")
Link: https://lore.kernel.org/netdev/a5315a8a-32c2-962f-f696-de9a26d30091@samsung.com/
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org # v2.6.33+
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/b7f386d04e9b5b0e2738f0125743e30676f309ef.1656410895.git.lukas@wunner.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-29 20:38:52 -07:00
Oliver Neukum
e65af5403e usbnet: fix memory allocation in helpers
usbnet provides some helper functions that are also used in
the context of reset() operations. During a reset the other
drivers on a device are unable to operate. As that can be block
drivers, a driver for another interface cannot use paging
in its memory allocations without risking a deadlock.
Use GFP_NOIO in the helpers.

Fixes: 877bd862f3 ("usbnet: introduce usbnet 3 command helpers")
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://lore.kernel.org/r/20220628093517.7469-1-oneukum@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-29 20:36:45 -07:00
Coleman Dietsch
7b92aa9e61 selftests net: fix kselftest net fatal error
The incorrect path is causing the following error when trying to run net
kselftests:

In file included from bpf/nat6to4.c:43:
../../../lib/bpf/bpf_helpers.h:11:10: fatal error: 'bpf_helper_defs.h' file not found
         ^~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: cf67838c44 ("selftests net: fix bpf build error")
Signed-off-by: Coleman Dietsch <dietschc@csp.edu>
Link: https://lore.kernel.org/r/20220628174744.7908-1-dietschc@csp.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-29 20:15:57 -07:00
Jakub Kicinski
236d59292e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

1) Restore set counter when one of the CPU loses race to add elements
   to sets.

2) After NF_STOLEN, skb might be there no more, update nftables trace
   infra to avoid access to skb in this case. From Florian Westphal.

3) nftables bridge might register a prerouting hook with zero priority,
   br_netfilter incorrectly skips it. Also from Florian.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: br_netfilter: do not skip all hooks with 0 priority
  netfilter: nf_tables: avoid skb access on nf_stolen
  netfilter: nft_dynset: restore set element counter when failing to update
====================

Link: https://lore.kernel.org/r/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-29 20:09:32 -07:00
Dave Airlie
078a3be793 Merge tag 'amd-drm-fixes-5.19-2022-06-29' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.19-2022-06-29:

amdgpu:
- GPU recovery fix
- Fix integer type usage in fourcc header for AMD modifiers
- KFD TLB flush fix for gfx9 APUs
- Display fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629192220.5870-1-alexander.deucher@amd.com
2022-06-30 10:48:55 +10:00
Dave Airlie
8cdf1b56cc Merge tag 'drm-intel-fixes-2022-06-29' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.19-rc5:
- Fix ioctl argument error return
- Fix d3cold disable to allow PCI upstream bridge D3 transition
- Fix setting cache_dirty for dma-buf objects on discrete

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/871qv7rblv.fsf@intel.com
2022-06-30 10:21:14 +10:00
Mikulas Patocka
617b365872 dm raid: fix KASAN warning in raid5_add_disks
There's a KASAN warning in raid5_add_disk when running the LVM testsuite.
The warning happens in the test
lvconvert-raid-reshape-linear_to_raid6-single-type.sh. We fix the warning
by verifying that rdev->saved_raid_disk is within limits.

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-29 19:48:04 -04:00
Mikulas Patocka
1ebc2cec0b dm raid: fix KASAN warning in raid5_remove_disk
There's a KASAN warning in raid5_remove_disk when running the LVM
testsuite. We fix this warning by verifying that the "number" variable is
within limits.

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-29 19:47:36 -04:00
John Garry
32788beb10 ata: pata_cs5535: Fix W=1 warnings
x86_64 allmodconfig build with W=1 gives these warnings:

drivers/ata/pata_cs5535.c: In function ‘cs5535_set_piomode’:
drivers/ata/pata_cs5535.c:93:11: error: variable ‘dummy’ set but not
used [-Werror=unused-but-set-variable]
  u32 reg, dummy;
           ^~~~~
drivers/ata/pata_cs5535.c: In function ‘cs5535_set_dmamode’:
drivers/ata/pata_cs5535.c:132:11: error: variable ‘dummy’ set but not
used [-Werror=unused-but-set-variable]
  u32 reg, dummy;
           ^~~~~
cc1: all warnings being treated as errors

Mark variables 'dummy' as "maybe unused" as they are only ever written
in rdmsr() calls.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-06-30 08:21:43 +09:00
Jiang Jian
f0aa153b6c hwmon: (pmbus/ucd9200) fix typos in comments
Drop the redundant word 'the' in the comments following
    /*
     * Set PHASE registers on all pages to 0xff to ensure that phase
     * specific commands will apply to all phases of a given page (rail).
     * This only affects the READ_IOUT and READ_TEMPERATURE2 registers.
     * READ_IOUT will return the sum of currents of all phases of a rail,
     * and READ_TEMPERATURE2 will return the maximum temperature detected
     * for the [the - DROP] phases of the rail.
     */

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Link: https://lore.kernel.org/r/20220622063231.20612-1-jiangjian@cdjrlc.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-06-29 14:02:08 -07:00
Eddie James
1bbb280904 hwmon: (occ) Prevent power cap command overwriting poll response
Currently, the response to the power cap command overwrites the
first eight bytes of the poll response, since the commands use
the same buffer. This means that user's get the wrong data between
the time of sending the power cap and the next poll response update.
Fix this by specifying a different buffer for the power cap command
response.

Fixes: 5b5513b880 ("hwmon: Add On-Chip Controller (OCC) hwmon driver")
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20220628203029.51747-1-eajames@linux.ibm.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-06-29 13:59:23 -07:00
Lukas Bulwahn
f08fe6fcbe PM / devfreq: passive: revert an editing accident in SPDX-License line
Commit 26984d9d58 ("PM / devfreq: passive: Keep cpufreq_policy for
possible cpus") reworked governor_passive.c, and accidently added a
tab in the first line, i.e., the SPDX-License-Identifier line.

The checkpatch script warns with the SPDX_LICENSE_TAG warning, and hence
pointed this issue out while investigating checkpatch warnings.

Revert this editing accident. No functional change.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2022-06-30 05:11:17 +09:00
Christian Marangi
82c66d2bbb PM / devfreq: Fix kernel warning with cpufreq passive register fail
Remove cpufreq_passive_unregister_notifier from
cpufreq_passive_register_notifier in case of error as devfreq core
already call unregister on GOV_START fail.

This fix the kernel always printing a WARN on governor PROBE_DEFER as
cpufreq_passive_unregister_notifier is called two times and return
error on the second call as the cpufreq is already unregistered.

Fixes: a03dacb031 ("PM / devfreq: Add cpu based scaling support to passive governor")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2022-06-30 05:11:17 +09:00
Christian Marangi
b5d281f6c1 PM / devfreq: Rework freq_table to be local to devfreq struct
On a devfreq PROBE_DEFER, the freq_table in the driver profile struct,
is never reset and may be leaved in an undefined state.

This comes from the fact that we store the freq_table in the driver
profile struct that is commonly defined as static and not reset on
PROBE_DEFER.
We currently skip the reinit of the freq_table if we found
it's already defined since a driver may declare his own freq_table.

This logic is flawed in the case devfreq core generate a freq_table, set
it in the profile struct and then PROBE_DEFER, freeing the freq_table.
In this case devfreq will found a NOT NULL freq_table that has been
freed, skip the freq_table generation and probe the driver based on the
wrong table.

To fix this and correctly handle PROBE_DEFER, use a local freq_table and
max_state in the devfreq struct and never modify the freq_table present
in the profile struct if it does provide it.

Fixes: 0ec09ac2ce ("PM / devfreq: Set the freq_table of devfreq device")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2022-06-30 05:11:17 +09:00
Miaoqian Lin
f44b799603 PM / devfreq: exynos-ppmu: Fix refcount leak in of_get_devfreq_events
of_get_child_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
This function only calls of_node_put() in normal path,
missing it in error paths.
Add missing of_node_put() to avoid refcount leak.

Fixes: f262f28c14 ("PM / devfreq: event: Add devfreq_event class")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2022-06-30 05:11:17 +09:00
Yicong Yang
20e6c3cc90 PM / devfreq: passive: Use HZ_PER_KHZ macro in units.h
HZ macros has been centralized in units.h since [1]. Use it to avoid
duplicated definition.

[1] commit e2c77032fc ("units: add the HZ macros")

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2022-06-30 05:11:17 +09:00
Christian 'Ansuel' Marangi
0cca7e8dcf PM / devfreq: Fix cpufreq passive unregister erroring on PROBE_DEFER
With the passive governor, the cpu based scaling can PROBE_DEFER due to
the fact that CPU policy are not ready.
The cpufreq passive unregister notifier is called both from the
GOV_START errors and for the GOV_STOP and assume the notifier is
successfully registred every time. With GOV_START failing it's wrong to
loop over each possible CPU since the register path has failed for
some CPU policy not ready. Change the logic and unregister the notifer
based on the current allocated parent_cpu_data list to correctly handle
errors and the governor unregister path.

Fixes: a03dacb031 ("PM / devfreq: Add cpu based scaling support to passive governor")
Signed-off-by: Christian 'Ansuel' Marangi <ansuelsmth@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2022-06-30 05:11:17 +09:00
Christian 'Ansuel' Marangi
e52b045fe0 PM / devfreq: Mute warning on governor PROBE_DEFER
Don't print warning when a governor PROBE_DEFER as it's not a real
GOV_START fail.

Fixes: a03dacb031 ("PM / devfreq: Add cpu based scaling support to passive governor")
Signed-off-by: Christian 'Ansuel' Marangi <ansuelsmth@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2022-06-30 05:11:17 +09:00
Christian 'Ansuel' Marangi
57e00b4003 PM / devfreq: Fix kernel panic with cpu based scaling to passive gov
The cpufreq passive register notifier can PROBE_DEFER and the devfreq
struct is freed and then reallocaed on probe retry.
The current logic assume that the code can't PROBE_DEFER so the devfreq
struct in the this variable in devfreq_passive_data is assumed to be
(if already set) always correct.
This cause kernel panic as the code try to access the wrong address.
To correctly handle this, update the this variable in
devfreq_passive_data to the devfreq reallocated struct.

Fixes: a03dacb031 ("PM / devfreq: Add cpu based scaling support to passive governor")
Signed-off-by: Christian 'Ansuel' Marangi <ansuelsmth@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2022-06-30 05:11:17 +09:00
Alex Deucher
a775e4e494 Revert "drm/amdgpu/display: set vblank_disable_immediate for DC"
This reverts commit 92020e81dd.

This causes stuttering and timeouts with DMCUB for some users
so revert it until we understand why and safely enable it
to save power.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1887
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: stable@vger.kernel.org
2022-06-29 14:50:52 -04:00
Ruili Ji
5cb0e3fb2c drm/amdgpu: To flush tlb for MMHUB of RAVEN series
amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:40 vmid:8 pasid:32769, for process test_basic pid 3305 thread test_basic pid 3305)
amdgpu: in page starting at address 0x00007ff990003000 from IH client 0x12 (VMC)
amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00840051
amdgpu: Faulty UTCL2 client ID: MP1 (0x0)
amdgpu: MORE_FAULTS: 0x1
amdgpu: WALKER_ERROR: 0x0
amdgpu: PERMISSION_FAULTS: 0x5
amdgpu: MAPPING_ERROR: 0x0
amdgpu: RW: 0x1

When memory is allocated by kfd, no one triggers the tlb flush for MMHUB0.
There is page fault from MMHUB0.

v2:fix indentation
v3:change subject and fix indentation

Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-06-29 14:50:51 -04:00
Carlos Llamas
20b8264394 drm/fourcc: fix integer type usage in uapi header
Kernel uapi headers are supposed to use __[us]{8,16,32,64} types defined
by <linux/types.h> as opposed to 'uint32_t' and similar. See [1] for the
relevant discussion about this topic. In this particular case, the usage
of 'uint64_t' escaped headers_check as these macros are not being called
here. However, the following program triggers a compilation error:

  #include <drm/drm_fourcc.h>

  int main()
  {
  	unsigned long x = AMD_FMT_MOD_CLEAR(RB);
  	return 0;
  }

gcc error:
  drm.c:5:27: error: ‘uint64_t’ undeclared (first use in this function)
      5 |         unsigned long x = AMD_FMT_MOD_CLEAR(RB);
        |                           ^~~~~~~~~~~~~~~~~

This patch changes AMD_FMT_MOD_{SET,CLEAR} macros to use the correct
integer types, which fixes the above issue.

  [1] https://lkml.org/lkml/2019/6/5/18

Fixes: 8ba16d5993 ("drm/fourcc: Add AMD DRM modifiers.")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-29 14:50:51 -04:00
Alex Deucher
bbba251577 drm/amdgpu: fix adev variable used in amdgpu_device_gpu_recover()
Use the correct adev variable for the drm_fb_helper in
amdgpu_device_gpu_recover().  Noticed by inspection.

Fixes: 087451f372 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-06-29 14:50:42 -04:00
Linus Torvalds
d9b2ba6791 Merge tag 'platform-drivers-x86-v5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:

 - thinkpad_acpi/ideapad-laptop: mem-leak and platform-profile fixes

 - panasonic-laptop: missing hotkey presses regression fix

 - some hardware-id additions

 - some other small fixes

* tag 'platform-drivers-x86-v5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: hp-wmi: Ignore Sanitization Mode event
  platform/x86: thinkpad_acpi: do not use PSC mode on Intel platforms
  platform/x86: thinkpad-acpi: profile capabilities as integer
  platform/x86: panasonic-laptop: filter out duplicate volume up/down/mute keypresses
  platform/x86: panasonic-laptop: don't report duplicate brightness key-presses
  platform/x86: panasonic-laptop: revert "Resolve hotkey double trigger bug"
  platform/x86: panasonic-laptop: sort includes alphabetically
  platform/x86: panasonic-laptop: de-obfuscate button codes
  ACPI: video: Change how we determine if brightness key-presses are handled
  platform/x86: ideapad-laptop: Add Ideapad 5 15ITL05 to ideapad_dytc_v4_allow_table[]
  platform/x86: ideapad-laptop: Add allow_v4_dytc module parameter
  platform/x86: thinkpad_acpi: Fix a memory leak of EFCH MMIO resource
  platform/mellanox: nvsw-sn2201: fix error code in nvsw_sn2201_create_static_devices()
  platform/x86: intel/pmc: Add Alder Lake N support to PMC core driver
2022-06-29 09:32:06 -07:00
Linus Torvalds
732f306943 Merge tag '5.19-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull ksmbd server fixes from Steve French:

 - seek null check (don't use f_seek op directly and blindly)

 - offset validation in FSCTL_SET_ZERO_DATA

 - fallocate fix (relates e.g. to xfstests generic/091 and 263)

 - two cleanup fixes

 - fix socket settings on some arch

* tag '5.19-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: use vfs_llseek instead of dereferencing NULL
  ksmbd: check invalid FileOffset and BeyondFinalZero in FSCTL_ZERO_DATA
  ksmbd: set the range of bytes to zero without extending file size in FSCTL_ZERO_DATA
  ksmbd: remove duplicate flag set in smb2_write
  ksmbd: smbd: Remove useless license text when SPDX-License-Identifier is already used
  ksmbd: use SOCK_NONBLOCK type for kernel_accept()
2022-06-29 09:20:40 -07:00
Jeff Layton
8692969e91 ceph: wait on async create before checking caps for syncfs
Currently, we'll call ceph_check_caps, but if we're still waiting
on the reply, we'll end up spinning around on the same inode in
flush_dirty_session_caps. Wait for the async create reply before
flushing caps.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/55823
Fixes: fbed7045f5 ("ceph: wait for async create reply before sending any cap messages")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-06-29 18:02:57 +02:00
Darrick J. Wong
8944c6fb8a xfs: dont treat rt extents beyond EOF as eofblocks to be cleared
On a system with a realtime volume and a 28k realtime extent,
generic/491 fails because the test opens a file on a frozen filesystem
and closing it causes xfs_release -> xfs_can_free_eofblocks to
mistakenly think that the the blocks of the realtime extent beyond EOF
are posteof blocks to be freed.  Realtime extents cannot be partially
unmapped, so this is pointless.  Worse yet, this triggers posteof
cleanup, which stalls on a transaction allocation, which is why the test
fails.

Teach the predicate to account for realtime extents properly.

Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-06-29 08:47:56 -07:00
Darrick J. Wong
e53bcffad0 xfs: don't hold xattr leaf buffers across transaction rolls
Now that we've established (again!) that empty xattr leaf buffers are
ok, we no longer need to bhold them to transactions when we're creating
new leaf blocks.  Get rid of the entire mechanism, which should simplify
the xattr code quite a bit.

The original justification for using bhold here was to prevent the AIL
from trying to write the empty leaf block into the fs during the brief
time that we release the buffer lock.  The reason for /that/ was to
prevent recovery from tripping over the empty ondisk block.

Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-29 08:47:56 -07:00
Darrick J. Wong
7be3bd8856 xfs: empty xattr leaf header blocks are not corruption
TLDR: Revert commit 51e6104fdb ("xfs: detect empty attr leaf blocks in
xfs_attr3_leaf_verify") because it was wrong.

Every now and then we get a corruption report from the kernel or
xfs_repair about empty leaf blocks in the extended attribute structure.
We've long thought that these shouldn't be possible, but prior to 5.18
one would shake loose in the recoveryloop fstests about once a month.

A new addition to the xattr leaf block verifier in 5.19-rc1 makes this
happen every 7 minutes on my testing cloud.  I added a ton of logging to
detect any time we set the header count on an xattr leaf block to zero.
This produced the following dmesg output on generic/388:

XFS (sda4): ino 0x21fcbaf leaf 0x129bf78 hdcount==0!
Call Trace:
 <TASK>
 dump_stack_lvl+0x34/0x44
 xfs_attr3_leaf_create+0x187/0x230
 xfs_attr_shortform_to_leaf+0xd1/0x2f0
 xfs_attr_set_iter+0x73e/0xa90
 xfs_xattri_finish_update+0x45/0x80
 xfs_attr_finish_item+0x1b/0xd0
 xfs_defer_finish_noroll+0x19c/0x770
 __xfs_trans_commit+0x153/0x3e0
 xfs_attr_set+0x36b/0x740
 xfs_xattr_set+0x89/0xd0
 __vfs_setxattr+0x67/0x80
 __vfs_setxattr_noperm+0x6e/0x120
 vfs_setxattr+0x97/0x180
 setxattr+0x88/0xa0
 path_setxattr+0xc3/0xe0
 __x64_sys_setxattr+0x27/0x30
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

So now we know that someone is creating empty xattr leaf blocks as part
of converting a sf xattr structure into a leaf xattr structure.  The
conversion routine logs any existing sf attributes in the same
transaction that creates the leaf block, so we know this is a setxattr
to a file that has no attributes at all.

Next, g/388 calls the shutdown ioctl and cycles the mount to trigger log
recovery.  I also augmented buffer item recovery to call ->verify_struct
on any attr leaf blocks and complain if it finds a failure:

XFS (sda4): Unmounting Filesystem
XFS (sda4): Mounting V5 Filesystem
XFS (sda4): Starting recovery (logdev: internal)
XFS (sda4): xattr leaf daddr 0x129bf78 hdrcount == 0!
Call Trace:
 <TASK>
 dump_stack_lvl+0x34/0x44
 xfs_attr3_leaf_verify+0x3b8/0x420
 xlog_recover_buf_commit_pass2+0x60a/0x6c0
 xlog_recover_items_pass2+0x4e/0xc0
 xlog_recover_commit_trans+0x33c/0x350
 xlog_recovery_process_trans+0xa5/0xe0
 xlog_recover_process_data+0x8d/0x140
 xlog_do_recovery_pass+0x19b/0x720
 xlog_do_log_recovery+0x62/0xc0
 xlog_do_recover+0x33/0x1d0
 xlog_recover+0xda/0x190
 xfs_log_mount+0x14c/0x360
 xfs_mountfs+0x517/0xa60
 xfs_fs_fill_super+0x6bc/0x950
 get_tree_bdev+0x175/0x280
 vfs_get_tree+0x1a/0x80
 path_mount+0x6f5/0xaa0
 __x64_sys_mount+0x103/0x140
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7fc61e241eae

And a moment later, the _delwri_submit of the recovered buffers trips
the same verifier and recovery fails:

XFS (sda4): Metadata corruption detected at xfs_attr3_leaf_verify+0x393/0x420 [xfs], xfs_attr3_leaf block 0x129bf78
XFS (sda4): Unmount and run xfs_repair
XFS (sda4): First 128 bytes of corrupted metadata buffer:
00000000: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00  ........;.......
00000010: 00 00 00 00 01 29 bf 78 00 00 00 00 00 00 00 00  .....).x........
00000020: a5 1b d0 02 b2 9a 49 df 8e 9c fb 8d f8 31 3e 9d  ......I......1>.
00000030: 00 00 00 00 02 1f cb af 00 00 00 00 10 00 00 00  ................
00000040: 00 50 0f b0 00 00 00 00 00 00 00 00 00 00 00 00  .P..............
00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
XFS (sda4): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply+0x37f/0x3b0 [xfs] (fs/xfs/xfs_buf.c:1518).  Shutting down filesystem.
XFS (sda4): Please unmount the filesystem and rectify the problem(s)
XFS (sda4): log mount/recovery failed: error -117
XFS (sda4): log mount failed

I think I see what's going on here -- setxattr is racing with something
that shuts down the filesystem:

Thread 1				Thread 2
--------				--------
xfs_attr_sf_addname
xfs_attr_shortform_to_leaf
<create empty leaf>
xfs_trans_bhold(leaf)
xattri_dela_state = XFS_DAS_LEAF_ADD
<roll transaction>
					<flush log>
					<shut down filesystem>
xfs_trans_bhold_release(leaf)
<discover fs is dead, bail>

Thread 3
--------
<cycle mount, start recovery>
xlog_recover_buf_commit_pass2
xlog_recover_do_reg_buffer
<replay empty leaf buffer from recovered buf item>
xfs_buf_delwri_queue(leaf)
xfs_buf_delwri_submit
_xfs_buf_ioapply(leaf)
xfs_attr3_leaf_write_verify
<trip over empty leaf buffer>
<fail recovery>

As you can see, the bhold keeps the leaf buffer locked and thus prevents
the *AIL* from tripping over the ichdr.count==0 check in the write
verifier.  Unfortunately, it doesn't prevent the log from getting
flushed to disk, which sets up log recovery to fail.

So.  It's clear that the kernel has always had the ability to persist
attr leaf blocks with ichdr.count==0, which means that it's part of the
ondisk format now.

Unfortunately, this check has been added and removed multiple times
throughout history.  It first appeared in[1] kernel 3.10 as part of the
early V5 format patches.  The check was later discovered to break log
recovery and hence disabled[2] during log recovery in kernel 4.10.
Simultaneously, the check was added[3] to xfs_repair 4.9.0 to try to
weed out the empty leaf blocks.  This was still not correct because log
recovery would recover an empty attr leaf block successfully only for
regular xattr operations to trip over the empty block during of the
block during regular operation.  Therefore, the check was removed
entirely[4] in kernel 5.7 but removal of the xfs_repair check was
forgotten.  The continued complaints from xfs_repair lead to us
mistakenly re-adding[5] the verifier check for kernel 5.19.  Remove it
once again.

[1] 517c22207b ("xfs: add CRCs to attr leaf blocks")
[2] 2e1d23370e ("xfs: ignore leaf attr ichdr.count in verifier
                   during log replay")
[3] f7140161 ("xfs_repair: junk leaf attribute if count == 0")
[4] f28cef9e4d ("xfs: don't fail verifier on empty attr3 leaf
                   block")
[5] 51e6104fdb ("xfs: detect empty attr leaf blocks in
                   xfs_attr3_leaf_verify")

Looking at the rest of the xattr code, it seems that files with empty
leaf blocks behave as expected -- listxattr reports no attributes;
getxattr on any xattr returns nothing as expected; removexattr does
nothing; and setxattr can add attributes just fine.

Original-bug: 517c22207b ("xfs: add CRCs to attr leaf blocks")
Still-not-fixed-by: 2e1d23370e ("xfs: ignore leaf attr ichdr.count in verifier during log replay")
Removed-in: f28cef9e4d ("xfs: don't fail verifier on empty attr3 leaf block")
Fixes: 51e6104fdb ("xfs: detect empty attr leaf blocks in xfs_attr3_leaf_verify")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-06-29 08:47:56 -07:00
Ruozhu Li
f7f70f4aa0 nvme: fix regression when disconnect a recovering ctrl
We encountered a problem that the disconnect command hangs.
After analyzing the log and stack, we found that the triggering
process is as follows:
CPU0                          CPU1
                                nvme_rdma_error_recovery_work
                                  nvme_rdma_teardown_io_queues
nvme_do_delete_ctrl                 nvme_stop_queues
  nvme_remove_namespaces
  --clear ctrl->namespaces
                                    nvme_start_queues
                                    --no ns in ctrl->namespaces
    nvme_ns_remove                  return(because ctrl is deleting)
      blk_freeze_queue
        blk_mq_freeze_queue_wait
        --wait for ns to unquiesce to clean infligt IO, hang forever

This problem was not found in older kernels because we will flush
err work in nvme_stop_ctrl before nvme_remove_namespaces.It does not
seem to be modified for functional reasons, the patch can be revert
to solve the problem.

Revert commit 794a4cb3d2 ("nvme: remove the .stop_ctrl callout")

Signed-off-by: Ruozhu Li <liruozhu@huawei.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-29 16:13:45 +02:00
Pablo Greco
1629de0e03 nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG SX6000LNP (AKA SPECTRIX S40G)
ADATA XPG SPECTRIX S40G drives report bogus eui64 values that appear to
be the same across drives in one system. Quirk them out so they are
not marked as "non globally unique" duplicates.

Before:
[    2.258919] nvme nvme1: pci function 0000:06:00.0
[    2.264898] nvme nvme2: pci function 0000:05:00.0
[    2.323235] nvme nvme1: failed to set APST feature (2)
[    2.326153] nvme nvme2: failed to set APST feature (2)
[    2.333935] nvme nvme1: allocated 64 MiB host memory buffer.
[    2.336492] nvme nvme2: allocated 64 MiB host memory buffer.
[    2.339611] nvme nvme1: 7/0/0 default/read/poll queues
[    2.341805] nvme nvme2: 7/0/0 default/read/poll queues
[    2.346114]  nvme1n1: p1
[    2.347197] nvme nvme2: globally duplicate IDs for nsid 1
After:
[    2.427715] nvme nvme1: pci function 0000:06:00.0
[    2.427771] nvme nvme2: pci function 0000:05:00.0
[    2.488154] nvme nvme2: failed to set APST feature (2)
[    2.489895] nvme nvme1: failed to set APST feature (2)
[    2.498773] nvme nvme2: allocated 64 MiB host memory buffer.
[    2.500587] nvme nvme1: allocated 64 MiB host memory buffer.
[    2.504113] nvme nvme2: 7/0/0 default/read/poll queues
[    2.507026] nvme nvme1: 7/0/0 default/read/poll queues
[    2.509467] nvme nvme2: Ignoring bogus Namespace Identifiers
[    2.512804] nvme nvme1: Ignoring bogus Namespace Identifiers
[    2.513698]  nvme1n1: p1

Signed-off-by: Pablo Greco <pgreco@centosproject.org>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-29 16:13:45 +02:00
Sagi Grimberg
41d07df7de nvme-tcp: always fail a request when sending it failed
queue stoppage and inflight requests cancellation is fully fenced from
io_work and thus failing a request from this context. Hence we don't
need to try to guess from the socket retcode if this failure is because
the queue is about to be torn down or not.

We are perfectly safe to just fail it, the request will not be cancelled
later on.

This solves possible very long shutdown delays when the users issues a
'nvme disconnect-all'

Reported-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-29 16:13:45 +02:00
Sagi Grimberg
ed0691cf55 nvmet-tcp: fix regression in data_digest calculation
Data digest calculation iterates over command mapped iovec. However
since commit bac04454ef we unmap the iovec before we handle the data
digest, and since commit 69b85e1f1d we clear nr_mapped when we unmap
the iov.

Instead of open-coding the command iov traversal, simply call
crypto_ahash_digest with the command sg that is already allocated (we
already do that for the send path). Rename nvmet_tcp_send_ddgst to
nvmet_tcp_calc_ddgst and call it from send and recv paths.

Fixes: 69b85e1f1d ("nvmet-tcp: add an helper to free the cmd buffers")
Fixes: bac04454ef ("nvmet-tcp: fix kmap leak when data digest in use")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-29 16:13:44 +02:00
Michael Walle
9577fc5fdc NFC: nxp-nci: don't print header length mismatch on i2c error
Don't print a misleading header length mismatch error if the i2c call
returns an error. Instead just return the error code without any error
message.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 14:05:00 +01:00
Michael Walle
eddd95b942 NFC: nxp-nci: Don't issue a zero length i2c_master_read()
There are packets which doesn't have a payload. In that case, the second
i2c_master_read() will have a zero length. But because the NFC
controller doesn't have any data left, it will NACK the I2C read and
-ENXIO will be returned. In case there is no payload, just skip the
second i2c master read.

Fixes: 6be88670fc ("NFC: nxp-nci_i2c: Add I2C support to NXP NCI driver")
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 14:05:00 +01:00
Hangyu Hua
00aff3590f net: tipc: fix possible refcount leak in tipc_sk_create()
Free sk in case tipc_sk_insert() fails.

Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 13:49:06 +01:00
Aneesh Kumar K.V
ac790d0988 powerpc/memhotplug: Add add_pages override for PPC
With commit ffa0b64e3b ("powerpc: Fix virt_addr_valid() for 64-bit Book3E & 32-bit")
the kernel now validate the addr against high_memory value. This results
in the below BUG_ON with dax pfns.

[  635.798741][T26531] kernel BUG at mm/page_alloc.c:5521!
1:mon> e
cpu 0x1: Vector: 700 (Program Check) at [c000000007287630]
    pc: c00000000055ed48: free_pages.part.0+0x48/0x110
    lr: c00000000053ca70: tlb_finish_mmu+0x80/0xd0
    sp: c0000000072878d0
   msr: 800000000282b033
  current = 0xc00000000afabe00
  paca    = 0xc00000037ffff300   irqmask: 0x03   irq_happened: 0x05
    pid   = 26531, comm = 50-landscape-sy
kernel BUG at :5521!
Linux version 5.19.0-rc3-14659-g4ec05be7c2e1 (kvaneesh@ltc-boston8) (gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #625 SMP Thu Jun 23 00:35:43 CDT 2022
1:mon> t
[link register   ] c00000000053ca70 tlb_finish_mmu+0x80/0xd0
[c0000000072878d0] c00000000053ca54 tlb_finish_mmu+0x64/0xd0 (unreliable)
[c000000007287900] c000000000539424 exit_mmap+0xe4/0x2a0
[c0000000072879e0] c00000000019fc1c mmput+0xcc/0x210
[c000000007287a20] c000000000629230 begin_new_exec+0x5e0/0xf40
[c000000007287ae0] c00000000070b3cc load_elf_binary+0x3ac/0x1e00
[c000000007287c10] c000000000627af0 bprm_execve+0x3b0/0xaf0
[c000000007287cd0] c000000000628414 do_execveat_common.isra.0+0x1e4/0x310
[c000000007287d80] c00000000062858c sys_execve+0x4c/0x60
[c000000007287db0] c00000000002c1b0 system_call_exception+0x160/0x2c0
[c000000007287e10] c00000000000c53c system_call_common+0xec/0x250

The fix is to make sure we update high_memory on memory hotplug.
This is similar to what x86 does in commit 3072e413e3 ("mm/memory_hotplug: introduce add_pages")

Fixes: ffa0b64e3b ("powerpc: Fix virt_addr_valid() for 64-bit Book3E & 32-bit")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220629050925.31447-1-aneesh.kumar@linux.ibm.com
2022-06-29 20:43:16 +10:00
Naveen N. Rao
b21bd5a4b1 powerpc/bpf: Fix use of user_pt_regs in uapi
Trying to build a .c file that includes <linux/bpf_perf_event.h>:
  $ cat test_bpf_headers.c
  #include <linux/bpf_perf_event.h>

throws the below error:
  /usr/include/linux/bpf_perf_event.h:14:28: error: field ‘regs’ has incomplete type
     14 |         bpf_user_pt_regs_t regs;
	|                            ^~~~

This is because we typedef bpf_user_pt_regs_t to 'struct user_pt_regs'
in arch/powerpc/include/uaps/asm/bpf_perf_event.h, but 'struct
user_pt_regs' is not exposed to userspace.

Powerpc has both pt_regs and user_pt_regs structures. However, unlike
arm64 and s390, we expose user_pt_regs to userspace as just 'pt_regs'.
As such, we should typedef bpf_user_pt_regs_t to 'struct pt_regs' for
userspace.

Within the kernel though, we want to typedef bpf_user_pt_regs_t to
'struct user_pt_regs'.

Remove arch/powerpc/include/uapi/asm/bpf_perf_event.h so that the
uapi/asm-generic version of the header is exposed to userspace.
Introduce arch/powerpc/include/asm/bpf_perf_event.h so that we can
typedef bpf_user_pt_regs_t to 'struct user_pt_regs' for use within the
kernel.

Note that this was not showing up with the bpf selftest build since
tools/include/uapi/asm/bpf_perf_event.h didn't include the powerpc
variant.

Fixes: a6460b03f9 ("powerpc/bpf: Fix broken uapi for BPF_PROG_TYPE_PERF_EVENT")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Use typical naming for header include guard]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220627191119.142867-1-naveen.n.rao@linux.vnet.ibm.com
2022-06-29 20:43:04 +10:00
Javier Martinez Canillas
ee7a69aa38 fbdev: Disable sysfb device registration when removing conflicting FBs
The platform devices registered by sysfb match with firmware-based DRM or
fbdev drivers, that are used to have early graphics using a framebuffer
provided by the system firmware.

DRM or fbdev drivers later are probed and remove conflicting framebuffers,
leading to these platform devices for generic drivers to be unregistered.

But the current solution has a race, since the sysfb_init() function could
be called after a DRM or fbdev driver is probed and request to unregister
the devices for drivers with conflicting framebuffes.

To prevent this, disable any future sysfb platform device registration by
calling sysfb_disable(), if a driver requests to remove the conflicting
framebuffers.

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607182338.344270-4-javierm@redhat.com
2022-06-29 09:51:50 +02:00
Javier Martinez Canillas
bde376e9de firmware: sysfb: Add sysfb_disable() helper function
This can be used by subsystems to unregister a platform device registered
by sysfb and also to disable future platform device registration in sysfb.

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607182338.344270-3-javierm@redhat.com
2022-06-29 09:51:41 +02:00
Javier Martinez Canillas
9e121040e5 firmware: sysfb: Make sysfb_create_simplefb() return a pdev pointer
This function just returned 0 on success or an errno code on error, but it
could be useful for sysfb_init() callers to have a pointer to the device.

Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607182338.344270-2-javierm@redhat.com
2022-06-29 09:51:31 +02:00
Krzysztof Kozlowski
5a478a653b nfc: nfcmrvl: Fix irq_of_parse_and_map() return value
The irq_of_parse_and_map() returns 0 on failure, not a negative ERRNO.

Reported-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Fixes: caf6e49bf6 ("NFC: nfcmrvl: add spi driver")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220627124048.296253-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 21:27:53 -07:00
YueHaibing
53ad46169f net: ipv6: unexport __init-annotated seg6_hmac_net_init()
As of commit 5801f064e3 ("net: ipv6: unexport __init-annotated seg6_hmac_init()"),
EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.

This remove the EXPORT_SYMBOL to fix modpost warning:

WARNING: modpost: vmlinux.o(___ksymtab+seg6_hmac_net_init+0x0): Section mismatch in reference from the variable __ksymtab_seg6_hmac_net_init to the function .init.text:seg6_hmac_net_init()
The symbol seg6_hmac_net_init is exported and annotated __init
Fix this by removing the __init annotation of seg6_hmac_net_init or drop the export.

Fixes: bf355b8d2c ("ipv6: sr: add core files for SR HMAC support")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20220628033134.21088-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 21:23:30 -07:00
Dave Airlie
76f0544428 Merge tag 'drm-msm-fixes-2022-06-28' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v5.19-rc5

- Fix to increment vsync_cnt before calling drm_crtc_handle_vblank so that
  userspace sees the value *after* it is incremented if waiting for vblank
  events
- Fix to reset drm_dev to NULL in dp_display_unbind to avoid a crash in
  probe/bind error paths
- Fix to resolve the smatch error of de-referencing before NULL check in
  dpu_encoder_phys_wb.c
- Fix error return to userspace if fence-id allocation fails in submit
  ioctl

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvswNKdd02EYKYv5Zjv7f+mcqeWC7hHQ1SBjqYzN_ZHnA@mail.gmail.com
2022-06-29 14:16:46 +10:00
katrinzhou
adabdd8f6a ipv6/sit: fix ipip6_tunnel_get_prl return value
When kcalloc fails, ipip6_tunnel_get_prl() should return -ENOMEM.
Move the position of label "out" to return correctly.

Addresses-Coverity: ("Unused value")
Fixes: 300aaeeaab ("[IPV6] SIT: Add SIOCGETPRL ioctl to get/dump PRL.")
Signed-off-by: katrinzhou <katrinzhou@tencent.com>
Reviewed-by: Eric Dumazet<edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220628035030.1039171-1-zys.zljxml@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 21:00:34 -07:00
Jakub Kicinski
bce3bb30b2 Merge branch 'mptcp-fixes-for-5-19'
Mat Martineau says:

====================
mptcp: Fixes for 5.19

Several categories of fixes from the mptcp tree:

Patches 1-3 are fixes related to MP_FAIL and FASTCLOSE, to make sure
MIBs are accurate, and to handle MP_FAIL transmission and responses at
the correct times. sk_timer conflicts are also resolved.

Patches 4 and 6 handle two separate race conditions, one at socket
shutdown and one with unaccepted subflows.

Patch 5 makes sure read operations are not blocked during fallback to
TCP.

Patch 7 improves the diag selftest, which were incorrectly failing on
slow machines (like the VMs used for CI testing).

Patch 8 avoids possible symbol redefinition errors in the userspace
mptcp.h file.

Patch 9 fixes a selftest build issue with gcc 12.
====================

Link: https://lore.kernel.org/r/20220628010243.166605-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 20:45:47 -07:00
Mat Martineau
fd37c2ecb2 selftests: mptcp: Initialize variables to quiet gcc 12 warnings
In a few MPTCP selftest tools, gcc 12 complains that the 'sock' variable
might be used uninitialized. This is a false positive because the only
code path that could lead to uninitialized access is where getaddrinfo()
fails, but the local xgetaddrinfo() wrapper exits if such a failure
occurs.

Initialize the 'sock' variable anyway to allow the tools to build with
gcc 12.

Fixes: 048d19d444 ("mptcp: add basic kselftest for mptcp")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 20:45:43 -07:00
Ossama Othman
06e445f740 mptcp: fix conflict with <netinet/in.h>
Including <linux/mptcp.h> before the C library <netinet/in.h> header
causes symbol redefinition errors at compile-time due to duplicate
declarations and definitions in the <linux/in.h> header included by
<linux/mptcp.h>.

Explicitly include <netinet/in.h> before <linux/in.h> in
<linux/mptcp.h> when __KERNEL__ is not defined so that the C library
compatibility logic in <linux/libc-compat.h> is enabled when including
<linux/mptcp.h> in user space code.

Fixes: c11c5906bc ("mptcp: add MPTCP_SUBFLOW_ADDRS getsockopt support")
Signed-off-by: Ossama Othman <ossama.othman@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 20:45:43 -07:00
Paolo Abeni
42fb6cddec selftests: mptcp: more stable diag tests
The mentioned test-case still use an hard-coded-len sleep to
wait for a relative large number of connection to be established.

On very slow VM and with debug build such timeout could be exceeded,
causing failures in our CI.

Address the issue polling for the expected condition several times,
up to an unreasonable high amount of time. On reasonably fast system
the self-tests will be faster then before, on very slow one we will
still catch the correct condition.

Fixes: df62f2ec3d ("selftests/mptcp: add diag interface tests")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 20:45:43 -07:00
Paolo Abeni
6aeed90450 mptcp: fix race on unaccepted mptcp sockets
When the listener socket owning the relevant request is closed,
it frees the unaccepted subflows and that causes later deletion
of the paired MPTCP sockets.

The mptcp socket's worker can run in the time interval between such delete
operations. When that happens, any access to msk->first will cause an UaF
access, as the subflow cleanup did not cleared such field in the mptcp
socket.

Address the issue explicitly traversing the listener socket accept
queue at close time and performing the needed cleanup on the pending
msk.

Note that the locking is a bit tricky, as we need to acquire the msk
socket lock, while still owning the subflow socket one.

Fixes: 86e39e0448 ("mptcp: keep track of local endpoint still available for each msk")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 20:45:42 -07:00
Paolo Abeni
f745a3ebdf mptcp: consistent map handling on failure
When the MPTCP receive path reach a non fatal fall-back condition, e.g.
when the MPC sockets must fall-back to TCP, the existing code is a little
self-inconsistent: it reports that new data is available - return true -
but sets the MPC flag to the opposite value.

As the consequence read operations in some exceptional scenario may block
unexpectedly.

Address the issue setting the correct MPC read status. Additionally avoid
some code duplication in the fatal fall-back scenario.

Fixes: 9c81be0dbc ("mptcp: add MP_FAIL response support")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 20:45:42 -07:00
Paolo Abeni
d51991e2e3 mptcp: fix shutdown vs fallback race
If the MPTCP socket shutdown happens before a fallback
to TCP, and all the pending data have been already spooled,
we never close the TCP connection.

Address the issue explicitly checking for critical condition
at fallback time.

Fixes: 1e39e5a32a ("mptcp: infinite mapping sending")
Fixes: 0348c690ed ("mptcp: add the fallback check")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 20:45:42 -07:00
Geliang Tang
76a13b3157 mptcp: invoke MP_FAIL response when needed
mptcp_mp_fail_no_response shouldn't be invoked on each worker run, it
should be invoked only when MP_FAIL response timeout occurs.

This patch refactors the MP_FAIL response logic.

It leverages the fact that only the MPC/first subflow can gracefully
fail to avoid unneeded subflows traversal: the failing subflow can
be only msk->first.

A new 'fail_tout' field is added to the subflow context to record the
MP_FAIL response timeout and use such field to reliably share the
timeout timer between the MP_FAIL event and the MPTCP socket close
timeout.

Finally, a new ack is generated to send out MP_FAIL notification as soon
as we hit the relevant condition, instead of waiting a possibly unbound
time for the next data packet.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/281
Fixes: d9fb797046 ("mptcp: Do not traverse the subflow connection list without lock")
Co-developed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 20:45:42 -07:00
Paolo Abeni
31bf11de14 mptcp: introduce MAPPING_BAD_CSUM
This allow moving a couple of conditional out of the fast path,
making the code more easy to follow and will simplify the next
patch.

Fixes: ae66fb2ba6 ("mptcp: Do TCP fallback on early DSS checksum failure")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 20:45:42 -07:00
Paolo Abeni
0c1f78a49a mptcp: fix error mibs accounting
The current accounting for MP_FAIL and FASTCLOSE is not very
accurate: both can be increased even when the related option is
not really sent. Move the accounting into the correct place.

Fixes: eb7f33654d ("mptcp: add the mibs for MP_FAIL")
Fixes: 1e75629cb9 ("mptcp: add the mibs for MP_FASTCLOSE")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-28 20:45:42 -07:00
Kai-Heng Feng
9ab762a84b platform/x86: hp-wmi: Ignore Sanitization Mode event
After system resume the hp-wmi driver may complain:
[ 702.620180] hp_wmi: Unknown event_id - 23 - 0x0

According to HP it means 'Sanitization Mode' and it's harmless to just
ignore the event.

Cc: Jorge Lopez <jorge.lopez2@hp.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Link: https://lore.kernel.org/r/20220628123726.250062-1-kai.heng.feng@canonical.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-28 22:20:07 +02:00
Mark Pearson
bce6243f76 platform/x86: thinkpad_acpi: do not use PSC mode on Intel platforms
PSC platform profile mode is only supported on Linux for AMD platforms.

Some older Intel platforms (e.g T490) are advertising it's capability
as Windows uses it - but on Linux we should only be using MMC profile
for Intel systems.

Add a check to prevent it being enabled incorrectly.

Signed-off-by: Mark Pearson <markpearson@lenovo.com>
Link: https://lore.kernel.org/r/20220627181449.3537-1-markpearson@lenovo.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-28 22:20:04 +02:00
Mark Pearson
42504af775 platform/x86: thinkpad-acpi: profile capabilities as integer
Currently the active mode (PSC/MMC) is stored in an enum and queried
throughout the driver.

Other driver changes will enumerate additional submodes that are relevant
to be tracked, so instead track PSC/MMC in a single integer variable.

Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mark Pearson <markpearson@lenovo.com>
Link: https://lore.kernel.org/r/20220603170212.164963-1-markpearson@lenovo.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-28 22:20:00 +02:00
Hans de Goede
aacb455dfe platform/x86: panasonic-laptop: filter out duplicate volume up/down/mute keypresses
On some Panasonic models the volume up/down/mute keypresses get
reported both through the Panasonic ACPI HKEY interface as well as
through the atkbd device.

Filter out the atkbd scan-codes for these to avoid reporting presses
twice.

Note normally we would leave the filtering of these to userspace by mapping
the scan-codes to KEY_UNKNOWN through /lib/udev/hwdb.d/60-keyboard.hwdb.
However in this case that would cause regressions since we were filtering
the Panasonic ACPI HKEY events before, so filter these in the kernel.

Fixes: ed83c91718 ("platform/x86: panasonic-laptop: Resolve hotkey double trigger bug")
Reported-and-tested-by: Stefan Seyfried <seife+kernel@b1-systems.com>
Reported-and-tested-by: Kenneth Chan <kenneth.t.chan@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220624112340.10130-7-hdegoede@redhat.com
2022-06-28 21:54:50 +02:00
Hans de Goede
1f2c9de83a platform/x86: panasonic-laptop: don't report duplicate brightness key-presses
The brightness key-presses might also get reported by the ACPI video bus,
check for this and in this case don't report the presses to avoid reporting
2 presses for a single key-press.

Fixes: ed83c91718 ("platform/x86: panasonic-laptop: Resolve hotkey double trigger bug")
Reported-and-tested-by: Stefan Seyfried <seife+kernel@b1-systems.com>
Reported-and-tested-by: Kenneth Chan <kenneth.t.chan@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220624112340.10130-6-hdegoede@redhat.com
2022-06-28 21:54:44 +02:00
Hans de Goede
83a5ddc3dc platform/x86: panasonic-laptop: revert "Resolve hotkey double trigger bug"
In hindsight blindly throwing away most of the key-press events is not
a good idea. So revert commit ed83c91718 ("platform/x86:
panasonic-laptop: Resolve hotkey double trigger bug").

Fixes: ed83c91718 ("platform/x86: panasonic-laptop: Resolve hotkey double trigger bug")
Reported-and-tested-by: Stefan Seyfried <seife+kernel@b1-systems.com>
Reported-and-tested-by: Kenneth Chan <kenneth.t.chan@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220624112340.10130-5-hdegoede@redhat.com
2022-06-28 21:53:47 +02:00
Hans de Goede
fe4326c8d1 platform/x86: panasonic-laptop: sort includes alphabetically
Sort includes alphabetically, small cleanup patch in preparation of
further changes.

Fixes: ed83c91718 ("platform/x86: panasonic-laptop: Resolve hotkey double trigger bug")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220624112340.10130-4-hdegoede@redhat.com
2022-06-28 21:53:39 +02:00
Stefan Seyfried
65a3e6c8d3 platform/x86: panasonic-laptop: de-obfuscate button codes
In the definition of panasonic_keymap[] the key codes are given in
decimal, later checks are done with hexadecimal values, which does
not help in understanding the code.
Additionally use two helper variables to shorten the code and make
the logic more obvious.

Fixes: ed83c91718 ("platform/x86: panasonic-laptop: Resolve hotkey double trigger bug")
Signed-off-by: Stefan Seyfried <seife+kernel@b1-systems.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220624112340.10130-3-hdegoede@redhat.com
2022-06-28 21:53:35 +02:00
Hans de Goede
3a0cf7ab8d ACPI: video: Change how we determine if brightness key-presses are handled
Some systems have an ACPI video bus but not ACPI video devices with
backlight capability. On these devices brightness key-presses are
(logically) not reported through the ACPI video bus.

Change how acpi_video_handles_brightness_key_presses() determines if
brightness key-presses are handled by the ACPI video driver to avoid
vendor specific drivers/platform/x86 drivers filtering out their
brightness key-presses even though they are the only ones reporting
these presses.

Fixes: ed83c91718 ("platform/x86: panasonic-laptop: Resolve hotkey double trigger bug")
Reported-and-tested-by: Stefan Seyfried <seife+kernel@b1-systems.com>
Reported-and-tested-by: Kenneth Chan <kenneth.t.chan@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220624112340.10130-2-hdegoede@redhat.com
2022-06-28 21:53:30 +02:00
Arnaldo Carvalho de Melo
7fe718fb8f tools headers UAPI: Sync linux/kvm.h with the kernel sources
To pick the changes in:

  bfbab44568 ("KVM: arm64: Implement PSCI SYSTEM_SUSPEND")
  7b33a09d03 ("KVM: arm64: Add support for userspace to suspend a vCPU")
  ffbb61d09f ("KVM: x86: Accept KVM_[GS]ET_TSC_KHZ as a VM ioctl.")
  661a20fab7 ("KVM: x86/xen: Advertise and document KVM_XEN_HVM_CONFIG_EVTCHN_SEND")
  fde0451be8 ("KVM: x86/xen: Support per-vCPU event channel upcall via local APIC")
  28d1629f75 ("KVM: x86/xen: Kernel acceleration for XENVER_version")
  5363952605 ("KVM: x86/xen: handle PV timers oneshot mode")
  942c2490c2 ("KVM: x86/xen: Add KVM_XEN_VCPU_ATTR_TYPE_VCPU_ID")
  2fd6df2f2b ("KVM: x86/xen: intercept EVTCHNOP_send from guests")
  35025735a7 ("KVM: x86/xen: Support direct injection of event channel events")

That automatically adds support for this new ioctl:

  $ tools/perf/trace/beauty/kvm_ioctl.sh > before
  $ cp include/uapi/linux/kvm.h tools/include/uapi/linux/kvm.h
  $ tools/perf/trace/beauty/kvm_ioctl.sh > after
  $ diff -u before after
  --- before	2022-06-28 12:13:07.281150509 -0300
  +++ after	2022-06-28 12:13:16.423392896 -0300
  @@ -98,6 +98,7 @@
   	[0xcc] = "GET_SREGS2",
   	[0xcd] = "SET_SREGS2",
   	[0xce] = "GET_STATS_FD",
  +	[0xd0] = "XEN_HVM_EVTCHN_SEND",
   	[0xe0] = "CREATE_DEVICE",
   	[0xe1] = "SET_DEVICE_ATTR",
   	[0xe2] = "GET_DEVICE_ATTR",
  $

This silences these perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oupton@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lore.kernel.org/lkml/Yrs4RE+qfgTaWdAt@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-28 14:20:22 -03:00
Rafael J. Wysocki
049b1ed9be Merge tag 'cpufreq-arm-fixes-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull cpufreq ARM fixes for 5.19-rc5 from Viresh Kumar:

 - Fix missing of_node_put for qoriq and pmac32 driver (Liang He).
 - Fix issues around throttle interrupt for qcom driver (Stephen Boyd).
 - Add MT8186 to cpufreq-dt-platdev blocklist (AngeloGioacchino Del Regno).

* tag 'cpufreq-arm-fixes-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  cpufreq: Add MT8186 to cpufreq-dt-platdev blocklist
  cpufreq: pmac32-cpufreq: Fix refcount leak bug
  cpufreq: qcom-hw: Don't do lmh things without a throttle interrupt
  drivers: cpufreq: Add missing of_node_put() in qoriq-cpufreq.c
2022-06-28 17:56:57 +02:00
Ian Rogers
579d6c6d77 perf bpf: 8 byte align bpil data
bpil data is accessed assuming 64-bit alignment resulting in undefined
behavior as the data is just byte aligned. With an -fsanitize=undefined
build the following errors are observed:

  $ sudo perf record -a sleep 1
  util/bpf-event.c:310:22: runtime error: load of misaligned address 0x55f61084520f for type '__u64', which requires 8 byte alignment
  0x55f61084520f: note: pointer points here
   a8 fe ff ff 3c  51 d3 c0 ff ff ff ff 04  84 d3 c0 ff ff ff ff d8  aa d3 c0 ff ff ff ff a4  c0 d3 c0
               ^
  util/bpf-event.c:311:20: runtime error: load of misaligned address 0x55f61084522f for type '__u32', which requires 4 byte alignment
  0x55f61084522f: note: pointer points here
   ff ff ff ff c7  17 00 00 f1 02 00 00 1f  04 00 00 58 04 00 00 00  00 00 00 0f 00 00 00 63  02 00 00
               ^
  util/bpf-event.c:198:33: runtime error: member access within misaligned address 0x55f61084523f for type 'const struct bpf_func_info', which requires 4 byte alignment
  0x55f61084523f: note: pointer points here
   58 04 00 00 00  00 00 00 0f 00 00 00 63  02 00 00 3b 00 00 00 ab  02 00 00 44 00 00 00 14  03 00 00

Correct this by rouding up the data sizes and aligning the pointers.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220614014714.1407239-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-28 12:05:25 -03:00
Arnaldo Carvalho de Melo
117c49505b tools kvm headers arm64: Update KVM headers from the kernel sources
To pick the changes from:

  2cde51f1e1 ("KVM: arm64: Hide KVM_REG_ARM_*_BMAP_BIT_COUNT from userspace")
  b22216e1a6 ("KVM: arm64: Add vendor hypervisor firmware register")
  428fd6788d ("KVM: arm64: Add standard hypervisor firmware register")
  05714cab7d ("KVM: arm64: Setup a framework for hypercall bitmap firmware registers")
  18f3976fdb ("KVM: arm64: uapi: Add kvm_debug_exit_arch.hsr_high")
  a5905d6af4 ("KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated")

That don't causes any changes in tooling (when built on x86), only
addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
  diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Raghavendra Rao Ananta <rananta@google.com>
Link: https://lore.kernel.org/lkml/YrsWcDQyJC+xsfmm@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-28 11:56:01 -03:00
Namhyung Kim
49c692b7df perf offcpu: Accept allowed sample types only
As offcpu-time event is synthesized at the end, it could not get the
all the sample info.  Define OFFCPU_SAMPLE_TYPES for allowed ones and
mask out others in evsel__config() to prevent parse errors.

Because perf sample parsing assumes a specific ordering with the
sample types, setting unsupported one would make it fail to read
data like perf record -d/--data.

Fixes: edc41a1099 ("perf record: Enable off-cpu analysis with BPF")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220624231313.367909-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-28 11:45:45 -03:00
Namhyung Kim
d6838ec44b perf offcpu: Fix build failure on old kernels
Old kernels have a 'struct task_struct' which contains a "state" field
and newer kernels have "__state" instead.

While the get_task_state() in the BPF code handles that in some way, it
assumed the current kernel has the new definition and it caused a build
error on old kernels.

We should not assume anything and access them carefully.  Do not use
'task struct' directly access it instead using new and old definitions
in a row.

Fixes: edc41a1099 ("perf record: Enable off-cpu analysis with BPF")
Reported-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220624231313.367909-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-28 11:41:26 -03:00
Tao Liu
0fe3dbbefb linux/dim: Fix divide by 0 in RDMA DIM
Fix a divide 0 error in rdma_dim_stats_compare() when prev->cpe_ratio ==
0.

CallTrace:
  Hardware name: H3C R4900 G3/RS33M2C9S, BIOS 2.00.37P21 03/12/2020
  task: ffff880194b78000 task.stack: ffffc90006714000
  RIP: 0010:backport_rdma_dim+0x10e/0x240 [mlx_compat]
  RSP: 0018:ffff880c10e83ec0 EFLAGS: 00010202
  RAX: 0000000000002710 RBX: ffff88096cd7f780 RCX: 0000000000000064
  RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000001
  RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: 000000001d7c6c09
  R13: ffff88096cd7f780 R14: ffff880b174fe800 R15: 0000000000000000
  FS:  0000000000000000(0000) GS:ffff880c10e80000(0000)
  knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000a0965b00 CR3: 000000000200a003 CR4: 00000000007606e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <IRQ>
   ib_poll_handler+0x43/0x80 [ib_core]
   irq_poll_softirq+0xae/0x110
   __do_softirq+0xd1/0x28c
   irq_exit+0xde/0xf0
   do_IRQ+0x54/0xe0
   common_interrupt+0x8f/0x8f
   </IRQ>
   ? cpuidle_enter_state+0xd9/0x2a0
   ? cpuidle_enter_state+0xc7/0x2a0
   ? do_idle+0x170/0x1d0
   ? cpu_startup_entry+0x6f/0x80
   ? start_secondary+0x1b9/0x210
   ? secondary_startup_64+0xa5/0xb0
  Code: 0f 87 e1 00 00 00 8b 4c 24 14 44 8b 43 14 89 c8 4d 63 c8 44 29 c0 99 31 d0 29 d0 31 d2 48 98 48 8d 04 80 48 8d 04 80 48 c1 e0 02 <49> f7 f1 48 83 f8 0a 0f 86 c1 00 00 00 44 39 c1 7f 10 48 89 df
  RIP: backport_rdma_dim+0x10e/0x240 [mlx_compat] RSP: ffff880c10e83ec0

Fixes: f4915455dc ("linux/dim: Implement RDMA adaptive moderation (DIM)")
Link: https://lore.kernel.org/r/20220627140004.3099-1-thomas.liu@ucloud.cn
Signed-off-by: Tao Liu <thomas.liu@ucloud.cn>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-06-28 10:37:25 -03:00
Eric Dumazet
ab84db251c net: bonding: fix possible NULL deref in rlb code
syzbot has two reports involving the same root cause.

bond_alb_initialize() must not set bond->alb_info.rlb_enabled
if a memory allocation error is detected.

Report 1:

general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
CPU: 0 PID: 12276 Comm: kworker/u4:10 Not tainted 5.19.0-rc3-syzkaller-00132-g3b89b511ea0c #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
RIP: 0010:rlb_clear_slave+0x10e/0x690 drivers/net/bonding/bond_alb.c:393
Code: 8e fc 83 fb ff 0f 84 74 02 00 00 e8 cc 2a 8e fc 48 8b 44 24 08 89 dd 48 c1 e5 06 4c 8d 34 28 49 8d 7e 14 48 89 f8 48 c1 e8 03 <42> 0f b6 14 20 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85
RSP: 0018:ffffc90018a8f678 EFLAGS: 00010203
RAX: 0000000000000002 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff88803375bb00 RSI: ffffffff84ec4ac4 RDI: 0000000000000014
RBP: 0000000000000000 R08: 0000000000000005 R09: 00000000ffffffff
R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
R13: ffff8880ac889000 R14: 0000000000000000 R15: ffff88815a668c80
FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005597077e10b0 CR3: 0000000026668000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
bond_alb_deinit_slave+0x43c/0x6b0 drivers/net/bonding/bond_alb.c:1663
__bond_release_one.cold+0x383/0xd53 drivers/net/bonding/bond_main.c:2370
bond_slave_netdev_event drivers/net/bonding/bond_main.c:3778 [inline]
bond_netdev_event+0x993/0xad0 drivers/net/bonding/bond_main.c:3889
notifier_call_chain+0xb5/0x200 kernel/notifier.c:87
call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:1945
call_netdevice_notifiers_extack net/core/dev.c:1983 [inline]
call_netdevice_notifiers net/core/dev.c:1997 [inline]
unregister_netdevice_many+0x948/0x18b0 net/core/dev.c:10839
default_device_exit_batch+0x449/0x590 net/core/dev.c:11333
ops_exit_list+0x125/0x170 net/core/net_namespace.c:167
cleanup_net+0x4ea/0xb00 net/core/net_namespace.c:594
process_one_work+0x996/0x1610 kernel/workqueue.c:2289
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e9/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302
</TASK>

Report 2:

general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
CPU: 1 PID: 5206 Comm: syz-executor.1 Not tainted 5.18.0-syzkaller-12108-g58f9d52ff689 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:rlb_req_update_slave_clients+0x109/0x2f0 drivers/net/bonding/bond_alb.c:502
Code: 5d 18 8f fc 41 80 3e 00 0f 85 a5 01 00 00 89 d8 48 c1 e0 06 49 03 84 24 68 01 00 00 48 8d 78 30 49 89 c7 48 89 fa 48 c1 ea 03 <80> 3c 2a 00 0f 85 98 01 00 00 4d 39 6f 30 75 83 e8 22 18 8f fc 49
RSP: 0018:ffffc9000300ee80 EFLAGS: 00010206
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc90016c11000
RDX: 0000000000000006 RSI: ffffffff84eb6bf3 RDI: 0000000000000030
RBP: dffffc0000000000 R08: 0000000000000005 R09: 00000000ffffffff
R10: 0000000000000000 R11: 0000000000000000 R12: ffff888027c80c80
R13: ffff88807d7ff800 R14: ffffed1004f901bd R15: 0000000000000000
FS:  00007f6f46c58700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020010000 CR3: 00000000516cc000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 alb_fasten_mac_swap+0x886/0xa80 drivers/net/bonding/bond_alb.c:1070
 bond_alb_handle_active_change+0x624/0x1050 drivers/net/bonding/bond_alb.c:1765
 bond_change_active_slave+0xfa1/0x29b0 drivers/net/bonding/bond_main.c:1173
 bond_select_active_slave+0x23f/0xa50 drivers/net/bonding/bond_main.c:1253
 bond_enslave+0x3b34/0x53b0 drivers/net/bonding/bond_main.c:2159
 do_set_master+0x1c8/0x220 net/core/rtnetlink.c:2577
 rtnl_newlink_create net/core/rtnetlink.c:3380 [inline]
 __rtnl_newlink+0x13ac/0x17e0 net/core/rtnetlink.c:3580
 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3593
 rtnetlink_rcv_msg+0x43a/0xc90 net/core/rtnetlink.c:6089
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2501
 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
 netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
 netlink_sendmsg+0x917/0xe10 net/netlink/af_netlink.c:1921
 sock_sendmsg_nosec net/socket.c:714 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:734
 ____sys_sendmsg+0x6eb/0x810 net/socket.c:2492
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2546
 __sys_sendmsg net/socket.c:2575 [inline]
 __do_sys_sendmsg net/socket.c:2584 [inline]
 __se_sys_sendmsg net/socket.c:2582 [inline]
 __x64_sys_sendmsg+0x132/0x220 net/socket.c:2582
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f6f45a89109
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6f46c58168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f6f45b9c030 RCX: 00007f6f45a89109
RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000006
RBP: 00007f6f45ae308d R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffed99029af R14: 00007f6f46c58300 R15: 0000000000022000
 </TASK>

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Link: https://lore.kernel.org/r/20220627102813.126264-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-28 15:23:00 +02:00
Amir Goldstein
8698e3bab4 fanotify: refine the validation checks on non-dir inode mask
Commit ceaf69f8ea ("fanotify: do not allow setting dirent events in
mask of non-dir") added restrictions about setting dirent events in the
mask of a non-dir inode mark, which does not make any sense.

For backward compatibility, these restictions were added only to new
(v5.17+) APIs.

It also does not make any sense to set the flags FAN_EVENT_ON_CHILD or
FAN_ONDIR in the mask of a non-dir inode.  Add these flags to the
dir-only restriction of the new APIs as well.

Move the check of the dir-only flags for new APIs into the helper
fanotify_events_supported(), which is only called for FAN_MARK_ADD,
because there is no need to error on an attempt to remove the dir-only
flags from non-dir inode.

Fixes: ceaf69f8ea ("fanotify: do not allow setting dirent events in mask of non-dir")
Link: https://lore.kernel.org/linux-fsdevel/20220627113224.kr2725conevh53u4@quack3.lan/
Link: https://lore.kernel.org/r/20220627174719.2838175-1-amir73il@gmail.com
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2022-06-28 11:18:13 +02:00
AngeloGioacchino Del Regno
be4b61ec45 cpufreq: Add MT8186 to cpufreq-dt-platdev blocklist
This SoC shall use the mediatek-cpufreq driver, or the system will
crash upon any clock scaling request: add it to the cpufreq-dt-platdev
blocklist.

Fixes: 39b360102f ("cpufreq: mediatek: Add support for MT8186")
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-06-28 13:34:56 +05:30
Liang He
ccd7567d4b cpufreq: pmac32-cpufreq: Fix refcount leak bug
In pmac_cpufreq_init_MacRISC3(), we need to add corresponding
of_node_put() for the three node pointers whose refcount have
been incremented by of_find_node_by_name().

Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-06-28 13:34:51 +05:30
Stephen Boyd
668a7a12de cpufreq: qcom-hw: Don't do lmh things without a throttle interrupt
Offlining cpu6 and cpu7 and then onlining cpu6 hangs on
sc7180-trogdor-lazor because the throttle interrupt doesn't exist.
Similarly, things go sideways when suspend/resume runs. That's because
the qcom_cpufreq_hw_cpu_online() and qcom_cpufreq_hw_lmh_exit()
functions are calling genirq APIs with an interrupt value of '-6', i.e.
-ENXIO, and that isn't good.

Check the value of the throttle interrupt like we already do in other
functions in this file and bail out early from lmh code to fix the hang.

Reported-by: Rob Clark <robdclark@chromium.org>
Cc: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Fixes: a1eb080a04 ("cpufreq: qcom-hw: provide online/offline operations")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-06-28 13:34:51 +05:30
Liang He
4ff5a9b6d9 drivers: cpufreq: Add missing of_node_put() in qoriq-cpufreq.c
In qoriq_cpufreq_probe(), of_find_matching_node() will return a
node pointer with refcount incremented. We should use of_node_put()
when it is not used anymore.

Fixes: 157f527639 ("cpufreq: qoriq: convert to a platform driver")
[ Viresh: Fixed Author's name in commit log ]
Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-06-28 13:34:45 +05:30
Nicolas Dichtel
3b0dc529f5 ipv6: take care of disable_policy when restoring routes
When routes corresponding to addresses are restored by
fixup_permanent_addr(), the dst_nopolicy parameter was not set.
The typical use case is a user that configures an address on a down
interface and then put this interface up.

Let's take care of this flag in addrconf_f6i_alloc(), so that every callers
benefit ont it.

CC: stable@kernel.org
CC: David Forster <dforster@brocade.com>
Fixes: df789fe752 ("ipv6: Provide ipv6 version of "disable_policy" sysctl")
Reported-by: Siwar Zitouni <siwar.zitouni@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220623120015.32640-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-27 22:24:30 -07:00
Oleksij Rempel
ce95ab775f net: usb: asix: do not force pause frames support
We should respect link partner capabilities and not force flow control
support on every link. Even more, in current state the MAC driver do not
advertises pause support so we should not keep flow control enabled at
all.

Fixes: e532a096be ("net: usb: asix: ax88772: add phylib support")
Reported-by: Anton Lundin <glance@acc.umu.se>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Tested-by: Anton Lundin <glance@acc.umu.se>
Link: https://lore.kernel.org/r/20220624075139.3139300-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-27 22:04:33 -07:00
Oleksij Rempel
805206e66f net: asix: fix "can't send until first packet is send" issue
If cable is attached after probe sequence, the usbnet framework would
not automatically start processing RX packets except at least one
packet was transmitted.

On systems with any kind of address auto configuration this issue was
not detected, because some packets are send immediately after link state
is changed to "running".

With this patch we will notify usbnet about link status change provided by the
PHYlib.

Fixes: e532a096be ("net: usb: asix: ax88772: add phylib support")
Reported-by: Anton Lundin <glance@acc.umu.se>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Tested-by: Anton Lundin <glance@acc.umu.se>
Link: https://lore.kernel.org/r/20220624075139.3139300-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-27 22:04:33 -07:00
Michael Walle
6b9f1d46fd MAINTAINERS: nfc: drop Charles Gorand from NXP-NCI
Mails to Charles get an auto reply, that he is no longer working at
Eff'Innov technologies. Drop the entry and mark the driver as orphaned.

Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220626200039.4062784-1-michael@walle.cc
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-27 21:58:59 -07:00
Shreenidhi Shedi
4bbfed9112 octeon_ep: use bitwise AND
This should be bitwise operator not logical.

Fixes: 862cd659a6 ("octeon_ep: Add driver framework and device initialization")
Signed-off-by: Shreenidhi Shedi <sshedi@vmware.com>
Link: https://lore.kernel.org/r/20220626132947.3992423-1-sshedi@vmware.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-27 21:56:35 -07:00
Jakub Kicinski
cce13b82cf Merge branch 'notify-user-space-if-any-actions-were-flushed-before-error'
Victor Nogueira says:

====================
Notify user space if any actions were flushed before error

This patch series fixes the behaviour of actions flush so that the
kernel always notifies user space whenever it deletes actions during a
flush operation, even if it didn't flush all the actions. This series
also introduces tdc tests to verify this new behaviour.
====================

Link: https://lore.kernel.org/r/20220623140742.684043-1-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-27 21:51:30 -07:00
Victor Nogueira
88153e29c1 selftests: tc-testing: Add testcases to test new flush behaviour
Add tdc test cases to verify new flush behaviour is correct, which do
the following:

- Try to flush only one action which is being referenced by a filter
- Try to flush three actions where the last one (index 3) is being
  referenced by a filter

Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-27 21:51:28 -07:00
Victor Nogueira
76b39b9438 net/sched: act_api: Notify user space if any actions were flushed before error
If during an action flush operation one of the actions is still being
referenced, the flush operation is aborted and the kernel returns to
user space with an error. However, if the kernel was able to flush, for
example, 3 actions and failed on the fourth, the kernel will not notify
user space that it deleted 3 actions before failing.

This patch fixes that behaviour by notifying user space of how many
actions were deleted before flush failed and by setting extack with a
message describing what happened.

Fixes: 55334a5db5 ("net_sched: act: refuse to remove bound action outside")
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-27 21:51:23 -07:00
Tong Zhang
8ee9d82cd0 epic100: fix use after free on rmmod
epic_close() calls epic_rx() and uses dma buffer, but in epic_remove_one()
we already freed the dma buffer. To fix this issue, reorder function calls
like in the .probe function.

BUG: KASAN: use-after-free in epic_rx+0xa6/0x7e0 [epic100]
Call Trace:
 epic_rx+0xa6/0x7e0 [epic100]
 epic_close+0xec/0x2f0 [epic100]
 unregister_netdev+0x18/0x20
 epic_remove_one+0xaa/0xf0 [epic100]

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Yilun Wu <yiluwu@cs.stonybrook.edu>
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Reviewed-by: Francois Romieu <romieu@fr.zoreil.com>
Link: https://lore.kernel.org/r/20220627043351.25615-1-ztong0001@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-27 21:48:51 -07:00
Jakub Kicinski
a8fc8cb569 net: tun: stop NAPI when detaching queues
While looking at a syzbot report I noticed the NAPI only gets
disabled before it's deleted. I think that user can detach
the queue before destroying the device and the NAPI will never
be stopped.

Fixes: 943170998b ("tun: enable NAPI for TUN/TAP driver")
Acked-by: Petar Penkov <ppenkov@aviatrix.com>
Link: https://lore.kernel.org/r/20220623042105.2274812-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-27 21:48:01 -07:00
John Garry
fce54ed027 scsi: hisi_sas: Limit max hw sectors for v3 HW
If the controller is behind an IOMMU then the IOMMU IOVA caching range can
affect performance, as discussed in [0].

Limit the max HW sectors to not exceed this limit. We need to hardcode the
value until a proper DMA mapping API is available.

[0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/

Link: https://lore.kernel.org/r/1655988119-223714-1-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-27 22:43:57 -04:00
Heinz Mauelshagen
332bd07787 dm raid: fix accesses beyond end of raid member array
On dm-raid table load (using raid_ctr), dm-raid allocates an array
rs->devs[rs->raid_disks] for the raid device members. rs->raid_disks
is defined by the number of raid metadata and image tupples passed
into the target's constructor.

In the case of RAID layout changes being requested, that number can be
different from the current number of members for existing raid sets as
defined in their superblocks. Example RAID layout changes include:
- raid1 legs being added/removed
- raid4/5/6/10 number of stripes changed (stripe reshaping)
- takeover to higher raid level (e.g. raid5 -> raid6)

When accessing array members, rs->raid_disks must be used in control
loops instead of the potentially larger value in rs->md.raid_disks.
Otherwise it will cause memory access beyond the end of the rs->devs
array.

Fix this by changing code that is prone to out-of-bounds access.
Also fix validate_raid_redundancy() to validate all devices that are
added. Also, use braces to help clean up raid_iterate_devices().

The out-of-bounds memory accesses was discovered using KASAN.

This commit was verified to pass all LVM2 RAID tests (with KASAN
enabled).

Cc: stable@vger.kernel.org
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-27 19:31:47 -04:00
Rob Clark
08de214138 drm/msm/gem: Fix error return on fence id alloc fail
This was a typo, we didn't actually want to return zero.

Fixes: a61acbbe9c ("drm/msm: Track "seqno" fences by idr")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/491145/
Link: https://lore.kernel.org/r/20220624184528.4036837-1-robdclark@gmail.com
2022-06-27 12:48:27 -07:00
Helge Deller
96b80fcd27 parisc/unaligned: Fix emulate_ldw() breakage
The commit e8aa7b17fe broke the 32-bit load-word unalignment exception
handler because it calculated the wrong amount of bits by which the value
should be shifted. This patch fixes it.

Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: e8aa7b17fe ("parisc/unaligned: Rewrite inline assembly of emulate_ldw()")
Cc: stable@vger.kernel.org   # v5.18
2022-06-27 21:30:11 +02:00
Linus Torvalds
941e3e7912 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
 "Fixes all over the place, most notably we are disabling
  IRQ hardening (again!)"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_ring: make vring_create_virtqueue_split prettier
  vhost-vdpa: call vhost_vdpa_cleanup during the release
  virtio_mmio: Restore guest page size on resume
  virtio_mmio: Add missing PM calls to freeze/restore
  caif_virtio: fix race between virtio_device_ready() and ndo_open()
  virtio-net: fix race between ndo_open() and virtio_device_ready()
  virtio: disable notification hardening by default
  virtio: Remove unnecessary variable assignments
  virtio_ring : keep used_wrap_counter in vq->last_used_idx
  vduse: Tie vduse mgmtdev and its device
  vdpa/mlx5: Initialize CVQ vringh only once
  vdpa/mlx5: Update Control VQ callback information
2022-06-27 10:47:34 -07:00
Masahiro Yamada
2390095113 tick/nohz: unexport __init-annotated tick_nohz_full_setup()
EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.

modpost used to detect it, but it had been broken for a decade.

Commit 28438794ab ("modpost: fix section mismatch check for exported
init/exit sections") fixed it so modpost started to warn it again, then
this showed up:

    MODPOST vmlinux.symvers
  WARNING: modpost: vmlinux.o(___ksymtab_gpl+tick_nohz_full_setup+0x0): Section mismatch in reference from the variable __ksymtab_tick_nohz_full_setup to the function .init.text:tick_nohz_full_setup()
  The symbol tick_nohz_full_setup is exported and annotated __init
  Fix this by removing the __init annotation of tick_nohz_full_setup or drop the export.

Drop the export because tick_nohz_full_setup() is only called from the
built-in code in kernel/sched/isolation.c.

Fixes: ae9e557b5b ("time: Export tick start/stop functions for rcutorture")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-27 10:43:12 -07:00
Florian Westphal
c2577862ee netfilter: br_netfilter: do not skip all hooks with 0 priority
When br_netfilter module is loaded, skbs may be diverted to the
ipv4/ipv6 hooks, just like as if we were routing.

Unfortunately, bridge filter hooks with priority 0 may be skipped
in this case.

Example:
1. an nftables bridge ruleset is loaded, with a prerouting
   hook that has priority 0.
2. interface is added to the bridge.
3. no tcp packet is ever seen by the bridge prerouting hook.
4. flush the ruleset
5. load the bridge ruleset again.
6. tcp packets are processed as expected.

After 1) the only registered hook is the bridge prerouting hook, but its
not called yet because the bridge hasn't been brought up yet.

After 2), hook order is:
   0 br_nf_pre_routing // br_netfilter internal hook
   0 chain bridge f prerouting // nftables bridge ruleset

The packet is diverted to br_nf_pre_routing.
If call-iptables is off, the nftables bridge ruleset is called as expected.

But if its enabled, br_nf_hook_thresh() will skip it because it assumes
that all 0-priority hooks had been called previously in bridge context.

To avoid this, check for the br_nf_pre_routing hook itself, we need to
resume directly after it, even if this hook has a priority of 0.

Unfortunately, this still results in different packet flow.
With this fix, the eval order after in 3) is:
1. br_nf_pre_routing
2. ip(6)tables (if enabled)
3. nftables bridge

but after 5 its the much saner:
1. nftables bridge
2. br_nf_pre_routing
3. ip(6)tables (if enabled)

Unfortunately I don't see a solution here:
It would be possible to move br_nf_pre_routing to a higher priority
so that it will be called later in the pipeline, but this also impacts
ebtables evaluation order, and would still result in this very ordering
problem for all nftables-bridge hooks with the same priority as the
br_nf_pre_routing one.

Searching back through the git history I don't think this has
ever behaved in any other way, hence, no fixes-tag.

Reported-by: Radim Hrazdil <rhrazdil@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-27 19:23:27 +02:00
Florian Westphal
e34b9ed96c netfilter: nf_tables: avoid skb access on nf_stolen
When verdict is NF_STOLEN, the skb might have been freed.

When tracing is enabled, this can result in a use-after-free:
1. access to skb->nf_trace
2. access to skb->mark
3. computation of trace id
4. dump of packet payload

To avoid 1, keep a cached copy of skb->nf_trace in the
trace state struct.
Refresh this copy whenever verdict is != STOLEN.

Avoid 2 by skipping skb->mark access if verdict is STOLEN.

3 is avoided by precomputing the trace id.

Only dump the packet when verdict is not "STOLEN".

Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-27 19:22:54 +02:00
Pablo Neira Ayuso
05907f10e2 netfilter: nft_dynset: restore set element counter when failing to update
This patch fixes a race condition.

nft_rhash_update() might fail for two reasons:

- Element already exists in the hashtable.
- Another packet won race to insert an entry in the hashtable.

In both cases, new() has already bumped the counter via atomic_add_unless(),
therefore, decrement the set element counter.

Fixes: 22fe54d5fe ("netfilter: nf_tables: add support for dynamic set updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-27 19:03:37 +02:00
Matthew Auld
79538490fd drm/i915: tweak the ordering in cpu_write_needs_clflush
For imported dma-buf objects we leave the object as cache_coherent = 0
across all platforms, which is reasonable given that have no clue what
the memory underneath is, and its not like the driver can ever manually
clflush the pages anyway (like with i915_gem_clflush_object) for such
objects. However on discrete we choose to treat cache_dirty = true as a
programmer error, leading to a warning. The simplest fix looks to be to
just change the ordering in cpu_write_needs_clflush to prevent ever
setting cache_dirty for dma-buf objects on discrete.

Fixes: d028a7690d ("drm/i915/dmabuf: Fix prime_mmap to work when using LMEM")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5266
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220622155919.355081-1-matthew.auld@intel.com
(cherry picked from commit 563aaf4a92)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-06-27 18:12:10 +03:00
Anshuman Gupta
7d23a80dc9 drm/i915/dgfx: Disable d3cold at gfx root port
Currently i915 disables d3cold for i915 pci dev.
This blocks D3 for i915 gfx pci upstream bridge (VSP).
Let's disable d3cold at gfx root port to make sure that
i915 gfx VSP can transition to D3 to save some power.

We don't need to disable/enable d3cold in rpm, s2idle
suspend/resume handlers. Disabling/Enabling d3cold at
gfx root port in probe/remove phase is sufficient.

Fixes: 1a085e2341 ("drm/i915: Disable D3Cold in s2idle and runtime pm")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220616122249.5007-1-anshuman.gupta@intel.com
(cherry picked from commit 138c2fca6f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-06-27 18:12:05 +03:00
katrinzhou
9efdd519d0 drm/i915/gem: add missing else
Add missing else in set_proto_ctx_param() to fix coverity issue.

Addresses-Coverity: ("Unused value")
Fixes: d4433c7600 ("drm/i915/gem: Use the proto-context to handle create parameters (v5)")
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: katrinzhou <katrinzhou@tencent.com>
[tursulin: fixup alignment]
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621124926.615884-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 7482a65664)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-06-27 18:12:03 +03:00
Alexey Khoroshilov
8a9ffb8c85 NFSD: restore EINVAL error translation in nfsd_commit()
commit 555dbf1a9a ("nfsd: Replace use of rwsem with errseq_t")
incidentally broke translation of -EINVAL to nfserr_notsupp.
The patch restores that.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Fixes: 555dbf1a9a ("nfsd: Replace use of rwsem with errseq_t")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-06-27 10:33:05 -04:00
Hans de Goede
8853e8ce9b platform/x86: ideapad-laptop: Add Ideapad 5 15ITL05 to ideapad_dytc_v4_allow_table[]
The Ideapad 5 15ITL05 uses DYTC version 4 for platform-profile
control. This has been tested successfully with the ideapad-laptop
DYTC version 5 code; Add the Ideapad 5 15ITL05 to the
ideapad_dytc_v4_allow_table[].

Fixes: 599482c58e ("platform/x86: ideapad-laptop: Add platform support for Ideapad 5 Pro 16ACH6-82L5")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213297
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220627130850.313537-1-hdegoede@redhat.com
2022-06-27 16:20:27 +02:00
Hans de Goede
a27a1e35f5 platform/x86: ideapad-laptop: Add allow_v4_dytc module parameter
Add an allow_v4_dytc module parameter to allow users to easily test if
DYTC version 4 platform-profiles work on their laptop.

Fixes: 599482c58e ("platform/x86: ideapad-laptop: Add platform support for Ideapad 5 Pro 16ACH6-82L5")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213297
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220623115914.103001-1-hdegoede@redhat.com
2022-06-27 16:20:11 +02:00
Maxime Ripard
5f701324c0 drm/vc4: perfmon: Fix variable dereferenced before check
Commit 30f8c74ca9 ("drm/vc4: Warn if some v3d code is run on BCM2711")
introduced a check in vc4_perfmon_get() that dereferences a pointer before
we checked whether that pointer is valid or not.

Let's rework that function a bit to do things in the proper order.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 30f8c74ca9 ("drm/vc4: Warn if some v3d code is run on BCM2711")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: José Expósito <jose.exposito89@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220622080243.22119-1-maxime@cerno.tech
2022-06-27 15:43:14 +02:00
Deming Wang
c7cc29aaeb virtio_ring: make vring_create_virtqueue_split prettier
Add some spaces to vring_alloc_queue(make it look prettier).

Signed-off-by: Deming Wang <wangdeming@inspur.com>
Message-Id: <20220622192306.4371-1-wangdeming@inspur.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-27 08:05:35 -04:00
Stefano Garzarella
037d430556 vhost-vdpa: call vhost_vdpa_cleanup during the release
Before commit 3d56987938 ("vhost-vdpa: introduce asid based IOTLB")
we call vhost_vdpa_iotlb_free() during the release to clean all regions
mapped in the iotlb.

That commit removed vhost_vdpa_iotlb_free() and added vhost_vdpa_cleanup()
to do some cleanup, including deleting all mappings, but we forgot to call
it in vhost_vdpa_release().

This causes that if an application does not remove all mappings explicitly
(or it crashes), the mappings remain in the iotlb and subsequent
applications may fail if they map the same addresses.

Calling vhost_vdpa_cleanup() also fixes a memory leak since we are not
freeing `v->vdev.vqs` during the release from the same commit.

Since vhost_vdpa_cleanup() calls vhost_dev_cleanup() we can remove its
call from vhost_vdpa_release().

Fixes: 3d56987938 ("vhost-vdpa: introduce asid based IOTLB")
Cc: gautam.dawar@xilinx.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220622151407.51232-1-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-06-27 08:05:35 -04:00
Stephan Gerhold
e0c2ce8217 virtio_mmio: Restore guest page size on resume
Virtio devices might lose their state when the VMM is restarted
after a suspend to disk (hibernation) cycle. This means that the
guest page size register must be restored for the virtio_mmio legacy
interface, since otherwise the virtio queues are not functional.

This is particularly problematic for QEMU that currently still defaults
to using the legacy interface for virtio_mmio. Write the guest page
size register again in virtio_mmio_restore() to make legacy virtio_mmio
devices work correctly after hibernation.

Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Message-Id: <20220621110621.3638025-3-stephan.gerhold@kernkonzept.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-27 08:05:35 -04:00
Stephan Gerhold
ed7ac37fde virtio_mmio: Add missing PM calls to freeze/restore
Most virtio drivers provide freeze/restore callbacks to finish up
device usage before suspend and to reinitialize the virtio device after
resume. However, these callbacks are currently only called when using
virtio_pci. virtio_mmio does not have any PM ops defined.

This causes problems for example after suspend to disk (hibernation),
since the virtio devices might lose their state after the VMM is
restarted. Calling virtio_device_freeze()/restore() ensures that
the virtio devices are re-initialized correctly.

Fix this by implementing the dev_pm_ops for virtio_mmio,
similar to virtio_pci_common.

Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Message-Id: <20220621110621.3638025-2-stephan.gerhold@kernkonzept.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-27 08:05:35 -04:00
Jason Wang
11a37eb668 caif_virtio: fix race between virtio_device_ready() and ndo_open()
We currently depend on probe() calling virtio_device_ready() -
which happens after netdev
registration. Since ndo_open() can be called immediately
after register_netdev, this means there exists a race between
ndo_open() and virtio_device_ready(): the driver may start to use the
device (e.g. TX) before DRIVER_OK which violates the spec.

Fix this by switching to use register_netdevice() and protect the
virtio_device_ready() with rtnl_lock() to make sure ndo_open() can
only be called after virtio_device_ready().

Fixes: 0d2e1a2926 ("caif_virtio: Introduce caif over virtio")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220620051115.3142-3-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-27 08:04:30 -04:00
Jason Wang
50c0ada627 virtio-net: fix race between ndo_open() and virtio_device_ready()
We currently call virtio_device_ready() after netdev
registration. Since ndo_open() can be called immediately
after register_netdev, this means there exists a race between
ndo_open() and virtio_device_ready(): the driver may start to use the
device before DRIVER_OK which violates the spec.

Fix this by switching to use register_netdevice() and protect the
virtio_device_ready() with rtnl_lock() to make sure ndo_open() can
only be called after virtio_device_ready().

Fixes: 4baf1e33d0 ("virtio_net: enable VQs early")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220617072949.30734-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-27 08:02:59 -04:00
Xin Long
cb8092d70a tipc: move bc link creation back to tipc_node_create
Shuang Li reported a NULL pointer dereference crash:

  [] BUG: kernel NULL pointer dereference, address: 0000000000000068
  [] RIP: 0010:tipc_link_is_up+0x5/0x10 [tipc]
  [] Call Trace:
  []  <IRQ>
  []  tipc_bcast_rcv+0xa2/0x190 [tipc]
  []  tipc_node_bc_rcv+0x8b/0x200 [tipc]
  []  tipc_rcv+0x3af/0x5b0 [tipc]
  []  tipc_udp_recv+0xc7/0x1e0 [tipc]

It was caused by the 'l' passed into tipc_bcast_rcv() is NULL. When it
creates a node in tipc_node_check_dest(), after inserting the new node
into hashtable in tipc_node_create(), it creates the bc link. However,
there is a gap between this insert and bc link creation, a bc packet
may come in and get the node from the hashtable then try to dereference
its bc link, which is NULL.

This patch is to fix it by moving the bc link creation before inserting
into the hashtable.

Note that for a preliminary node becoming "real", the bc link creation
should also be called before it's rehashed, as we don't create it for
preliminary nodes.

Fixes: 4cbf8ac2fe ("tipc: enable creating a "preliminary" node")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-27 11:51:56 +01:00
Eric Dumazet
853a761488 tunnels: do not assume mac header is set in skb_tunnel_check_pmtu()
Recently added debug in commit f9aefd6b2a ("net: warn if mac header
was not set") caught a bug in skb_tunnel_check_pmtu(), as shown
in this syzbot report [1].

In ndo_start_xmit() paths, there is really no need to use skb->mac_header,
because skb->data is supposed to point at it.

[1] WARNING: CPU: 1 PID: 8604 at include/linux/skbuff.h:2784 skb_mac_header_len include/linux/skbuff.h:2784 [inline]
WARNING: CPU: 1 PID: 8604 at include/linux/skbuff.h:2784 skb_tunnel_check_pmtu+0x5de/0x2f90 net/ipv4/ip_tunnel_core.c:413
Modules linked in:
CPU: 1 PID: 8604 Comm: syz-executor.3 Not tainted 5.19.0-rc2-syzkaller-00443-g8720bd951b8e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:skb_mac_header_len include/linux/skbuff.h:2784 [inline]
RIP: 0010:skb_tunnel_check_pmtu+0x5de/0x2f90 net/ipv4/ip_tunnel_core.c:413
Code: 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 84 b9 fe ff ff 4c 89 ff e8 7c 0f d7 f9 e9 ac fe ff ff e8 c2 13 8a f9 <0f> 0b e9 28 fc ff ff e8 b6 13 8a f9 48 8b 54 24 70 48 b8 00 00 00
RSP: 0018:ffffc90002e4f520 EFLAGS: 00010212
RAX: 0000000000000324 RBX: ffff88804d5fd500 RCX: ffffc90005b52000
RDX: 0000000000040000 RSI: ffffffff87f05e3e RDI: 0000000000000003
RBP: ffffc90002e4f650 R08: 0000000000000003 R09: 000000000000ffff
R10: 000000000000ffff R11: 0000000000000000 R12: 000000000000ffff
R13: 0000000000000000 R14: 000000000000ffcd R15: 000000000000001f
FS: 00007f3babba9700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000080 CR3: 0000000075319000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
geneve_xmit_skb drivers/net/geneve.c:927 [inline]
geneve_xmit+0xcf8/0x35d0 drivers/net/geneve.c:1107
__netdev_start_xmit include/linux/netdevice.h:4805 [inline]
netdev_start_xmit include/linux/netdevice.h:4819 [inline]
__dev_direct_xmit+0x500/0x730 net/core/dev.c:4309
dev_direct_xmit include/linux/netdevice.h:3007 [inline]
packet_direct_xmit+0x1b8/0x2c0 net/packet/af_packet.c:282
packet_snd net/packet/af_packet.c:3073 [inline]
packet_sendmsg+0x21f4/0x55d0 net/packet/af_packet.c:3104
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:734
____sys_sendmsg+0x6eb/0x810 net/socket.c:2489
___sys_sendmsg+0xf3/0x170 net/socket.c:2543
__sys_sendmsg net/socket.c:2572 [inline]
__do_sys_sendmsg net/socket.c:2581 [inline]
__se_sys_sendmsg net/socket.c:2579 [inline]
__x64_sys_sendmsg+0x132/0x220 net/socket.c:2579
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f3baaa89109
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3babba9168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f3baab9bf60 RCX: 00007f3baaa89109
RDX: 0000000000000000 RSI: 0000000020000a00 RDI: 0000000000000003
RBP: 00007f3baaae305d R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffe74f2543f R14: 00007f3babba9300 R15: 0000000000022000
</TASK>

Fixes: 4cb47a8644 ("tunnels: PMTU discovery support for directly bridged IP packets")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-27 11:50:30 +01:00
Jean Delvare
d2f33f0c3a platform/x86: thinkpad_acpi: Fix a memory leak of EFCH MMIO resource
Unlike release_mem_region(), a call to release_resource() does not
free the resource, so it has to be freed explicitly to avoid a memory
leak.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 455cd867b8 ("platform/x86: thinkpad_acpi: Add a s2idle resume quirk for a number of laptops")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Mark Gross <markgross@kernel.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20220621155511.5b266395@endymion.delvare
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-27 09:38:56 +02:00
Dan Carpenter
79e90ca02d platform/mellanox: nvsw-sn2201: fix error code in nvsw_sn2201_create_static_devices()
This should return PTR_ERR() instead of IS_ERR().  Also "dev->client"
has been set to NULL by this point so it returns 0/success so preserve
the error code earlier.

Fixes: 662f24826f ("platform/mellanox: Add support for new SN2201 system")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/YqmUGwmPK7cPolk/@kili
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-27 09:38:56 +02:00
Gayatri Kammela
d63eae6747 platform/x86: intel/pmc: Add Alder Lake N support to PMC core driver
Add Alder Lake N (ADL-N) to the list of the platforms that Intel's
PMC core driver supports. Alder Lake N reuses all the TigerLake PCH IPs.

Cc: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@linux.intel.com>
Reviewed-by: Rajneesh Bhardwaj <irenic.rajneesh@gmail.com>
Link: https://lore.kernel.org/r/20220615002751.3371730-1-gayatri.kammela@linux.intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-06-27 09:38:31 +02:00
Darrick J. Wong
f94e08b602 xfs: clean up the end of xfs_attri_item_recover
The end of this function could use some cleanup -- the EAGAIN
conditionals make it harder to figure out what's going on with the
disposal of xattri_leaf_bp, and the dual error/ret variables aren't
needed.  Turn the EAGAIN case into a separate block documenting all the
subtleties of recovering in the middle of an xattr update chain, which
makes the rest of the prologue much simpler.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-06-26 14:43:28 -07:00
Darrick J. Wong
b822ea17fd xfs: always free xattri_leaf_bp when cancelling a deferred op
While running the following fstest with logged xattrs DISabled, I
noticed the following:

# FSSTRESS_AVOID="-z -f unlink=1 -f rmdir=1 -f creat=2 -f mkdir=2 -f
getfattr=3 -f listfattr=3 -f attr_remove=4 -f removefattr=4 -f
setfattr=20 -f attr_set=60" ./check generic/475

INFO: task u9:1:40 blocked for more than 61 seconds.
      Tainted: G           O      5.19.0-rc2-djwx #rc2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:u9:1            state:D stack:12872 pid:   40 ppid:     2 flags:0x00004000
Workqueue: xfs-cil/dm-0 xlog_cil_push_work [xfs]
Call Trace:
 <TASK>
 __schedule+0x2db/0x1110
 schedule+0x58/0xc0
 schedule_timeout+0x115/0x160
 __down_common+0x126/0x210
 down+0x54/0x70
 xfs_buf_lock+0x2d/0xe0 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73]
 xfs_buf_item_unpin+0x227/0x3a0 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73]
 xfs_trans_committed_bulk+0x18e/0x320 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73]
 xlog_cil_committed+0x2ea/0x360 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73]
 xlog_cil_push_work+0x60f/0x690 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73]
 process_one_work+0x1df/0x3c0
 worker_thread+0x53/0x3b0
 kthread+0xea/0x110
 ret_from_fork+0x1f/0x30
 </TASK>

This appears to be the result of shortform_to_leaf creating a new leaf
buffer as part of adding an xattr to a file.  The new leaf buffer is
held and attached to the xfs_attr_intent structure, but then the
filesystem shuts down.  Instead of the usual path (which adds the attr
to the held leaf buffer which releases the hold), we instead cancel the
entire deferred operation.

Unfortunately, xfs_attr_cancel_item doesn't release any attached leaf
buffers, so we leak the locked buffer.  The CIL cannot do anything
about that, and hangs.  Fix this by teaching it to release leaf buffers,
and make XFS a little more careful about not leaving a dangling
reference.

The prologue of xfs_attri_item_recover is (in this author's opinion) a
little hard to figure out, so I'll clean that up in the next patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-06-26 14:43:28 -07:00
Kaixu Xia
82af880639 xfs: use invalidate_lock to check the state of mmap_lock
We should use invalidate_lock and XFS_MMAPLOCK_SHARED to check the state
of mmap_lock rw_semaphore in xfs_isilocked(), rather than i_rwsem and
XFS_IOLOCK_SHARED.

Fixes: 2433480a7e ("xfs: Convert to use invalidate_lock")
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-26 14:43:28 -07:00
Kaixu Xia
ca76a761ea xfs: factor out the common lock flags assert
There are similar lock flags assert in xfs_ilock(), xfs_ilock_nowait(),
xfs_iunlock(), thus we can factor it out into a helper that is clear.

Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-26 14:43:28 -07:00
Linus Torvalds
03c765b0e3 Linux 5.19-rc4 2022-06-26 14:22:10 -07:00
Linus Torvalds
1709b88739 Merge tag 'soc-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "A number of fixes have accumulated, but they are largely for harmless
  issues:

   - Several OF node leak fixes

   - A fix to the Exynos7885 UART clock description

   - DTS fixes to prevent boot failures on TI AM64 and J721s2

   - Bus probe error handling fixes for Baikal-T1

   - A fixup to the way STM32 SoCs use separate dts files for different
     firmware stacks

   - Multiple code fixes for Arm SCMI firmware, all dealing with
     robustness of the implementation

   - Multiple NXP i.MX devicetree fixes, addressing incorrect data in DT
     nodes

   - Three updates to the MAINTAINERS file, including Florian Fainelli
     taking over BCM283x/BCM2711 (Raspberry Pi) from Nicolas Saenz
     Julienne"

* tag 'soc-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (29 commits)
  ARM: dts: aspeed: nuvia: rename vendor nuvia to qcom
  arm: mach-spear: Add missing of_node_put() in time.c
  ARM: cns3xxx: Fix refcount leak in cns3xxx_init
  MAINTAINERS: Update email address
  arm64: dts: ti: k3-am64-main: Remove support for HS400 speed mode
  arm64: dts: ti: k3-j721s2: Fix overlapping GICD memory region
  ARM: dts: bcm2711-rpi-400: Fix GPIO line names
  bus: bt1-axi: Don't print error on -EPROBE_DEFER
  bus: bt1-apb: Don't print error on -EPROBE_DEFER
  ARM: Fix refcount leak in axxia_boot_secondary
  ARM: dts: stm32: move SCMI related nodes in a dedicated file for stm32mp15
  soc: imx: imx8m-blk-ctrl: fix display clock for LCDIF2 power domain
  ARM: dts: imx6qdl-colibri: Fix capacitive touch reset polarity
  ARM: dts: imx6qdl: correct PU regulator ramp delay
  firmware: arm_scmi: Fix incorrect error propagation in scmi_voltage_descriptors_get
  firmware: arm_scmi: Avoid using extended string-buffers sizes if not necessary
  firmware: arm_scmi: Fix SENSOR_AXIS_NAME_GET behaviour when unsupported
  ARM: dts: imx7: Move hsic_phy power domain to HSIC PHY node
  soc: bcm: brcmstb: pm: pm-arm: Fix refcount leak in brcmstb_pm_probe
  MAINTAINERS: Update BCM2711/BCM2835 maintainer
  ...
2022-06-26 14:12:56 -07:00
Linus Torvalds
413c1f1491 Merge tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
 "Minor things, mainly - mailmap updates, MAINTAINERS updates, etc.

  Fixes for this merge window:

   - fix for a damon boot hang, from SeongJae

   - fix for a kfence warning splat, from Jason Donenfeld

   - fix for zero-pfn pinning, from Alex Williamson

   - fix for fallocate hole punch clearing, from Mike Kravetz

  Fixes for previous releases:

   - fix for a performance regression, from Marcelo

   - fix for a hwpoisining BUG from zhenwei pi"

* tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mailmap: add entry for Christian Marangi
  mm/memory-failure: disable unpoison once hw error happens
  hugetlbfs: zero partial pages during fallocate hole punch
  mm: memcontrol: reference to tools/cgroup/memcg_slabinfo.py
  mm: re-allow pinning of zero pfns
  mm/kfence: select random number before taking raw lock
  MAINTAINERS: add maillist information for LoongArch
  MAINTAINERS: update MM tree references
  MAINTAINERS: update Abel Vesa's email
  MAINTAINERS: add MEMORY HOT(UN)PLUG section and add David as reviewer
  MAINTAINERS: add Miaohe Lin as a memory-failure reviewer
  mailmap: add alias for jarkko@profian.com
  mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is initialized
  kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal
  mm: lru_cache_disable: use synchronize_rcu_expedited
  mm/page_isolation.c: fix one kernel-doc comment
2022-06-26 14:00:55 -07:00
Linus Torvalds
893d1eaa56 Merge tag 'perf-tools-fixes-for-v5.19-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Enable ignore_missing_thread in 'perf stat', enabling counting with
   '--pid' when threads disappear during counting session setup

 - Adjust output data offset for backward compatibility in 'perf inject'

 - Fix missing free in copy_kcore_dir() in 'perf inject'

 - Fix caching files with a wrong build ID

 - Sync drm, cpufeatures, vhost and svn headers with the kernel

* tag 'perf-tools-fixes-for-v5.19-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  tools headers UAPI: Synch KVM's svm.h header with the kernel
  tools include UAPI: Sync linux/vhost.h with the kernel sources
  perf stat: Enable ignore_missing_thread
  perf inject: Adjust output data offset for backward compatibility
  perf trace beauty: Fix generation of errno id->str table on ALT Linux
  perf build-id: Fix caching files with a wrong build ID
  tools headers cpufeatures: Sync with the kernel sources
  tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
  perf inject: Fix missing free in copy_kcore_dir()
2022-06-26 12:12:25 -07:00
Linus Torvalds
82708bb1eb Merge tag 'for-5.19-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:

 - zoned relocation fixes:
      - fix critical section end for extent writeback, this could lead
        to out of order write
      - prevent writing to previous data relocation block group if space
        gets low

 - reflink fixes:
      - fix race between reflinking and ordered extent completion
      - proper error handling when block reserve migration fails
      - add missing inode iversion/mtime/ctime updates on each iteration
        when replacing extents

 - fix deadlock when running fsync/fiemap/commit at the same time

 - fix false-positive KCSAN report regarding pid tracking for read locks
   and data race

 - minor documentation update and link to new site

* tag 'for-5.19-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Documentation: update btrfs list of features and link to readthedocs.io
  btrfs: fix deadlock with fsync+fiemap+transaction commit
  btrfs: don't set lock_owner when locking extent buffer for reading
  btrfs: zoned: fix critical section of relocation inode writeback
  btrfs: zoned: prevent allocation from previous data relocation BG
  btrfs: do not BUG_ON() on failure to migrate space when replacing extents
  btrfs: add missing inode updates on each iteration when replacing extents
  btrfs: fix race between reflinking and ordered extent completion
2022-06-26 10:11:36 -07:00
Linus Torvalds
c898c67db6 Merge tag 'dma-mapping-5.19-2022-06-26' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:

 - pass the correct size to dma_set_encrypted() when freeing memory
   (Dexuan Cui)

* tag 'dma-mapping-5.19-2022-06-26' of git://git.infradead.org/users/hch/dma-mapping:
  dma-direct: use the correct size for dma_set_encrypted()
2022-06-26 10:01:40 -07:00
Linus Torvalds
be129fab66 Merge tag 'for-5.19/fbdev-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev fixes from Helge Deller:
 "Two bug fixes for the pxa3xx and intelfb drivers:

   - pxa3xx-gcu: Fix integer overflow in pxa3xx_gcu_write

   - intelfb: Initialize value of stolen size

  The other changes are small cleanups, simplifications and
  documentation updates to the cirrusfb, skeletonfb, omapfb,
  intelfb, au1100fb and simplefb drivers"

* tag 'for-5.19/fbdev-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  video: fbdev: omap: Remove duplicate 'the' in comment
  video: fbdev: omapfb: Align '*' in comment
  video: fbdev: simplefb: Check before clk_put() not needed
  video: fbdev: au1100fb: Drop unnecessary NULL ptr check
  video: fbdev: pxa3xx-gcu: Fix integer overflow in pxa3xx_gcu_write
  video: fbdev: skeletonfb: Convert to generic power management
  video: fbdev: cirrusfb: Remove useless reference to PCI power management
  video: fbdev: intelfb: Initialize value of stolen size
  video: fbdev: intelfb: Use aperture size from pci_resource_len
  video: fbdev: skeletonfb: Fix syntax errors in comments
2022-06-26 09:13:51 -07:00
Linus Torvalds
c0c6a7bd4c Merge tag 'for-5.19/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:

 - enable ARCH_HAS_STRICT_MODULE_RWX to prevent a boot crash on c8000
   machines

 - flush all mappings of a shared anonymous page on PA8800/8900 machines
   via flushing the whole data cache. This may slow down such machines
   but makes sure that the cache is consistent

 - Fix duplicate definition build error regarding fb_is_primary_device()

* tag 'for-5.19/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Enable ARCH_HAS_STRICT_MODULE_RWX
  parisc: Fix flush_anon_page on PA8800/PA8900
  parisc: align '*' in comment in math-emu code
  parisc/stifb: Fix fb_is_primary_device() only available with CONFIG_FB_STI
2022-06-26 09:08:30 -07:00
Linus Torvalds
e963d685dd Merge tag 'xtensa-20220626' of https://github.com/jcmvbkbc/linux-xtensa
Pull xtensa fixes from Max Filippov:

 - fix OF reference leaks in xtensa arch code

 - replace '.bss' with '.section .bss' to fix entry.S build with old
   assembler

* tag 'xtensa-20220626' of https://github.com/jcmvbkbc/linux-xtensa:
  xtensa: change '.bss' to '.section .bss'
  xtensa: xtfpga: Fix refcount leak bug in setup
  xtensa: Fix refcount leak bug in time.c
2022-06-26 08:59:21 -07:00
Linus Torvalds
8100775d59 Merge tag 'powerpc-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:

 - A fix for a CMA change that broke booting guests with > 2G RAM on
   Power8 hosts.

 - Fix the RTAS call filter to allow a special case that applications
   rely on.

 - A change to our execve path, to make the execve syscall exit
   tracepoint work.

 - Three fixes to wire up our various RNGs earlier in boot so they're
   available for use in the initial seeding in random_init().

 - A build fix for when KASAN is enabled along with
   STRUCTLEAK_BYREF_ALL.

Thanks to Andrew Donnellan, Aneesh Kumar K.V, Christophe Leroy, Jason
Donenfeld, Nathan Lynch, Naveen N. Rao, Sathvika Vasireddy, Sumit
Dubey2, Tyrel Datwyler, and Zi Yan.

* tag 'powerpc-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/powernv: wire up rng during setup_arch
  powerpc/prom_init: Fix build failure with GCC_PLUGIN_STRUCTLEAK_BYREF_ALL and KASAN
  powerpc/rtas: Allow ibm,platform-dump RTAS call with null buffer address
  powerpc: Enable execve syscall exit tracepoint
  powerpc/pseries: wire up rng during setup_arch()
  powerpc/microwatt: wire up rng during setup_arch()
  powerpc/mm: Move CMA reservations after initmem_init()
2022-06-26 08:53:20 -07:00
Linus Torvalds
393ed5d85e Merge tag 'kbuild-fixes-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:

 - Fix modpost to detect EXPORT_SYMBOL marked as __init or__exit

 - Update the supported arch list in the LLVM document

 - Avoid the second link of vmlinux for CONFIG_TRIM_UNUSED_KSYMS

 - Avoid false __KSYM___this_module define in include/generated/autoksyms.h

* tag 'kbuild-fixes-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: Ignore __this_module in gen_autoksyms.sh
  kbuild: link vmlinux only once for CONFIG_TRIM_UNUSED_KSYMS (2nd attempt)
  Documentation/llvm: Update Supported Arch table
  modpost: fix section mismatch check for exported init/exit sections
2022-06-26 08:47:28 -07:00
Linus Torvalds
97d4d02697 Merge tag 'exfat-for-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat fix from Namjae Jeon:

 - Use updated exfat_chain directly instead of snapshot values in
   rename.

* tag 'exfat-for-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: use updated exfat_chain directly during renaming
2022-06-26 08:41:04 -07:00
Linus Torvalds
918c30dffd Merge tag '5.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs client fixes from Steve French:
 "Fixes addressing important multichannel, and reconnect issues.

  Multichannel mounts when the server network interfaces changed, or ip
  addresses changed, uncovered problems, especially in reconnect, but
  the patches for this were held up until recently due to some lock
  conflicts that are now addressed.

  Included in this set of fixes:

   - three fixes relating to multichannel reconnect, dynamically
     adjusting the list of server interfaces to avoid problems during
     reconnect

   - a lock conflict fix related to the above

   - two important fixes for negotiate on secondary channels (null
     netname can unintentionally cause multichannel to be disabled to
     some servers)

   - a reconnect fix (reporting incorrect IP address in some cases)"

* tag '5.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update cifs_ses::ip_addr after failover
  cifs: avoid deadlocks while updating iface
  cifs: periodically query network interfaces from server
  cifs: during reconnect, update interface if necessary
  cifs: change iface_list from array to sorted linked list
  smb3: use netname when available on secondary channels
  smb3: fix empty netname context on secondary channels
2022-06-26 08:34:52 -07:00
Arnaldo Carvalho de Melo
f8d8661940 tools headers UAPI: Synch KVM's svm.h header with the kernel
To pick up the changes from:

  d5af44dde5 ("x86/sev: Provide support for SNP guest request NAEs")
  0afb6b660a ("x86/sev: Use SEV-SNP AP creation to start secondary CPUs")
  dc3f3d2474 ("x86/mm: Validate memory when changing the C-bit")
  cbd3d4f7c4 ("x86/sev: Check SEV-SNP features support")

That gets these new SVM exit reasons:

+       { SVM_VMGEXIT_PSC,              "vmgexit_page_state_change" }, \
+       { SVM_VMGEXIT_GUEST_REQUEST,    "vmgexit_guest_request" }, \
+       { SVM_VMGEXIT_EXT_GUEST_REQUEST, "vmgexit_ext_guest_request" }, \
+       { SVM_VMGEXIT_AP_CREATION,      "vmgexit_ap_creation" }, \
+       { SVM_VMGEXIT_HV_FEATURES,      "vmgexit_hypervisor_feature" }, \

Addressing this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/svm.h' differs from latest version at 'arch/x86/include/uapi/asm/svm.h'
  diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h

This causes these changes:

  CC      /tmp/build/perf-urgent/arch/x86/util/kvm-stat.o
  LD      /tmp/build/perf-urgent/arch/x86/util/perf-in.o
  LD      /tmp/build/perf-urgent/arch/x86/perf-in.o
  LD      /tmp/build/perf-urgent/arch/perf-in.o
  LD      /tmp/build/perf-urgent/perf-in.o
  LINK    /tmp/build/perf-urgent/perf

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26 12:32:55 -03:00
Arnaldo Carvalho de Melo
e2213a2dc6 tools include UAPI: Sync linux/vhost.h with the kernel sources
To get the changes in:

  84d7c8fd3a ("vhost-vdpa: introduce uAPI to set group ASID")
  2d1fcb7758 ("vhost-vdpa: uAPI to get virtqueue group id")
  a0c95f2011 ("vhost-vdpa: introduce uAPI to get the number of address spaces")
  3ace88bd37 ("vhost-vdpa: introduce uAPI to get the number of virtqueue groups")
  175d493c3c ("vhost: move the backend feature bits to vhost_types.h")

Silencing this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h'
  diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h

To pick up these changes and support them:

  $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before
  $ cp include/uapi/linux/vhost.h tools/include/uapi/linux/vhost.h
  $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after
  $ diff -u before after
  --- before	2022-06-26 12:04:35.982003781 -0300
  +++ after	2022-06-26 12:04:43.819972476 -0300
  @@ -28,6 +28,7 @@
   	[0x74] = "VDPA_SET_CONFIG",
   	[0x75] = "VDPA_SET_VRING_ENABLE",
   	[0x77] = "VDPA_SET_CONFIG_CALL",
  +	[0x7C] = "VDPA_SET_GROUP_ASID",
   };
   static const char *vhost_virtio_ioctl_read_cmds[] = {
   	[0x00] = "GET_FEATURES",
  @@ -39,5 +40,8 @@
   	[0x76] = "VDPA_GET_VRING_NUM",
   	[0x78] = "VDPA_GET_IOVA_RANGE",
   	[0x79] = "VDPA_GET_CONFIG_SIZE",
  +	[0x7A] = "VDPA_GET_AS_NUM",
  +	[0x7B] = "VDPA_GET_VRING_GROUP",
   	[0x80] = "VDPA_GET_VQS_COUNT",
  +	[0x81] = "VDPA_GET_GROUP_NUM",
   };
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Gautam Dawar <gautam.dawar@xilinx.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Yrh3xMYbfeAD0MFL@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26 12:32:55 -03:00
Gang Li
448ce0e6ea perf stat: Enable ignore_missing_thread
perf already support ignore_missing_thread for -p, but not yet
applied to `perf stat -p <pid>`. This patch enables ignore_missing_thread
for `perf stat -p <pid>`.

Committer notes:

And here is a refresher about the 'ignore_missing_thread' knob, from a
previous patch using it:

  ca8000684e ("perf evsel: Enable ignore_missing_thread for pid option")

  ---
    While monitoring a multithread process with pid option, perf sometimes
    may return sys_perf_event_open failure with 3(No such process) if any of
    the process's threads die before we open the event. However, we want
    perf continue monitoring the remaining threads and do not exit with
    error.
  ---

Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220622030037.15005-1-ligang.bdlg@bytedance.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26 12:32:55 -03:00
Raul Silvera
37ed2cddcb perf inject: Adjust output data offset for backward compatibility
When 'perf inject' creates a new file, it reuses the data offset from
the input file. If there has been a change on the size of the header, as
happened in v5.12 -> v5.13, the new offsets will be wrong, resulting in
a corrupted output file.

This change adds the function perf_session__data_offset to compute the
data offset based on the current header size, and uses that instead of
the offset from the original input file.

Signed-off-by: Raul Silvera <rsilvera@google.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Colin Ian King <colin.king@intel.com>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220621152725.2668041-1-rsilvera@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26 12:32:55 -03:00
Arnaldo Carvalho de Melo
3713e2494b perf trace beauty: Fix generation of errno id->str table on ALT Linux
For some reason using:

         cat <<EoFuncBegin
  static const char *errno_to_name__$arch(int err)
  {
         switch (err) {
  EoFuncBegin

In tools/perf/trace/beauty/arch_errno_names.sh isn't working on ALT
Linux sisyphus (development version), which could be some distro
specific glitch, so just get this done in an alternative way that works
everywhere while giving notice to the people working on that distro to
try and figure our what really took place.

Cc: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26 12:32:55 -03:00
Adrian Hunter
ab66fdace8 perf build-id: Fix caching files with a wrong build ID
Build ID events associate a file name with a build ID.  However, when
using perf inject, there is no guarantee that the file on the current
machine at the current time has that build ID. Fix by comparing the
build IDs and skip adding to the cache if they are different.

Example:

  $ echo "int main() {return 0;}" > prog.c
  $ gcc -o prog prog.c
  $ perf record --buildid-all ./prog
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.019 MB perf.data ]
  $ file-buildid() { file $1 | awk -F= '{print $2}' | awk -F, '{print $1}' ; }
  $ file-buildid prog
  444ad9be165d8058a48ce2ffb4e9f55854a3293e
  $ file-buildid ~/.debug/$(pwd)/prog/444ad9be165d8058a48ce2ffb4e9f55854a3293e/elf
  444ad9be165d8058a48ce2ffb4e9f55854a3293e
  $ echo "int main() {return 1;}" > prog.c
  $ gcc -o prog prog.c
  $ file-buildid prog
  885524d5aaa24008a3e2b06caa3ea95d013c0fc5

Before:

  $ perf buildid-cache --purge $(pwd)/prog
  $ perf inject -i perf.data -o junk
  $ file-buildid ~/.debug/$(pwd)/prog/444ad9be165d8058a48ce2ffb4e9f55854a3293e/elf
  885524d5aaa24008a3e2b06caa3ea95d013c0fc5
  $

After:

  $ perf buildid-cache --purge $(pwd)/prog
  $ perf inject -i perf.data -o junk
  $ file-buildid ~/.debug/$(pwd)/prog/444ad9be165d8058a48ce2ffb4e9f55854a3293e/elf

  $

Fixes: 454c407ec1 ("perf: add perf-inject builtin")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: https://lore.kernel.org/r/20220621125144.5623-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26 12:32:55 -03:00
Arnaldo Carvalho de Melo
4b3f7644ae tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  d6d0c7f681 ("x86/cpufeatures: Add PerfMonV2 feature bit")
  296d5a17e7 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
  f30903394e ("x86/cpufeatures: Add virtual TSC_AUX feature bit")
  8ad7e8f696 ("x86/fpu/xsave: Support XSAVEC in the kernel")
  59bd54a84d ("x86/tdx: Detect running as a TDX guest in early boot")
  a77d41ac3a ("x86/cpufeatures: Add AMD Fam19h Branch Sampling feature")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
  diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YrDkgmwhLv+nKeOo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26 12:32:43 -03:00
Arnaldo Carvalho de Melo
0fdd435cb4 tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
To pick up the changes in:

  ecf8eca51f ("drm/i915/xehp: Add compute engine ABI")
  991b4de327 ("drm/i915/uapi: Add kerneldoc for engine class enum")
  c94fde8f51 ("drm/i915/uapi: Add DRM_I915_QUERY_GEOMETRY_SUBSLICES")
  1c671ad753 ("drm/i915/doc: Link query items to their uapi structs")
  a2e5402691 ("drm/i915/doc: Convert perf UAPI comments to kerneldoc")
  462ac1cdf4 ("drm/i915/doc: Convert drm_i915_query_topology_info comment to kerneldoc")
  034d47b25b ("drm/i915/uapi: Document DRM_I915_QUERY_HWCONFIG_BLOB")
  78e1fb3112 ("drm/i915/uapi: Add query for hwconfig blob")

That don't add any new ioctl, so no changes in tooling.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
  diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h

Cc: John Harrison <John.C.Harrison@intel.com>
Cc: Matt Atwood <matthew.s.atwood@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://lore.kernel.org/lkml/YrDi4ALYjv9Mdocq@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26 12:32:43 -03:00
Adrian Hunter
342cb0d806 perf inject: Fix missing free in copy_kcore_dir()
Free string allocated by asprintf().

Fixes: d8fc085509 ("perf inject: Keep a copy of kcore_dir")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220620103904.7960-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26 12:32:15 -03:00
Helge Deller
0a1355db36 parisc: Enable ARCH_HAS_STRICT_MODULE_RWX
Fix a boot crash on a c8000 machine as reported by Dave.  Basically it changes
patch_map() to return an alias mapping to the to-be-patched code in order to
prevent writing to write-protected memory.

Signed-off-by: Helge Deller <deller@gmx.de>
Suggested-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org   # v5.2+
Link: https://lore.kernel.org/all/e8ec39e8-25f8-e6b4-b7ed-4cb23efc756e@bell.net/
2022-06-26 12:23:15 +02:00
John David Anglin
e9ed22e6e5 parisc: Fix flush_anon_page on PA8800/PA8900
Anonymous pages are allocated with the shared mappings colouring,
SHM_COLOUR. Since the alias boundary on machines with PA8800 and
PA8900 processors is unknown, flush_user_cache_page() might not
flush all mappings of a shared anonymous page. Flushing the whole
data cache flushes all mappings.

This won't fix all coherency issues with shared mappings but it
seems to work well in practice.  I haven't seen any random memory
faults in almost a month on a rp3440 running as a debian buildd
machine.

There is a small preformance hit.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org   # v5.18+
2022-06-26 12:23:15 +02:00
Jason A. Donenfeld
067baa9a37 ksmbd: use vfs_llseek instead of dereferencing NULL
By not checking whether llseek is NULL, this might jump to NULL. Also,
it doesn't check FMODE_LSEEK. Fix this by using vfs_llseek(), which
always does the right thing.

Fixes: f441584858 ("cifsd: add file operations")
Cc: stable@vger.kernel.org
Cc: linux-cifs@vger.kernel.org
Cc: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: Hyunchul Lee <hyc.lee@gmail.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-25 19:52:49 -05:00
Jiang Jian
d16c5c7c92 parisc: align '*' in comment in math-emu code
Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-06-26 00:19:27 +02:00
Sami Tolvanen
ff13976676 kbuild: Ignore __this_module in gen_autoksyms.sh
Module object files can contain an undefined reference to __this_module,
which isn't resolved until we link the final .ko. The kernel doesn't
export this symbol, so ignore it in gen_autoksyms.sh.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Steve Muckle <smuckle@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Ramji Jiyani <ramjiyani@google.com>
2022-06-26 06:15:05 +09:00
Masahiro Yamada
53632ba87d kbuild: link vmlinux only once for CONFIG_TRIM_UNUSED_KSYMS (2nd attempt)
If CONFIG_TRIM_UNUSED_KSYMS is enabled and the kernel is built from
a pristine state, the vmlinux is linked twice.

Commit 3fdc7d3fe4 ("kbuild: link vmlinux only once for
CONFIG_TRIM_UNUSED_KSYMS") explains why this happens, but it did not fix
the issue at all.

Now I realized I had applied a wrong patch.

In v1 patch [1], the autoksyms_recursive target correctly recurses to
"$(MAKE) -f $(srctree)/Makefile autoksyms_recursive".

In v2 patch [2], I accidentally dropped the diff line, and it recurses to
"$(MAKE) -f $(srctree)/Makefile vmlinux".

Restore the code I intended in v1.

[1]: https://lore.kernel.org/linux-kbuild/1521045861-22418-8-git-send-email-yamada.masahiro@socionext.com/
[2]: https://lore.kernel.org/linux-kbuild/1521166725-24157-8-git-send-email-yamada.masahiro@socionext.com/

Fixes: 3fdc7d3fe4 ("kbuild: link vmlinux only once for CONFIG_TRIM_UNUSED_KSYMS")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-06-26 06:15:05 +09:00
Linus Torvalds
0840a7914c Merge tag 'char-misc-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull IIO driver fixes from Greg KH:
 "Here are a set of IIO driver fixes for 5.19-rc4. Jonathan said it best
  in his pull request to me, so I'll just quote it here below, as that's
  the only changes we have right now for the char-misc driver tree:

  testing:
      - Fix a missing MODULE_LICENSE() warning by restricting possible
        build configs.

  Various drivers:
      - Fix ordering of iio_get_trigger() being called before
        iio_trigger_register()

  adi,admv1014:
      - Fix dubious x & !y warning.

  adi,axi-adc:
      - Fix missing of_node_put() in error and normal paths.

  aspeed,adc:
      - Add missing of_node_put()

  fsl,mma8452:
      - Fix broken probing from device tree.
      - Drop check on return value of i2c write to device to cause reset
        as ACK will be missing (device reset before sending it).

  fsl,vf610:
      - Fix documentation of in_conversion_mode ABI.

  iio-trig-sysfs:
      - Ensure irq work has finished before freeing the trigger.

  invensense,mpu3050:
      - Disable regulators in error path.

  invensense,icm42600:
      - Fix collision of enum value of 0 with error path where 0 is no
        match.

  renesas,rzg2l_Adc:
      - Add missing fwnode_handle_put() in error path.

  rescale:
      - Fix a boolean logic bug for detection of raw + scale affecting
        an obscure corner case.

  semtech,sx9324:
      - Check return value of read of pin_defs

  st,stm32-adc:
      - Fix interaction across ADC instances for some supported devices.
      - Drop false spurious IRQ messages.
      - Fix calibration value handling. If we can't calibrate don't
        expose the vref_int channel.
      - Fix maximum clock rate for stm32pm15x

  ti,ads131e08:
      - Add missing fwnode_handle_put() in error paths.

  xilinx,ams:
      - Fix variable checked for error from platform_get_irq()

  x-powers,axp288:
      - Overide TS_PIN bias current for boards where it is not correctly
        initialized.

  yamaha,yas530:
      - Fix inverted check on calibration data being all zeros.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'char-misc-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (26 commits)
  iio:proximity:sx9324: Check ret value of device_property_read_u32_array()
  iio: accel: mma8452: ignore the return value of reset operation
  iio: adc: stm32: fix maximum clock rate for stm32mp15x
  iio: adc: stm32: fix vrefint wrong calibration value handling
  iio: imu: inv_icm42600: Fix broken icm42600 (chip id 0 value)
  iio: adc: vf610: fix conversion mode sysfs node name
  iio: adc: adi-axi-adc: Fix refcount leak in adi_axi_adc_attach_client
  iio: test: fix missing MODULE_LICENSE for IIO_RESCALE=m
  iio:humidity:hts221: rearrange iio trigger get and register
  iio:chemical:ccs811: rearrange iio trigger get and register
  iio:accel:mxc4005: rearrange iio trigger get and register
  iio:accel:kxcjk-1013: rearrange iio trigger get and register
  iio:accel:bma180: rearrange iio trigger get and register
  iio: afe: rescale: Fix boolean logic bug
  iio: adc: aspeed: Fix refcount leak in aspeed_adc_set_trim_data
  iio: adc: stm32: Fix IRQs on STM32F4 by removing custom spurious IRQs message
  iio: adc: stm32: Fix ADCs iteration in irq handler
  iio: adc: ti-ads131e08: add missing fwnode_handle_put() in ads131e08_alloc_channels()
  iio: adc: rzg2l_adc: add missing fwnode_handle_put() in rzg2l_adc_parse_properties()
  iio: trigger: sysfs: fix use-after-free on remove
  ...
2022-06-25 10:07:36 -07:00
Linus Torvalds
c24eb8d6a5 Merge tag 'usb-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
 "Here are some small USB driver fixes and new device ids for 5.19-rc4
  for a few small reported issues. They include:

   - new usb-serial driver ids

   - MAINTAINERS file update to properly catch the USB dts files

   - dt-bindings fixes for reported build warnings

   - xhci driver fixes for reported problems

   - typec Kconfig dependancy fix

   - raw_gadget fuzzing fixes found by syzbot

   - chipidea driver bugfix

   - usb gadget uvc bugfix

  All of these have been in linux-next with no reported issues"

* tag 'usb-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: chipidea: udc: check request status before setting device address
  USB: gadget: Fix double-free bug in raw_gadget driver
  xhci-pci: Allow host runtime PM as default for Intel Meteor Lake xHCI
  xhci-pci: Allow host runtime PM as default for Intel Raptor Lake xHCI
  xhci: turn off port power in shutdown
  xhci: Keep interrupt disabled in initialization until host is running.
  USB: serial: option: add Quectel RM500K module support
  USB: serial: option: add Quectel EM05-G modem
  USB: serial: pl2303: add support for more HXN (G) types
  usb: typec: wcove: Drop wrong dependency to INTEL_SOC_PMIC
  usb: gadget: uvc: fix list double add in uvcg_video_pump
  dt-bindings: usb: ehci: Increase the number of PHYs
  dt-bindings: usb: ohci: Increase the number of PHYs
  usb: gadget: Fix non-unique driver names in raw-gadget driver
  MAINTAINERS: add include/dt-bindings/usb to USB SUBSYSTEM
  USB: serial: option: add Telit LE910Cx 0x1250 composition
2022-06-25 10:02:05 -07:00
wuchi
fbb564a557 lib/sbitmap: Fix invalid loop in __sbitmap_queue_get_batch()
1. Getting next index before continue branch.
2. Checking free bits when setting the target bits. Otherwise,
it may reuse the busying bits.

Signed-off-by: wuchi <wuchi.zero@gmail.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Link: https://lore.kernel.org/r/20220605145835.26916-1-wuchi.zero@gmail.com
Fixes: 9672b0d437 ("sbitmap: add __sbitmap_queue_get_batch()")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-25 10:58:55 -06:00
Linus Torvalds
cb84318baa Merge tag 'loongarch-fixes-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
 "Some bug fixes and a trivial cleanup"

* tag 'loongarch-fixes-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: Make compute_return_era() return void
  LoongArch: Fix wrong fpu version
  LoongArch: Fix EENTRY/MERRENTRY setting in setup_tlb_handler()
  LoongArch: Fix sleeping in atomic context in setup_tlb_handler()
  LoongArch: Fix the _stext symbol address
  LoongArch: Fix the !THP build
2022-06-25 09:24:59 -07:00
Linus Torvalds
29eeafc661 Merge tag 'f2fs-for-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs fixes from Jaegeuk Kim:
 "Some urgent fixes to avoid generating corrupted inodes caused by
  compressed and inline_data files.

  In addition, avoid a wrong error report which prevents a roll-forward
  recovery"

* tag 'f2fs-for-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
  f2fs: do not count ENOENT for error case
  f2fs: fix iostat related lock protection
  f2fs: attach inline_data after setting compression
2022-06-25 09:19:51 -07:00
Tiezhu Yang
ea18d43478 LoongArch: Make compute_return_era() return void
compute_return_era() always returns 0, make it return void,
and then no need to check its return value for its callers.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-25 18:06:07 +08:00
Tiezhu Yang
ad82eef3ce LoongArch: Fix wrong fpu version
According to the configuration information accessible by the CPUCFG
instruction in LoongArch Reference Manual [1], FP_ver is stored in
bit [5: 3] of CPUCFG2, the current code to get fpu version is wrong,
use CPUCFG2_FPVERS to fix it.

[1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Fixes: 628c3bb40e ("LoongArch: Add boot and setup routines")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-25 18:05:59 +08:00
Huacai Chen
26808cebf1 LoongArch: Fix EENTRY/MERRENTRY setting in setup_tlb_handler()
setup_tlb_handler() is expected to set per-cpu exception handlers, but
it only set the TLBRENTRY successfully because of copy & paste errors,
so fix it.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-25 18:05:58 +08:00
Huacai Chen
bab1c299f3 LoongArch: Fix sleeping in atomic context in setup_tlb_handler()
Since setup_tlb_handler() is executed in atomic context, we should use
GFP_ATOMIC instead of GFP_KERNEL to alloc pages. Otherwise we will get
a "sleeping in atomic context" error:

[    0.013118] BUG: sleeping function called from invalid context at mm/page_alloc.c:5158
[    0.013126] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
[    0.013131] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.19-rc3+ #1008 1a223086d14d07967cc427f15d52139422271360
[    0.013136] Hardware name: Loongson Loongson-3A5000-7A1000-1w-V0.1-CRB/Loongson-LS3A5000-7A1000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V2.0.04082-beta7 04/27
[    0.013140] Stack : 90000000015fc990 9000000100493c18 9000000000df3370 9000000100490000
[    0.013151]         9000000100493b50 0000000000000000 9000000100493b58 9000000001417ef0
[    0.013160]         900000000199e54e 0000000000000040 9000000100493c18 90000000015f7a98
[    0.013168]         ffffffffffffffff 6de72f8b42179d1e 9000000100403b80 90000000015f7890
[    0.013176]         0000000000000001 00000000fffff175 9000000000eb9860 9000000001530b4b
[    0.013184]         9000000000e99e60 0000000000000013 0000000006ecc000 0000000000000001
[    0.013193]         90000000015f7a98 9000000001417ef0 0000000000000004 0000000000000000
[    0.013201]         0000000000000cc0 0000000000000000 0000000000000001 90000000015fc990
[    0.013209]         9000000000217e74 9000000001603b6b 9000000000208640 0000000000000000
[    0.013217]         00000000000000b0 0000000000000004 0000000000000000 0000000000070000
[    0.013225]         ...
[    0.013229] Call Trace:
[    0.013230] [<9000000000208640>] show_stack+0x4c/0x14c
[    0.013240] [<9000000000df3370>] dump_stack_lvl+0x70/0xac
[    0.013246] [<9000000000270c8c>] ___might_sleep+0x104/0x124
[    0.013253] [<9000000000477e84>] __alloc_pages+0x240/0x464
[    0.013260] [<9000000000214214>] setup_tlb_handler+0x104/0x1e8
[    0.013265] [<9000000000214324>] tlb_init+0x2c/0x3c
[    0.013270] [<9000000000208b74>] per_cpu_trap_init+0xec/0x108
[    0.013275] [<9000000000202850>] cpu_probe+0x400/0x8a4
[    0.013279] [<900000000020d160>] start_secondary+0x5c/0x3d4

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-25 18:05:58 +08:00
Huacai Chen
92264f2dae LoongArch: Fix the _stext symbol address
_stext means the start of .text section (see __is_kernel_text()), but we
put its definition in .ref.text by mistake. Fix it by defining it in the
vmlinux.lds.S.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-25 18:05:58 +08:00
Huacai Chen
501dcbe495 LoongArch: Fix the !THP build
Fix the !THP build by making pmd_pfn() available in all configurations.
Because pmd_pfn() is used in mm/page_vma_mapped.c whether or not THP is
configured.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-25 18:05:58 +08:00
Linus Torvalds
8c23f235a6 Merge tag 'gpio-fixes-for-v5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - make the irqchip immutable in gpio-realtek-otto

 - fix error code propagation in gpio-winbond

 - fix device removing in gpio-grgpio

 - fix a typo in gpio-mxs which indicates the driver is for a different
   model

 - documentation fixes

 - MAINTAINERS file updates

* tag 'gpio-fixes-for-v5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: mxs: Fix header comment
  gpio: Fix kernel-doc comments to nested union
  gpio: grgpio: Fix device removing
  gpio: winbond: Fix error code in winbond_gpio_get()
  gpio: realtek-otto: Make the irqchip immutable
  docs: driver-api: gpio: Fix filename mismatch
  MAINTAINERS: add include/dt-bindings/gpio to GPIO SUBSYSTEM
2022-06-24 17:01:31 -07:00
Dan Carpenter
3b89b511ea net: fix IFF_TX_SKB_NO_LINEAR definition
The "1<<31" shift has a sign extension bug so IFF_TX_SKB_NO_LINEAR is
0xffffffff80000000 instead of 0x0000000080000000.

Fixes: c2ff53d804 ("net: Add priv_flags for allow tx skb without linear")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://lore.kernel.org/r/YrRrcGttfEVnf85Q@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-24 16:45:40 -07:00
Jakub Kicinski
8cc6838337 Merge branch 'net-dp83822-fix-interrupt-floods'
Enguerrand de Ribaucourt says:

====================
net: dp83822: fix interrupt floods

The false carrier and RX error counters, once half full, produce interrupt
floods. Since we do not use these counters, these interrupts should be disabled.
====================

Link: https://lore.kernel.org/r/20220623134645.1858361-1-enguerrand.de-ribaucourt@savoirfairelinux.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-24 16:33:45 -07:00
Enguerrand de Ribaucourt
0e597e2aff net: dp83822: disable rx error interrupt
Some RX errors, notably when disconnecting the cable, increase the RCSR
register. Once half full (0x7fff), an interrupt flood is generated. I
measured ~3k/s interrupts even after the RX errors transfer was
stopped.

Since we don't read and clear the RCSR register, we should disable this
interrupt.

Fixes: 87461f7a58 ("net: phy: DP83822 initial driver submission")
Signed-off-by: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-24 16:33:22 -07:00
Enguerrand de Ribaucourt
c96614eeab net: dp83822: disable false carrier interrupt
When unplugging an Ethernet cable, false carrier events were produced by
the PHY at a very high rate. Once the false carrier counter full, an
interrupt was triggered every few clock cycles until the cable was
replugged. This resulted in approximately 10k/s interrupts.

Since the false carrier counter (FCSCR) is never used, we can safely
disable this interrupt.

In addition to improving performance, this also solved MDIO read
timeouts I was randomly encountering with an i.MX8 fec MAC because of
the interrupt flood. The interrupt count and MDIO timeout fix were
tested on a v5.4.110 kernel.

Fixes: 87461f7a58 ("net: phy: DP83822 initial driver submission")
Signed-off-by: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-24 16:32:34 -07:00
Jakub Kicinski
3b9bc84d31 net: tun: unlink NAPI from device on destruction
Syzbot found a race between tun file and device destruction.
NAPIs live in struct tun_file which can get destroyed before
the netdev so we have to del them explicitly. The current
code is missing deleting the NAPI if the queue was detached
first.

Fixes: 943170998b ("tun: enable NAPI for TUN/TAP driver")
Reported-by: syzbot+b75c138e9286ac742647@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20220623042039.2274708-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-24 16:28:08 -07:00
Eric Dumazet
6f0012e351 tcp: add a missing nf_reset_ct() in 3WHS handling
When the third packet of 3WHS connection establishment
contains payload, it is added into socket receive queue
without the XFRM check and the drop of connection tracking
context.

This means that if the data is left unread in the socket
receive queue, conntrack module can not be unloaded.

As most applications usually reads the incoming data
immediately after accept(), bug has been hiding for
quite a long time.

Commit 68822bdf76 ("net: generalize skb freeing
deferral to per-cpu lists") exposed this bug because
even if the application reads this data, the skb
with nfct state could stay in a per-cpu cache for
an arbitrary time, if said cpu no longer process RX softirqs.

Many thanks to Ilya Maximets for reporting this issue,
and for testing various patches:
https://lore.kernel.org/netdev/20220619003919.394622-1-i.maximets@ovn.org/

Note that I also added a missing xfrm4_policy_check() call,
although this is probably not a big issue, as the SYN
packet should have been dropped earlier.

Fixes: b59c270104 ("[NETFILTER]: Keep conntrack reference until IPsec policy checks are done")
Reported-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Tested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://lore.kernel.org/r/20220623050436.1290307-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-24 16:27:46 -07:00
Linus Torvalds
6a0a17e6c6 Merge tag 'mtd/fixes-for-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd fixes from Miquel RAynal:
 "NAND controller fix:
   - gpmi: Fix busy timeout setting (wrong calculation)

  NAND chip driver fix:
   - Thoshiba: Revert the commit introducing support for a chip that
     might have been counterfeit"

* tag 'mtd/fixes-for-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: gpmi: Fix setting busy timeout setting
  Revert "mtd: rawnand: add support for Toshiba TC58NVG0S3HTA00 NAND flash"
2022-06-24 14:04:08 -07:00
Linus Torvalds
4039974f3b Merge tag 'spi-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A bunch of driver specific fixes, plus a fix for spi-mem's status
  polling for devices that use GPIO chip selects and a DT bindings
  examples fix that helps with the validation work"

* tag 'spi-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: rockchip: Unmask IRQ at the final to avoid preemption
  spi: dt-bindings: Fix unevaluatedProperties warnings in examples
  spi: spi-mem: Fix spi_mem_poll_status()
  spi: cadence: Detect transmit FIFO depth
  spi: spi-cadence: Fix SPI CS gets toggling sporadically
2022-06-24 13:54:16 -07:00
Linus Torvalds
bed051817c Merge tag 'regulator-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
 "One fix for an incorrect device description for MP5496"

* tag 'regulator-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: qcom_smd: correct MP5496 ranges
2022-06-24 13:51:07 -07:00
Linus Torvalds
7bc8354607 Merge tag 'regmap-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
 "Two sets of fixes - one for things that were missed with the support
  for custom bulk I/O operations introduced in the last merge window,
  and another for some long standing issues with regmap-irq which affect
  a fairly small subset of devices"

* tag 'regmap-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap-irq: Fix offset/index mismatch in read_sub_irq_data()
  regmap-irq: Fix a bug in regmap_irq_enable() for type_in_mask chips
  regmap: Wire up regmap_config provided bulk write in missed functions
  regmap: Make regmap_noinc_read() return -ENOTSUPP if map->read isn't set
  regmap: Re-introduce bulk read support check in regmap_bulk_read()
2022-06-24 13:49:13 -07:00
Linus Torvalds
bc3b8977e3 Merge tag 'iommu-fixes-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:

 - Add a new IOMMU mailing list to the MAINTAINERS file to prepare for
   the a list migration happening on July 5th. The old list needs to
   stay in place until the switch happens to guarantee seemless
   archiving of list email.

 - Fix compatible device-tree string for rcar-gen4 in Renesas IOMMU
   driver.

* tag 'iommu-fixes-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  MAINTAINERS: Add new IOMMU development mailing list
  iommu/ipmmu-vmsa: Fix compatible for rcar-gen4
2022-06-24 13:41:47 -07:00
Miaoqian Lin
2990f223ff RDMA/cm: Fix memory leak in ib_cm_insert_listen
cm_alloc_id_priv() allocates resource for the cm_id_priv. When
cm_init_listen() fails it doesn't free it, leading to memory leak.

Add the missing error unwind.

Fixes: 98f67156a8 ("RDMA/cm: Simplify establishing a listen cm_id")
Link: https://lore.kernel.org/r/20220621052546.4821-1-linmq006@gmail.com
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-06-24 16:41:03 -03:00
Linus Torvalds
70d605cbee Merge tag 'riscv-for-linus-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fix from Palmer Dabbelt:

 - fix the T-Head memory type errata workaround to avoid behavior
   that is unsupported in the LLVM assembler

* tag 'riscv-for-linus-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix ALT_THEAD_PMA's asm parameters
2022-06-24 12:34:13 -07:00
Linus Torvalds
f6e9d01468 Merge tag 's390-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Alexander Gordeev:

 - Fix perf stat accounting for cryptography counters when multiple
   events are installed concurrently.

 - Prevent installation of unsupported perf events for cryptography
   counters.

 - Treat perf events cpum_cf/CPU_CYCLES/ and cpu_cf/INSTRUCTIONS/
   identical to basic events CPU_CYCLES" and INSTRUCTIONS, since they
   address the same hardware.

 - Restore kcrash operation which was broken by commit 5d8de293c2
   ("vmcore: convert copy_oldmem_page() to take an iov_iter").

* tag 's390-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/pai: Fix multiple concurrent event installation
  s390/pai: Prevent invalid event number for pai_crypto PMU
  s390/cpumf: Handle events cycles and instructions identical
  s390/crash: make copy_oldmem_page() return number of bytes copied
  s390/crash: add missing iterator advance in copy_oldmem_page()
2022-06-24 12:27:24 -07:00
Linus Torvalds
2c39d612aa Merge tag 'for-linus-5.19a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:

 - A rare deadlock in Qubes-OS between the i915 driver and Xen grant
   unmapping, solved by making the unmapping fully asynchronous

 - A bug in the Xen blkfront driver caused by incomplete error handling

 - A fix for undefined behavior (shifting a signed int by 31 bits)

 - A fix in the Xen drmfront driver avoiding a WARN()

* tag 'for-linus-5.19a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/gntdev: Avoid blocking in unmap_grant_pages()
  drm/xen: Add missing VM_DONTEXPAND flag in mmap callback
  x86/xen: Remove undefined behavior in setup_features()
  xen-blkfront: Handle NULL gendisk
2022-06-24 12:22:11 -07:00
Linus Torvalds
e946554905 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "ARM64:

   - Fix a regression with pKVM when kmemleak is enabled

   - Add Oliver Upton as an official KVM/arm64 reviewer

  selftests:

   - deal with compiler optimizations around hypervisor exits

  x86:

   - MAINTAINERS reorganization

   - Two SEV fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SEV: Init target VMCBs in sev_migrate_from
  KVM: x86/svm: add __GFP_ACCOUNT to __sev_dbg_{en,de}crypt_user()
  MAINTAINERS: Reorganize KVM/x86 maintainership
  selftests: KVM: Handle compiler optimizations in ucall
  KVM: arm64: Add Oliver as a reviewer
  KVM: arm64: Prevent kmemleak from accessing pKVM memory
  tools/kvm_stat: fix display of error when multiple processes are found
2022-06-24 12:17:47 -07:00
Chris Ye
ef9102004a nvdimm: Fix badblocks clear off-by-one error
nvdimm_clear_badblocks_region() validates badblock clearing requests
against the span of the region, however it compares the inclusive
badblock request range to the exclusive region range. Fix up the
off-by-one error.

Fixes: 23f4984483 ("libnvdimm: rework region badblocks clearing")
Cc: <stable@vger.kernel.org>
Signed-off-by: Chris Ye <chris.ye@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/165404219489.2445897.9792886413715690399.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-06-24 11:57:19 -07:00
Linus Torvalds
38bc4ac431 Merge tag 'drm-fixes-2022-06-24' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Fixes for this week, bit larger than normal, but I think the last
  couple have been quieter, and it's only rc4.

  There are a lot of small msm fixes, and a slightly larger set of vc4
  fixes. The vc4 fixes clean up a lot of crashes around the rPI4
  hardware differences from earlier ones, and problems in the page flip
  and modeset code which assumed earlier hw, so I thought it would be
  okay to keep them in.

  Otherwise, it's a few amdgpu, i915, sun4i and a panel quirk.

  amdgpu:
   - Adjust GTT size logic
   - eDP fix for RMB
   - DCN 3.15 fix
   - DP training fix
   - Color encoding fix for DCN2+

  sun4i:
   - multiple suspend fixes

  vc4:
   - rework driver split for rpi4, fixes mulitple crashers.

  panel:
   - quirk for Aya Neo Next

  i915:
   - Revert low voltage SKU check removal to fix display issues
   - Apply PLL DCO fraction workaround for ADL-S
   - Don't show engine classes not present in client fdinfo

  msm:
   - Workaround for parade DSI bridge power sequencing
   - Fix for multi-planar YUV format offsets
   - Limiting WB modes to max sspp linewidth
   - Fixing the supported rotations to add 180 back for IGT
   - Fix to handle pm_runtime_get_sync() errors to avoid unclocked
     access in the bind() path for dpu driver
   - Fix the irq_free() without request issue which was a being hit
     frequently in CI.
   - Fix to add minimum ICC vote in the msm_mdss pm_resume path to
     address bootup splats
   - Fix to avoid dereferencing without checking in WB encoder
   - Fix to avoid crash during suspend in DP driver by ensuring
     interrupt mask bits are updated
   - Remove unused code from dpu_encoder_virt_atomic_check()
   - Fix to remove redundant init of dsc variable
   - Fix to ensure mmap offset is initialized to avoid memory corruption
     from unpin/evict
   - Fix double runpm disable in probe-defer path
   - VMA fenced-unpin fixes
   - Fix for WB max-width
   - Fix for rare dp resolution change issue"

* tag 'drm-fixes-2022-06-24' of git://anongit.freedesktop.org/drm/drm: (41 commits)
  amd/display/dc: Fix COLOR_ENCODING and COLOR_RANGE doing nothing for DCN20+
  drm/amd/display: Fix typo in override_lane_settings
  drm/amd/display: Fix DC warning at driver load
  drm/amd: Revert "drm/amd/display: keep eDP Vdd on when eDP stream is already enabled"
  drm/amdgpu: Adjust logic around GTT size (v3)
  drm/sun4i: Return if frontend is not present
  drm/vc4: fix error code in vc4_check_tex_size()
  drm/sun4i: Add DMA mask and segment size
  drm/vc4: hdmi: Fixed possible integer overflow
  drm/i915/display: Re-add check for low voltage sku for max dp source rate
  drm/i915/fdinfo: Don't show engine classes not present
  drm/i915: Implement w/a 22010492432 for adl-s
  drm: panel-orientation-quirks: Add quirk for Aya Neo Next
  drm/msm/dp: force link training for display resolution change
  drm/msm/dpu: limit wb modes based on max_mixer_width
  drm/msm/dp: check core_initialized before disable interrupts at dp_display_unbind()
  drm/msm/mdp4: Fix refcount leak in mdp4_modeset_init_intf
  drm/msm: Don't overwrite hw fence in hw_init
  drm/msm: Drop update_fences()
  drm/vc4: Warn if some v3d code is run on BCM2711
  ...
2022-06-24 11:43:49 -07:00
Paulo Alcantara
af3a6d1018 cifs: update cifs_ses::ip_addr after failover
cifs_ses::ip_addr wasn't being updated in cifs_session_setup() when
reconnecting SMB sessions thus returning wrong value in
/proc/fs/cifs/DebugData.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Cc: stable@kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-24 13:34:28 -05:00
Jakub Sitnicki
935336c191 selftests/bpf: Test sockmap update when socket has ULP
Cover the scenario when we cannot insert a socket into the sockmap, because
it has it is using ULP. Failed insert should not have any effect on the ULP
state. This is a regression test.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20220623091231.417138-1-jakub@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-24 11:21:50 -07:00
Linus Torvalds
cbe232ab07 Merge tag 'for-5.19/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:

 - Fix DM era to commit metadata during suspend using drain_workqueue
   instead of flush_workqueue.

 - Fix DM core's dm_io_complete to not return early if io error is
   BLK_STS_AGAIN but bio polling is not in use.

 - Fix DM core's dm_io_complete BLK_STS_DM_REQUEUE handling when dm_io
   represents a split bio.

 - Fix recent DM mirror log regression by clearing bits up to
   BITS_PER_LONG boundary.

* tag 'for-5.19/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm mirror log: clear log bits up to BITS_PER_LONG boundary
  dm: fix BLK_STS_DM_REQUEUE handling when dm_io represents split bio
  dm: do not return early from dm_io_complete if BLK_STS_AGAIN without polling
  dm era: commit metadata in postsuspend after worker stops
2022-06-24 11:16:27 -07:00
Linus Torvalds
43627618a0 Merge tag 'ata-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ATA fix from Damien Le Moal:

 - a single patch to fix tracing of command completion (Edward)

* tag 'ata-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: libata: add qc->flags in ata_qc_complete_template tracepoint
2022-06-24 11:12:34 -07:00
Linus Torvalds
a237cfd6b7 Merge tag 'block-5.19-2022-06-24' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - Series fixing issues with sysfs locking and name reuse (Christoph)

 - NVMe pull request via Christoph:
      - Fix the mixed up CRIMS/CRWMS constants (Joel Granados)
      - Add another broken identifier quirk (Leo Savernik)
      - Fix up a quirk because Samsung reuses PCI IDs over different
        products (Christoph Hellwig)

 - Remove old WARN_ON() that doesn't apply anymore (Li)

 - Fix for using a stale cached request value for rq-qos throttling
   mechanisms that may schedule(), like iocost (me)

 - Remove unused parameter to blk_independent_access_range() (Damien)

* tag 'block-5.19-2022-06-24' of git://git.kernel.dk/linux-block:
  block: remove WARN_ON() from bd_link_disk_holder
  nvme: move the Samsung X5 quirk entry to the core quirks
  nvme: fix the CRIMS and CRWMS definitions to match the spec
  nvme: add a bogus subsystem NQN quirk for Micron MTFDKBA2T0TFH
  block: pop cached rq before potentially blocking rq_qos_throttle()
  block: remove queue from struct blk_independent_access_range
  block: freeze the queue earlier in del_gendisk
  block: remove per-disk debugfs files in blk_unregister_queue
  block: serialize all debugfs operations using q->debugfs_mutex
  block: disable the elevator int del_gendisk
2022-06-24 11:07:54 -07:00
Linus Torvalds
598f240487 Merge tag 'io_uring-5.19-2022-06-24' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "A few fixes that should go into the 5.19 release. All are fixing
  issues that either happened in this release, or going to stable.

  In detail:

   - A small series of fixlets for the poll handling, all destined for
     stable (Pavel)

   - Fix a merge error from myself that caused a potential -EINVAL for
     the recv/recvmsg flag setting (me)

   - Fix a kbuf recycling issue for partial IO (me)

   - Use the original request for the inflight tracking (me)

   - Fix an issue introduced this merge window with trace points using a
     custom decoder function, which won't work for perf (Dylan)"

* tag 'io_uring-5.19-2022-06-24' of git://git.kernel.dk/linux-block:
  io_uring: use original request task for inflight tracking
  io_uring: move io_uring_get_opcode out of TP_printk
  io_uring: fix double poll leak on repolling
  io_uring: fix wrong arm_poll error handling
  io_uring: fail links when poll fails
  io_uring: fix req->apoll_events
  io_uring: fix merge error in checking send/recv addr2 flags
  io_uring: mark reissue requests with REQ_F_PARTIAL_IO
2022-06-24 11:02:26 -07:00
Linus Torvalds
9d882352ba Merge tag 'printk-for-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk kernel thread revert from Petr Mladek:
 "Revert printk console kthreads.

  The testing of 5.19 release candidates revealed issues that did not
  happen when all consoles were serialized using the console semaphore.

  More time is needed to check expectations of the existing console
  drivers and be confident that they can be safely used in parallel"

* tag 'printk-for-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  Revert "printk: add functions to prefer direct printing"
  Revert "printk: add kthread console printers"
  Revert "printk: extend console_lock for per-console locking"
  Revert "printk: remove @console_locked"
  Revert "printk: Block console kthreads when direct printing will be required"
  Revert "printk: Wait for the global console lock when the system is going down"
2022-06-24 10:54:07 -07:00
Jae Hyun Yoo
7f05811287 ARM: dts: aspeed: nuvia: rename vendor nuvia to qcom
Nuvia has been acquired by Qualcomm and the vendor name 'nuvia' will
not be used anymore so rename aspeed-bmc-nuvia-dc-scm.dts to
aspeed-bmc-qcom-dc-scm-v1.dts and change 'nuvia' to 'qcom' as its vendor
name in the file.

Fixes: 7b46aa7c00 ("ARM: dts: aspeed: Add Nuvia DC-SCM BMC")
Signed-off-by: Jae Hyun Yoo <quic_jaehyoo@quicinc.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20220523175640.60155-1-quic_jaehyoo@quicinc.com
Link: https://lore.kernel.org/r/20220624070511.4070659-1-joel@jms.id.au'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24 17:57:13 +02:00
Arnd Bergmann
60192dd85c Merge tag 'memory-controller-drv-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl into arm/fixes
Memory controller drivers - fixes for v5.19

Broken in current cycle:
1. OMAP GPMC: fix Kconfig dependency for OMAP_GPMC, so it will not be visible
   for everyone (driver is specific to OMAP).

Broken before:
1. Mediatek SMI: fix missing put_device() in error paths.
2. Exynos DMC: fix OF node leaks in error paths.

* tag 'memory-controller-drv-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl:
  memory: samsung: exynos5422-dmc: Fix refcount leak in of_get_dram_timings
  memory: mtk-smi: add missing put_device() call in mtk_smi_device_link_common
  memory: omap-gpmc: OMAP_GPMC should depend on ARCH_OMAP2PLUS || ARCH_KEYSTONE || ARCH_K3

Link: https://lore.kernel.org/r/20220624081819.33617-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24 17:21:23 +02:00
Arnd Bergmann
416e95a479 Merge tag 'samsung-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into arm/fixes
Samsung fixes for v5.19

Both fixes are for issues present before v5.19 merge window:
1. Correct UART clocks on Exynos7885.  Although the initial, fixed
   DTS commit is from v5.18, the issue will be exposed with a upcoming fix
   to Exynos7885 clock driver, so we need to correct the DTS earlier.
2. Fix theoretical OF node leak in Exynos machine code.

* tag 'samsung-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
  ARM: exynos: Fix refcount leak in exynos_map_pmu
  arm64: dts: exynos: Correct UART clocks on Exynos7885

Link: https://lore.kernel.org/r/20220624080423.31427-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24 17:20:24 +02:00
Arnd Bergmann
b262b3b571 Merge tag 'arm-soc/for-5.19/devicetree-fixes' of https://github.com/Broadcom/stblinux into arm/fixes
This pull request contains Broadcom ARM-SoC Device Tree fixes for 5.19,
please pull the following:

- Stefan fixes the Raspberry Pi 400 GPIO expander line names to match
  that of the downstream Raspberry Pi Linux tree

* tag 'arm-soc/for-5.19/devicetree-fixes' of https://github.com/Broadcom/stblinux:
  ARM: dts: bcm2711-rpi-400: Fix GPIO line names

Link: https://lore.kernel.org/r/20220620170745.2485199-1-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24 17:19:50 +02:00
Arnd Bergmann
db6b92459f Merge tag 'ti-k3-dt-fixes-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ti/linux into arm/fixes
Devicetree fixes for TI K3 platforms for v5.19

Critical fixes for the following:
* Boot failure on J721s2 (overlap GIC memory map)
* AM64 boot fails on highspeed cards (SoC characterization updates)

* tag 'ti-k3-dt-fixes-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ti/linux:
  arm64: dts: ti: k3-am64-main: Remove support for HS400 speed mode
  arm64: dts: ti: k3-j721s2: Fix overlapping GICD memory region

Link: https://lore.kernel.org/r/20220618031627.xxvscc22c6doaa3t@kahuna
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24 17:19:21 +02:00
Liang He
2c629dd2d1 arm: mach-spear: Add missing of_node_put() in time.c
In spear_setup_of_timer(), of_find_matching_node() will return a
node pointer with refcount incrementd. We should use of_node_put()
in each fail path or when it is not used anymore.

Signed-off-by: Liang He <windhl@126.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20220616093027.3984903-1-windhl@126.com'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24 17:18:55 +02:00
Miaoqian Lin
1ba904b6b1 ARM: cns3xxx: Fix refcount leak in cns3xxx_init
of_find_compatible_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.

Fixes: 415f59142d ("ARM: cns3xxx: initial DT support")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Acked-by: Krzysztof Halasa <khalasa@piap.pl>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24 17:18:30 +02:00
Thara Gopinath
17b1362d49 MAINTAINERS: Update email address
Update my email address in the MAINTAINERS file as  the current
one will stop functioning in a while.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24 17:18:30 +02:00
Shyam Prasad N
8da33fd11c cifs: avoid deadlocks while updating iface
We use cifs_tcp_ses_lock to protect a lot of things.
Not only does it protect the lists of connections, sessions,
tree connects, open file lists, etc., we also use it to
protect some fields in each of it's entries.

In this case, cifs_mark_ses_for_reconnect takes the
cifs_tcp_ses_lock to traverse the lists, and then calls
cifs_update_iface. However, that can end up calling
cifs_put_tcp_session, which picks up the same lock again.

Avoid this by taking a ref for the session, drop the lock,
and then call update iface.

Also, in cifs_update_iface, avoid nested locking of iface_lock
and chan_lock, as much as possible. When unavoidable, we need
to pick iface_lock first.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-24 09:17:56 -05:00
Joerg Roedel
c242507c1b MAINTAINERS: Add new IOMMU development mailing list
The IOMMU mailing list will move from lists.linux-foundation.org to
lists.linux.dev. The hard switch of the archive will happen on July
5th, but add the new list now already so that people start using the
list when sending patches. After July 5th the old list will disappear.

Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220624125139.412-1-joro@8bytes.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-06-24 15:36:11 +02:00
Xu Yang
b24346a240 usb: chipidea: udc: check request status before setting device address
The complete() function may be called even though request is not
completed. In this case, it's necessary to check request status so
as not to set device address wrongly.

Fixes: 10775eb17b ("usb: chipidea: udc: update gadget states according to ch9")
cc: <stable@vger.kernel.org>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20220623030242.41796-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-24 13:45:23 +02:00
Alan Stern
90bc2af246 USB: gadget: Fix double-free bug in raw_gadget driver
Re-reading a recently merged fix to the raw_gadget driver showed that
it inadvertently introduced a double-free bug in a failure pathway.
If raw_ioctl_init() encounters an error after the driver ID number has
been allocated, it deallocates the ID number before returning.  But
when dev_free() runs later on, it will then try to deallocate the ID
number a second time.

Closely related to this issue is another error in the recent fix: The
ID number is stored in the raw_dev structure before the code checks to
see whether the structure has already been initialized, in which case
the new ID number would overwrite the earlier value.

The solution to both bugs is to keep the new ID number in a local
variable, and store it in the raw_dev structure only after the check
for prior initialization.  No errors can occur after that point, so
the double-free will never happen.

Fixes: f2d8c26068 ("usb: gadget: Fix non-unique driver names in raw-gadget driver")
CC: Andrey Konovalov <andreyknvl@gmail.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/YrMrRw5AyIZghN0v@rowland.harvard.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-24 13:45:21 +02:00
Tom Lendacky
87d044096e crypto: ccp - Fix device IRQ counting by using platform_irq_count()
The ccp driver loops through the platform device resources array to get
the IRQ count for the device. With commit a1a2b7125e ("of/platform: Drop
static setup of IRQ resource from DT core"), the IRQ resources are no
longer stored in the platform device resource array. As a result, the IRQ
count is now always zero. This causes the driver to issue a second call to
platform_get_irq(), which fails if the IRQ count is really 1, causing the
loading of the driver to fail.

Replace looping through the resources array to count the number of IRQs
with a call to platform_irq_count().

Fixes: a1a2b7125e ("of/platform: Drop static setup of IRQ resource from DT core")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-06-24 17:09:01 +08:00
Peter Gonda
6defa24d3b KVM: SEV: Init target VMCBs in sev_migrate_from
The target VMCBs during an intra-host migration need to correctly setup
for running SEV and SEV-ES guests. Add sev_init_vmcb() function and make
sev_es_init_vmcb() static. sev_init_vmcb() uses the now private function
to init SEV-ES guests VMCBs when needed.

Fixes: 0b020f5af0 ("KVM: SEV: Add support for SEV-ES intra host migration")
Fixes: b56639318b ("KVM: SEV: Add support for SEV intra host migration")
Signed-off-by: Peter Gonda <pgonda@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Message-Id: <20220623173406.744645-1-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24 04:10:18 -04:00
Mingwei Zhang
ebdec859fa KVM: x86/svm: add __GFP_ACCOUNT to __sev_dbg_{en,de}crypt_user()
Adding the accounting flag when allocating pages within the SEV function,
since these memory pages should belong to individual VM.

No functional change intended.

Signed-off-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20220623171858.2083637-1-mizhang@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24 03:58:06 -04:00
Jason Wang
c346dae4f3 virtio: disable notification hardening by default
We try to harden virtio device notifications in 8b4ec69d7e ("virtio:
harden vring IRQ"). It works with the assumption that the driver or
core can properly call virtio_device_ready() at the right
place. Unfortunately, this seems to be not true and uncover various
bugs of the existing drivers, mainly the issue of using
virtio_device_ready() incorrectly.

So let's add a Kconfig option and disable it by default. It gives
us time to fix the drivers and then we can consider re-enabling it.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220622012940.21441-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2022-06-24 02:49:48 -04:00
Bo Liu
03d9571706 virtio: Remove unnecessary variable assignments
In function vp_modern_probe(), "pci_dev" is initialized with the
value of "mdev->pci_dev", so assigning "pci_dev" to "mdev->pci_dev"
is unnecessary since they store the same value.

Signed-off-by: Bo Liu <liubo03@inspur.com>
Message-Id: <20220617055952.5364-1-liubo03@inspur.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-06-24 02:49:48 -04:00
huangjie.albert
a7722890fd virtio_ring : keep used_wrap_counter in vq->last_used_idx
the used_wrap_counter and the vq->last_used_idx may get
out of sync if they are separate assignment,and interrupt
might use an incorrect value to check for the used index.

for example:OOB access
ksoftirqd may consume the packet and it will call:
virtnet_poll
	-->virtnet_receive
		-->virtqueue_get_buf_ctx
			-->virtqueue_get_buf_ctx_packed
and in virtqueue_get_buf_ctx_packed:

vq->last_used_idx += vq->packed.desc_state[id].num;
if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) {
         vq->last_used_idx -= vq->packed.vring.num;
         vq->packed.used_wrap_counter ^= 1;
}

if at the same time, there comes a vring interrupt,in vring_interrupt:
we will call:
vring_interrupt
	-->more_used
		-->more_used_packed
			-->is_used_desc_packed
in is_used_desc_packed, the last_used_idx maybe >= vq->packed.vring.num.
so this could case a memory out of bounds bug.

this patch is to keep the used_wrap_counter in vq->last_used_idx
so we can get the correct value to check for used index in interrupt.

v3->v4:
- use READ_ONCE/WRITE_ONCE to get/set vq->last_used_idx

v2->v3:
- add inline function to get used_wrap_counter and last_used
- when use vq->last_used_idx, only read once
  if vq->last_used_idx is read twice, the values can be inconsistent.
- use last_used_idx & ~(-(1 << VRING_PACKED_EVENT_F_WRAP_CTR))
  to get the all bits below VRING_PACKED_EVENT_F_WRAP_CTR

v1->v2:
- reuse the VRING_PACKED_EVENT_F_WRAP_CTR
- Remove parameter judgment in is_used_desc_packed,
because it can't be illegal

Signed-off-by: huangjie.albert <huangjie.albert@bytedance.com>
Message-Id: <20220617020411.80367-1-huangjie.albert@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-24 02:49:48 -04:00
Parav Pandit
0e0348ac3f vduse: Tie vduse mgmtdev and its device
vduse devices are not backed by any real devices such as PCI. Hence it
doesn't have any parent device linked to it.

Kernel driver model in [1] suggests to avoid an empty device
release callback.

Hence tie the mgmtdevice object's life cycle to an allocate dummy struct
device instead of static one.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/core-api/kobject.rst?h=v5.18-rc7#n284

Signed-off-by: Parav Pandit <parav@nvidia.com>
Message-Id: <20220613195223.473966-1-parav@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-06-24 02:49:48 -04:00
Eli Cohen
ace9252446 vdpa/mlx5: Initialize CVQ vringh only once
Currently, CVQ vringh is initialized inside setup_virtqueues() which is
called every time a memory update is done. This is undesirable since it
resets all the context of the vring, including the available and used
indices.

Move the initialization to mlx5_vdpa_set_status() when
VIRTIO_CONFIG_S_DRIVER_OK is set.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220613075958.511064-2-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
2022-06-24 02:49:47 -04:00
Eli Cohen
40f2f3e941 vdpa/mlx5: Update Control VQ callback information
The control VQ specific information is stored in the dedicated struct
mlx5_control_vq. When the callback is updated through
mlx5_vdpa_set_vq_cb(), make sure to update the control VQ struct.

Fixes: 5262912ef3 ("vdpa/mlx5: Add support for control VQ and MAC setting")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220613075958.511064-1-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com)
2022-06-24 02:49:47 -04:00
Namjae Jeon
b5e5f9dfc9 ksmbd: check invalid FileOffset and BeyondFinalZero in FSCTL_ZERO_DATA
FileOffset should not be greater than BeyondFinalZero in FSCTL_ZERO_DATA.
And don't call ksmbd_vfs_zero_data() if length is zero.

Cc: stable@vger.kernel.org
Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-23 23:30:46 -05:00
Namjae Jeon
18e39fb960 ksmbd: set the range of bytes to zero without extending file size in FSCTL_ZERO_DATA
generic/091, 263 test failed since commit f66f8b94e7 ("cifs: when
extending a file with falloc we should make files not-sparse").
FSCTL_ZERO_DATA sets the range of bytes to zero without extending file
size. The VFS_FALLOCATE_FL_KEEP_SIZE flag should be used even on
non-sparse files.

Cc: stable@vger.kernel.org
Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-23 23:30:46 -05:00
Hyunchul Lee
745bbc0995 ksmbd: remove duplicate flag set in smb2_write
The writethrough flag is set again if is_rdma_channel is false.

Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-23 23:30:46 -05:00
Dimitris Michailidis
b968080808 selftests/net: pass ipv6_args to udpgso_bench's IPv6 TCP test
udpgso_bench.sh has been running its IPv6 TCP test with IPv4 arguments
since its initial conmit. Looks like a typo.

Fixes: 3a687bef14 ("selftests: udp gso benchmark")
Cc: willemb@google.com
Signed-off-by: Dimitris Michailidis <dmichail@fungible.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20220623000234.61774-1-dmichail@fungible.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-23 21:19:03 -07:00
Eric Dumazet
1228b34c8d net: clear msg_get_inq in __sys_recvfrom() and __copy_msghdr_from_user()
syzbot reported uninit-value in tcp_recvmsg() [1]

Issue here is that msg->msg_get_inq should have been cleared,
otherwise tcp_recvmsg() might read garbage and perform
more work than needed, or have undefined behavior.

Given CONFIG_INIT_STACK_ALL_ZERO=y is probably going to be
the default soon, I chose to change __sys_recvfrom() to clear
all fields but msghdr.addr which might be not NULL.

For __copy_msghdr_from_user(), I added an explicit clear
of kmsg->msg_get_inq.

[1]
BUG: KMSAN: uninit-value in tcp_recvmsg+0x6cf/0xb60 net/ipv4/tcp.c:2557
tcp_recvmsg+0x6cf/0xb60 net/ipv4/tcp.c:2557
inet_recvmsg+0x13a/0x5a0 net/ipv4/af_inet.c:850
sock_recvmsg_nosec net/socket.c:995 [inline]
sock_recvmsg net/socket.c:1013 [inline]
__sys_recvfrom+0x696/0x900 net/socket.c:2176
__do_sys_recvfrom net/socket.c:2194 [inline]
__se_sys_recvfrom net/socket.c:2190 [inline]
__x64_sys_recvfrom+0x122/0x1c0 net/socket.c:2190
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x46/0xb0

Local variable msg created at:
__sys_recvfrom+0x81/0x900 net/socket.c:2154
__do_sys_recvfrom net/socket.c:2194 [inline]
__se_sys_recvfrom net/socket.c:2190 [inline]
__x64_sys_recvfrom+0x122/0x1c0 net/socket.c:2190

CPU: 0 PID: 3493 Comm: syz-executor170 Not tainted 5.19.0-rc3-syzkaller-30868-g4b28366af7d9 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: f94fd25cb0 ("tcp: pass back data left in socket after receive")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Alexander Potapenko<glider@google.com>
Link: https://lore.kernel.org/r/20220622150220.1091182-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-23 20:56:23 -07:00
Krzysztof Kozlowski
ad887a507d net/ncsi: use proper "mellanox" DT vendor prefix
"mlx" Devicetree vendor prefix is not documented and instead "mellanox"
should be used.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220622115416.7400-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-23 20:51:06 -07:00
Liam Howlett
6886da5f49 powerpc/prom_init: Fix kernel config grep
When searching for config options, use the KCONFIG_CONFIG shell variable
so that builds using non-standard config locations work.

Fixes: 26deb04342 ("powerpc: prepare string/mem functions for KASAN")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220624011745.4060795-1-Liam.Howlett@oracle.com
2022-06-24 13:47:26 +10:00
Doug Berger
7c97bc0128 net: dsa: bcm_sf2: force pause link settings
The pause settings reported by the PHY should also be applied to the GMII port
status override otherwise the switch will not generate pause frames towards the
link partner despite the advertisement saying otherwise.

Fixes: 246d7f773c ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220623030204.1966851-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-23 20:46:39 -07:00
Liang He
16d584d2fc net/dsa/hirschmann: Add missing of_node_get() in hellcreek_led_setup()
of_find_node_by_name() will decrease the refcount of its first arg and
we need a of_node_get() to keep refcount balance.

Fixes: 7d9ee2e8ff ("net: dsa: hellcreek: Add PTP status LEDs")
Signed-off-by: Liang He <windhl@126.com>
Link: https://lore.kernel.org/r/20220622040621.4094304-1-windhl@126.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-23 20:39:22 -07:00
Christophe Leroy
9864816180 powerpc/book3e: Fix PUD allocation size in map_kernel_page()
Commit 2fb4706057 ("powerpc: add support for folded p4d page tables")
erroneously changed PUD setup to a mix of PMD and PUD. Fix it.

While at it, use PTE_TABLE_SIZE instead of PAGE_SIZE for PTE tables
in order to avoid any confusion.

Fixes: 2fb4706057 ("powerpc: add support for folded p4d page tables")
Cc: stable@vger.kernel.org # v5.8+
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/95ddfd6176d53e6c85e13bd1c358359daa56775f.1655974558.git.christophe.leroy@csgroup.eu
2022-06-24 12:42:45 +10:00
Nathan Lynch
19fc5bb93c powerpc/xive/spapr: correct bitmap allocation size
kasan detects access beyond the end of the xibm->bitmap allocation:

BUG: KASAN: slab-out-of-bounds in _find_first_zero_bit+0x40/0x140
Read of size 8 at addr c00000001d1d0118 by task swapper/0/1

CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc2-00001-g90df023b36dd #28
Call Trace:
[c00000001d98f770] [c0000000012baab8] dump_stack_lvl+0xac/0x108 (unreliable)
[c00000001d98f7b0] [c00000000068faac] print_report+0x37c/0x710
[c00000001d98f880] [c0000000006902c0] kasan_report+0x110/0x354
[c00000001d98f950] [c000000000692324] __asan_load8+0xa4/0xe0
[c00000001d98f970] [c0000000011c6ed0] _find_first_zero_bit+0x40/0x140
[c00000001d98f9b0] [c0000000000dbfbc] xive_spapr_get_ipi+0xcc/0x260
[c00000001d98fa70] [c0000000000d6d28] xive_setup_cpu_ipi+0x1e8/0x450
[c00000001d98fb30] [c000000004032a20] pSeries_smp_probe+0x5c/0x118
[c00000001d98fb60] [c000000004018b44] smp_prepare_cpus+0x944/0x9ac
[c00000001d98fc90] [c000000004009f9c] kernel_init_freeable+0x2d4/0x640
[c00000001d98fd90] [c0000000000131e8] kernel_init+0x28/0x1d0
[c00000001d98fe10] [c00000000000cd54] ret_from_kernel_thread+0x5c/0x64

Allocated by task 0:
 kasan_save_stack+0x34/0x70
 __kasan_kmalloc+0xb4/0xf0
 __kmalloc+0x268/0x540
 xive_spapr_init+0x4d0/0x77c
 pseries_init_irq+0x40/0x27c
 init_IRQ+0x44/0x84
 start_kernel+0x2a4/0x538
 start_here_common+0x1c/0x20

The buggy address belongs to the object at c00000001d1d0118
 which belongs to the cache kmalloc-8 of size 8
The buggy address is located 0 bytes inside of
 8-byte region [c00000001d1d0118, c00000001d1d0120)

The buggy address belongs to the physical page:
page:c00c000000074740 refcount:1 mapcount:0 mapping:0000000000000000 index:0xc00000001d1d0558 pfn:0x1d1d
flags: 0x7ffff000000200(slab|node=0|zone=0|lastcpupid=0x7ffff)
raw: 007ffff000000200 c00000001d0003c8 c00000001d0003c8 c00000001d010480
raw: c00000001d1d0558 0000000001e1000a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 c00000001d1d0000: fc 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 c00000001d1d0080: fc fc 00 fc fc fc fc fc fc fc fc fc fc fc fc fc
>c00000001d1d0100: fc fc fc 02 fc fc fc fc fc fc fc fc fc fc fc fc
                            ^
 c00000001d1d0180: fc fc fc fc 04 fc fc fc fc fc fc fc fc fc fc fc
 c00000001d1d0200: fc fc fc fc fc 04 fc fc fc fc fc fc fc fc fc fc

This happens because the allocation uses the wrong unit (bits) when it
should pass (BITS_TO_LONGS(count) * sizeof(long)) or equivalent. With small
numbers of bits, the allocated object can be smaller than sizeof(long),
which results in invalid accesses.

Use bitmap_zalloc() to allocate and initialize the irq bitmap, paired with
bitmap_free() for consistency.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220623182509.3985625-1-nathanl@linux.ibm.com
2022-06-24 12:40:38 +10:00
Dave Airlie
1e9124df8b Merge tag 'drm-msm-fixes-2022-06-20' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v5.19-rc4

- Workaround for parade DSI bridge power sequencing
- Fix for multi-planar YUV format offsets
- Limiting WB modes to max sspp linewidth
- Fixing the supported rotations to add 180 back for IGT
- Fix to handle pm_runtime_get_sync() errors to avoid unclocked access
  in the bind() path for dpu driver
- Fix the irq_free() without request issue which was a being hit frequently
  in CI.
- Fix to add minimum ICC vote in the msm_mdss pm_resume path to address
  bootup splats
- Fix to avoid dereferencing without checking in WB encoder
- Fix to avoid crash during suspend in DP driver by ensuring interrupt
  mask bits are updated
- Remove unused code from dpu_encoder_virt_atomic_check()
- Fix to remove redundant init of dsc variable
- Fix to ensure mmap offset is initialized to avoid memory corruption
  from unpin/evict
- Fix double runpm disable in probe-defer path
- VMA fenced-unpin fixes
- Fix for WB max-width
- Fix for rare dp resolution change issue

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvdsOF1-+WfTWyEyu33XPcvxOCU00G-dz7EF2J+fdyUHg@mail.gmail.com
2022-06-24 10:11:27 +10:00
Dave Airlie
08d27daaaa Merge tag 'drm-intel-fixes-2022-06-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.19-rc4:
- Revert low voltage SKU check removal to fix display issues
- Apply PLL DCO fraction workaround for ADL-S
- Don't show engine classes not present in client fdinfo

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87a6a4syrr.fsf@intel.com
2022-06-24 09:58:06 +10:00
Dave Airlie
0a86b0db38 Merge tag 'drm-misc-fixes-2022-06-23' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Multiple fixes in sun4i for suspend, DDC, DMA setup; A rework of vc4 to
properly split the driver between hardware capabilities that wasn't done
properly causing multiple crashes; and a panel quirk for Aya Neo Next

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220623064152.ubjmnpj7tdejdcw6@houat
2022-06-24 09:45:50 +10:00
Dave Airlie
382cf35f25 Merge tag 'amd-drm-fixes-5.19-2022-06-22' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.19-2022-06-22:

amdgpu:
- Adjust GTT size logic
- eDP fix for RMB
- DCN 3.15 fix
- DP training fix
- Color encoding fix for DCN2+

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220622214106.5984-1-alexander.deucher@amd.com
2022-06-24 09:36:30 +10:00
Stefan Wahren
b0d473185b gpio: mxs: Fix header comment
This driver is about MXS GPIO support. MXC is a different platform.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-06-23 23:18:13 +02:00
Dave Chinner
5e672cd69f xfs: introduce xfs_inodegc_push()
The current blocking mechanism for pushing the inodegc queue out to
disk can result in systems becoming unusable when there is a long
running inodegc operation. This is because the statfs()
implementation currently issues a blocking flush of the inodegc
queue and a significant number of common system utilities will call
statfs() to discover something about the underlying filesystem.

This can result in userspace operations getting stuck on inodegc
progress, and when trying to remove a heavily reflinked file on slow
storage with a full journal, this can result in delays measuring in
hours.

Avoid this problem by adding "push" function that expedites the
flushing of the inodegc queue, but doesn't wait for it to complete.

Convert xfs_fs_statfs() and xfs_qm_scall_getquota() to use this
mechanism so they don't block but still ensure that queued
operations are expedited.

Fixes: ab23a77687 ("xfs: per-cpu deferred inode inactivation queues")
Reported-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
[djwong: fix _getquota_next to use _inodegc_push too]
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-23 13:34:38 -07:00
Dave Chinner
7cf2b0f961 xfs: bound maximum wait time for inodegc work
Currently inodegc work can sit queued on the per-cpu queue until
the workqueue is either flushed of the queue reaches a depth that
triggers work queuing (and later throttling). This means that we
could queue work that waits for a long time for some other event to
trigger flushing.

Hence instead of just queueing work at a specific depth, use a
delayed work that queues the work at a bound time. We can still
schedule the work immediately at a given depth, but we no long need
to worry about leaving a number of items on the list that won't get
processed until external events prevail.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-23 13:34:38 -07:00
Akira Yokosawa
c7e1c44358 gpio: Fix kernel-doc comments to nested union
Commit 48ec13d36d ("gpio: Properly document parent data union")
is supposed to have fixed a warning from "make htmldocs" regarding
kernel-doc comments to union members.  However, the same warning
still remains [1].

Fix the issue by following the example found in section "Nested
structs/unions" of Documentation/doc-guide/kernel-doc.rst.

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 48ec13d36d ("gpio: Properly document parent data union")
Link: https://lore.kernel.org/r/20220606093302.21febee3@canb.auug.org.au/ [1]
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-06-23 22:17:43 +02:00
Jinzhou Su
b376471fb4 cpufreq: amd-pstate: Add resume and suspend callbacks
When system resumes from S3, the CPPC enable register will be
cleared and reset to 0.

So enable the CPPC interface by writing 1 to this register on
system resume and disable it during system suspend.

Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
[ rjw: Subject and changelog edits ]
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-23 21:19:52 +02:00
Linus Torvalds
92f20ff720 Merge tag 'pm-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
 "Fix a recent regression preventing some systems from powering off
  after saving a hibernation image (Dmitry Osipenko)"

* tag 'pm-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: hibernate: Use kernel_can_power_off()
2022-06-23 14:17:15 -05:00
Linus Torvalds
ba461afbef Merge tag 'random-5.19-rc4-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator fixes from Jason Donenfeld:

 - A change to schedule the interrupt randomness mixing less often, yet
   credit a little more each time, to reduce overhead during interrupt
   storms.

 - Squelch an undesired pr_warn() from __ratelimit(), which was causing
   problems in the reporters' CI.

 - A trivial comment fix.

* tag 'random-5.19-rc4-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  random: update comment from copy_to_user() -> copy_to_iter()
  random: quiet urandom warning ratelimit suppression message
  random: schedule mix_interrupt_randomness() less often
2022-06-23 14:00:49 -05:00
Mikulas Patocka
90736eb323 dm mirror log: clear log bits up to BITS_PER_LONG boundary
Commit 85e123c27d ("dm mirror log: round up region bitmap size to
BITS_PER_LONG") introduced a regression on 64-bit architectures in the
lvm testsuite tests: lvcreate-mirror, mirror-names and vgsplit-operation.

If the device is shrunk, we need to clear log bits beyond the end of the
device. The code clears bits up to a 32-bit boundary and then calculates
lc->sync_count by summing set bits up to a 64-bit boundary (the commit
changed that; previously, this boundary was 32-bit too). So, it was using
some non-zeroed bits in the calculation and this caused misbehavior.

Fix this regression by clearing bits up to BITS_PER_LONG boundary.

Fixes: 85e123c27d ("dm mirror log: round up region bitmap size to BITS_PER_LONG")
Cc: stable@vger.kernel.org
Reported-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-23 14:55:43 -04:00
Ming Lei
61b6e2e532 dm: fix BLK_STS_DM_REQUEUE handling when dm_io represents split bio
Commit 7dd76d1fee ("dm: improve bio splitting and associated IO
accounting") removed using cloned bio when dm io splitting is needed.
Using bio_trim()+bio_inc_remaining() rather than bio_split()+bio_chain()
causes multiple dm_io instances to share the same original bio, and it
works fine if IOs are completed successfully.

But a regression was caused for the case when BLK_STS_DM_REQUEUE is
returned from any one of DM's cloned bios (whose dm_io share the same
orig_bio). In this BLK_STS_DM_REQUEUE case only the mapped subset of
the original bio for the current exact dm_io needs to be re-submitted.
However, since the original bio is shared among all dm_io instances,
the ->orig_bio actually only represents the last dm_io instance, so
requeue can't work as expected. Also when more than one dm_io is
requeued, the same original bio is requeued from all dm_io's
completion handler, then race is caused.

Fix this issue by still allocating one clone bio for completing io
only, then io accounting can rely on ->orig_bio being unmodified. This
is needed because the dm_io's sector_offset and sectors members are
recorded relative to an unmodified ->orig_bio.

In the future, we can go back to using bio_trim()+bio_inc_remaining()
for dm's io splitting but then delay needing a bio clone only when
handling BLK_STS_DM_REQUEUE, but that approach is a bit complicated
(so it needs a development cycle):
1) bio clone needs to be done in task context
2) a block interface for unwinding bio is required

Fixes: 7dd76d1fee ("dm: improve bio splitting and associated IO accounting")
Reported-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-23 14:33:13 -04:00
sunliming
eb174bd875 drm/msm/dpu: Fix variable dereferenced before check
Fixes the following smatch warning:

drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c:261
dpu_encoder_phys_wb_atomic_check() warn: variable dereferenced before check 'conn_state'

Fixes: d7d0e73f7d ("drm/msm/dpu: introduce the dpu_encoder_phys_* for writeback")
Signed-off-by: sunliming <sunliming@kylinos.cn>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/490850/
Link: https://lore.kernel.org/r/20220623012707.453972-1-sunliming@kylinos.cn
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2022-06-23 10:34:36 -07:00
Kuogee Hsieh
0769d0a7ae drm/msm/dp: reset drm_dev to NULL at dp_display_unbind()
During msm initialize phase, dp_display_unbind() will be called to undo
initializations had been done by dp_display_bind() previously if there is
error happen at msm_drm_bind. Under this kind of circumstance, drm_device
may not be populated completed which causes system crash at drm_dev_dbg().
This patch reset drm_dev to NULL so that following drm_dev_dbg() will not
refer to any internal fields of drm_device to prevent system from crashing.
Below are panic stack trace,

[   53.584904] Unable to handle kernel paging request at virtual address 0000000070018001
.
[   53.702212] Hardware name: Qualcomm Technologies, Inc. sc7280 CRD platform (rev5+) (DT)
[   53.710445] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   53.717596] pc : string_nocheck+0x1c/0x64
[   53.721738] lr : string+0x54/0x60
[   53.725162] sp : ffffffc013d6b650
[   53.728590] pmr_save: 000000e0
[   53.731743] x29: ffffffc013d6b650 x28: 0000000000000002 x27: 0000000000ffffff
[   53.739083] x26: ffffffc013d6b710 x25: ffffffd07a066ae0 x24: ffffffd07a419f97
[   53.746420] x23: ffffffd07a419f99 x22: ffffff81fef360d4 x21: ffffff81fef364d4
[   53.753760] x20: ffffffc013d6b6f8 x19: ffffffd07a06683c x18: 0000000000000000
[   53.761093] x17: 4020386678302f30 x16: 00000000000000b0 x15: ffffffd0797523c8
[   53.768429] x14: 0000000000000004 x13: ffff0000ffffff00 x12: ffffffd07a066b2c
[   53.775780] x11: 0000000000000000 x10: 000000000000013c x9 : 0000000000000000
[   53.783117] x8 : ffffff81fef364d4 x7 : 0000000000000000 x6 : 0000000000000000
[   53.790445] x5 : 0000000000000000 x4 : ffff0a00ffffff04 x3 : ffff0a00ffffff04
[   53.797783] x2 : 0000000070018001 x1 : ffffffffffffffff x0 : ffffff81fef360d4
[   53.805136] Call trace:
[   53.807667]  string_nocheck+0x1c/0x64
[   53.811439]  string+0x54/0x60
[   53.814498]  vsnprintf+0x374/0x53c
[   53.818009]  pointer+0x3dc/0x40c
[   53.821340]  vsnprintf+0x398/0x53c
[   53.824854]  vscnprintf+0x3c/0x88
[   53.828274]  __trace_array_vprintk+0xcc/0x2d4
[   53.832768]  trace_array_printk+0x8c/0xb4
[   53.836900]  drm_trace_printf+0x74/0x9c
[   53.840875]  drm_dev_dbg+0xfc/0x1b8
[   53.844480]  dp_pm_suspend+0x70/0xf8
[   53.848164]  dpm_run_callback+0x60/0x1a0
[   53.852222]  __device_suspend+0x304/0x3f4
[   53.856363]  dpm_suspend+0xf8/0x3a8
[   53.859959]  dpm_suspend_start+0x8c/0xc0

Fixes: 570d3e5d28 ("drm/msm/dp: stop event kernel thread when DP unbind")
Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/490756/
Link: https://lore.kernel.org/r/1655927731-22396-1-git-send-email-quic_khsieh@quicinc.com
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2022-06-23 10:28:53 -07:00
Stephen Boyd
c28d76d360 drm/msm/dpu: Increment vsync_cnt before waking up userspace
The 'vsync_cnt' is used to count the number of frames for a crtc.
Unfortunately, we increment the count after waking up userspace via
dpu_crtc_vblank_callback() calling drm_crtc_handle_vblank().
drm_crtc_handle_vblank() wakes up userspace processes that have called
drm_wait_vblank_ioctl(), and if that ioctl is expecting the count to
increase it won't.

Increment the count before calling into the drm APIs so that we don't
have to worry about ordering the increment with anything else in drm.
This fixes a software video decode test that fails to see frame counts
increase on Trogdor boards.

Cc: Mark Yacoub <markyacoub@chromium.org>
Cc: Jessica Zhang <quic_jesszhan@quicinc.com>
Fixes: 885455d6bf ("drm/msm: Change dpu_crtc_get_vblank_counter to use vsync count.")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Tested-by: Jessica Zhang <quic_jesszhan@quicinc.com> # Trogdor (sc7180)
Patchwork: https://patchwork.freedesktop.org/patch/490531/
Link: https://lore.kernel.org/r/20220622023855.2970913-1-swboyd@chromium.org
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2022-06-23 10:25:32 -07:00
Linus Torvalds
fa1796a835 Merge tag 'trace-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:

 - Check for NULL in kretprobe_dispatcher()

   NULL can now be passed in, make sure it can handle it

 - Clean up unneeded #endif #ifdef of the same preprocessor
   check in the middle of the block.

 - Comment clean up

 - Remove unneeded initialization of the "ret" variable in
   __trace_uprobe_create()

* tag 'trace-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/uprobes: Remove unwanted initialization in __trace_uprobe_create()
  tracefs: Fix syntax errors in comments
  tracing: Simplify conditional compilation code in tracing_set_tracer()
  tracing/kprobes: Check whether get_kretprobe() returns NULL in kretprobe_dispatcher()
2022-06-23 12:24:49 -05:00
Linus Torvalds
16e4bce6de Merge tag 'folio-5.19b' of git://git.infradead.org/users/willy/pagecache
Pull pagecache fixes from Matthew Wilcox:
 "Four folio-related fixes for 5.19:

   - Mark a folio accessed at the right time (Yu Kuai)

   - Fix a race for folios being replaced in the middle of a read (Brian
     Foster)

   - Clear folio->private in more places (Xiubo Li)

   - Take the invalidate_lock in page_cache_ra_order() (Alistair Popple)"

* tag 'folio-5.19b' of git://git.infradead.org/users/willy/pagecache:
  filemap: Fix serialization adding transparent huge pages to page cache
  mm: Clear page->private when splitting or migrating a page
  filemap: Handle sibling entries in filemap_get_read_batch()
  filemap: Correct the conditions for marking a folio as accessed
2022-06-23 12:16:14 -05:00
Petr Mladek
51889d225c Merge branch 'rework/kthreads' into for-linus 2022-06-23 19:11:28 +02:00
Linus Torvalds
599d16912d Merge tag 'mips-fixes_5.19_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:

 - several refcount fixes

 - added missing clock for ingenic

 - fix wrong irq_err_count for vr41xx

* tag 'mips-fixes_5.19_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  mips: lantiq: Add missing of_node_put() in irq.c
  mips: dts: ingenic: Add TCU clock to x1000/x1830 tcu device node
  mips/pic32/pic32mzda: Fix refcount leak bugs
  mips: lantiq: xway: Fix refcount leak bug in sysctrl
  mips: lantiq: falcon: Fix refcount leak bug in sysctrl
  mips: ralink: Fix refcount leak in of.c
  mips: mti-malta: Fix refcount leak in malta-time.c
  arch: mips: generic: Add missing of_node_put() in board-ranchu.c
  MIPS: Remove repetitive increase irq_err_count
2022-06-23 12:11:26 -05:00
Jens Axboe
386e4fb696 io_uring: use original request task for inflight tracking
In prior kernels, we did file assignment always at prep time. This meant
that req->task == current. But after deferring that assignment and then
pushing the inflight tracking back in, we've got the inflight tracking
using current when it should in fact now be using req->task.

Fixup that error introduced by adding the inflight tracking back after
file assignments got modifed.

Fixes: 9cae36a094 ("io_uring: reinstate the inflight tracking")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-23 11:06:43 -06:00
Paolo Bonzini
6945b2141f MAINTAINERS: Reorganize KVM/x86 maintainership
For the last few years I have been the sole maintainer of KVM, albeit
getting serious help from all the people who have reviewed hundreds of
patches.  The volume of KVM x86 alone has gotten to the point where one
maintainer is not enough; especially if that maintainer is not doing it
full time and if they want to keep up with the evolution of ARM64 and
RISC-V at both the architecture and the hypervisor level.

So, this patch is the first step in restoring double maintainership
or even transitioning to the submaintainer model of other architectures.

The changes here were mostly proposed by Sean offlist and they are twofold:

- revisiting the set of KVM x86 reviewers.  It's important to have an
  an accurate list of people that are actively reviewing patches ("R"),
  as well as people that are able to act on bug reports ("M").  Otherwise,
  voids to be filled are not easily visible.  The proposal is to split
  KVM on Hyper-V, which is where Vitaly has been the main contributor
  for quite some time now; likewise for KVM paravirt support, which
  has been the main interest of Wanpeng and to which Vitaly has also
  contributed (e.g., for async page faults).  Jim and Joerg have not been
  particularly active (though Joerg has worked on guest support for AMD
  SEV); knowing them a bit, I can't imagine they would object to their
  removal or even be surprised, but please speak up if you do.

- promoting Sean to maintainer for KVM x86 host support.  While for
  now this changes little, let's treat it as a harbinger for future
  changes.  The plan is that I would keep the final integration testing
  for quite some time, and probably focus more on -rc work.  This will
  give me more time to clean up my ad hoc setup and moving towards a
  more public CI, with Sean focusing instead on next-release patches,
  and the testing up to where kvm-unit-tests and selftests pass.  In
  order to facilitate collaboration between Sean and myself, we'll
  also formalize a bit more the various branches of kvm.git.

Nothing is going to change with respect to handling pull requests to Linus
and from other architectures, as well as maintainance of the generic code
(which I expect and hope to be more important as architectures try to
share more code) and documentation.  However, it's not a coincidence
that my entry is now the last for x86, ready to be demoted to reviewer
if/when the right time comes.

Suggested-by: Sean Christopherson <seanjc@google.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org
Cc: Sean Christopherson <seanjc@google.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-23 12:57:49 -04:00
Petr Mladek
07a22b6194 Revert "printk: add functions to prefer direct printing"
This reverts commit 2bb2b7b57f.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-7-pmladek@suse.com
2022-06-23 18:41:40 +02:00
Petr Mladek
5831788afb Revert "printk: add kthread console printers"
This reverts commit 09c5ba0aa2.

This reverts commit b87f02307d.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-6-pmladek@suse.com
2022-06-23 18:41:40 +02:00
Petr Mladek
2d9ef940f8 Revert "printk: extend console_lock for per-console locking"
This reverts commit 8e27473211.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-5-pmladek@suse.com
2022-06-23 18:41:40 +02:00
Petr Mladek
007eeab7e9 Revert "printk: remove @console_locked"
This reverts commit ab406816fc.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-4-pmladek@suse.com
2022-06-23 18:41:40 +02:00
Petr Mladek
05c96b3713 Revert "printk: Block console kthreads when direct printing will be required"
This reverts commit c3230283e2.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-3-pmladek@suse.com
2022-06-23 18:41:40 +02:00
Petr Mladek
20fb0c8272 Revert "printk: Wait for the global console lock when the system is going down"
This reverts commit b87f02307d.

The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.

It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.

printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.

Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220623145157.21938-2-pmladek@suse.com
2022-06-23 18:41:40 +02:00
Alistair Popple
00fa15e0d5 filemap: Fix serialization adding transparent huge pages to page cache
Commit 793917d997 ("mm/readahead: Add large folio readahead")
introduced support for using large folios for filebacked pages if the
filesystem supports it.

page_cache_ra_order() was introduced to allocate and add these large
folios to the page cache. However adding pages to the page cache should
be serialized against truncation and hole punching by taking
invalidate_lock. Not doing so can lead to data races resulting in stale
data getting added to the page cache and marked up-to-date. See commit
730633f0b7 ("mm: Protect operations adding pages to page cache with
invalidate_lock") for more details.

This issue was found by inspection but a testcase revealed it was
possible to observe in practice on XFS. Fix this by taking
invalidate_lock in page_cache_ra_order(), to mirror what is done for the
non-thp case in page_cache_ra_unbounded().

Signed-off-by: Alistair Popple <apopple@nvidia.com>
Fixes: 793917d997 ("mm/readahead: Add large folio readahead")
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-23 12:22:00 -04:00
Matthew Wilcox (Oracle)
b653db7735 mm: Clear page->private when splitting or migrating a page
In our efforts to remove uses of PG_private, we have found folios with
the private flag clear and folio->private not-NULL.  That is the root
cause behind 642d51fb07 ("ceph: check folio PG_private bit instead
of folio->private").  It can also affect a few other filesystems that
haven't yet reported a problem.

compaction_alloc() can return a page with uninitialised page->private,
and rather than checking all the callers of migrate_pages(), just zero
page->private after calling get_new_page().  Similarly, the tail pages
from split_huge_page() may also have an uninitialised page->private.

Reported-by: Xiubo Li <xiubli@redhat.com>
Tested-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-23 12:21:44 -04:00
Thomas Richter
21e8764487 s390/pai: Fix multiple concurrent event installation
Two different events such as pai_crypto/KM_AES_128/ and
pai_crypto/KM_AES_192/ can be installed multiple times on the same CPU
and the events are executed concurrently:

  # perf stat -e pai_crypto/KM_AES_128/  -C0 -a -- sleep 5 &
  # sleep 2
  # perf stat -e pai_crypto/KM_AES_192/ -C0 -a -- true

This results in the first event being installed two times with two seconds
delay. The kernel does install the second event after the first
event has been deleted and re-added, as can be seen in the traces:

 13:48:47.600350  paicrypt_start event 0x1007 (event KM_AES_128)
 13:48:49.599359  paicrypt_stop event 0x1007  (event KM_AES_128)
 13:48:49.599198  paicrypt_start event 0x1007
 13:48:49.599199  paicrypt_start event 0x1008
 13:48:49.599921  paicrypt_event_destroy event 0x1008
 13:48:52.601507  paicrypt_event_destroy event 0x1007

This is caused by functions event_sched_in() and event_sched_out() which
call the PMU's add() and start() functions on schedule_in and the PMU's
stop() and del() functions on schedule_out. This is correct for events
attached to processes.  The pai_crypto events are system-wide events
and not attached to processes.

Since the kernel common code can not be changed easily, fix this issue
and do not reset the event count value to zero each time the event is
added and started. Instead use a flag and zero the event count value
only when called immediately after the event has been initialized.
Therefore only the first invocation of the the event's add() function
initializes the event count value to zero. The following invocations
of the event's add() function leave the current event count value
untouched.

Fixes: 39d62336f5 ("s390/pai: add support for cryptography counters")

Reported-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-23 17:24:00 +02:00
Thomas Richter
541a496644 s390/pai: Prevent invalid event number for pai_crypto PMU
The pai_crypto PMU has to check the event number. It has to be in
the supported range. This is not the case, the lower limit is not
checked. Fix this and obey the lower limit.

Fixes: 39d62336f5 ("s390/pai: add support for cryptography counters")

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Suggested-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-23 17:23:59 +02:00
Thomas Richter
be857b7f77 s390/cpumf: Handle events cycles and instructions identical
Events CPU_CYCLES and INSTRUCTIONS can be submitted with two different
perf_event attribute::type values:
 - PERF_TYPE_HARDWARE: when invoked via perf tool predefined events name
   cycles or cpu-cycles or instructions.
 - pmu->type: when invoked via perf tool event name cpu_cf/CPU_CYLCES/ or
   cpu_cf/INSTRUCTIONS/. This invocation also selects the PMU to which
   the event belongs.
Handle both type of invocations identical for events CPU_CYLCES and
INSTRUCTIONS. They address the same hardware.
The result is different when event modifier exclude_kernel is also set.
Invocation with event modifier for user space event counting fails.

Output before:

 # perf stat -e cpum_cf/cpu_cycles/u -- true

 Performance counter stats for 'true':

   <not supported>      cpum_cf/cpu_cycles/u

       0.000761033 seconds time elapsed

       0.000076000 seconds user
       0.000725000 seconds sys

 #

Output after:
 # perf stat -e cpum_cf/cpu_cycles/u -- true

 Performance counter stats for 'true':

           349,613      cpum_cf/cpu_cycles/u

       0.000844143 seconds time elapsed

       0.000079000 seconds user
       0.000800000 seconds sys
 #

Fixes: 6a82e23f45 ("s390/cpumf: Adjust registration of s390 PMU device drivers")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
[agordeev@linux.ibm.com corrected commit ID of Fixes commit]
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-23 17:23:59 +02:00
Alexander Gordeev
af2debd58b s390/crash: make copy_oldmem_page() return number of bytes copied
Callback copy_oldmem_page() returns either error code or zero.
Instead, it should return the error code or number of bytes copied.

Fixes: df9694c797 ("s390/dump: streamline oldmem copy functions")
Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-23 17:23:30 +02:00
Alexander Gordeev
cc02e6e21a s390/crash: add missing iterator advance in copy_oldmem_page()
In case old memory was successfully copied the passed iterator
should be advanced as well. Currently copy_oldmem_page() is
always called with single-segment iterator. Should that ever
change - copy_oldmem_user and copy_oldmem_kernel() functions
would need a rework to deal with multi-segment iterators.

Fixes: 5d8de293c2 ("vmcore: convert copy_oldmem_page() to take an iov_iter")
Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-23 17:23:30 +02:00
Uwe Kleine-König
c1c2a15c2b gpio: grgpio: Fix device removing
If a platform device's remove callback returns non-zero, the device core
emits a warning and still removes the device and calls the devm cleanup
callbacks.

So it's not save to not unregister the gpiochip because on the next request
to a GPIO the driver accesses kfree()'d memory. Also if an IRQ triggers,
the freed memory is accessed.

Instead rely on the GPIO framework to ensure that after gpiochip_remove()
all GPIOs are freed and so the corresponding IRQs are unmapped.

This is a preparation for making platform remove callbacks return void.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-06-23 16:42:27 +02:00
Dylan Yudaken
e70b64a3f2 io_uring: move io_uring_get_opcode out of TP_printk
The TP_printk macro's are not supposed to use custom code ([1]) or else
tools such as perf cannot use these events.

Convert the opcode string representation to use the __string wiring that
the event framework provides ([2]).

[1]: https://lwn.net/Articles/379903/
[2]: https://lwn.net/Articles/381064/

Fixes: 033b87d24f ("io_uring: use the text representation of ops in trace")
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220623083743.2648321-1-dylany@fb.com
[axboe: fixup spurious removal of sq_thread assignment]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-23 08:40:36 -06:00
Dan Carpenter
9ca766eaea gpio: winbond: Fix error code in winbond_gpio_get()
This error path returns 1, but it should instead propagate the negative
error code from winbond_sio_enter().

Fixes: a0d6500941 ("gpio: winbond: Add driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-06-23 16:29:55 +02:00
Utkarsh Patel
8ffdc53a60 xhci-pci: Allow host runtime PM as default for Intel Meteor Lake xHCI
Meteor Lake TCSS(Type-C Subsystem) xHCI needs to be runtime suspended
whenever possible to allow the TCSS hardware block to enter D3cold and
thus save energy.

Cc: stable@kernel.org
Signed-off-by: Utkarsh Patel <utkarsh.h.patel@intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20220623111945.1557702-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-23 16:27:28 +02:00
Tanveer Alam
7516da47a3 xhci-pci: Allow host runtime PM as default for Intel Raptor Lake xHCI
In the same way as Intel Alder Lake TCSS (Type-C Subsystem) the Raptor
Lake TCSS xHCI needs to be runtime suspended whenever possible to
allow the TCSS hardware block to enter D3cold and thus save energy.

Cc: stable@kernel.org
Signed-off-by: Tanveer Alam <tanveer1.alam@intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20220623111945.1557702-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-23 16:27:28 +02:00
Mathias Nyman
83810f84ec xhci: turn off port power in shutdown
If ports are not turned off in shutdown then runtime suspended
self-powered USB devices may survive in U3 link state over S5.

During subsequent boot, if firmware sends an IPC command to program
the port in DISCONNECT state, it will time out, causing significant
delay in the boot time.

Turning off roothub port power is also recommended in xhci
specification 4.19.4 "Port Power" in the additional note.

Cc: stable@vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20220623111945.1557702-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-23 16:27:28 +02:00
Hongyu Xie
a808925075 xhci: Keep interrupt disabled in initialization until host is running.
irq is disabled in xhci_quiesce(called by xhci_halt, with bit:2 cleared
in USBCMD register), but xhci_run(called by usb_add_hcd) re-enable it.
It's possible that you will receive thousands of interrupt requests
after initialization for 2.0 roothub. And you will get a lot of
warning like, "xHCI dying, ignoring interrupt. Shouldn't IRQs be
disabled?". This amount of interrupt requests will cause the entire
system to freeze.
This problem was first found on a device with ASM2142 host controller
on it.

[tidy up old code while moving it, reword header -Mathias]

Cc: stable@kernel.org
Signed-off-by: Hongyu Xie <xiehongyu1@kylinos.cn>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20220623111945.1557702-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-23 16:27:28 +02:00
Paolo Bonzini
922d4578cf Merge tag 'kvmarm-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.19, take #2

- Fix a regression with pKVM when kmemleak is enabled

- Add Oliver Upton as an official KVM/arm64 reviewer
2022-06-23 10:27:21 -04:00
Raghavendra Rao Ananta
9e2f6498ef selftests: KVM: Handle compiler optimizations in ucall
The selftests, when built with newer versions of clang, is found
to have over optimized guests' ucall() function, and eliminating
the stores for uc.cmd (perhaps due to no immediate readers). This
resulted in the userspace side always reading a value of '0', and
causing multiple test failures.

As a result, prevent the compiler from optimizing the stores in
ucall() with WRITE_ONCE().

Suggested-by: Ricardo Koller <ricarkol@google.com>
Suggested-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Message-Id: <20220615185706.1099208-1-rananta@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-23 10:26:41 -04:00
Greg Kroah-Hartman
2bdc2bcd9a Merge tag 'usb-serial-5.19-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:

USB-serial fixes for 5.19-rc4

Here are some new modem device ids and support for further PL2303
device types.

All but the final commit (RM500K device id) have been in linux-next and
with no reported issues.

* tag 'usb-serial-5.19-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: option: add Quectel RM500K module support
  USB: serial: option: add Quectel EM05-G modem
  USB: serial: pl2303: add support for more HXN (G) types
  USB: serial: option: add Telit LE910Cx 0x1250 composition
2022-06-23 16:25:55 +02:00
Linus Torvalds
399bd66e21 Merge tag 'net-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf and netfilter.

  Current release - regressions:

   - netfilter: cttimeout: fix slab-out-of-bounds read in
     cttimeout_net_exit

Current release - new code bugs:

   - bpf: ftrace: keep address offset in ftrace_lookup_symbols

   - bpf: force cookies array to follow symbols sorting

  Previous releases - regressions:

   - ipv4: ping: fix bind address validity check

   - tipc: fix use-after-free read in tipc_named_reinit

   - eth: veth: add updating of trans_start

  Previous releases - always broken:

   - sock: redo the psock vs ULP protection check

   - netfilter: nf_dup_netdev: fix skb_under_panic

   - bpf: fix request_sock leak in sk lookup helpers

   - eth: igb: fix a use-after-free issue in igb_clean_tx_ring

   - eth: ice: prohibit improper channel config for DCB

   - eth: at803x: fix null pointer dereference on AR9331 phy

   - eth: virtio_net: fix xdp_rxq_info bug after suspend/resume

  Misc:

   - eth: hinic: replace memcpy() with direct assignment"

* tag 'net-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
  net: openvswitch: fix parsing of nw_proto for IPv6 fragments
  sock: redo the psock vs ULP protection check
  Revert "net/tls: fix tls_sk_proto_close executed repeatedly"
  virtio_net: fix xdp_rxq_info bug after suspend/resume
  igb: Make DMA faster when CPU is active on the PCIe link
  net: dsa: qca8k: reduce mgmt ethernet timeout
  net: dsa: qca8k: reset cpu port on MTU change
  MAINTAINERS: Add a maintainer for OCP Time Card
  hinic: Replace memcpy() with direct assignment
  Revert "drivers/net/ethernet/neterion/vxge: Fix a use-after-free bug in vxge-main.c"
  net: phy: smsc: Disable Energy Detect Power-Down in interrupt mode
  ice: ethtool: Prohibit improper channel config for DCB
  ice: ethtool: advertise 1000M speeds properly
  ice: Fix switchdev rules book keeping
  ice: ignore protocol field in GTP offload
  netfilter: nf_dup_netdev: add and use recursion counter
  netfilter: nf_dup_netdev: do not push mac header a second time
  selftests: netfilter: correct PKTGEN_SCRIPT_PATHS in nft_concat_range.sh
  net/tls: fix tls_sk_proto_close executed repeatedly
  erspan: do not assume transport header is always set
  ...
2022-06-23 09:01:01 -05:00
Linus Torvalds
f410c3e000 Merge tag 'mmc-v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:

 - mtk-sd: Fix dma hang issues

 - sdhci-pci-o2micro: Fix card detect by dealing with debouncing

* tag 'mmc-v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: mediatek: wait dma stop bit reset to 0
  mmc: sdhci-pci-o2micro: Fix card detect by dealing with debouncing
2022-06-23 08:55:37 -05:00
Jens Axboe
e531485a0a Merge tag 'nvme-5.19-2022-06-23' of git://git.infradead.org/nvme into block-5.19
Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.19

 - fix the mixed up CRIMS/CRWMS constants (Joel Granados)
 - add another broken identifier quirk (Leo Savernik)
 - fix up a quirk because Samsung reuses PCI IDs over different products
   (Christoph Hellwig)"

* tag 'nvme-5.19-2022-06-23' of git://git.infradead.org/nvme:
  nvme: move the Samsung X5 quirk entry to the core quirks
  nvme: fix the CRIMS and CRWMS definitions to match the spec
  nvme: add a bogus subsystem NQN quirk for Micron MTFDKBA2T0TFH
2022-06-23 07:55:07 -06:00
Li Nan
ca2a3343d6 block: remove WARN_ON() from bd_link_disk_holder
Since commit 83cbce957446("block: add error handling for device_add_disk /
add_disk"), bdev->bd_holder_dir can not be empty now, so remove WARN_ON()
from bd_link_disk_holder.

Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220623074100.2251301-1-linan122@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-23 07:48:05 -06:00
Linus Torvalds
ddfe80311b Merge tag 'sound-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "All small changes, mostly device-specific:

   - A regression fix for PCM WC-page allocation on x86

   - A regression fix for i915 audio component binding

   - Fixes for (longstanding) beep handling bug

   - Runtime PM fixes for Intel LPE HDMI audio

   - A couple of pending FireWire fixes

   - Usual HD-audio and USB-audio quirks, new Intel dspconf entries"

* tag 'sound-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek: Add quirk for Clevo NS50PU
  ALSA: hda: Fix discovery of i915 graphics PCI device
  ALSA: hda/via: Fix missing beep setup
  ALSA: hda/conexant: Fix missing beep setup
  ALSA: memalloc: Drop x86-specific hack for WC allocations
  ALSA: hda/realtek: Add quirk for Clevo PD70PNT
  ALSA: x86: intel_hdmi_audio: use pm_runtime_resume_and_get()
  ALSA: x86: intel_hdmi_audio: enable pm_runtime and set autosuspend delay
  ALSA: hda: intel-nhlt: remove use of __func__ in dev_dbg
  ALSA: hda: intel-dspcfg: use SOF for UpExtreme and UpExtreme11 boards
  firewire: convert sysfs sprintf/snprintf family to sysfs_emit
  firewire: cdev: fix potential leak of kernel stack due to uninitialized value
  ALSA: hda/realtek: Apply fixup for Lenovo Yoga Duet 7 properly
  ALSA: hda/realtek - ALC897 headset MIC no sound
  ALSA: usb-audio: US16x08: Move overflow check before array access
  ALSA: hda/realtek: Add mute LED quirk for HP Omen laptop
2022-06-23 08:44:00 -05:00
Demi Marie Obenour
dbe97cff7d xen/gntdev: Avoid blocking in unmap_grant_pages()
unmap_grant_pages() currently waits for the pages to no longer be used.
In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a
deadlock against i915: i915 was waiting for gntdev's MMU notifier to
finish, while gntdev was waiting for i915 to free its pages.  I also
believe this is responsible for various deadlocks I have experienced in
the past.

Avoid these problems by making unmap_grant_pages async.  This requires
making it return void, as any errors will not be available when the
function returns.  Fortunately, the only use of the return value is a
WARN_ON(), which can be replaced by a WARN_ON when the error is
detected.  Additionally, a failed call will not prevent further calls
from being made, but this is harmless.

Because unmap_grant_pages is now async, the grant handle will be sent to
INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same
handle.  Instead, a separate bool array is allocated for this purpose.
This wastes memory, but stuffing this information in padding bytes is
too fragile.  Furthermore, it is necessary to grab a reference to the
map before making the asynchronous call, and release the reference when
the call returns.

It is also necessary to guard against reentrancy in gntdev_map_put(),
and to handle the case where userspace tries to map a mapping whose
contents have not all been freed yet.

Fixes: 745282256c ("xen/gntdev: safely unmap grants in case they are still in use")
Cc: stable@vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220622022726.2538-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-23 15:29:18 +02:00
Dexuan Cui
3be4562584 dma-direct: use the correct size for dma_set_encrypted()
The third parameter of dma_set_encrypted() is a size in bytes rather than
the number of pages.

Fixes: 4d0564785b ("dma-direct: factor out dma_set_{de,en}crypted helpers")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-23 15:26:59 +02:00
Christoph Hellwig
e648783318 nvme: move the Samsung X5 quirk entry to the core quirks
This device shares the PCI ID with the Samsung 970 Evo Plus that
does not need or want the quirks.  Move the the quirk entry to the
core table based on the model number instead.

Fixes: bc360b0b16 ("nvme-pci: add quirks for Samsung X5 SSDs")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
2022-06-23 15:22:22 +02:00
Joel Granados
23c9cd5600 nvme: fix the CRIMS and CRWMS definitions to match the spec
Adjust the values of NVME_CAP_CRMS_CRIMS and NVME_CAP_CRMS_CRWMS masks as
they are different from the ones in TP4084 - Time-to-ready.

Fixes: 354201c53e ("nvme: add support for TP4084 - Time-to-Ready Enhancements").
Signed-off-by: Joel Granados <j.granados@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-23 15:22:22 +02:00
Leo Savernik
41f38043f8 nvme: add a bogus subsystem NQN quirk for Micron MTFDKBA2T0TFH
The Micron MTFDKBA2T0TFH device reports the same subsysem NQN for
all devices.  Add a quick to ignore it.

Signed-off-by: Leo Savernik <l.savernik@aon.at>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-23 15:22:18 +02:00
Macpaul Lin
15b694e96c USB: serial: option: add Quectel RM500K module support
Add usb product id of the Quectel RM500K module.

RM500K provides 2 mandatory interfaces to Linux host after enumeration.
 - /dev/ttyUSB5: this is a serial interface for control path. User needs
   to write AT commands to this device node to query status, set APN,
   set PIN code, and enable/disable the data connection to 5G network.
 - ethX: this is the data path provided as a RNDIS devices. After the
   data connection has been established, Linux host can access 5G data
   network via this interface.

"RNDIS": RNDIS + ADB + AT (/dev/ttyUSB5) + MODEM COMs

usb-devices output for 0x7001:
T:  Bus=05 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  3 Spd=480 MxCh= 0
D:  Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2c7c ProdID=7001 Rev=00.01
S:  Manufacturer=MediaTek Inc.
S:  Product=USB DATA CARD
S:  SerialNumber=869206050009672
C:  #Ifs=10 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=02 Prot=ff Driver=rndis_host
E:  Ad=82(I) Atr=03(Int.) MxPS=  64 Ivl=125us
I:  If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 7 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 8 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=08(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 9 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=09(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=8a(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

Co-developed-by: Ballon Shi <ballon.shi@quectel.com>
Signed-off-by: Ballon Shi <ballon.shi@quectel.com>
Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2022-06-23 13:58:05 +02:00
Rosemarie O'Riorden
12378a5a75 net: openvswitch: fix parsing of nw_proto for IPv6 fragments
When a packet enters the OVS datapath and does not match any existing
flows installed in the kernel flow cache, the packet will be sent to
userspace to be parsed, and a new flow will be created. The kernel and
OVS rely on each other to parse packet fields in the same way so that
packets will be handled properly.

As per the design document linked below, OVS expects all later IPv6
fragments to have nw_proto=44 in the flow key, so they can be correctly
matched on OpenFlow rules. OpenFlow controllers create pipelines based
on this design.

This behavior was changed by the commit in the Fixes tag so that
nw_proto equals the next_header field of the last extension header.
However, there is no counterpart for this change in OVS userspace,
meaning that this field is parsed differently between OVS and the
kernel. This is a problem because OVS creates actions based on what is
parsed in userspace, but the kernel-provided flow key is used as a match
criteria, as described in Documentation/networking/openvswitch.rst. This
leads to issues such as packets incorrectly matching on a flow and thus
the wrong list of actions being applied to the packet. Such changes in
packet parsing cannot be implemented without breaking the userspace.

The offending commit is partially reverted to restore the expected
behavior.

The change technically made sense and there is a good reason that it was
implemented, but it does not comply with the original design of OVS.
If in the future someone wants to implement such a change, then it must
be user-configurable and disabled by default to preserve backwards
compatibility with existing OVS versions.

Cc: stable@vger.kernel.org
Fixes: fa642f0883 ("openvswitch: Derive IP protocol number for IPv6 later frags")
Link: https://docs.openvswitch.org/en/latest/topics/design/#fragments
Signed-off-by: Rosemarie O'Riorden <roriorden@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/20220621204845.9721-1-roriorden@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-23 11:44:01 +02:00
Jakub Kicinski
e34a07c0ae sock: redo the psock vs ULP protection check
Commit 8a59f9d1e3 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()")
has moved the inet_csk_has_ulp(sk) check from sk_psock_init() to
the new tcp_bpf_update_proto() function. I'm guessing that this
was done to allow creating psocks for non-inet sockets.

Unfortunately the destruction path for psock includes the ULP
unwind, so we need to fail the sk_psock_init() itself.
Otherwise if ULP is already present we'll notice that later,
and call tcp_update_ulp() with the sk_proto of the ULP
itself, which will most likely result in the ULP looping
its callbacks.

Fixes: 8a59f9d1e3 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20220620191353.1184629-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-23 10:08:30 +02:00
Jakub Kicinski
1b205d948f Revert "net/tls: fix tls_sk_proto_close executed repeatedly"
This reverts commit 69135c572d.

This commit was just papering over the issue, ULP should not
get ->update() called with its own sk_prot. Each ULP would
need to add this check.

Fixes: 69135c572d ("net/tls: fix tls_sk_proto_close executed repeatedly")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20220620191353.1184629-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-23 10:08:30 +02:00
Stephan Gerhold
8af52fe9fd virtio_net: fix xdp_rxq_info bug after suspend/resume
The following sequence currently causes a driver bug warning
when using virtio_net:

  # ip link set eth0 up
  # echo mem > /sys/power/state (or e.g. # rtcwake -s 10 -m mem)
  <resume>
  # ip link set eth0 down

  Missing register, driver bug
  WARNING: CPU: 0 PID: 375 at net/core/xdp.c:138 xdp_rxq_info_unreg+0x58/0x60
  Call trace:
   xdp_rxq_info_unreg+0x58/0x60
   virtnet_close+0x58/0xac
   __dev_close_many+0xac/0x140
   __dev_change_flags+0xd8/0x210
   dev_change_flags+0x24/0x64
   do_setlink+0x230/0xdd0
   ...

This happens because virtnet_freeze() frees the receive_queue
completely (including struct xdp_rxq_info) but does not call
xdp_rxq_info_unreg(). Similarly, virtnet_restore() sets up the
receive_queue again but does not call xdp_rxq_info_reg().

Actually, parts of virtnet_freeze_down() and virtnet_restore_up()
are almost identical to virtnet_close() and virtnet_open(): only
the calls to xdp_rxq_info_(un)reg() are missing. This means that
we can fix this easily and avoid such problems in the future by
just calling virtnet_close()/open() from the freeze/restore handlers.

Aside from adding the missing xdp_rxq_info calls the only difference
is that the refill work is only cancelled if netif_running(). However,
this should not make any functional difference since the refill work
should only be active if the network interface is actually up.

Fixes: 754b8a21a9 ("virtio_net: setup xdp_rxq_info")
Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20220621114845.3650258-1-stephan.gerhold@kernkonzept.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22 19:09:13 -07:00
Jakub Kicinski
448ad88f80 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-06-21

This series contains updates to ice driver only.

Marcin fixes GTP filters by allowing ignoring of the inner ethertype field.

Wojciech adds VSI handle tracking in order to properly distinguish similar
filters for removal.

Anatolii removes ability to set 1000baseT and 1000baseX fields
concurrently which caused link issues. He also disallows setting
channels to less than the number of Traffic Classes which would cause
NULL pointer dereference.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: ethtool: Prohibit improper channel config for DCB
  ice: ethtool: advertise 1000M speeds properly
  ice: Fix switchdev rules book keeping
  ice: ignore protocol field in GTP offload
====================

Link: https://lore.kernel.org/r/20220621224756.631765-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22 18:59:29 -07:00
Kai-Heng Feng
4e0effd900 igb: Make DMA faster when CPU is active on the PCIe link
Intel I210 on some Intel Alder Lake platforms can only achieve ~750Mbps
Tx speed via iperf. The RR2DCDELAY shows around 0x2xxx DMA delay, which
will be significantly lower when 1) ASPM is disabled or 2) SoC package
c-state stays above PC3. When the RR2DCDELAY is around 0x1xxx the Tx
speed can reach to ~950Mbps.

According to the I210 datasheet "8.26.1 PCIe Misc. Register - PCIEMISC",
"DMA Idle Indication" doesn't seem to tie to DMA coalesce anymore, so
set it to 1b for "DMA is considered idle when there is no Rx or Tx AND
when there are no TLPs indicating that CPU is active detected on the
PCIe link (such as the host executes CSR or Configuration register read
or write operation)" and performing Tx should also fall under "active
CPU on PCIe link" case.

In addition to that, commit b6e0c419f0 ("igb: Move DMA Coalescing init
code to separate function.") seems to wrongly changed from enabling
E1000_PCIEMISC_LX_DECISION to disabling it, also fix that.

Fixes: b6e0c419f0 ("igb: Move DMA Coalescing init code to separate function.")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220621221056.604304-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22 18:46:24 -07:00
Christian Marangi
85467f7da1 net: dsa: qca8k: reduce mgmt ethernet timeout
The current mgmt ethernet timeout is set to 100ms. This value is too
big and would slow down any mdio command in case the mgmt ethernet
packet have some problems on the receiving part.
Reduce it to just 5ms to handle case when some operation are done on the
master port that would cause the mgmt ethernet to not work temporarily.

Fixes: 5950c7c0a6 ("net: dsa: qca8k: add support for mgmt read/write in Ethernet packet")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20220621151633.11741-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22 18:33:51 -07:00
Christian Marangi
386228c694 net: dsa: qca8k: reset cpu port on MTU change
It was discovered that the Documentation lacks of a fundamental detail
on how to correctly change the MAX_FRAME_SIZE of the switch.

In fact if the MAX_FRAME_SIZE is changed while the cpu port is on, the
switch panics and cease to send any packet. This cause the mgmt ethernet
system to not receive any packet (the slow fallback still works) and
makes the device not reachable. To recover from this a switch reset is
required.

To correctly handle this, turn off the cpu ports before changing the
MAX_FRAME_SIZE and turn on again after the value is applied.

Fixes: f58d2598cf ("net: dsa: qca8k: implement the port MTU callbacks")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20220621151122.10220-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22 18:32:58 -07:00
Shyam Prasad N
6e1c1c08cd cifs: periodically query network interfaces from server
Currently, we only query the server for network interfaces
information at the time of mount, and never afterwards.
This can be a problem, especially for services like Azure,
where the IP address of the channel endpoints can change
over time.

With this change, we schedule a 600s polling of this info
from the server for each tree connect.

An alternative for periodic polling was to do this only at
the time of reconnect. But this could delay the reconnect
time slightly. Also, there are some challenges w.r.t how
we have cifs_reconnect implemented today.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-22 19:51:43 -05:00
Shyam Prasad N
b54034a73b cifs: during reconnect, update interface if necessary
Going forward, the plan is to periodically query the server
for it's interfaces (when multichannel is enabled).

This change allows checking for inactive interfaces during
reconnect, and reconnect to a new interface if necessary.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-22 19:51:43 -05:00
Shyam Prasad N
aa45dadd34 cifs: change iface_list from array to sorted linked list
A server's published interface list can change over time, and needs
to be updated. We've storing iface_list as a simple array, which
makes it difficult to manipulate an existing list.

With this change, iface_list is modified into a linked list of
interfaces, which is kept sorted by speed.

Also added a reference counter for an iface entry, so that each
channel can maintain a backpointer to the iface and drop it
easily when needed.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-22 19:51:43 -05:00
Shyam Prasad N
9de74996a7 smb3: use netname when available on secondary channels
Some servers do not allow null netname contexts, which would cause
multichannel to revert to single channel when mounting to some
servers (e.g. Azure xSMB). The previous patch fixed that by avoiding
incorrectly sending the netname context when there would be a null
hostname sent in the netname context, while this patch fixes the null
hostname for the secondary channel by using the hostname of the
primary channel for the secondary channel.

Fixes: 4c14d7043f ("cifs: populate empty hostnames for extra channels")
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-22 19:46:53 -05:00
Vadim Fedorenko
13f28c2cf0 MAINTAINERS: Add a maintainer for OCP Time Card
I've been contributing and reviewing patches for ptp_ocp driver for
some time and I'm taking care of it's github mirror. On Jakub's
suggestion, I would like to step forward and become a maintainer for
this driver. This patch adds a dedicated entry to MAINTAINERS.

Signed-off-by: Vadim Fedorenko <vadfed@fb.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/r/20220621233131.21240-1-vfedorenko@novek.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22 17:22:11 -07:00
Joshua Ashton
e84131a88a amd/display/dc: Fix COLOR_ENCODING and COLOR_RANGE doing nothing for DCN20+
For DCN20 and above, the code that actually hooks up the provided
input_color_space got lost at some point.

Fixes COLOR_ENCODING and COLOR_RANGE doing nothing on DCN20+.
Tested using Steam Remote Play Together + gamescope.

Update other DCNs the same wasy DCN1.x was updates in
commit a1e07ba89d ("drm/amd/display: Use plane->color_space for dpp if specified")

Fixes: a1e07ba89d ("drm/amd/display: Use plane->color_space for dpp if specified")
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-06-22 17:17:16 -04:00
George Shen
98b02e9f00 drm/amd/display: Fix typo in override_lane_settings
[Why]
The function currently skips overriding the drive
settings of the first lane.

[How]
Change for loop to start at 0 instead of 1.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-06-22 17:16:23 -04:00
Qingqing Zhuo
235870f659 drm/amd/display: Fix DC warning at driver load
[Why]
Wrong index was checked for dcfclk_mhz, causing false warning.

[How]
Fix the assertion index.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x
2022-06-22 17:15:41 -04:00
Mario Limonciello
937e24b7f5 drm/amd: Revert "drm/amd/display: keep eDP Vdd on when eDP stream is already enabled"
A variety of Lenovo machines with Rembrandt APUs and OLED panels have
stopped showing the display at login.  This behavior clears up after
leaving it idle and moving the mouse or touching keyboard.

It was bisected to be caused by commit 559e265522 ("drm/amd/display:
keep eDP Vdd on when eDP stream is already enabled").  Revert this commit
to fix the issue.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2047
Reported-by: Aaron Ma <aaron.ma@canonical.com>
Fixes: 559e265522 ("drm/amd/display: keep eDP Vdd on when eDP stream is already enabled")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mark Pearson <markpearson@lenovo.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-22 17:11:28 -04:00
Alex Deucher
f15345a377 drm/amdgpu: Adjust logic around GTT size (v3)
Certain GL unit tests for large textures can cause problems
with the OOM killer since there is no way to link this memory
to a process.  This was originally mitigated (but not necessarily
eliminated) by limiting the GTT size.  The problem is this limit
is often too low for many modern games so just make the limit 1/2
of system memory. The OOM accounting needs to be addressed, but
we shouldn't prevent common 3D applications from being usable
just to potentially mitigate that corner case.

Set default GTT size to max(3G, 1/2 of system ram) by default.

v2: drop previous logic and default to 3/4 of ram
v3: default to half of ram to align with ttm
v4: fix spelling in comment (Kent)

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1942
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-22 17:11:16 -04:00
Linus Torvalds
de5c208d53 Merge tag 'linux-kselftest-fixes-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fixes from Shuah Khan:
 "Compile time fixes and run-time resources leaks:

   - Fix clang cross compilation

   - Fix resource leak when return error

   - fix compile error for dma_map_benchmark

   - Fix regression - make use of GUP_TEST_FILE macro"

* tag 'linux-kselftest-fixes-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: make use of GUP_TEST_FILE macro
  selftests: vm: Fix resource leak when return error
  selftests dma: fix compile error for dma_map_benchmark
  selftests: Fix clang cross compilation
2022-06-22 14:08:06 -05:00
Kees Cook
1e70212e03 hinic: Replace memcpy() with direct assignment
Under CONFIG_FORTIFY_SOURCE=y and CONFIG_UBSAN_BOUNDS=y, Clang is bugged
here for calculating the size of the destination buffer (0x10 instead of
0x14). This copy is a fixed size (sizeof(struct fw_section_info_st)), with
the source and dest being struct fw_section_info_st, so the memcpy should
be safe, assuming the index is within bounds, which is UBSAN_BOUNDS's
responsibility to figure out.

Avoid the whole thing and just do a direct assignment. This results in
no change to the executable code.

[This is a duplicate of commit 2c0ab32b73 ("hinic: Replace memcpy()
 with direct assignment") which was applied to net-next.]

Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Link: https://github.com/ClangBuiltLinux/linux/issues/1592
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build
Link: https://lore.kernel.org/r/20220616052312.292861-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22 11:04:32 -07:00
Tim Crawford
627ce0d68e ALSA: hda/realtek: Add quirk for Clevo NS50PU
Fixes headset detection on Clevo NS50PU.

Signed-off-by: Tim Crawford <tcrawford@system76.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220622150017.9897-1-tcrawford@system76.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-22 17:19:57 +02:00
Jiang Jian
cb5177336e video: fbdev: omap: Remove duplicate 'the' in comment
Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-06-22 16:55:51 +02:00
Jiang Jian
bdc48fd571 video: fbdev: omapfb: Align '*' in comment
Consider '*' alignment in comments

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-06-22 16:53:55 +02:00
Saud Farooqui
85016f66af drm/sun4i: Return if frontend is not present
Added return statement in sun4i_layer_format_mod_supported()
in case frontend is not present.

Signed-off-by: Saud Farooqui <farooqui_saud@hotmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/PA4P189MB1421E93EF5F8E8E00E71B7878BB29@PA4P189MB1421.EURP189.PROD.OUTLOOK.COM
2022-06-22 16:42:25 +02:00
Dan Carpenter
3026b5ca06 drm/vc4: fix error code in vc4_check_tex_size()
The vc4_check_tex_size() function is supposed to return false on error
but this error path accidentally returns -ENODEV (which means true).

Fixes: 30f8c74ca9 ("drm/vc4: Warn if some v3d code is run on BCM2711")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/YrMKK89/viQiaiAg@kili
2022-06-22 16:41:30 +02:00
Yoshihiro Shimoda
9f7d09fe23 iommu/ipmmu-vmsa: Fix compatible for rcar-gen4
Fix compatible string for R-Car Gen4.

Fixes: ae684caf46 ("iommu/ipmmu-vmsa: Add support for R-Car Gen4")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20220617010107.3229784-1-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-06-22 15:45:56 +02:00
Linus Torvalds
3abc3ae553 Merge tag '9p-for-5.19-rc4' of https://github.com/martinetd/linux
Pull 9pfs fixes from Dominique Martinet:
 "A couple of fid refcount and fscache fixes:

   - fid refcounting was incorrect in some corner cases and would leak
     resources, only freed at umount time. The first three commits fix
     three such cases

   - 'cache=loose' or fscache was broken when trying to write a partial
     page to a file with no read permission since the rework a few
     releases ago.

     The fix taken here is just to restore old behavior of using the
     special 'writeback_fid' for such reads, which is open as root/RDWR
     and such not get complains that we try to read on a WRONLY fid.

     Long-term it'd be nice to get rid of this and not issue the read at
     all (skip cache?) in such cases, but that direction hasn't
     progressed"

* tag '9p-for-5.19-rc4' of https://github.com/martinetd/linux:
  9p: fix EBADF errors in cached mode
  9p: Fix refcounting during full path walks for fid lookups
  9p: fix fid refcount leak in v9fs_vfs_get_link
  9p: fix fid refcount leak in v9fs_vfs_atomic_open_dotl
2022-06-22 08:09:49 -05:00
Jakub Kicinski
877fe9d49b Revert "drivers/net/ethernet/neterion/vxge: Fix a use-after-free bug in vxge-main.c"
This reverts commit 8fc74d1863.

BAR0 is the main (only?) register bank for this device. We most
obviously can't unmap it before the netdev is unregistered.
This was pointed out in review but the patch got reposted and
merged, anyway.

The author of the patch was only testing it with a QEMU model,
which I presume does not emulate enough for the netdev to be brought
up (author's replies are not visible in lore because they kept sending
their emails in HTML).

Link: https://lore.kernel.org/all/20220616085059.680dc215@kernel.org/
Fixes: 8fc74d1863 ("drivers/net/ethernet/neterion/vxge: Fix a use-after-free bug in vxge-main.c")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-22 13:15:49 +01:00
Aidan MacDonald
3f05010f24 regmap-irq: Fix offset/index mismatch in read_sub_irq_data()
We need to divide the sub-irq status register offset by register
stride to get an index for the status buffer to avoid an out of
bounds write when the register stride is greater than 1.

Fixes: a2d21848d9 ("regmap: regmap-irq: Add main status register support")
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220620200644.1961936-3-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-22 11:59:52 +01:00
Aidan MacDonald
485037ae9a regmap-irq: Fix a bug in regmap_irq_enable() for type_in_mask chips
When enabling a type_in_mask irq, the type_buf contents must be
AND'd with the mask of the IRQ we're enabling to avoid enabling
other IRQs by accident, which can happen if several type_in_mask
irqs share a mask register.

Fixes: bc998a7303 ("regmap: irq: handle HW using separate rising/falling edge interrupts")
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220620200644.1961936-2-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-22 11:59:51 +01:00
Jason A. Donenfeld
f3eac42665 powerpc/powernv: wire up rng during setup_arch
The platform's RNG must be available before random_init() in order to be
useful for initial seeding, which in turn means that it needs to be
called from setup_arch(), rather than from an init call.

Complicating things, however, is that POWER8 systems need some per-cpu
state and kmalloc, which isn't available at this stage. So we split
things up into an early phase and a later opportunistic phase. This
commit also removes some noisy log messages that don't add much.

Fixes: a4da0d50b2 ("powerpc: Implement arch_get_random_long/int() for powernv")
Cc: stable@vger.kernel.org # v3.13+
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Add of_node_put(), use pnv naming, minor change log editing]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220621140849.127227-1-Jason@zx2c4.com
2022-06-22 19:47:22 +10:00
Jernej Skrabec
f5aa16807a drm/sun4i: Add DMA mask and segment size
Kernel occasionally complains that there is mismatch in segment size
when trying to render HW decoded videos and rendering them directly with
sun4i DRM driver. Following message can be observed on H6 SoC:

[  184.298308] ------------[ cut here ]------------
[  184.298326] DMA-API: sun4i-drm display-engine: mapping sg segment longer than device claims to support [len=6144000] [max=65536]
[  184.298364] WARNING: CPU: 1 PID: 382 at kernel/dma/debug.c:1162 debug_dma_map_sg+0x2b0/0x350
[  184.322997] CPU: 1 PID: 382 Comm: ffmpeg Not tainted 5.19.0-rc1+ #1331
[  184.329533] Hardware name: Tanix TX6 (DT)
[  184.333544] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  184.340512] pc : debug_dma_map_sg+0x2b0/0x350
[  184.344882] lr : debug_dma_map_sg+0x2b0/0x350
[  184.349250] sp : ffff800009f33a50
[  184.352567] x29: ffff800009f33a50 x28: 0000000000010000 x27: ffff000001b86c00
[  184.359725] x26: ffffffffffffffff x25: ffff000005d8cc80 x24: 0000000000000000
[  184.366879] x23: ffff80000939ab18 x22: 0000000000000001 x21: 0000000000000001
[  184.374031] x20: 0000000000000000 x19: ffff0000018a7410 x18: ffffffffffffffff
[  184.381186] x17: 0000000000000000 x16: 0000000000000000 x15: ffffffffffffffff
[  184.388338] x14: 0000000000000001 x13: ffff800009534e86 x12: 6f70707573206f74
[  184.395493] x11: 20736d69616c6320 x10: 000000000000000a x9 : 0000000000010000
[  184.402647] x8 : ffff8000093b6d40 x7 : ffff800009f33850 x6 : 000000000000000c
[  184.409800] x5 : ffff0000bf997940 x4 : 0000000000000000 x3 : 0000000000000027
[  184.416953] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000003960e80
[  184.424106] Call trace:
[  184.426556]  debug_dma_map_sg+0x2b0/0x350
[  184.430580]  __dma_map_sg_attrs+0xa0/0x110
[  184.434687]  dma_map_sgtable+0x28/0x4c
[  184.438447]  vb2_dc_dmabuf_ops_map+0x60/0xcc
[  184.442729]  __map_dma_buf+0x2c/0xd4
[  184.446321]  dma_buf_map_attachment+0xa0/0x130
[  184.450777]  drm_gem_prime_import_dev+0x7c/0x18c
[  184.455410]  drm_gem_prime_fd_to_handle+0x1b8/0x214
[  184.460300]  drm_prime_fd_to_handle_ioctl+0x2c/0x40
[  184.465190]  drm_ioctl_kernel+0xc4/0x174
[  184.469123]  drm_ioctl+0x204/0x420
[  184.472534]  __arm64_sys_ioctl+0xac/0xf0
[  184.476474]  invoke_syscall+0x48/0x114
[  184.480240]  el0_svc_common.constprop.0+0x44/0xec
[  184.484956]  do_el0_svc+0x2c/0xc0
[  184.488283]  el0_svc+0x2c/0x84
[  184.491354]  el0t_64_sync_handler+0x11c/0x150
[  184.495723]  el0t_64_sync+0x18c/0x190
[  184.499397] ---[ end trace 0000000000000000 ]---

Fix that by setting DMA mask and segment size.

Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Reviewed-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220620181333.650301-1-jernej.skrabec@gmail.com
2022-06-22 09:31:22 +02:00
Saud Farooqui
5f940e528d drm/vc4: hdmi: Fixed possible integer overflow
Multiplying ints and saving it in unsigned long long
could lead to integer overflow before being type casted to
unsigned long long.

Addresses-Coverity:  1505113: Unintentional integer overflow.

Signed-off-by: Saud Farooqui <farooqui_saud@hotmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/PA4P189MB1421E63C0FF3EBF234A80AB38BA79@PA4P189MB1421.EURP189.PROD.OUTLOOK.COM
2022-06-22 09:30:17 +02:00
Yonglin Tan
33b29dbb39 USB: serial: option: add Quectel EM05-G modem
The EM05-G modem has 2 USB configurations that are configurable via the AT
command AT+QCFG="usbnet",[ 0 | 2 ] which make the modem enumerate with
the following interfaces, respectively:

"RMNET"	: AT + DIAG + NMEA + Modem + QMI
"MBIM"	: MBIM + AT + DIAG + NMEA + Modem

The detailed description of the USB configuration for each mode as follows:

RMNET Mode
--------------
T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 21 Spd=480  MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2c7c ProdID=030a Rev= 3.18
S:  Manufacturer=Quectel
S:  Product=Quectel EM05-G
C:* #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA
I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=87(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 6 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
E:  Ad=89(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms

MBIM Mode
--------------
T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 16 Spd=480  MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2c7c ProdID=030a Rev= 3.18
S:  Manufacturer=Quectel
S:  Product=Quectel EM05-G
C:* #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA
A:  FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00
I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=87(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim
E:  Ad=89(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
I:  If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms

Signed-off-by: Yonglin Tan <yonglin.tan@outlook.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2022-06-22 08:54:39 +02:00
Johan Hovold
ae60aac59a USB: serial: pl2303: add support for more HXN (G) types
Add support for further HXN (G) type devices (GT variant, GL variant, GS
variant and GR) and document the bcdDevice mapping.

Note that the TA and TB types use the same bcdDevice as some GT and GE
variants, respectively, but that the HX status request can be used to
determine which is which.

Also note that we currently do not distinguish between the various HXN
(G) types in the driver but that this may change eventually (e.g. when
adding GPIO support).

Reported-by: Charles Yeh <charlesyeh522@gmail.com>
Link: https://lore.kernel.org/r/YrF77b9DdeumUAee@hovoldconsulting.com
Cc: stable@vger.kernel.org	# 5.13
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
2022-06-22 08:52:03 +02:00
Jakub Kicinski
53664d51d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

1) Use get_random_u32() instead of prandom_u32_state() in nft_meta
   and nft_numgen, from Florian Westphal.

2) Incorrect list head in nfnetlink_cttimeout in recent update coming
   from previous development cycle. Also from Florian.

3) Incorrect path to pktgen scripts for nft_concat_range.sh selftest.
   From Jie2x Zhou.

4) Two fixes for the for nft_fwd and nft_dup egress support, from Florian.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_dup_netdev: add and use recursion counter
  netfilter: nf_dup_netdev: do not push mac header a second time
  selftests: netfilter: correct PKTGEN_SCRIPT_PATHS in nft_concat_range.sh
  netfilter: cttimeout: fix slab-out-of-bounds read typo in cttimeout_net_exit
  netfilter: use get_random_u32 instead of prandom
====================

Link: https://lore.kernel.org/r/20220621085618.3975-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-21 22:41:41 -07:00
Lukas Wunner
2642cc6c3b net: phy: smsc: Disable Energy Detect Power-Down in interrupt mode
Simon reports that if two LAN9514 USB adapters are directly connected
without an intermediate switch, the link fails to come up and link LEDs
remain dark.  The issue was introduced by commit 1ce8b37241 ("usbnet:
smsc95xx: Forward PHY interrupts to PHY driver to avoid polling").

The PHY suffers from a known erratum wherein link detection becomes
unreliable if Energy Detect Power-Down is used.  In poll mode, the
driver works around the erratum by briefly disabling EDPD for 640 msec
to detect a neighbor, then re-enabling it to save power.

In interrupt mode, no interrupt is signaled if EDPD is used by both link
partners, so it must not be enabled at all.

We'll recoup the power savings by enabling SUSPEND1 mode on affected
LAN95xx chips in a forthcoming commit.

Fixes: 1ce8b37241 ("usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling")
Reported-by: Simon Han <z.han@kunbus.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/r/439a3f3168c2f9d44b5fd9bb8d2b551711316be6.1655714438.git.lukas@wunner.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-21 21:59:47 -07:00
Pavel Begunkov
c0737fa9a5 io_uring: fix double poll leak on repolling
We have re-polling for partial IO, so a request can be polled twice. If
it used two poll entries the first time then on the second
io_arm_poll_handler() it will find the old apoll entry and NULL
kmalloc()'ed second entry, i.e. apoll->double_poll, so leaking it.

Fixes: 10c873334f ("io_uring: allow re-poll if we made progress")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/fee2452494222ecc7f1f88c8fb659baef971414a.1655852245.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-21 17:24:37 -06:00
Pavel Begunkov
9d2ad2947a io_uring: fix wrong arm_poll error handling
Leaving ip.error set when a request was punted to task_work execution is
problematic, don't forget to clear it.

Fixes: aa43477b04 ("io_uring: poll rework")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a6c84ef4182c6962380aebe11b35bdcb25b0ccfb.1655852245.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-21 17:24:37 -06:00
Pavel Begunkov
c487a5ad48 io_uring: fail links when poll fails
Don't forget to cancel all linked requests of poll request when
__io_arm_poll_handler() failed.

Fixes: aa43477b04 ("io_uring: poll rework")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a78aad962460f9fdfe4aa4c0b62425c88f9415bc.1655852245.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-21 17:24:37 -06:00
Anatolii Gerasymenko
a632b2a4c9 ice: ethtool: Prohibit improper channel config for DCB
Do not allow setting less channels, than Traffic Classes there are
via ethtool. There must be at least one channel per Traffic Class.

If you set less channels, than Traffic Classes there are, then during
ice_vsi_rebuild there would be allocated only the requested amount
of tx/rx rings in ice_vsi_alloc_arrays. But later in ice_vsi_setup_q_map
there would be requested at least one channel per Traffic Class. This
results in setting num_rxq > alloc_rxq and num_txq > alloc_txq.
Later, there would be a NULL pointer dereference in
ice_vsi_map_rings_to_vectors, because we go beyond of rx_rings or
tx_rings arrays.

Change ice_set_channels() to return error if you try to allocate less
channels, than Traffic Classes there are.
Change ice_vsi_setup_q_map() and ice_vsi_setup_q_map_mqprio() to return
status code instead of void.
Add error handling for ice_vsi_setup_q_map() and
ice_vsi_setup_q_map_mqprio() in ice_vsi_init() and ice_vsi_cfg_tc().

[53753.889983] INFO: Flow control is disabled for this traffic class (0) on this vsi.
[53763.984862] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[53763.992915] PGD 14b45f5067 P4D 0
[53763.996444] Oops: 0002 [#1] SMP NOPTI
[53764.000312] CPU: 12 PID: 30661 Comm: ethtool Kdump: loaded Tainted: GOE    --------- -  - 4.18.0-240.el8.x86_64 #1
[53764.011825] Hardware name: Intel Corporation WilsonCity/WilsonCity, BIOS WLYDCRB1.SYS.0020.P21.2012150710 12/15/2020
[53764.022584] RIP: 0010:ice_vsi_map_rings_to_vectors+0x7e/0x120 [ice]
[53764.029089] Code: 41 0d 0f b7 b7 12 05 00 00 0f b6 d0 44 29 de 44 0f b7 c6 44 01 c2 41 39 d0 7d 2d 4c 8b 47 28 44 0f b7 ce 83 c6 01 4f 8b 04 c8 <49> 89 48 28 4                           c 8b 89 b8 01 00 00 4d 89 08 4c 89 81 b8 01 00 00 44
[53764.048379] RSP: 0018:ff550dd88ea47b20 EFLAGS: 00010206
[53764.053884] RAX: 0000000000000002 RBX: 0000000000000004 RCX: ff385ea42fa4a018
[53764.061301] RDX: 0000000000000006 RSI: 0000000000000005 RDI: ff385e9baeedd018
[53764.068717] RBP: 0000000000000010 R08: 0000000000000000 R09: 0000000000000004
[53764.076133] R10: 0000000000000002 R11: 0000000000000004 R12: 0000000000000000
[53764.083553] R13: 0000000000000000 R14: ff385e658fdd9000 R15: ff385e9baeedd018
[53764.090976] FS:  000014872c5b5740(0000) GS:ff385e847f100000(0000) knlGS:0000000000000000
[53764.099362] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[53764.105409] CR2: 0000000000000028 CR3: 0000000a820fa002 CR4: 0000000000761ee0
[53764.112851] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[53764.120301] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[53764.127747] PKRU: 55555554
[53764.130781] Call Trace:
[53764.133564]  ice_vsi_rebuild+0x611/0x870 [ice]
[53764.138341]  ice_vsi_recfg_qs+0x94/0x100 [ice]
[53764.143116]  ice_set_channels+0x1a8/0x3e0 [ice]
[53764.147975]  ethtool_set_channels+0x14e/0x240
[53764.152667]  dev_ethtool+0xd74/0x2a10
[53764.156665]  ? __mod_lruvec_state+0x44/0x110
[53764.161280]  ? __mod_lruvec_state+0x44/0x110
[53764.165893]  ? page_add_file_rmap+0x15/0x170
[53764.170518]  ? inet_ioctl+0xd1/0x220
[53764.174445]  ? netdev_run_todo+0x5e/0x290
[53764.178808]  dev_ioctl+0xb5/0x550
[53764.182485]  sock_do_ioctl+0xa0/0x140
[53764.186512]  sock_ioctl+0x1a8/0x300
[53764.190367]  ? selinux_file_ioctl+0x161/0x200
[53764.195090]  do_vfs_ioctl+0xa4/0x640
[53764.199035]  ksys_ioctl+0x60/0x90
[53764.202722]  __x64_sys_ioctl+0x16/0x20
[53764.206845]  do_syscall_64+0x5b/0x1a0
[53764.210887]  entry_SYSCALL_64_after_hwframe+0x65/0xca

Fixes: 87324e747f ("ice: Implement ethtool ops for channels")
Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-21 15:20:24 -07:00
Anatolii Gerasymenko
c3d184c83f ice: ethtool: advertise 1000M speeds properly
In current implementation ice_update_phy_type enables all link modes
for selected speed. This approach doesn't work for 1000M speeds,
because both copper (1000baseT) and optical (1000baseX) standards
cannot be enabled at once.

Fix this, by adding the function `ice_set_phy_type_from_speed()`
for 1000M speeds.

Fixes: 48cb27f2fd ("ice: Implement handlers for ethtool PHY/link operations")
Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-21 13:48:57 -07:00
Liang He
3748d2185a mips: lantiq: Add missing of_node_put() in irq.c
In icu_of_init(), of_find_compatible_node() will return a node
pointer with refcount incremented. We should use of_node_put()
when it is not used anymore.

Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21 22:34:03 +02:00
Wojciech Drewek
3578dc9001 ice: Fix switchdev rules book keeping
Adding two filters with same matching criteria ends up with
one rule in hardware with act = ICE_FWD_TO_VSI_LIST.
In order to remove them properly we have to keep the
information about vsi handle which is used in VSI bitmap
(ice_adv_fltr_mgmt_list_entry::vsi_list_info::vsi_map).

Fixes: 0d08a441fb ("ice: ndo_setup_tc implementation for PF")
Reported-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-21 13:09:03 -07:00
Dmitry Osipenko
2027732600 PM: hibernate: Use kernel_can_power_off()
Use new kernel_can_power_off() API instead of legacy pm_power_off global
variable to fix regressed hibernation to disk where machine no longer
powers off when it should because ACPI power driver transitioned to the
new sys-off based API and it doesn't use pm_power_off anymore.

Fixes: 98f30d0ecf ("ACPI: power: Switch to sys-off handler API")
Tested-by: Ken Moffat <zarniwhoop@ntlworld.com>
Reported-by: Ken Moffat <zarniwhhop@ntlworld.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-21 20:57:30 +02:00
Marcin Szycik
d4ea6f6373 ice: ignore protocol field in GTP offload
Commit 34a897758e ("ice: Add support for inner etype in switchdev")
added the ability to match on inner ethertype. A side effect of that change
is that it is now impossible to add some filters for protocols which do not
contain inner ethtype field. tc requires the protocol field to be specified
when providing certain other options, e.g. src_ip. This is a problem in
case of GTP - when user wants to specify e.g. src_ip, they also need to
specify protocol in tc command (otherwise tc fails with: Illegal "src_ip").
Because GTP is a tunnel, the protocol field is treated as inner protocol.
GTP does not contain inner ethtype field and the filter cannot be added.

To fix this, ignore the ethertype field in case of GTP filters.

Fixes: 9a225f81f5 ("ice: Support GTP-U and GTP-C offload in switchdev")
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-21 11:14:36 -07:00
Mike Snitzer
78ccef9123 dm: do not return early from dm_io_complete if BLK_STS_AGAIN without polling
Commit 5291984004 ("dm: fix bio polling to handle possibile
BLK_STS_AGAIN") inadvertently introduced an early return from
dm_io_complete() without first queueing the bio to DM if BLK_STS_AGAIN
occurs and bio-polling is _not_ being used.

Fix this by only returning early from dm_io_complete() if the bio has
first been properly queued to DM. Otherwise, the bio will never finish
via bio_endio.

Fixes: 5291984004 ("dm: fix bio polling to handle possibile BLK_STS_AGAIN")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-21 13:48:52 -04:00
Nikos Tsironis
9ae6e8b1c9 dm era: commit metadata in postsuspend after worker stops
During postsuspend dm-era does the following:

1. Archives the current era
2. Commits the metadata, as part of the RPC call for archiving the
   current era
3. Stops the worker

Until the worker stops, it might write to the metadata again. Moreover,
these writes are not flushed to disk immediately, but are cached by the
dm-bufio client, which writes them back asynchronously.

As a result, the committed metadata of a suspended dm-era device might
not be consistent with the in-core metadata.

In some cases, this can result in the corruption of the on-disk
metadata. Suppose the following sequence of events:

1. Load a new table, e.g. a snapshot-origin table, to a device with a
   dm-era table
2. Suspend the device
3. dm-era commits its metadata, but the worker does a few more metadata
   writes until it stops, as part of digesting an archived writeset
4. These writes are cached by the dm-bufio client
5. Load the dm-era table to another device.
6. The new instance of the dm-era target loads the committed, on-disk
   metadata, which don't include the extra writes done by the worker
   after the metadata commit.
7. Resume the new device
8. The new dm-era target instance starts using the metadata
9. Resume the original device
10. The destructor of the old dm-era target instance is called and
    destroys the dm-bufio client, which results in flushing the cached
    writes to disk
11. These writes might overwrite the writes done by the new dm-era
    instance, hence corrupting its metadata.

Fix this by committing the metadata after the worker stops running.

stop_worker uses flush_workqueue to flush the current work. However, the
work item may re-queue itself and flush_workqueue doesn't wait for
re-queued works to finish.

This could result in the worker changing the metadata after they have
been committed, or writing to the metadata concurrently with the commit
in the postsuspend thread.

Use drain_workqueue instead, which waits until the work and all
re-queued works finish.

Fixes: eec40579d8 ("dm: add era target")
Cc: stable@vger.kernel.org # v3.15+
Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-21 13:35:01 -04:00
Linus Torvalds
ca1fdab7fd Merge tag 'efi-urgent-for-v5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:

 - remove pointless include of asm/efi.h, which does not exist on ia64

 - fix DXE service marshalling prototype for mixed mode

* tag 'efi-urgent-for-v5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi/x86: libstub: Fix typo in __efi64_argmap* name
  efi: sysfb_efi: remove unnecessary <asm/efi.h> include
2022-06-21 12:20:11 -05:00
Linus Torvalds
0273fd423b Merge tag 'certs-20220621' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull signature checking selftest from David Howells:
 "The signature checking code, as used by module signing, kexec, etc.,
  is non-FIPS compliant as there is no selftest.

  For a kernel to be FIPS-compliant, signature checking would have to be
  tested before being used, and the box would need to panic if it's not
  available (probably reasonable as simply disabling signature checking
  would prevent you from loading any driver modules).

  Deal with this by adding a minimal test.

  This is split into two patches: the first moves load_certificate_list()
  to the same place as the X.509 code to make it more accessible
  internally; the second adds a selftest"

* tag 'certs-20220621' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  certs: Add FIPS selftests
  certs: Move load_certificate_list() to be with the asymmetric keys code
2022-06-21 12:13:53 -05:00
Linus Torvalds
ff872b76b3 Merge tag 'for-5.19-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:

 - print more error messages for invalid mount option values

 - prevent remount with v1 space cache for subpage filesystem

 - fix hang during unmount when block group reclaim task is running

* tag 'for-5.19-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: add error messages to all unrecognized mount options
  btrfs: prevent remounting to v1 space cache for subpage mount
  btrfs: fix hang during unmount when block group reclaim task is running
2022-06-21 12:06:04 -05:00
Jens Axboe
2645672ffe block: pop cached rq before potentially blocking rq_qos_throttle()
If rq_qos_throttle() ends up blocking, then we will have invalidated and
flushed our current plug. Since blk_mq_get_cached_request() hasn't
popped the cached request off the plug list just yet, we end holding a
pointer to a request that is no longer valid. This insta-crashes with
rq->mq_hctx being NULL in the validity checks just after.

Pop the request off the cached list before doing rq_qos_throttle() to
avoid using a potentially stale request.

Fixes: 0a5aa8d161 ("block: fix blk_mq_attempt_bio_merge and rq_qos_throttle protection")
Reported-by: Dylan Yudaken <dylany@fb.com>
Tested-by: Dylan Yudaken <dylany@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-21 10:59:58 -06:00
David Howells
cb78d1b5ef afs: Fix dynamic root getattr
The recent patch to make afs_getattr consult the server didn't account
for the pseudo-inodes employed by the dynamic root-type afs superblock
not having a volume or a server to access, and thus an oops occurs if
such a directory is stat'd.

Fix this by checking to see if the vnode->volume pointer actually points
anywhere before following it in afs_getattr().

This can be tested by stat'ing a directory in /afs.  It may be
sufficient just to do "ls /afs" and the oops looks something like:

        BUG: kernel NULL pointer dereference, address: 0000000000000020
        ...
        RIP: 0010:afs_getattr+0x8b/0x14b
        ...
        Call Trace:
         <TASK>
         vfs_statx+0x79/0xf5
         vfs_fstatat+0x49/0x62

Fixes: 2aeb8c86d4 ("afs: Fix afs_getattr() to refetch file status if callback break occurred")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/165408450783.1031787.7941404776393751186.stgit@warthog.procyon.org.uk/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-21 11:47:30 -05:00
Evgeniy Baskov
aa6d1ed107 efi/x86: libstub: Fix typo in __efi64_argmap* name
The actual name of the DXE services function used
is set_memory_space_attributes(), not set_memory_space_descriptor().

Change EFI mixed mode helper macro name to match the function name.

Fixes: 31f1a0edff ("efi/x86: libstub: Make DXE calls mixed mode safe")
Signed-off-by: Evgeniy Baskov <baskov@ispras.ru>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-06-21 18:11:46 +02:00
Javier Martinez Canillas
34705a57e7 efi: sysfb_efi: remove unnecessary <asm/efi.h> include
Nothing defined in the header is used by drivers/firmware/efi/sysfb_efi.c
but also, including it can lead to build errors when built on arches that
don't have an asm/efi.h header file.

This can happen for example if a driver that is built when COMPILE_TEST is
enabled selects the SYSFB symbol, e.g. on powerpc with allyesconfig:

drivers/firmware/efi/sysfb_efi.c:29:10: fatal error: asm/efi.h: No such file or directory
   29 | #include <asm/efi.h>
      |          ^~~~~~~~~~~

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-06-21 18:11:43 +02:00
Jaegeuk Kim
82c7863ed9 f2fs: do not count ENOENT for error case
Otherwise, we can get a wrong cp_error mark.

Cc: <stable@vger.kernel.org>
Fixes: a7b8618aa2 ("f2fs: avoid infinite loop to flush node pages")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-06-21 08:29:56 -07:00
Aidan MacDonald
db30dc1a52 mips: dts: ingenic: Add TCU clock to x1000/x1830 tcu device node
This clock is a gate for the TCU hardware block on these SoCs, but
it wasn't included in the device tree since the ingenic-tcu driver
erroneously did not request it.

Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21 17:18:39 +02:00
David Howells
3cde3174eb certs: Add FIPS selftests
Add some selftests for signature checking when FIPS mode is enabled.  These
need to be done before we start actually using the signature checking for
things and must panic the kernel upon failure.

Note that the tests must not check the blacklist lest this provide a way to
prevent a kernel from booting by installing a hash of a test key in the
appropriate UEFI table.

Reported-by: Simo Sorce <simo@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Simo Sorce <simo@redhat.com>
Reviewed-by: Herbert Xu <herbert@gondor.apana.org.au>
cc: keyrings@vger.kernel.org
cc: linux-crypto@vger.kernel.org
Link: https://lore.kernel.org/r/165515742832.1554877.2073456606206090838.stgit@warthog.procyon.org.uk/
2022-06-21 16:05:12 +01:00
David Howells
60050ffe3d certs: Move load_certificate_list() to be with the asymmetric keys code
Move load_certificate_list(), which loads a series of binary X.509
certificates from a blob and inserts them as keys into a keyring, to be
with the asymmetric keys code that it drives.

This makes it easier to add FIPS selftest code in which we need to load up
a private keyring for the tests to use.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Simo Sorce <simo@redhat.com>
Reviewed-by: Herbert Xu <herbert@gondor.apana.org.au>
cc: keyrings@vger.kernel.org
cc: linux-crypto@vger.kernel.org
Link: https://lore.kernel.org/r/165515742145.1554877.13488098107542537203.stgit@warthog.procyon.org.uk/
2022-06-21 16:05:06 +01:00
Liang He
eb9e9bc4fa mips/pic32/pic32mzda: Fix refcount leak bugs
of_find_matching_node(), of_find_compatible_node() and
of_find_node_by_path() will return node pointers with refcout
incremented. We should call of_node_put() when they are not
used anymore.

Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21 17:04:54 +02:00
Liang He
7669559271 mips: lantiq: xway: Fix refcount leak bug in sysctrl
In ltq_soc_init(), of_find_compatible_node() will return a node
pointer with refcount incremented. We should use of_node_put() when
it is not used anymore.

Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21 17:04:30 +02:00
Liang He
72a2af539f mips: lantiq: falcon: Fix refcount leak bug in sysctrl
In ltq_soc_init(), of_find_compatible_node() will return a node pointer
with refcount incremented. We should use of_node_put() when it is not
used anymore.

Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21 17:04:30 +02:00
Liang He
48ca54e391 mips: ralink: Fix refcount leak in of.c
In plat_of_remap_node(), plat_of_remap_node() will return a node
pointer with refcount incremented. We should use of_node_put()
when it is not used anymore.

Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21 17:04:30 +02:00
Liang He
608d94cb84 mips: mti-malta: Fix refcount leak in malta-time.c
In update_gic_frequency_dt(), of_find_compatible_node() will return
a node pointer with refcount incremented. We should use of_node_put()
when it is not used anymore.

Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21 17:04:30 +02:00
Liang He
4becf6417b arch: mips: generic: Add missing of_node_put() in board-ranchu.c
In ranchu_measure_hpt_freq(), of_find_compatible_node() will return
a node pointer with refcount incremented. We should use of_put_node()
when it is not used anymore.

Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21 17:04:30 +02:00
huhai
c81aba8fde MIPS: Remove repetitive increase irq_err_count
commit 979934da9e ("[PATCH] mips: update IRQ handling for vr41xx") added
a function irq_dispatch, and it'll increase irq_err_count when the get_irq
callback returns a negative value, but increase irq_err_count in get_irq
was not removed.

And also, modpost complains once gpio-vr41xx drivers become modules.
  ERROR: modpost: "irq_err_count" [drivers/gpio/gpio-vr41xx.ko] undefined!

So it would be a good idea to remove repetitive increase irq_err_count in
get_irq callback.

Fixes: 27fdd325da ("MIPS: Update VR41xx GPIO driver to use gpiolib")
Fixes: 979934da9e ("[PATCH] mips: update IRQ handling for vr41xx")
Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: huhai <huhai@kylinos.cn>
Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21 16:50:58 +02:00
Oleksandr Tyshchenko
ca6969013d drm/xen: Add missing VM_DONTEXPAND flag in mmap callback
With Xen PV Display driver in use the "expected" VM_DONTEXPAND flag
is not set (neither explicitly nor implicitly), so the driver hits
the code path in drm_gem_mmap_obj() which triggers the WARNING.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Link: https://lore.kernel.org/r/1652104303-5098-1-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-21 16:36:13 +02:00
Julien Grall
ecb6237fa3 x86/xen: Remove undefined behavior in setup_features()
1 << 31 is undefined. So switch to 1U << 31.

Fixes: 5ead97c84f ("xen: Core Xen implementation")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220617103037.57828-1-julien@xen.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-21 16:36:11 +02:00
Jason Andryuk
f9710c357e xen-blkfront: Handle NULL gendisk
When a VBD is not fully created and then closed, the kernel can have a
NULL pointer dereference:

The reproducer is trivial:

[user@dom0 ~]$ sudo xl block-attach work backend=sys-usb vdev=xvdi target=/dev/sdz
[user@dom0 ~]$ xl block-list work
Vdev  BE  handle state evt-ch ring-ref BE-path
51712 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51712
51728 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51728
51744 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51744
51760 0   241    4     -1     -1       /local/domain/0/backend/vbd/241/51760
51840 3   241    3     -1     -1       /local/domain/3/backend/vbd/241/51840
                 ^ note state, the /dev/sdz doesn't exist in the backend

[user@dom0 ~]$ sudo xl block-detach work xvdi
[user@dom0 ~]$ xl block-list work
Vdev  BE  handle state evt-ch ring-ref BE-path
work is an invalid domain identifier

And its console has:

BUG: kernel NULL pointer dereference, address: 0000000000000050
PGD 80000000edebb067 P4D 80000000edebb067 PUD edec2067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 52 Comm: xenwatch Not tainted 5.16.18-2.43.fc32.qubes.x86_64 #1
RIP: 0010:blk_mq_stop_hw_queues+0x5/0x40
Code: 00 48 83 e0 fd 83 c3 01 48 89 85 a8 00 00 00 41 39 5c 24 50 77 c0 5b 5d 41 5c 41 5d c3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <8b> 47 50 85 c0 74 32 41 54 49 89 fc 55 53 31 db 49 8b 44 24 48 48
RSP: 0018:ffffc90000bcfe98 EFLAGS: 00010293
RAX: ffffffffc0008370 RBX: 0000000000000005 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
RBP: ffff88800775f000 R08: 0000000000000001 R09: ffff888006e620b8
R10: ffff888006e620b0 R11: f000000000000000 R12: ffff8880bff39000
R13: ffff8880bff39000 R14: 0000000000000000 R15: ffff88800604be00
FS:  0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000050 CR3: 00000000e932e002 CR4: 00000000003706e0
Call Trace:
 <TASK>
 blkback_changed+0x95/0x137 [xen_blkfront]
 ? read_reply+0x160/0x160
 xenwatch_thread+0xc0/0x1a0
 ? do_wait_intr_irq+0xa0/0xa0
 kthread+0x16b/0x190
 ? set_kthread_struct+0x40/0x40
 ret_from_fork+0x22/0x30
 </TASK>
Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore ipt_REJECT nf_reject_ipv4 xt_state xt_conntrack nft_counter nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel xen_netfront pcspkr xen_scsiback target_core_mod xen_netback xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn ipmi_devintf ipmi_msghandler fuse bpf_preload ip_tables overlay xen_blkfront
CR2: 0000000000000050
---[ end trace 7bc9597fd06ae89d ]---
RIP: 0010:blk_mq_stop_hw_queues+0x5/0x40
Code: 00 48 83 e0 fd 83 c3 01 48 89 85 a8 00 00 00 41 39 5c 24 50 77 c0 5b 5d 41 5c 41 5d c3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <8b> 47 50 85 c0 74 32 41 54 49 89 fc 55 53 31 db 49 8b 44 24 48 48
RSP: 0018:ffffc90000bcfe98 EFLAGS: 00010293
RAX: ffffffffc0008370 RBX: 0000000000000005 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
RBP: ffff88800775f000 R08: 0000000000000001 R09: ffff888006e620b8
R10: ffff888006e620b0 R11: f000000000000000 R12: ffff8880bff39000
R13: ffff8880bff39000 R14: 0000000000000000 R15: ffff88800604be00
FS:  0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000050 CR3: 00000000e932e002 CR4: 00000000003706e0
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled

info->rq and info->gd are only set in blkfront_connect(), which is
called for state 4 (XenbusStateConnected).  Guard against using NULL
variables in blkfront_closing() to avoid the issue.

The rest of blkfront_closing looks okay.  If info->nr_rings is 0, then
for_each_rinfo won't do anything.

blkfront_remove also needs to check for non-NULL pointers before
cleaning up the gendisk and request queue.

Fixes: 05d69d950d "xen-blkfront: sanitize the removal state machine"
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220601195341.28581-1-jandryuk@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-21 16:36:09 +02:00
Andy Shevchenko
9ef1654063 usb: typec: wcove: Drop wrong dependency to INTEL_SOC_PMIC
Intel SoC PMIC is a generic name for all PMICs that are used
on Intel platforms. In particular, INTEL_SOC_PMIC kernel configuration
option refers to Crystal Cove PMIC, which has never been a part
of any Intel Broxton hardware. Drop wrong dependency from Kconfig.

Note, the correct dependency is satisfied via ACPI PMIC OpRegion driver,
which the Type-C depends on.

Fixes: d2061f9cc3 ("usb: typec: add driver for Intel Whiskey Cove PMIC USB Type-C PHY")
Reported-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220620104316.57592-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-21 16:28:05 +02:00
Dan Vacura
96163f835e usb: gadget: uvc: fix list double add in uvcg_video_pump
A panic can occur if the endpoint becomes disabled and the
uvcg_video_pump adds the request back to the req_free list after it has
already been queued to the endpoint. The endpoint complete will add the
request back to the req_free list. Invalidate the local request handle
once it's been queued.

<6>[  246.796704][T13726] configfs-gadget gadget: uvc: uvc_function_set_alt(1, 0)
<3>[  246.797078][   T26] list_add double add: new=ffffff878bee5c40, prev=ffffff878bee5c40, next=ffffff878b0f0a90.
<6>[  246.797213][   T26] ------------[ cut here ]------------
<2>[  246.797224][   T26] kernel BUG at lib/list_debug.c:31!
<6>[  246.807073][   T26] Call trace:
<6>[  246.807180][   T26]  uvcg_video_pump+0x364/0x38c
<6>[  246.807366][   T26]  process_one_work+0x2a4/0x544
<6>[  246.807394][   T26]  worker_thread+0x350/0x784
<6>[  246.807442][   T26]  kthread+0x2ac/0x320

Fixes: f9897ec0f6 ("usb: gadget: uvc: only pump video data if necessary")
Cc: stable@vger.kernel.org
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Dan Vacura <w36195@motorola.com>
Link: https://lore.kernel.org/r/20220617163154.16621-1-w36195@motorola.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-21 16:27:26 +02:00
Geert Uytterhoeven
9faa1c8f92 dt-bindings: usb: ehci: Increase the number of PHYs
"make dtbs_check":

    arch/arm/boot/dts/r8a77470-iwg23s-sbc.dtb: usb@ee080100: phys: [[17, 0], [31]] is too long
	    From schema: Documentation/devicetree/bindings/usb/generic-ehci.yaml
    arch/arm/boot/dts/r8a77470-iwg23s-sbc.dtb: usb@ee0c0100: phys: [[17, 1], [33], [21, 0]] is too long
	    From schema: Documentation/devicetree/bindings/usb/generic-ehci.yaml

Some USB EHCI controllers (e.g. on the Renesas RZ/G1C SoC) have multiple
PHYs.  Increase the maximum number of PHYs to 3, which is sufficient for
now.

Fixes: 0499220d6d ("dt-bindings: Add missing array size constraints")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/c5d19e2f9714f43effd90208798fc1936098078f.1655301043.git.geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-21 16:26:54 +02:00
Geert Uytterhoeven
0f074c1c95 dt-bindings: usb: ohci: Increase the number of PHYs
"make dtbs_check":

    arch/arm/boot/dts/r8a77470-iwg23s-sbc.dtb: usb@ee080000: phys: [[17, 0], [31]] is too long
	    From schema: Documentation/devicetree/bindings/usb/generic-ohci.yaml
    arch/arm/boot/dts/r8a77470-iwg23s-sbc.dtb: usb@ee0c0000: phys: [[17, 1], [33], [21, 0]] is too long
	    From schema: Documentation/devicetree/bindings/usb/generic-ohci.yaml

Some USB OHCI controllers (e.g. on the Renesas RZ/G1C SoC) have multiple
PHYs.  Increase the maximum number of PHYs to 3, which is sufficient for
now.

Fixes: 0499220d6d ("dt-bindings: Add missing array size constraints")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/0112f9c8881513cb33bf7b66bc743dd08b35a2f5.1655301203.git.geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-21 16:26:43 +02:00
Pavel Begunkov
aacf2f9f38 io_uring: fix req->apoll_events
apoll_events should be set once in the beginning of poll arming just as
poll->events and not change after. However, currently io_uring resets it
on each __io_poll_execute() for no clear reason. There is also a place
in __io_arm_poll_handler() where we add EPOLLONESHOT to downgrade a
multishot, but forget to do the same thing with ->apoll_events, which is
buggy.

Fixes: 81459350d5 ("io_uring: cache req->apoll->events in req->cflags")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Hao Xu <howeyxu@tencent.com>
Link: https://lore.kernel.org/r/0aef40399ba75b1a4d2c2e85e6e8fd93c02fc6e4.1655814213.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-21 07:49:05 -06:00
Jens Axboe
b60cac14bb io_uring: fix merge error in checking send/recv addr2 flags
With the dropping of the IOPOLL checking in the per-opcode handlers,
we inadvertently left two checks in the recv/recvmsg and send/sendmsg
prep handlers for the same thing, and one of them includes addr2 which
holds the flags for these opcodes.

Fix it up and kill the redundant checks.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-21 07:47:13 -06:00
David Sterba
037e127452 Documentation: update btrfs list of features and link to readthedocs.io
The btrfs documentation in kernel is only meant as a starting point, so
update the list of features and add link to btrfs.readthedocs.io page
that is most up-to-date. The wiki is still used but information is
migrated from there.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-06-21 14:47:19 +02:00
Josef Bacik
bf7ba8ee75 btrfs: fix deadlock with fsync+fiemap+transaction commit
We are hitting the following deadlock in production occasionally

Task 1		Task 2		Task 3		Task 4		Task 5
		fsync(A)
		 start trans
						start commit
				falloc(A)
				 lock 5m-10m
				 start trans
				  wait for commit
fiemap(A)
 lock 0-10m
  wait for 5m-10m
   (have 0-5m locked)

		 have btrfs_need_log_full_commit
		  !full_sync
		  wait_ordered_extents
								finish_ordered_io(A)
								lock 0-5m
								DEADLOCK

We have an existing dependency of file extent lock -> transaction.
However in fsync if we tried to do the fast logging, but then had to
fall back to committing the transaction, we will be forced to call
btrfs_wait_ordered_range() to make sure all of our extents are updated.

This creates a dependency of transaction -> file extent lock, because
btrfs_finish_ordered_io() will need to take the file extent lock in
order to run the ordered extents.

Fix this by stopping the transaction if we have to do the full commit
and we attempted to do the fast logging.  Then attach to the transaction
and commit it if we need to.

CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-06-21 14:47:08 +02:00
Zygo Blaxell
97e86631bc btrfs: don't set lock_owner when locking extent buffer for reading
In 196d59ab9c "btrfs: switch extent buffer tree lock to rw_semaphore"
the functions for tree read locking were rewritten, and in the process
the read lock functions started setting eb->lock_owner = current->pid.
Previously lock_owner was only set in tree write lock functions.

Read locks are shared, so they don't have exclusive ownership of the
underlying object, so setting lock_owner to any single value for a
read lock makes no sense.  It's mostly harmless because write locks
and read locks are mutually exclusive, and none of the existing code
in btrfs (btrfs_init_new_buffer and print_eb_refs_lock) cares what
nonsense is written in lock_owner when no writer is holding the lock.

KCSAN does care, and will complain about the data race incessantly.
Remove the assignments in the read lock functions because they're
useless noise.

Fixes: 196d59ab9c ("btrfs: switch extent buffer tree lock to rw_semaphore")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-06-21 14:46:56 +02:00
Naohiro Aota
19ab78ca86 btrfs: zoned: fix critical section of relocation inode writeback
We use btrfs_zoned_data_reloc_{lock,unlock} to allow only one process to
write out to the relocation inode. That critical section must include all
the IO submission for the inode. However, flush_write_bio() in
extent_writepages() is out of the critical section, causing an IO
submission outside of the lock. This leads to an out of the order IO
submission and fail the relocation process.

Fix it by extending the critical section.

Fixes: 35156d8527 ("btrfs: zoned: only allow one process to add pages to a relocation inode")
CC: stable@vger.kernel.org # 5.16+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-06-21 14:46:30 +02:00
Naohiro Aota
343d8a3085 btrfs: zoned: prevent allocation from previous data relocation BG
After commit 5f0addf7b8 ("btrfs: zoned: use dedicated lock for data
relocation"), we observe IO errors on e.g, btrfs/232 like below.

  [09.0][T4038707] WARNING: CPU: 3 PID: 4038707 at fs/btrfs/extent-tree.c:2381 btrfs_cross_ref_exist+0xfc/0x120 [btrfs]
  <snip>
  [09.9][T4038707] Call Trace:
  [09.5][T4038707]  <TASK>
  [09.3][T4038707]  run_delalloc_nocow+0x7f1/0x11a0 [btrfs]
  [09.6][T4038707]  ? test_range_bit+0x174/0x320 [btrfs]
  [09.2][T4038707]  ? fallback_to_cow+0x980/0x980 [btrfs]
  [09.3][T4038707]  ? find_lock_delalloc_range+0x33e/0x3e0 [btrfs]
  [09.5][T4038707]  btrfs_run_delalloc_range+0x445/0x1320 [btrfs]
  [09.2][T4038707]  ? test_range_bit+0x320/0x320 [btrfs]
  [09.4][T4038707]  ? lock_downgrade+0x6a0/0x6a0
  [09.2][T4038707]  ? orc_find.part.0+0x1ed/0x300
  [09.5][T4038707]  ? __module_address.part.0+0x25/0x300
  [09.0][T4038707]  writepage_delalloc+0x159/0x310 [btrfs]
  <snip>
  [09.4][    C3] sd 10:0:1:0: [sde] tag#2620 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
  [09.5][    C3] sd 10:0:1:0: [sde] tag#2620 Sense Key : Illegal Request [current]
  [09.9][    C3] sd 10:0:1:0: [sde] tag#2620 Add. Sense: Unaligned write command
  [09.5][    C3] sd 10:0:1:0: [sde] tag#2620 CDB: Write(16) 8a 00 00 00 00 00 02 f3 63 87 00 00 00 2c 00 00
  [09.4][    C3] critical target error, dev sde, sector 396041272 op 0x1:(WRITE) flags 0x800 phys_seg 3 prio class 0
  [09.9][    C3] BTRFS error (device dm-1): bdev /dev/mapper/dml_102_2 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0

The IO errors occur when we allocate a regular extent in previous data
relocation block group.

On zoned btrfs, we use a dedicated block group to relocate a data
extent. Thus, we allocate relocating data extents (pre-alloc) only from
the dedicated block group and vice versa. Once the free space in the
dedicated block group gets tight, a relocating extent may not fit into
the block group. In that case, we need to switch the dedicated block
group to the next one. Then, the previous one is now freed up for
allocating a regular extent. The BG is already not enough to allocate
the relocating extent, but there is still room to allocate a smaller
extent. Now the problem happens. By allocating a regular extent while
nocow IOs for the relocation is still on-going, we will issue WRITE IOs
(for relocation) and ZONE APPEND IOs (for the regular writes) at the
same time. That mixed IOs confuses the write pointer and arises the
unaligned write errors.

This commit introduces a new bit 'zoned_data_reloc_ongoing' to the
btrfs_block_group. We set this bit before releasing the dedicated block
group, and no extent are allocated from a block group having this bit
set. This bit is similar to setting block_group->ro, but is different from
it by allowing nocow writes to start.

Once all the nocow IO for relocation is done (hooked from
btrfs_finish_ordered_io), we reset the bit to release the block group for
further allocation.

Fixes: c2707a2556 ("btrfs: zoned: add a dedicated data relocation block group")
CC: stable@vger.kernel.org # 5.16+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-06-21 14:43:48 +02:00
Filipe Manana
650c9caba3 btrfs: do not BUG_ON() on failure to migrate space when replacing extents
At btrfs_replace_file_extents(), if we fail to migrate reserved metadata
space from the transaction block reserve into the local block reserve,
we trigger a BUG_ON(). This is because it should not be possible to have
a failure here, as we reserved more space when we started the transaction
than the space we want to migrate. However having a BUG_ON() is way too
drastic, we can perfectly handle the failure and return the error to the
caller. So just do that instead, and add a WARN_ON() to make it easier
to notice the failure if it ever happens (which is particularly useful
for fstests, and the warning will trigger a failure of a test case).

Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-06-21 14:43:27 +02:00
Filipe Manana
983d8209c6 btrfs: add missing inode updates on each iteration when replacing extents
When replacing file extents, called during fallocate, hole punching,
clone and deduplication, we may not be able to replace/drop all the
target file extent items with a single transaction handle. We may get
-ENOSPC while doing it, in which case we release the transaction handle,
balance the dirty pages of the btree inode, flush delayed items and get
a new transaction handle to operate on what's left of the target range.

By dropping and replacing file extent items we have effectively modified
the inode, so we should bump its iversion and update its mtime/ctime
before we update the inode item. This is because if the transaction
we used for partially modifying the inode gets committed by someone after
we release it and before we finish the rest of the range, a power failure
happens, then after mounting the filesystem our inode has an outdated
iversion and mtime/ctime, corresponding to the values it had before we
changed it.

So add the missing iversion and mtime/ctime updates.

Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-06-21 14:43:21 +02:00
Filipe Manana
d4597898ba btrfs: fix race between reflinking and ordered extent completion
While doing a reflink operation, if an ordered extent for a file range
that does not overlap with the source and destination ranges of the
reflink operation happens, we can end up having a failure in the reflink
operation and return -EINVAL to user space.

The following sequence of steps explains how this can happen:

1) We have the page at file offset 315392 dirty (under delalloc);

2) A reflink operation for this file starts, using the same file as both
   source and destination, the source range is [372736, 409600) (length of
   36864 bytes) and the destination range is [208896, 245760);

3) At btrfs_remap_file_range_prep(), we flush all delalloc in the source
   and destination ranges, and wait for any ordered extents in those range
   to complete;

4) Still at btrfs_remap_file_range_prep(), we then flush all delalloc in
   the inode, but we neither wait for it to complete nor any ordered
   extents to complete. This results in starting delalloc for the page at
   file offset 315392 and creating an ordered extent for that single page
   range;

5) We then move to btrfs_clone() and enter the loop to find file extent
   items to copy from the source range to destination range;

6) In the first iteration we end up at last file extent item stored in
   leaf A:

   (...)
   item 131 key (143616 108 315392) itemoff 5101 itemsize 53
            extent data disk bytenr 1903988736 nr 73728
            extent data offset 12288 nr 61440 ram 73728

   This represents the file range [315392, 376832), which overlaps with
   the source range to clone.

   @datal is set to 61440, key.offset is 315392 and @next_key_min_offset
   is therefore set to 376832 (315392 + 61440).

   @off (372736) is > key.offset (315392), so @new_key.offset is set to
   the value of @destoff (208896).

   @new_key.offset == @last_dest_end (208896) so @drop_start is set to
   208896 (@new_key.offset).

   @datal is adjusted to 4096, as @off is > @key.offset.

   So in this iteration we call btrfs_replace_file_extents() for the range
   [208896, 212991] (a single page, which is
   [@drop_start, @new_key.offset + @datal - 1]).

   @last_dest_end is set to 212992 (@new_key.offset + @datal =
   208896 + 4096 = 212992).

   Before the next iteration of the loop, @key.offset is set to the value
   376832, which is @next_key_min_offset;

7) On the second iteration btrfs_search_slot() leaves us again at leaf A,
   but this time pointing beyond the last slot of leaf A, as that's where
   a key with offset 376832 should be at if it existed. So end up calling
   btrfs_next_leaf();

8) btrfs_next_leaf() releases the path, but before it searches again the
   tree for the next key/leaf, the ordered extent for the single page
   range at file offset 315392 completes. That results in trimming the
   file extent item we processed before, adjusting its key offset from
   315392 to 319488, reducing its length from 61440 to 57344 and inserting
   a new file extent item for that single page range, with a key offset of
   315392 and a length of 4096.

   Leaf A now looks like:

     (...)
     item 132 key (143616 108 315392) itemoff 4995 itemsize 53
              extent data disk bytenr 1801666560 nr 4096
              extent data offset 0 nr 4096 ram 4096
     item 133 key (143616 108 319488) itemoff 4942 itemsize 53
              extent data disk bytenr 1903988736 nr 73728
              extent data offset 16384 nr 57344 ram 73728

9) When btrfs_next_leaf() returns, it gives us a path pointing to leaf A
   at slot 133, since it's the first key that follows what was the last
   key we saw (143616 108 315392). In fact it's the same item we processed
   before, but its key offset was changed, so it counts as a new key;

10) So now we have:

    @key.offset == 319488
    @datal == 57344

    @off (372736) is > key.offset (319488), so @new_key.offset is set to
    208896 (@destoff value).

    @new_key.offset (208896) != @last_dest_end (212992), so @drop_start
    is set to 212992 (@last_dest_end value).

    @datal is adjusted to 4096 because @off > @key.offset.

    So in this iteration we call btrfs_replace_file_extents() for the
    invalid range of [212992, 212991] (which is
    [@drop_start, @new_key.offset + @datal - 1]).

    This range is empty, the end offset is smaller than the start offset
    so btrfs_replace_file_extents() returns -EINVAL, which we end up
    returning to user space and fail the reflink operation.

    This all happens because the range of this file extent item was
    already processed in the previous iteration.

This scenario can be triggered very sporadically by fsx from fstests, for
example with test case generic/522.

So fix this by having btrfs_clone() skip file extent items that cover a
file range that we have already processed.

CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-06-21 14:43:13 +02:00
Takashi Iwai
36a38c53b4 ALSA: hda: Fix discovery of i915 graphics PCI device
It's been reported that the recent fix for skipping the
component-binding with D-GPU caused a regression on some systems; it
resulted in the completely missing component binding with i915 GPU.

The problem was the use of pci_get_class() function.  It matches with
the full PCI class bits, while we want to match only partially the PCI
base class bits.  So, when a system has an i915 graphics device with
the PCI class 0380, it won't hit because we're looking for only the
PCI class 0300.

This patch fixes i915_gfx_present() to look up each PCI device and
match with PCI base class explicitly instead of pci_get_class().

Fixes: c9db8a30d9 ("ALSA: hda/i915 - skip acomp init if no matching display")
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Tested-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1200611
Link: https://lore.kernel.org/r/87bkunztec.wl-tiwai@suse.de
Link: https://lore.kernel.org/r/20220621120044.11573-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-21 14:05:12 +02:00
Alan Stern
f2d8c26068 usb: gadget: Fix non-unique driver names in raw-gadget driver
In a report for a separate bug (which has already been fixed by commit
5f0b5f4d50 "usb: gadget: fix race when gadget driver register via
ioctl") in the raw-gadget driver, the syzbot console log included
error messages caused by attempted registration of a new driver with
the same name as an existing driver:

> kobject_add_internal failed for raw-gadget with -EEXIST, don't try to register things with the same name in the same directory.
> UDC core: USB Raw Gadget: driver registration failed: -17
> misc raw-gadget: fail, usb_gadget_register_driver returned -17

These errors arise because raw_gadget.c registers a separate UDC
driver for each of the UDC instances it creates, but these drivers all
have the same name: "raw-gadget".  Until recently this wasn't a
problem, but when the "gadget" bus was added and UDC drivers were
registered on this bus, it became possible for name conflicts to cause
the registrations to fail.  The reason is simply that the bus code in
the driver core uses the driver name as a sysfs directory name (e.g.,
/sys/bus/gadget/drivers/raw-gadget/), and you can't create two
directories with the same pathname.

To fix this problem, the driver names used by raw-gadget are made
distinct by appending a unique ID number: "raw-gadget.N", with a
different value of N for each driver instance.  And to avoid the
proliferation of error handling code in the raw_ioctl_init() routine,
the error return paths are refactored into the common pattern (goto
statements leading to cleanup code at the end of the routine).

Link: https://lore.kernel.org/all/0000000000008c664105dffae2eb@google.com/
Fixes: fc274c1e99 "USB: gadget: Add a new bus for gadgets"
CC: Andrey Konovalov <andreyknvl@gmail.com>
CC: <stable@vger.kernel.org>
Reported-and-tested-by: syzbot+02b16343704b3af1667e@syzkaller.appspotmail.com
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Acked-by: Hillf Danton <hdanton@sina.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/YqdG32w+3h8c1s7z@rowland.harvard.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-21 10:51:09 +02:00
Lukas Bulwahn
dbab764ed5 MAINTAINERS: add include/dt-bindings/usb to USB SUBSYSTEM
Maintainers of the directory Documentation/devicetree/bindings/usb
are also the maintainers of the corresponding directory
include/dt-bindings/usb.

Add the file entry for include/dt-bindings/usb to the appropriate
section in MAINTAINERS.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20220613124647.32019-1-lukas.bulwahn@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-21 10:50:52 +02:00
Florian Westphal
fcd53c51d0 netfilter: nf_dup_netdev: add and use recursion counter
Now that the egress function can be called from egress hook, we need
to avoid recursive calls into the nf_tables traverser, else crash.

Fixes: f87b9464d1 ("netfilter: nft_fwd_netdev: Support egress hook")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-21 10:50:41 +02:00
Florian Westphal
574a5b85dc netfilter: nf_dup_netdev: do not push mac header a second time
Eric reports skb_under_panic when using dup/fwd via bond+egress hook.
Before pushing mac header, we should make sure that we're called from
ingress to put back what was pulled earlier.

In egress case, the MAC header is already there; we should leave skb
alone.

While at it be more careful here: skb might have been altered and
headroom reduced, so add a skb_cow() before so that headroom is
increased if necessary.

nf_do_netdev_egress() assumes skb ownership (it normally ends with
a call to dev_queue_xmit), so we must free the packet on error.

Fixes: f87b9464d1 ("netfilter: nft_fwd_netdev: Support egress hook")
Reported-by: Eric Garver <eric@garver.life>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-21 10:50:40 +02:00
Jie2x Zhou
5d79d8af8d selftests: netfilter: correct PKTGEN_SCRIPT_PATHS in nft_concat_range.sh
Before change:
make -C netfilter
 TEST: performance
   net,port                                                      [SKIP]
   perf not supported
   port,net                                                      [SKIP]
   perf not supported
   net6,port                                                     [SKIP]
   perf not supported
   port,proto                                                    [SKIP]
   perf not supported
   net6,port,mac                                                 [SKIP]
   perf not supported
   net6,port,mac,proto                                           [SKIP]
   perf not supported
   net,mac                                                       [SKIP]
   perf not supported

After change:
   net,mac                                                       [ OK ]
     baseline (drop from netdev hook):               2061098pps
     baseline hash (non-ranged entries):             1606741pps
     baseline rbtree (match on first field only):    1191607pps
     set with  1000 full, ranged entries:            1639119pps
ok 8 selftests: netfilter: nft_concat_range.sh

Fixes: 611973c1e0 ("selftests: netfilter: Introduce tests for sets with range concatenation")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jie2x Zhou <jie2x.zhou@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-21 10:50:40 +02:00
Steve French
73130a7b1a smb3: fix empty netname context on secondary channels
Some servers do not allow null netname contexts, which would cause
multichannel to revert to single channel when mounting to some
servers (e.g. Azure xSMB).

Fixes: 4c14d7043f ("cifs: populate empty hostnames for extra channels")
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-20 16:23:50 -05:00
Matthew Wilcox (Oracle)
cb995f4eeb filemap: Handle sibling entries in filemap_get_read_batch()
If a read races with an invalidation followed by another read, it is
possible for a folio to be replaced with a higher-order folio.  If that
happens, we'll see a sibling entry for the new folio in the next iteration
of the loop.  This manifests as a NULL pointer dereference while holding
the RCU read lock.

Handle this by simply returning.  The next call will find the new folio
and handle it correctly.  The other ways of handling this rare race are
more complex and it's just not worth it.

Reported-by: Dave Chinner <david@fromorbit.com>
Reported-by: Brian Foster <bfoster@redhat.com>
Debugged-by: Brian Foster <bfoster@redhat.com>
Tested-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Fixes: cbd59c48ae ("mm/filemap: use head pages in generic_file_buffered_read")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-20 16:37:45 -04:00
Matthew Wilcox (Oracle)
5ccc944dce filemap: Correct the conditions for marking a folio as accessed
We had an off-by-one error which meant that we never marked the first page
in a read as accessed.  This was visible as a slowdown when re-reading
a file as pages were being evicted from cache too soon.  In reviewing
this code, we noticed a second bug where a multi-page folio would be
marked as accessed multiple times when doing reads that were less than
the size of the folio.

Abstract the comparison of whether two file positions are in the same
folio into a new function, fixing both of these bugs.

Reported-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-20 16:37:45 -04:00
Yihao Han
5491424d17 video: fbdev: simplefb: Check before clk_put() not needed
clk_put() already checks the clk ptr using !clk and IS_ERR()
so there is no need to check it again before calling it.

Signed-off-by: Yihao Han <hanyihao@vivo.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-06-20 20:22:16 +02:00
Yihao Han
b5c525abe7 video: fbdev: au1100fb: Drop unnecessary NULL ptr check
clk_disable() already checks the clk ptr using IS_ERR_OR_NULL(clk) and
clk_enable() checks the clk ptr using !clk, so there is no need to check clk
ptr again before calling them.

Signed-off-by: Yihao Han <hanyihao@vivo.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-06-20 20:19:50 +02:00
Hyunwoo Kim
a09d2d00af video: fbdev: pxa3xx-gcu: Fix integer overflow in pxa3xx_gcu_write
In pxa3xx_gcu_write, a count parameter of type size_t is passed to words of
type int.  Then, copy_from_user() may cause a heap overflow because it is used
as the third argument of copy_from_user().

Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-06-20 20:12:17 +02:00
Jason A. Donenfeld
c7b28f52f4 drm/i915/display: Re-add check for low voltage sku for max dp source rate
This reverts commit 73867c8709 ("drm/i915/display: Remove check for
low voltage sku for max dp source rate"), which, on an i7-11850H iGPU
with a Thinkpad X1 Extreme Gen 4, attached to a LG LP160UQ1-SPB1
embedded panel, causes wild flickering glitching technicolor
pyrotechnics on resumption from suspend. The display shows strobing
colors in an utter disaster explosion of pantone, as though bombs were
dropped on the leprechauns at the base of the rainbow.

Rebooting the machine fixes the issue, presumably because the display is
initialized by firmware rather than by i915. Otherwise, the GPU appears
to work fine.

Bisection traced it back to this commit, which makes sense given the
issues.

Note: This re-opens, and puts back to the drawing board,
https://gitlab.freedesktop.org/drm/intel/-/issues/5272 which was fixed
by the regressing commit.

Fixes: 73867c8709 ("drm/i915/display: Remove check for low voltage sku for max dp source rate")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6205
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Animesh Manna <animesh.manna@intel.com>
Cc: Jani Saarinen <jani.saarinen@intel.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220613102241.9236-1-Jason@zx2c4.com
(cherry picked from commit d592983508)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-06-20 19:39:00 +03:00
Javier Martinez Canillas
2a166929bc regmap: Wire up regmap_config provided bulk write in missed functions
There are some functions that were missed by commit d77e745613 ("regmap:
Add bulk read/write callbacks into regmap_config") when support to define
bulk read/write callbacks in regmap_config was introduced.

The regmap_bulk_write() and regmap_noinc_write() functions weren't changed
to use the added map->write instead of the map->bus->write handler.

Also, the regmap_can_raw_write() was not modified to take map->write into
account. So will only return true if a bus with a .write callback is set.

Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220616073435.1988219-4-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-20 16:51:29 +01:00
Javier Martinez Canillas
c42e99a3f9 regmap: Make regmap_noinc_read() return -ENOTSUPP if map->read isn't set
Before adding support to define bulk read/write callbacks in regmap_config
by the commit d77e745613 ("regmap: Add bulk read/write callbacks into
regmap_config"), the regmap_noinc_read() function returned an errno early
a map->bus->read callback wasn't set.

But that commit dropped the check and now a call to _regmap_raw_read() is
attempted even when bulk read operations are not supported. That function
checks for map->read anyways but there's no point to continue if the read
can't succeed.

Also is a fragile assumption to make so is better to make it fail earlier.

Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220616073435.1988219-3-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-20 16:51:28 +01:00
Javier Martinez Canillas
ea50e2a154 regmap: Re-introduce bulk read support check in regmap_bulk_read()
Support for drivers to define bulk read/write callbacks in regmap_config
was introduced by the commit d77e745613 ("regmap: Add bulk read/write
callbacks into regmap_config"), but this commit wrongly dropped a check
in regmap_bulk_read() to determine whether bulk reads can be done or not.

Before that commit, it was checked if map->bus was set. Now has to check
if a map->read callback has been set.

Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220616073435.1988219-2-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-20 16:51:26 +01:00
Linus Torvalds
78ca55889a Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Eight fixes, all in drivers (ufs, scsi_debug, storvsc, iscsi, ibmvfc).

  Apart from the ufs command clearing updates, these are mostly minor
  and obvious fixes"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ibmvfc: Store vhost pointer during subcrq allocation
  scsi: ibmvfc: Allocate/free queue resource only during probe/remove
  scsi: storvsc: Correct reporting of Hyper-V I/O size limits
  scsi: ufs: Fix a race between the interrupt handler and the reset handler
  scsi: ufs: Support clearing multiple commands at once
  scsi: ufs: Simplify ufshcd_clear_cmd()
  scsi: iscsi: Exclude zero from the endpoint ID range
  scsi: scsi_debug: Fix zone transition to full condition
2022-06-20 09:35:04 -05:00
Linus Torvalds
c5b3a0946b Merge tag 'perf-tools-fixes-for-v5.19-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool fixes from Arnaldo Carvalho de Melo:

 - Don't set data source if it's not a memory operation in ARM SPE
   (Statistical Profiling Extensions).

 - Fix handling of exponent floating point values in perf stat
   expressions.

 - Don't leak fd on failure on libperf open.

 - Fix 'perf test' CPU topology test for PPC guest systems.

 - Fix undefined behaviour on breakpoint account 'perf test' entry.

 - Record only user callchains on the "Check ARM64 callgraphs are
   complete in FP mode" 'perf test' entry.

 - Fix "perf stat CSV output linter" test on s390.

 - Sync batch of kernel headers with tools/perf/.

* tag 'perf-tools-fixes-for-v5.19-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  tools headers UAPI: Sync linux/prctl.h with the kernel sources
  perf metrics: Ensure at least 1 id per metric
  tools headers arm64: Sync arm64's cputype.h with the kernel sources
  tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
  perf arm-spe: Don't set data source if it's not a memory operation
  perf expr: Allow exponents on floating point values
  perf test topology: Use !strncmp(right platform) to fix guest PPC comparision check
  perf test: Record only user callchains on the "Check Arm64 callgraphs are complete in fp mode" test
  perf beauty: Update copy of linux/socket.h with the kernel sources
  perf test: Fix variable length array undefined behavior in bp_account
  libperf evsel: Open shouldn't leak fd on failure
  perf test: Fix "perf stat CSV output linter" test on s390
  perf unwind: Fix uninitialized variable
2022-06-20 09:31:03 -05:00
Linus Torvalds
59b785fe2a Merge tag 'slab-for-5.19-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fixes from Vlastimil Babka:

 - A slub fix for PREEMPT_RT locking semantics from Sebastian.

 - A slub fix for state corruption due to a possible race scenario from
   Jann.

* tag 'slab-for-5.19-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slub: add missing TID updates on slab deactivation
  mm/slub: Move the stackdepot related allocation out of IRQ-off section.
2022-06-20 09:28:51 -05:00
Gerd Hoffmann
05b252cccb udmabuf: add back sanity check
Check vm_fault->pgoff before using it.  When we removed the warning, we
also removed the check.

Fixes: 7b26e4e211 ("udmabuf: drop WARN_ON() check.")
Reported-by: zdi-disclosures@trendmicro.com
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-20 08:38:29 -05:00
Jens Axboe
1bacd264d3 io_uring: mark reissue requests with REQ_F_PARTIAL_IO
If we mark for reissue, we assume that the buffer will remain stable.
Hence if are using a provided buffer, we need to ensure that we stick
with it for the duration of that request.

This only affects block devices that use provided buffers, as those are
the only ones that get marked with REQ_F_REISSUE.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-20 06:39:27 -06:00
Bjorn Helgaas
267173cbf4 video: fbdev: skeletonfb: Convert to generic power management
PCI-specific power management (pci_driver.suspend and pci_driver.resume) is
deprecated.  If drivers implement power management, they should use the
generic power management framework, not the PCI-specific hooks.

Convert the sample code to use the generic power management framework.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-06-20 14:38:27 +02:00
Bjorn Helgaas
e146a09621 video: fbdev: cirrusfb: Remove useless reference to PCI power management
PCI-specific power management (pci_driver.suspend and pci_driver.resume) is
deprecated.  The cirrusfb driver has never implemented power management at
all, but if it ever does, it should use the generic power management
framework, not the PCI-specific hooks.

Remove the commented-out references to the PCI-specific power management
hooks.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-06-20 14:38:27 +02:00
Petr Cvek
d36a869e0d video: fbdev: intelfb: Initialize value of stolen size
Variable stolen_size can be left uninitialized in a code path with
INTEL_855_GMCH_GMS_DISABLED. Fix this by initializing the variable to 0.

Also fix indentation of function arguments.

Signed-off-by: Petr Cvek <petrcvekcz@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-06-20 14:33:05 +02:00
Petr Cvek
25c9a15fb7 video: fbdev: intelfb: Use aperture size from pci_resource_len
Aperture size for i9x5 variants is determined from PCI base address.

	if (pci_resource_start(pdev, 2) & 0x08000000)
		*aperture_size = MB(128);
	...

This condition is incorrect as 128 MiB address can have the address
set as 0x?8000000 or 0x?0000000. Also the code can be simplified to just
use pci_resource_len().

The true settings of the aperture size is in the MSAC register, which
could be used instead. However the value is used only as an info message,
so it doesn't matter.

Signed-off-by: Petr Cvek <petrcvekcz@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-06-20 14:33:05 +02:00
Xiang wangx
fc378794a2 video: fbdev: skeletonfb: Fix syntax errors in comments
Delete the redundant word 'its'.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-06-20 14:27:19 +02:00
Takashi Iwai
c7807b27d5 ALSA: hda/via: Fix missing beep setup
Like the previous fix for Conexant codec, the beep_nid has to be set
up before calling snd_hda_gen_parse_auto_config(); otherwise it'd miss
the path setup.

Fix the call order for addressing the missing beep setup.

Fixes: 0e8f986249 ("ALSA: hda/via - Simplify control management")
Cc: <stable@vger.kernel.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216152
Link: https://lore.kernel.org/r/20220620104008.1994-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-20 12:40:42 +02:00
Takashi Iwai
5faa0bc691 ALSA: hda/conexant: Fix missing beep setup
Currently the Conexant codec driver sets up the beep NID after calling
snd_hda_gen_parse_auto_config().  It turned out that this results in
the insufficient setup for the beep control, as the generic parser
handles the fake path in snd_hda_gen_parse_auto_config() only if the
beep_nid is set up beforehand.

For dealing with the beep widget properly, call cx_auto_parse_beep()
before snd_hda_gen_parse_auto_config() call.

Fixes: 51e19ca5f7 ("ALSA: hda/conexant - Clean up beep code")
Cc: <stable@vger.kernel.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216152
Link: https://lore.kernel.org/r/20220620104008.1994-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-20 12:40:41 +02:00
Jon Lin
419bc8f681 spi: rockchip: Unmask IRQ at the final to avoid preemption
Avoid pio_write process is preempted, resulting in abnormal state.

Signed-off-by: Jon Lin <jon.lin@rock-chips.com>
Signed-off-by: Jon <jon.lin@rock-chips.com>
Link: https://lore.kernel.org/r/20220617124251.5051-1-jon.lin@rock-chips.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-20 11:35:43 +01:00
Carlo Lobrano
342fc0c3b3 USB: serial: option: add Telit LE910Cx 0x1250 composition
Add support for the following Telit LE910Cx composition:

0x1250: rmnet, tty, tty, tty, tty

Reviewed-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: Carlo Lobrano <c.lobrano@gmail.com>
Link: https://lore.kernel.org/r/20220614075623.2392607-1-c.lobrano@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2022-06-20 12:28:35 +02:00
Tvrtko Ursulin
3828296ad6 drm/i915/fdinfo: Don't show engine classes not present
Stop displaying engine classes with no engines - it is not a huge problem
if they are shown, since the values will correctly be all zeroes, but it
does count as misleading.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 055634e4b6 ("drm/i915: Expose client engine utilisation via fdinfo")
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220616140056.559074-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 9f1b1d0b22)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-06-20 13:07:49 +03:00
Ville Syrjälä
13bd259b64 drm/i915: Implement w/a 22010492432 for adl-s
adl-s needs the combo PLL DCO fraction w/a as well.
Gets us slightly more accurate clock out of the PLL.

Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220613201439.23341-1-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit d36bdd77b9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-06-20 13:07:44 +03:00
Max Filippov
a2d9b75b19 xtensa: change '.bss' to '.section .bss'
For some reason (ancient assembler?) the following build error is
reported by the kisskb:

  kisskb/src/arch/xtensa/kernel/entry.S: Error: unknown pseudo-op: `.bss':
  => 2176

Change abbreviated '.bss' to the full '.section .bss, "aw"' to fix this
error.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2022-06-20 02:50:34 -07:00
Jason A. Donenfeld
63b8ea5e4f random: update comment from copy_to_user() -> copy_to_iter()
This comment wasn't updated when we moved from read() to read_iter(), so
this patch makes the trivial fix.

Fixes: 1b388e7765 ("random: convert to using fops->read_iter()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-20 11:06:17 +02:00
Ziyang Xuan
69135c572d net/tls: fix tls_sk_proto_close executed repeatedly
After setting the sock ktls, update ctx->sk_proto to sock->sk_prot by
tls_update(), so now ctx->sk_proto->close is tls_sk_proto_close(). When
close the sock, tls_sk_proto_close() is called for sock->sk_prot->close
is tls_sk_proto_close(). But ctx->sk_proto->close() will be executed later
in tls_sk_proto_close(). Thus tls_sk_proto_close() executed repeatedly
occurred. That will trigger the following bug.

=================================================================
KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
RIP: 0010:tls_sk_proto_close+0xd8/0xaf0 net/tls/tls_main.c:306
Call Trace:
 <TASK>
 tls_sk_proto_close+0x356/0xaf0 net/tls/tls_main.c:329
 inet_release+0x12e/0x280 net/ipv4/af_inet.c:428
 __sock_release+0xcd/0x280 net/socket.c:650
 sock_close+0x18/0x20 net/socket.c:1365

Updating a proto which is same with sock->sk_prot is incorrect. Add proto
and sock->sk_prot equality check at the head of tls_update() to fix it.

Fixes: 95fa145479 ("bpf: sockmap/tls, close can race with map free")
Reported-by: syzbot+29c3c12f3214b85ad081@syzkaller.appspotmail.com
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20 10:01:52 +01:00
Eric Dumazet
301bd140ed erspan: do not assume transport header is always set
Rewrite tests in ip6erspan_tunnel_xmit() and
erspan_fb_xmit() to not assume transport header is set.

syzbot reported:

WARNING: CPU: 0 PID: 1350 at include/linux/skbuff.h:2911 skb_transport_header include/linux/skbuff.h:2911 [inline]
WARNING: CPU: 0 PID: 1350 at include/linux/skbuff.h:2911 ip6erspan_tunnel_xmit+0x15af/0x2eb0 net/ipv6/ip6_gre.c:963
Modules linked in:
CPU: 0 PID: 1350 Comm: aoe_tx0 Not tainted 5.19.0-rc2-syzkaller-00160-g274295c6e53f #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
RIP: 0010:skb_transport_header include/linux/skbuff.h:2911 [inline]
RIP: 0010:ip6erspan_tunnel_xmit+0x15af/0x2eb0 net/ipv6/ip6_gre.c:963
Code: 0f 47 f0 40 88 b5 7f fe ff ff e8 8c 16 4b f9 89 de bf ff ff ff ff e8 a0 12 4b f9 66 83 fb ff 0f 85 1d f1 ff ff e8 71 16 4b f9 <0f> 0b e9 43 f0 ff ff e8 65 16 4b f9 48 8d 85 30 ff ff ff ba 60 00
RSP: 0018:ffffc90005daf910 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 000000000000ffff RCX: 0000000000000000
RDX: ffff88801f032100 RSI: ffffffff882e8d3f RDI: 0000000000000003
RBP: ffffc90005dafab8 R08: 0000000000000003 R09: 000000000000ffff
R10: 000000000000ffff R11: 0000000000000000 R12: ffff888024f21d40
R13: 000000000000a288 R14: 00000000000000b0 R15: ffff888025a2e000
FS: 0000000000000000(0000) GS:ffff88802c800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2e425000 CR3: 000000006d099000 CR4: 0000000000152ef0
Call Trace:
<TASK>
__netdev_start_xmit include/linux/netdevice.h:4805 [inline]
netdev_start_xmit include/linux/netdevice.h:4819 [inline]
xmit_one net/core/dev.c:3588 [inline]
dev_hard_start_xmit+0x188/0x880 net/core/dev.c:3604
sch_direct_xmit+0x19f/0xbe0 net/sched/sch_generic.c:342
__dev_xmit_skb net/core/dev.c:3815 [inline]
__dev_queue_xmit+0x14a1/0x3900 net/core/dev.c:4219
dev_queue_xmit include/linux/netdevice.h:2994 [inline]
tx+0x6a/0xc0 drivers/block/aoe/aoenet.c:63
kthread+0x1e7/0x3b0 drivers/block/aoe/aoecmd.c:1229
kthread+0x2e9/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302
</TASK>

Fixes: d5db21a3e6 ("erspan: auto detect truncated ipv6 packets.")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20 10:00:55 +01:00
Riccardo Paolo Bestetti
313c502fa3 ipv4: fix bind address validity regression tests
Commit 8ff978b8b2 ("ipv4/raw: support binding to nonlocal addresses")
introduces support for binding to nonlocal addresses, as well as some
basic test coverage for some of the related cases.

Commit b4a028c4d0 ("ipv4: ping: fix bind address validity check")
fixes a regression which incorrectly removed some checks for bind
address validation. In addition, it introduces regression tests for
those specific checks. However, those regression tests are defective, in
that they perform the tests using an incorrect combination of bind
flags. As a result, those tests fail when they should succeed.

This commit introduces additional regression tests for nonlocal binding
and fixes the defective regression tests. It also introduces new
set_sysctl calls for the ipv4_bind test group, as to perform the ICMP
binding tests it is necessary to allow ICMP socket creation by setting
the net.ipv4.ping_group_range knob.

Fixes: b4a028c4d0 ("ipv4: ping: fix bind address validity check")
Reported-by: Riccardo Paolo Bestetti <pbl@bestov.io>
Signed-off-by: Riccardo Paolo Bestetti <pbl@bestov.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20 09:58:12 +01:00
Greg Kroah-Hartman
315f7e15c2 Merge tag 'iio-fixes-for-5.19a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next
Jonathan writes:

1st set of IIO fixes for the 5.19 cycle.

Most of these have been in next for a long time. Unfortunately there
was one stray patch in the branch (wasn't a fix), so I've just rebased
to remove that.

* testing
  - Fix a missing MODULE_LICENSE() warning by restricting possible build
    configs.
* Various drivers
  - Fix ordering of iio_get_trigger() being called before
    iio_trigger_register()
* adi,admv1014
  - Fix dubious x & !y warning.
* adi,axi-adc
  - Fix missing of_node_put() in error and normal paths.
* aspeed,adc
  - Add missing of_node_put()
* fsl,mma8452
  - Fix broken probing from device tree.
  - Drop check on return value of i2c write to device to cause reset as
    ACK will be missing (device reset before sending it).
* fsl,vf610
  - Fix documentation of in_conversion_mode ABI.
* iio-trig-sysfs
  - Ensure irq work has finished before freeing the trigger.
* invensense,mpu3050
 - Disable regulators in error path.
* invensense,icm42600
  - Fix collision of enum value of 0 with error path where 0 is no match.
* renesas,rzg2l_Adc
  - Add missing fwnode_handle_put() in error path.
* rescale
  - Fix a boolean logic bug for detection of raw + scale affecting an
    obscure corner case.
* semtech,sx9324
  - Check return value of read of pin_defs
* st,stm32-adc:
  - Fix interaction across ADC instances for some supported devices.
  - Drop false spurious IRQ messages.
  - Fix calibration value handling.  If we can't calibrate don't expose the
    vref_int channel.
  - Fix maximum clock rate for stm32pm15x
* ti,ads131e08
  - Add missing fwnode_handle_put() in error paths.
* xilinx,ams
  - Fix variable checked for error from platform_get_irq()
* x-powers,axp288
  - Overide TS_PIN bias current for boards where it is not correctly
    initialized.
* yamaha,yas530
  - Fix inverted check on calibration data being all zeros.

* tag 'iio-fixes-for-5.19a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (26 commits)
  iio:proximity:sx9324: Check ret value of device_property_read_u32_array()
  iio: accel: mma8452: ignore the return value of reset operation
  iio: adc: stm32: fix maximum clock rate for stm32mp15x
  iio: adc: stm32: fix vrefint wrong calibration value handling
  iio: imu: inv_icm42600: Fix broken icm42600 (chip id 0 value)
  iio: adc: vf610: fix conversion mode sysfs node name
  iio: adc: adi-axi-adc: Fix refcount leak in adi_axi_adc_attach_client
  iio: test: fix missing MODULE_LICENSE for IIO_RESCALE=m
  iio:humidity:hts221: rearrange iio trigger get and register
  iio:chemical:ccs811: rearrange iio trigger get and register
  iio:accel:mxc4005: rearrange iio trigger get and register
  iio:accel:kxcjk-1013: rearrange iio trigger get and register
  iio:accel:bma180: rearrange iio trigger get and register
  iio: afe: rescale: Fix boolean logic bug
  iio: adc: aspeed: Fix refcount leak in aspeed_adc_set_trim_data
  iio: adc: stm32: Fix IRQs on STM32F4 by removing custom spurious IRQs message
  iio: adc: stm32: Fix ADCs iteration in irq handler
  iio: adc: ti-ads131e08: add missing fwnode_handle_put() in ads131e08_alloc_channels()
  iio: adc: rzg2l_adc: add missing fwnode_handle_put() in rzg2l_adc_parse_properties()
  iio: trigger: sysfs: fix use-after-free on remove
  ...
2022-06-20 09:49:52 +02:00
Takashi Iwai
9882d63bea ALSA: memalloc: Drop x86-specific hack for WC allocations
The recent report for a crash on Haswell machines implied that the
x86-specific (rather hackish) implementation for write-cache memory
buffer allocation in ALSA core is buggy with the recent kernel in some
corner cases.  This patch drops the x86-specific implementation and
uses the standard dma_alloc_wc() & co generically for avoiding the bug
and also for simplification.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216112
Cc: <stable@vger.kernel.org> # v5.18+
Link: https://lore.kernel.org/r/20220620073440.7514-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-20 09:35:23 +02:00
Damien Le Moal
9243fc4cd2 block: remove queue from struct blk_independent_access_range
The request queue pointer in struct blk_independent_access_range is
unused. Remove it.

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Fixes: 41e46b3c2a ("block: Fix potential deadlock in blk_ia_range_sysfs_show()")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220603053529.76405-1-damien.lemoal@opensource.wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-19 18:40:11 -06:00
Nick Desaulniers
291810be42 Documentation/llvm: Update Supported Arch table
While watching Michael's new talk on Clang-built-Linux, I noticed the
arch table in our docs that he refers to is outdated.

Add hexagon and User Mode.  Bump MIPS and RISCV to LLVM=1.  PowerPC is
almost LLVM=1 capable; ppc64le works, but ppc64 (big endian) and ppc32
still need more work.

Link: https://youtu.be/W4zdEDpvR5c?t=399
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-06-20 08:21:29 +09:00
Masahiro Yamada
28438794ab modpost: fix section mismatch check for exported init/exit sections
Since commit f02e8a6596 ("module: Sort exported symbols"),
EXPORT_SYMBOL* is placed in the individual section ___ksymtab(_gpl)+<sym>
(3 leading underscores instead of 2).

Since then, modpost cannot detect the bad combination of EXPORT_SYMBOL
and __init/__exit.

Fix the .fromsec field.

Fixes: f02e8a6596 ("module: Sort exported symbols")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-06-20 08:18:03 +09:00
Daeho Jeong
61803e9843 f2fs: fix iostat related lock protection
Made iostat related locks safe to be called from irq context again.

Cc: <stable@vger.kernel.org>
Fixes: a1e09b03e6 ("f2fs: use iomap for direct I/O")
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Tested-by: Eddie Huang <eddie.huang@mediatek.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-06-19 15:16:12 -07:00
Jaegeuk Kim
4cde00d507 f2fs: attach inline_data after setting compression
This fixes the below corruption.

[345393.335389] F2FS-fs (vdb): sanity_check_inode: inode (ino=6d0, mode=33206) should not have inline_data, run fsck to fix

Cc: <stable@vger.kernel.org>
Fixes: 677a82b44e ("f2fs: fix to do sanity check for inline inode")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-06-19 15:16:10 -07:00
Jason A. Donenfeld
c01d4d0a82 random: quiet urandom warning ratelimit suppression message
random.c ratelimits how much it warns about uninitialized urandom reads
using __ratelimit(). When the RNG is finally initialized, it prints the
number of missed messages due to ratelimiting.

It has been this way since that functionality was introduced back in
2018. Recently, cc1e127bfa ("random: remove ratelimiting for in-kernel
unseeded randomness") put a bit more stress on the urandom ratelimiting,
which teased out a bug in the implementation.

Specifically, when under pressure, __ratelimit() will print its own
message and reset the count back to 0, making the final message at the
end less useful. Secondly, it does so as a pr_warn(), which apparently
is undesirable for people's CI.

Fortunately, __ratelimit() has the RATELIMIT_MSG_ON_RELEASE flag exactly
for this purpose, so we set the flag.

Fixes: 4e00b339e2 ("random: rate limit unseeded randomness warnings")
Cc: stable@vger.kernel.org
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Reported-by: Ron Economos <re@w6rz.net>
Tested-by: Ron Economos <re@w6rz.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-19 23:50:46 +02:00
Jason A. Donenfeld
534d2eaf19 random: schedule mix_interrupt_randomness() less often
It used to be that mix_interrupt_randomness() would credit 1 bit each
time it ran, and so add_interrupt_randomness() would schedule mix() to
run every 64 interrupts, a fairly arbitrary number, but nonetheless
considered to be a decent enough conservative estimate.

Since e3e33fc2ea ("random: do not use input pool from hard IRQs"),
mix() is now able to credit multiple bits, depending on the number of
calls to add(). This was done for reasons separate from this commit, but
it has the nice side effect of enabling this patch to schedule mix()
less often.

Currently the rules are:
a) Credit 1 bit for every 64 calls to add().
b) Schedule mix() once a second that add() is called.
c) Schedule mix() once every 64 calls to add().

Rules (a) and (c) no longer need to be coupled. It's still important to
have _some_ value in (c), so that we don't "over-saturate" the fast
pool, but the once per second we get from rule (b) is a plenty enough
baseline. So, by increasing the 64 in rule (c) to something larger, we
avoid calling queue_work_on() as frequently during irq storms.

This commit changes that 64 in rule (c) to be 1024, which means we
schedule mix() 16 times less often. And it does *not* need to change the
64 in rule (a).

Fixes: 58340f8e95 ("random: defer fast pool mixing to worker")
Cc: stable@vger.kernel.org
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-19 23:50:45 +02:00
Linus Torvalds
a111daf0c5 Linux 5.19-rc3 2022-06-19 15:06:47 -05:00
Aashish Sharma
70171ed6dc iio:proximity:sx9324: Check ret value of device_property_read_u32_array()
0-day reports:

drivers/iio/proximity/sx9324.c:868:3: warning: Value stored
to 'ret' is never read [clang-analyzer-deadcode.DeadStores]

Put an if condition to break out of switch if ret is non-zero.

Signed-off-by: Aashish Sharma <shraash@google.com>
Fixes: a8ee3b32f5 ("iio:proximity:sx9324: Add dt_binding support")
Reported-by: kernel test robot <lkp@intel.com>
[swboyd@chromium.org: Reword commit subject, add fixes tag]
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Gwendal Grignou <gwendal@chromium.org>
Link: https://lore.kernel.org/r/20220613232224.2466278-1-swboyd@chromium.org
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:49 +01:00
Haibo Chen
bf745142cc iio: accel: mma8452: ignore the return value of reset operation
On fxls8471, after set the reset bit, the device will reset immediately,
will not give ACK. So ignore the return value of this reset operation,
let the following code logic to check whether the reset operation works.

Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Fixes: ecabae7131 ("iio: mma8452: Initialise before activating")
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/1655292718-14287-1-git-send-email-haibo.chen@nxp.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:49 +01:00
Olivier Moysan
990539486e iio: adc: stm32: fix maximum clock rate for stm32mp15x
Change maximum STM32 ADC input clock rate to 36MHz, as specified
in STM32MP15x datasheets.

Fixes: d58c67d1d8 ("iio: adc: stm32-adc: add support for STM32MP1")
Signed-off-by: Olivier Moysan <olivier.moysan@foss.st.com>
Reviewed-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Link: https://lore.kernel.org/r/20220609095234.375925-1-olivier.moysan@foss.st.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:49 +01:00
Olivier Moysan
bc05f30fc2 iio: adc: stm32: fix vrefint wrong calibration value handling
If the vrefint calibration is zero, the vrefint channel output value
cannot be computed. Currently, in such case, the raw conversion value
is returned, which is not relevant.
Do not expose the vrefint channel when the output value cannot be
computed, instead.

Fixes: 0e346b2cfa ("iio: adc: stm32-adc: add vrefint calibration support")
Signed-off-by: Olivier Moysan <olivier.moysan@foss.st.com>
Reviewed-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Link: https://lore.kernel.org/r/20220609095856.376961-1-olivier.moysan@foss.st.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:49 +01:00
Jean-Baptiste Maneyrol
106b391e1b iio: imu: inv_icm42600: Fix broken icm42600 (chip id 0 value)
The 0 value used for INV_CHIP_ICM42600 was not working since the
match in i2c/spi was checking against NULL value.

To keep this check, add a first INV_CHIP_INVALID 0 value as safe
guard.

Fixes: 31c24c1e93 ("iio: imu: inv_icm42600: add core of new inv_icm42600 driver")
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Link: https://lore.kernel.org/r/20220609102301.4794-1-jmaneyrol@invensense.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:49 +01:00
Baruch Siach
f1a633b15c iio: adc: vf610: fix conversion mode sysfs node name
The documentation missed the "in_" prefix for this IIO_SHARED_BY_DIR
entry.

Fixes: bf04c1a367 ("iio: adc: vf610: implement configurable conversion modes")
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Acked-by: Haibo Chen <haibo.chen@nxp.com>
Link: https://lore.kernel.org/r/560dc93fafe5ef7e9a409885fd20b6beac3973d8.1653900626.git.baruch@tkos.co.il
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:49 +01:00
Miaoqian Lin
ada7b0c0de iio: adc: adi-axi-adc: Fix refcount leak in adi_axi_adc_attach_client
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.

Fixes: ef04070692 ("iio: adc: adi-axi-adc: add support for AXI ADC IP core")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220524074517.45268-1-linmq006@gmail.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:49 +01:00
Liam Beguin
7a2f6f61e8 iio: test: fix missing MODULE_LICENSE for IIO_RESCALE=m
When IIO_RESCALE_KUNIT_TEST=y and IIO_RESCALE=m,
drivers/iio/afe/iio-rescale.o is built twice causing the
MODULE_LICENSE() to be lost, as shown by:

  ERROR: modpost: missing MODULE_LICENSE() in drivers/iio/afe/iio-rescale.o

Rework the build configuration to have the dependency specified in the
Kconfig.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: 8e74a48d17 ("iio: test: add basic tests for the iio-rescale driver")
Signed-off-by: Liam Beguin <liambeguin@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20220601142138.3331278-1-liambeguin@gmail.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:49 +01:00
Dmitry Rokosov
10b9c2c33a iio:humidity:hts221: rearrange iio trigger get and register
IIO trigger interface function iio_trigger_get() should be called after
iio_trigger_register() (or its devm analogue) strictly, because of
iio_trigger_get() acquires module refcnt based on the trigger->owner
pointer, which is initialized inside iio_trigger_register() to
THIS_MODULE.
If this call order is wrong, the next iio_trigger_put() (from sysfs
callback or "delete module" path) will dereference "default" module
refcnt, which is incorrect behaviour.

Fixes: e4a70e3e7d ("iio: humidity: add support to hts221 rh/temp combo device")
Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220524181150.9240-6-ddrokosov@sberdevices.ru
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:49 +01:00
Dmitry Rokosov
d710359c0b iio:chemical:ccs811: rearrange iio trigger get and register
IIO trigger interface function iio_trigger_get() should be called after
iio_trigger_register() (or its devm analogue) strictly, because of
iio_trigger_get() acquires module refcnt based on the trigger->owner
pointer, which is initialized inside iio_trigger_register() to
THIS_MODULE.
If this call order is wrong, the next iio_trigger_put() (from sysfs
callback or "delete module" path) will dereference "default" module
refcnt, which is incorrect behaviour.

Fixes: f1f065d7ac ("iio: chemical: ccs811: Add support for data ready trigger")
Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220524181150.9240-5-ddrokosov@sberdevices.ru
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:49 +01:00
Dmitry Rokosov
9354c224c9 iio:accel:mxc4005: rearrange iio trigger get and register
IIO trigger interface function iio_trigger_get() should be called after
iio_trigger_register() (or its devm analogue) strictly, because of
iio_trigger_get() acquires module refcnt based on the trigger->owner
pointer, which is initialized inside iio_trigger_register() to
THIS_MODULE.
If this call order is wrong, the next iio_trigger_put() (from sysfs
callback or "delete module" path) will dereference "default" module
refcnt, which is incorrect behaviour.

Fixes: 47196620c8 ("iio: mxc4005: add data ready trigger for mxc4005")
Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220524181150.9240-4-ddrokosov@sberdevices.ru
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:48 +01:00
Dmitry Rokosov
ed302925d7 iio:accel:kxcjk-1013: rearrange iio trigger get and register
IIO trigger interface function iio_trigger_get() should be called after
iio_trigger_register() (or its devm analogue) strictly, because of
iio_trigger_get() acquires module refcnt based on the trigger->owner
pointer, which is initialized inside iio_trigger_register() to
THIS_MODULE.
If this call order is wrong, the next iio_trigger_put() (from sysfs
callback or "delete module" path) will dereference "default" module
refcnt, which is incorrect behaviour.

Fixes: c1288b8338 ("iio: accel: kxcjk-1013: Increment ref counter for indio_dev->trig")
Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220524181150.9240-3-ddrokosov@sberdevices.ru
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:48 +01:00
Dmitry Rokosov
e5f3205b04 iio:accel:bma180: rearrange iio trigger get and register
IIO trigger interface function iio_trigger_get() should be called after
iio_trigger_register() (or its devm analogue) strictly, because of
iio_trigger_get() acquires module refcnt based on the trigger->owner
pointer, which is initialized inside iio_trigger_register() to
THIS_MODULE.
If this call order is wrong, the next iio_trigger_put() (from sysfs
callback or "delete module" path) will dereference "default" module
refcnt, which is incorrect behaviour.

Fixes: 0668a4e4d2 ("iio: accel: bma180: Fix indio_dev->trig assignment")
Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220524181150.9240-2-ddrokosov@sberdevices.ru
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:48 +01:00
Linus Walleij
9decacd8b3 iio: afe: rescale: Fix boolean logic bug
When introducing support for processed channels I needed
to invert the expression:

  if (!iio_channel_has_info(schan, IIO_CHAN_INFO_RAW) ||
      !iio_channel_has_info(schan, IIO_CHAN_INFO_SCALE))
        dev_err(dev, "source channel does not support raw/scale\n");

To the inverse, meaning detect when we can usse raw+scale
rather than when we can not. This was the result:

  if (iio_channel_has_info(schan, IIO_CHAN_INFO_RAW) ||
      iio_channel_has_info(schan, IIO_CHAN_INFO_SCALE))
       dev_info(dev, "using raw+scale source channel\n");

Ooops. Spot the error. Yep old George Boole came up and bit me.
That should be an &&.

The current code "mostly works" because we have not run into
systems supporting only raw but not scale or only scale but not
raw, and I doubt there are few using the rescaler on anything
such, but let's fix the logic.

Cc: Liam Beguin <liambeguin@gmail.com>
Cc: stable@vger.kernel.org
Fixes: 53ebee9499 ("iio: afe: iio-rescale: Support processed channels")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Liam Beguin <liambeguin@gmail.com>
Acked-by: Peter Rosin <peda@axentia.se>
Link: https://lore.kernel.org/r/20220524075448.140238-1-linus.walleij@linaro.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:48 +01:00
Miaoqian Lin
8a2b6b5687 iio: adc: aspeed: Fix refcount leak in aspeed_adc_set_trim_data
of_find_node_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.

Fixes: d0a4c17b40 ("iio: adc: aspeed: Get and set trimming data.")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220516075206.34580-1-linmq006@gmail.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:48 +01:00
Yannick Brosseau
99bded02da iio: adc: stm32: Fix IRQs on STM32F4 by removing custom spurious IRQs message
The check for spurious IRQs introduced in 695e2f5c28 assumed that the bits
in the control and status registers are aligned. This is true for the H7 and MP1
version, but not the F4. The interrupt was then never handled on the F4.

Instead of increasing the complexity of the comparison and check each bit specifically,
we remove this check completely and rely on the generic handler for spurious IRQs.

Fixes: 695e2f5c28 ("iio: adc: stm32-adc: fix a regression when using dma and irq")
Signed-off-by: Yannick Brosseau <yannick.brosseau@gmail.com>
Reviewed-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Link: https://lore.kernel.org/r/20220516203939.3498673-3-yannick.brosseau@gmail.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:48 +01:00
Yannick Brosseau
d2214cca4d iio: adc: stm32: Fix ADCs iteration in irq handler
The irq handler was only checking the mask for the first ADCs in the case of the
F4 and H7 generation, since it was iterating up to the num_irq value. This patch add
the maximum number of ADC in the common register, which map to the number of entries of
eoc_msk and ovr_msk in stm32_adc_common_regs. This allow the handler to check all ADCs in
that module.

Tested on a STM32F429NIH6.

Fixes: 695e2f5c28 ("iio: adc: stm32-adc: fix a regression when using dma and irq")
Signed-off-by: Yannick Brosseau <yannick.brosseau@gmail.com>
Reviewed-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Link: https://lore.kernel.org/r/20220516203939.3498673-2-yannick.brosseau@gmail.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:48 +01:00
Jialin Zhang
47dcf770ab iio: adc: ti-ads131e08: add missing fwnode_handle_put() in ads131e08_alloc_channels()
fwnode_handle_put() should be used when terminating
device_for_each_child_node() iteration with break or return to prevent
stale device node references from being left behind.

Fixes: d935eddd27 ("iio: adc: Add driver for Texas Instruments ADS131E0x ADC family")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
Link: https://lore.kernel.org/r/20220517033020.2033324-1-zhangjialin11@huawei.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:48 +01:00
Jialin Zhang
d836715f58 iio: adc: rzg2l_adc: add missing fwnode_handle_put() in rzg2l_adc_parse_properties()
fwnode_handle_put() should be used when terminating
device_for_each_child_node() iteration with break or return to prevent
stale device node references from being left behind.

Fixes: d484c21bac ("iio: adc: Add driver for Renesas RZ/G2L A/D converter")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20220517033526.2035735-1-zhangjialin11@huawei.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:48 +01:00
Vincent Whitchurch
78601726d4 iio: trigger: sysfs: fix use-after-free on remove
Ensure that the irq_work has completed before the trigger is freed.

 ==================================================================
 BUG: KASAN: use-after-free in irq_work_run_list
 Read of size 8 at addr 0000000064702248 by task python3/25

 Call Trace:
  irq_work_run_list
  irq_work_tick
  update_process_times
  tick_sched_handle
  tick_sched_timer
  __hrtimer_run_queues
  hrtimer_interrupt

 Allocated by task 25:
  kmem_cache_alloc_trace
  iio_sysfs_trig_add
  dev_attr_store
  sysfs_kf_write
  kernfs_fop_write_iter
  new_sync_write
  vfs_write
  ksys_write
  sys_write

 Freed by task 25:
  kfree
  iio_sysfs_trig_remove
  dev_attr_store
  sysfs_kf_write
  kernfs_fop_write_iter
  new_sync_write
  vfs_write
  ksys_write
  sys_write

 ==================================================================

Fixes: f38bc926d0 ("staging:iio:sysfs-trigger: Use irq_work to properly active trigger")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20220519091925.1053897-1-vincent.whitchurch@axis.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:48 +01:00
Zheyu Ma
b2f5ad9764 iio: gyro: mpu3050: Fix the error handling in mpu3050_power_up()
The driver should disable regulators when fails at regmap_update_bits().

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: <Stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220510092431.1711284-1-zheyuma97@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:48 +01:00
Antoniu Miclaus
6f6bd75919 iio: freq: admv1014: Fix warning about dubious x & !y and improve readability
The warning comes from __BF_FIELD_CHECK()
specifically

BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ?		\
		 ~((_mask) >> __bf_shf(_mask)) & (_val) : 0, \
		 _pfx "value too large for the field"); \

The code was using !(enum value) which is not particularly easy to follow
so replace that with explicit matching and use of ? 0 : 1; or ? 1 : 0;
to improve readability.

Signed-off-by: Antoniu Miclaus <antoniu.miclaus@analog.com>
Link: https://lore.kernel.org/r/20220511090006.90502-1-antoniu.miclaus@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-06-19 17:22:48 +01:00
Maya Matuszczyk
be33d52ef5 drm: panel-orientation-quirks: Add quirk for Aya Neo Next
The device is identified by "NEXT" in board name, however there are
different versions of it, "Next Advance" and "Next Pro", that have
different DMI board names.
Due to a production error a batch or two have their board names prefixed
by "AYANEO", this makes it 6 different DMI board names. To save some
space in final kernel image DMI_MATCH is used instead of
DMI_EXACT_MATCH.

Signed-off-by: Maya Matuszczyk <maccraft123mc@gmail.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220619111952.8487-1-maccraft123mc@gmail.com
2022-06-19 17:37:16 +02:00
Linus Torvalds
05c6ca8512 Merge tag 'x86-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:

 - Make RESERVE_BRK() work again with older binutils. The recent
   'simplification' broke that.

 - Make early #VE handling increment RIP when successful.

 - Make the #VE code consistent vs. the RIP adjustments and add
   comments.

 - Handle load_unaligned_zeropad() across page boundaries correctly in
   #VE when the second page is shared.

* tag 'x86-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  x86/tdx: Clarify RIP adjustments in #VE handler
  x86/tdx: Fix early #VE handling
  x86/mm: Fix RESERVE_BRK() for older binutils
2022-06-19 09:58:28 -05:00
Linus Torvalds
5d770f11a1 Merge tag 'objtool-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull build tooling updates from Thomas Gleixner:

 - Remove obsolete CONFIG_X86_SMAP reference from objtool

 - Fix overlapping text section failures in faddr2line for real

 - Remove OBJECT_FILES_NON_STANDARD usage from x86 ftrace and replace it
   with finegrained annotations so objtool can validate that code
   correctly.

* tag 'objtool-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ftrace: Remove OBJECT_FILES_NON_STANDARD usage
  faddr2line: Fix overlapping text section failures, the sequel
  objtool: Fix obsolete reference to CONFIG_X86_SMAP
2022-06-19 09:54:16 -05:00
Linus Torvalds
727c3991df Merge tag 'sched-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Thomas Gleixner:
 "A single scheduler fix plugging a race between sched_setscheduler()
  and balance_push().

  sched_setscheduler() spliced the balance callbacks accross a lock
  break which makes it possible for an interleaving schedule() to
  observe an empty list"

* tag 'sched-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix balance_push() vs __sched_setscheduler()
2022-06-19 09:51:00 -05:00
Linus Torvalds
4afb65156a Merge tag 'locking-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull lockdep fix from Thomas Gleixner:
 "A RT fix for lockdep.

  lockdep invokes prandom_u32() to create cookies. This worked until
  prandom_u32() was switched to the real random generator, which takes a
  spinlock for extraction, which does not work on RT when invoked from
  atomic contexts.

  lockdep has no requirement for real random numbers and it turns out
  sched_clock() is good enough to create the cookie. That works
  everywhere and is faster"

* tag 'locking-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/lockdep: Use sched_clock() for random numbers
2022-06-19 09:47:41 -05:00
Linus Torvalds
36da9f5fb6 Merge tag 'irq-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "A set of interrupt subsystem updates:

  Core:

   - Ensure runtime power management for chained interrupts

  Drivers:

   - A collection of OF node refcount fixes

   - Unbreak MIPS uniprocessor builds

   - Fix xilinx interrupt controller Kconfig dependencies

   - Add a missing compatible string to the Uniphier driver"

* tag 'irq-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/loongson-liointc: Use architecture register to get coreid
  irqchip/uniphier-aidet: Add compatible string for NX1 SoC
  dt-bindings: interrupt-controller/uniphier-aidet: Add bindings for NX1 SoC
  irqchip/realtek-rtl: Fix refcount leak in map_interrupts
  irqchip/gic-v3: Fix refcount leak in gic_populate_ppi_partitions
  irqchip/gic-v3: Fix error handling in gic_populate_ppi_partitions
  irqchip/apple-aic: Fix refcount leak in aic_of_ic_init
  irqchip/apple-aic: Fix refcount leak in build_fiq_affinity
  irqchip/gic/realview: Fix refcount leak in realview_gic_of_init
  irqchip/xilinx: Remove microblaze+zynq dependency
  genirq: PM: Use runtime PM for chained interrupts
2022-06-19 09:45:16 -05:00
Arnaldo Carvalho de Melo
140cd9ec8f tools headers UAPI: Sync linux/prctl.h with the kernel sources
To pick the changes in:

  9e4ab6c891 ("arm64/sme: Implement vector length configuration prctl()s")

That don't result in any changes in tooling:

  $ tools/perf/trace/beauty/prctl_option.sh > before
  $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
  $ tools/perf/trace/beauty/prctl_option.sh > after
  $ diff -u before after
  $

Just silences this perf tools build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'
  diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Link: http://lore.kernel.org/lkml/Yq81we+XFOqlBWyu@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 11:42:25 -03:00
Linus Torvalds
bc94632ceb Merge tag 'char-misc-5.19-rc3-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes for real from Greg KH:
 "Let's tag the proper branch this time...

  Here are some small char/misc driver fixes for 5.19-rc3 that resolve
  some reported issues.

  They include:

   - mei driver fixes

   - comedi driver fix

   - rtsx build warning fix

   - fsl-mc-bus driver fix

  All of these have been in linux-next for a while with no reported
  issues"

This is what the merge in commit f0ec9c65a8 _should_ have merged, but
Greg fat-fingered the pull request and I got some small changes from
linux-next instead there. Credit to Nathan Chancellor for eagle-eyes.

Link: https://lore.kernel.org/all/Yqywy+Md2AfGDu8v@dev-arch.thelio-3990X/

* tag 'char-misc-5.19-rc3-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  bus: fsl-mc-bus: fix KASAN use-after-free in fsl_mc_bus_remove()
  mei: me: add raptor lake point S DID
  mei: hbm: drop capability response on early shutdown
  mei: me: set internal pg flag to off on hardware reset
  misc: rtsx: Fix clang -Wsometimes-uninitialized in rts5261_init_from_hw()
  comedi: vmk80xx: fix expression for tx buffer size
2022-06-19 09:37:29 -05:00
Linus Torvalds
ee4eb6eeaf Merge tag 'i2c-for-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "MAINTAINERS rectifications and a few minor driver fixes"

* tag 'i2c-for-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: mediatek: Fix an error handling path in mtk_i2c_probe()
  i2c: designware: Use standard optional ref clock implementation
  MAINTAINERS: core DT include belongs to core
  MAINTAINERS: add include/dt-bindings/i2c to I2C SUBSYSTEM HOST DRIVERS
  i2c: npcm7xx: Add check for platform_driver_register
  MAINTAINERS: Update Synopsys DesignWare I2C to Supported
2022-06-19 09:35:09 -05:00
Linus Torvalds
063232b6c4 Merge tag 'xfs-5.19-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
 "There's not a whole lot this time around (I'm still on vacation) but
  here are some important fixes for new features merged in -rc1:

   - Fix a bug where inode flag changes would accidentally drop nrext64

   - Fix a race condition when toggling LARP mode"

* tag 'xfs-5.19-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: preserve DIFLAG2_NREXT64 when setting other inode attributes
  xfs: fix variable state usage
  xfs: fix TOCTOU race involving the new logged xattrs control knob
2022-06-19 09:24:49 -05:00
Ian Rogers
c788ef61ef perf metrics: Ensure at least 1 id per metric
We may have no events for a metric evaluated to a constant. In such a
case ensure a tool event is at least evaluated for metric parsing and
displaying.

Fixes: 8586d2744f ("perf metrics: Don't add all tool events for sharing")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220618013957.999321-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 11:24:05 -03:00
Arnaldo Carvalho de Melo
37402d5d06 tools headers arm64: Sync arm64's cputype.h with the kernel sources
To get the changes in:

  cae889302e ("KVM: arm64: vgic-v3: List M1 Pro/Max as requiring the SEIS workaround")

That addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h'
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/lkml/Yq8w7p4omYKNwOij@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 11:23:04 -03:00
Arnaldo Carvalho de Melo
2e323f360a tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
To pick the changes in:

  f1a9761fbb ("KVM: x86: Allow userspace to opt out of hypercall patching")

That just rebuilds kvm-stat.c on x86, no change in functionality.

This silences these perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
  diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h

Cc: Oliver Upton <oupton@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/lkml/Yq8qgiMwRcl9ds+f@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 11:22:59 -03:00
Leo Yan
51ba539f5b perf arm-spe: Don't set data source if it's not a memory operation
Except for memory load and store operations, ARM SPE records also can
support other operation types, bug when set the data source field the
current code assumes a record is a either load operation or store
operation, this leads to wrongly synthesize memory samples.

This patch strictly checks the record operation type, it only sets data
source only for the operation types ARM_SPE_LD and ARM_SPE_ST,
otherwise, returns zero for data source.  Therefore, we can synthesize
memory samples only when data source is a non-zero value, the function
arm_spe__is_memory_event() is useless and removed.

Fixes: e55ed3423c ("perf arm-spe: Synthesize memory event")
Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Reviewed-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: alisaidi@amazon.com
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: http://lore.kernel.org/lkml/20220517020326.18580-5-alisaidi@amazon.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 10:41:43 -03:00
Ian Rogers
e5287e6dd3 perf expr: Allow exponents on floating point values
Pass the optional exponent component through to strtod that already
supports it. We already have exponents in ScaleUnit and so this adds
uniformity.

Reported-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Reviewed-By: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20220527020653.4160884-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 10:41:43 -03:00
Athira Rajeev
b236371421 perf test topology: Use !strncmp(right platform) to fix guest PPC comparision check
commit cfd7092c31 ("perf test session topology: Fix test to skip
the test in guest environment") added check to skip the testcase if the
socket_id can't be fetched from topology info.

But the condition check uses strncmp which should be changed to !strncmp
and to correctly match platform.

Fix this condition check.

Fixes: cfd7092c31 ("perf test session topology: Fix test to skip the test in guest environment")
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20220610135939.63361-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 10:41:43 -03:00
Michael Petlan
72dcae8efd perf test: Record only user callchains on the "Check Arm64 callgraphs are complete in fp mode" test
The testcase 'Check Arm64 callgraphs are complete in fp mode' wants to
see the following output:

    610 leaf
    62f parent
    648 main

However, without excluding kernel callchains, the output might look like:

	ffffc2ff40ef1b5c arch_local_irq_enable
	ffffc2ff419d032c __schedule
	ffffc2ff419d06c0 schedule
	ffffc2ff40e4da30 do_notify_resume
	ffffc2ff40e421b0 work_pending
	             610 leaf
	             62f parent
	             648 main

Adding '--user-callchains' leaves only the wanted symbols in the chain.

Fixes: cd6382d827 ("perf test arm64: Test unwinding using fame-pointer (fp) mode")
Suggested-by: German Gomez <german.gomez@arm.com>
Reviewed-by: German Gomez <german.gomez@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220614105207.26223-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 10:41:43 -03:00
Arnaldo Carvalho de Melo
67e7d77158 perf beauty: Update copy of linux/socket.h with the kernel sources
To pick the changes in:

  f94fd25cb0 ("tcp: pass back data left in socket after receive")

That don't result in any changes in the tables generated from that
header.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
  diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/all/YqORj9d58AiGYl8b@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 10:41:43 -03:00
Ian Rogers
cc2145526c perf test: Fix variable length array undefined behavior in bp_account
Fix:

  tests/bp_account.c:154:9: runtime error: variable length array bound evaluates to non-positive value 0

by switching from a variable length to an allocated array.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220610180247.444798-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 10:41:43 -03:00
Ian Rogers
94725994cf libperf evsel: Open shouldn't leak fd on failure
If perf_event_open() fails the fd is opened but it is only freed by
closing (not by delete).

Typically when an open fails you don't call close and so this results in
a memory leak. To avoid this, add a close when open fails.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-By: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220609052355.1300162-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 10:41:43 -03:00
Thomas Richter
ec906102e5 perf test: Fix "perf stat CSV output linter" test on s390
perf test -F 83 ("perf stat CSV output linter") fails on s390.

Reason is the wrong number of fields for certain CPU core/die/socket
related output.

On x84_64 the output of command:

  # ./perf stat -x, -A -a --no-merge true
  CPU0,1.50,msec,cpu-clock,1502781,100.00,1.052,CPUs utilized
  CPU1,1.48,msec,cpu-clock,1476113,100.00,1.034,CPUs utilized
  ...

results in 8 fields with 7 comma separators.

On s390 the output of command:

  #  ./perf stat -x, -A -a --no-merge -- true
  0.95,msec,cpu-clock,949800,100.00,1.060,CPUs utilized
  ...

results in 7 fields with 6 comma separators. Therefore this tests
fails on s390. Similar issues exist for per-die and per-socket output
which is not supported on s390.

I have rewritten the python program to count commas in each output line
into a bash function to achieve the same result. I hope this makes it a
bit easier.

Output before:

  # ./perf test -F 83
  83: perf stat CSV output linter  :
  Checking CSV output: no args [Success]
  Checking CSV output: system wide [Success]
  Checking CSV output: system wide Checking CSV output: \
	  system wide no aggregation 6.92,msec,cpu-clock,\
	  6918131,100.00,6.972,CPUs utilized
  ...
  RuntimeError: wrong number of fields. expected 7 in \
	  6.92,msec,cpu-clock,6918131,100.00,6.972,CPUs utilized

  FAILED!
  #

Output after:

  # ./perf test -F 83
  83: perf stat CSV output linter             :
  Checking CSV output: no args [Success]
  Checking CSV output: system wide [Success]
  Checking CSV output: system wide Checking CSV output:\
	  system wide no aggregation [Success]
  Checking CSV output: interval [Success]
  Checking CSV output: event [Success]
  Checking CSV output: per core [Success]
  Checking CSV output: per thread [Success]
  Checking CSV output: per die [Success]
  Checking CSV output: per node [Success]
  Checking CSV output: per socket [Success]
  Ok
  #

Committer notes:

Continues to work on x86_64

  $ perf test lint
   89: perf stat CSV output linter                                     : Ok
  $ perf test -v lint
  Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
   89: perf stat CSV output linter                                     :
  --- start ---
  test child forked, pid 53133
  Checking CSV output: no args [Success]
  Checking CSV output: system wide [Skip] paranoid and not root
  Checking CSV output: system wide [Skip] paranoid and not root
  Checking CSV output: interval [Success]
  Checking CSV output: event [Success]
  Checking CSV output: per core [Skip] paranoid and not root
  Checking CSV output: per thread [Skip] paranoid and not root
  Checking CSV output: per die [Skip] paranoid and not root
  Checking CSV output: per node [Skip] paranoid and not root
  Checking CSV output: per socket [Skip] paranoid and not root
  test child finished with 0
  ---- end ----
  perf stat CSV output linter: Ok
  $

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Claire Jensen <cjense@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: linux390-list@tuxmaker.boeblingen.de.ibm.com
Link: https://lore.kernel.org/r/20220603113034.2009728-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 10:41:43 -03:00
Ian Rogers
1d98cdf7fa perf unwind: Fix uninitialized variable
The 'ret' variable may be uninitialized on error goto paths.

Fixes: dc2cf4ca86 ("perf unwind: Fix segbase for ld.lld linked objects")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Fangrui Song <maskray@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
Cc: Fangrui Song <maskray@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: llvm@lists.linux.dev
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Ullrich <sebasti@nullri.ch>
Link: https://lore.kernel.org/r/20220607000851.39798-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 10:41:43 -03:00
Christophe Leroy
ca5dabcff1 powerpc/prom_init: Fix build failure with GCC_PLUGIN_STRUCTLEAK_BYREF_ALL and KASAN
When CONFIG_KASAN is selected, we expect prom_init to use __memset()
because it is too early to use memset().

But with CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL, the compiler adds calls
to memset() to clear objects on stack, hence the following failure:

	  PROMCHK arch/powerpc/kernel/prom_init_check
	Error: External symbol 'memset' referenced from prom_init.c
	make[2]: *** [arch/powerpc/kernel/Makefile:204 : arch/powerpc/kernel/prom_init_check] Erreur 1

prom_find_machine_type() is called from prom_init() and is called only
once, so lets put compat[] in BSS instead of stack to avoid that.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3802811f7cf94f730be44688539c01bba3a3b5c0.1654875808.git.christophe.leroy@csgroup.eu
2022-06-19 21:57:56 +10:00
Oleksij Rempel
9926de7315 net: phy: at803x: fix NULL pointer dereference on AR9331 PHY
Latest kernel will explode on the PHY interrupt config, since it depends
now on allocated priv. So, run probe to allocate priv to fix it.

 ar9331_switch ethernet.1:10 lan0 (uninitialized): PHY [!ahb!ethernet@1a000000!mdio!switch@10:00] driver [Qualcomm Atheros AR9331 built-in PHY] (irq=13)
 CPU 0 Unable to handle kernel paging request at virtual address 0000000a, epc == 8050e8a8, ra == 80504b34
         ...
 Call Trace:
 [<8050e8a8>] at803x_config_intr+0x5c/0xd0
 [<80504b34>] phy_request_interrupt+0xa8/0xd0
 [<8050289c>] phylink_bringup_phy+0x2d8/0x3ac
 [<80502b68>] phylink_fwnode_phy_connect+0x118/0x130
 [<8074d8ec>] dsa_slave_create+0x270/0x420
 [<80743b04>] dsa_port_setup+0x12c/0x148
 [<8074580c>] dsa_register_switch+0xaf0/0xcc0
 [<80511344>] ar9331_sw_probe+0x370/0x388
 [<8050cb78>] mdio_probe+0x44/0x70
 [<804df300>] really_probe+0x200/0x424
 [<804df7b4>] __driver_probe_device+0x290/0x298
 [<804df810>] driver_probe_device+0x54/0xe4
 [<804dfd50>] __device_attach_driver+0xe4/0x130
 [<804dcb00>] bus_for_each_drv+0xb4/0xd8
 [<804dfac4>] __device_attach+0x104/0x1a4
 [<804ddd24>] bus_probe_device+0x48/0xc4
 [<804deb44>] deferred_probe_work_func+0xf0/0x10c
 [<800a0ffc>] process_one_work+0x314/0x4d4
 [<800a17fc>] worker_thread+0x2a4/0x354
 [<800a9a54>] kthread+0x134/0x13c
 [<8006306c>] ret_from_kernel_thread+0x14/0x1c

Same Issue would affect some other PHYs (QCA8081, QCA9561), so fix it
too.

Fixes: 3265f42188 ("net: phy: at803x: add fiber support")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-19 11:52:23 +01:00
Wentao_Liang
8fc74d1863 drivers/net/ethernet/neterion/vxge: Fix a use-after-free bug in vxge-main.c
The pointer vdev points to a memory region adjacent to a net_device
structure ndev, which is a field of hldev. At line 4740, the invocation
to vxge_device_unregister unregisters device hldev, and it also releases
the memory region pointed by vdev->bar0. At line 4743, the freed memory
region is referenced (i.e., iounmap(vdev->bar0)), resulting in a
use-after-free vulnerability. We can fix the bug by calling iounmap
before vxge_device_unregister.

4721.      static void vxge_remove(struct pci_dev *pdev)
4722.      {
4723.             struct __vxge_hw_device *hldev;
4724.             struct vxgedev *vdev;
…
4731.             vdev = netdev_priv(hldev->ndev);
…
4740.             vxge_device_unregister(hldev);
4741.             /* Do not call pci_disable_sriov here, as it
						will break child devices */
4742.             vxge_hw_device_terminate(hldev);
4743.             iounmap(vdev->bar0);
…
4749              vxge_debug_init(vdev->level_trace, "%s:%d
								Device unregistered",
4750                            __func__, __LINE__);
4751              vxge_debug_entryexit(vdev->level_trace, "%s:%d
								Exiting...", __func__,
4752                          __LINE__);
4753.      }

This is the screenshot when the vulnerability is triggered by using
KASAN. We can see that there is a use-after-free reported by KASAN.

/***************************start**************************/

root@kernel:~# echo 1 > /sys/bus/pci/devices/0000:00:03.0/remove
[  178.296316] vxge_remove
[  182.057081]
 ==================================================================
[  182.057548] BUG: KASAN: use-after-free in vxge_remove+0xe0/0x15c
[  182.057760] Read of size 8 at addr ffff888006c76598 by task bash/119
[  182.057983]
[  182.058747] CPU: 0 PID: 119 Comm: bash Not tainted 5.18.0 #5
[  182.058919] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[  182.059463] Call Trace:
[  182.059726]  <TASK>
[  182.060017]  dump_stack_lvl+0x34/0x44
[  182.060316]  print_report.cold+0xb2/0x6b7
[  182.060401]  ? kfree+0x89/0x290
[  182.060478]  ? vxge_remove+0xe0/0x15c
[  182.060545]  kasan_report+0xa9/0x120
[  182.060629]  ? vxge_remove+0xe0/0x15c
[  182.060706]  vxge_remove+0xe0/0x15c
[  182.060793]  pci_device_remove+0x5d/0xe0
[  182.060968]  device_release_driver_internal+0xf1/0x180
[  182.061063]  pci_stop_bus_device+0xae/0xe0
[  182.061150]  pci_stop_and_remove_bus_device_locked+0x11/0x20
[  182.061236]  remove_store+0xc6/0xe0
[  182.061297]  ? subordinate_bus_number_show+0xc0/0xc0
[  182.061359]  ? __mutex_lock_slowpath+0x10/0x10
[  182.061438]  ? sysfs_kf_write+0x6d/0xa0
[  182.061525]  kernfs_fop_write_iter+0x1b0/0x260
[  182.061610]  ? sysfs_kf_bin_read+0xf0/0xf0
[  182.061695]  new_sync_write+0x209/0x310
[  182.061789]  ? new_sync_read+0x310/0x310
[  182.061865]  ? cgroup_rstat_updated+0x5c/0x170
[  182.061937]  ? preempt_count_sub+0xf/0xb0
[  182.061995]  ? pick_next_entity+0x13a/0x220
[  182.062063]  ? __inode_security_revalidate+0x44/0x80
[  182.062155]  ? security_file_permission+0x46/0x2a0
[  182.062230]  vfs_write+0x33f/0x3e0
[  182.062303]  ksys_write+0xb4/0x150
[  182.062369]  ? __ia32_sys_read+0x40/0x40
[  182.062451]  do_syscall_64+0x3b/0x90
[  182.062531]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  182.062894] RIP: 0033:0x7f3f37d17274
[  182.063558] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f
80 00 00 00 00 48 8d 05 89 54 0d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f
05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53
[  182.063797] RSP: 002b:00007ffd5ba9e178 EFLAGS: 00000246
ORIG_RAX: 0000000000000001
[  182.064117] RAX: ffffffffffffffda RBX: 0000000000000002
RCX: 00007f3f37d17274
[  182.064219] RDX: 0000000000000002 RSI: 000055bbec327180
RDI: 0000000000000001
[  182.064315] RBP: 000055bbec327180 R08: 000000000000000a
R09: 00007f3f37de7cf0
[  182.064414] R10: 000000000000000a R11: 0000000000000246
R12: 00007f3f37de8760
[  182.064513] R13: 0000000000000002 R14: 00007f3f37de3760
R15: 0000000000000002
[  182.064691]  </TASK>
[  182.064916]
[  182.065224] The buggy address belongs to the physical page:
[  182.065804] page:00000000ef31e4f4 refcount:0 mapcount:0
mapping:0000000000000000 index:0x0 pfn:0x6c76
[  182.067419] flags: 0x100000000000000(node=0|zone=1)
[  182.068997] raw: 0100000000000000 0000000000000000
ffffea00001b1d88 0000000000000000
[  182.069118] raw: 0000000000000000 0000000000000000
00000000ffffffff 0000000000000000
[  182.069294] page dumped because: kasan: bad access detected
[  182.069331]
[  182.069360] Memory state around the buggy address:
[  182.070006]  ffff888006c76480: ff ff ff ff ff ff ff ff ff ff ff
 ff ff ff ff ff
[  182.070136]  ffff888006c76500: ff ff ff ff ff ff ff ff ff ff ff
 ff ff ff ff ff
[  182.070230] >ffff888006c76580: ff ff ff ff ff ff ff ff ff ff ff
 ff ff ff ff ff
[  182.070305]                             ^
[  182.070456]  ffff888006c76600: ff ff ff ff ff ff ff ff ff ff ff
 ff ff ff ff ff
[  182.070505]  ffff888006c76680: ff ff ff ff ff ff ff ff ff ff ff
 ff ff ff ff ff
[  182.070606]
==================================================================
[  182.071374] Disabling lock debugging due to kernel taint

/*****************************end*****************************/

After fixing the bug as done in the patch, we can find KASAN do not report
 the bug and the device(00:03.0) has been successfully removed.

/*****************************start***************************/

root@kernel:~# echo 1 > /sys/bus/pci/devices/0000:00:03.0/remove
root@kernel:~#

/******************************end****************************/

Signed-off-by: Wentao_Liang <Wentao_Liang_g@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-19 11:47:28 +01:00
Linus Torvalds
354c6e071b Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
 "Fix a variety of bugs, many of which were found by folks using fuzzing
  or error injection.

  Also fix up how test_dummy_encryption mount option is handled for the
  new mount API.

  Finally, fix/cleanup a number of comments and ext4 Documentation
  files"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix a doubled word "need" in a comment
  ext4: add reserved GDT blocks check
  ext4: make variable "count" signed
  ext4: correct the judgment of BUG in ext4_mb_normalize_request
  ext4: fix bug_on ext4_mb_use_inode_pa
  ext4: fix up test_dummy_encryption handling for new mount API
  ext4: use kmemdup() to replace kmalloc + memcpy
  ext4: fix super block checksum incorrect after mount
  ext4: improve write performance with disabled delalloc
  ext4: fix warning when submitting superblock in ext4_commit_super()
  ext4, doc: remove unnecessary escaping
  ext4: fix incorrect comment in ext4_bio_write_page()
  fs: fix jbd2_journal_try_to_free_buffers() kernel-doc comment
2022-06-18 21:51:12 -05:00
Linus Torvalds
ace2045ed5 Merge tag '5.19-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs client fixes from Steve French:
 "Two cifs debugging improvements - one found to deal with debugging a
  multichannel problem and one for a recent fallocate issue

  This does include the two larger multichannel reconnect (dynamically
  adjusting interfaces on reconnect) patches, because we recently found
  an additional problem with multichannel to one server type that I want
  to include at the same time"

* tag '5.19-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: when a channel is not found for server, log its connection id
  smb3: add trace point for SMB2_set_eof
2022-06-18 21:44:44 -05:00
Xiang wangx
1f3ddff375 ext4: fix a doubled word "need" in a comment
Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Link: https://lore.kernel.org/r/20220605091503.12513-1-wangxiang@cdjrlc.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-18 19:36:20 -04:00
Zhang Yi
b55c3cd102 ext4: add reserved GDT blocks check
We capture a NULL pointer issue when resizing a corrupt ext4 image which
is freshly clear resize_inode feature (not run e2fsck). It could be
simply reproduced by following steps. The problem is because of the
resize_inode feature was cleared, and it will convert the filesystem to
meta_bg mode in ext4_resize_fs(), but the es->s_reserved_gdt_blocks was
not reduced to zero, so could we mistakenly call reserve_backup_gdb()
and passing an uninitialized resize_inode to it when adding new group
descriptors.

 mkfs.ext4 /dev/sda 3G
 tune2fs -O ^resize_inode /dev/sda #forget to run requested e2fsck
 mount /dev/sda /mnt
 resize2fs /dev/sda 8G

 ========
 BUG: kernel NULL pointer dereference, address: 0000000000000028
 CPU: 19 PID: 3243 Comm: resize2fs Not tainted 5.18.0-rc7-00001-gfde086c5ebfd #748
 ...
 RIP: 0010:ext4_flex_group_add+0xe08/0x2570
 ...
 Call Trace:
  <TASK>
  ext4_resize_fs+0xbec/0x1660
  __ext4_ioctl+0x1749/0x24e0
  ext4_ioctl+0x12/0x20
  __x64_sys_ioctl+0xa6/0x110
  do_syscall_64+0x3b/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f2dd739617b
 ========

The fix is simple, add a check in ext4_resize_begin() to make sure that
the es->s_reserved_gdt_blocks is zero when the resize_inode feature is
disabled.

Cc: stable@kernel.org
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220601092717.763694-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-18 19:36:08 -04:00
Ding Xiang
bc75a6eb85 ext4: make variable "count" signed
Since dx_make_map() may return -EFSCORRUPTED now, so change "count" to
be a signed integer so we can correctly check for an error code returned
by dx_make_map().

Fixes: 46c116b920 ("ext4: verify dir block before splitting it")
Cc: stable@kernel.org
Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com>
Link: https://lore.kernel.org/r/20220530100047.537598-1-dingxiang@cmss.chinamobile.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-18 19:35:57 -04:00
Baokun Li
cf4ff938b4 ext4: correct the judgment of BUG in ext4_mb_normalize_request
ext4_mb_normalize_request() can move logical start of allocated blocks
to reduce fragmentation and better utilize preallocation. However logical
block requested as a start of allocation (ac->ac_o_ex.fe_logical) should
always be covered by allocated blocks so we should check that by
modifying and to or in the assertion.

Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20220528110017.354175-3-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-18 19:35:57 -04:00
Baokun Li
a08f789d2a ext4: fix bug_on ext4_mb_use_inode_pa
Hulk Robot reported a BUG_ON:
==================================================================
kernel BUG at fs/ext4/mballoc.c:3211!
[...]
RIP: 0010:ext4_mb_mark_diskspace_used.cold+0x85/0x136f
[...]
Call Trace:
 ext4_mb_new_blocks+0x9df/0x5d30
 ext4_ext_map_blocks+0x1803/0x4d80
 ext4_map_blocks+0x3a4/0x1a10
 ext4_writepages+0x126d/0x2c30
 do_writepages+0x7f/0x1b0
 __filemap_fdatawrite_range+0x285/0x3b0
 file_write_and_wait_range+0xb1/0x140
 ext4_sync_file+0x1aa/0xca0
 vfs_fsync_range+0xfb/0x260
 do_fsync+0x48/0xa0
[...]
==================================================================

Above issue may happen as follows:
-------------------------------------
do_fsync
 vfs_fsync_range
  ext4_sync_file
   file_write_and_wait_range
    __filemap_fdatawrite_range
     do_writepages
      ext4_writepages
       mpage_map_and_submit_extent
        mpage_map_one_extent
         ext4_map_blocks
          ext4_mb_new_blocks
           ext4_mb_normalize_request
            >>> start + size <= ac->ac_o_ex.fe_logical
           ext4_mb_regular_allocator
            ext4_mb_simple_scan_group
             ext4_mb_use_best_found
              ext4_mb_new_preallocation
               ext4_mb_new_inode_pa
                ext4_mb_use_inode_pa
                 >>> set ac->ac_b_ex.fe_len <= 0
           ext4_mb_mark_diskspace_used
            >>> BUG_ON(ac->ac_b_ex.fe_len <= 0);

we can easily reproduce this problem with the following commands:
	`fallocate -l100M disk`
	`mkfs.ext4 -b 1024 -g 256 disk`
	`mount disk /mnt`
	`fsstress -d /mnt -l 0 -n 1000 -p 1`

The size must be smaller than or equal to EXT4_BLOCKS_PER_GROUP.
Therefore, "start + size <= ac->ac_o_ex.fe_logical" may occur
when the size is truncated. So start should be the start position of
the group where ac_o_ex.fe_logical is located after alignment.
In addition, when the value of fe_logical or EXT4_BLOCKS_PER_GROUP
is very large, the value calculated by start_off is more accurate.

Cc: stable@kernel.org
Fixes: cd648b8a8f ("ext4: trim allocation requests to group size")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20220528110017.354175-2-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-18 19:35:43 -04:00
Eric Biggers
85456054e1 ext4: fix up test_dummy_encryption handling for new mount API
Since ext4 was converted to the new mount API, the test_dummy_encryption
mount option isn't being handled entirely correctly, because the needed
fscrypt_set_test_dummy_encryption() helper function combines
parsing/checking/applying into one function.  That doesn't work well
with the new mount API, which split these into separate steps.

This was sort of okay anyway, due to the parsing logic that was copied
from fscrypt_set_test_dummy_encryption() into ext4_parse_param(),
combined with an additional check in ext4_check_test_dummy_encryption().
However, these overlooked the case of changing the value of
test_dummy_encryption on remount, which isn't allowed but ext4 wasn't
detecting until ext4_apply_options() when it's too late to fail.
Another bug is that if test_dummy_encryption was specified multiple
times with an argument, memory was leaked.

Fix this up properly by using the new helper functions that allow
splitting up the parse/check/apply steps for test_dummy_encryption.

Fixes: cebe85d570 ("ext4: switch to the new mount api")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20220526040412.173025-1-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-18 19:35:43 -04:00
Shuqi Zhang
4efd9f0d12 ext4: use kmemdup() to replace kmalloc + memcpy
Replace kmalloc + memcpy with kmemdup()

Signed-off-by: Shuqi Zhang <zhangshuqi3@huawei.com>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20220525030120.803330-1-zhangshuqi3@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-18 19:35:43 -04:00
Ye Bin
9b6641dd95 ext4: fix super block checksum incorrect after mount
We got issue as follows:
[home]# mount  /dev/sda  test
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
[home]# dmesg
EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (sda): Errors on filesystem, clearing orphan list.
EXT4-fs (sda): recovery complete
EXT4-fs (sda): mounted filesystem with ordered data mode. Quota mode: none.
[home]# debugfs /dev/sda
debugfs 1.46.5 (30-Dec-2021)
Checksum errors in superblock!  Retrying...

Reason is ext4_orphan_cleanup will reset ‘s_last_orphan’ but not update
super block checksum.

To solve above issue, defer update super block checksum after
ext4_orphan_cleanup.

Signed-off-by: Ye Bin <yebin10@huawei.com>
Cc: stable@kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20220525012904.1604737-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-18 19:35:24 -04:00
Liang He
173940b3ae xtensa: xtfpga: Fix refcount leak bug in setup
In machine_setup(), of_find_compatible_node() will return a node
pointer with refcount incremented. We should use of_node_put() when
it is not used anymore.

Cc: stable@vger.kernel.org
Signed-off-by: Liang He <windhl@126.com>
Message-Id: <20220617115323.4046905-1-windhl@126.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2022-06-18 14:47:41 -07:00
Liang He
a0117dc956 xtensa: Fix refcount leak bug in time.c
In calibrate_ccount(), of_find_compatible_node() will return a node
pointer with refcount incremented. We should use of_node_put() when
it is not used anymore.

Cc: stable@vger.kernel.org
Signed-off-by: Liang He <windhl@126.com>
Message-Id: <20220617124432.4049006-1-windhl@126.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2022-06-18 14:46:59 -07:00
Shyam Prasad N
5d24968f5b cifs: when a channel is not found for server, log its connection id
cifs_ses_get_chan_index gets the index for a given server pointer.
When a match is not found, we warn about a possible bug.
However, printing details about the non-matching server could be
more useful to debug here.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-18 14:55:06 -05:00
Kuogee Hsieh
a6e2af64a7 drm/msm/dp: force link training for display resolution change
Display resolution change is implemented through drm modeset. Older
modeset (resolution) has to be disabled first before newer modeset
(resolution) can be enabled. Display disable will turn off both
pixel clock and main link clock so that main link have to be
re-trained during display enable to have new video stream flow
again. At current implementation, display enable function manually
kicks up irq_hpd_handle which will read panel link status and start
link training if link status is not in sync state.

However, there is rare case that a particular panel links status keep
staying in sync for some period of time after main link had been shut
down previously at display disabled. In this case, main link retraining
will not be executed by irq_hdp_handle(). Hence video stream of newer
display resolution will fail to be transmitted to panel due to main
link is not in sync between host and panel.

This patch will bypass irq_hpd_handle() in favor of directly call
dp_ctrl_on_stream() to always perform link training in regardless of
main link status. So that no unexpected exception resolution change
failure cases will happen. Also this implementation are more efficient
than manual kicking off irq_hpd_handle function.

Changes in v2:
-- set force_link_train flag on DP only (is_edp == false)

Changes in v3:
-- revise commit  text
-- add Fixes tag

Changes in v4:
-- revise commit  text

Changes in v5:
-- fix spelling at commit text

Changes in v6:
-- split dp_ctrl_on_stream() for phy test case
-- revise commit text for modeset

Changes in v7:
-- drop 0 assignment at local variable (ret = 0)

Changes in v8:
-- add patch to remove pixel_rate from dp_ctrl

Changes in v9:
-- forward declare dp_ctrl_on_stream_phy_test_report()

Fixes: 62671d2ef2 ("drm/msm/dp: fixes wrong connection state caused by failure of link train")
Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/489895/
Link: https://lore.kernel.org/r/1655411200-7255-1-git-send-email-quic_khsieh@quicinc.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-18 09:14:06 -07:00
Abhinav Kumar
2211e34a9d drm/msm/dpu: limit wb modes based on max_mixer_width
As explained in [1], using max_linewidth to limit the modes
does not seem to remove 4K modes on chipsets such as
sm8250 where the max_linewidth actually supports 4k.

This would have been alright if dual SSPP support was
present but otherwise fails the per SSPP bandwidth check.

The ideal way to implement this would be to filter out
the modes which will exceed the bandwidth check by computing
it.

But this would be an exhaustive solution till we have
dual SSPP support.

Let's instead use max_mixer_width to limit the modes.

max_mixer_width still remains 2560 on sm8250 so even if
the max_linewidth is 4096, the only way 4k modes could have
been supported is to have source split enabled on the SSPP.

Since source split support is not enabled yet in DPU driver,
enforce max_mixer_width as the upper limit on the modes.

[1] https://patchwork.freedesktop.org/patch/489662/

Fixes: e67dcecda0 ("drm/msm/dpu: limit writeback modes according to max_linewidth")
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/489893/
Link: https://lore.kernel.org/r/1655407606-21760-1-git-send-email-quic_abhinavk@quicinc.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-18 09:14:06 -07:00
Kuogee Hsieh
d80c3ba0ac drm/msm/dp: check core_initialized before disable interrupts at dp_display_unbind()
During msm initialize phase, dp_display_unbind() will be called to undo
initializations had been done by dp_display_bind() previously if there is
error happen at msm_drm_bind. In this case, core_initialized flag had to
be check to make sure clocks is on before update DP controller register
to disable HPD interrupts. Otherwise system will crash due to below NOC
fatal error.

QTISECLIB [01f01a7ad]CNOC2 ERROR: ERRLOG0_LOW = 0x00061007
QTISECLIB [01f01a7ad]GEM_NOC ERROR: ERRLOG0_LOW = 0x00001007
QTISECLIB [01f0371a0]CNOC2 ERROR: ERRLOG0_HIGH = 0x00000003
QTISECLIB [01f055297]GEM_NOC ERROR: ERRLOG0_HIGH = 0x00000003
QTISECLIB [01f072beb]CNOC2 ERROR: ERRLOG1_LOW = 0x00000024
QTISECLIB [01f0914b8]GEM_NOC ERROR: ERRLOG1_LOW = 0x00000042
QTISECLIB [01f0ae639]CNOC2 ERROR: ERRLOG1_HIGH = 0x00004002
QTISECLIB [01f0cc73f]GEM_NOC ERROR: ERRLOG1_HIGH = 0x00004002
QTISECLIB [01f0ea092]CNOC2 ERROR: ERRLOG2_LOW = 0x0009020c
QTISECLIB [01f10895f]GEM_NOC ERROR: ERRLOG2_LOW = 0x0ae9020c
QTISECLIB [01f125ae1]CNOC2 ERROR: ERRLOG2_HIGH = 0x00000000
QTISECLIB [01f143be7]GEM_NOC ERROR: ERRLOG2_HIGH = 0x00000000
QTISECLIB [01f16153a]CNOC2 ERROR: ERRLOG3_LOW = 0x00000000
QTISECLIB [01f17fe07]GEM_NOC ERROR: ERRLOG3_LOW = 0x00000000
QTISECLIB [01f19cf89]CNOC2 ERROR: ERRLOG3_HIGH = 0x00000000
QTISECLIB [01f1bb08e]GEM_NOC ERROR: ERRLOG3_HIGH = 0x00000000
QTISECLIB [01f1d8a31]CNOC2 ERROR: SBM1 FAULTINSTATUS0_LOW = 0x00000002
QTISECLIB [01f1f72a4]GEM_NOC ERROR: SBM0 FAULTINSTATUS0_LOW = 0x00000001
QTISECLIB [01f21a217]CNOC3 ERROR: ERRLOG0_LOW = 0x00000006
QTISECLIB [01f23dfd3]NOC error fatal

changes in v2:
-- drop the first patch (drm/msm: enable msm irq after all initializations are done successfully at msm_drm_init()) since the problem had been fixed by other patch

Fixes: 570d3e5d28 ("drm/msm/dp: stop event kernel thread when DP unbind")
Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/488387/
Link: https://lore.kernel.org/r/1654538139-7450-1-git-send-email-quic_khsieh@quicinc.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-18 09:14:05 -07:00
Miaoqian Lin
b9cc459860 drm/msm/mdp4: Fix refcount leak in mdp4_modeset_init_intf
of_graph_get_remote_node() returns remote device node pointer with
refcount incremented, we should use of_node_put() on it
when not need anymore.
Add missing of_node_put() to avoid refcount leak.

Fixes: 86418f90a4 ("drm: convert drivers to use of_graph_get_remote_node")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/488473/
Link: https://lore.kernel.org/r/20220607110841.53889-1-linmq006@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-18 09:14:05 -07:00
Rob Clark
c8af219d18 drm/msm: Don't overwrite hw fence in hw_init
Prior to the last commit, this could result in setting the GPU
written fence value back to an older value, if we had missed
updating completed_fence prior to suspend.  This was mostly
harmless as the GPU would eventually overwrite it again with
the correct value.  But we should just not do this.  Instead
just leave a sanity check that the fence looks plausible (in
case the GPU scribbled on memory).

Reported-by: Steev Klimaszewski <steev@kali.org>
Fixes: 95d1deb02a ("drm/msm/gem: Add fenced vma unpin")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Patchwork: https://patchwork.freedesktop.org/patch/490138/
Link: https://lore.kernel.org/r/20220618161120.3451993-2-robdclark@gmail.com
2022-06-18 09:13:33 -07:00
Rob Clark
3c7a52217a drm/msm: Drop update_fences()
I noticed while looking at some traces, that we could miss calls to
msm_update_fence(), as the irq could have raced with retire_submits()
which could have already popped the last submit on a ring out of the
queue of in-flight submits.  But walking the list of submits in the
irq handler isn't really needed, as dma_fence_is_signaled() will dtrt.
So lets just drop it entirely.

v2: use spin_lock_irqsave/restore as we are no longer protected by the
    spin_lock_irqsave/restore() in update_fences()

Reported-by: Steev Klimaszewski <steev@kali.org>
Fixes: 95d1deb02a ("drm/msm/gem: Add fenced vma unpin")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Patchwork: https://patchwork.freedesktop.org/patch/490136/
Link: https://lore.kernel.org/r/20220618161120.3451993-1-robdclark@gmail.com
2022-06-18 09:13:32 -07:00
Peilin Ye
a2b1a5d40b net/sched: sch_netem: Fix arithmetic in netem_dump() for 32-bit platforms
As reported by Yuming, currently tc always show a latency of UINT_MAX
for netem Qdisc's on 32-bit platforms:

    $ tc qdisc add dev dummy0 root netem latency 100ms
    $ tc qdisc show dev dummy0
    qdisc netem 8001: root refcnt 2 limit 1000 delay 275s  275s
                                               ^^^^^^^^^^^^^^^^

Let us take a closer look at netem_dump():

        qopt.latency = min_t(psched_tdiff_t, PSCHED_NS2TICKS(q->latency,
                             UINT_MAX);

qopt.latency is __u32, psched_tdiff_t is signed long,
(psched_tdiff_t)(UINT_MAX) is negative for 32-bit platforms, so
qopt.latency is always UINT_MAX.

Fix it by using psched_time_t (u64) instead.

Note: confusingly, users have two ways to specify 'latency':

  1. normally, via '__u32 latency' in struct tc_netem_qopt;
  2. via the TCA_NETEM_LATENCY64 attribute, which is s64.

For the second case, theoretically 'latency' could be negative.  This
patch ignores that corner case, since it is broken (i.e. assigning a
negative s64 to __u32) anyways, and should be handled separately.

Thanks Ted Lin for the analysis [1] .

[1] https://github.com/raspberrypi/linux/issues/3512

Reported-by: Yuming Chen <chenyuming.junnan@bytedance.com>
Fixes: 112f9cb656 ("netem: convert to qdisc_watchdog_schedule_ns")
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://lore.kernel.org/r/20220616234336.2443-1-yepeilin.cs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-17 20:29:38 -07:00
Ivan Vecera
a3bb7b6381 ethtool: Fix get module eeprom fallback
Function fallback_set_params() checks if the module type returned
by a driver is ETH_MODULE_SFF_8079 and in this case it assumes
that buffer returns a concatenated content of page  A0h and A2h.
The check is wrong because the correct type is ETH_MODULE_SFF_8472.

Fixes: 96d971e307 ("ethtool: Add fallback to get_module_eeprom from netlink command")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220616160856.3623273-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-17 20:22:16 -07:00
Jay Vosburgh
7a9214f3d8 bonding: ARP monitor spams NETDEV_NOTIFY_PEERS notifiers
The bonding ARP monitor fails to decrement send_peer_notif, the
number of peer notifications (gratuitous ARP or ND) to be sent. This
results in a continuous series of notifications.

Correct this by decrementing the counter for each notification.

Reported-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Fixes: b0929915e0 ("bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab arp monitor")
Link: https://lore.kernel.org/netdev/b2fd4147-8f50-bebd-963a-1a3e8d1d9715@redhat.com/
Tested-by: Jonathan Toppins <jtoppins@redhat.com>
Reviewed-by: Jonathan Toppins <jtoppins@redhat.com>
Link: https://lore.kernel.org/r/9400.1655407960@famine
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-17 20:18:44 -07:00
Lorenzo Bianconi
3f6a57ee85 igb: fix a use-after-free issue in igb_clean_tx_ring
Fix the following use-after-free bug in igb_clean_tx_ring routine when
the NIC is running in XDP mode. The issue can be triggered redirecting
traffic into the igb NIC and then closing the device while the traffic
is flowing.

[   73.322719] CPU: 1 PID: 487 Comm: xdp_redirect Not tainted 5.18.3-apu2 #9
[   73.330639] Hardware name: PC Engines APU2/APU2, BIOS 4.0.7 02/28/2017
[   73.337434] RIP: 0010:refcount_warn_saturate+0xa7/0xf0
[   73.362283] RSP: 0018:ffffc9000081f798 EFLAGS: 00010282
[   73.367761] RAX: 0000000000000000 RBX: ffffc90000420f80 RCX: 0000000000000000
[   73.375200] RDX: ffff88811ad22d00 RSI: ffff88811ad171e0 RDI: ffff88811ad171e0
[   73.382590] RBP: 0000000000000900 R08: ffffffff82298f28 R09: 0000000000000058
[   73.390008] R10: 0000000000000219 R11: ffffffff82280f40 R12: 0000000000000090
[   73.397356] R13: ffff888102343a40 R14: ffff88810359e0e4 R15: 0000000000000000
[   73.404806] FS:  00007ff38d31d740(0000) GS:ffff88811ad00000(0000) knlGS:0000000000000000
[   73.413129] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   73.419096] CR2: 000055cff35f13f8 CR3: 0000000106391000 CR4: 00000000000406e0
[   73.426565] Call Trace:
[   73.429087]  <TASK>
[   73.431314]  igb_clean_tx_ring+0x43/0x140 [igb]
[   73.436002]  igb_down+0x1d7/0x220 [igb]
[   73.439974]  __igb_close+0x3c/0x120 [igb]
[   73.444118]  igb_xdp+0x10c/0x150 [igb]
[   73.447983]  ? igb_pci_sriov_configure+0x70/0x70 [igb]
[   73.453362]  dev_xdp_install+0xda/0x110
[   73.457371]  dev_xdp_attach+0x1da/0x550
[   73.461369]  do_setlink+0xfd0/0x10f0
[   73.465166]  ? __nla_validate_parse+0x89/0xc70
[   73.469714]  rtnl_setlink+0x11a/0x1e0
[   73.473547]  rtnetlink_rcv_msg+0x145/0x3d0
[   73.477709]  ? rtnl_calcit.isra.0+0x130/0x130
[   73.482258]  netlink_rcv_skb+0x8d/0x110
[   73.486229]  netlink_unicast+0x230/0x340
[   73.490317]  netlink_sendmsg+0x215/0x470
[   73.494395]  __sys_sendto+0x179/0x190
[   73.498268]  ? move_addr_to_user+0x37/0x70
[   73.502547]  ? __sys_getsockname+0x84/0xe0
[   73.506853]  ? netlink_setsockopt+0x1c1/0x4a0
[   73.511349]  ? __sys_setsockopt+0xc8/0x1d0
[   73.515636]  __x64_sys_sendto+0x20/0x30
[   73.519603]  do_syscall_64+0x3b/0x80
[   73.523399]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   73.528712] RIP: 0033:0x7ff38d41f20c
[   73.551866] RSP: 002b:00007fff3b945a68 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[   73.559640] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff38d41f20c
[   73.567066] RDX: 0000000000000034 RSI: 00007fff3b945b30 RDI: 0000000000000003
[   73.574457] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
[   73.581852] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff3b945ab0
[   73.589179] R13: 0000000000000000 R14: 0000000000000003 R15: 00007fff3b945b30
[   73.596545]  </TASK>
[   73.598842] ---[ end trace 0000000000000000 ]---

Fixes: 9cbc948b5a ("igb: add XDP support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/r/e5c01d549dc37bff18e46aeabd6fb28a7bcf84be.1655388571.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-17 20:14:07 -07:00
Jakub Kicinski
582573f1b2 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2022-06-17

We've added 12 non-merge commits during the last 4 day(s) which contain
a total of 14 files changed, 305 insertions(+), 107 deletions(-).

The main changes are:

1) Fix x86 JIT tailcall count offset on BPF-2-BPF call, from Jakub Sitnicki.

2) Fix a kprobe_multi link bug which misplaces BPF cookies, from Jiri Olsa.

3) Fix an infinite loop when processing a module's BTF, from Kumar Kartikeya Dwivedi.

4) Fix getting a rethook only in RCU available context, from Masami Hiramatsu.

5) Fix request socket refcount leak in sk lookup helpers, from Jon Maxwell.

6) Fix xsk xmit behavior which wrongly adds skb to already full cq, from Ciara Loftus.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  rethook: Reject getting a rethook if RCU is not watching
  fprobe, samples: Add use_trace option and show hit/missed counter
  bpf, docs: Update some of the JIT/maintenance entries
  selftest/bpf: Fix kprobe_multi bench test
  bpf: Force cookies array to follow symbols sorting
  ftrace: Keep address offset in ftrace_lookup_symbols
  selftests/bpf: Shuffle cookies symbols in kprobe multi test
  selftests/bpf: Test tail call counting with bpf2bpf and data on stack
  bpf, x86: Fix tail call count offset calculation on bpf2bpf call
  bpf: Limit maximum modifier chain length in btf_check_type_tags
  bpf: Fix request_sock leak in sk lookup helpers
  xsk: Fix generic transmit when completion queue reservation fails
====================

Link: https://lore.kernel.org/r/20220617202119.2421-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-17 18:30:01 -07:00
Aswath Govindraju
0c0af88f3f arm64: dts: ti: k3-am64-main: Remove support for HS400 speed mode
AM64 SoC, does not support HS400 and HS200 is the maximum supported speed
mode[1]. Therefore, fix the device tree node to reflect the same.

[1] - https://www.ti.com/lit/ds/symlink/am6442.pdf
      (SPRSP56C – JANUARY 2021 – REVISED FEBRUARY 2022)

Fixes: 8abae9389b ("arm64: dts: ti: Add support for AM642 SoC")
Signed-off-by: Aswath Govindraju <a-govindraju@ti.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Link: https://lore.kernel.org/r/20220512064859.32059-1-a-govindraju@ti.com
2022-06-17 20:24:01 -05:00
Matt Ranostay
856216b70a arm64: dts: ti: k3-j721s2: Fix overlapping GICD memory region
GICD region was overlapping with GICR causing the latter to not map
successfully, and in turn the gic-v3 driver would fail to initialize.

This issue was hidden till commit 2b2cd74a06 ("irqchip/gic-v3: Claim
iomem resources") replaced of_iomap() calls with of_io_request_and_map()
that internally called request_mem_region().

Respective console output before this patchset:

[    0.000000] GICv3: /bus@100000/interrupt-controller@1800000: couldn't map region 0

Fixes: b8545f9d3a ("arm64: dts: ti: Add initial support for J721S2 SoC")
Cc: linux-stable@vger.kernel.org
Cc: Marc Zyngier <maz@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Nishanth Menon <nm@ti.com>
Signed-off-by: Matt Ranostay <mranostay@ti.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Nishanth Menon <nm@ti.com>
Link: https://lore.kernel.org/r/20220617151304.446607-1-mranostay@ti.com
2022-06-17 20:23:56 -05:00
Andrew Donnellan
7bc08056a6 powerpc/rtas: Allow ibm,platform-dump RTAS call with null buffer address
Add a special case to block_rtas_call() to allow the ibm,platform-dump RTAS
call through the RTAS filter if the buffer address is 0.

According to PAPR, ibm,platform-dump is called with a null buffer address
to notify the platform firmware that processing of a particular dump is
finished.

Without this, on a pseries machine with CONFIG_PPC_RTAS_FILTER enabled, an
application such as rtas_errd that is attempting to retrieve a dump will
encounter an error at the end of the retrieval process.

Fixes: bd59380c5b ("powerpc/rtas: Restrict RTAS requests from userspace")
Cc: stable@vger.kernel.org
Reported-by: Sathvika Vasireddy <sathvika@linux.ibm.com>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220614134952.156010-1-ajd@linux.ibm.com
2022-06-18 10:19:10 +10:00
Naveen N. Rao
ec6d0dde71 powerpc: Enable execve syscall exit tracepoint
On execve[at], we are zero'ing out most of the thread register state
including gpr[0], which contains the syscall number. Due to this, we
fail to trigger the syscall exit tracepoint properly. Fix this by
retaining gpr[0] in the thread register state.

Before this patch:
  # tail /sys/kernel/debug/tracing/trace
	       cat-123     [000] .....    61.449351: sys_execve(filename:
  7fffa6b23448, argv: 7fffa6b233e0, envp: 7fffa6b233f8)
	       cat-124     [000] .....    62.428481: sys_execve(filename:
  7fffa6b23448, argv: 7fffa6b233e0, envp: 7fffa6b233f8)
	      echo-125     [000] .....    65.813702: sys_execve(filename:
  7fffa6b23378, argv: 7fffa6b233a0, envp: 7fffa6b233b0)
	      echo-125     [000] .....    65.822214: sys_execveat(fd: 0,
  filename: 1009ac48, argv: 7ffff65d0c98, envp: 7ffff65d0ca8, flags: 0)

After this patch:
  # tail /sys/kernel/debug/tracing/trace
	       cat-127     [000] .....   100.416262: sys_execve(filename:
  7fffa41b3448, argv: 7fffa41b33e0, envp: 7fffa41b33f8)
	       cat-127     [000] .....   100.418203: sys_execve -> 0x0
	      echo-128     [000] .....   103.873968: sys_execve(filename:
  7fffa41b3378, argv: 7fffa41b33a0, envp: 7fffa41b33b0)
	      echo-128     [000] .....   103.875102: sys_execve -> 0x0
	      echo-128     [000] .....   103.882097: sys_execveat(fd: 0,
  filename: 1009ac48, argv: 7fffd10d2148, envp: 7fffd10d2158, flags: 0)
	      echo-128     [000] .....   103.883225: sys_execveat -> 0x0

Cc: stable@vger.kernel.org
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: Sumit Dubey2 <Sumit.Dubey2@ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220609103328.41306-1-naveen.n.rao@linux.vnet.ibm.com
2022-06-18 10:19:10 +10:00
Jason A. Donenfeld
e561e472a3 powerpc/pseries: wire up rng during setup_arch()
The platform's RNG must be available before random_init() in order to be
useful for initial seeding, which in turn means that it needs to be
called from setup_arch(), rather than from an init call. Fortunately,
each platform already has a setup_arch function pointer, which means
it's easy to wire this up. This commit also removes some noisy log
messages that don't add much.

Fixes: a489043f46 ("powerpc/pseries: Implement arch_get_random_long() based on H_RANDOM")
Cc: stable@vger.kernel.org # v3.13+
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220611151015.548325-4-Jason@zx2c4.com
2022-06-18 10:19:10 +10:00
Jason A. Donenfeld
20a9689b36 powerpc/microwatt: wire up rng during setup_arch()
The platform's RNG must be available before random_init() in order to be
useful for initial seeding, which in turn means that it needs to be
called from setup_arch(), rather than from an init call. Fortunately,
each platform already has a setup_arch function pointer, which means
it's easy to wire this up. This commit also removes some noisy log
messages that don't add much.

Fixes: c25769fdda ("powerpc/microwatt: Add support for hardware random number generator")
Cc: stable@vger.kernel.org # v5.14+
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220611151015.548325-2-Jason@zx2c4.com
2022-06-18 10:19:10 +10:00
Michael Ellerman
6cf06c17e9 powerpc/mm: Move CMA reservations after initmem_init()
After commit 11ac3e87ce ("mm: cma: use pageblock_order as the single
alignment") there is an error at boot about the KVM CMA reservation
failing, eg:

  kvm_cma_reserve: reserving 6553 MiB for global area
  cma: Failed to reserve 6553 MiB

That makes it impossible to start KVM guests using the hash MMU with
more than 2G of memory, because the VM is unable to allocate a large
enough region for the hash page table, eg:

  $ qemu-system-ppc64 -enable-kvm -M pseries -m 4G ...
  qemu-system-ppc64: Failed to allocate KVM HPT of order 25: Cannot allocate memory

Aneesh pointed out that this happens because when kvm_cma_reserve() is
called, pageblock_order has not been initialised yet, and is still zero,
causing the checks in cma_init_reserved_mem() against
CMA_MIN_ALIGNMENT_PAGES to fail.

Fix it by moving the call to kvm_cma_reserve() after initmem_init(). The
pageblock_order is initialised in sparse_init() which is called from
initmem_init().

Also move the hugetlb CMA reservation.

Fixes: 11ac3e87ce ("mm: cma: use pageblock_order as the single alignment")
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220616120033.1976732-1-mpe@ellerman.id.au
2022-06-18 10:18:55 +10:00
Gautam Menghani
12c3e0c92f tracing/uprobes: Remove unwanted initialization in __trace_uprobe_create()
Remove the unwanted initialization of variable 'ret'. This fixes the clang
scan warning: Value stored to 'ret' is never read [deadcode.DeadStores]

Link: https://lkml.kernel.org/r/20220612144232.145209-1-gautammenghani201@gmail.com

Signed-off-by: Gautam Menghani <gautammenghani201@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-06-17 19:12:07 -04:00
Xiang wangx
93a8c044b9 tracefs: Fix syntax errors in comments
Delete the redundant word 'to'.

Link: https://lkml.kernel.org/r/20220605092729.13010-1-wangxiang@cdjrlc.com

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-06-17 19:01:28 -04:00
sunliming
f4b0d31809 tracing: Simplify conditional compilation code in tracing_set_tracer()
Two conditional compilation directives "#ifdef CONFIG_TRACER_MAX_TRACE"
are used consecutively, and no other code in between. Simplify conditional
the compilation code and only use one "#ifdef CONFIG_TRACER_MAX_TRACE".

Link: https://lkml.kernel.org/r/20220602140613.545069-1-sunliming@kylinos.cn

Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-06-17 18:42:17 -04:00
Kirill A. Shutemov
1e7769653b x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
load_unaligned_zeropad() can lead to unwanted loads across page boundaries.
The unwanted loads are typically harmless. But, they might be made to
totally unrelated or even unmapped memory. load_unaligned_zeropad()
relies on exception fixup (#PF, #GP and now #VE) to recover from these
unwanted loads.

In TDX guests, the second page can be shared page and a VMM may configure
it to trigger #VE.

The kernel assumes that #VE on a shared page is an MMIO access and tries to
decode instruction to handle it. In case of load_unaligned_zeropad() it
may result in confusion as it is not MMIO access.

Fix it by detecting split page MMIO accesses and failing them.
load_unaligned_zeropad() will recover using exception fixups.

The issue was discovered by analysis and reproduced artificially. It was
not triggered during testing.

[ dhansen: fix up changelogs and comments for grammar and clarity,
	   plus incorporate Kirill's off-by-one fix]

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20220614120135.14812-4-kirill.shutemov@linux.intel.com
2022-06-17 15:37:33 -07:00
Stefan Wahren
b9b6d4c925 ARM: dts: bcm2711-rpi-400: Fix GPIO line names
The GPIO expander line names has been fixed in the vendor tree last year,
so upstream these changes.

Fixes: 1c701accec ("ARM: dts: Add Raspberry Pi 400 support")
Reported-by: Ivan T. Ivanov <iivanov@suse.de>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2022-06-17 14:53:06 -07:00
Masami Hiramatsu (Google)
cc72b72073 tracing/kprobes: Check whether get_kretprobe() returns NULL in kretprobe_dispatcher()
There is a small chance that get_kretprobe(ri) returns NULL in
kretprobe_dispatcher() when another CPU unregisters the kretprobe
right after __kretprobe_trampoline_handler().

To avoid this issue, kretprobe_dispatcher() checks the get_kretprobe()
return value again. And if it is NULL, it returns soon because that
kretprobe is under unregistering process.

This issue has been introduced when the kretprobe is decoupled
from the struct kretprobe_instance by commit d741bf41d7
("kprobes: Remove kretprobe hash"). Before that commit, the
struct kretprob_instance::rp directly points the kretprobe
and it is never be NULL.

Link: https://lkml.kernel.org/r/165366693881.797669.16926184644089588731.stgit@devnote2

Reported-by: Yonghong Song <yhs@fb.com>
Fixes: d741bf41d7 ("kprobes: Remove kretprobe hash")
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: bpf <bpf@vger.kernel.org>
Cc: Kernel Team <kernel-team@fb.com>
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-06-17 17:40:06 -04:00
Florian Westphal
394e771684 netfilter: cttimeout: fix slab-out-of-bounds read typo in cttimeout_net_exit
syzbot reports:
  BUG: KASAN: slab-out-of-bounds in __list_del_entry_valid+0xcc/0xf0 lib/list_debug.c:42
  [..]
  list_del include/linux/list.h:148 [inline]
  cttimeout_net_exit+0x211/0x540 net/netfilter/nfnetlink_cttimeout.c:617

Problem is the wrong name of the list member, so container_of() result is wrong.

Reported-by: <syzbot+92968395eedbdbd3617d@syzkaller.appspotmail.com>
Fixes: 78222bacfc ("netfilter: cttimeout: decouple unlink and free on netns destruction")
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-06-17 23:31:20 +02:00
Linus Torvalds
4b35035bcf Merge tag 'nfs-for-5.19-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:

 - Add FMODE_CAN_ODIRECT support to NFSv4 so opens don't fail

 - Fix trunking detection & cl_max_connect setting

 - Avoid pnfs_update_layout() livelocks

 - Don't keep retrying pNFS if the server replies with NFS4ERR_UNAVAILABLE

* tag 'nfs-for-5.19-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFSv4: Add FMODE_CAN_ODIRECT after successful open of a NFS4.x file
  sunrpc: set cl_max_connect when cloning an rpc_clnt
  pNFS: Avoid a live lock condition in pnfs_update_layout()
  pNFS: Don't keep retrying if the server replied NFS4ERR_LAYOUTUNAVAILABLE
2022-06-17 15:17:57 -05:00
Linus Torvalds
32efdbffff Merge tag 'pci-v5.19-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci fix from Bjorn Helgaas:
 "Revert clipping of PCI host bridge windows to avoid E820 regions,
  which broke several machines by forcing unnecessary BAR reassignments
  (Hans de Goede)"

* tag 'pci-v5.19-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  x86/PCI: Revert "x86/PCI: Clip only host bridge windows for E820 regions"
2022-06-17 15:12:20 -05:00
Linus Torvalds
93d17c1c8c Merge tag 'printk-for-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fixes from Petr Mladek:
 "Make the global console_sem available for CPU that is handling panic()
  or shutdown.

  This is an old problem when an existing console lock owner might block
  console output, but it became more visible with the kthreads"

* tag 'printk-for-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: Wait for the global console lock when the system is going down
  printk: Block console kthreads when direct printing will be required
2022-06-17 14:57:42 -05:00
Masami Hiramatsu (Google)
c0f3bb4054 rethook: Reject getting a rethook if RCU is not watching
Since the rethook_recycle() will involve the call_rcu() for reclaiming
the rethook_instance, the rethook must be set up at the RCU available
context (non idle). This rethook_recycle() in the rethook trampoline
handler is inevitable, thus the RCU available check must be done before
setting the rethook trampoline.

This adds a rcu_is_watching() check in the rethook_try_get() so that
it will return NULL if it is called when !rcu_is_watching().

Fixes: 54ecbe6f1e ("rethook: Add a generic return hook")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/165461827269.280167.7379263615545598958.stgit@devnote2
2022-06-17 21:53:35 +02:00
Masami Hiramatsu (Google)
c88dbbcd88 fprobe, samples: Add use_trace option and show hit/missed counter
Add use_trace option to use trace_printk() instead of pr_info()
so that the handler doesn't involve the RCU operations.
And show the hit and missed counter so that the user can check
how many times the probe handler hit and missed.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/165461826247.280167.11939123218334322352.stgit@devnote2
2022-06-17 21:53:29 +02:00
Daniel Borkmann
63ce81d1c4 bpf, docs: Update some of the JIT/maintenance entries
Various minor updates around some of the BPF-related entries:

JITs for ARM32/NFP/SPARC/X86-32 haven't seen updates in quite a while, thus
for now, mark them as 'Odd Fixes' until they become more actively developed.

JITs for POWERPC/S390 are in good shape and receive active development and
review, thus bump to 'Supported' similar as we have with X86-64/ARM64.

JITs for MIPS/RISC-V are in similar good shape as the ones mentioned above,
but looked after mostly in spare time, thus leave for now in 'Maintained' state.

Add Michael to PPC JIT given he's picking up the patches there, so it better
reflects today's state.

Also, I haven't done much reviewing around BPF sockmap/kTLS after John and I
did the big rework back in the days to integrate sockmap with kTLS.

These days, most of this is taken care by John, Jakub {Sitnicki,Kicinski} and
others in the community, so remove myself from these two.

Lastly, move all BPF-related entries into one place, that is, move the sockmap
one over near rest of BPF.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/f9b8a63a0b48dc764bd4c50f87632889f5813f69.1655494758.git.daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-17 12:47:53 -07:00
Hans de Goede
a2b36ffbf5 x86/PCI: Revert "x86/PCI: Clip only host bridge windows for E820 regions"
This reverts commit 4c5e242d3e.

Prior to 4c5e242d3e ("x86/PCI: Clip only host bridge windows for E820
regions"), E820 regions did not affect PCI host bridge windows.  We only
looked at E820 regions and avoided them when allocating new MMIO space.
If firmware PCI bridge window and BAR assignments used E820 regions, we
left them alone.

After 4c5e242d3e, we removed E820 regions from the PCI host bridge
windows before looking at BARs, so firmware assignments in E820 regions
looked like errors, and we moved things around to fit in the space left
(if any) after removing the E820 regions.  This unnecessary BAR
reassignment broke several machines.

Guilherme reported that Steam Deck fails to boot after 4c5e242d3e.  We
clipped the window that contained most 32-bit BARs:

  BIOS-e820: [mem 0x00000000a0000000-0x00000000a00fffff] reserved
  acpi PNP0A08:00: clipped [mem 0x80000000-0xf7ffffff window] to [mem 0xa0100000-0xf7ffffff window] for e820 entry [mem 0xa0000000-0xa00fffff]

which forced us to reassign all those BARs, for example, this NVMe BAR:

  pci 0000:00:01.2: PCI bridge to [bus 01]
  pci 0000:00:01.2:   bridge window [mem 0x80600000-0x806fffff]
  pci 0000:01:00.0: BAR 0: [mem 0x80600000-0x80603fff 64bit]
  pci 0000:00:01.2: can't claim window [mem 0x80600000-0x806fffff]: no compatible bridge window
  pci 0000:01:00.0: can't claim BAR 0 [mem 0x80600000-0x80603fff 64bit]: no compatible bridge window

  pci 0000:00:01.2: bridge window: assigned [mem 0xa0100000-0xa01fffff]
  pci 0000:01:00.0: BAR 0: assigned [mem 0xa0100000-0xa0103fff 64bit]

All the reassignments were successful, so the devices should have been
functional at the new addresses, but some were not.

Andy reported a similar failure on an Intel MID platform.  Benjamin
reported a similar failure on a VMWare Fusion VM.

Note: this is not a clean revert; this revert keeps the later change to
make the clipping dependent on a new pci_use_e820 bool, moving the checking
of this bool to arch_remove_reservations().

[bhelgaas: commit log, add more reporters and testers]
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216109
Reported-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Reported-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reported-by: Benjamin Coddington <bcodding@redhat.com>
Reported-by: Jongman Heo <jongman.heo@gmail.com>
Fixes: 4c5e242d3e ("x86/PCI: Clip only host bridge windows for E820 regions")
Link: https://lore.kernel.org/r/20220612144325.85366-1-hdegoede@redhat.com
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-06-17 14:24:14 -05:00
Linus Torvalds
ef06e68290 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:

 - Revert the moving of the jump labels initialisation before
   setup_machine_fdt(). The bug was fixed in drivers/char/random.c.

 - Ftrace fixes: branch range check and consistent handling of PLTs.

 - Clean rather than invalidate FROM_DEVICE buffers at start of DMA
   transfer (safer if such buffer is mapped in user space). A cache
   invalidation is done already at the end of the transfer.

 - A couple of clean-ups (unexport symbol, remove unused label).

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mm: Don't invalidate FROM_DEVICE buffers at start of DMA transfer
  arm64/cpufeature: Unexport set_cpu_feature()
  arm64: ftrace: remove redundant label
  arm64: ftrace: consistently handle PLTs.
  arm64: ftrace: fix branch range checks
  Revert "arm64: Initialize jump labels before setup_machine_fdt()"
2022-06-17 13:55:19 -05:00
Linus Torvalds
cc2fb31d49 Merge tag 'loongarch-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
 "Add missing ELF_DETAILS in vmlinux.lds.S and fix document rendering"

* tag 'loongarch-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  docs/zh_CN/LoongArch: Fix notes rendering by using reST directives
  docs/LoongArch: Fix notes rendering by using reST directives
  LoongArch: vmlinux.lds.S: Add missing ELF_DETAILS
2022-06-17 13:50:24 -05:00
Linus Torvalds
f10516322d Merge tag 'riscv-for-linus-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for the PolarFire SOC's device tree

 - A handful of fixes for the recently added Svpmbt support

 - An improvement to the Kconfig text for Svpbmt

* tag 'riscv-for-linus-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Improve description for RISCV_ISA_SVPBMT Kconfig symbol
  riscv: drop cpufeature_apply_feature tracking variable
  riscv: fix dependency for t-head errata
  riscv: dts: microchip: re-add pdma to mpfs device tree
2022-06-17 13:45:47 -05:00
Linus Torvalds
2d806a688f Merge tag 'hyperv-fixes-signed-20220617' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:

 - Fix hv_init_clocksource annotation (Masahiro Yamada)

 - Two bug fixes for vmbus driver (Saurabh Sengar)

 - Fix SEV negotiation (Tianyu Lan)

 - Fix comments in code (Xiang Wang)

 - One minor fix to HID driver (Michael Kelley)

* tag 'hyperv-fixes-signed-20220617' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  x86/Hyper-V: Add SEV negotiate protocol support in Isolation VM
  Drivers: hv: vmbus: Release cpu lock in error case
  HID: hyperv: Correctly access fields declared as __le16
  clocksource: hyper-v: unexport __init-annotated hv_init_clocksource()
  Drivers: hv: Fix syntax errors in comments
  Drivers: hv: vmbus: Don't assign VMbus channel interrupts to isolated CPUs
2022-06-17 13:39:12 -05:00
Linus Torvalds
462abc9de7 Merge tag 'block-5.19-2022-06-16' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - NVMe pull request from Christoph
      - Quirks, quirks, quirks to work around buggy consumer grade
        devices (Keith Bush, Ning Wang, Stefan Reiter, Rasheed Hsueh)
      - Better kernel messages for devices that need quirking (Keith
        Bush)
      - Make a kernel message more useful (Thomas Weißschuh)

 - MD pull request from Song, with a few fixes

 - blk-mq sysfs locking fixes (Ming)

 - BFQ stats fix (Bart)

 - blk-mq offline queue fix (Bart)

 - blk-mq flush request tag fix (Ming)

* tag 'block-5.19-2022-06-16' of git://git.kernel.dk/linux-block:
  block/bfq: Enable I/O statistics
  blk-mq: don't clear flush_rq from tags->rqs[]
  blk-mq: avoid to touch q->elevator without any protection
  blk-mq: protect q->elevator by ->sysfs_lock in blk_mq_elv_switch_none
  block: Fix handling of offline queues in blk_mq_alloc_request_hctx()
  md/raid5-ppl: Fix argument order in bio_alloc_bioset()
  Revert "md: don't unregister sync_thread with reconfig_mutex held"
  nvme-pci: disable write zeros support on UMIC and Samsung SSDs
  nvme-pci: avoid the deepest sleep state on ZHITAI TiPro7000 SSDs
  nvme-pci: sk hynix p31 has bogus namespace ids
  nvme-pci: smi has bogus namespace ids
  nvme-pci: phison e12 has bogus namespace ids
  nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S50
  nvme-pci: add trouble shooting steps for timeouts
  nvme: add bug report info for global duplicate id
  nvme: add device name to warning in uuid_show()
2022-06-17 11:22:58 -07:00
Linus Torvalds
f8e174c307 Merge tag 'io_uring-5.19-2022-06-16' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "Bigger than usual at this time, both because we missed -rc2, but also
  because of some reverts that we chose to do. In detail:

   - Adjust mapped buffer API while we still can (Dylan)

   - Mapped buffer fixes (Dylan, Hao, Pavel, me)

   - Fix for uring_cmd wrong API usage for task_work (Dylan)

   - Fix for bug introduced in fixed file closing (Hao)

   - Fix race in buffer/file resource handling (Pavel)

   - Revert the NOP support for CQE32 and buffer selection that was
     brought up during the merge window (Pavel)

   - Remove IORING_CLOSE_FD_AND_FILE_SLOT introduced in this merge
     window. The API needs further refining, so just yank it for now and
     we'll revisit for a later kernel.

   - Series cleaning up the CQE32 support added in this merge window,
     making it more integrated rather than sitting on the side (Pavel)"

* tag 'io_uring-5.19-2022-06-16' of git://git.kernel.dk/linux-block: (21 commits)
  io_uring: recycle provided buffer if we punt to io-wq
  io_uring: do not use prio task_work_add in uring_cmd
  io_uring: commit non-pollable provided mapped buffers upfront
  io_uring: make io_fill_cqe_aux honour CQE32
  io_uring: remove __io_fill_cqe() helper
  io_uring: fix ->extra{1,2} misuse
  io_uring: fill extra big cqe fields from req
  io_uring: unite fill_cqe and the 32B version
  io_uring: get rid of __io_fill_cqe{32}_req()
  io_uring: remove IORING_CLOSE_FD_AND_FILE_SLOT
  Revert "io_uring: add buffer selection support to IORING_OP_NOP"
  Revert "io_uring: support CQE32 for nop operation"
  io_uring: limit size of provided buffer ring
  io_uring: fix types in provided buffer ring
  io_uring: fix index calculation
  io_uring: fix double unlock for pbuf select
  io_uring: kbuf: fix bug of not consuming ring buffer in partial io case
  io_uring: openclose: fix bug of closing wrong fixed file
  io_uring: fix not locked access to fixed buf table
  io_uring: fix races with buffer table unregister
  ...
2022-06-17 11:14:07 -07:00
Will Deacon
c50f11c619 arm64: mm: Don't invalidate FROM_DEVICE buffers at start of DMA transfer
Invalidating the buffer memory in arch_sync_dma_for_device() for
FROM_DEVICE transfers

When using the streaming DMA API to map a buffer prior to inbound
non-coherent DMA (i.e. DMA_FROM_DEVICE), we invalidate any dirty CPU
cachelines so that they will not be written back during the transfer and
corrupt the buffer contents written by the DMA. This, however, poses two
potential problems:

  (1) If the DMA transfer does not write to every byte in the buffer,
      then the unwritten bytes will contain stale data once the transfer
      has completed.

  (2) If the buffer has a virtual alias in userspace, then stale data
      may be visible via this alias during the period between performing
      the cache invalidation and the DMA writes landing in memory.

Address both of these issues by cleaning (aka writing-back) the dirty
lines in arch_sync_dma_for_device(DMA_FROM_DEVICE) instead of discarding
them using invalidation.

Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220606152150.GA31568@willie-the-truck
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220610151228.4562-2-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-06-17 19:06:06 +01:00
Linus Torvalds
5c0cd3d4a9 Merge tag 'fs_for_v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull writeback and ext2 fixes from Jan Kara:
 "A fix for writeback bug which prevented machines with kdevtmpfs from
  booting and also one small ext2 bugfix in IO error handling"

* tag 'fs_for_v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  init: Initialize noop_backing_dev_info early
  ext2: fix fs corruption when trying to remove a non-empty directory with IO error
2022-06-17 10:09:24 -07:00
Linus Torvalds
274295c6e5 Merge tag 'for-5.19/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:

 - Fix a race in DM core's dm_start_io_acct that could result in double
   accounting for abnormal IO (e.g. discards, write zeroes, etc).

 - Fix a use-after-free in DM core's dm_put_live_table_bio.

 - Fix a race for REQ_NOWAIT bios being issued despite no support from
   underlying DM targets (due to DM table reload at an "unlucky" time)

 - Fix access beyond allocated bitmap in DM mirror's log.

* tag 'for-5.19/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm mirror log: round up region bitmap size to BITS_PER_LONG
  dm: fix narrow race for REQ_NOWAIT bios being issued despite no support
  dm: fix use-after-free in dm_put_live_table_bio
  dm: fix race in dm_start_io_acct
2022-06-17 10:03:53 -07:00
Linus Torvalds
a96e902ba9 Merge tag 'hwmon-for-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:

 - Add missing lock protection in occ driver

 - Add missing comma in board name list in asus-ec-sensors driver

 - Fix devicetree bindings for ti,tmp401

* tag 'hwmon-for-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (asus-ec-sensors) add missing comma in board name list.
  hwmon: (occ) Lock mutex in shutdown to prevent race with occ_active
  dt-bindings: hwmon: ti,tmp401: Drop 'items' from 'ti,n-factor' property
2022-06-17 10:02:26 -07:00
Linus Torvalds
7c2d03f15f Merge tag 'linux-watchdog-5.19-rc3' of git://www.linux-watchdog.org/linux-watchdog
Pull watchdog fix from Wim Van Sebroeck:
 "Add missing MODULE_LICENSE in gxp driver"

* tag 'linux-watchdog-5.19-rc3' of git://www.linux-watchdog.org/linux-watchdog:
  watchdog: gxp: Add missing MODULE_LICENSE
2022-06-17 10:00:25 -07:00
Linus Torvalds
79fe0f863f Merge tag 'v5.19-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "This fixes a potential build failure when CRYPTO=m"

* tag 'v5.19-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: memneq - move into lib/
2022-06-17 08:27:27 -07:00
Linus Torvalds
f0ec9c65a8 Merge tag 'char-misc-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
 "Here are some small char/misc driver fixes for 5.19-rc3 that resolve
  some reported issues.

  They include:

   - mei driver fixes

   - comedi driver fix

   - rtsx build warning fix

   - fsl-mc-bus driver fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  eeprom: at25: Split reads into chunks and cap write size
  misc: atmel-ssc: Fix IRQ check in ssc_probe
  char: lp: remove redundant initialization of err
2022-06-17 07:58:39 -07:00
Linus Torvalds
9afc441c3c Merge tag 'staging-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
 "Here are some small staging driver fixes for 5.19-rc3 that resolve
  reported issues:

   - remove visorbus.h which was forgotten in the -rc1 merge where the
     code that used it was removed

   - olpc_dcon: mark as broken to allow the DRM developers to evolve the
     fbdev api properly without having to deal with this obsolete
     driver. It will be removed soon if no one steps up to adopt it and
     fix the issues with it.

   - rtl8723bs driver fix

   - r8188eu driver fix to resolve many reports of the driver being
     broken with -rc1.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'staging-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: Also remove the Unisys visorbus.h
  staging: rtl8723bs: Allocate full pwep structure
  staging: olpc_dcon: mark driver as broken
  staging: r8188eu: Fix warning of array overflow in ioctl_linux.c
  staging: r8188eu: fix rtw_alloc_hwxmits error detection for now
2022-06-17 07:55:24 -07:00
Linus Torvalds
62dcd5e198 Merge tag 'tty-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
 "Here are some small tty and serial driver fixes for 5.19-rc3 to
  resolve some reported problems:

   - 8250 lsr read bugfix

   - n_gsm line discipline allocation fix

   - qcom serial driver fix for reported lockups that happened in -rc1

   - goldfish tty driver fix

  All have been in linux-next for a while now with no reported issues"

* tag 'tty-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: 8250: Store to lsr_save_flags after lsr read
  tty: goldfish: Fix free_irq() on remove
  tty: serial: qcom-geni-serial: Implement start_rx callback
  serial: core: Introduce callback for start_rx and do stop_rx in suspend only if this callback implementation is present.
  tty: n_gsm: Debug output allocation must use GFP_ATOMIC
2022-06-17 07:52:43 -07:00
Linus Torvalds
9057a64644 Merge tag 'usb-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
 "Here are some small USB driver fixes and new device ids for 5.19-rc3

  They include:

   - new usb-serial driver device ids

   - usb gadget driver fixes for reported problems

   - cdnsp driver fix

   - dwc3 driver fixes for reported problems

   - dwc3 driver fix for merge problem that I caused in 5.18

   - xhci driver fixes

   - dwc2 memory leak fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: gadget: f_fs: change ep->ep safe in ffs_epfile_io()
  usb: gadget: f_fs: change ep->status safe in ffs_epfile_io()
  xhci: Fix null pointer dereference in resume if xhci has only one roothub
  USB: fixup for merge issue with "usb: dwc3: Don't switch OTG -> peripheral if extcon is present"
  usb: cdnsp: Fixed setting last_trb incorrectly
  usb: gadget: u_ether: fix regression in setting fixed MAC address
  usb: gadget: lpc32xx_udc: Fix refcount leak in lpc32xx_udc_probe
  usb: dwc2: Fix memory leak in dwc2_hcd_init
  usb: dwc3: pci: Restore line lost in merge conflict resolution
  usb: dwc3: gadget: Fix IN endpoint max packet size allocation
  USB: serial: option: add support for Cinterion MV31 with new baseline
  USB: serial: io_ti: add Agilent E5805A support
2022-06-17 07:50:41 -07:00
Tim Crawford
d49951219b ALSA: hda/realtek: Add quirk for Clevo PD70PNT
Fixes speaker output and headset detection on Clevo PD70PNT.

Signed-off-by: Tim Crawford <tcrawford@system76.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220617133028.50568-1-tcrawford@system76.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-17 16:37:49 +02:00
Petr Mladek
38335cc5ff Merge branch 'rework/kthreads' into for-linus 2022-06-17 16:36:48 +02:00
Yanteng Si
03dfb4a3ab docs/zh_CN/LoongArch: Fix notes rendering by using reST directives
Notes are better expressed with reST admonitions.

Fixes: f23b22599f ("Documentation/zh_CN: Add basic LoongArch documentations")
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-17 22:09:05 +08:00
Yanteng Si
a667e4d3d0 docs/LoongArch: Fix notes rendering by using reST directives
Notes are better expressed with reST admonitions.

Fixes: 0ea8ce61cb ("Documentation: LoongArch: Add basic documentations")
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-17 22:09:05 +08:00
Youling Tang
b672332ef9 LoongArch: vmlinux.lds.S: Add missing ELF_DETAILS
Commit c604abc3f6 ("vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUG")
splits ELF_DETAILS from STABS_DEBUG, resulting in missing ELF_DETAILS
information in LoongArch architecture, so add it.

Fixes: c604abc3f6 ("vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUG")
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-17 22:09:05 +08:00
Christoph Hellwig
a09b314005 block: freeze the queue earlier in del_gendisk
Freeze the queue earlier in del_gendisk so that the state does not
change while we remove debugfs and sysfs files.

Ming mentioned that being able to observer request in debugfs might
be useful while the queue is being frozen in del_gendisk, which is
made possible by this change.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220614074827.458955-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-17 07:31:05 -06:00
Christoph Hellwig
99d055b4fd block: remove per-disk debugfs files in blk_unregister_queue
The block debugfs files are created in blk_register_queue, which is
called by add_disk and use a naming scheme based on the disk_name.
After del_gendisk returns that name can be reused and thus we must not
leave these debugfs files around, otherwise the kernel is unhappy
and spews messages like:

	Directory XXXXX with parent 'block' already present!

and the newly created devices will not have working debugfs files.

Move the unregistration to blk_unregister_queue instead (which matches
the sysfs unregistration) to make sure the debugfs life time rules match
those of the disk name.

As part of the move also make sure the whole debugfs unregistration is
inside a single debugfs_mutex critical section.

Note that this breaks blktests block/002, which checks that the debugfs
directory has not been removed while blktests is running, but that
particular check should simply be removed from the test case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220614074827.458955-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-17 07:31:05 -06:00
Christoph Hellwig
5cf9c91ba9 block: serialize all debugfs operations using q->debugfs_mutex
Various places like I/O schedulers or the QOS infrastructure try to
register debugfs files on demans, which can race with creating and
removing the main queue debugfs directory.  Use the existing
debugfs_mutex to serialize all debugfs operations that rely on
q->debugfs_dir or the directories hanging off it.

To make the teardown code a little simpler declare all debugfs dentry
pointers and not just the main one uncoditionally in blkdev.h.

Move debugfs_mutex next to the dentries that it protects and document
what it is used for.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220614074827.458955-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-17 07:31:05 -06:00
Christoph Hellwig
50e34d7881 block: disable the elevator int del_gendisk
The elevator is only used for file system requests, which are stopped in
del_gendisk.  Move disabling the elevator and freeing the scheduler tags
to the end of del_gendisk instead of doing that work in disk_release and
blk_cleanup_queue to avoid a use after free on q->tag_set from
disk_release as the tag_set might not be alive at that point.

Move the blk_qos_exit call as well, as it just depends on the elevator
exit and would be the only reason to keep the not exactly cheap queue
freeze in disk_release.

Fixes: e155b0c238 ("blk-mq: Use shared tags for shared sbitmap support")
Reported-by: syzbot+3e3f419f4a7816471838@syzkaller.appspotmail.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: syzbot+3e3f419f4a7816471838@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20220614074827.458955-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-17 07:31:05 -06:00
Nathan Chancellor
e830315641 riscv: Fix ALT_THEAD_PMA's asm parameters
After commit a35707c3d8 ("riscv: add memory-type errata for T-Head"),
builds with LLVM's integrated assembler fail like:

  In file included from arch/riscv/kernel/asm-offsets.c:10:
  In file included from ./include/linux/mm.h:29:
  In file included from ./include/linux/pgtable.h:6:
  In file included from ./arch/riscv/include/asm/pgtable.h:114:
  ./arch/riscv/include/asm/pgtable-64.h:210:2: error: invalid input constraint '0' in asm
          ALT_THEAD_PMA(prot_val);
          ^
  ./arch/riscv/include/asm/errata_list.h:88:4: note: expanded from macro 'ALT_THEAD_PMA'
          : "0"(_val),                                                    \
            ^

This was reported upstream to LLVM where Jessica pointed out a couple of
issues with the existing implementation of ALT_THEAD_PMA:

* t3 is modified but not listed in the clobbers list.

* "+r"(_val) marks _val as both an input and output of the asm but then
  "0"(_val) marks _val as an input matching constraint, which does not
  make much sense in this situation, as %1 is not actually used in the
  asm and matching constraints are designed to be used for different
  inputs that need to use the same register.

Drop the matching contraint and shift all the operands by one, as %1 is
unused, and mark t3 as clobbered. This resolves the build error and goes
not cause any problems with GNU as.

Fixes: a35707c3d8 ("riscv: add memory-type errata for T-Head")
Link: https://github.com/ClangBuiltLinux/linux/issues/1641
Link: https://github.com/llvm/llvm-project/issues/55514
Link: https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html
Suggested-by: Jessica Clarke <jrtc27@jrtc27.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220518184529.454008-1-nathan@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-06-17 06:16:46 -07:00
Jens Axboe
6436c770f1 io_uring: recycle provided buffer if we punt to io-wq
io_arm_poll_handler() will recycle the buffer appropriately if we end
up arming poll (or if we're ready to retry), but not for the io-wq case
if we have attempted poll first.

Explicitly recycle the buffer to avoid both hanging on to it too long,
but also to avoid multiple reads grabbing the same one. This can happen
for ring mapped buffers, since it hasn't necessarily been committed.

Fixes: c7fb19428d ("io_uring: add support for ring mapped supplied buffers")
Link: https://github.com/axboe/liburing/issues/605
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-17 06:24:26 -06:00
Riccardo Paolo Bestetti
b4a028c4d0 ipv4: ping: fix bind address validity check
Commit 8ff978b8b2 ("ipv4/raw: support binding to nonlocal addresses")
introduced a helper function to fold duplicated validity checks of bind
addresses into inet_addr_valid_or_nonlocal(). However, this caused an
unintended regression in ping_check_bind_addr(), which previously would
reject binding to multicast and broadcast addresses, but now these are
both incorrectly allowed as reported in [1].

This patch restores the original check. A simple reordering is done to
improve readability and make it evident that multicast and broadcast
addresses should not be allowed. Also, add an early exit for INADDR_ANY
which replaces lost behavior added by commit 0ce779a9f5 ("net: Avoid
unnecessary inet_addr_type() call when addr is INADDR_ANY").

Furthermore, this patch introduces regression selftests to catch these
specific cases.

[1] https://lore.kernel.org/netdev/CANP3RGdkAcDyAZoT1h8Gtuu0saq+eOrrTiWbxnOs+5zn+cpyKg@mail.gmail.com/

Fixes: 8ff978b8b2 ("ipv4/raw: support binding to nonlocal addresses")
Cc: Miaohe Lin <linmiaohe@huawei.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Riccardo Paolo Bestetti <pbl@bestov.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-17 11:41:34 +01:00
Xu Jia
2b04495e21 hamradio: 6pack: fix array-index-out-of-bounds in decode_std_command()
Hulk Robot reports incorrect sp->rx_count_cooked value in decode_std_command().
This should be caused by the subtracting from sp->rx_count_cooked before.
It seems that sp->rx_count_cooked value is changed to 0, which bypassed the
previous judgment.

The situation is shown below:

         (Thread 1)			|  (Thread 2)
decode_std_command()		| resync_tnc()
...					|
if (rest == 2)			|
	sp->rx_count_cooked -= 2;	|
else if (rest == 3)			| ...
					| sp->rx_count_cooked = 0;
	sp->rx_count_cooked -= 1;	|
for (i = 0; i < sp->rx_count_cooked; i++) // report error
	checksum += sp->cooked_buf[i];

sp->rx_count_cooked is a shared variable but is not protected by a lock.
The same applies to sp->rx_count. This patch adds a lock to fix the bug.

The fail log is shown below:
=======================================================================
UBSAN: array-index-out-of-bounds in drivers/net/hamradio/6pack.c:925:31
index 400 is out of range for type 'unsigned char [400]'
CPU: 3 PID: 7433 Comm: kworker/u10:1 Not tainted 5.18.0-rc5-00163-g4b97bac0756a #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
Workqueue: events_unbound flush_to_ldisc
Call Trace:
 <TASK>
 dump_stack_lvl+0xcd/0x134
 ubsan_epilogue+0xb/0x50
 __ubsan_handle_out_of_bounds.cold+0x62/0x6c
 sixpack_receive_buf+0xfda/0x1330
 tty_ldisc_receive_buf+0x13e/0x180
 tty_port_default_receive_buf+0x6d/0xa0
 flush_to_ldisc+0x213/0x3f0
 process_one_work+0x98f/0x1620
 worker_thread+0x665/0x1080
 kthread+0x2e9/0x3a0
 ret_from_fork+0x1f/0x30
 ...

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xu Jia <xujia39@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-17 11:39:46 +01:00
Hoang Le
911600bf5a tipc: fix use-after-free Read in tipc_named_reinit
syzbot found the following issue on:
==================================================================
BUG: KASAN: use-after-free in tipc_named_reinit+0x94f/0x9b0
net/tipc/name_distr.c:413
Read of size 8 at addr ffff88805299a000 by task kworker/1:9/23764

CPU: 1 PID: 23764 Comm: kworker/1:9 Not tainted
5.18.0-rc4-syzkaller-00878-g17d49e6e8012 #0
Hardware name: Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Workqueue: events tipc_net_finalize_work
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description.constprop.0.cold+0xeb/0x495
mm/kasan/report.c:313
 print_report mm/kasan/report.c:429 [inline]
 kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491
 tipc_named_reinit+0x94f/0x9b0 net/tipc/name_distr.c:413
 tipc_net_finalize+0x234/0x3d0 net/tipc/net.c:138
 process_one_work+0x996/0x1610 kernel/workqueue.c:2289
 worker_thread+0x665/0x1080 kernel/workqueue.c:2436
 kthread+0x2e9/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
 </TASK>
[...]
==================================================================

In the commit
d966ddcc38 ("tipc: fix a deadlock when flushing scheduled work"),
the cancel_work_sync() function just to make sure ONLY the work
tipc_net_finalize_work() is executing/pending on any CPU completed before
tipc namespace is destroyed through tipc_exit_net(). But this function
is not guaranteed the work is the last queued. So, the destroyed instance
may be accessed in the work which will try to enqueue later.

In order to completely fix, we re-order the calling of cancel_work_sync()
to make sure the work tipc_net_finalize_work() was last queued and it
must be completed by calling cancel_work_sync().

Reported-by: syzbot+47af19f3307fc9c5c82e@syzkaller.appspotmail.com
Fixes: d966ddcc38 ("tipc: fix a deadlock when flushing scheduled work")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-17 11:39:10 +01:00
Jay Vosburgh
e66e257a5d veth: Add updating of trans_start
Since commit 21a75f0915 ("bonding: Fix ARP monitor validation"),
the bonding ARP / ND link monitors depend on the trans_start time to
determine link availability.  NETIF_F_LLTX drivers must update trans_start
directly, which veth does not do.  This prevents use of the ARP or ND link
monitors with veth interfaces in a bond.

	Resolve this by having veth_xmit update the trans_start time.

Reported-by: Jonathan Toppins <jtoppins@redhat.com>
Tested-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Fixes: 21a75f0915 ("bonding: Fix ARP monitor validation")
Link: https://lore.kernel.org/netdev/b2fd4147-8f50-bebd-963a-1a3e8d1d9715@redhat.com/
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-17 11:38:09 +01:00
Eric Dumazet
cc26c2661f net: fix data-race in dev_isalive()
dev_isalive() is called under RTNL or dev_base_lock protection.

This means that changes to dev->reg_state should be done with both locks held.

syzbot reported:

BUG: KCSAN: data-race in register_netdevice / type_show

write to 0xffff888144ecf518 of 1 bytes by task 20886 on cpu 0:
register_netdevice+0xb9f/0xdf0 net/core/dev.c:10050
lapbeth_new_device drivers/net/wan/lapbether.c:414 [inline]
lapbeth_device_event+0x4a0/0x6c0 drivers/net/wan/lapbether.c:456
notifier_call_chain kernel/notifier.c:87 [inline]
raw_notifier_call_chain+0x53/0xb0 kernel/notifier.c:455
__dev_notify_flags+0x1d6/0x3a0
dev_change_flags+0xa2/0xc0 net/core/dev.c:8607
do_setlink+0x778/0x2230 net/core/rtnetlink.c:2780
__rtnl_newlink net/core/rtnetlink.c:3546 [inline]
rtnl_newlink+0x114c/0x16a0 net/core/rtnetlink.c:3593
rtnetlink_rcv_msg+0x811/0x8c0 net/core/rtnetlink.c:6089
netlink_rcv_skb+0x13e/0x240 net/netlink/af_netlink.c:2501
rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:6107
netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
netlink_unicast+0x58a/0x660 net/netlink/af_netlink.c:1345
netlink_sendmsg+0x661/0x750 net/netlink/af_netlink.c:1921
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg net/socket.c:734 [inline]
__sys_sendto+0x21e/0x2c0 net/socket.c:2119
__do_sys_sendto net/socket.c:2131 [inline]
__se_sys_sendto net/socket.c:2127 [inline]
__x64_sys_sendto+0x74/0x90 net/socket.c:2127
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x46/0xb0

read to 0xffff888144ecf518 of 1 bytes by task 20423 on cpu 1:
dev_isalive net/core/net-sysfs.c:38 [inline]
netdev_show net/core/net-sysfs.c:50 [inline]
type_show+0x24/0x90 net/core/net-sysfs.c:112
dev_attr_show+0x35/0x90 drivers/base/core.c:2095
sysfs_kf_seq_show+0x175/0x240 fs/sysfs/file.c:59
kernfs_seq_show+0x75/0x80 fs/kernfs/file.c:162
seq_read_iter+0x2c3/0x8e0 fs/seq_file.c:230
kernfs_fop_read_iter+0xd1/0x2f0 fs/kernfs/file.c:235
call_read_iter include/linux/fs.h:2052 [inline]
new_sync_read fs/read_write.c:401 [inline]
vfs_read+0x5a5/0x6a0 fs/read_write.c:482
ksys_read+0xe8/0x1a0 fs/read_write.c:620
__do_sys_read fs/read_write.c:630 [inline]
__se_sys_read fs/read_write.c:628 [inline]
__x64_sys_read+0x3e/0x50 fs/read_write.c:628
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x46/0xb0

value changed: 0x00 -> 0x01

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 20423 Comm: udevd Tainted: G W 5.19.0-rc2-syzkaller-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-17 10:59:31 +01:00
Marc Zyngier
cbc6d44867 KVM: arm64: Add Oliver as a reviewer
Oliver Upton has agreed to help with reviewing the KVM/arm64
patches, and has been doing so for a while now, so adding him
as to the reviewer list.

Note that Oliver is using a different email address for this
purpose, rather than the one his been using for his other
contributions.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Oliver Upton <oupton@google.com>
Link: https://lore.kernel.org/r/20220616085318.1303657-1-maz@kernel.org
2022-06-17 09:49:41 +01:00
Quentin Perret
56961c6331 KVM: arm64: Prevent kmemleak from accessing pKVM memory
Commit a7259df767 ("memblock: make memblock_find_in_range method
private") changed the API using which memory is reserved for the pKVM
hypervisor. However, memblock_phys_alloc() differs from the original API in
terms of kmemleak semantics -- the old one didn't report the reserved
regions to kmemleak while the new one does. Unfortunately, when protected
KVM is enabled, all kernel accesses to pKVM-private memory result in a
fatal exception, which can now happen because of kmemleak scans:

$ echo scan > /sys/kernel/debug/kmemleak
[   34.991354] kvm [304]: nVHE hyp BUG at: [<ffff800008fa3750>] __kvm_nvhe_handle_host_mem_abort+0x270/0x290!
[   34.991580] kvm [304]: Hyp Offset: 0xfffe8be807e00000
[   34.991813] Kernel panic - not syncing: HYP panic:
[   34.991813] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800
[   34.991813] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000
[   34.991813] VCPU:0000000000000000
[   34.993660] CPU: 0 PID: 304 Comm: bash Not tainted 5.19.0-rc2 #102
[   34.994059] Hardware name: linux,dummy-virt (DT)
[   34.994452] Call trace:
[   34.994641]  dump_backtrace.part.0+0xcc/0xe0
[   34.994932]  show_stack+0x18/0x6c
[   34.995094]  dump_stack_lvl+0x68/0x84
[   34.995276]  dump_stack+0x18/0x34
[   34.995484]  panic+0x16c/0x354
[   34.995673]  __hyp_pgtable_total_pages+0x0/0x60
[   34.995933]  scan_block+0x74/0x12c
[   34.996129]  scan_gray_list+0xd8/0x19c
[   34.996332]  kmemleak_scan+0x2c8/0x580
[   34.996535]  kmemleak_write+0x340/0x4a0
[   34.996744]  full_proxy_write+0x60/0xbc
[   34.996967]  vfs_write+0xc4/0x2b0
[   34.997136]  ksys_write+0x68/0xf4
[   34.997311]  __arm64_sys_write+0x20/0x2c
[   34.997532]  invoke_syscall+0x48/0x114
[   34.997779]  el0_svc_common.constprop.0+0x44/0xec
[   34.998029]  do_el0_svc+0x2c/0xc0
[   34.998205]  el0_svc+0x2c/0x84
[   34.998421]  el0t_64_sync_handler+0xf4/0x100
[   34.998653]  el0t_64_sync+0x18c/0x190
[   34.999252] SMP: stopping secondary CPUs
[   35.000034] Kernel Offset: disabled
[   35.000261] CPU features: 0x800,00007831,00001086
[   35.000642] Memory Limit: none
[   35.001329] ---[ end Kernel panic - not syncing: HYP panic:
[   35.001329] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800
[   35.001329] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000
[   35.001329] VCPU:0000000000000000 ]---

Fix this by explicitly excluding the hypervisor's memory pool from
kmemleak like we already do for the hyp BSS.

Cc: Mike Rapoport <rppt@kernel.org>
Fixes: a7259df767 ("memblock: make memblock_find_in_range method private")
Signed-off-by: Quentin Perret <qperret@google.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220616161135.3997786-1-qperret@google.com
2022-06-17 09:48:38 +01:00
Pierre-Louis Bossart
bb30b453fe ALSA: x86: intel_hdmi_audio: use pm_runtime_resume_and_get()
The current code does not check for errors and does not release the
reference on errors.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://lore.kernel.org/r/20220616222910.136854-3-pierre-louis.bossart@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-17 10:46:38 +02:00
Pierre-Louis Bossart
e87c65aeb4 ALSA: x86: intel_hdmi_audio: enable pm_runtime and set autosuspend delay
The existing code uses pm_runtime_get_sync/put_autosuspend, but
pm_runtime was not explicitly enabled. The autosuspend delay was not
set either, the value is set to 5s since HDMI is rather painful to
resume.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://lore.kernel.org/r/20220616222910.136854-2-pierre-louis.bossart@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-17 10:46:22 +02:00
Pierre-Louis Bossart
6376ab0237 ALSA: hda: intel-nhlt: remove use of __func__ in dev_dbg
The module and function information can be added with
'modprobe foo dyndbg=+pmf'

Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://lore.kernel.org/r/20220616220559.136160-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-17 10:45:41 +02:00
Pierre-Louis Bossart
33fa35db89 ALSA: hda: intel-dspcfg: use SOF for UpExtreme and UpExtreme11 boards
The UpExtreme BIOS reports microphones that are not physically
present, so this module ends-up selecting SOF, while the UpExtreme11
BIOS does not report microphones so the snd-hda-intel driver is
selected.

For consistency use SOF unconditionally in autodetection mode. The use
of the snd-hda-intel driver can still be enabled with
'options snd-intel-dspcfg dsp_driver=1'

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://lore.kernel.org/r/20220616201029.130477-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-17 10:44:52 +02:00
Jiapeng Chong
2328fe7a98 firewire: convert sysfs sprintf/snprintf family to sysfs_emit
Fix the following coccicheck warning:

./drivers/firewire/core-device.c:375:8-16: WARNING: use scnprintf or
sprintf.

Reported-by: Abaci Robot<abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20220615121505.61412-2-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-17 10:43:20 +02:00
Takashi Sakamoto
dda8ad0aa8 firewire: cdev: fix potential leak of kernel stack due to uninitialized value
Recent change brings potential leak of value on kernel stack to userspace
due to uninitialized value.

This commit fixes the bug.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: baa914cd81 ("firewire: add kernel API to access CYCLE_TIME register")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20220512112037.103142-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-17 10:43:11 +02:00
Edward Wu
540a92bfe6 ata: libata: add qc->flags in ata_qc_complete_template tracepoint
Add flags value to check the result of ata completion

Fixes: 255c03d15a ("libata: Add tracepoints")
Cc: stable@vger.kernel.org
Signed-off-by: Edward Wu <edwardwu@realtek.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-06-17 16:30:03 +09:00
Linus Torvalds
47700948a4 Merge tag 'drm-fixes-2022-06-17' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Regular drm fixes for rc3. Nothing too serious, i915, amdgpu and
  exynos all have a few small driver fixes, and two ttm fixes, and one
  compiler warning.

  atomic:
   - fix spurious compiler warning

  ttm:
   - add NULL ptr check in swapout code
   - fix bulk move handling

  i915:
   - Fix page fault on error state read
   - Fix memory leaks in per-gt sysfs
   - Fix multiple fence handling
   - Remove accidental static from a local variable

  amdgpu:
   - Fix regression in GTT size reporting
   - OLED backlight fix

  exynos:
   - Check a null pointer instead of IS_ERR()
   - Rework initialization code of Exynos MIC driver"

* tag 'drm-fixes-2022-06-17' of git://anongit.freedesktop.org/drm/drm:
  drm/amd/display: Cap OLED brightness per max frame-average luminance
  drm/amdgpu: Fix GTT size reporting in amdgpu_ioctl
  drm/exynos: mic: Rework initialization
  drm/exynos: fix IS_ERR() vs NULL check in probe
  drm/ttm: fix bulk move handling v2
  drm/i915/uc: remove accidental static from a local variable
  drm/i915: Individualize fences before adding to dma_resv obj
  drm/i915/gt: Fix memory leaks in per-gt sysfs
  drm/i915/reset: Fix error_state_read ptr + offset use
  drm/ttm: fix missing NULL check in ttm_device_swapout
  drm/atomic: fix warning of unused variable
2022-06-16 21:39:51 -07:00
Claudiu Manoil
9b7fd1670a phy: aquantia: Fix AN when higher speeds than 1G are not advertised
Even when the eth port is resticted to work with speeds not higher than 1G,
and so the eth driver is requesting the phy (via phylink) to advertise up
to 1000BASET support, the aquantia phy device is still advertising for 2.5G
and 5G speeds.
Clear these advertising defaults when requested.

Cc: Ondrej Spacek <ondrej.spacek@nxp.com>
Fixes: 09c4c57f7b ("net: phy: aquantia: add support for auto-negotiation configuration")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20220610084037.7625-1-claudiu.manoil@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-16 20:25:55 -07:00
Alexei Starovoitov
a4a8b2eea4 Merge branch 'bpf: Fix cookie values for kprobe multi'
Jiri Olsa says:

====================

hi,
there's bug in kprobe_multi link that makes cookies misplaced when
using symbols to attach. The reason is that we sort symbols by name
but not adjacent cookie values. Current test did not find it because
bpf_fentry_test* are already sorted by name.

v3 changes:
  - fixed kprobe_multi bench test to filter out invalid entries
    from available_filter_functions

v2 changes:
  - rebased on top of bpf/master
  - checking if cookies are defined later in swap function [Andrii]
  - added acks

thanks,
jirka
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-16 19:42:22 -07:00
Jiri Olsa
730067022c selftest/bpf: Fix kprobe_multi bench test
With [1] the available_filter_functions file contains records
starting with __ftrace_invalid_address___ and marking disabled
entries.

We need to filter them out for the bench test to pass only
resolvable symbols to kernel.

[1] commit b39181f7c6 ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function")

Fixes: b39181f7c6 ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220615112118.497303-5-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-16 19:42:21 -07:00
Jiri Olsa
eb5fb03256 bpf: Force cookies array to follow symbols sorting
When user specifies symbols and cookies for kprobe_multi link
interface it's very likely the cookies will be misplaced and
returned to wrong functions (via get_attach_cookie helper).

The reason is that to resolve the provided functions we sort
them before passing them to ftrace_lookup_symbols, but we do
not do the same sort on the cookie values.

Fixing this by using sort_r function with custom swap callback
that swaps cookie values as well.

Fixes: 0236fec57a ("bpf: Resolve symbols with ftrace_lookup_symbols for kprobe multi link")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220615112118.497303-4-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-16 19:42:21 -07:00
Jiri Olsa
eb1b2985fe ftrace: Keep address offset in ftrace_lookup_symbols
We want to store the resolved address on the same index as
the symbol string, because that's the user (bpf kprobe link)
code assumption.

Also making sure we don't store duplicates that might be
present in kallsyms.

Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Fixes: bed0d9a50d ("ftrace: Add ftrace_lookup_symbols function")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220615112118.497303-3-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-16 19:42:21 -07:00
Jiri Olsa
ad8848535e selftests/bpf: Shuffle cookies symbols in kprobe multi test
There's a kernel bug that causes cookies to be misplaced and
the reason we did not catch this with this test is that we
provide bpf_fentry_test* functions already sorted by name.

Shuffling function bpf_fentry_test2 deeper in the list and
keeping the current cookie values as before will trigger
the bug.

The kernel fix is coming in following changes.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220615112118.497303-2-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-16 19:42:21 -07:00
Christian Marangi
e67679cc42 mailmap: add entry for Christian Marangi
Add entry to map ansuelsmth@gmail.com to the unique identity of
Christian Marangi.

Link: https://lkml.kernel.org/r/20220615225012.18782-1-ansuelsmth@gmail.com
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:32 -07:00
zhenwei pi
67f22ba775 mm/memory-failure: disable unpoison once hw error happens
Currently unpoison_memory(unsigned long pfn) is designed for soft
poison(hwpoison-inject) only.  Since 17fae1294a, the KPTE gets cleared
on a x86 platform once hardware memory corrupts.

Unpoisoning a hardware corrupted page puts page back buddy only, the
kernel has a chance to access the page with *NOT PRESENT* KPTE.  This
leads BUG during accessing on the corrupted KPTE.

Suggested by David&Naoya, disable unpoison mechanism when a real HW error
happens to avoid BUG like this:

 Unpoison: Software-unpoisoned page 0x61234
 BUG: unable to handle page fault for address: ffff888061234000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 2c01067 P4D 2c01067 PUD 107267063 PMD 10382b063 PTE 800fffff9edcb062
 Oops: 0002 [#1] PREEMPT SMP NOPTI
 CPU: 4 PID: 26551 Comm: stress Kdump: loaded Tainted: G   M       OE     5.18.0.bm.1-amd64 #7
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ...
 RIP: 0010:clear_page_erms+0x7/0x10
 Code: ...
 RSP: 0000:ffffc90001107bc8 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: 0000000000000901 RCX: 0000000000001000
 RDX: ffffea0001848d00 RSI: ffffea0001848d40 RDI: ffff888061234000
 RBP: ffffea0001848d00 R08: 0000000000000901 R09: 0000000000001276
 R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001
 R13: 0000000000000000 R14: 0000000000140dca R15: 0000000000000001
 FS:  00007fd8b2333740(0000) GS:ffff88813fd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffff888061234000 CR3: 00000001023d2005 CR4: 0000000000770ee0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  <TASK>
  prep_new_page+0x151/0x170
  get_page_from_freelist+0xca0/0xe20
  ? sysvec_apic_timer_interrupt+0xab/0xc0
  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
  __alloc_pages+0x17e/0x340
  __folio_alloc+0x17/0x40
  vma_alloc_folio+0x84/0x280
  __handle_mm_fault+0x8d4/0xeb0
  handle_mm_fault+0xd5/0x2a0
  do_user_addr_fault+0x1d0/0x680
  ? kvm_read_and_reset_apf_flags+0x3b/0x50
  exc_page_fault+0x78/0x170
  asm_exc_page_fault+0x27/0x30

Link: https://lkml.kernel.org/r/20220615093209.259374-2-pizhenwei@bytedance.com
Fixes: 847ce401df ("HWPOISON: Add unpoisoning support")
Fixes: 17fae1294a ("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned")
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>	[5.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:32 -07:00
Mike Kravetz
68d32527d3 hugetlbfs: zero partial pages during fallocate hole punch
hugetlbfs fallocate support was originally added with commit 70c3547e36
("hugetlbfs: add hugetlbfs_fallocate()").  Initial support only operated
on whole hugetlb pages.  This makes sense for populating files as other
interfaces such as mmap and truncate require hugetlb page size alignment. 
Only operating on whole hugetlb pages for the hole punch case was a
simplification and there was no compelling use case to zero partial pages.

In a recent discussion[1] it was assumed that hugetlbfs hole punch would
zero partial hugetlb pages as that is in line with the man page
description saying 'partial filesystem blocks are zeroed'.  However, the
hugetlbfs hole punch code actually does this:

        hole_start = round_up(offset, hpage_size);
        hole_end = round_down(offset + len, hpage_size);

Modify code to zero partial hugetlb pages in hole punch range.  It is
possible that application code could note a change in behavior.  However,
that would imply the code is passing in an unaligned range and expecting
only whole pages be removed.  This is unlikely as the fallocate
documentation states the opposite.

The current hugetlbfs fallocate hole punch behavior is tested with the
libhugetlbfs test fallocate_align[2].  This test will be updated to
validate partial page zeroing.

[1] https://lore.kernel.org/linux-mm/20571829-9d3d-0b48-817c-b6b15565f651@redhat.com/
[2] https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/fallocate_align.c

Link: https://lkml.kernel.org/r/YqeiMlZDKI1Kabfe@monkey
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:32 -07:00
Yang Yang
df4ae285a3 mm: memcontrol: reference to tools/cgroup/memcg_slabinfo.py
There is no slabinfo.py in tools/cgroup, but has memcg_slabinfo.py instead.

Link: https://lkml.kernel.org/r/20220610024451.744135-1-yang.yang29@zte.com.cn
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:32 -07:00
Alex Williamson
034e5afad9 mm: re-allow pinning of zero pfns
The commit referenced below subtly and inadvertently changed the logic to
disallow pinning of zero pfns.  This breaks device assignment with vfio
and potentially various other users of gup.  Exclude the zero page test
from the negation.

Link: https://lkml.kernel.org/r/165490039431.944052.12458624139225785964.stgit@omen
Fixes: 1c56343258 ("mm: fix is_pinnable_page against a cma page")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Reported-by: Yishai Hadas <yishaih@nvidia.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: John Dias <joaodias@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:32 -07:00
Jason A. Donenfeld
327b18b7aa mm/kfence: select random number before taking raw lock
The RNG uses vanilla spinlocks, not raw spinlocks, so kfence should pick
its random numbers before taking its raw spinlocks.  This also has the
nice effect of doing less work inside the lock.  It should fix a splat
that Geert saw with CONFIG_PROVE_RAW_LOCK_NESTING:

     dump_backtrace.part.0+0x98/0xc0
     show_stack+0x14/0x28
     dump_stack_lvl+0xac/0xec
     dump_stack+0x14/0x2c
     __lock_acquire+0x388/0x10a0
     lock_acquire+0x190/0x2c0
     _raw_spin_lock_irqsave+0x6c/0x94
     crng_make_state+0x148/0x1e4
     _get_random_bytes.part.0+0x4c/0xe8
     get_random_u32+0x4c/0x140
     __kfence_alloc+0x460/0x5c4
     kmem_cache_alloc_trace+0x194/0x1dc
     __kthread_create_on_node+0x5c/0x1a8
     kthread_create_on_node+0x58/0x7c
     printk_start_kthread.part.0+0x34/0xa8
     printk_activate_kthreads+0x4c/0x54
     do_one_initcall+0xec/0x278
     kernel_init_freeable+0x11c/0x214
     kernel_init+0x24/0x124
     ret_from_fork+0x10/0x20

Link: https://lkml.kernel.org/r/20220609123319.17576-1-Jason@zx2c4.com
Fixes: d4150779e6 ("random32: use real rng for non-deterministic randomness")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:31 -07:00
Huacai Chen
8a6f62a26d MAINTAINERS: add maillist information for LoongArch
Now there is a dedicated maillist (loongarch@lists.linux.dev) for
LoongArch, add it for better collaboration.

Link: https://lkml.kernel.org/r/20220616121456.3613470-1-chenhuacai@loongson.cn
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Cc: Huacai Chen <chenhuacai@loongson.cn>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Xuefeng Li <lixuefeng@loongson.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Xuerui Wang <kernel@xen0n.name>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:31 -07:00
Andrew Morton
f0a7d33a71 MAINTAINERS: update MM tree references
Describe the new kernel.org location of the MM trees.

Suggested-by: David Hildenbrand <david@redhat.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:31 -07:00
Abel Vesa
8585c3971d MAINTAINERS: update Abel Vesa's email
Use Abel Vesa's kernel.org account in maintainer entry and mailmap.

Link: https://lkml.kernel.org/r/20220611093142.202271-1-abelvesa@kernel.org
Signed-off-by: Abel Vesa <abelvesa@nxp.com>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Dong Aisheng <aisheng.dong@nxp.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:31 -07:00
David Hildenbrand
7757e7627a MAINTAINERS: add MEMORY HOT(UN)PLUG section and add David as reviewer
There are certainly a lot more files that partially fall into the memory
hot(un)plug category, including parts of mm/sparse.c, mm/page_isolation.c
and mm/page_alloc.c.  Let's only add what's almost completely memory
hot(un)plug related.

Add myself as reviewer so it's easier for contributors to figure out
whom to CC.

Link: https://lkml.kernel.org/r/20220610101258.75738-1-david@redhat.com
Link: https://lkml.kernel.org/r/YqlaE/LYHwB0gpaW@localhost.localdomain
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:31 -07:00
Miaohe Lin
6901c0b6df MAINTAINERS: add Miaohe Lin as a memory-failure reviewer
I have been focusing on mm for the past two years.  e.g.  fixing bugs,
cleaning up the code and reviewing.  I would like to help maintainers and
people working on memory-failure by reviewing their work.

Let me be Cc'd on patches related to memory-failure.

Link: https://lkml.kernel.org/r/20220607145135.38670-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:31 -07:00
Jarkko Sakkinen
515e1d86c9 mailmap: add alias for jarkko@profian.com
Add alias for patches that I contribute on behalf of Profian
(my current employer).

Link: https://lkml.kernel.org/r/20220607164140.1230876-1-jarkko@kernel.org
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:30 -07:00
SeongJae Park
2949282938 mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is initialized
Commit 059342d1dd ("mm/damon/reclaim: fix the timer always stays
active") made DAMON_RECLAIM's 'enabled' parameter store callback,
'enabled_store()', to schedule 'damon_reclaim_timer'.  The scheduling uses
'system_wq', which is initialized in 'workqueue_init_early()'.  As kernel
parameters parsing function ('parse_args()') is called before
'workqueue_init_early()', 'enabled_store()' can be executed before
'workqueue_init_early()' and end up accessing the uninitialized
'system_wq'.  As a result, the booting hang[1].  This commit fixes the
issue by checking if the initialization is done before scheduling the
timer.

[1] https://lkml.kernel.org/20220604192222.1488-1-sj@kernel.org/

Link: https://lkml.kernel.org/r/20220604195051.1589-1-sj@kernel.org
Fixes: 059342d1dd ("mm/damon/reclaim: fix the timer always stays active")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reported-by: Greg White <gwhite@kupulau.com>
Cc: Hailong Tu <tuhailong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:30 -07:00
Petr Mladek
d25c83c660 kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal
The comments in kernel/kthread.c create a feeling that only SIGKILL is
able to terminate the creation of kernel kthreads by
kthread_create()/_on_node()/_on_cpu() APIs.

In reality, wait_for_completion_killable() might be killed by any fatal
signal that does not have a custom handler:

	(!siginmask(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \
	 (t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL)

static inline void signal_wake_up(struct task_struct *t, bool resume)
{
	signal_wake_up_state(t, resume ? TASK_WAKEKILL : 0);
}

static void complete_signal(int sig, struct task_struct *p, enum pid_type type)
{
[...]
	/*
	 * Found a killable thread.  If the signal will be fatal,
	 * then start taking the whole group down immediately.
	 */
	if (sig_fatal(p, sig) ...) {
		if (!sig_kernel_coredump(sig)) {
		[...]
			do {
				task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
				sigaddset(&t->pending.signal, SIGKILL);
				signal_wake_up(t, 1);
			} while_each_thread(p, t);
			return;
		}
	}
}

Update the comments in kernel/kthread.c to make this more obvious.

The motivation for this change was debugging why a module initialization
failed.  The module was being loaded from initrd.  It "magically" failed
when systemd was switching to the real root.  The clean up operations sent
SIGTERM to various pending processed that were started from initrd.

Link: https://lkml.kernel.org/r/20220315102444.2380-1-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:30 -07:00
Marcelo Tosatti
3173346337 mm: lru_cache_disable: use synchronize_rcu_expedited
commit ff042f4a9b ("mm: lru_cache_disable: replace work queue
synchronization with synchronize_rcu") replaced lru_cache_disable's usage
of work queues with synchronize_rcu.

Some users reported large performance regressions due to this commit, for
example:
https://lore.kernel.org/all/20220521234616.GO1790663@paulmck-ThinkPad-P17-Gen-1/T/

Switching to synchronize_rcu_expedited fixes the problem.

Link: https://lkml.kernel.org/r/YpToHCmnx/HEcVyR@fuller.cnet
Fixes: ff042f4a9b ("mm: lru_cache_disable: replace work queue synchronization with synchronize_rcu")
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Michael Larabel <Michael@MichaelLarabel.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Phil Elwell <phil@raspberrypi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:30 -07:00
Yang Li
042999388e mm/page_isolation.c: fix one kernel-doc comment
Remove one warning found by running scripts/kernel-doc, which is caused by
using 'make W=1':

mm/page_isolation.c:304: warning: Function parameter or member
'skip_isolation' not described in 'isolate_single_pageblock'

Link: https://lkml.kernel.org/r/20220602062116.61199-1-yang.lee@linux.alibaba.com
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:11:30 -07:00
Tyrel Datwyler
aeaadcde1a scsi: ibmvfc: Store vhost pointer during subcrq allocation
Currently the back pointer from a queue to the vhost adapter isn't set
until after subcrq interrupt registration. The value is available when a
queue is first allocated and can/should be also set for primary and async
queues as well as subcrqs.

This fixes a crash observed during kexec/kdump on Power 9 with legacy XICS
interrupt controller where a pending subcrq interrupt from the previous
kernel can be replayed immediately upon IRQ registration resulting in
dereference of a garbage backpointer in ibmvfc_interrupt_scsi().

Kernel attempted to read user page (58) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x00000058
Faulting instruction address: 0xc008000003216a08
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c008000003216a08] ibmvfc_interrupt_scsi+0x40/0xb0 [ibmvfc]
LR [c0000000082079e8] __handle_irq_event_percpu+0x98/0x270
Call Trace:
[c000000047fa3d80] [c0000000123e6180] 0xc0000000123e6180 (unreliable)
[c000000047fa3df0] [c0000000082079e8] __handle_irq_event_percpu+0x98/0x270
[c000000047fa3ea0] [c000000008207d18] handle_irq_event+0x98/0x188
[c000000047fa3ef0] [c00000000820f564] handle_fasteoi_irq+0xc4/0x310
[c000000047fa3f40] [c000000008205c60] generic_handle_irq+0x50/0x80
[c000000047fa3f60] [c000000008015c40] __do_irq+0x70/0x1a0
[c000000047fa3f90] [c000000008016d7c] __do_IRQ+0x9c/0x130
[c000000014622f60] [0000000020000000] 0x20000000
[c000000014622ff0] [c000000008016e50] do_IRQ+0x40/0xa0
[c000000014623020] [c000000008017044] replay_soft_interrupts+0x194/0x2f0
[c000000014623210] [c0000000080172a8] arch_local_irq_restore+0x108/0x170
[c000000014623240] [c000000008eb1008] _raw_spin_unlock_irqrestore+0x58/0xb0
[c000000014623270] [c00000000820b12c] __setup_irq+0x49c/0x9f0
[c000000014623310] [c00000000820b7c0] request_threaded_irq+0x140/0x230
[c000000014623380] [c008000003212a50] ibmvfc_register_scsi_channel+0x1e8/0x2f0 [ibmvfc]
[c000000014623450] [c008000003213d1c] ibmvfc_init_sub_crqs+0xc4/0x1f0 [ibmvfc]
[c0000000146234d0] [c0080000032145a8] ibmvfc_reset_crq+0x150/0x210 [ibmvfc]
[c000000014623550] [c0080000032147c8] ibmvfc_init_crq+0x160/0x280 [ibmvfc]
[c0000000146235f0] [c00800000321a9cc] ibmvfc_probe+0x2a4/0x530 [ibmvfc]

Link: https://lore.kernel.org/r/20220616191126.1281259-2-tyreld@linux.ibm.com
Fixes: 3034ebe263 ("scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels")
Cc: stable@vger.kernel.org
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-16 21:42:04 -04:00
Tyrel Datwyler
72ea7fe0db scsi: ibmvfc: Allocate/free queue resource only during probe/remove
Currently, the sub-queues and event pool resources are allocated/freed for
every CRQ connection event such as reset and LPM. This exposes the driver
to a couple issues. First the inefficiency of freeing and reallocating
memory that can simply be resued after being sanitized. Further, a system
under memory pressue runs the risk of allocation failures that could result
in a crippled driver. Finally, there is a race window where command
submission/compeletion can try to pull/return elements from/to an event
pool that is being deleted or already has been deleted due to the lack of
host state around freeing/allocating resources. The following is an example
of list corruption following a live partition migration (LPM):

Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: vfat fat isofs cdrom ext4 mbcache jbd2 nft_counter nft_compat nf_tables nfnetlink rpadlpar_io rpaphp xsk_diag nfsv3 nfs_acl nfs lockd grace fscache netfs rfkill bonding tls sunrpc pseries_rng drm drm_panel_orientation_quirks xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc scsi_transport_fc ibmveth vmx_crypto dm_multipath dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse
CPU: 0 PID: 2108 Comm: ibmvfc_0 Kdump: loaded Not tainted 5.14.0-70.9.1.el9_0.ppc64le #1
NIP: c0000000007c4bb0 LR: c0000000007c4bac CTR: 00000000005b9a10
REGS: c00000025c10b760 TRAP: 0700  Not tainted (5.14.0-70.9.1.el9_0.ppc64le)
MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 2800028f XER: 0000000f
CFAR: c0000000001f55bc IRQMASK: 0
        GPR00: c0000000007c4bac c00000025c10ba00 c000000002a47c00 000000000000004e
        GPR04: c0000031e3006f88 c0000031e308bd00 c00000025c10b768 0000000000000027
        GPR08: 0000000000000000 c0000031e3009dc0 00000031e0eb0000 0000000000000000
        GPR12: c0000031e2ffffa8 c000000002dd0000 c000000000187108 c00000020fcee2c0
        GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR20: 0000000000000000 0000000000000000 0000000000000000 c008000002f81300
        GPR24: 5deadbeef0000100 5deadbeef0000122 c000000263ba6910 c00000024cc88000
        GPR28: 000000000000003c c0000002430a0000 c0000002430ac300 000000000000c300
NIP [c0000000007c4bb0] __list_del_entry_valid+0x90/0x100
LR [c0000000007c4bac] __list_del_entry_valid+0x8c/0x100
Call Trace:
[c00000025c10ba00] [c0000000007c4bac] __list_del_entry_valid+0x8c/0x100 (unreliable)
[c00000025c10ba60] [c008000002f42284] ibmvfc_free_queue+0xec/0x210 [ibmvfc]
[c00000025c10bb10] [c008000002f4246c] ibmvfc_deregister_scsi_channel+0xc4/0x160 [ibmvfc]
[c00000025c10bba0] [c008000002f42580] ibmvfc_release_sub_crqs+0x78/0x130 [ibmvfc]
[c00000025c10bc20] [c008000002f4f6cc] ibmvfc_do_work+0x5c4/0xc70 [ibmvfc]
[c00000025c10bce0] [c008000002f4fdec] ibmvfc_work+0x74/0x1e8 [ibmvfc]
[c00000025c10bda0] [c0000000001872b8] kthread+0x1b8/0x1c0
[c00000025c10be10] [c00000000000cd64] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
40820034 38600001 38210060 4e800020 7c0802a6 7c641b78 3c62fe7a 7d254b78
3863b590 f8010070 4ba309cd 60000000 <0fe00000> 7c0802a6 3c62fe7a 3863b640
---[ end trace 11a2b65a92f8b66c ]---
ibmvfc 30000003: Send warning. Receive queue closed, will retry.

Add registration/deregistration helpers that are called instead during
connection resets to sanitize and reconfigure the queues.

Link: https://lore.kernel.org/r/20220616191126.1281259-3-tyreld@linux.ibm.com
Fixes: 3034ebe263 ("scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels")
Cc: stable@vger.kernel.org
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-16 21:40:10 -04:00
Saurabh Sengar
1d3e098078 scsi: storvsc: Correct reporting of Hyper-V I/O size limits
Current code is based on the idea that the max number of SGL entries
also determines the max size of an I/O request.  While this idea was
true in older versions of the storvsc driver when SGL entry length
was limited to 4 Kbytes, commit 3d9c3dcc58 ("scsi: storvsc: Enable
scatterlist entry lengths > 4Kbytes") removed that limitation. It's
now theoretically possible for the block layer to send requests that
exceed the maximum size supported by Hyper-V. This problem doesn't
currently happen in practice because the block layer defaults to a
512 Kbyte maximum, while Hyper-V in Azure supports 2 Mbyte I/O sizes.
But some future configuration of Hyper-V could have a smaller max I/O
size, and the block layer could exceed that max.

Fix this by correctly setting max_sectors as well as sg_tablesize to
reflect the maximum I/O size that Hyper-V reports. While allowing
I/O sizes larger than the block layer default of 512 Kbytes doesn’t
provide any noticeable performance benefit in the tests we ran, it's
still appropriate to report the correct underlying Hyper-V capabilities
to the Linux block layer.

Also tweak the virt_boundary_mask to reflect that the required
alignment derives from Hyper-V communication using a 4 Kbyte page size,
and not on the guest page size, which might be bigger (eg. ARM64).

Link: https://lore.kernel.org/r/1655190355-28722-1-git-send-email-ssengar@linux.microsoft.com
Fixes: 3d9c3dcc58 ("scsi: storvsc: Enable scatter list entry lengths > 4Kbytes")
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-16 21:36:03 -04:00
Dave Airlie
65cf7c02cf Merge tag 'exynos-drm-fixes-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes
two regression fixups
- Check a null pointer instead of IS_ERR().
- Rework initialization code of Exynos MIC driver.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220614141336.88614-1-inki.dae@samsung.com
2022-06-17 11:32:35 +10:00
Bart Van Assche
2acd76e7b8 scsi: ufs: Fix a race between the interrupt handler and the reset handler
Prevent that both the interrupt handler and the reset handler try to
complete a request at the same time. This patch is the result of an
analysis of the following crash:

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000120
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           OE     5.10.107-android13-4-00051-g1e48e8970cca-ab8664745 #1
pc : ufshcd_release_scsi_cmd+0x30/0x46c
lr : __ufshcd_transfer_req_compl+0x4fc/0x9c0
Call trace:
 ufshcd_release_scsi_cmd+0x30/0x46c
 __ufshcd_transfer_req_compl+0x4fc/0x9c0
 ufshcd_poll+0xf0/0x208
 ufshcd_sl_intr+0xb8/0xf0
 ufshcd_intr+0x168/0x2f4
 __handle_irq_event_percpu+0xa0/0x30c
 handle_irq_event+0x84/0x178
 handle_fasteoi_irq+0x150/0x2e8
 __handle_domain_irq+0x114/0x1e4
 gic_handle_irq.31846+0x58/0x300
 el1_irq+0xe4/0x1c0
 cpuidle_enter_state+0x3ac/0x8c4
 do_idle+0x2fc/0x55c
 cpu_startup_entry+0x84/0x90
 kernel_init+0x0/0x310
 start_kernel+0x0/0x608
 start_kernel+0x4ec/0x608

Link: https://lore.kernel.org/r/20220613214442.212466-4-bvanassche@acm.org
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-16 21:32:09 -04:00
Bart Van Assche
d1a7644648 scsi: ufs: Support clearing multiple commands at once
Modify ufshcd_clear_cmd() such that it supports clearing multiple commands
at once instead of one command at a time. This change will be used in a
later patch to reduce the time spent in the reset handler.

Link: https://lore.kernel.org/r/20220613214442.212466-3-bvanassche@acm.org
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-16 21:32:09 -04:00
Bart Van Assche
da8badd7d3 scsi: ufs: Simplify ufshcd_clear_cmd()
Remove the local variable 'err'. This patch does not change any
functionality.

Link: https://lore.kernel.org/r/20220613214442.212466-2-bvanassche@acm.org
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-16 21:32:09 -04:00
Dave Airlie
d08227a8b1 Merge tag 'amd-drm-fixes-5.19-2022-06-15' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.19-2022-06-15:

amdgpu:
- Fix regression in GTT size reporting
- OLED backlight fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220615205609.28763-1-alexander.deucher@amd.com
2022-06-17 11:17:37 +10:00
Dave Airlie
3f0acf259a Merge tag 'drm-intel-fixes-2022-06-16' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.19-rc3:
- Fix page fault on error state read
- Fix memory leaks in per-gt sysfs
- Fix multiple fence handling
- Remove accidental static from a local variable

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8735g5xd25.fsf@intel.com
2022-06-17 10:24:42 +10:00
Mikulas Patocka
85e123c27d dm mirror log: round up region bitmap size to BITS_PER_LONG
The code in dm-log rounds up bitset_size to 32 bits. It then uses
find_next_zero_bit_le on the allocated region. find_next_zero_bit_le
accesses the bitmap using unsigned long pointers. So, on 64-bit
architectures, it may access 4 bytes beyond the allocated size.

Fix this bug by rounding up bitset_size to BITS_PER_LONG.

This bug was found by running the lvm2 testsuite with kasan.

Fixes: 29121bd0b0 ("[PATCH] dm mirror log: bitset_size fix")
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-16 19:39:29 -04:00
Mikulas Patocka
1ee88de395 dm: fix narrow race for REQ_NOWAIT bios being issued despite no support
Starting with the commit 63a225c9fd20, device mapper has an optimization
that it will take cheaper table lock (dm_get_live_table_fast instead of
dm_get_live_table) if the bio has REQ_NOWAIT. The bios with REQ_NOWAIT
must not block in the target request routine, if they did, we would be
blocking while holding rcu_read_lock, which is prohibited.

The targets that are suitable for REQ_NOWAIT optimization (and that don't
block in the map routine) have the flag DM_TARGET_NOWAIT set. Device
mapper will test if all the targets and all the devices in a table
support nowait (see the function dm_table_supports_nowait) and it will set
or clear the QUEUE_FLAG_NOWAIT flag on its request queue according to
this check.

There's a test in submit_bio_noacct: "if ((bio->bi_opf & REQ_NOWAIT) &&
!blk_queue_nowait(q)) goto not_supported" - this will make sure that
REQ_NOWAIT bios can't enter a request queue that doesn't support them.

This mechanism works to prevent REQ_NOWAIT bios from reaching dm targets
that don't support the REQ_NOWAIT flag (and that may block in the map
routine) - except that there is a small race condition:

submit_bio_noacct checks if the queue has the QUEUE_FLAG_NOWAIT without
holding any locks. Immediatelly after this check, the device mapper table
may be reloaded with a table that doesn't support REQ_NOWAIT (for example,
if we start moving the logical volume or if we activate a snapshot).
However the REQ_NOWAIT bio that already passed the check in
submit_bio_noacct would be sent to device mapper, where it could be
redirected to a dm target that doesn't support REQ_NOWAIT - the result is
sleeping while we hold rcu_read_lock.

In order to fix this race, we double-check if the target supports
REQ_NOWAIT while we hold the table lock (so that the table can't change
under us).

Fixes: 563a225c9f ("dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-16 19:39:02 -04:00
Mikulas Patocka
5d7362d0d5 dm: fix use-after-free in dm_put_live_table_bio
dm_put_live_table_bio is called from the end of dm_submit_bio.
However, at this point, the bio may be already finished and the caller
may have freed the bio. Consequently, dm_put_live_table_bio accesses
the stale "bio" pointer.

Fix this bug by loading the bi_opf value and passing it to
dm_get_live_table_bio and dm_put_live_table_bio instead of the bio.

This bug was found by running the lvm2 testsuite with kasan.

Fixes: 563a225c9f ("dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-16 19:38:49 -04:00
Dave Airlie
2f90ec1271 Merge tag 'drm-misc-fixes-2022-06-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Two fixes for TTM, one for a NULL pointer dereference and one to make sure
the buffer is pinned prior to a bulk move, and a fix for a spurious
compiler warning.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220616072519.qwrsefsemejefowu@houat
2022-06-17 09:31:37 +10:00
Steve French
7c05eae8db smb3: add trace point for SMB2_set_eof
In order to debug problems with file size being reported incorrectly
temporarily (in this case xfstest generic/584 intermittent failure)
we need to add trace point for the non-compounded code path where
we set the file size (SMB2_set_eof).  The new trace point is:
   "smb3_set_eof"

Here is sample output from the tracepoint:

            TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
              | |         |   |||||     |         |
          xfs_io-75403   [002] ..... 95219.189835: smb3_set_eof: xid=221 sid=0xeef1cbd2 tid=0x27079ee6 fid=0x52edb58c offset=0x100000
 aio-dio-append--75418   [010] ..... 95219.242402: smb3_set_eof: xid=226 sid=0xeef1cbd2 tid=0x27079ee6 fid=0xae89852d offset=0x0

Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-16 18:07:10 -05:00
Joel Savitz
9b4d5c01eb selftests: make use of GUP_TEST_FILE macro
Commit 17de1e559c ("selftests: clarify common error when running
gup_test") had most of its hunks dropped due to a conflict with another
patch accepted into Linux around the same time that implemented the same
behavior as a subset of other changes.

However, the remaining hunk defines the GUP_TEST_FILE macro without
making use of it. This patch makes use of the macro in the two relevant
places.

Furthermore, the above mentioned commit's log message erroneously describes
the changes that were dropped from the patch.

This patch corrects the record.

Fixes: 17de1e559c ("selftests: clarify common error when running gup_test")

Signed-off-by: Joel Savitz <jsavitz@redhat.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Nico Pache <npache@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-06-16 17:05:50 -06:00
Bart Van Assche
b96f3cab59 block/bfq: Enable I/O statistics
BFQ uses io_start_time_ns. That member variable is only set if I/O
statistics are enabled. Hence this patch that enables I/O statistics
at the time BFQ is associated with a request queue.

Compile-tested only.

Reported-by: Cixi Geng <cixi.geng1@unisoc.com>
Cc: Cixi Geng <cixi.geng1@unisoc.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Paolo Valente <paolo.valente@unimore.it>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-16 16:59:28 -06:00
Linus Torvalds
0639b599f6 Merge tag 'audit-pr-20220616' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit fix from Paul Moore:
 "A single audit patch to fix a problem where we were not properly
  freeing memory allocated when recording information related to a
  module load"

* tag 'audit-pr-20220616' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: free module name
2022-06-16 15:53:38 -07:00
Linus Torvalds
6decbf75c9 Merge tag 'selinux-pr-20220616' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
 "A single SELinux patch to fix memory leaks when mounting filesystems
  with SELinux mount options"

* tag 'selinux-pr-20220616' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: free contexts previously transferred in selinux_add_opt()
2022-06-16 15:50:36 -07:00
Palmer Dabbelt
c836d9d17a RISC-V: Some Svpbmt fixes
Some additionals comments and notes from autobuilders received after the
series got applied, warranted some changes.

* commit '924cbb8cbe3460ea192e6243017ceb0ceb255b1b':
  riscv: Improve description for RISCV_ISA_SVPBMT Kconfig symbol
  riscv: drop cpufeature_apply_feature tracking variable
  riscv: fix dependency for t-head errata
2022-06-16 15:48:39 -07:00
Heiko Stuebner
924cbb8cbe riscv: Improve description for RISCV_ISA_SVPBMT Kconfig symbol
This improves the symbol's description to make it easier for
people to understand what it is about.

Suggested-by: Christoph Hellwig <hch@lst.de>
Suggested-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20220526205646.258337-3-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-06-16 15:47:39 -07:00
Heiko Stuebner
237c0ee474 riscv: drop cpufeature_apply_feature tracking variable
The variable was tracking which feature patches got applied
but that information was never actually used - and thus resulted
in a warning as well.

Drop the variable.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20220526205646.258337-2-heiko@sntech.de
Fixes: ff689fd21c ("riscv: add RISC-V Svpbmt extension support")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-06-16 15:47:31 -07:00
Heiko Stuebner
21f356f990 riscv: fix dependency for t-head errata
alternatives only work correctly on non-xip-kernels and while the
selected alternative-symbol has the correct dependency the symbol
selecting it also needs that dependency.

So add the missing dependency to the T-Head errata Kconfig symbol.

Reported-by: kernel test robot <yujie.liu@intel.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220526205646.258337-5-heiko@sntech.de
Fixes: a35707c3d8 ("riscv: add memory-type errata for T-Head")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-06-16 15:42:55 -07:00
Palmer Dabbelt
a7c1c97fb1 Merge tag 'dt-fixes-for-palmer-5.19-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/conor/linux into fixes
Microchip RISC-V devicetree fixes for 5.19-rc3

A single fix for mpfs.dtsi:
- The sifive pdma entry fell through the cracks between versions of my
  dt patches & I gave Zong the wrong conflict resolution, so it is
  added back.

* tag 'dt-fixes-for-palmer-5.19-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: dts: microchip: re-add pdma to mpfs device tree
2022-06-16 15:13:10 -07:00
Dominique Martinet
b0017602fd 9p: fix EBADF errors in cached mode
cached operations sometimes need to do invalid operations (e.g. read
on a write only file)
Historic fscache had added a "writeback fid", a special handle opened
RW as root, for this. The conversion to new fscache missed that bit.

This commit reinstates a slightly lesser variant of the original code
that uses the writeback fid for partial pages backfills if the regular
user fid had been open as WRONLY, and thus would lack read permissions.

Link: https://lkml.kernel.org/r/20220614033802.1606738-1-asmadeus@codewreck.org
Fixes: eb497943fa ("9p: Convert to using the netfs helper lib to do reads and caching")
Cc: stable@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>
Reported-By: Christian Schoenebeck <linux_oss@crudebyte.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Tested-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-06-17 06:03:30 +09:00
Ming Lei
6cfeadbff3 blk-mq: don't clear flush_rq from tags->rqs[]
commit 364b61818f ("blk-mq: clearing flush request reference in
tags->rqs[]") is added to clear the to-be-free flush request from
tags->rqs[] for avoiding use-after-free on the flush rq.

Yu Kuai reported that blk_mq_clear_flush_rq_mapping() slows down boot time
by ~8s because running scsi probe which may create and remove lots of
unpresent LUNs on megaraid-sas which uses BLK_MQ_F_TAG_HCTX_SHARED and
each request queue has lots of hw queues.

Improve the situation by not running blk_mq_clear_flush_rq_mapping if
disk isn't added when there can't be any flush request issued.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reported-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20220616014401.817001-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-16 14:45:15 -06:00
Ming Lei
4d337cebcb blk-mq: avoid to touch q->elevator without any protection
q->elevator is referred in blk_mq_has_sqsched() without any protection,
no .q_usage_counter is held, no queue srcu and rcu read lock is held,
so potential use-after-free may be triggered.

Fix the issue by adding one queue flag for checking if the elevator
uses single queue style dispatch. Meantime the elevator feature flag
of ELEVATOR_F_MQ_AWARE isn't needed any more.

Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220616014401.817001-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-16 14:45:15 -06:00
Ming Lei
5fd7a84a09 blk-mq: protect q->elevator by ->sysfs_lock in blk_mq_elv_switch_none
elevator can be tore down by sysfs switch interface or disk release, so
hold ->sysfs_lock before referring to q->elevator, then potential
use-after-free can be avoided.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20220616014401.817001-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-16 14:45:15 -06:00
Bart Van Assche
14dc7a18ab block: Fix handling of offline queues in blk_mq_alloc_request_hctx()
This patch prevents that test nvme/004 triggers the following:

UBSAN: array-index-out-of-bounds in block/blk-mq.h:135:9
index 512 is out of range for type 'long unsigned int [512]'
Call Trace:
 show_stack+0x52/0x58
 dump_stack_lvl+0x49/0x5e
 dump_stack+0x10/0x12
 ubsan_epilogue+0x9/0x3b
 __ubsan_handle_out_of_bounds.cold+0x44/0x49
 blk_mq_alloc_request_hctx+0x304/0x310
 __nvme_submit_sync_cmd+0x70/0x200 [nvme_core]
 nvmf_connect_io_queue+0x23e/0x2a0 [nvme_fabrics]
 nvme_loop_connect_io_queues+0x8d/0xb0 [nvme_loop]
 nvme_loop_create_ctrl+0x58e/0x7d0 [nvme_loop]
 nvmf_create_ctrl+0x1d7/0x4d0 [nvme_fabrics]
 nvmf_dev_write+0xae/0x111 [nvme_fabrics]
 vfs_write+0x144/0x560
 ksys_write+0xb7/0x140
 __x64_sys_write+0x42/0x50
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Fixes: 20e4d81393 ("blk-mq: simplify queue mapping & schedule with each possisble CPU")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20220615210004.1031820-1-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-16 14:43:31 -06:00
Ding Xiang
3084a4ec7f selftests: vm: Fix resource leak when return error
When return on an error path, file handle need to be closed
to prevent resource leak

Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-06-16 14:14:08 -06:00
Yu Liao
12a29115be selftests dma: fix compile error for dma_map_benchmark
When building selftests/dma:
$ make -C tools/testing/selftests TARGETS=dma
I hit the following compilation error:

dma_map_benchmark.c:13:10: fatal error: linux/map_benchmark.h: No such file or directory
 #include <linux/map_benchmark.h>
          ^~~~~~~~~~~~~~~~~~~~~~~

dma/Makefile does not include the map_benchmark.h path, so add
more including path, and fix include order in dma_map_benchmark.c

Fixes: 8ddde07a3d ("dma-mapping: benchmark: extract a common header file for map_benchmark definition")
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-06-16 14:03:21 -06:00
Jakub Sitnicki
5e0b0a4c52 selftests/bpf: Test tail call counting with bpf2bpf and data on stack
Cover the case when tail call count needs to be passed from BPF function to
BPF function, and the caller has data on stack. Specifically when the size
of data allocated on BPF stack is not a multiple on 8.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220616162037.535469-3-jakub@cloudflare.com
2022-06-16 21:49:05 +02:00
Jakub Sitnicki
ff672c67ee bpf, x86: Fix tail call count offset calculation on bpf2bpf call
On x86-64 the tail call count is passed from one BPF function to another
through %rax. Additionally, on function entry, the tail call count value
is stored on stack right after the BPF program stack, due to register
shortage.

The stored count is later loaded from stack either when performing a tail
call - to check if we have not reached the tail call limit - or before
calling another BPF function call in order to pass it via %rax.

In the latter case, we miscalculate the offset at which the tail call count
was stored on function entry. The JIT does not take into account that the
allocated BPF program stack is always a multiple of 8 on x86, while the
actual stack depth does not have to be.

This leads to a load from an offset that belongs to the BPF stack, as shown
in the example below:

SEC("tc")
int entry(struct __sk_buff *skb)
{
	/* Have data on stack which size is not a multiple of 8 */
	volatile char arr[1] = {};
	return subprog_tail(skb);
}

int entry(struct __sk_buff * skb):
   0: (b4) w2 = 0
   1: (73) *(u8 *)(r10 -1) = r2
   2: (85) call pc+1#bpf_prog_ce2f79bb5f3e06dd_F
   3: (95) exit

int entry(struct __sk_buff * skb):
   0xffffffffa0201788:  nop    DWORD PTR [rax+rax*1+0x0]
   0xffffffffa020178d:  xor    eax,eax
   0xffffffffa020178f:  push   rbp
   0xffffffffa0201790:  mov    rbp,rsp
   0xffffffffa0201793:  sub    rsp,0x8
   0xffffffffa020179a:  push   rax
   0xffffffffa020179b:  xor    esi,esi
   0xffffffffa020179d:  mov    BYTE PTR [rbp-0x1],sil
   0xffffffffa02017a1:  mov    rax,QWORD PTR [rbp-0x9]	!!! tail call count
   0xffffffffa02017a8:  call   0xffffffffa02017d8       !!! is at rbp-0x10
   0xffffffffa02017ad:  leave
   0xffffffffa02017ae:  ret

Fix it by rounding up the BPF stack depth to a multiple of 8, when
calculating the tail call count offset on stack.

Fixes: ebf7d1f508 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220616162037.535469-2-jakub@cloudflare.com
2022-06-16 21:48:24 +02:00
Linus Torvalds
48a23ec6ff Merge tag 'net-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Mostly driver fixes.

  Current release - regressions:

   - Revert "net: Add a second bind table hashed by port and address",
     needs more work

   - amd-xgbe: use platform_irq_count(), static setup of IRQ resources
     had been removed from DT core

   - dts: at91: ksz9477_evb: add phy-mode to fix port/phy validation

  Current release - new code bugs:

   - hns3: modify the ring param print info

  Previous releases - always broken:

   - axienet: make the 64b addressable DMA depends on 64b architectures

   - iavf: fix issue with MAC address of VF shown as zero

   - ice: fix PTP TX timestamp offset calculation

   - usb: ax88179_178a needs FLAG_SEND_ZLP

  Misc:

   - document some net.sctp.* sysctls"

* tag 'net-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (31 commits)
  net: axienet: add missing error return code in axienet_probe()
  Revert "net: Add a second bind table hashed by port and address"
  net: ax25: Fix deadlock caused by skb_recv_datagram in ax25_recvmsg
  net: usb: ax88179_178a needs FLAG_SEND_ZLP
  MAINTAINERS: add include/dt-bindings/net to NETWORKING DRIVERS
  ARM: dts: at91: ksz9477_evb: fix port/phy validation
  net: bgmac: Fix an erroneous kfree() in bgmac_remove()
  ice: Fix memory corruption in VF driver
  ice: Fix queue config fail handling
  ice: Sync VLAN filtering features for DVM
  ice: Fix PTP TX timestamp offset calculation
  mlxsw: spectrum_cnt: Reorder counter pools
  docs: networking: phy: Fix a typo
  amd-xgbe: Use platform_irq_count()
  octeontx2-vf: Add support for adaptive interrupt coalescing
  xilinx:  Fix build on x86.
  net: axienet: Use iowrite64 to write all 64b descriptor pointers
  net: axienet: make the 64b addresable DMA depends on 64b archectures
  net: hns3: fix tm port shapping of fibre port is incorrect after driver initialization
  net: hns3: fix PF rss size initialization bug
  ...
2022-06-16 11:51:32 -07:00
Yang Yingliang
2e7bf4a6af net: axienet: add missing error return code in axienet_probe()
It should return error code in error path in axienet_probe().

Fixes: 00be43a74c ("net: axienet: make the 64b addresable DMA depends on 64b archectures")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220616062917.3601-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-16 11:08:38 -07:00
Joanne Koong
593d1ebe00 Revert "net: Add a second bind table hashed by port and address"
This reverts:

commit d5a42de8bd ("net: Add a second bind table hashed by port and address")
commit 538aaf9b23 ("selftests: Add test for timing a bind request to a port with a populated bhash entry")
Link: https://lore.kernel.org/netdev/20220520001834.2247810-1-kuba@kernel.org/

There are a few things that need to be fixed here:
* Updating bhash2 in cases where the socket's rcv saddr changes
* Adding bhash2 hashbucket locks

Links to syzbot reports:
https://lore.kernel.org/netdev/00000000000022208805e0df247a@google.com/
https://lore.kernel.org/netdev/0000000000003f33bc05dfaf44fe@google.com/

Fixes: d5a42de8bd ("net: Add a second bind table hashed by port and address")
Reported-by: syzbot+015d756bbd1f8b5c8f09@syzkaller.appspotmail.com
Reported-by: syzbot+98fd2d1422063b0f8c44@syzkaller.appspotmail.com
Reported-by: syzbot+0a847a982613c6438fba@syzkaller.appspotmail.com
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Link: https://lore.kernel.org/r/20220615193213.2419568-1-joannelkoong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-16 11:07:59 -07:00
Mark Brown
3f77a1d057 arm64/cpufeature: Unexport set_cpu_feature()
We currently export set_cpu_feature() to modules but there are no in tree
users that can be built as modules and it is hard to see cases where it
would make sense for there to be any such users. Remove the export to avoid
anyone else having to worry about why it is there and ensure that any users
that do get added get a bit more visiblity.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220615191504.626604-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-06-16 18:42:26 +01:00
Jan Kara
8d5459c11f ext4: improve write performance with disabled delalloc
When delayed allocation is disabled (either through mount option or
because we are running low on free space), ext4_write_begin() allocates
blocks with EXT4_GET_BLOCKS_IO_CREATE_EXT flag. With this flag extent
merging is disabled and since ext4_write_begin() is called for each page
separately, we end up with a *lot* of 1 block extents in the extent tree
and following writeback is writing 1 block at a time which results in
very poor write throughput (4 MB/s instead of 200 MB/s). These days when
ext4_get_block_unwritten() is used only by ext4_write_begin(),
ext4_page_mkwrite() and inline data conversion, we can safely allow
extent merging to happen from these paths since following writeback will
happen on different boundaries anyway. So use
EXT4_GET_BLOCKS_CREATE_UNRIT_EXT instead which restores the performance.

Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220520111402.4252-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-16 12:17:56 -04:00
Zhang Yi
15baa7dcad ext4: fix warning when submitting superblock in ext4_commit_super()
We have already check the io_error and uptodate flag before submitting
the superblock buffer, and re-set the uptodate flag if it has been
failed to write out. But it was lockless and could be raced by another
ext4_commit_super(), and finally trigger '!uptodate' WARNING when
marking buffer dirty. Fix it by submit buffer directly.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20220520023216.3065073-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-16 11:50:48 -04:00
Dylan Yudaken
32fc810b36 io_uring: do not use prio task_work_add in uring_cmd
io_req_task_prio_work_add has a strict assumption that it will only be
used with io_req_task_complete. There is a codepath that assumes this is
the case and will not even call the completion function if it is hit.

For uring_cmd with an arbitrary completion function change the call to the
correct non-priority version.

Fixes: ee692a21e9 ("fs,io_uring: add infrastructure for uring-cmd")
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/20220616135011.441980-1-dylany@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-16 09:10:26 -06:00
Wang Jianjian
3103084afc ext4, doc: remove unnecessary escaping
Signed-off-by: Wang Jianjian <wangjianjian3@huawei.com>
Link: https://lore.kernel.org/r/20220520022255.2120576-2-wangjianjian3@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-16 11:03:17 -04:00
Wang Jianjian
48e02e6113 ext4: fix incorrect comment in ext4_bio_write_page()
Signed-off-by: Wang Jianjian <wangjianjian3@huawei.com>
Link: https://lore.kernel.org/r/20220520022255.2120576-1-wangjianjian3@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-16 11:03:16 -04:00
Sascha Hauer
06781a5026 mtd: rawnand: gpmi: Fix setting busy timeout setting
The DEVICE_BUSY_TIMEOUT value is described in the Reference Manual as:

| Timeout waiting for NAND Ready/Busy or ATA IRQ. Used in WAIT_FOR_READY
| mode. This value is the number of GPMI_CLK cycles multiplied by 4096.

So instead of multiplying the value in cycles with 4096, we have to
divide it by that value. Use DIV_ROUND_UP to make sure we are on the
safe side, especially when the calculated value in cycles is smaller
than 4096 as typically the case.

This bug likely never triggered because any timeout != 0 usually will
do. In my case the busy timeout in cycles was originally calculated as
2408, which multiplied with 4096 is 0x968000. The lower 16 bits were
taken for the 16 bit wide register field, so the register value was
0x8000. With 2970bf5a32 ("mtd: rawnand: gpmi: fix controller timings
setting") however the value in cycles became 2384, which multiplied
with 4096 is 0x950000. The lower 16 bit are 0x0 now resulting in an
intermediate timeout when reading from NAND.

Fixes: b120612206 ("mtd: rawnand: gpmi: use core timings instead of an empirical derivation")
Cc: stable@vger.kernel.org
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220614083138.3455683-1-s.hauer@pengutronix.de
2022-06-16 16:46:08 +02:00
Yang Li
4f5bf12732 fs: fix jbd2_journal_try_to_free_buffers() kernel-doc comment
Add the description of @folio and remove @page in function kernel-doc
comment to remove warnings found by running scripts/kernel-doc, which
is caused by using 'make W=1'.

fs/jbd2/transaction.c:2149: warning: Function parameter or member
'folio' not described in 'jbd2_journal_try_to_free_buffers'
fs/jbd2/transaction.c:2149: warning: Excess function parameter 'page'
description in 'jbd2_journal_try_to_free_buffers'

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220512075432.31763-1-yang.lee@linux.alibaba.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-16 10:36:09 -04:00
Jens Axboe
a76c0b31ee io_uring: commit non-pollable provided mapped buffers upfront
For recv/recvmsg, IO either completes immediately or gets queued for a
retry. This isn't the case for read/readv, if eg a normal file or a block
device is used. Here, an operation can get queued with the block layer.
If this happens, ring mapped buffers must get committed immediately to
avoid that the next read can consume the same buffer.

Check if we're dealing with pollable file, when getting a new ring mapped
provided buffer. If it's not, commit it immediately rather than wait post
issue. If we don't wait, we can race with completions coming in, or just
plain buffer reuse by committing after a retry where others could have
grabbed the same buffer.

Fixes: c7fb19428d ("io_uring: add support for ring mapped supplied buffers")
Reviewed-by: Hao Xu <howeyxu@tencent.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-16 07:14:44 -06:00
Maxime Ripard
30f8c74ca9 drm/vc4: Warn if some v3d code is run on BCM2711
The BCM2711 has a separate driver for the v3d, and thus we can't call
into any of the driver entrypoints that rely on the v3d being there.

Let's add a bunch of checks and complain loudly if that ever happen.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-15-maxime@cerno.tech
2022-06-16 11:07:52 +02:00
Maxime Ripard
d19e00ee06 drm/vc4: crtc: Fix out of order frames during asynchronous page flips
When doing an asynchronous page flip (PAGE_FLIP ioctl with the
DRM_MODE_PAGE_FLIP_ASYNC flag set), the current code waits for the
possible GPU buffer being rendered through a call to
vc4_queue_seqno_cb().

On the BCM2835-37, the GPU driver is part of the vc4 driver and that
function is defined in vc4_gem.c to wait for the buffer to be rendered,
and once it's done, call a callback.

However, on the BCM2711 used on the RaspberryPi4, the GPU driver is
separate (v3d) and that function won't do anything. This was working
because we were going into a path, due to uninitialized variables, that
was always scheduling the callback.

However, we were never actually waiting for the buffer to be rendered
which was resulting in frames being displayed out of order.

The generic API to signal those kind of completion in the kernel are the
DMA fences, and fortunately the v3d drivers supports them and signal
when its job is done. That API also provides an equivalent function that
allows to have a callback being executed when the fence is signalled as
done.

Let's change our driver a bit to rely on the previous function for the
older SoCs, and on DMA fences for the BCM2711.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://lore.kernel.org/r/20220610115149.964394-14-maxime@cerno.tech
2022-06-16 11:07:52 +02:00
Maxime Ripard
d87db1c79d drm/vc4: crtc: Don't call into BO Handling on Async Page-Flips on BCM2711
The BCM2711 doesn't have a v3d GPU so we don't want to call into its BO
management code. Let's create an asynchronous page-flip handler for the
BCM2711 that just calls into the common code.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-13-maxime@cerno.tech
2022-06-16 11:07:52 +02:00
Maxime Ripard
f6766fb265 drm/vc4: crtc: Move the BO Handling out of Common Page-Flip Handler
The function vc4_async_page_flip() handles asynchronous page-flips in
the vc4 driver.

However, it mixes some generic code with code that should only be run on
older generations that have the GPU handled by the vc4 driver.

Let's split the generic part out of vc4_async_page_flip() and into a
common function that we be reusable by an handler made for the BCM2711.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-12-maxime@cerno.tech
2022-06-16 11:07:52 +02:00
Maxime Ripard
4d12c36fb7 drm/vc4: crtc: Move the BO handling out of common page-flip callback
We'll soon introduce another completion callback source that won't need
to use the BO reference counting, so let's move it around to create a
function we will be able to share between both callbacks.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-11-maxime@cerno.tech
2022-06-16 11:07:52 +02:00
Maxime Ripard
2523e9dcc3 drm/vc4: crtc: Use an union to store the page flip callback
We'll need to extend the vc4_async_flip_state structure to rely on
another callback implementation, so let's move the current one into a
union.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-10-maxime@cerno.tech
2022-06-16 11:07:52 +02:00
Maxime Ripard
257add942a drm/vc4: drv: Skip BO Backend Initialization on BCM2711
On the BCM2711, we currently call the vc4_bo_cache_init() and
vc4_gem_init() functions. These functions initialize the BO and GEM
backends.

However, this code was initially created to accomodate the requirements
of the GPU on the older SoCs, while the BCM2711 has a separate driver
for it. So let's just skip these calls when we're on a newer hardware.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-9-maxime@cerno.tech
2022-06-16 11:07:52 +02:00
Maxime Ripard
2095848661 drm/vc4: plane: Register a different drm_plane_helper_funcs on BCM2711
On the BCM2711, our current definition of drm_plane_helper_funcs uses
the custom vc4_prepare_fb() and vc4_cleanup_fb().

Those functions rely on the buffer allocation path that was relying on
the GPU, and is no longer relevant.

Let's create another drm_plane_helper_funcs structure that we will
register on the BCM2711.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-8-maxime@cerno.tech
2022-06-16 11:07:51 +02:00
Maxime Ripard
39a30ec645 drm/vc4: kms: Register a different drm_mode_config_funcs on BCM2711
On the BCM2711, our current definition of drm_mode_config_funcs uses the
custom vc4_fb_create().

However, that function relies on the buffer allocation path that was
relying on the GPU, and is no longer relevant.

Let's create another drm_mode_config_funcs structure that we will
register on the BCM2711.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-7-maxime@cerno.tech
2022-06-16 11:07:51 +02:00
Maxime Ripard
538f111160 drm/vc4: drv: Register a different driver on BCM2711
Prior to the BCM2711/RaspberryPi4, the GPU was a part of the display
components of the SoC. It was thus a part of the vc4 driver.

However, with the BCM2711, it got split out and thus the v3d driver was
created. The vc4 driver now only handles the display part.

We didn't properly split out the code when doing the BCM2711 support
though, and most of the code around buffer allocations is still
involved, even though it doesn't have the backing hardware anymore.

Let's start the split out by creating a new drm_driver that only reports
and uses what we support on the BCM2711. The ioctl were properly
filtered already, but we were still exposing a .gem_create_object hook,
as well as having an .open and .postclose hooks which are only relevant
on older generations.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-6-maxime@cerno.tech
2022-06-16 11:07:51 +02:00
Maxime Ripard
3d7637423b drm/vc4: bo: Split out Dumb buffers fixup
The vc4_bo_dumb_create() both fixes up the allocation arguments to match
the hardware constraints and actually performs the allocation.

Since we're going to introduce a new function that uses a different
allocator, let's split the arguments fixup to a separate function we
will be able to reuse.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-5-maxime@cerno.tech
2022-06-16 11:07:51 +02:00
Maxime Ripard
dd2dfd44ed drm/vc4: bo: Rename vc4_dumb_create
We're going to add a new variant of the dumb BO allocation function, so
let's rename vc4_dumb_create() to something a bit more specific.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-4-maxime@cerno.tech
2022-06-16 11:07:51 +02:00
Maxime Ripard
1cbc91eb7b drm/vc4: Consolidate Hardware Revision Check
A new generation of controller has been introduced with the
BCM2711/RaspberryPi4. This generation needs a bunch of quirks, and over
time we've piled on a number of checks in most parts of the drivers.

All these checks are performed several times, and are not always
consistent. Let's create a single, global, variable to hold it and use
it everywhere.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-3-maxime@cerno.tech
2022-06-16 11:07:51 +02:00
Maxime Ripard
cb468c7d84 drm/vc4: plane: Prevent async update if we don't have a dlist
The vc4 planes are setup in hardware by creating a hardware descriptor
in a dedicated RAM. As part of the process to setup a plane in KMS, we
thus need to allocate some part of that dedicated RAM to store our
descriptor there.

The async update path will just reuse the descriptor already allocated
for that plane and will modify it directly in RAM to match whatever has
been asked for.

In order to do that, it will compare the descriptor for the old plane
state and the new plane state, will make sure they fit in the same size,
and check that only the position or buffer address have changed.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220610115149.964394-2-maxime@cerno.tech
2022-06-16 11:07:51 +02:00
Jan Kara
4bca7e80b6 init: Initialize noop_backing_dev_info early
noop_backing_dev_info is used by superblocks of various
pseudofilesystems such as kdevtmpfs. After commit 10e1407310
("writeback: Fix inode->i_io_list not be protected by inode->i_lock
error") this broke because __mark_inode_dirty() started to access more
fields from noop_backing_dev_info and this led to crashes inside
locked_inode_to_wb_and_lock_list() called from __mark_inode_dirty().
Fix the problem by initializing noop_backing_dev_info before the
filesystems get mounted.

Fixes: 10e1407310 ("writeback: Fix inode->i_io_list not be protected by inode->i_lock error")
Reported-and-tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reported-and-tested-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2022-06-16 10:55:57 +02:00
Ye Bin
27cfa25895 ext2: fix fs corruption when trying to remove a non-empty directory with IO error
We got issue as follows:
[home]# mount  /dev/sdd  test
[home]# cd test
[test]# ls
dir1  lost+found
[test]# rmdir  dir1
ext2_empty_dir: inject fault
[test]# ls
lost+found
[test]# cd ..
[home]# umount test
[home]# fsck.ext2 -fn  /dev/sdd
e2fsck 1.42.9 (28-Dec-2013)
Pass 1: Checking inodes, blocks, and sizes
Inode 4065, i_size is 0, should be 1024.  Fix? no

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Unconnected directory inode 4065 (/???)
Connect to /lost+found? no

'..' in ... (4065) is / (2), should be <The NULL inode> (0).
Fix? no

Pass 4: Checking reference counts
Inode 2 ref count is 3, should be 4.  Fix? no

Inode 4065 ref count is 2, should be 3.  Fix? no

Pass 5: Checking group summary information

/dev/sdd: ********** WARNING: Filesystem still has errors **********

/dev/sdd: 14/128016 files (0.0% non-contiguous), 18477/512000 blocks

Reason is same with commit 7aab5c84a0. We can't assume directory
is empty when read directory entry failed.

Link: https://lore.kernel.org/r/20220615090010.1544152-1-yebin10@huawei.com
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2022-06-16 10:55:45 +02:00
Samuel Holland
1342b5b23d drm/sun4i: Fix crash during suspend after component bind failure
If the component driver fails to bind, or is unbound, the driver data
for the top-level platform device points to a freed drm_device. If the
system is then suspended, the driver passes this dangling pointer to
drm_mode_config_helper_suspend(), which crashes.

Fix this by only setting the driver data while the platform driver holds
a reference to the drm_device.

Fixes: 624b4b48d9 ("drm: sun4i: Add support for suspending the display driver")
Signed-off-by: Samuel Holland <samuel@sholland.org>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220615054254.16352-1-samuel@sholland.org
2022-06-16 09:49:28 +02:00
Samuel Holland
920169041b drm/sun4i: dw-hdmi: Fix ddc-en GPIO consumer conflict
commit 6de79dd3a9 ("drm/bridge: display-connector: add ddc-en gpio
support") added a consumer for this GPIO in the HDMI connector device.
This new consumer conflicts with the pre-existing GPIO consumer in the
sun8i HDMI controller driver, which prevents the driver from probing:

  [    4.983358] display-connector connector: GPIO lookup for consumer ddc-en
  [    4.983364] display-connector connector: using device tree for GPIO lookup
  [    4.983392] gpio-226 (ddc-en): gpiod_request: status -16
  [    4.983399] sun8i-dw-hdmi 6000000.hdmi: Couldn't get ddc-en gpio
  [    4.983618] sun4i-drm display-engine: failed to bind 6000000.hdmi (ops sun8i_dw_hdmi_ops [sun8i_drm_hdmi]): -16
  [    4.984082] sun4i-drm display-engine: Couldn't bind all pipelines components
  [    4.984171] sun4i-drm display-engine: adev bind failed: -16
  [    4.984179] sun8i-dw-hdmi: probe of 6000000.hdmi failed with error -16

Both drivers have the same behavior: they leave the GPIO active for the
life of the device. Let's take advantage of the new implementation, and
drop the now-obsolete code from the HDMI controller driver.

Fixes: 6de79dd3a9 ("drm/bridge: display-connector: add ddc-en gpio support")
Signed-off-by: Samuel Holland <samuel@sholland.org>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220614073100.11550-1-samuel@sholland.org
2022-06-16 09:48:44 +02:00
Darrick J. Wong
e89ab76d7e xfs: preserve DIFLAG2_NREXT64 when setting other inode attributes
It is vitally important that we preserve the state of the NREXT64 inode
flag when we're changing the other flags2 fields.

Fixes: 9b7d16e34b ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2022-06-15 23:13:33 -07:00
Darrick J. Wong
10930b254d xfs: fix variable state usage
The variable @args is fed to a tracepoint, and that's the only place
it's used.  This is fine for the kernel, but for userspace, tracepoints
are #define'd out of existence, which results in this warning on gcc
11.2:

xfs_attr.c: In function ‘xfs_attr_node_try_addname’:
xfs_attr.c:1440:42: warning: unused variable ‘args’ [-Wunused-variable]
 1440 |         struct xfs_da_args              *args = attr->xattri_da_args;
      |                                          ^~~~

Clean this up.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2022-06-15 23:13:32 -07:00
Darrick J. Wong
f4288f0182 xfs: fix TOCTOU race involving the new logged xattrs control knob
I found a race involving the larp control knob, aka the debugging knob
that lets developers enable logging of extended attribute updates:

Thread 1			Thread 2

echo 0 > /sys/fs/xfs/debug/larp
				setxattr(REPLACE)
				xfs_has_larp (returns false)
				xfs_attr_set

echo 1 > /sys/fs/xfs/debug/larp

				xfs_attr_defer_replace
				xfs_attr_init_replace_state
				xfs_has_larp (returns true)
				xfs_attr_init_remove_state

				<oops, wrong DAS state!>

This isn't a particularly severe problem right now because xattr logging
is only enabled when CONFIG_XFS_DEBUG=y, and developers *should* know
what they're doing.

However, the eventual intent is that callers should be able to ask for
the assistance of the log in persisting xattr updates.  This capability
might not be required for /all/ callers, which means that dynamic
control must work correctly.  Once an xattr update has decided whether
or not to use logged xattrs, it needs to stay in that mode until the end
of the operation regardless of what subsequent parallel operations might
do.

Therefore, it is an error to continue sampling xfs_globals.larp once
xfs_attr_change has made a decision about larp, and it was not correct
for me to have told Allison that ->create_intent functions can sample
the global log incompat feature bitfield to decide to elide a log item.

Instead, create a new op flag for the xfs_da_args structure, and convert
all other callers of xfs_has_larp and xfs_sb_version_haslogxattrs within
the attr update state machine to look for the operations flag.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2022-06-15 23:13:32 -07:00
Christian Göttsche
cad140d008 selinux: free contexts previously transferred in selinux_add_opt()
`selinux_add_opt()` stopped taking ownership of the passed context since
commit 70f4169ab4 ("selinux: parse contexts for mount options early").

    unreferenced object 0xffff888114dfd140 (size 64):
      comm "mount", pid 15182, jiffies 4295687028 (age 796.340s)
      hex dump (first 32 bytes):
        73 79 73 74 65 6d 5f 75 3a 6f 62 6a 65 63 74 5f  system_u:object_
        72 3a 74 65 73 74 5f 66 69 6c 65 73 79 73 74 65  r:test_filesyste
      backtrace:
        [<ffffffffa07dbef4>] kmemdup_nul+0x24/0x80
        [<ffffffffa0d34253>] selinux_sb_eat_lsm_opts+0x293/0x560
        [<ffffffffa0d13f08>] security_sb_eat_lsm_opts+0x58/0x80
        [<ffffffffa0af1eb2>] generic_parse_monolithic+0x82/0x180
        [<ffffffffa0a9c1a5>] do_new_mount+0x1f5/0x550
        [<ffffffffa0a9eccb>] path_mount+0x2ab/0x1570
        [<ffffffffa0aa019e>] __x64_sys_mount+0x20e/0x280
        [<ffffffffa1f47124>] do_syscall_64+0x34/0x80
        [<ffffffffa200007e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

    unreferenced object 0xffff888108e71640 (size 64):
      comm "fsmount", pid 7607, jiffies 4295044974 (age 1601.016s)
      hex dump (first 32 bytes):
        73 79 73 74 65 6d 5f 75 3a 6f 62 6a 65 63 74 5f  system_u:object_
        72 3a 74 65 73 74 5f 66 69 6c 65 73 79 73 74 65  r:test_filesyste
      backtrace:
        [<ffffffff861dc2b1>] memdup_user+0x21/0x90
        [<ffffffff861dc367>] strndup_user+0x47/0xa0
        [<ffffffff864f6965>] __do_sys_fsconfig+0x485/0x9f0
        [<ffffffff87940124>] do_syscall_64+0x34/0x80
        [<ffffffff87a0007e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

Cc: stable@vger.kernel.org
Fixes: 70f4169ab4 ("selinux: parse contexts for mount options early")
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-06-15 21:20:45 -04:00
Lukas Bulwahn
a79e69c871 MAINTAINERS: add include/dt-bindings/clock to COMMON CLK FRAMEWORK
Maintainers of the directory Documentation/devicetree/bindings/clock
are also the maintainers of the corresponding directory in
include/dt-bindings/clock.

Add the file entry for include/dt-bindings/clock to the appropriate
section in MAINTAINERS.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20220613085100.402-1-lukas.bulwahn@gmail.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-06-15 17:19:02 -07:00
Christian Göttsche
ef79c396c6 audit: free module name
Reset the type of the record last as the helper `audit_free_module()`
depends on it.

    unreferenced object 0xffff888153b707f0 (size 16):
      comm "modprobe", pid 1319, jiffies 4295110033 (age 1083.016s)
      hex dump (first 16 bytes):
        62 69 6e 66 6d 74 5f 6d 69 73 63 00 6b 6b 6b a5  binfmt_misc.kkk.
      backtrace:
        [<ffffffffa07dbf9b>] kstrdup+0x2b/0x50
        [<ffffffffa04b0a9d>] __audit_log_kern_module+0x4d/0xf0
        [<ffffffffa03b6664>] load_module+0x9d4/0x2e10
        [<ffffffffa03b8f44>] __do_sys_finit_module+0x114/0x1b0
        [<ffffffffa1f47124>] do_syscall_64+0x34/0x80
        [<ffffffffa200007e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

Cc: stable@vger.kernel.org
Fixes: 12c5e81d3f ("audit: prepare audit_context for use in calling contexts beyond syscalls")
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-06-15 19:28:44 -04:00
Linus Torvalds
30306f6194 Merge tag 'hardening-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:

 - Correctly handle vm_map areas in hardened usercopy (Matthew Wilcox)

 - Adjust CFI RCU usage to avoid boot splats with cpuidle (Sami Tolvanen)

* tag 'hardening-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  usercopy: Make usercopy resilient against ridiculously large copies
  usercopy: Cast pointer to an integer once
  usercopy: Handle vm_map_ram() areas
  cfi: Fix __cfi_slowpath_diag RCU usage with cpuidle
2022-06-15 14:20:26 -07:00
Rob Clark
b4d329c451 drm/msm/gem: Drop early returns in close/purge vma
Keep the warn, but drop the early return.  If we do manage to hit this
sort of issue, skipping the cleanup just makes things worse (dangling
drm_mm_nodes when the msm_gem_vma is freed, etc).  Whereas the worst
that happens if we tear down a mapping the GPU is accessing is that we
get GPU iova faults, but otherwise the world keeps spinning.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Reported-by: Steev Klimaszewski <steev@kali.org>
Patchwork: https://patchwork.freedesktop.org/patch/489115/
Link: https://lore.kernel.org/r/20220610172055.2337977-1-robdclark@gmail.com
2022-06-15 13:22:45 -07:00
Rob Clark
311e03c29c drm/msm/gem: Separate object and vma unpin
Previously the BO_PINNED state in the submit was tracking two related
but different things: (1) that the buffer object was pinned, and (2)
that the vma (mapping within a set of pagetables) was pinned.  But with
fenced vma unpin (needed so that userspace couldn't race with retire
path for releasing a vma) these two were decoupled.  The fact that the
BO_PINNED flag was already cleared meant that we leaked the bo pin count
which should have been dropped when the submit was retired.

So split this state into BO_OBJ_PINNED and BO_VMA_PINNED, so they can be
dropped independently.

Fixes: 95d1deb02a ("drm/msm/gem: Add fenced vma unpin")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/487559/
Link: https://lore.kernel.org/r/20220527172341.2151005-1-robdclark@gmail.com
2022-06-15 13:06:54 -07:00
Petr Mladek
b87f02307d printk: Wait for the global console lock when the system is going down
There are reports that the console kthreads block the global console
lock when the system is going down, for example, reboot, panic.

First part of the solution was to block kthreads in these problematic
system states so they stopped handling newly added messages.

Second part of the solution is to wait when for the kthreads when
they are actively printing. It solves the problem when a message
was printed before the system entered the problematic state and
the kthreads managed to step in.

A busy waiting has to be used because panic() can be called in any
context and in an unknown state of the scheduler.

There must be a timeout because the kthread might get stuck or sleeping
and never release the lock. The timeout 10s is an arbitrary value
inspired by the softlockup timeout.

Link: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
Link: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20220615162805.27962-3-pmladek@suse.com
2022-06-15 22:04:15 +02:00
Petr Mladek
c3230283e2 printk: Block console kthreads when direct printing will be required
There are known situations when the console kthreads are not
reliable or does not work in principle, for example, early boot,
panic, shutdown.

For these situations there is the direct (legacy) mode when printk() tries
to get console_lock() and flush the messages directly. It works very well
during the early boot when the console kthreads are not available at all.
It gets more complicated in the other situations when console kthreads
might be actively printing and block console_trylock() in printk().

The same problem is in the legacy code as well. Any console_lock()
owner could block console_trylock() in printk(). It is solved by
a trick that the current console_lock() owner is responsible for
printing all pending messages. It is actually the reason why there
is the risk of softlockups and why the console kthreads were
introduced.

The console kthreads use the same approach. They are responsible
for printing the messages by definition. So that they handle
the messages anytime when they are awake and see new ones.
The global console_lock is available when there is nothing
to do.

It should work well when the problematic context is correctly
detected and printk() switches to the direct mode. But it seems
that it is not enough in practice. There are reports that
the messages are not printed during panic() or shutdown()
even though printk() tries to use the direct mode here.

The problem seems to be that console kthreads become active in these
situation as well. They steel the job before other CPUs are stopped.
Then they are stopped in the middle of the job and block the global
console_lock.

First part of the solution is to block console kthreads when
the system is in a problematic state and requires the direct
printk() mode.

Link: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1
Link: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com
Suggested-by: John Ogness <john.ogness@linutronix.de>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220615162805.27962-2-pmladek@suse.com
2022-06-15 22:03:38 +02:00
Linus Torvalds
afe9eb14ea Merge tag 'tpmdd-next-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm fixes from Jarkko Sakkinen:
 "Two fixes for this merge window"

* tag 'tpmdd-next-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  certs: fix and refactor CONFIG_SYSTEM_BLACKLIST_HASH_LIST build
  certs/blacklist_hashes.c: fix const confusion in certs blacklist
2022-06-15 12:34:19 -07:00
Dave Wysochanski
5ee3d10f84 NFSv4: Add FMODE_CAN_ODIRECT after successful open of a NFS4.x file
Commit a2ad63daa8 ("VFS: add FMODE_CAN_ODIRECT file flag")
added the FMODE_CAN_ODIRECT flag for NFSv3 but neglected to add
it for NFSv4.x.  This causes direct io on NFSv4.x to fail open
with EINVAL:
  mount -o vers=4.2 127.0.0.1:/export /mnt/nfs4
  dd if=/dev/zero of=/mnt/nfs4/file.bin bs=128k count=1 oflag=direct
  dd: failed to open '/mnt/nfs4/file.bin': Invalid argument
  dd of=/dev/null if=/mnt/nfs4/file.bin bs=128k count=1 iflag=direct
  dd: failed to open '/mnt/dir1/file1.bin': Invalid argument

Fixes: a2ad63daa8 ("VFS: add FMODE_CAN_ODIRECT file flag")
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-15 15:03:12 -04:00
Masahiro Yamada
27b5b22d25 certs: fix and refactor CONFIG_SYSTEM_BLACKLIST_HASH_LIST build
Commit addf466389 ("certs: Check that builtin blacklist hashes are
valid") was applied 8 months after the submission.

In the meantime, the base code had been removed by commit b8c96a6b46
("certs: simplify $(srctree)/ handling and remove config_filename
macro").

Fix the Makefile.

Create a local copy of $(CONFIG_SYSTEM_BLACKLIST_HASH_LIST). It is
included from certs/blacklist_hashes.c and also works as a timestamp.

Send error messages from check-blacklist-hashes.awk to stderr instead
of stdout.

Fixes: addf466389 ("certs: Check that builtin blacklist hashes are valid")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Mickaël Salaün <mic@linux.microsoft.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2022-06-15 21:52:32 +03:00
Masahiro Yamada
6a1c3767d8 certs/blacklist_hashes.c: fix const confusion in certs blacklist
This file fails to compile as follows:

  CC      certs/blacklist_hashes.o
certs/blacklist_hashes.c:4:1: error: ignoring attribute ‘section (".init.data")’ because it conflicts with previous ‘section (".init.rodata")’ [-Werror=attributes]
    4 | const char __initdata *const blacklist_hashes[] = {
      | ^~~~~
In file included from certs/blacklist_hashes.c:2:
certs/blacklist.h:5:38: note: previous declaration here
    5 | extern const char __initconst *const blacklist_hashes[];
      |                                      ^~~~~~~~~~~~~~~~

Apply the same fix as commit 2be04df566 ("certs/blacklist_nohashes.c:
fix const confusion in certs blacklist").

Fixes: 734114f878 ("KEYS: Add a system blacklist keyring")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Mickaël Salaün <mic@linux.microsoft.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2022-06-15 21:52:32 +03:00
Tianyu Lan
49d6a3c062 x86/Hyper-V: Add SEV negotiate protocol support in Isolation VM
Hyper-V Isolation VM current code uses sev_es_ghcb_hv_call()
to read/write MSR via GHCB page and depends on the sev code.
This may cause regression when sev code changes interface
design.

The latest SEV-ES code requires to negotiate GHCB version before
reading/writing MSR via GHCB page and sev_es_ghcb_hv_call() doesn't
work for Hyper-V Isolation VM. Add Hyper-V ghcb related implementation
to decouple SEV and Hyper-V code. Negotiate GHCB version in the
hyperv_init() and use the version to communicate with Hyper-V
in the ghcb hv call function.

Fixes: 2ea29c5abb ("x86/sev: Save the negotiated GHCB version")
Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220614014553.1915929-1-ltykernel@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-06-15 18:27:40 +00:00
Kirill A. Shutemov
cdd85786f4 x86/tdx: Clarify RIP adjustments in #VE handler
After successful #VE handling, tdx_handle_virt_exception() has to move
RIP to the next instruction. The handler needs to know the length of the
instruction.

If the #VE happened due to instruction execution, the GET_VEINFO TDX
module call provides info on the instruction in R10, including its length.

For #VE due to EPT violation, the info in R10 is not populand and the
kernel must decode the instruction manually to find out its length.

Restructure the code to make it explicit that the instruction length
depends on the type of #VE. Make individual #VE handlers return
the instruction length on success or -errno on failure.

[ dhansen: fix up changelog and comments ]

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20220614120135.14812-3-kirill.shutemov@linux.intel.com
2022-06-15 11:05:16 -07:00
Jens Axboe
04cb45b495 Merge branch 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-5.19
Pull MD fixes from Song.

* 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
  md/raid5-ppl: Fix argument order in bio_alloc_bioset()
  Revert "md: don't unregister sync_thread with reconfig_mutex held"
2022-06-15 11:56:07 -06:00
Kirill A. Shutemov
60428d8bc2 x86/tdx: Fix early #VE handling
tdx_early_handle_ve() does not increment RIP after successfully
handling the exception.  That leads to infinite loop of exceptions.

Move RIP when exceptions are successfully handled.

[ dhansen: make problem statement more clear ]

Fixes: 32e72854fa ("x86/tdx: Port I/O: Add early boot support")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://lkml.kernel.org/r/20220614120135.14812-2-kirill.shutemov@linux.intel.com
2022-06-15 10:52:59 -07:00
Logan Gunthorpe
f34fdcd4a0 md/raid5-ppl: Fix argument order in bio_alloc_bioset()
bio_alloc_bioset() takes a block device, number of vectors, the
OP flags, the GFP mask and the bio set. However when the prototype
was changed, the callisite in ppl_do_flush() had the OP flags and
the GFP flags reversed. This introduced some sparse error:

  drivers/md/raid5-ppl.c:632:57: warning: incorrect type in argument 3
				    (different base types)
  drivers/md/raid5-ppl.c:632:57:    expected unsigned int opf
  drivers/md/raid5-ppl.c:632:57:    got restricted gfp_t [usertype]
  drivers/md/raid5-ppl.c:633:61: warning: incorrect type in argument 4
  				    (different base types)
  drivers/md/raid5-ppl.c:633:61:    expected restricted gfp_t [usertype]
				    gfp_mask
  drivers/md/raid5-ppl.c:633:61:    got unsigned long long

The sparse error introduction may not have been reported correctly by
0day due to other work that was cleaning up other sparse errors in this
area.

Fixes: 609be10667 ("block: pass a block_device and opf to bio_alloc_bioset")
Cc: stable@vger.kernel.org # 5.18+
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
2022-06-15 10:32:48 -07:00
Kumar Kartikeya Dwivedi
d1a374a1ae bpf: Limit maximum modifier chain length in btf_check_type_tags
On processing a module BTF of module built for an older kernel, we might
sometimes find that some type points to itself forming a loop. If such a
type is a modifier, btf_check_type_tags's while loop following modifier
chain will be caught in an infinite loop.

Fix this by defining a maximum chain length and bailing out if we spin
any longer than that.

Fixes: eb596b0905 ("bpf: Ensure type tags precede modifiers in BTF")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220615042151.2266537-1-memxor@gmail.com
2022-06-15 19:32:12 +02:00
Guoqing Jiang
d0a180341f Revert "md: don't unregister sync_thread with reconfig_mutex held"
The 07reshape5intr test is broke because of below path.

    md_reap_sync_thread
            -> mddev_unlock
            -> md_unregister_thread(&mddev->sync_thread)

And md_check_recovery is triggered by,

mddev_unlock -> md_wakeup_thread(mddev->thread)

then mddev->reshape_position is set to MaxSector in raid5_finish_reshape
since MD_RECOVERY_INTR is cleared in md_check_recovery, which means
feature_map is not set with MD_FEATURE_RESHAPE_ACTIVE and superblock's
reshape_position can't be updated accordingly.

Fixes: 8b48ec23cc ("md: don't unregister sync_thread with reconfig_mutex held")
Reported-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Song Liu <song@kernel.org>
2022-06-15 10:30:14 -07:00
Mengqi Zhang
89bcd9a64b mmc: mediatek: wait dma stop bit reset to 0
MediaTek IP requires that after dma stop, it need to wait this dma stop
bit auto-reset to 0. When bus is in high loading state, it will take a
while for the dma stop complete. If there is no waiting operation here,
when program runs to clear fifo and reset, bus will hang.

In addition, there should be no return in msdc_data_xfer_next() if
there is data need be transferred, because no matter what error occurs
here, it should continue to excute to the following mmc_request_done.
Otherwise the core layer may wait complete forever.

Signed-off-by: Mengqi Zhang <mengqi.zhang@mediatek.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220609112239.18911-1-mengqi.zhang@mediatek.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-06-15 10:05:56 -07:00
Linus Torvalds
979086f5e0 Merge tag 'fs.fixes.v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull vfs idmapping fix from Christian Brauner:
 "This fixes an issue where we fail to change the group of a file when
  the caller owns the file and is a member of the group to change to.

  This is only relevant on idmapped mounts.

  There's a detailed description in the commit message and regression
  tests have been added to xfstests"

* tag 'fs.fixes.v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  fs: account for group membership
2022-06-15 09:04:55 -07:00
Benjamin Marzinski
10eb3a0d51 dm: fix race in dm_start_io_acct
After commit 82f6cdcc36 ("dm: switch dm_io booleans over to proper
flags") dm_start_io_acct stopped atomically checking and setting
was_accounted, which turned into the DM_IO_ACCOUNTED flag. This opened
the possibility for a race where IO accounting is started twice for
duplicate bios. To remove the race, check the flag while holding the
io->lock.

Fixes: 82f6cdcc36 ("dm: switch dm_io booleans over to proper flags")
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-15 11:51:41 -04:00
Jens Axboe
2396e958c8 Merge tag 'nvme-5.19-2022-06-15' of git://git.infradead.org/nvme into block-5.19
Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.19

 - quirks, quirks, quirks to work around buggy consumer grade devices
   (Keith Bush, Ning Wang, Stefan Reiter, Rasheed Hsueh)
 - better kernel messages for devices that need quirking (Keith Bush)
 - make a kernel message more useful (Thomas Weißschuh)"

* tag 'nvme-5.19-2022-06-15' of git://git.infradead.org/nvme:
  nvme-pci: disable write zeros support on UMIC and Samsung SSDs
  nvme-pci: avoid the deepest sleep state on ZHITAI TiPro7000 SSDs
  nvme-pci: sk hynix p31 has bogus namespace ids
  nvme-pci: smi has bogus namespace ids
  nvme-pci: phison e12 has bogus namespace ids
  nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S50
  nvme-pci: add trouble shooting steps for timeouts
  nvme: add bug report info for global duplicate id
  nvme: add device name to warning in uuid_show()
2022-06-15 09:39:05 -06:00
Mark Rutland
0d8116ccd8 arm64: ftrace: remove redundant label
Since commit:

  c4a0ebf87c ("arm64/ftrace: Make function graph use ftrace directly")

The 'ftrace_common_return' label has been unused.

Remove it.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Chengming Zhou <zhouchengming@bytedance.com>
Cc: Will Deacon <will@kernel.org>
Tested-by: "Ivan T. Ivanov" <iivanov@suse.de>
Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220614080944.1349146-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-06-15 16:14:47 +01:00
Mark Rutland
a625357997 arm64: ftrace: consistently handle PLTs.
Sometimes it is necessary to use a PLT entry to call an ftrace
trampoline. This is handled by ftrace_make_call() and ftrace_make_nop(),
with each having *almost* identical logic, but this is not handled by
ftrace_modify_call() since its introduction in commit:

  3b23e4991f ("arm64: implement ftrace with regs")

Due to this, if we ever were to call ftrace_modify_call() for a callsite
which requires a PLT entry for a trampoline, then either:

a) If the old addr requires a trampoline, ftrace_modify_call() will use
   an out-of-range address to generate the 'old' branch instruction.
   This will result in warnings from aarch64_insn_gen_branch_imm() and
   ftrace_modify_code(), and no instructions will be modified. As
   ftrace_modify_call() will return an error, this will result in
   subsequent internal ftrace errors.

b) If the old addr does not require a trampoline, but the new addr does,
   ftrace_modify_call() will use an out-of-range address to generate the
   'new' branch instruction. This will result in warnings from
   aarch64_insn_gen_branch_imm(), and ftrace_modify_code() will replace
   the 'old' branch with a BRK. This will result in a kernel panic when
   this BRK is later executed.

Practically speaking, case (a) is vastly more likely than case (b), and
typically this will result in internal ftrace errors that don't
necessarily affect the rest of the system. This can be demonstrated with
an out-of-tree test module which triggers ftrace_modify_call(), e.g.

| # insmod test_ftrace.ko
| test_ftrace: Function test_function raw=0xffffb3749399201c, callsite=0xffffb37493992024
| branch_imm_common: offset out of range
| branch_imm_common: offset out of range
| ------------[ ftrace bug ]------------
| ftrace failed to modify
| [<ffffb37493992024>] test_function+0x8/0x38 [test_ftrace]
|  actual:   1d:00:00:94
| Updating ftrace call site to call a different ftrace function
| ftrace record flags: e0000002
|  (2) R
|  expected tramp: ffffb374ae42ed54
| ------------[ cut here ]------------
| WARNING: CPU: 0 PID: 165 at kernel/trace/ftrace.c:2085 ftrace_bug+0x280/0x2b0
| Modules linked in: test_ftrace(+)
| CPU: 0 PID: 165 Comm: insmod Not tainted 5.19.0-rc2-00002-g4d9ead8b45ce #13
| Hardware name: linux,dummy-virt (DT)
| pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : ftrace_bug+0x280/0x2b0
| lr : ftrace_bug+0x280/0x2b0
| sp : ffff80000839ba00
| x29: ffff80000839ba00 x28: 0000000000000000 x27: ffff80000839bcf0
| x26: ffffb37493994180 x25: ffffb374b0991c28 x24: ffffb374b0d70000
| x23: 00000000ffffffea x22: ffffb374afcc33b0 x21: ffffb374b08f9cc8
| x20: ffff572b8462c000 x19: ffffb374b08f9000 x18: ffffffffffffffff
| x17: 6c6c6163202c6331 x16: ffffb374ae5ad110 x15: ffffb374b0d51ee4
| x14: 0000000000000000 x13: 3435646532346561 x12: 3437336266666666
| x11: 203a706d61727420 x10: 6465746365707865 x9 : ffffb374ae5149e8
| x8 : 336266666666203a x7 : 706d617274206465 x6 : 00000000fffff167
| x5 : ffff572bffbc4a08 x4 : 00000000fffff167 x3 : 0000000000000000
| x2 : 0000000000000000 x1 : ffff572b84461e00 x0 : 0000000000000022
| Call trace:
|  ftrace_bug+0x280/0x2b0
|  ftrace_replace_code+0x98/0xa0
|  ftrace_modify_all_code+0xe0/0x144
|  arch_ftrace_update_code+0x14/0x20
|  ftrace_startup+0xf8/0x1b0
|  register_ftrace_function+0x38/0x90
|  test_ftrace_init+0xd0/0x1000 [test_ftrace]
|  do_one_initcall+0x50/0x2b0
|  do_init_module+0x50/0x1f0
|  load_module+0x17c8/0x1d64
|  __do_sys_finit_module+0xa8/0x100
|  __arm64_sys_finit_module+0x2c/0x3c
|  invoke_syscall+0x50/0x120
|  el0_svc_common.constprop.0+0xdc/0x100
|  do_el0_svc+0x3c/0xd0
|  el0_svc+0x34/0xb0
|  el0t_64_sync_handler+0xbc/0x140
|  el0t_64_sync+0x18c/0x190
| ---[ end trace 0000000000000000 ]---

We can solve this by consistently determining whether to use a PLT entry
for an address.

Note that since (the earlier) commit:

  f1a54ae9af ("arm64: module/ftrace: intialize PLT at load time")

... we can consistently determine the PLT address that a given callsite
will use, and therefore ftrace_make_nop() does not need to skip
validation when a PLT is in use.

This patch factors the existing logic out of ftrace_make_call() and
ftrace_make_nop() into a common ftrace_find_callable_addr() helper
function, which is used by ftrace_make_call(), ftrace_make_nop(), and
ftrace_modify_call(). In ftrace_make_nop() the patching is consistently
validated by ftrace_modify_code() as we can always determine what the
old instruction should have been.

Fixes: 3b23e4991f ("arm64: implement ftrace with regs")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Will Deacon <will@kernel.org>
Tested-by: "Ivan T. Ivanov" <iivanov@suse.de>
Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220614080944.1349146-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-06-15 16:14:47 +01:00
Mark Rutland
3eefdf9d1e arm64: ftrace: fix branch range checks
The branch range checks in ftrace_make_call() and ftrace_make_nop() are
incorrect, erroneously permitting a forwards branch of 128M and
erroneously rejecting a backwards branch of 128M.

This is because both functions calculate the offset backwards,
calculating the offset *from* the target *to* the branch, rather than
the other way around as the later comparisons expect.

If an out-of-range branch were erroeously permitted, this would later be
rejected by aarch64_insn_gen_branch_imm() as branch_imm_common() checks
the bounds correctly, resulting in warnings and the placement of a BRK
instruction. Note that this can only happen for a forwards branch of
exactly 128M, and so the caller would need to be exactly 128M bytes
below the relevant ftrace trampoline.

If an in-range branch were erroeously rejected, then:

* For modules when CONFIG_ARM64_MODULE_PLTS=y, this would result in the
  use of a PLT entry, which is benign.

  Note that this is the common case, as this is selected by
  CONFIG_RANDOMIZE_BASE (and therefore RANDOMIZE_MODULE_REGION_FULL),
  which distributions typically seelct. This is also selected by
  CONFIG_ARM64_ERRATUM_843419.

* For modules when CONFIG_ARM64_MODULE_PLTS=n, this would result in
  internal ftrace failures.

* For core kernel text, this would result in internal ftrace failues.

  Note that for this to happen, the kernel text would need to be at
  least 128M bytes in size, and typical configurations are smaller tha
  this.

Fix this by calculating the offset *from* the branch *to* the target in
both functions.

Fixes: f8af0b364e ("arm64: ftrace: don't validate branch via PLT in ftrace_make_nop()")
Fixes: e71a4e1beb ("arm64: ftrace: add support for far branches to dynamic ftrace")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Will Deacon <will@kernel.org>
Tested-by: "Ivan T. Ivanov" <iivanov@suse.de>
Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220614080944.1349146-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-06-15 16:14:46 +01:00
Michael Carns
ec41c6d820 hwmon: (asus-ec-sensors) add missing comma in board name list.
This fixes a regression where coma lead to concatenating board names
and broke module loading for C8H.

Fixes: 5b4285c57b ("hwmon: (asus-ec-sensors) fix Formula VIII definition")

Signed-off-by: Michael Carns <mike@carns.com>
Signed-off-by: Eugene Shalygin <eugene.shalygin@gmail.com>
Link: https://lore.kernel.org/r/20220615122544.140340-1-eugene.shalygin@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-06-15 08:14:38 -07:00
Catalin Marinas
27d8fa2078 Revert "arm64: Initialize jump labels before setup_machine_fdt()"
This reverts commit 73e2d827a5.

The reverted patch was needed as a fix after commit f5bda35fba
("random: use static branch for crng_ready()"). However, this was
already fixed by 60e5b2886b ("random: do not use jump labels before
they are initialized") and hence no longer necessary to initialise jump
labels before setup_machine_fdt().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-06-15 16:14:32 +01:00
Jon Maxwell
3046a82731 bpf: Fix request_sock leak in sk lookup helpers
A customer reported a request_socket leak in a Calico cloud environment. We
found that a BPF program was doing a socket lookup with takes a refcnt on
the socket and that it was finding the request_socket but returning the parent
LISTEN socket via sk_to_full_sk() without decrementing the child request socket
1st, resulting in request_sock slab object leak. This patch retains the
existing behaviour of returning full socks to the caller but it also decrements
the child request_socket if one is present before doing so to prevent the leak.

Thanks to Curtis Taylor for all the help in diagnosing and testing this. And
thanks to Antoine Tenart for the reproducer and patch input.

v2 of this patch contains, refactor as per Daniel Borkmann's suggestions to
validate RCU flags on the listen socket so that it balances with bpf_sk_release()
and update comments as per Martin KaFai Lau's suggestion. One small change to
Daniels suggestion, put "sk = sk2" under "if (sk2 != sk)" to avoid an extra
instruction.

Fixes: f7355a6c04 ("bpf: Check sk_fullsock() before returning from bpf_sk_lookup()")
Fixes: edbf8c01de ("bpf: add skc_lookup_tcp helper")
Co-developed-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Curtis Taylor <cutaylor-pub@yahoo.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/56d6f898-bde0-bb25-3427-12a330b29fb8@iogearbox.net
Link: https://lore.kernel.org/bpf/20220615011540.813025-1-jmaxwell37@gmail.com
2022-06-15 16:10:07 +02:00
Dmitry Klochkov
933b5f9f98 tools/kvm_stat: fix display of error when multiple processes are found
Instead of printing an error message, kvm_stat script fails when we
restrict statistics to a guest by its name and there are multiple guests
with such name:

  # kvm_stat -g my_vm
  Traceback (most recent call last):
    File "/usr/bin/kvm_stat", line 1819, in <module>
      main()
    File "/usr/bin/kvm_stat", line 1779, in main
      options = get_options()
    File "/usr/bin/kvm_stat", line 1718, in get_options
      options = argparser.parse_args()
    File "/usr/lib64/python3.10/argparse.py", line 1825, in parse_args
      args, argv = self.parse_known_args(args, namespace)
    File "/usr/lib64/python3.10/argparse.py", line 1858, in parse_known_args
      namespace, args = self._parse_known_args(args, namespace)
    File "/usr/lib64/python3.10/argparse.py", line 2067, in _parse_known_args
      start_index = consume_optional(start_index)
    File "/usr/lib64/python3.10/argparse.py", line 2007, in consume_optional
      take_action(action, args, option_string)
    File "/usr/lib64/python3.10/argparse.py", line 1935, in take_action
      action(self, namespace, argument_values, option_string)
    File "/usr/bin/kvm_stat", line 1649, in __call__
      ' to specify the desired pid'.format(" ".join(pids)))
  TypeError: sequence item 0: expected str instance, int found

To avoid this, it's needed to convert pids int values to strings before
pass them to join().

Signed-off-by: Dmitry Klochkov <kdmitry556@gmail.com>
Message-Id: <20220614121141.160689-1-kdmitry556@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15 08:14:20 -04:00
Duoming Zhou
219b51a6f0 net: ax25: Fix deadlock caused by skb_recv_datagram in ax25_recvmsg
The skb_recv_datagram() in ax25_recvmsg() will hold lock_sock
and block until it receives a packet from the remote. If the client
doesn`t connect to server and calls read() directly, it will not
receive any packets forever. As a result, the deadlock will happen.

The fail log caused by deadlock is shown below:

[  369.606973] INFO: task ax25_deadlock:157 blocked for more than 245 seconds.
[  369.608919] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  369.613058] Call Trace:
[  369.613315]  <TASK>
[  369.614072]  __schedule+0x2f9/0xb20
[  369.615029]  schedule+0x49/0xb0
[  369.615734]  __lock_sock+0x92/0x100
[  369.616763]  ? destroy_sched_domains_rcu+0x20/0x20
[  369.617941]  lock_sock_nested+0x6e/0x70
[  369.618809]  ax25_bind+0xaa/0x210
[  369.619736]  __sys_bind+0xca/0xf0
[  369.620039]  ? do_futex+0xae/0x1b0
[  369.620387]  ? __x64_sys_futex+0x7c/0x1c0
[  369.620601]  ? fpregs_assert_state_consistent+0x19/0x40
[  369.620613]  __x64_sys_bind+0x11/0x20
[  369.621791]  do_syscall_64+0x3b/0x90
[  369.622423]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  369.623319] RIP: 0033:0x7f43c8aa8af7
[  369.624301] RSP: 002b:00007f43c8197ef8 EFLAGS: 00000246 ORIG_RAX: 0000000000000031
[  369.625756] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f43c8aa8af7
[  369.626724] RDX: 0000000000000010 RSI: 000055768e2021d0 RDI: 0000000000000005
[  369.628569] RBP: 00007f43c8197f00 R08: 0000000000000011 R09: 00007f43c8198700
[  369.630208] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff845e6afe
[  369.632240] R13: 00007fff845e6aff R14: 00007f43c8197fc0 R15: 00007f43c8198700

This patch replaces skb_recv_datagram() with an open-coded variant of it
releasing the socket lock before the __skb_wait_for_more_packets() call
and re-acquiring it after such call in order that other functions that
need socket lock could be executed.

what's more, the socket lock will be released only when recvmsg() will
block and that should produce nicer overall behavior.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Suggested-by: Thomas Osterried <thomas@osterried.de>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reported-by: Thomas Habets <thomas@@habets.se>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 13:00:22 +01:00
Pavel Begunkov
c5595975b5 io_uring: make io_fill_cqe_aux honour CQE32
Don't let io_fill_cqe_aux() post 16B cqes for CQE32 rings, neither the
kernel nor the userspace expect this to happen.

Fixes: 76c68fbf1a ("io_uring: enable CQE32")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/64fae669fae1b7083aa15d0cd807f692b0880b9a.1655287457.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-15 05:06:56 -06:00
Pavel Begunkov
cd94903d3b io_uring: remove __io_fill_cqe() helper
In preparation for the following patch, inline __io_fill_cqe(), there is
only one user.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/71dab9afc3cde3f8b64d26f20d3b60bdc40726ff.1655287457.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-15 05:06:42 -06:00
Pavel Begunkov
2caf9822f0 io_uring: fix ->extra{1,2} misuse
We don't really know the state of req->extra{1,2] fields in
__io_fill_cqe_req(), if an opcode handler is not aware of CQE32 option,
it never sets them up properly. Track the state of those fields with a
request flag.

Fixes: 76c68fbf1a ("io_uring: enable CQE32")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4b3e5be512fbf4debec7270fd485b8a3b014d464.1655287457.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-15 05:06:09 -06:00
Pavel Begunkov
29ede2014c io_uring: fill extra big cqe fields from req
The only user of io_req_complete32()-like functions is cmd
requests. Instead of keeping the whole complete32 family, remove them
and provide the extras in already added for inline completions
req->extra{1,2}. When fill_cqe_res() finds CQE32 option enabled
it'll use those fields to fill a 32B cqe.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/af1319eb661b1f9a0abceb51cbbf72b8002e019d.1655287457.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-15 05:06:09 -06:00
Pavel Begunkov
f43de1f888 io_uring: unite fill_cqe and the 32B version
We want just one function that will handle both normal cqes and 32B
cqes. Combine __io_fill_cqe_req() and __io_fill_cqe_req32(). It's still
not entirely correct yet, but saves us from cases when we fill an CQE of
a wrong size.

Fixes: 76c68fbf1a ("io_uring: enable CQE32")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/8085c5b2f74141520f60decd45334f87e389b718.1655287457.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-15 05:06:09 -06:00
Pavel Begunkov
91ef75a7db io_uring: get rid of __io_fill_cqe{32}_req()
There are too many cqe filling helpers, kill __io_fill_cqe{32}_req(),
use __io_fill_cqe{32}_req_filled() instead, and then rename it. It'll
simplify fixing in following patches.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c18e0d191014fb574f24721245e4e3fddd0b6917.1655287457.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-15 05:06:09 -06:00
Jose Alonso
36a15e1cb1 net: usb: ax88179_178a needs FLAG_SEND_ZLP
The extra byte inserted by usbnet.c when
 (length % dev->maxpacket == 0) is causing problems to device.

This patch sets FLAG_SEND_ZLP to avoid this.

Tested with: 0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet

Problems observed:
======================================================================
1) Using ssh/sshfs. The remote sshd daemon can abort with the message:
   "message authentication code incorrect"
   This happens because the tcp message sent is corrupted during the
   USB "Bulk out". The device calculate the tcp checksum and send a
   valid tcp message to the remote sshd. Then the encryption detects
   the error and aborts.
2) NETDEV WATCHDOG: ... (ax88179_178a): transmit queue 0 timed out
3) Stop normal work without any log message.
   The "Bulk in" continue receiving packets normally.
   The host sends "Bulk out" and the device responds with -ECONNRESET.
   (The netusb.c code tx_complete ignore -ECONNRESET)
Under normal conditions these errors take days to happen and in
intense usage take hours.

A test with ping gives packet loss, showing that something is wrong:
ping -4 -s 462 {destination}	# 462 = 512 - 42 - 8
Not all packets fail.
My guess is that the device tries to find another packet starting
at the extra byte and will fail or not depending on the next
bytes (old buffer content).
======================================================================

Signed-off-by: Jose Alonso <joalonsof@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:32:12 +01:00
David S. Miller
371de1aa00 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-06-14

This series contains updates to ice driver only.

Michal fixes incorrect Tx timestamp offset calculation for E822 devices.

Roman enforces required VLAN filtering settings for double VLAN mode.

Przemyslaw fixes memory corruption issues with VFs by ensuring
queues are disabled in the error path of VF queue configuration and to
disabled VFs during reset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:15:33 +01:00
Lukas Bulwahn
b60377de77 MAINTAINERS: add include/dt-bindings/net to NETWORKING DRIVERS
Maintainers of the directory Documentation/devicetree/bindings/net
are also the maintainers of the corresponding directory
include/dt-bindings/net.

Add the file entry for include/dt-bindings/net to the appropriate
section in MAINTAINERS.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20220613121826.11484-1-lukas.bulwahn@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 22:32:10 -07:00
Takashi Iwai
56ec3e755b ALSA: hda/realtek: Apply fixup for Lenovo Yoga Duet 7 properly
It turned out that Lenovo shipped two completely different products
with the very same PCI SSID, where both require different quirks;
namely, Lenovo C940 has already the fixup for its speaker
(ALC298_FIXUP_LENOVO_SPK_VOLUME) with the PCI SSID 17aa:3818, while
Yoga Duet 7 has also the very same PCI SSID but requires a different
quirk, ALC287_FIXUP_YOGA7_14TIL_SPEAKERS.

Fortunately, both are with different codecs (C940 with ALC298 and Duet
7 with ALC287), hence we can apply different fixes by checking the
codec ID.  This patch implements that special fixup function.

For easier handling, the internal function for applying a specific
fixup entry is exported as __snd_hda_apply_fixup(), so that it can be
called from the codec driver.  The rest is simply calling it with a
different fixup ID depending on the codec ID.

Reported-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: nikitashvets@flyium.com
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/5ca147d1-3a2d-60c6-c491-8aa844183222@redhat.com
Link: https://lore.kernel.org/r/20220614054831.14648-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-15 07:28:51 +02:00
Oleksij Rempel
56315b6bf7 ARM: dts: at91: ksz9477_evb: fix port/phy validation
Latest drivers version requires phy-mode to be set. Otherwise we will
use "NA" mode and the switch driver will invalidate this port mode.

Fixes: 65ac79e181 ("net: dsa: microchip: add the phylink get_caps")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20220610081621.584393-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 22:11:02 -07:00
Tyler Hicks
2a3dcbccd6 9p: Fix refcounting during full path walks for fid lookups
Decrement the refcount of the parent dentry's fid after walking
each path component during a full path walk for a lookup. Failure to do
so can lead to fids that are not clunked until the filesystem is
unmounted, as indicated by this warning:

 9pnet: found fid 3 not clunked

The improper refcounting after walking resulted in open(2) returning
-EIO on any directories underneath the mount point when using the virtio
transport. When using the fd transport, there's no apparent issue until
the filesytem is unmounted and the warning above is emitted to the logs.

In some cases, the user may not yet be attached to the filesystem and a
new root fid, associated with the user, is created and attached to the
root dentry before the full path walk is performed. Increment the new
root fid's refcount to two in that situation so that it can be safely
decremented to one after it is used for the walk operation. The new fid
will still be attached to the root dentry when
v9fs_fid_lookup_with_uid() returns so a final refcount of one is
correct/expected.

Link: https://lkml.kernel.org/r/20220527000003.355812-2-tyhicks@linux.microsoft.com
Link: https://lkml.kernel.org/r/20220612085330.1451496-4-asmadeus@codewreck.org
Fixes: 6636b6dcc3 ("9p: add refcount to p9_fid struct")
Cc: stable@vger.kernel.org
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
[Dominique: fix clunking fid multiple times discussed in second link]
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-06-15 12:05:33 +09:00
Dominique Martinet
e5690f2632 9p: fix fid refcount leak in v9fs_vfs_get_link
we check for protocol version later than required, after a fid has
been obtained. Just move the version check earlier.

Link: https://lkml.kernel.org/r/20220612085330.1451496-3-asmadeus@codewreck.org
Fixes: 6636b6dcc3 ("9p: add refcount to p9_fid struct")
Cc: stable@vger.kernel.org
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-06-15 12:05:29 +09:00
Dominique Martinet
beca774fc5 9p: fix fid refcount leak in v9fs_vfs_atomic_open_dotl
We need to release directory fid if we fail halfway through open

This fixes fid leaking with xfstests generic 531

Link: https://lkml.kernel.org/r/20220612085330.1451496-2-asmadeus@codewreck.org
Fixes: 6636b6dcc3 ("9p: add refcount to p9_fid struct")
Cc: stable@vger.kernel.org
Reported-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-06-15 12:05:21 +09:00
Christophe JAILLET
d7dd6eccfb net: bgmac: Fix an erroneous kfree() in bgmac_remove()
'bgmac' is part of a managed resource allocated with bgmac_alloc(). It
should not be freed explicitly.

Remove the erroneous kfree() from the .remove() function.

Fixes: 34a5102c32 ("net: bgmac: allocate struct bgmac just once & don't copy it")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/a026153108dd21239036a032b95c25b5cece253b.1655153616.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 19:16:36 -07:00
Chevron Li
e591fcf6b4 mmc: sdhci-pci-o2micro: Fix card detect by dealing with debouncing
The result from ->get_cd() may be incorrect as the card detect debouncing
isn't managed correctly. Let's fix it.

Signed-off-by: Chevron Li<chevron.li@bayhubtech.com>
Fixes: 7d44061704 ("mmc: sdhci-pci-o2micro: Fix O2 Host data read/write DLL Lock phase shift issue")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220602132543.596-1-chevron.li@bayhubtech.com
[Ulf: Updated the commit message]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-06-14 13:52:53 -07:00
Christophe JAILLET
de87b603b0 i2c: mediatek: Fix an error handling path in mtk_i2c_probe()
The clsk are prepared, enabled, then disabled. So if an error occurs after
the disable step, they are still prepared.

Add an error handling path to unprepare the clks in such a case, as already
done in the .remove function.

Fixes: 8b4fc246c3 ("i2c: mediatek: Optimize master_xfer() and avoid circular locking")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Qii Wang <qii.wang@mediatek.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-06-14 22:11:54 +02:00
Jonathan Marek
62b5e322fb drm/msm: use for_each_sgtable_sg to iterate over scatterlist
The dma_map_sgtable() call (used to invalidate cache) overwrites sgt->nents
with 1, so msm_iommu_pagetable_map maps only the first physical segment.

To fix this problem use for_each_sgtable_sg(), which uses orig_nents.

Fixes: b145c6e65e ("drm/msm: Add support to create a local pagetable")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Link: https://lore.kernel.org/r/20220613221019.11399-1-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-14 11:54:38 -07:00
Linus Torvalds
018ab4fabd netfs: fix up netfs_inode_init() docbook comment
Commit e81fb4198e ("netfs: Further cleanups after struct netfs_inode
wrapper introduced") changed the argument types and names, and actually
updated the comment too (although that was thanks to David Howells, not
me: my original patch only changed the code).

But the comment fixup didn't go quite far enough, and didn't change the
argument name in the comment, resulting in

  include/linux/netfs.h:314: warning: Function parameter or member 'ctx' not described in 'netfs_inode_init'
  include/linux/netfs.h:314: warning: Excess function parameter 'inode' description in 'netfs_inode_init'

during htmldoc generation.

Fixes: e81fb4198e ("netfs: Further cleanups after struct netfs_inode wrapper introduced")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-14 10:36:11 -07:00
Mark Brown
795285ef24 selftests: Fix clang cross compilation
Unlike GCC clang uses a single compiler image to support multiple target
architectures meaning that we can't simply rely on CROSS_COMPILE to select
the output architecture. Instead we must pass --target to the compiler to
tell it what to output, kselftest was not doing this so cross compilation
of kselftest using clang resulted in kselftest being built for the host
architecture.

More work is required to fix tests using custom rules but this gets the
bulk of things building.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-06-14 11:24:24 -06:00
Roman Li
4fd17f2ac0 drm/amd/display: Cap OLED brightness per max frame-average luminance
[Why]
For OLED eDP the Display Manager uses max_cll value as a limit
for brightness control.
max_cll defines the content light luminance for individual pixel.
Whereas max_fall defines frame-average level luminance.
The user may not observe the difference in brightness in between
max_fall and max_cll.
That negatively impacts the user experience.

[How]
Use max_fall value instead of max_cll as a limit for brightness control.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-06-14 13:11:11 -04:00
Michel Dänzer
c904e3acba drm/amdgpu: Fix GTT size reporting in amdgpu_ioctl
The commit below changed the TTM manager size unit from pages to
bytes, but failed to adjust the corresponding calculations in
amdgpu_ioctl.

Fixes: dfa714b88e ("drm/amdgpu: remove GTT accounting v2")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1930
Bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6642
Tested-by: Martin Roukala <martin.roukala@mupuf.org>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x
2022-06-14 13:10:06 -04:00
Pavel Begunkov
d884b6498d io_uring: remove IORING_CLOSE_FD_AND_FILE_SLOT
This partially reverts a7c41b4687

Even though IORING_CLOSE_FD_AND_FILE_SLOT might save cycles for some
users, but it tries to do two things at a time and it's not clear how to
handle errors and what to return in a single result field when one part
fails and another completes well. Kill it for now.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/837c745019b3795941eee4fcfd7de697886d645b.1655224415.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-14 10:57:40 -06:00
Pavel Begunkov
aa165d6d2b Revert "io_uring: add buffer selection support to IORING_OP_NOP"
This reverts commit 3d200242a6.

Buffer selection with nops was used for debugging and benchmarking but
is useless in real life. Let's revert it before it's released.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c5012098ca6b51dfbdcb190f8c4e3c0bf1c965dc.1655224415.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-14 10:57:40 -06:00
Pavel Begunkov
8899ce4b2f Revert "io_uring: support CQE32 for nop operation"
This reverts commit 2bb04df7c2.

CQE32 nops were used for debugging and benchmarking but it doesn't
target any real use case. Revert it, we can return it back if someone
finds a good way to use it.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/5ff623d84ccb4b3f3b92a3ea41cdcfa612f3d96f.1655224415.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-14 10:57:40 -06:00
Przemyslaw Patynowski
efe4186000 ice: Fix memory corruption in VF driver
Disable VF's RX/TX queues, when it's disabled. VF can have queues enabled,
when it requests a reset. If PF driver assumes that VF is disabled,
while VF still has queues configured, VF may unmap DMA resources.
In such scenario device still can map packets to memory, which ends up
silently corrupting it.
Previously, VF driver could experience memory corruption, which lead to
crash:
[ 5119.170157] BUG: unable to handle kernel paging request at 00001b9780003237
[ 5119.170166] PGD 0 P4D 0
[ 5119.170173] Oops: 0002 [#1] PREEMPT_RT SMP PTI
[ 5119.170181] CPU: 30 PID: 427592 Comm: kworker/u96:2 Kdump: loaded Tainted: G        W I      --------- -  - 4.18.0-372.9.1.rt7.166.el8.x86_64 #1
[ 5119.170189] Hardware name: Dell Inc. PowerEdge R740/014X06, BIOS 2.3.10 08/15/2019
[ 5119.170193] Workqueue: iavf iavf_adminq_task [iavf]
[ 5119.170219] RIP: 0010:__page_frag_cache_drain+0x5/0x30
[ 5119.170238] Code: 0f 0f b6 77 51 85 f6 74 07 31 d2 e9 05 df ff ff e9 90 fe ff ff 48 8b 05 49 db 33 01 eb b4 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <f0> 29 77 34 74 01 c3 48 8b 07 f6 c4 80 74 0f 0f b6 77 51 85 f6 74
[ 5119.170244] RSP: 0018:ffffa43b0bdcfd78 EFLAGS: 00010282
[ 5119.170250] RAX: ffffffff896b3e40 RBX: ffff8fb282524000 RCX: 0000000000000002
[ 5119.170254] RDX: 0000000049000000 RSI: 0000000000000000 RDI: 00001b9780003203
[ 5119.170259] RBP: ffff8fb248217b00 R08: 0000000000000022 R09: 0000000000000009
[ 5119.170262] R10: 2b849d6300000000 R11: 0000000000000020 R12: 0000000000000000
[ 5119.170265] R13: 0000000000001000 R14: 0000000000000009 R15: 0000000000000000
[ 5119.170269] FS:  0000000000000000(0000) GS:ffff8fb1201c0000(0000) knlGS:0000000000000000
[ 5119.170274] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5119.170279] CR2: 00001b9780003237 CR3: 00000008f3e1a003 CR4: 00000000007726e0
[ 5119.170283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5119.170286] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 5119.170290] PKRU: 55555554
[ 5119.170292] Call Trace:
[ 5119.170298]  iavf_clean_rx_ring+0xad/0x110 [iavf]
[ 5119.170324]  iavf_free_rx_resources+0xe/0x50 [iavf]
[ 5119.170342]  iavf_free_all_rx_resources.part.51+0x30/0x40 [iavf]
[ 5119.170358]  iavf_virtchnl_completion+0xd8a/0x15b0 [iavf]
[ 5119.170377]  ? iavf_clean_arq_element+0x210/0x280 [iavf]
[ 5119.170397]  iavf_adminq_task+0x126/0x2e0 [iavf]
[ 5119.170416]  process_one_work+0x18f/0x420
[ 5119.170429]  worker_thread+0x30/0x370
[ 5119.170437]  ? process_one_work+0x420/0x420
[ 5119.170445]  kthread+0x151/0x170
[ 5119.170452]  ? set_kthread_struct+0x40/0x40
[ 5119.170460]  ret_from_fork+0x35/0x40
[ 5119.170477] Modules linked in: iavf sctp ip6_udp_tunnel udp_tunnel mlx4_en mlx4_core nfp tls vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM ipt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc intel_rapl_msr iTCO_wdt iTCO_vendor_support dell_smbios wmi_bmof dell_wmi_descriptor dcdbas kvm_intel kvm irqbypass intel_rapl_common isst_if_common skx_edac irdma nfit libnvdimm x86_pkg_temp_thermal i40e intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ib_uverbs rapl ipmi_ssif intel_cstate intel_uncore mei_me pcspkr acpi_ipmi ib_core mei lpc_ich i2c_i801 ipmi_si ipmi_devintf wmi ipmi_msghandler acpi_power_meter xfs libcrc32c sd_mod t10_pi sg mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ice ahci drm libahci crc32c_intel libata tg3 megaraid_sas
[ 5119.170613]  i2c_algo_bit dm_mirror dm_region_hash dm_log dm_mod fuse [last unloaded: iavf]
[ 5119.170627] CR2: 00001b9780003237

Fixes: ec4f5a436b ("ice: Check if VF is disabled for Opcode and other operations")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Co-developed-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-14 09:38:57 -07:00
Przemyslaw Patynowski
be2af71496 ice: Fix queue config fail handling
Disable VF's RX/TX queues, when VIRTCHNL_OP_CONFIG_VSI_QUEUES fail.
Not disabling them might lead to scenario, where PF driver leaves VF
queues enabled, when VF's VSI failed queue config.
In this scenario VF should not have RX/TX queues enabled. If PF failed
to set up VF's queues, VF will reset due to TX timeouts in VF driver.
Initialize iterator 'i' to -1, so if error happens prior to configuring
queues then error path code will not disable queue 0. Loop that
configures queues will is using same iterator, so error path code will
only disable queues that were configured.

Fixes: 77ca27c417 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap")
Suggested-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-14 09:38:57 -07:00
Roman Storozhenko
9542ef4fba ice: Sync VLAN filtering features for DVM
VLAN filtering features, that is C-Tag and S-Tag, in DVM mode must be
both enabled or disabled.
In case of turning off/on only one of the features, another feature must
be turned off/on automatically with issuing an appropriate message to
the kernel log.

Fixes: 1babaf77f4 ("ice: Advertise 802.1ad VLAN filtering and offloads for PF netdev")
Signed-off-by: Roman Storozhenko <roman.storozhenko@intel.com>
Co-developed-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com>
Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-14 09:38:57 -07:00
Michal Michalik
71a579f0d3 ice: Fix PTP TX timestamp offset calculation
The offset was being incorrectly calculated for E822 - that led to
collisions in choosing TX timestamp register location when more than
one port was trying to use timestamping mechanism.

In E822 one quad is being logically split between ports, so quad 0 is
having trackers for ports 0-3, quad 1 ports 4-7 etc. Each port should
have separate memory location for tracking timestamps. Due to error for
example ports 1 and 2 had been assigned to quad 0 with same offset (0),
while port 1 should have offset 0 and 1 offset 16.

Fix it by correctly calculating quad offset.

Fixes: 3a7496234d ("ice: implement basic E822 PTP support")
Signed-off-by: Michal Michalik <michal.michalik@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-14 09:35:57 -07:00
Linus Torvalds
24625f7d91 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "While last week's pull request contained miscellaneous fixes for x86,
  this one covers other architectures, selftests changes, and a bigger
  series for APIC virtualization bugs that were discovered during 5.20
  development. The idea is to base 5.20 development for KVM on top of
  this tag.

  ARM64:

   - Properly reset the SVE/SME flags on vcpu load

   - Fix a vgic-v2 regression regarding accessing the pending state of a
     HW interrupt from userspace (and make the code common with vgic-v3)

   - Fix access to the idreg range for protected guests

   - Ignore 'kvm-arm.mode=protected' when using VHE

   - Return an error from kvm_arch_init_vm() on allocation failure

   - A bunch of small cleanups (comments, annotations, indentation)

  RISC-V:

   - Typo fix in arch/riscv/kvm/vmid.c

   - Remove broken reference pattern from MAINTAINERS entry

  x86-64:

   - Fix error in page tables with MKTME enabled

   - Dirty page tracking performance test extended to running a nested
     guest

   - Disable APICv/AVIC in cases that it cannot implement correctly"

[ This merge also fixes a misplaced end parenthesis bug introduced in
  commit 3743c2f025 ("KVM: x86: inhibit APICv/AVIC on changes to APIC
  ID or APIC base") pointed out by Sean Christopherson ]

Link: https://lore.kernel.org/all/20220610191813.371682-1-seanjc@google.com/

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (34 commits)
  KVM: selftests: Restrict test region to 48-bit physical addresses when using nested
  KVM: selftests: Add option to run dirty_log_perf_test vCPUs in L2
  KVM: selftests: Clean up LIBKVM files in Makefile
  KVM: selftests: Link selftests directly with lib object files
  KVM: selftests: Drop unnecessary rule for STATIC_LIBS
  KVM: selftests: Add a helper to check EPT/VPID capabilities
  KVM: selftests: Move VMX_EPT_VPID_CAP_AD_BITS to vmx.h
  KVM: selftests: Refactor nested_map() to specify target level
  KVM: selftests: Drop stale function parameter comment for nested_map()
  KVM: selftests: Add option to create 2M and 1G EPT mappings
  KVM: selftests: Replace x86_page_size with PG_LEVEL_XX
  KVM: x86: SVM: fix nested PAUSE filtering when L0 intercepts PAUSE
  KVM: x86: SVM: drop preempt-safe wrappers for avic_vcpu_load/put
  KVM: x86: disable preemption around the call to kvm_arch_vcpu_{un|}blocking
  KVM: x86: disable preemption while updating apicv inhibition
  KVM: x86: SVM: fix avic_kick_target_vcpus_fast
  KVM: x86: SVM: remove avic's broken code that updated APIC ID
  KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base
  KVM: x86: document AVIC/APICv inhibit reasons
  KVM: x86/mmu: Set memory encryption "value", not "mask", in shadow PDPTRs
  ...
2022-06-14 07:57:18 -07:00
Ciara Loftus
a6e944f25c xsk: Fix generic transmit when completion queue reservation fails
Two points of potential failure in the generic transmit function are:

  1. completion queue (cq) reservation failure.
  2. skb allocation failure

Originally the cq reservation was performed first, followed by the skb
allocation. Commit 675716400d ("xdp: fix possible cq entry leak")
reversed the order because at the time there was no mechanism available
to undo the cq reservation which could have led to possible cq entry leaks
in the event of skb allocation failure. However if the skb allocation is
performed first and the cq reservation then fails, the xsk skb destructor
is called which blindly adds the skb address to the already full cq leading
to undefined behavior.

This commit restores the original order (cq reservation followed by skb
allocation) and uses the xskq_prod_cancel helper to undo the cq reserve
in event of skb allocation failure.

Fixes: 675716400d ("xdp: fix possible cq entry leak")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20220614070746.8871-1-ciara.loftus@intel.com
2022-06-14 16:49:29 +02:00
Linus Torvalds
8e8afafb0b Merge tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 MMIO stale data fixes from Thomas Gleixner:
 "Yet another hw vulnerability with a software mitigation: Processor
  MMIO Stale Data.

  They are a class of MMIO-related weaknesses which can expose stale
  data by propagating it into core fill buffers. Data which can then be
  leaked using the usual speculative execution methods.

  Mitigations include this set along with microcode updates and are
  similar to MDS and TAA vulnerabilities: VERW now clears those buffers
  too"

* tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation/mmio: Print SMT warning
  KVM: x86/speculation: Disable Fill buffer clear within guests
  x86/speculation/mmio: Reuse SRBDS mitigation for SBDS
  x86/speculation/srbds: Update SRBDS mitigation selection
  x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data
  x86/speculation/mmio: Enable CPU Fill buffer clearing on idle
  x86/bugs: Group MDS, TAA & Processor MMIO Stale Data mitigations
  x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data
  x86/speculation: Add a common function for MD_CLEAR mitigation update
  x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug
  Documentation: Add documentation for Processor MMIO Stale Data
2022-06-14 07:43:15 -07:00
Petr Machata
4b7a632ac4 mlxsw: spectrum_cnt: Reorder counter pools
Both RIF and ACL flow counters use a 24-bit SW-managed counter address to
communicate which counter they want to bind.

In a number of Spectrum FW releases, binding a RIF counter is broken and
slices the counter index to 16 bits. As a result, on Spectrum-2 and above,
no more than about 410 RIF counters can be effectively used. This
translates to 205 netdevices for which L3 HW stats can be enabled. (This
does not happen on Spectrum-1, because there are fewer counters available
overall and the counter index never exceeds 16 bits.)

Binding counters to ACLs does not have this issue. Therefore reorder the
counter allocation scheme so that RIF counters come first and therefore get
lower indices that are below the 16-bit barrier.

Fixes: 98e60dce4d ("Merge branch 'mlxsw-Introduce-initial-Spectrum-2-support'")
Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220613125017.2018162-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-14 16:00:37 +02:00
Marek Szyprowski
7d787184a1 drm/exynos: mic: Rework initialization
Commit dd8b6803bc ("exynos: drm: dsi: Attach in_bridge in MIC driver")
moved Exynos MIC attaching from DSI to MIC driver. However the method
proposed there is incomplete and cannot really work. To properly attach
it to the bridge chain, access to the respective encoder is needed. The
Exynos MIC driver always attaches to the encoder created by the Exynos
DSI driver, so grab it via available helpers for getting access to the
CRTC and encoders. This also requires to change the order of driver
component binding to let DSI to be bound before MIC.

Fixes: dd8b6803bc ("exynos: drm: dsi: Attach in_bridge in MIC driver")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixed merge conflict.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-06-14 22:32:16 +09:00
Dan Carpenter
5c2b745173 drm/exynos: fix IS_ERR() vs NULL check in probe
The of_drm_find_bridge() does not return error pointers, it returns
NULL on error.

Fixes: dd8b6803bc ("exynos: drm: dsi: Attach in_bridge in MIC driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2022-06-14 22:32:03 +09:00
Serge Semin
5e93207e96 bus: bt1-axi: Don't print error on -EPROBE_DEFER
The Baikal-T1 AXI bus driver correctly handles the deferred probe
situation, but still pollutes the system log with a misleading error
message. Let's fix that by using the dev_err_probe() method to print the
log message in case of the clocks/resets request errors.

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Link: https://lore.kernel.org/r/20220610104030.28399-2-Sergey.Semin@baikalelectronics.ru'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-14 12:21:44 +02:00
Serge Semin
be5cddef05 bus: bt1-apb: Don't print error on -EPROBE_DEFER
The Baikal-T1 APB bus driver correctly handles the deferred probe
situation, but still pollutes the system log with a misleading error
message. Let's fix that by using the dev_err_probe() method to print the
log message in case of the clocks/resets request errors.

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Link: https://lore.kernel.org/r/20220610104030.28399-1-Sergey.Semin@baikalelectronics.ru'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-14 12:21:44 +02:00
Arnd Bergmann
9658904252 Merge tag 'arm-soc/for-5.19/maintainers-fixes' of https://github.com/Broadcom/stblinux into arm/fixes
This pull request contains MAINTAINERS file update for Broadcom SoCs,
please pull the following:

- Nicolas steps down from maintaining the BCM283x/BCM2711 (Raspberry
Pi) architecture and designates Florian to take care of that

* tag 'arm-soc/for-5.19/maintainers-fixes' of https://github.com/Broadcom/stblinux:
  MAINTAINERS: Update BCM2711/BCM2835 maintainer

Link: https://lore.kernel.org/r/20220608093132.1465703-2-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-14 12:20:47 +02:00
Arnd Bergmann
11bb764fbf Merge tag 'arm-soc/for-5.19/drivers-fixes' of https://github.com/Broadcom/stblinux into arm/fixes
This pull request contains Broadcom ARM-based SoCs driver fixes for
5.19, please pull the following:

- Miaoqian fixes a device tree node reference count in the system sleep
  code for Broadcom STB chips

* tag 'arm-soc/for-5.19/drivers-fixes' of https://github.com/Broadcom/stblinux:
  soc: bcm: brcmstb: pm: pm-arm: Fix refcount leak in brcmstb_pm_probe

Link: https://lore.kernel.org/r/20220608093132.1465703-1-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-14 12:20:31 +02:00
Arnd Bergmann
2d2cb31bd3 Merge tag 's32g2-fixes-5.19' of https://github.com/chesterlintw/linux-s32g into arm/fixes
S32G2 fixes for 5.19

- MAINTAINERS: Add s32@nxp.com as a review group.
- dts: Pass unit name to soc node to fix a W=1 build warning.

* tag 's32g2-fixes-5.19' of https://github.com/chesterlintw/linux-s32g:
  MAINTAINERS: add a new reviewer for S32G
  arm64: s32g2: Pass unit name to soc node

Link: https://lore.kernel.org/r/Yp9Y4nGrQ2kVKV6S@linux-8mug
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-14 12:19:58 +02:00
Miaoqian Lin
7c7ff68daa ARM: Fix refcount leak in axxia_boot_secondary
of_find_compatible_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.

Fixes: 1d22924e1c ("ARM: Add platform support for LSI AXM55xx SoC")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220601090548.47616-1-linmq006@gmail.com'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-14 12:19:13 +02:00
Christian Brauner
168f912893 fs: account for group membership
When calling setattr_prepare() to determine the validity of the
attributes the ia_{g,u}id fields contain the value that will be written
to inode->i_{g,u}id. This is exactly the same for idmapped and
non-idmapped mounts and allows callers to pass in the values they want
to see written to inode->i_{g,u}id.

When group ownership is changed a caller whose fsuid owns the inode can
change the group of the inode to any group they are a member of. When
searching through the caller's groups we need to use the gid mapped
according to the idmapped mount otherwise we will fail to change
ownership for unprivileged users.

Consider a caller running with fsuid and fsgid 1000 using an idmapped
mount that maps id 65534 to 1000 and 65535 to 1001. Consequently, a file
owned by 65534:65535 in the filesystem will be owned by 1000:1001 in the
idmapped mount.

The caller now requests the gid of the file to be changed to 1000 going
through the idmapped mount. In the vfs we will immediately map the
requested gid to the value that will need to be written to inode->i_gid
and place it in attr->ia_gid. Since this idmapped mount maps 65534 to
1000 we place 65534 in attr->ia_gid.

When we check whether the caller is allowed to change group ownership we
first validate that their fsuid matches the inode's uid. The
inode->i_uid is 65534 which is mapped to uid 1000 in the idmapped mount.
Since the caller's fsuid is 1000 we pass the check.

We now check whether the caller is allowed to change inode->i_gid to the
requested gid by calling in_group_p(). This will compare the passed in
gid to the caller's fsgid and search the caller's additional groups.

Since we're dealing with an idmapped mount we need to pass in the gid
mapped according to the idmapped mount. This is akin to checking whether
a caller is privileged over the future group the inode is owned by. And
that needs to take the idmapped mount into account. Note, all helpers
are nops without idmapped mounts.

New regression test sent to xfstests.

Link: https://github.com/lxc/lxd/issues/10537
Link: https://lore.kernel.org/r/20220613111517.2186646-1-brauner@kernel.org
Fixes: 2f221d6f7b ("attr: handle idmapped mounts")
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org # 5.15+
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-06-14 12:18:47 +02:00
Alexandre Torgue
89931cb463 ARM: dts: stm32: move SCMI related nodes in a dedicated file for stm32mp15
Adding a "secure" version of STM32 boards (DK1/DK2/ED1/EV1), SCMI (clock/
reset) protocol and OP-TEE node have been added in SoC dtsi file
(stm32mp151.dtsi). They have been added with a status disabled in order to
keep our legacy unchanged. It is actually not enough to keep our legacy
unchanged.

First, just a reminder about our use case: TF-A (BL2) loads and starts
OP-TEE, then loads and runs U-Boot. U-Boot code checks if an OP-TEE is
running, if yes it searches in Kernel device tree if an OP-TEE node is
present:

-If the OP-TEE node is not present then U-Boot copies OP-TEE node and its
reserved memory region from U-Boot device tree to the kernel device tree.

-If the OP-TEE node is present then it does nothing (this OP-TEE node will
be used by Linux). So U-Boot lets the kernel device tree unchanged thinking
it is correct for an OP-TEE usage. It is the case for our legacy boards,
the OP-TEE node is present (although disabled) but the reserved memory
region is not declared. As no memory region has been reserved for OP-TEE,
the end of DDR is seen by the kernel as free and then used for CMA. But as
OP-TEE is running, this end of DDR is already used by OP-TEE. So as soon as
kernel tries to access to the CMA region OP-TEE raises an error.

To fix it, all OP-TEE node and SCMI is moved in a dedicated file.

Fixes: 40b4157dbd ("ARM: dts: stm32: enable optee firmware and SCMI support on STM32MP15")
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Link: https://lore.kernel.org/r/20220613071920.5463-1-alexandre.torgue@foss.st.com'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-14 12:17:54 +02:00
Arnd Bergmann
002ec15747 Merge tag 'scmi-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Arm SCMI firmware driver fixes for v5.19

Bunch of fixes to address:
1. Issues reported on RK3568 EVB1 and BPI-R2 pro platforms using SCMI.
   More checks were added to validate the firmware response but that
   resulted in breaking above platforms, so the checks are relaxed when
   for cases where there is no potential memory corruption issues.

2. Possible data leak by reading more than required length from the firmware.
   Recent addition of support for v3.1 extended names used larger buffers
   in the kernel and used their size to read response from the firmware even
   for cases where shorter formats are used. While that is mostly harmless
   except when firmware sends malformed non-NULL terminated buffers.

3. Possible issues sending unsupported commands to the firmware.
   SENSOR_AXIS_NAME_GET added in v3.1 needs to be used only if the firmware
   supports it. While the firmware conformant to the spec must return not
   supported error for any unsupported features, it is always safer to
   avoid issuing commands that are known to be unsupported.

4. Incorrect error propagation in scmi_voltage_descriptors_get.
   Since the return value is not reset for each iteration of the loop, the
   error value in the previous iteration will be carried for the current one.
   Fix that by not saving the return values into local variable.

5. Some warnings reported by cppcheck

* tag 'scmi-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
  firmware: arm_scmi: Fix incorrect error propagation in scmi_voltage_descriptors_get
  firmware: arm_scmi: Avoid using extended string-buffers sizes if not necessary
  firmware: arm_scmi: Fix SENSOR_AXIS_NAME_GET behaviour when unsupported
  firmware: arm_scmi: Remove all the unused local variables
  firmware: arm_scmi: Relax base protocol sanity checks on the protocol list

Link: https://lore.kernel.org/r/20220614100007.1029881-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-14 12:16:35 +02:00
Arnd Bergmann
2916bf2233 Merge tag 'imx-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 5.19:

- Correct i.MX7 power domain for HSIC USB PHY node to fix an USB Host
  issue, that is all downstream events will be lost if USB host is
  runtime suspended.
- Fix i.MX8M blk-ctrl LCDIF2 power domain to point to refer to the
  correct clock.
- Correct i.MX6Q/DL PU regulator ramp delay to fix some peripherals
  power-up failure especially when the chip is at a low temperature.
- Fix capacitive touch reset polarity for imx6qdl-colibri board.

* tag 'imx-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  soc: imx: imx8m-blk-ctrl: fix display clock for LCDIF2 power domain
  ARM: dts: imx6qdl-colibri: Fix capacitive touch reset polarity
  ARM: dts: imx6qdl: correct PU regulator ramp delay
  ARM: dts: imx7: Move hsic_phy power domain to HSIC PHY node

Link: https://lore.kernel.org/r/20220614095515.GU254723@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-14 12:15:44 +02:00
Christian König
0f9cd1ea10 drm/ttm: fix bulk move handling v2
The resource must be on the LRU before ttm_lru_bulk_move_add() is called
and we need to check if the BO is pinned or not before adding it.

Additional to that we missed taking the LRU spinlock in ttm_bo_unpin().

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220613080816.4965-1-christian.koenig@amd.com
Fixes: fee2ede155 ("drm/ttm: rework bulk move handling v5")
2022-06-14 11:15:19 +02:00
Jonathan Neuschäfer
9cc8ea99bf docs: networking: phy: Fix a typo
Write "to be operated" instead of "to be operate".

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220610072809.352962-1-j.neuschaefer@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-13 23:12:44 -07:00
Jean-Philippe Brucker
884c65e4da amd-xgbe: Use platform_irq_count()
The AMD XGbE driver currently counts the number of interrupts assigned
to the device by inspecting the pdev->resource array. Since commit
a1a2b7125e ("of/platform: Drop static setup of IRQ resource from DT
core") removed IRQs from this array, the driver now attempts to get all
interrupts from 1 to -1U and gives up probing once it reaches an invalid
interrupt index.

Obtain the number of IRQs with platform_irq_count() instead.

Fixes: a1a2b7125e ("of/platform: Drop static setup of IRQ resource from DT core")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20220609161457.69614-1-jean-philippe@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-13 23:12:39 -07:00
Alexander Stein
7c7eaeefb0 soc: imx: imx8m-blk-ctrl: fix display clock for LCDIF2 power domain
LCDIF2 has its own display clock, use this one.

Fixes: 07614fed00 ("soc: imx: imx8m-blk-ctrl: Add i.MX8MP media blk-ctrl")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Paul Elder <paul.elder@ideasonboard.com>
Tested-by: Martyn Welch <martyn.welch@collabora.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2022-06-14 11:53:42 +08:00
Max Krummenacher
b426310e50 ARM: dts: imx6qdl-colibri: Fix capacitive touch reset polarity
The commit feedaacdad ("Input: atmel_mxt_ts - fix up inverted RESET
handler") requires the reset GPIO to have GPIO_ACTIVE_LOW.

Fixes: 1524b27c94 ("ARM: dts: imx6dl-colibri: Move common nodes to SoM dtsi")
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Max Krummenacher <max.krummenacher@toradex.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2022-06-14 11:53:42 +08:00
Lucas Stach
93a8ba2a61 ARM: dts: imx6qdl: correct PU regulator ramp delay
Contrary to what was believed at the time, the ramp delay of 150us is not
plenty for the PU LDO with the default step time of 512 pulses of the 24MHz
clock. Measurements have shown that after enabling the LDO the voltage on
VDDPU_CAP jumps to ~750mV in the first step and after that the regulator
executes the normal ramp up as defined by the step size control.

This means it takes the regulator between 360us and 370us to ramp up to
the nominal 1.15V voltage for this power domain. With the old setting of
the ramp delay the power up of the PU GPC domain would happen in the middle
of the regulator ramp with the voltage being at around 900mV. Apparently
this was enough for most units to properly power up the peripherals in the
domain and execute the reset. Some units however, fail to power up properly,
especially when the chip is at a low temperature. In that case any access
to the GPU registers would yield an incorrect result with no way to recover
from this situation.

Change the ramp delay to 380us to cover the measured ramp up time with a
bit of additional slack.

Fixes: 40130d327f ("ARM: dts: imx6qdl: Allow disabling the PU regulator, add a enable ramp delay")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2022-06-14 11:53:34 +08:00
Sergey Gorenko
f6eed15f3e scsi: iscsi: Exclude zero from the endpoint ID range
The kernel returns an endpoint ID as r.ep_connect_ret.handle in the
iscsi_uevent. The iscsid validates a received endpoint ID and treats zero
as an error. The commit referenced in the fixes line changed the endpoint
ID range, and zero is always assigned to the first endpoint ID.  So, the
first attempt to create a new iSER connection always fails.

Link: https://lore.kernel.org/r/20220613123854.55073-1-sergeygo@nvidia.com
Fixes: 3c6ae371b8 ("scsi: iscsi: Release endpoint ID when its freed")
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Sergey Gorenko <sergeygo@nvidia.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-13 22:11:36 -04:00
Rob Clark
49e4776100 drm/msm: Switch ordering of runpm put vs devfreq_idle
In msm_devfreq_suspend() we cancel idle_work synchronously so that it
doesn't run after we power of the hw or in the resume path.  But this
means that we want to ensure that idle_work is not scheduled *after* we
no longer hold a runpm ref.  So switch the ordering of pm_runtime_put()
vs msm_devfreq_idle().

v2. Only move the runpm _put_autosuspend, and not the _mark_last_busy()

Fixes: 9bc9557017 ("drm/msm: Devfreq tuning")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20210927152928.831245-1-robdclark@gmail.com
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20220608161334.2140611-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-13 15:32:32 -07:00
rasheed.hsueh
43047e082b nvme-pci: disable write zeros support on UMIC and Samsung SSDs
Like commit 5611ec2b98 ("nvme-pci: prevent SK hynix PC400 from using
Write Zeroes command"), UMIS and Samsung has the same issue:
[ 6305.633887] blk_update_request: operation not supported error,
dev nvme0n1, sector 340812032 op 0x9:(WRITE_ZEROES) flags 0x0
phys_seg 0 prio class 0

So also disable Write Zeroes command on UMIS and Samsung.

Signed-off-by: rasheed.hsueh <rasheed.hsueh@lcfc.corp-partner.google.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-13 19:56:57 +02:00
Ning Wang
6b961bce50 nvme-pci: avoid the deepest sleep state on ZHITAI TiPro7000 SSDs
When ZHITAI TiPro7000 SSDs entered deepest power state(ps4)
it has the same APST sleep problem as Kingston A2000.
by chance the system crashes and displays the same dmesg info:

https://bugzilla.kernel.org/show_bug.cgi?id=195039#c65

As the Archlinux wiki suggest (enlat + exlat) < 25000 is fine
and my testing shows no system crashes ever since.
Therefore disabling the deepest power state will fix the APST sleep issue.

https://wiki.archlinux.org/title/Solid_state_drive/NVMe

This is the APST data from 'nvme id-ctrl /dev/nvme1'

NVME Identify Controller:
vid       : 0x1e49
ssvid     : 0x1e49
sn        : [...]
mn        : ZHITAI TiPro7000 1TB
fr        : ZTA32F3Y
[...]
ps    0 : mp:3.50W operational enlat:5 exlat:5 rrt:0 rrl:0
          rwt:0 rwl:0 idle_power:- active_power:-
ps    1 : mp:3.30W operational enlat:50 exlat:100 rrt:1 rrl:1
          rwt:1 rwl:1 idle_power:- active_power:-
ps    2 : mp:2.80W operational enlat:50 exlat:200 rrt:2 rrl:2
          rwt:2 rwl:2 idle_power:- active_power:-
ps    3 : mp:0.1500W non-operational enlat:500 exlat:5000 rrt:3 rrl:3
          rwt:3 rwl:3 idle_power:- active_power:-
ps    4 : mp:0.0200W non-operational enlat:2000 exlat:60000 rrt:4 rrl:4
          rwt:4 rwl:4 idle_power:- active_power:-

Signed-off-by: Ning Wang <ningwang35@outlook.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-13 19:56:56 +02:00
Keith Busch
c4f01a776b nvme-pci: sk hynix p31 has bogus namespace ids
Add the quirk.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216049
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-13 19:56:56 +02:00
Keith Busch
c98a879312 nvme-pci: smi has bogus namespace ids
Add the quirk.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216096
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-13 19:56:56 +02:00
Keith Busch
2cf7a77ed5 nvme-pci: phison e12 has bogus namespace ids
Add the quirk.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216049
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-13 19:56:56 +02:00
Stefan Reiter
3765fad508 nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S50
ADATA XPG GAMMIX S50 drives report bogus eui64 values that appear to
be the same across drives in one system. Quirk them out so they are
not marked as "non globally unique" duplicates.

Signed-off-by: Stefan Reiter <stefan@pimaker.at>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-13 19:56:56 +02:00
Keith Busch
4641a8e6e1 nvme-pci: add trouble shooting steps for timeouts
Many users have encountered IO timeouts with a CSTS value of 0xffffffff,
which indicates a failure to read the register. While there are various
potential causes for this observation, faulty NVMe APST has been the
culprit quite frequently. Add the recommended troubleshooting steps in
the error output when this condition occurs.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-13 19:56:56 +02:00
Keith Busch
2f0dad1719 nvme: add bug report info for global duplicate id
The recent global id check is finding poorly implemented devices in the
wild. Include relavant device information in the output to help quicken
an appropriate quirk patch.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-13 19:54:14 +02:00
Thomas Weißschuh
1fc766b5c0 nvme: add device name to warning in uuid_show()
This provides more context to users.

Old message:

[   00.000000] No UUID available providing old NGUID

New message:

[   00.000000] block nvme0n1: No UUID available providing old NGUID

Fixes: d934f9848a ("nvme: provide UUID value to userspace")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-06-13 19:54:04 +02:00
Matthew Wilcox (Oracle)
1dfbe9fcda usercopy: Make usercopy resilient against ridiculously large copies
If 'n' is so large that it's negative, we might wrap around and mistakenly
think that the copy is OK when it's not.  Such a copy would probably
crash, but just doing the arithmetic in a more simple way lets us detect
and refuse this case.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Tested-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220612213227.3881769-4-willy@infradead.org
2022-06-13 09:54:52 -07:00
Matthew Wilcox (Oracle)
35fb9ae4aa usercopy: Cast pointer to an integer once
Get rid of a lot of annoying casts by setting 'addr' once at the top
of the function.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Tested-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220612213227.3881769-3-willy@infradead.org
2022-06-13 09:54:52 -07:00
Matthew Wilcox (Oracle)
993d0b287e usercopy: Handle vm_map_ram() areas
vmalloc does not allocate a vm_struct for vm_map_ram() areas.  That causes
us to deny usercopies from those areas.  This affects XFS which uses
vm_map_ram() for its directories.

Fix this by calling find_vmap_area() instead of find_vm_area().

Fixes: 0aef499f31 ("mm/usercopy: Detect vmalloc overruns")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Tested-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220612213227.3881769-2-willy@infradead.org
2022-06-13 09:54:52 -07:00
Sami Tolvanen
57cd6d157e cfi: Fix __cfi_slowpath_diag RCU usage with cpuidle
RCU_NONIDLE usage during __cfi_slowpath_diag can result in an invalid
RCU state in the cpuidle code path:

  WARNING: CPU: 1 PID: 0 at kernel/rcu/tree.c:613 rcu_eqs_enter+0xe4/0x138
  ...
  Call trace:
    rcu_eqs_enter+0xe4/0x138
    rcu_idle_enter+0xa8/0x100
    cpuidle_enter_state+0x154/0x3a8
    cpuidle_enter+0x3c/0x58
    do_idle.llvm.6590768638138871020+0x1f4/0x2ec
    cpu_startup_entry+0x28/0x2c
    secondary_start_kernel+0x1b8/0x220
    __secondary_switched+0x94/0x98

Instead, call rcu_irq_enter/exit to wake up RCU only when needed and
disable interrupts for the entire CFI shadow/module check when we do.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20220531175910.890307-1-samitolvanen@google.com
Fixes: cf68fffb66 ("add support for Clang CFI")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-06-13 09:18:46 -07:00
Sander Vanheule
a01a40e334 gpio: realtek-otto: Make the irqchip immutable
Since commit 6c846d026d ("gpio: Don't fiddle with irqchips marked as
immutable") a warning is issued for the realtek-otto driver:

    gpio gpiochip0: (18003500.gpio): not an immutable chip, please consider fixing it!

Make the driver's irqchip immutable to fix this.

Signed-off-by: Sander Vanheule <sander@svanheule.net>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-06-13 18:13:36 +02:00
Tom Schwindl
30756cc164 docs: driver-api: gpio: Fix filename mismatch
The filenames were changed a while ago, but board.rst, consumer.rst and
intro.rst still refer to the old names. Fix those references to match the
Actual names and avoid possible confusion.

Signed-off-by: Tom Schwindl <schwindl@posteo.de>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-06-13 18:12:24 +02:00
Lukas Bulwahn
97a4087a36 MAINTAINERS: add include/dt-bindings/gpio to GPIO SUBSYSTEM
Maintainers of the directory Documentation/devicetree/bindings/gpio
are also the maintainers of the corresponding directory
include/dt-bindings/gpio.

Add the file entry for include/dt-bindings/gpio to the appropriate
section in MAINTAINERS.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-06-13 18:11:24 +02:00
Kailang Yang
fe6900bd81 ALSA: hda/realtek - ALC897 headset MIC no sound
There is not have Headset Mic verb table in BIOS default.
So, it will have recording issue from headset MIC.
Add the verb table value without jack detect. It will turn on Headset Mic.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/719133a27d8844a890002cb817001dfa@realtek.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-13 18:01:05 +02:00
Jann Horn
eeaa345e12 mm/slub: add missing TID updates on slab deactivation
The fastpath in slab_alloc_node() assumes that c->slab is stable as long as
the TID stays the same. However, two places in __slab_alloc() currently
don't update the TID when deactivating the CPU slab.

If multiple operations race the right way, this could lead to an object
getting lost; or, in an even more unlikely situation, it could even lead to
an object being freed onto the wrong slab's freelist, messing up the
`inuse` counter and eventually causing a page to be freed to the page
allocator while it still contains slab objects.

(I haven't actually tested these cases though, this is just based on
looking at the code. Writing testcases for this stuff seems like it'd be
a pain...)

The race leading to state inconsistency is (all operations on the same CPU
and kmem_cache):

 - task A: begin do_slab_free():
    - read TID
    - read pcpu freelist (==NULL)
    - check `slab == c->slab` (true)
 - [PREEMPT A->B]
 - task B: begin slab_alloc_node():
    - fastpath fails (`c->freelist` is NULL)
    - enter __slab_alloc()
    - slub_get_cpu_ptr() (disables preemption)
    - enter ___slab_alloc()
    - take local_lock_irqsave()
    - read c->freelist as NULL
    - get_freelist() returns NULL
    - write `c->slab = NULL`
    - drop local_unlock_irqrestore()
    - goto new_slab
    - slub_percpu_partial() is NULL
    - get_partial() returns NULL
    - slub_put_cpu_ptr() (enables preemption)
 - [PREEMPT B->A]
 - task A: finish do_slab_free():
    - this_cpu_cmpxchg_double() succeeds()
    - [CORRUPT STATE: c->slab==NULL, c->freelist!=NULL]

From there, the object on c->freelist will get lost if task B is allowed to
continue from here: It will proceed to the retry_load_slab label,
set c->slab, then jump to load_freelist, which clobbers c->freelist.

But if we instead continue as follows, we get worse corruption:

 - task A: run __slab_free() on object from other struct slab:
    - CPU_PARTIAL_FREE case (slab was on no list, is now on pcpu partial)
 - task A: run slab_alloc_node() with NUMA node constraint:
    - fastpath fails (c->slab is NULL)
    - call __slab_alloc()
    - slub_get_cpu_ptr() (disables preemption)
    - enter ___slab_alloc()
    - c->slab is NULL: goto new_slab
    - slub_percpu_partial() is non-NULL
    - set c->slab to slub_percpu_partial(c)
    - [CORRUPT STATE: c->slab points to slab-1, c->freelist has objects
      from slab-2]
    - goto redo
    - node_match() fails
    - goto deactivate_slab
    - existing c->freelist is passed into deactivate_slab()
    - inuse count of slab-1 is decremented to account for object from
      slab-2

At this point, the inuse count of slab-1 is 1 lower than it should be.
This means that if we free all allocated objects in slab-1 except for one,
SLUB will think that slab-1 is completely unused, and may free its page,
leading to use-after-free.

Fixes: c17dda40a6 ("slub: Separate out kmem_cache_cpu processing from deactivate_slab")
Fixes: 03e404af26 ("slub: fast release on full slab")
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20220608182205.2945720-1-jannh@google.com
2022-06-13 17:41:36 +02:00
Sebastian Andrzej Siewior
c4cf678559 mm/slub: Move the stackdepot related allocation out of IRQ-off section.
The set_track() invocation in free_debug_processing() is invoked with
acquired slab_lock(). The lock disables interrupts on PREEMPT_RT and
this forbids to allocate memory which is done in stack_depot_save().

Split set_track() into two parts: set_track_prepare() which allocate
memory and set_track_update() which only performs the assignment of the
trace data structure. Use set_track_prepare() before disabling
interrupts.

[ vbabka@suse.cz: make set_track() call set_track_update() instead of
  open-coded assignments ]

Fixes: 5cf909c553 ("mm/slub: use stackdepot to save stack trace in objects")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/Yp9sqoUi4fVa5ExF@linutronix.de
2022-06-13 17:26:20 +02:00
Serge Semin
27071b5cbc i2c: designware: Use standard optional ref clock implementation
Even though the DW I2C controller reference clock source is requested by
the method devm_clk_get() with non-optional clock requirement the way the
clock handler is used afterwards has a pure optional clock semantic
(though in some circumstances we can get a warning about the clock missing
printed in the system console). There is no point in reimplementing that
functionality seeing the kernel clock framework already supports the
optional interface from scratch. Thus let's convert the platform driver to
using it.

Note by providing this commit we get to fix two problems. The first one
was introduced in commit c62ebb3d5f ("i2c: designware: Add support for
an interface clock"). It causes not having the interface clock (pclk)
enabled/disabled in case if the reference clock isn't provided. The second
problem was first introduced in commit b33af11de2 ("i2c: designware: Do
not require clock when SSCN and FFCN are provided"). Since that
modification the deferred probe procedure has been unsupported in case if
the interface clock isn't ready.

Fixes: c62ebb3d5f ("i2c: designware: Add support for an interface clock")
Fixes: b33af11de2 ("i2c: designware: Do not require clock when SSCN and FFCN are provided")
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-06-13 16:50:27 +02:00
Wolfram Sang
5edc99f0c5 MAINTAINERS: core DT include belongs to core
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-06-13 16:45:19 +02:00
Lukas Bulwahn
6e21408774 MAINTAINERS: add include/dt-bindings/i2c to I2C SUBSYSTEM HOST DRIVERS
Maintainers of the directory Documentation/devicetree/bindings/i2c
are also the maintainers of the corresponding directory
include/dt-bindings/i2c.

Add the file entry for include/dt-bindings/i2c to the appropriate
section in MAINTAINERS.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-06-13 16:43:25 +02:00
Jens Axboe
feaf625e70 Merge branch 'io_uring/io_uring-5.19' of https://github.com/isilence/linux into io_uring-5.19
Pull io_uring fixes from Pavel.

* 'io_uring/io_uring-5.19' of https://github.com/isilence/linux:
  io_uring: fix double unlock for pbuf select
  io_uring: kbuf: fix bug of not consuming ring buffer in partial io case
  io_uring: openclose: fix bug of closing wrong fixed file
  io_uring: fix not locked access to fixed buf table
  io_uring: fix races with buffer table unregister
  io_uring: fix races with file table unregister
2022-06-13 06:52:52 -06:00
Suman Ghosh
619c010a65 octeontx2-vf: Add support for adaptive interrupt coalescing
Fixes: 6e144b47f5 (octeontx2-pf: Add support for adaptive interrupt coalescing)
Added support for VF interfaces as well.

Signed-off-by: Suman Ghosh <sumang@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 13:42:24 +01:00
David S. Miller
5f7b84151a xilinx: Fix build on x86.
CONFIG_64BIT is not sufficient for checking for availability of
iowrite64() and friends.

Also, the out_addr helpers need to be inline.

Fixes: b690f8df64 ("net: axienet: Use iowrite64 to write all 64b descriptor pointers")
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:49:21 +01:00
David S. Miller
a7ffce959c Merge branch 'axienet-fixes'
Andy Chiu says:

====================
net: axienet: fix DMA Tx error

We ran into multiple DMA TX errors while writing files over a network
block device running on top of a DMA-connected AXI Ethernet device on
64-bit RISC-V machines. The errors indicated that the DMA had fetched a
null descriptor and we found that the reason for this is that AXI DMA had
unexpectedly processed a partially updated tail descriptor pointer. To
fix it, we suggest that the driver should use one 64-bit write instead
of two 32-bit writes to perform such update if possible. For those
archectures where double-word load/stores are unavailable, e.g. 32-bit
archectures, force a driver probe failure if the driver finds 64-bit
capability on DMA.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:36:56 +01:00
Andy Chiu
b690f8df64 net: axienet: Use iowrite64 to write all 64b descriptor pointers
According to commit f735c40ed9 ("net: axienet: Autodetect 64-bit DMA
capability") and AXI-DMA spec (pg021), on 64-bit capable dma, only
writing MSB part of tail descriptor pointer causes DMA engine to start
fetching descriptors. However, we found that it is true only if dma is in
idle state. In other words, dma would use a tailp even if it only has LSB
updated, when the dma is running.

The non-atomicity of this behavior could be problematic if enough
delay were introduced in between the 2 writes. For example, if an
interrupt comes right after the LSB write and the cpu spends long
enough time in the handler for the dma to get back into idle state by
completing descriptors, then the seconcd write to MSB would treat dma
to start fetching descriptors again. Since the descriptor next to the
one pointed by current tail pointer is not filled by the kernel yet,
fetching a null descriptor here causes a dma internal error and halt
the dma engine down.

We suggest that the dma engine should start process a 64-bit MMIO write
to the descriptor pointer only if ONE 32-bit part of it is written on all
states. Or we should restrict the use of 64-bit addressable dma on 32-bit
platforms, since those devices have no instruction to guarantee the write
to LSB and MSB part of tail pointer occurs atomically to the dma.

initial condition:
curp =  x-3;
tailp = x-2;
LSB = x;
MSB = 0;

cpu:                       |dma:
 iowrite32(LSB, tailp)     |  completes #(x-3) desc, curp = x-3
 ...                       |  tailp updated
 => irq                    |  completes #(x-2) desc, curp = x-2
    ...                    |  completes #(x-1) desc, curp = x-1
    ...                    |  ...
    ...                    |  completes #x desc, curp = tailp = x
 <= irqreturn              |  reaches tailp == curp = x, idle
 iowrite32(MSB, tailp + 4) |  ...
                           |  tailp updated, starts fetching...
                           |  fetches #(x + 1) desc, sees cntrl = 0
                           |  post Tx error, halt

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reported-by: Max Hsu <max.hsu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:36:55 +01:00
Andy Chiu
00be43a74c net: axienet: make the 64b addresable DMA depends on 64b archectures
Currently it is not safe to config the IP as 64-bit addressable on 32-bit
archectures, which cannot perform a double-word store on its descriptor
pointers. The pointer is 64-bit wide if the IP is configured as 64-bit,
and the device would process the partially updated pointer on some
states if the pointer was updated via two store-words. To prevent such
condition, we force a probe fail if we discover that the IP has 64-bit
capability but it is not running on a 64-Bit kernel.

This is a series of patch (1/2). The next patch must be applied in order
to make 64b DMA safe on 64b archectures.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reported-by: Max Hsu <max.hsu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:36:55 +01:00
Dylan Yudaken
f9437ac0f8 io_uring: limit size of provided buffer ring
The type of head and tail do not allow more than 2^15 entries in a
provided buffer ring, so do not allow this.
At 2^16 while each entry can be indexed, there is no way to
disambiguate full vs empty.

Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220613101157.3687-4-dylany@fb.com
Reviewed-by: Hao Xu <howeyxu@tencent.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-13 05:13:33 -06:00
Dylan Yudaken
c6e9fa5c0a io_uring: fix types in provided buffer ring
The type of head needs to match that of tail in order for rollover and
comparisons to work correctly.

Without this change the comparison of tail to head might incorrectly allow
io_uring to use a buffer that userspace had not given it.

Fixes: c7fb19428d ("io_uring: add support for ring mapped supplied buffers")
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220613101157.3687-3-dylany@fb.com
Reviewed-by: Hao Xu <howeyxu@tencent.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-13 05:13:31 -06:00
Dylan Yudaken
97da4a5379 io_uring: fix index calculation
When indexing into a provided buffer ring, do not subtract 1 from the
index.

Fixes: c7fb19428d ("io_uring: add support for ring mapped supplied buffers")
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220613101157.3687-2-dylany@fb.com
Reviewed-by: Hao Xu <howeyxu@tencent.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-13 05:13:09 -06:00
David S. Miller
a5b00f5b78 Merge branch 'hns3-fixres'
Guangbin Huang says:

====================
net: hns3: add some fixes for -net

This series adds some fixes for the HNS3 ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Guangbin Huang
12a3670887 net: hns3: fix tm port shapping of fibre port is incorrect after driver initialization
Currently in driver initialization process, driver will set shapping
parameters of tm port to default speed read from firmware. However, the
speed of SFP module may not be default speed, so shapping parameters of
tm port may be incorrect.

To fix this problem, driver sets new shapping parameters for tm port
after getting exact speed of SFP module in this case.

Fixes: 88d10bd6f7 ("net: hns3: add support for multiple media type")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Jie Wang
71b215f36d net: hns3: fix PF rss size initialization bug
Currently hns3 driver misuses the VF rss size to initialize the PF rss size
in hclge_tm_vport_tc_info_update. So this patch fix it by checking the
vport id before initialization.

Fixes: 7347255ea3 ("net: hns3: refactor PF rss get APIs with new common rss get APIs")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Guangbin Huang
e93530ae0e net: hns3: restore tm priority/qset to default settings when tc disabled
Currently, settings parameters of schedule mode, dwrr, shaper of tm
priority or qset of one tc are only be set when tc is enabled, they are
not restored to the default settings when tc is disabled. It confuses
users when they cat tm_priority or tm_qset files of debugfs. So this
patch fixes it.

Fixes: 848440544b ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Jie Wang
cfd80687a5 net: hns3: modify the ring param print info
Currently tx push is also a ring param. So the original ring param print
info in hns3_is_ringparam_changed should be adjusted.

Fixes: 07fdc163ac ("net: hns3: refactor hns3_set_ringparam()")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Jian Shen
283847e3ef net: hns3: don't push link state to VF if unalive
It's unnecessary to push link state to unalive VF, and the VF will
query link state from PF when it being start works.

Fixes: 18b6e31f8b ("net: hns3: PF add support for pushing link status to VFs")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Guangbin Huang
9eda7d8bcb net: hns3: set port base vlan tbl_sta to false before removing old vlan
When modify port base vlan, the port base vlan tbl_sta needs to set to
false before removing old vlan, to indicate this operation is not finish.

Fixes: c0f46de30c ("net: hns3: fix port base vlan add fail when concurrent with reset")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Jani Nikula
2636e00811 drm/i915/uc: remove accidental static from a local variable
The arrays are static const, but the pointer shouldn't be static.

Fixes: 3d832f370d ("drm/i915/uc: Allow platforms to have GuC but not HuC")
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220511094619.27889-1-jani.nikula@intel.com
(cherry picked from commit 5821a0bbb4)
2022-06-13 13:53:35 +03:00
Pavel Begunkov
fc9375e3f7 io_uring: fix double unlock for pbuf select
io_buffer_select(), which is the only caller of io_ring_buffer_select(),
fully handles locking, mutex unlock in io_ring_buffer_select() will lead
to double unlock.

Fixes: c7fb19428d ("io_uring: add support for ring mapped supplied buffers")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2022-06-13 11:37:41 +01:00
Hao Xu
42db0c00e2 io_uring: kbuf: fix bug of not consuming ring buffer in partial io case
When we use ring-mapped provided buffer, we should consume it before
arm poll if partial io has been done. Otherwise the buffer may be used
by other requests and thus we lost the data.

Fixes: c7fb19428d ("io_uring: add support for ring mapped supplied buffers")
Signed-off-by: Hao Xu <howeyxu@tencent.com>
[pavel: 5.19 rebase]
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2022-06-13 11:37:30 +01:00
Hao Xu
e71d7c56dd io_uring: openclose: fix bug of closing wrong fixed file
Don't update ret until fixed file is closed, otherwise the file slot
becomes the error code.

Fixes: a7c41b4687 ("io_uring: let IORING_OP_FILES_UPDATE support choosing fixed file slots")
Signed-off-by: Hao Xu <howeyxu@tencent.com>
[pavel: 5.19 rebase]
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2022-06-13 11:37:03 +01:00
Nirmoy Das
842d9346b2 drm/i915: Individualize fences before adding to dma_resv obj
_i915_vma_move_to_active() can receive > 1 fences for
multiple batch buffers submission. Because dma_resv_add_fence()
can only accept one fence at a time, change _i915_vma_move_to_active()
to be aware of multiple fences so that it can add individual
fences to the dma resv object.

v6: fix multi-line comment.
v5: remove double fence reservation for batch VMAs.
v4: Reserve fences for composite_fence on multi-batch contexts and
    also reserve fence slots to composite_fence for each VMAs.
v3: dma_resv_reserve_fences is not cumulative so pass num_fences.
v2: make sure to reserve enough fence slots before adding.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5614
Fixes: 544460c338 ("drm/i915: Multi-BB execbuf")
Cc: <stable@vger.kernel.org> # v5.16+
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220525095955.15371-1-nirmoy.das@intel.com
(cherry picked from commit 420a07b841)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-06-13 13:04:40 +03:00
Ashutosh Dixit
6e3f3c239e drm/i915/gt: Fix memory leaks in per-gt sysfs
All kmalloc'd kobjects need a kobject_put() to free memory. For example in
previous code, kobj_gt_release() never gets called. The requirement of
kobject_put() now results in a slightly different code organization.

v2: s/gtn/gt/ (Andi)

Fixes: b770bcfae9 ("drm/i915/gt: create per-tile sysfs interface")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a6f6686517c85fba61a0c45097f5bb4fe7e257fb.1653484574.git.ashutosh.dixit@intel.com
(cherry picked from commit 69d6bf5c37)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-06-13 13:04:31 +03:00
Alan Previn
c9b576d0c7 drm/i915/reset: Fix error_state_read ptr + offset use
Fix our pointer offset usage in error_state_read
when there is no i915_gpu_coredump but buf offset
is non-zero.

This fixes a kernel page fault can happen when
multiple tests are running concurrently in a loop
and one is producing engine resets and consuming
the i915 error_state dump while the other is
forcing full GT resets. (takes a while to trigger).

The dmesg call trace:

[ 5590.803000] BUG: unable to handle page fault for address:
               ffffffffa0b0e000
[ 5590.803009] #PF: supervisor read access in kernel mode
[ 5590.803013] #PF: error_code(0x0000) - not-present page
[ 5590.803016] PGD 5814067 P4D 5814067 PUD 5815063 PMD 109de4067
               PTE 0
[ 5590.803022] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 5590.803026] CPU: 5 PID: 13656 Comm: i915_hangman Tainted: G U
                    5.17.0-rc5-ups69-guc-err-capt-rev6+ #136
[ 5590.803033] Hardware name: Intel Corporation Alder Lake Client
                    Platform/AlderLake-M LP4x RVP, BIOS ADLPFWI1.R00.
                    3031.A02.2201171222	01/17/2022
[ 5590.803039] RIP: 0010:memcpy_erms+0x6/0x10
[ 5590.803045] Code: fe ff ff cc eb 1e 0f 1f 00 48 89 f8 48 89 d1
                     48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3
                     66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4
                     c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20
                     72 7e 40 38 fe
[ 5590.803054] RSP: 0018:ffffc90003a8fdf0 EFLAGS: 00010282
[ 5590.803057] RAX: ffff888107ee9000 RBX: ffff888108cb1a00
               RCX: 0000000000000f8f
[ 5590.803061] RDX: 0000000000001000 RSI: ffffffffa0b0e000
               RDI: ffff888107ee9071
[ 5590.803065] RBP: 0000000000000000 R08: 0000000000000001
               R09: 0000000000000001
[ 5590.803069] R10: 0000000000000001 R11: 0000000000000002
               R12: 0000000000000019
[ 5590.803073] R13: 0000000000174fff R14: 0000000000001000
               R15: ffff888107ee9000
[ 5590.803077] FS: 00007f62a99bee80(0000) GS:ffff88849f880000(0000)
               knlGS:0000000000000000
[ 5590.803082] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5590.803085] CR2: ffffffffa0b0e000 CR3: 000000010a1a8004
               CR4: 0000000000770ee0
[ 5590.803089] PKRU: 55555554
[ 5590.803091] Call Trace:
[ 5590.803093] <TASK>
[ 5590.803096] error_state_read+0xa1/0xd0 [i915]
[ 5590.803175] kernfs_fop_read_iter+0xb2/0x1b0
[ 5590.803180] new_sync_read+0x116/0x1a0
[ 5590.803185] vfs_read+0x114/0x1b0
[ 5590.803189] ksys_read+0x63/0xe0
[ 5590.803193] do_syscall_64+0x38/0xc0
[ 5590.803197] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 5590.803201] RIP: 0033:0x7f62aaea5912
[ 5590.803204] Code: c0 e9 b2 fe ff ff 50 48 8d 3d 5a b9 0c 00 e8 05
                     19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25
                     18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff
                     ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[ 5590.803213] RSP: 002b:00007fff5b659ae8 EFLAGS: 00000246
               ORIG_RAX: 0000000000000000
[ 5590.803218] RAX: ffffffffffffffda RBX: 0000000000100000
               RCX: 00007f62aaea5912
[ 5590.803221] RDX: 000000000008b000 RSI: 00007f62a8c4000f
               RDI: 0000000000000006
[ 5590.803225] RBP: 00007f62a8bcb00f R08: 0000000000200010
               R09: 0000000000101000
[ 5590.803229] R10: 0000000000000001 R11: 0000000000000246
               R12: 0000000000000006
[ 5590.803233] R13: 0000000000075000 R14: 00007f62a8acb010
               R15: 0000000000200000
[ 5590.803238] </TASK>
[ 5590.803240] Modules linked in: i915 ttm drm_buddy drm_dp_helper
                        drm_kms_helper syscopyarea sysfillrect sysimgblt
                        fb_sys_fops prime_numbers nfnetlink br_netfilter
                        overlay mei_pxp mei_hdcp x86_pkg_temp_thermal
                        coretemp kvm_intel snd_hda_codec_hdmi snd_hda_intel
                        snd_intel_dspcfg snd_hda_codec snd_hwdep
                        snd_hda_core snd_pcm mei_me mei fuse ip_tables
                        x_tables crct10dif_pclmul e1000e crc32_pclmul ptp
                        i2c_i801 ghash_clmulni_intel i2c_smbus pps_core
                        [last unloa ded: ttm]
[ 5590.803277] CR2: ffffffffa0b0e000
[ 5590.803280] ---[ end trace 0000000000000000 ]---

Fixes: 0e39037b31 ("drm/i915: Cache the error string")
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220311004311.514198-2-alan.previn.teres.alexis@intel.com
(cherry picked from commit 3304033a1e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-06-13 13:04:23 +03:00
Pavel Begunkov
05b538c176 io_uring: fix not locked access to fixed buf table
We can look inside the fixed buffer table only while holding
->uring_lock, however in some cases we don't do the right async prep for
IORING_OP_{WRITE,READ}_FIXED ending up with NULL req->imu forcing making
an io-wq worker to try to resolve the fixed buffer without proper
locking.

Move req->imu setup into early req init paths, i.e. io_prep_rw(), which
is called unconditionally for rw requests and under uring_lock.

Fixes: 634d00df5e ("io_uring: add full-fledged dynamic buffers support")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2022-06-13 09:53:41 +01:00
Pavel Begunkov
d11d31fc5d io_uring: fix races with buffer table unregister
Fixed buffer table quiesce might unlock ->uring_lock, potentially
letting new requests to be submitted, don't allow those requests to
use the table as they will race with unregistration.

Reported-and-tested-by: van fantasy <g1042620637@gmail.com>
Fixes: bd54b6fe33 ("io_uring: implement fixed buffers registration similar to fixed files")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2022-06-13 09:53:27 +01:00
Pavel Begunkov
b0380bf6da io_uring: fix races with file table unregister
Fixed file table quiesce might unlock ->uring_lock, potentially letting
new requests to be submitted, don't allow those requests to use the
table as they will race with unregistration.

Reported-and-tested-by: van fantasy <g1042620637@gmail.com>
Fixes: 05f3fb3c53 ("io_uring: avoid ring quiesce for fixed file set unregister and update")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2022-06-13 09:53:07 +01:00
Sebastian Andrzej Siewior
4051a81774 locking/lockdep: Use sched_clock() for random numbers
Since the rewrote of prandom_u32(), in the commit mentioned below, the
function uses sleeping locks which extracing random numbers and filling
the batch.
This breaks lockdep on PREEMPT_RT because lock_pin_lock() disables
interrupts while calling __lock_pin_lock(). This can't be moved earlier
because the main user of the function (rq_pin_lock()) invokes that
function after disabling interrupts in order to acquire the lock.

The cookie does not require random numbers as its goal is to provide a
random value in order to notice unexpected "unlock + lock" sites.

Use sched_clock() to provide random numbers.

Fixes: a0103f4d86f88 ("random32: use real rng for non-deterministic randomness")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/YoNn3pTkm5+QzE5k@linutronix.de
2022-06-13 10:29:57 +02:00
Peter Zijlstra
04193d590b sched: Fix balance_push() vs __sched_setscheduler()
The purpose of balance_push() is to act as a filter on task selection
in the case of CPU hotplug, specifically when taking the CPU out.

It does this by (ab)using the balance callback infrastructure, with
the express purpose of keeping all the unlikely/odd cases in a single
place.

In order to serve its purpose, the balance_push_callback needs to be
(exclusively) on the callback list at all times (noting that the
callback always places itself back on the list the moment it runs,
also noting that when the CPU goes down, regular balancing concerns
are moot, so ignoring them is fine).

And here-in lies the problem, __sched_setscheduler()'s use of
splice_balance_callbacks() takes the callbacks off the list across a
lock-break, making it possible for, an interleaving, __schedule() to
see an empty list and not get filtered.

Fixes: ae79270232 ("sched: Optimize finish_lock_switch()")
Reported-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
Link: https://lkml.kernel.org/r/20220519134706.GH2578@worktop.programming.kicks-ass.net
2022-06-13 10:15:07 +02:00
Josh Poimboeuf
e32683c6f7 x86/mm: Fix RESERVE_BRK() for older binutils
With binutils 2.26, RESERVE_BRK() causes a build failure:

  /tmp/ccnGOKZ5.s: Assembler messages:
  /tmp/ccnGOKZ5.s:98: Error: missing ')'
  /tmp/ccnGOKZ5.s:98: Error: missing ')'
  /tmp/ccnGOKZ5.s:98: Error: missing ')'
  /tmp/ccnGOKZ5.s:98: Error: junk at end of line, first unrecognized
  character is `U'

The problem is this line:

  RESERVE_BRK(early_pgt_alloc, INIT_PGT_BUF_SIZE)

Specifically, the INIT_PGT_BUF_SIZE macro which (via PAGE_SIZE's use
_AC()) has a "1UL", which makes older versions of the assembler unhappy.
Unfortunately the _AC() macro doesn't work for inline asm.

Inline asm was only needed here to convince the toolchain to add the
STT_NOBITS flag.  However, if a C variable is placed in a section whose
name is prefixed with ".bss", GCC and Clang automatically set
STT_NOBITS.  In fact, ".bss..page_aligned" already relies on this trick.

So fix the build failure (and simplify the macro) by allocating the
variable in C.

Also, add NOLOAD to the ".brk" output section clause in the linker
script.  This is a failsafe in case the ".bss" prefix magic trick ever
stops working somehow.  If there's a section type mismatch, the GNU
linker will force the ".brk" output section to be STT_NOBITS.  The LLVM
linker will fail with a "section type mismatch" error.

Note this also changes the name of the variable from .brk.##name to
__brk_##name.  The variable names aren't actually used anywhere, so it's
harmless.

Fixes: a1e2c031ec ("x86/mm: Simplify RESERVE_BRK()")
Reported-by: Joe Damato <jdamato@fastly.com>
Reported-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Joe Damato <jdamato@fastly.com>
Link: https://lore.kernel.org/r/22d07a44c80d8e8e1e82b9a806ddc8c6bbb2606e.1654759036.git.jpoimboe@kernel.org
2022-06-13 10:15:04 +02:00
Thomas Gleixner
6872fcac71 Merge tag 'irqchip-fixes-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
Pull irqchip/genirq fixes from Marc Zyngier:

 - Invoke runtime PM for chained interrupts, aligning the behaviour
   with that of 'normal' interrupts

 - A flurry of of_node refcounting fixes

 - A fix for the recently merged loongarch that broke UP MIPS

 - A configuration fix for the Xilinx interrupt controller

 - Yet another new compat string for the Uniphier interrupt controller

Link: https://lore.kernel.org/lkml/20220610083628.1205136-1-maz@kernel.org
2022-06-13 09:10:49 +02:00
Daniil Dementev
3ddbe35d9a ALSA: usb-audio: US16x08: Move overflow check before array access
Buffer overflow could occur in the loop "while", due to accessing an
array element before checking the index.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Daniil Dementev <d.dementev@ispras.ru>
Reviewed-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Link: https://lore.kernel.org/r/20220610165732.2904-1-d.dementev@ispras.ru
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-13 07:40:08 +02:00
Ludvig Pärsson
44dbdf3bb3 firmware: arm_scmi: Fix incorrect error propagation in scmi_voltage_descriptors_get
scmi_voltage_descriptors_get() will incorrecly return an error code if
the last iteration of the for loop that retrieves the descriptors is
skipped due to an error. Skipping an iteration in the loop is not an
error, but the `ret` value from the last iteration will be propagated
when the function returns.

Fix by not saving return values that should not be propagated. This
solution also minimizes the risk of future patches accidentally
re-introducing this bug.

Link: https://lore.kernel.org/r/20220610140055.31491-1-ludvig.parsson@axis.com
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Ludvig Pärsson <ludvig.parsson@axis.com>
[sudeep.holla: Removed unneeded reset_rx_to_maxsz and check for return
value from scmi_voltage_levels_get as suggested by Cristian]
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-06-12 19:59:55 +01:00
Conor Dooley
5e757deddd riscv: dts: microchip: re-add pdma to mpfs device tree
PolarFire SoC /does/ have a SiFive pdma, despite what I suggested as a
conflict resolution to Zong. Somehow the entry fell through the cracks
between versions of my dt patches, so re-add it with Zong's updated
compatible & dma-channels property.

Fixes: c5094f3710 ("riscv: dts: microchip: refactor icicle kit device tree")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2022-06-12 19:33:52 +01:00
Jason A. Donenfeld
abfed87e2a crypto: memneq - move into lib/
This is used by code that doesn't need CONFIG_CRYPTO, so move this into
lib/ with a Kconfig option so that it can be selected by whatever needs
it.

This fixes a linker error Zheng pointed out when
CRYPTO_MANAGER_DISABLE_TESTS!=y and CRYPTO=m:

  lib/crypto/curve25519-selftest.o: In function `curve25519_selftest':
  curve25519-selftest.c:(.init.text+0x60): undefined reference to `__crypto_memneq'
  curve25519-selftest.c:(.init.text+0xec): undefined reference to `__crypto_memneq'
  curve25519-selftest.c:(.init.text+0x114): undefined reference to `__crypto_memneq'
  curve25519-selftest.c:(.init.text+0x154): undefined reference to `__crypto_memneq'

Reported-by: Zheng Bin <zhengbin13@huawei.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: stable@vger.kernel.org
Fixes: aa127963f1 ("crypto: lib/curve25519 - re-add selftests")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-06-12 14:51:51 +08:00
Christophe JAILLET
06ee1c0aeb ksmbd: smbd: Remove useless license text when SPDX-License-Identifier is already used
An SPDX-License-Identifier is already in place. There is no need to
duplicate part of the corresponding license.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-11 11:18:26 -05:00
Namjae Jeon
fe0fde09e1 ksmbd: use SOCK_NONBLOCK type for kernel_accept()
I found that normally it is O_NONBLOCK but there are different value
for some arch.

/include/linux/net.h:
#ifndef SOCK_NONBLOCK
#define SOCK_NONBLOCK   O_NONBLOCK
#endif

/arch/alpha/include/asm/socket.h:
#define SOCK_NONBLOCK   0x40000000

Use SOCK_NONBLOCK instead of O_NONBLOCK for kernel_accept().

Suggested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Namjae Jeon <linkinjeon@kerne.org>
Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-11 11:18:26 -05:00
Jakub Kicinski
6f0e1efc88 Merge branch 'documentation-add-description-for-a-couple-of-sctp-sysctl-options'
Xin Long says:

====================
Documentation: add description for a couple of sctp sysctl options

These are a couple of sysctl options I recently added, but missed adding
documents for them. Especially for net.sctp.intl_enable, it's hard for
users to setup stream interleaving, as it also needs to call some socket
options.

This patchset is to add documents for them.
====================

Link: https://lore.kernel.org/r/cover.1654787716.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10 22:18:23 -07:00
Xin Long
249eddaf65 Documentation: add description for net.sctp.ecn_enable
Describe it in networking/ip-sysctl.rst like other SCTP options.

Fixes: 2f5268a924 ("sctp: allow users to set netns ecn flag with sysctl")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10 22:18:20 -07:00
Xin Long
e65775fdd3 Documentation: add description for net.sctp.intl_enable
Describe it in networking/ip-sysctl.rst like other SCTP options.
We need to document this especially as when using the feature
of User Message Interleaving, some socket options also needs
to be set.

Fixes: 463118c34a ("sctp: support sysctl to allow users to use stream interleave")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10 22:18:20 -07:00
Xin Long
c349ae5f83 Documentation: add description for net.sctp.reconf_enable
Describe it in networking/ip-sysctl.rst like other SCTP options.

Fixes: c0d8bab6ae ("sctp: add get and set sockopt for reconf_enable")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10 22:18:20 -07:00
Jakub Kicinski
145684d9f9 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-06-09

Grzegorz prevents addition of TC flower filters to TC0 and fixes queue
iteration for VF ADQ to number of actual queues for i40e.

Aleksandr prevents running of ethtool tests when device is being reset
for i40e.

Michal resolves an issue where iavf does not report its MAC address
properly.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  iavf: Fix issue with MAC address of VF shown as zero
  i40e: Fix call trace in setup_tx_descriptors
  i40e: Fix calculating the number of queue pairs
  i40e: Fix adding ADQ filter to TC0
====================

Link: https://lore.kernel.org/r/20220609162620.2619258-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10 22:10:30 -07:00
Cristian Marussi
4314f9f4f8 firmware: arm_scmi: Avoid using extended string-buffers sizes if not necessary
Commit b260fccaeb ("firmware: arm_scmi: Add SCMI v3.1 protocol extended
names support") moved all the name string buffers to use the extended buffer
size of 64 instead of the required 16 bytes. While that should be fine if
the firmware terminates the string before 16 bytes, there is possibility
of copying random data if the name is not NULL terminated by the firmware.

SCMI base protocol agent_name/vendor_id/sub_vendor_id are defined by the
specification as NULL-terminated ASCII strings up to 16-bytes in length.

The underlying buffers and message descriptors are currently bigger than
needed; resize them to fit only the strictly needed 16 bytes to avoid
any possible leaks when reading data from the firmware.

Change the size argument of strlcpy to use SCMI_SHORT_NAME_MAX_SIZE always
when dealing with short domain names, so as to limit the possibility that
an ill-formed non-NULL terminated short reply from the SCMI platform
firmware can leak stale content laying in the underlying transport shared
memory area.

While at that, convert all strings handling routines to use the preferred
strscpy.

Link: https://lore.kernel.org/r/20220608095530.497879-1-cristian.marussi@arm.com
Fixes: b260fccaeb ("firmware: arm_scmi: Add SCMI v3.1 protocol extended names support")
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-06-10 17:55:29 +01:00
Cristian Marussi
8e60294c80 firmware: arm_scmi: Fix SENSOR_AXIS_NAME_GET behaviour when unsupported
Avoid to invoke SENSOR_AXIS_NAME_GET on sensors that have not declared at
least one of their axes as supporting extended names.

Since the returned list of axes supporting extended names is not
necessarily comprising all the existing axes of the specified sensor,
take care also to properly pick the axis descriptor from the ID embedded
in the response.

Link: https://lore.kernel.org/r/20220608164051.2326087-1-cristian.marussi@arm.com
Fixes: 802b0bed01 ("firmware: arm_scmi: Add SCMI v3.1 SENSOR_AXIS_NAME_GET support")
Cc: Peter Hilber <peter.hilber@opensynergy.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Peter Hilber <peter.hilber@opensynergy.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-06-10 17:50:29 +01:00
Damien Le Moal
566d3c57eb scsi: scsi_debug: Fix zone transition to full condition
When a write command to a sequential write required or sequential write
preferred zone result in the zone write pointer reaching the end of the
zone, the zone condition must be set to full AND the number of implicitly
or explicitly open zones updated to have a correct accounting for zone
resources. However, the function zbc_inc_wp() only sets the zone condition
to full without updating the open zone counters, resulting in a zone state
machine breakage.

Introduce the helper function zbc_set_zone_full() and use it in
zbc_inc_wp() to correctly transition zones to the full condition.

Link: https://lore.kernel.org/r/20220608011302.92061-1-damien.lemoal@opensource.wdc.com
Fixes: f0d1cf9378 ("scsi: scsi_debug: Add ZBC zone commands")
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-06-10 12:50:05 -04:00
Brad Bishop
0a35780c75 eeprom: at25: Split reads into chunks and cap write size
Make use of spi_max_transfer_size to avoid requesting transfers that are
too large for some spi controllers.

Signed-off-by: Brad Bishop <bradleyb@fuzziesquirrel.com>
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20220524215142.60047-1-eajames@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 16:42:48 +02:00
Shin'ichiro Kawasaki
928ea98252 bus: fsl-mc-bus: fix KASAN use-after-free in fsl_mc_bus_remove()
In fsl_mc_bus_remove(), mc->root_mc_bus_dev->mc_io is passed to
fsl_destroy_mc_io(). However, mc->root_mc_bus_dev is already freed in
fsl_mc_device_remove(). Then reference to mc->root_mc_bus_dev->mc_io
triggers KASAN use-after-free. To avoid the use-after-free, keep the
reference to mc->root_mc_bus_dev->mc_io in a local variable and pass to
fsl_destroy_mc_io().

This patch needs rework to apply to kernels older than v5.15.

Fixes: f93627146f ("staging: fsl-mc: fix asymmetry in destroy of mc_io")
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20220601105159.87752-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 15:53:12 +02:00
Alexander Usyskin
3ed8c7d39c mei: me: add raptor lake point S DID
Add Raptor (Point) Lake S device id.

Cc: <stable@vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Link: https://lore.kernel.org/r/20220606144225.282375-3-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 15:39:24 +02:00
Alexander Usyskin
68553650bc mei: hbm: drop capability response on early shutdown
Drop HBM responses also in the early shutdown phase where
the usual traffic is allowed.
Extend the rule that drop HBM responses received during the shutdown
phase by also in MEI_DEV_POWERING_DOWN state.
This resolves the stall if the driver is stopping in the middle
of the link initialization or link reset.

Drop the capabilities response on early shutdown.

Fixes: 6d7163f2c4 ("mei: hbm: drop hbm responses on early shutdown")
Cc: <stable@vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Link: https://lore.kernel.org/r/20220606144225.282375-2-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 15:39:24 +02:00
Alexander Usyskin
9f4639373e mei: me: set internal pg flag to off on hardware reset
Link reset flow is always performed in the runtime resumed state.
The internal PG state may be left as ON after the suspend
and will not be updated upon the resume if the D0i3 is not supported.

Ensure that the internal PG state is set to the right value on the flow
entrance in case the firmware does not support D0i3.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Link: https://lore.kernel.org/r/20220606144225.282375-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 15:39:24 +02:00
Peter Robinson
cd756dafd8 staging: Also remove the Unisys visorbus.h
The commit that removed the Unisys s-Par and visorbus drivers
left around the include/linux/visorbus.h file mentioned in the
MAINTAINERS entry, we can also remove that too.

Fixes: e5f45b011e ("staging: Remove the drivers for the Unisys s-Par")
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Link: https://lore.kernel.org/r/20220606132200.2873243-1-pbrobinson@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 15:36:49 +02:00
Miaoqian Lin
1c245358ce misc: atmel-ssc: Fix IRQ check in ssc_probe
platform_get_irq() returns negative error number instead 0 on failure.
And the doc of platform_get_irq() provides a usage example:

    int irq = platform_get_irq(pdev, 0);
    if (irq < 0)
        return irq;

Fix the check of return value to catch errors correctly.

Fixes: eb1f293060 ("Driver for the Atmel on-chip SSC on AT32AP and AT91")
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220601123026.7119-1-linmq006@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 15:29:56 +02:00
Shreenidhi Shedi
6497e77764 char: lp: remove redundant initialization of err
err is getting assigned with an appropriate value before returning,
hence this initialization is unnecessary.

Signed-off-by: Shreenidhi Shedi <sshedi@vmware.com>
Link: https://lore.kernel.org/r/20220603130040.601673-2-sshedi@vmware.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 15:29:50 +02:00
Nathan Chancellor
bd476c1306 misc: rtsx: Fix clang -Wsometimes-uninitialized in rts5261_init_from_hw()
Clang warns:

  drivers/misc/cardreader/rts5261.c:406:13: error: variable 'setting_reg2' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
          } else if (efuse_valid == 0) {
                     ^~~~~~~~~~~~~~~~
  drivers/misc/cardreader/rts5261.c:412:30: note: uninitialized use occurs here
          pci_read_config_dword(pdev, setting_reg2, &lval2);
                                      ^~~~~~~~~~~~

efuse_valid == 1 is not a valid value so just return early from the
function to avoid using setting_reg2 uninitialized.

Fixes: b1c5f30851 ("misc: rtsx: add rts5261 efuse function")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Tom Rix <trix@redhat.com>
Suggested-by: Ricky WU <ricky_wu@realtek.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20220523150521.2947108-1-nathan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 15:29:17 +02:00
Ian Abbott
242439f7e2 comedi: vmk80xx: fix expression for tx buffer size
The expression for setting the size of the allocated bulk TX buffer
(`devpriv->usb_tx_buf`) is calling `usb_endpoint_maxp(devpriv->ep_rx)`,
which is using the wrong endpoint (should be `devpriv->ep_tx`).  Fix it.

Fixes: a23461c474 ("comedi: vmk80xx: fix transfer-buffer overflow")
Cc: Johan Hovold <johan@kernel.org>
Cc: stable@vger.kernel.org # 4.9+
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20220607171819.4121-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 15:21:23 +02:00
Linyu Yuan
0698f0209d usb: gadget: f_fs: change ep->ep safe in ffs_epfile_io()
In ffs_epfile_io(), when read/write data in blocking mode, it will wait
the completion in interruptible mode, if task receive a signal, it will
terminate the wait, at same time, if function unbind occurs,
ffs_func_unbind() will kfree all eps, ffs_epfile_io() still try to
dequeue request by dereferencing ep which may become invalid.

Fix it by add ep spinlock and will not dereference ep if it is not valid.

Cc: <stable@vger.kernel.org> # 5.15
Reported-by: Michael Wu <michael@allwinnertech.com>
Tested-by: Michael Wu <michael@allwinnertech.com>
Reviewed-by: John Keeping <john@metanate.com>
Signed-off-by: Linyu Yuan <quic_linyyuan@quicinc.com>
Link: https://lore.kernel.org/r/1654863478-26228-3-git-send-email-quic_linyyuan@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 14:45:38 +02:00
Linyu Yuan
fb1f16d74e usb: gadget: f_fs: change ep->status safe in ffs_epfile_io()
If a task read/write data in blocking mode, it will wait the completion
in ffs_epfile_io(), if function unbind occurs, ffs_func_unbind() will
kfree ffs ep, once the task wake up, it still dereference the ffs ep to
obtain the request status.

Fix it by moving the request status to io_data which is stack-safe.

Cc: <stable@vger.kernel.org> # 5.15
Reported-by: Michael Wu <michael@allwinnertech.com>
Tested-by: Michael Wu <michael@allwinnertech.com>
Reviewed-by: John Keeping <john@metanate.com>
Signed-off-by: Linyu Yuan <quic_linyyuan@quicinc.com>
Link: https://lore.kernel.org/r/1654863478-26228-2-git-send-email-quic_linyyuan@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 14:45:38 +02:00
Mathias Nyman
802dcafc42 xhci: Fix null pointer dereference in resume if xhci has only one roothub
In the re-init path xhci_resume() passes 'hcd->primary_hcd' to hci_init(),
however this field isn't initialized by __usb_create_hcd() for a HCD
without secondary controller.

xhci_resume() is called once per xHC device, not per hcd, so the extra
checking for primary hcd can be removed.

Fixes: e0fe986972 ("usb: host: xhci-plat: prepare operation w/o shared hcd")
Reported-by: Matthias Kaehlcke <mka@chromium.org>
Tested-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20220610115338.863152-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 13:57:20 +02:00
Ilpo Järvinen
be03b0651f serial: 8250: Store to lsr_save_flags after lsr read
Not all LSR register flags are preserved across reads. Therefore, LSR
readers must store the non-preserved bits into lsr_save_flags.

This fix was initially mixed into feature commit f6f586102a ("serial:
8250: Handle UART without interrupt on TEMT using em485"). However,
that feature change had a flaw and it was reverted to make room for
simpler approach providing the same feature. The embedded fix got
reverted with the feature change.

Re-add the lsr_save_flags fix and properly mark it's a fix.

Link: https://lore.kernel.org/all/1d6c31d-d194-9e6a-ddf9-5f29af829f3@linux.intel.com/T/#m1737eef986bd20cf19593e344cebd7b0244945fc
Fixes: e490c9144c ("tty: Add software emulated RS485 support for 8250")
Cc: stable <stable@kernel.org>
Acked-by: Uwe Kleine-König <u.kleine-koenig@penugtronix.de>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/f4d774be-1437-a550-8334-19d8722ab98c@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 13:52:19 +02:00
Vincent Whitchurch
499e13aac6 tty: goldfish: Fix free_irq() on remove
Pass the correct dev_id to free_irq() to fix this splat when the driver
is unbound:

 WARNING: CPU: 0 PID: 30 at kernel/irq/manage.c:1895 free_irq
 Trying to free already-free IRQ 65
 Call Trace:
  warn_slowpath_fmt
  free_irq
  goldfish_tty_remove
  platform_remove
  device_remove
  device_release_driver_internal
  device_driver_detach
  unbind_store
  drv_attr_store
  ...

Fixes: 465893e188 ("tty: goldfish: support platform_device with id -1")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/r/20220609141704.1080024-1-vincent.whitchurch@axis.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 13:31:31 +02:00
Vijaya Krishna Nivarthi
654a8d6c93 tty: serial: qcom-geni-serial: Implement start_rx callback
In suspend sequence stop_rx will be performed only if implementation for
start_rx callback is present.

Set qcom_geni_serial_start_rx as callback for start_rx so that stop_rx is
performed.

Fixes: c9d2325cdb ("serial: core: Do stop_rx in suspend path for console if console_suspend is disabled")
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com>
Link: https://lore.kernel.org/r/1654627965-1461-3-git-send-email-quic_vnivarth@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 13:30:41 +02:00
Vijaya Krishna Nivarthi
cfab87c2c2 serial: core: Introduce callback for start_rx and do stop_rx in suspend only if this callback implementation is present.
In suspend sequence there is a need to perform stop_rx during suspend
sequence to prevent any asynchronous data over rx line. However this
can cause problem to drivers which dont do re-start_rx during set_termios.

Add new callback start_rx and perform stop_rx only when implementation of
start_rx is present. Also add call to start_rx in resume sequence so that
drivers who come across this problem can make use of this framework.

Fixes: c9d2325cdb ("serial: core: Do stop_rx in suspend path for console if console_suspend is disabled")
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com>
Link: https://lore.kernel.org/r/1654627965-1461-2-git-send-email-quic_vnivarth@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 13:30:41 +02:00
Tony Lindgren
e74024b2ec tty: n_gsm: Debug output allocation must use GFP_ATOMIC
Dan Carpenter <dan.carpenter@oracle.com> reported the following Smatch
warning:

drivers/tty/n_gsm.c:720 gsm_data_kick()
warn: sleeping in atomic context

This is because gsm_control_message() is holding a spin lock so
gsm_hex_dump_bytes() needs to use GFP_ATOMIC instead of GFP_KERNEL.

Fixes: 925ea0fa52 ("tty: n_gsm: Fix packet data hex dump output")
Cc: stable <stable@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20220523155052.57129-1-tony@atomide.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 13:30:11 +02:00
Christian König
81b0d0e4f8 drm/ttm: fix missing NULL check in ttm_device_swapout
Resources about to be destructed are not tied to BOs any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Fixes: 6a9b028994 ("drm/ttm: move the LRU into resource handling v4")
Link: https://patchwork.freedesktop.org/patch/msgid/20220603104604.456991-1-christian.koenig@amd.com
2022-06-10 13:20:21 +02:00
Stephen Rothwell
8bd6b8c4b1 USB: fixup for merge issue with "usb: dwc3: Don't switch OTG -> peripheral if extcon is present"
Today's linux-next merge of the extcon tree got a conflict in:

  drivers/usb/dwc3/drd.c

between commit:

  0f01017191 ("usb: dwc3: Don't switch OTG -> peripheral if extcon is present")

from the usb tree and commit:

  88490c7f43c4 ("extcon: Fix extcon_get_extcon_dev() error handling")

from the extcon tree.

I fixed it up (the former moved the code modified by the latter, so I
used the former version of this files and added the following merge fix
patch) and can carry the fix as necessary.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/r/20220426152739.62f6836e@canb.auug.org.au
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 11:19:42 +02:00
Jing Leng
5c7578c39c usb: cdnsp: Fixed setting last_trb incorrectly
When ZLP occurs in bulk transmission, currently cdnsp will set last_trb
for the last two TRBs, it will trigger an error "ERROR Transfer event TRB
DMA ptr not part of current TD ...".

Fixes: e913aada06 ("usb: cdnsp: Fixed issue with ZLP")
Cc: stable <stable@kernel.org>
Acked-by: Pawel Laszczak <pawell@cadence.com>
Signed-off-by: Jing Leng <jleng@ambarella.com>
Link: https://lore.kernel.org/r/20220609021134.1606-1-3090101217@zju.edu.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 11:15:23 +02:00
Marian Postevca
b337af3a4d usb: gadget: u_ether: fix regression in setting fixed MAC address
In systemd systems setting a fixed MAC address through
the "dev_addr" module argument fails systematically.
When checking the MAC address after the interface is created
it always has the same but different MAC address to the one
supplied as argument.

This is partially caused by systemd which by default will
set an internally generated permanent MAC address for interfaces
that are marked as having a randomly generated address.

Commit 890d5b4090 ("usb: gadget: u_ether: fix race in
setting MAC address in setup phase") didn't take into account
the fact that the interface must be marked as having a set
MAC address when it's set as module argument.

Fixed by marking the interface with NET_ADDR_SET when
the "dev_addr" module argument is supplied.

Fixes: 890d5b4090 ("usb: gadget: u_ether: fix race in setting MAC address in setup phase")
Cc: stable@vger.kernel.org
Signed-off-by: Marian Postevca <posteuca@mutex.one>
Link: https://lore.kernel.org/r/20220603153459.32722-1-posteuca@mutex.one
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 11:12:53 +02:00
Miaoqian Lin
4757c9ade3 usb: gadget: lpc32xx_udc: Fix refcount leak in lpc32xx_udc_probe
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
of_node_put() will check NULL pointer.

Fixes: 24a28e4283 ("USB: gadget driver for LPC32xx")
Cc: stable <stable@kernel.org>
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220603140246.64529-1-linmq006@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 11:12:39 +02:00
Miaoqian Lin
3755278f07 usb: dwc2: Fix memory leak in dwc2_hcd_init
usb_create_hcd will alloc memory for hcd, and we should
call usb_put_hcd to free it when platform_get_resource()
fails to prevent memory leak.
goto error2 label instead error1 to fix this.

Fixes: 856e6e8e0f ("usb: dwc2: check return value after calling platform_get_resource()")
Cc: stable <stable@kernel.org>
Acked-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220530085413.44068-1-linmq006@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 11:12:20 +02:00
Stephan Gerhold
7ddda2614d usb: dwc3: pci: Restore line lost in merge conflict resolution
Commit 582ab24e09 ("usb: dwc3: pci: Set "linux,phy_charger_detect"
property on some Bay Trail boards") added a new swnode similar to the
existing ones for boards where the PHY handles charger detection.

Unfortunately, the "linux,sysdev_is_parent" property got lost in the
merge conflict resolution of commit ca9400ef7f ("Merge 5.17-rc6 into
usb-next"). Now dwc3_pci_intel_phy_charger_detect_properties is the
only swnode in dwc3-pci that is missing "linux,sysdev_is_parent".

It does not seem to cause any obvious functional issues, but it's
certainly unintended so restore the line to make the properties
consistent again.

Fixes: ca9400ef7f ("Merge 5.17-rc6 into usb-next")
Cc: stable@vger.kernel.org
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Link: https://lore.kernel.org/r/20220528170913.9240-1-stephan@gerhold.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 11:11:01 +02:00
Wesley Cheng
9c1e916960 usb: dwc3: gadget: Fix IN endpoint max packet size allocation
The current logic to assign the max packet limit for IN endpoints attempts
to take the default HW value and apply the optimal endpoint settings based
on it.  However, if the default value reports a TxFIFO size large enough
for only one max packet, it will divide the value and assign a smaller ep
max packet limit.

For example, if the default TxFIFO size fits 1024B, current logic will
assign 1024/3 = 341B to ep max packet size.  If function drivers attempt to
request for an endpoint with a wMaxPacketSize of 1024B (SS BULK max packet
size) then it will fail, as the gadget is unable to find an endpoint which
can fit the requested size.

Functionally, if the TxFIFO has enough space to fit one max packet, it will
be sufficient, at least when initializing the endpoints.

Fixes: d94ea53198 ("usb: dwc3: gadget: Properly set maxpacket limit")
Cc: stable <stable@kernel.org>
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
Link: https://lore.kernel.org/r/20220523213948.22142-1-quic_wcheng@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 11:10:43 +02:00
Saurabh Sengar
656c5ba50b Drivers: hv: vmbus: Release cpu lock in error case
In case of invalid sub channel, release cpu lock before returning.

Fixes: a949e86c0d ("Drivers: hv: vmbus: Resolve race between init_vp_index() and CPU hotplug")
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1654794996-13244-1-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-06-10 08:41:28 +00:00
Alexander Stein
552ca27929 ARM: dts: imx7: Move hsic_phy power domain to HSIC PHY node
Move the power domain to its actual user. This keeps the power domain
enabled even when the USB host is runtime suspended. This is necessary
to detect any downstream events, like device attach.

Fixes: 02f8eb40ef ("ARM: dts: imx7s: Add power domain for imx7d HSIC")
Suggested-by: Jun Li <jun.li@nxp.com>
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2022-06-10 16:29:13 +08:00
Greg Kroah-Hartman
1d9e615f1a Merge tag 'usb-serial-5.19-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:

USB-serial fixes for 5.19-rc2

Here are some new device ids for a modem and an Edgeport device.

All have been in linux-next with no reported issues.

* tag 'usb-serial-5.19-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: option: add support for Cinterion MV31 with new baseline
  USB: serial: io_ti: add Agilent E5805A support
2022-06-10 10:27:30 +02:00
Soham Sen
b2e6b3d9bb ALSA: hda/realtek: Add mute LED quirk for HP Omen laptop
The HP Omen 15 laptop needs a quirk to toggle the mute LED. It already is implemented for a different variant of the HP Omen laptop so a fixup entry is needed for this variant.

Signed-off-by: Soham Sen <contact@sohamsen.me>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220609181919.45535-1-contact@sohamsen.me
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-10 09:59:20 +02:00
Jiaxun Yang
6fac824f40 irqchip/loongson-liointc: Use architecture register to get coreid
fa84f89395 ("irqchip/loongson-liointc: Fix build error for
LoongArch") replaced get_ebase_cpunum with physical processor
id from SMP facilities. However that breaks MIPS non-SMP build
and makes booting from other cores inpossible on non-SMP kernel.

Thus we revert get_ebase_cpunum back and use get_csr_cpuid for
LoongArch.

Fixes: fa84f89395 ("irqchip/loongson-liointc: Fix build error for LoongArch")
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609175242.977-1-jiaxun.yang@flygoat.com
2022-06-10 08:57:19 +01:00
Kees Cook
67ea0a2adb staging: rtl8723bs: Allocate full pwep structure
The pwep allocation was always being allocated smaller than the true
structure size. Avoid this by always allocating the full structure.
Found with GCC 12 and -Warray-bounds:

../drivers/staging/rtl8723bs/os_dep/ioctl_linux.c: In function 'rtw_set_encryption':
../drivers/staging/rtl8723bs/os_dep/ioctl_linux.c:591:29: warning: array subscript 'struct ndis_802_11_wep[0]' is partly outside array bounds of 'void[25]' [-Warray-bounds]
  591 |                         pwep->length = wep_total_len;
      |                             ^~

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Fabio Aiuto <fabioaiuto83@gmail.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: linux-staging@lists.linux.dev
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220608215512.1070847-1-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 09:10:16 +02:00
Javier Martinez Canillas
de0952f267 staging: olpc_dcon: mark driver as broken
The commit eecb3e4e5d ("staging: olpc_dcon: add OLPC display controller
(DCON) support") added this driver in 2010, and has been in staging since
then. It was marked as broken at some point because it didn't even build
but that got removed once the build issues were addressed.

But it seems that the work to move this driver out of staging has stalled,
the last non-trivial change to fix one of the items mentioned in its todo
file was commit e40219d5e4 ("staging: olpc_dcon: allow simultaneous XO-1
and XO-1.5 support") in 2019.

And even if work to destage the driver is resumed, the fbdev subsystem has
been deprecated for a long time and instead it should be ported to DRM.

Now this driver is preventing to land a kernel wide change, that makes the
num_registered_fb symbol to be private to the fbmem.c file.

So let's just mark the driver as broken. Someone can then work on making
it not depend on the num_registered_fb symbol, allowing to drop the broken
dependency again.

Suggested-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://lore.kernel.org/r/20220609223424.907174-1-javierm@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 09:09:47 +02:00
Wei Yongjun
a1ea0857b5 clk: stm32: rcc_reset: Fix missing spin_lock_init()
The driver allocates the spinlock but not initialize it.
Use spin_lock_init() on it to initialize it correctly.

Fixes: 637cee5ffc ("clk: stm32: Introduce STM32MP13 RCC drivers (Reset Clock Controller)")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: https://lore.kernel.org/r/20220608021154.990347-1-weiyongjun1@huawei.com
Tested-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Reviewed-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-06-09 15:34:08 -07:00
Kunihiko Hayashi
e3f056a7aa irqchip/uniphier-aidet: Add compatible string for NX1 SoC
Add the compatible string to support UniPhier NX1 SoC, which has the same
kinds of controls as the other UniPhier SoCs.

Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1653023822-19229-3-git-send-email-hayashi.kunihiko@socionext.com
2022-06-09 17:41:57 +01:00
Kunihiko Hayashi
df089e6f07 dt-bindings: interrupt-controller/uniphier-aidet: Add bindings for NX1 SoC
Update uniphier-aidet binding document for UniPhier NX1 SoC.

Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1653023822-19229-2-git-send-email-hayashi.kunihiko@socionext.com
2022-06-09 17:41:57 +01:00
Miaoqian Lin
eff4780f83 irqchip/realtek-rtl: Fix refcount leak in map_interrupts
of_find_node_by_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
This function doesn't call of_node_put() in error path.
Call of_node_put() directly after of_property_read_u32() to cover
both normal path and error path.

Fixes: 9f3a0f34b8 ("irqchip: Add support for Realtek RTL838x/RTL839x interrupt controller")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220601080930.31005-7-linmq006@gmail.com
2022-06-09 17:36:57 +01:00
Miaoqian Lin
fa1ad9d4cc irqchip/gic-v3: Fix refcount leak in gic_populate_ppi_partitions
of_find_node_by_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.

Fixes: e3825ba1af ("irqchip/gic-v3: Add support for partitioned PPIs")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220601080930.31005-6-linmq006@gmail.com
2022-06-09 17:36:57 +01:00
Miaoqian Lin
ec8401a429 irqchip/gic-v3: Fix error handling in gic_populate_ppi_partitions
of_get_child_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
When kcalloc fails, it missing of_node_put() and results in refcount
leak. Fix this by goto out_put_node label.

Fixes: 52085d3f20 ("irqchip/gic-v3: Dynamically allocate PPI partition descriptors")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220601080930.31005-5-linmq006@gmail.com
2022-06-09 17:36:57 +01:00
Miaoqian Lin
3d45670fab irqchip/apple-aic: Fix refcount leak in aic_of_ic_init
of_get_child_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.

Fixes: a5e8801202 ("irqchip/apple-aic: Parse FIQ affinities from device-tree")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220601080930.31005-4-linmq006@gmail.com
2022-06-09 17:36:57 +01:00
Miaoqian Lin
b1ac803f47 irqchip/apple-aic: Fix refcount leak in build_fiq_affinity
of_find_node_by_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.

Fixes: a5e8801202 ("irqchip/apple-aic: Parse FIQ affinities from device-tree")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220601080930.31005-3-linmq006@gmail.com
2022-06-09 17:36:57 +01:00
Miaoqian Lin
f4b98e3148 irqchip/gic/realview: Fix refcount leak in realview_gic_of_init
of_find_matching_node_and_match() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.

Fixes: 82b0a434b4 ("irqchip/gic/realview: Support more RealView DCC variants")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220601080930.31005-2-linmq006@gmail.com
2022-06-09 17:36:57 +01:00
Jamie Iles
b84dc7f0e3 irqchip/xilinx: Remove microblaze+zynq dependency
The Xilinx IRQ controller doesn't really have any architecture
dependencies - it's a generic AXI component that can be used for any
FPGA core from Zynq hard processor systems to microblaze+riscv soft
cores and more.

Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Acked-by: Michal Simek <michal.simek@amd.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220606213952.298686-1-jamie@jamieiles.com
2022-06-09 17:34:56 +01:00
Michal Wilczynski
6456038442 iavf: Fix issue with MAC address of VF shown as zero
After reinitialization of iavf, ice driver gets VIRTCHNL_OP_ADD_ETH_ADDR
message with incorrectly set type of MAC address. Hardware address should
have is_primary flag set as true. This way ice driver knows what it has
to set as a MAC address.

Check if the address is primary in iavf_add_filter function and set flag
accordingly.

To test set all-zero MAC on a VF. This triggers iavf re-initialization
and VIRTCHNL_OP_ADD_ETH_ADDR message gets sent to PF.
For example:

ip link set dev ens785 vf 0 mac 00:00:00:00:00:00

This triggers re-initialization of iavf. New MAC should be assigned.
Now check if MAC is non-zero:

ip link show dev ens785

Fixes: a3e839d539 ("iavf: Add usage of new virtchnl format to set default MAC")
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-09 08:58:15 -07:00
Aleksandr Loktionov
fd5855e6b1 i40e: Fix call trace in setup_tx_descriptors
After PF reset and ethtool -t there was call trace in dmesg
sometimes leading to panic. When there was some time, around 5
seconds, between reset and test there were no errors.

Problem was that pf reset calls i40e_vsi_close in prep_for_reset
and ethtool -t calls i40e_vsi_close in diag_test. If there was not
enough time between those commands the second i40e_vsi_close starts
before previous i40e_vsi_close was done which leads to crash.

Add check to diag_test if pf is in reset and don't start offline
tests if it is true.
Add netif_info("testing failed") into unhappy path of i40e_diag_test()

Fixes: e17bc411ae ("i40e: Disable offline diagnostics if VFs are enabled")
Fixes: 510efb2682 ("i40e: Fix ethtool offline diagnostic with netqueues")
Signed-off-by: Michal Jaron <michalx.jaron@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-09 08:54:19 -07:00
Grzegorz Szczurek
0bb050670a i40e: Fix calculating the number of queue pairs
If ADQ is enabled for a VF, then actual number of queue pair
is a number of currently available traffic classes for this VF.

Without this change the configuration of the Rx/Tx queues
fails with error.

Fixes: d29e0d233e ("i40e: missing input validation on VF message handling by the PF")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-09 08:54:03 -07:00
Grzegorz Szczurek
c3238d36c3 i40e: Fix adding ADQ filter to TC0
Procedure of configure tc flower filters erroneously allows to create
filters on TC0 where unfiltered packets are also directed by default.
Issue was caused by insufficient checks of hw_tc parameter specifying
the hardware traffic class to pass matching packets to.

Fix checking hw_tc parameter which blocks creation of filters on TC0.

Fixes: 2f4b411a3d ("i40e: Enable cloud filters via tc-flower")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-09 08:53:43 -07:00
Marc Zyngier
668a9fe5c6 genirq: PM: Use runtime PM for chained interrupts
When requesting an interrupt, we correctly call into the runtime
PM framework to guarantee that the underlying interrupt controller
is up and running.

However, we fail to do so for chained interrupt controllers, as
the mux interrupt is not requested along the same path.

Augment __irq_do_set_handler() to call into the runtime PM code
in this case, making sure the PM flow is the same for all interrupts.

Reported-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Liu Ying <victor.liu@nxp.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/26973cddee5f527ea17184c0f3fccb70bc8969a0.camel@pengutronix.de
2022-06-09 15:58:13 +01:00
David Matlack
e0f3f46e42 KVM: selftests: Restrict test region to 48-bit physical addresses when using nested
The selftests nested code only supports 4-level paging at the moment.
This means it cannot map nested guest physical addresses with more than
48 bits. Allow perf_test_util nested mode to work on hosts with more
than 48 physical addresses by restricting the guest test region to
48-bits.

While here, opportunistically fix an off-by-one error when dealing with
vm_get_max_gfn(). perf_test_util.c was treating this as the maximum
number of GFNs, rather than the maximum allowed GFN. This didn't result
in any correctness issues, but it did end up shifting the test region
down slightly when using huge pages.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-12-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:27 -04:00
David Matlack
71d4896619 KVM: selftests: Add option to run dirty_log_perf_test vCPUs in L2
Add an option to dirty_log_perf_test that configures the vCPUs to run in
L2 instead of L1. This makes it possible to benchmark the dirty logging
performance of nested virtualization, which is particularly interesting
because KVM must shadow L1's EPT/NPT tables.

For now this support only works on x86_64 CPUs with VMX. Otherwise
passing -n results in the test being skipped.

Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-11-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:27 -04:00
David Matlack
cf97d5e99f KVM: selftests: Clean up LIBKVM files in Makefile
Break up the long lines for LIBKVM and alphabetize each architecture.
This makes reading the Makefile easier, and will make reading diffs to
LIBKVM easier.

No functional change intended.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-10-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:26 -04:00
David Matlack
cdc979dae2 KVM: selftests: Link selftests directly with lib object files
The linker does obey strong/weak symbols when linking static libraries,
it simply resolves an undefined symbol to the first-encountered symbol.
This means that defining __weak arch-generic functions and then defining
arch-specific strong functions to override them in libkvm will not
always work.

More specifically, if we have:

lib/generic.c:

  void __weak foo(void)
  {
          pr_info("weak\n");
  }

  void bar(void)
  {
          foo();
  }

lib/x86_64/arch.c:

  void foo(void)
  {
          pr_info("strong\n");
  }

And a selftest that calls bar(), it will print "weak". Now if you make
generic.o explicitly depend on arch.o (e.g. add function to arch.c that
is called directly from generic.c) it will print "strong". In other
words, it seems that the linker is free to throw out arch.o when linking
because generic.o does not explicitly depend on it, which causes the
linker to lose the strong symbol.

One solution is to link libkvm.a with --whole-archive so that the linker
doesn't throw away object files it thinks are unnecessary. However that
is a bit difficult to plumb since we are using the common selftests
makefile rules. An easier solution is to drop libkvm.a just link
selftests with all the .o files that were originally in libkvm.a.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-9-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:25 -04:00
David Matlack
acf57736e7 KVM: selftests: Drop unnecessary rule for STATIC_LIBS
Drop the "all: $(STATIC_LIBS)" rule. The KVM selftests already depend
on $(STATIC_LIBS), so there is no reason to have an extra "all" rule.

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-8-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:25 -04:00
David Matlack
c363d95986 KVM: selftests: Add a helper to check EPT/VPID capabilities
Create a small helper function to check if a given EPT/VPID capability
is supported. This will be re-used in a follow-up commit to check for 1G
page support.

No functional change intended.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-7-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:24 -04:00
David Matlack
b6c086d04c KVM: selftests: Move VMX_EPT_VPID_CAP_AD_BITS to vmx.h
This is a VMX-related macro so move it to vmx.h. While here, open code
the mask like the rest of the VMX bitmask macros.

No functional change intended.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-6-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:24 -04:00
David Matlack
ce690e9c17 KVM: selftests: Refactor nested_map() to specify target level
Refactor nested_map() to specify that it explicityl wants 4K mappings
(the existing behavior) and push the implementation down into
__nested_map(), which can be used in subsequent commits to create huge
page mappings.

No function change intended.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-5-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:23 -04:00
David Matlack
b8ca01ea19 KVM: selftests: Drop stale function parameter comment for nested_map()
nested_map() does not take a parameter named eptp_memslot. Drop the
comment referring to it.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-4-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:23 -04:00
David Matlack
c5a0ccec4c KVM: selftests: Add option to create 2M and 1G EPT mappings
The current EPT mapping code in the selftests only supports mapping 4K
pages. This commit extends that support with an option to map at 2M or
1G. This will be used in a future commit to create large page mappings
to test eager page splitting.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-3-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:22 -04:00
David Matlack
4ee602e78d KVM: selftests: Replace x86_page_size with PG_LEVEL_XX
x86_page_size is an enum used to communicate the desired page size with
which to map a range of memory. Under the hood they just encode the
desired level at which to map the page. This ends up being clunky in a
few ways:

 - The name suggests it encodes the size of the page rather than the
   level.
 - In other places in x86_64/processor.c we just use a raw int to encode
   the level.

Simplify this by adopting the kernel style of PG_LEVEL_XX enums and pass
around raw ints when referring to the level. This makes the code easier
to understand since these macros are very common in KVM MMU code.

Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-2-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:22 -04:00
Paolo Bonzini
e3cdaab5ff KVM: x86: SVM: fix nested PAUSE filtering when L0 intercepts PAUSE
Commit 74fd41ed16 ("KVM: x86: nSVM: support PAUSE filtering when L0
doesn't intercept PAUSE") introduced passthrough support for nested pause
filtering, (when the host doesn't intercept PAUSE) (either disabled with
kvm module param, or disabled with '-overcommit cpu-pm=on')

Before this commit, L1 KVM didn't intercept PAUSE at all; afterwards,
the feature was exposed as supported by KVM cpuid unconditionally, thus
if L1 could try to use it even when the L0 KVM can't really support it.

In this case the fallback caused KVM to intercept each PAUSE instruction;
in some cases, such intercept can slow down the nested guest so much
that it can fail to boot.  Instead, before the problematic commit KVM
was already setting both thresholds to 0 in vmcb02, but after the first
userspace VM exit shrink_ple_window was called and would reset the
pause_filter_count to the default value.

To fix this, change the fallback strategy - ignore the guest threshold
values, but use/update the host threshold values unless the guest
specifically requests disabling PAUSE filtering (either simple or
advanced).

Also fix a minor bug: on nested VM exit, when PAUSE filter counter
were copied back to vmcb01, a dirty bit was not set.

Thanks a lot to Suravee Suthikulpanit for debugging this!

Fixes: 74fd41ed16 ("KVM: x86: nSVM: support PAUSE filtering when L0 doesn't intercept PAUSE")
Reported-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220518072709.730031-1-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:21 -04:00
Maxim Levitsky
ba8ec27324 KVM: x86: SVM: drop preempt-safe wrappers for avic_vcpu_load/put
Now that these functions are always called with preemption disabled,
remove the preempt_disable()/preempt_enable() pair inside them.

No functional change intended.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-8-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:20 -04:00
Maxim Levitsky
18869f26df KVM: x86: disable preemption around the call to kvm_arch_vcpu_{un|}blocking
On SVM, if preemption happens right after the call to finish_rcuwait
but before call to kvm_arch_vcpu_unblocking on SVM/AVIC, it itself
will re-enable AVIC, and then we will try to re-enable it again
in kvm_arch_vcpu_unblocking which will lead to a warning
in __avic_vcpu_load.

The same problem can happen if the vCPU is preempted right after the call
to kvm_arch_vcpu_blocking but before the call to prepare_to_rcuwait
and in this case, we will end up with AVIC enabled during sleep -
Ooops.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-7-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:20 -04:00
Maxim Levitsky
66c768d30e KVM: x86: disable preemption while updating apicv inhibition
Currently nothing prevents preemption in kvm_vcpu_update_apicv.

On SVM, If the preemption happens after we update the
vcpu->arch.apicv_active, the preemption itself will
'update' the inhibition since the AVIC will be first disabled
on vCPU unload and then enabled, when the current task
is loaded again.

Then we will try to update it again, which will lead to a warning
in __avic_vcpu_load, that the AVIC is already enabled.

Fix this by disabling preemption in this code.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-6-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:19 -04:00
Maxim Levitsky
603ccef42c KVM: x86: SVM: fix avic_kick_target_vcpus_fast
There are two issues in avic_kick_target_vcpus_fast

1. It is legal to issue an IPI request with APIC_DEST_NOSHORT
   and a physical destination of 0xFF (or 0xFFFFFFFF in case of x2apic),
   which must be treated as a broadcast destination.

   Fix this by explicitly checking for it.
   Also don’t use ‘index’ in this case as it gives no new information.

2. It is legal to issue a logical IPI request to more than one target.
   Index field only provides index in physical id table of first
   such target and therefore can't be used before we are sure
   that only a single target was addressed.

   Instead, parse the ICRL/ICRH, double check that a unicast interrupt
   was requested, and use that info to figure out the physical id
   of the target vCPU.
   At that point there is no need to use the index field as well.

In addition to fixing the above	issues,	also skip the call to
kvm_apic_match_dest.

It is possible to do this now, because now as long as AVIC is not
inhibited, it is guaranteed that none of the vCPUs changed their
apic id from its default value.

This fixes boot of windows guest with AVIC enabled because it uses
IPI with 0xFF destination and no destination shorthand.

Fixes: 7223fd2d53 ("KVM: SVM: Use target APIC ID to complete AVIC IRQs when possible")
Cc: stable@vger.kernel.org

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-5-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:19 -04:00
Maxim Levitsky
f5f9089f76 KVM: x86: SVM: remove avic's broken code that updated APIC ID
AVIC is now inhibited if the guest changes the apic id,
and therefore this code is no longer needed.

There are several ways this code was broken, including:

1. a vCPU was only allowed to change its apic id to an apic id
of an existing vCPU.

2. After such change, the vCPU whose apic id entry was overwritten,
could not correctly change its own apic id, because its own
entry is already overwritten.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:18 -04:00
Maxim Levitsky
3743c2f025 KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base
Neither of these settings should be changed by the guest and it is
a burden to support it in the acceleration code, so just inhibit
this code instead.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:18 -04:00
Maxim Levitsky
a9603ae0e4 KVM: x86: document AVIC/APICv inhibit reasons
These days there are too many AVIC/APICv inhibit
reasons, and it doesn't hurt to have some documentation
for them.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:17 -04:00
Yuan Yao
d2263de137 KVM: x86/mmu: Set memory encryption "value", not "mask", in shadow PDPTRs
Assign shadow_me_value, not shadow_me_mask, to PAE root entries,
a.k.a. shadow PDPTRs, when host memory encryption is supported.  The
"mask" is the set of all possible memory encryption bits, e.g. MKTME
KeyIDs, whereas "value" holds the actual value that needs to be
stuffed into host page tables.

Using shadow_me_mask results in a failed VM-Entry due to setting
reserved PA bits in the PDPTRs, and ultimately causes an OOPS due to
physical addresses with non-zero MKTME bits sending to_shadow_page()
into the weeds:

set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
BUG: unable to handle page fault for address: ffd43f00063049e8
PGD 86dfd8067 P4D 0
Oops: 0000 [#1] PREEMPT SMP
RIP: 0010:mmu_free_root_page+0x3c/0x90 [kvm]
 kvm_mmu_free_roots+0xd1/0x200 [kvm]
 __kvm_mmu_unload+0x29/0x70 [kvm]
 kvm_mmu_unload+0x13/0x20 [kvm]
 kvm_arch_destroy_vm+0x8a/0x190 [kvm]
 kvm_put_kvm+0x197/0x2d0 [kvm]
 kvm_vm_release+0x21/0x30 [kvm]
 __fput+0x8e/0x260
 ____fput+0xe/0x10
 task_work_run+0x6f/0xb0
 do_exit+0x327/0xa90
 do_group_exit+0x35/0xa0
 get_signal+0x911/0x930
 arch_do_signal_or_restart+0x37/0x720
 exit_to_user_mode_prepare+0xb2/0x140
 syscall_exit_to_user_mode+0x16/0x30
 do_syscall_64+0x4e/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: e54f1ff244 ("KVM: x86/mmu: Add shadow_me_value and repurpose shadow_me_mask")
Signed-off-by: Yuan Yao <yuan.yao@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Message-Id: <20220608012015.19566-1-yuan.yao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09 10:52:16 -04:00
Paolo Bonzini
76599a4761 Merge tag 'kvmarm-fixes-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.19, take #1

- Properly reset the SVE/SME flags on vcpu load

- Fix a vgic-v2 regression regarding accessing the pending
  state of a HW interrupt from userspace (and make the code
  common with vgic-v3)

- Fix access to the idreg range for protected guests

- Ignore 'kvm-arm.mode=protected' when using VHE

- Return an error from kvm_arch_init_vm() on allocation failure

- A bunch of small cleanups (comments, annotations, indentation)
2022-06-09 10:32:17 -04:00
GONG, Ruiqi
4527d47bb6 drm/atomic: fix warning of unused variable
Fix the `unused-but-set-variable` warning as how other iteration
wrappers do.

Link: https://lore.kernel.org/all/202206071049.pofHsRih-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607110848.941486-1-gongruiqi1@huawei.com
2022-06-09 16:09:46 +02:00
Paolo Bonzini
66da65005a Merge tag 'kvm-riscv-fixes-5.19-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv fixes for 5.19, take #1

- Typo fix in arch/riscv/kvm/vmid.c

- Remove broken reference pattern from MAINTAINERS entry
2022-06-09 09:45:00 -04:00
Christian Lamparter
2c5947cffd Revert "mtd: rawnand: add support for Toshiba TC58NVG0S3HTA00 NAND flash"
This reverts commit 3380557fc7.

It turned out this "4-byte" ID might have been an honest mistake.
Regrettably, the chip Andreas has might be a counterfeit or is
damaged in some other way and shouldn't have ended up in a router.

Andreas reported his chip is returning just four bytes:
"98 f1 80 15 00 00 00 00".

However, according to Kioxia/Toshiba's datasheet, there should
have been at least another byte that would have contained the
correct OOB size that Andreas needed.

Miquel and Andreas are both favoring reverting the patch over
further, possibly hacky modifications:
"[Reverting] is the safest option here. Apart from this device, we
do not know how many devices have these damaged/counterfeit chips.
If it is just a couple and only on Fritzboxes, as suggested in the
Github issue the patch could be carried through OpenWrt[...]"

Thanks to several users on the openwrt forum and github issue,
who stayed along for the ride:
 - Peter-vdL for reporting the issue and testing patches.
 - neg2led and Hannu Nyman who did all the
   datasheet digging and debugging.

Cc: Andreas Boehler <dev@aboehler.at>
Suggested-by: Andreas Boehler <dev@aboehler.at>
Suggested-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://github.com/openwrt/openwrt/issues/9962
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220607185918.1048204-1-chunkeey@gmail.com
2022-06-09 15:07:07 +02:00
Slark Xiao
158f7585bf USB: serial: option: add support for Cinterion MV31 with new baseline
Adding support for Cinterion device MV31 with Qualcomm
new baseline. Use different PIDs to separate it from
previous base line products.
All interfaces settings keep same as previous.

Below is test evidence:
T:  Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  6 Spd=480 MxCh= 0
D:  Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=1e2d ProdID=00b8 Rev=04.14
S:  Manufacturer=Cinterion
S:  Product=Cinterion PID 0x00B8 USB Mobile Broadband
S:  SerialNumber=90418e79
C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
I:  If#=0x1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
I:  If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
I:  If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option

T:  Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  7 Spd=480 MxCh= 0
D:  Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=1e2d ProdID=00b9 Rev=04.14
S:  Manufacturer=Cinterion
S:  Product=Cinterion PID 0x00B9 USB Mobile Broadband
S:  SerialNumber=90418e79
C:  #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option

For PID 00b8, interface 3 is GNSS port which don't use serial driver.

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Link: https://lore.kernel.org/r/20220601034740.5438-1-slark_xiao@163.com
[ johan: rename defines using a "2" infix ]
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2022-06-09 14:32:42 +02:00
Sungjong Seo
204e6ceaa1 exfat: use updated exfat_chain directly during renaming
In order for a file to access its own directory entry set,
exfat_inode_info(ei) has two copied values. One is ei->dir, which is
a snapshot of exfat_chain of the parent directory, and the other is
ei->entry, which is the offset of the start of the directory entry set
in the parent directory.

Since the parent directory can be updated after the snapshot point,
it should be used only for accessing one's own directory entry set.

However, as of now, during renaming, it could try to traverse or to
allocate clusters via snapshot values, it does not make sense.

This potential problem has been revealed when exfat_update_parent_info()
was removed by commit d8dad2588a ("exfat: fix referencing wrong parent
directory information after renaming"). However, I don't think it's good
idea to bring exfat_update_parent_info() back.

Instead, let's use the updated exfat_chain of parent directory diectly.

Fixes: d8dad2588a ("exfat: fix referencing wrong parent directory information after renaming")
Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com>
Tested-by: Wang Yugui <wangyugui@e16-tech.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2022-06-09 21:26:32 +09:00
Marc Zyngier
bcbfb588cf KVM: arm64: Drop stale comment
The layout of 'struct kvm_vcpu_arch' has evolved significantly since
the initial port of KVM/arm64, so remove the stale comment suggesting
that a prefix of the structure is used exclusively from assembly code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-7-will@kernel.org
2022-06-09 13:24:02 +01:00
Will Deacon
5879c97f37 KVM: arm64: Remove redundant hyp_assert_lock_held() assertions
host_stage2_try() asserts that the KVM host lock is held, so there's no
need to duplicate the assertion in its wrappers.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-6-will@kernel.org
2022-06-09 13:24:02 +01:00
Will Deacon
112f3bab41 KVM: arm64: Extend comment in has_vhe()
has_vhe() expands to a compile-time constant when evaluated from the VHE
or nVHE code, alternatively checking a static key when called from
elsewhere in the kernel. On face value, this looks like a case of
premature optimization, but in fact this allows symbol references on
VHE-specific code paths to be dropped from the nVHE object.

Expand the comment in has_vhe() to make this clearer, hopefully
discouraging anybody from simplifying the code.

Cc: David Brazdil <dbrazdil@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-5-will@kernel.org
2022-06-09 13:24:02 +01:00
Will Deacon
cde5042adf KVM: arm64: Ignore 'kvm-arm.mode=protected' when using VHE
Ignore 'kvm-arm.mode=protected' when using VHE so that kvm_get_mode()
only returns KVM_MODE_PROTECTED on systems where the feature is available.

Cc: David Brazdil <dbrazdil@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-4-will@kernel.org
2022-06-09 13:24:02 +01:00
Marc Zyngier
fa7a172144 KVM: arm64: Handle all ID registers trapped for a protected VM
A protected VM accessing ID_AA64ISAR2_EL1 gets punished with an UNDEF,
while it really should only get a zero back if the register is not
handled by the hypervisor emulation (as mandated by the architecture).

Introduce all the missing ID registers (including the unallocated ones),
and have them to return 0.

Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-3-will@kernel.org
2022-06-09 13:24:02 +01:00
Will Deacon
ae187fec75 KVM: arm64: Return error from kvm_arch_init_vm() on allocation failure
If we fail to allocate the 'supported_cpus' cpumask in kvm_arch_init_vm()
then be sure to return -ENOMEM instead of success (0) on the failure
path.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-2-will@kernel.org
2022-06-09 13:24:02 +01:00
Robert Eckelmann
908e698f21 USB: serial: io_ti: add Agilent E5805A support
Add support for Agilent E5805A (rebranded ION Edgeport/4) to io_ti.

Signed-off-by: Robert Eckelmann <longnoserob@gmail.com>
Link: https://lore.kernel.org/r/20220521230808.30931eca@octoberrain
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2022-06-09 14:13:28 +02:00
Guenter Roeck
b6c8cd80ac watchdog: gxp: Add missing MODULE_LICENSE
The build system says:

ERROR: modpost: missing MODULE_LICENSE() in drivers/watchdog/gxp-wdt.o

Add the missing MODULE_LICENSE.

Signed-off-by: Nick Hawkins <nick.hawkins@hpe.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/all/20220603131419.2948578-1-linux@roeck-us.net/
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2022-06-09 12:20:34 +02:00
Lukas Bulwahn
1a12b25274 MAINTAINERS: Limit KVM RISC-V entry to existing selftests
Commit fed9b26b25 ("MAINTAINERS: Update KVM RISC-V entry to cover
selftests support") optimistically adds a file entry for
tools/testing/selftests/kvm/riscv/, but this directory does not exist.

Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a
broken reference. The script is very useful to keep MAINTAINERS up to date
and MAINTAINERS can be kept in a state where the script emits no warning.

So, just drop the non-matching file entry rather than starting to collect
exceptions of entries that may match in some close or distant future.

Fixes: fed9b26b25 ("MAINTAINERS: Update KVM RISC-V entry to cover selftests support")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-06-09 09:18:22 +05:30
Julia Lawall
ea6c121321 RISC-V: KVM: fix typos in comments
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-06-09 09:18:15 +05:30
Jiasheng Jiang
6ba12b56b9 i2c: npcm7xx: Add check for platform_driver_register
As platform_driver_register() could fail, it should be better
to deal with the return value in order to maintain the code
consisitency.

Fixes: 56a1485b10 ("i2c: npcm7xx: Add Nuvoton NPCM I2C controller driver")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: Tali Perry <tali.perry1@gmail.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-06-08 22:15:37 +02:00
Andy Shevchenko
8c4811e7a5 MAINTAINERS: Update Synopsys DesignWare I2C to Supported
The actual status of the code is Supported (from x86 perspective).

Reported-by: dave.hansen@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
[wsa: fixed "DesignWare" spelling]
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-06-08 21:44:09 +02:00
Michael Kelley
f5f93d7f5a HID: hyperv: Correctly access fields declared as __le16
Add the use of le16_to_cpu() for fields declared as __le16. Because
Hyper-V only runs in Little Endian mode, there's no actual bug.
The change is made in the interest of general correctness in
addition to making sparse happy. No functional change.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1654660177-115463-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-06-08 12:28:13 +00:00
Masahiro Yamada
245b993d8f clocksource: hyper-v: unexport __init-annotated hv_init_clocksource()
EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.

modpost used to detect it, but it has been broken for a decade.

Recently, I fixed modpost so it started to warn it again, then this
showed up in linux-next builds.

There are two ways to fix it:

  - Remove __init
  - Remove EXPORT_SYMBOL

I chose the latter for this case because the only in-tree call-site,
arch/x86/kernel/cpu/mshyperv.c is never compiled as modular.
(CONFIG_HYPERVISOR_GUEST is boolean)

Fixes: dd2cb34861 ("clocksource/drivers: Continue making Hyper-V clocksource ISA agnostic")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220606050238.4162200-1-masahiroy@kernel.org
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-06-08 12:27:08 +00:00
Xiang wangx
92ec746bce Drivers: hv: Fix syntax errors in comments
Delete the redundant word 'in'.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220605085524.11289-1-wangxiang@cdjrlc.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-06-08 12:26:28 +00:00
Saurabh Sengar
6640b5df1a Drivers: hv: vmbus: Don't assign VMbus channel interrupts to isolated CPUs
When initially assigning a VMbus channel interrupt to a CPU, don’t choose
a managed IRQ isolated CPU (as specified on the kernel boot line with
parameter 'isolcpus=managed_irq,<#cpu>'). Also, when using sysfs to change
the CPU that a VMbus channel will interrupt, don't allow changing to a
managed IRQ isolated CPU.

Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1653637439-23060-1-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-06-08 12:24:50 +00:00
Florian Westphal
b1fd94e704 netfilter: use get_random_u32 instead of prandom
bh might occur while updating per-cpu rnd_state from user context,
ie. local_out path.

BUG: using smp_processor_id() in preemptible [00000000] code: nginx/2725
caller is nft_ng_random_eval+0x24/0x54 [nft_numgen]
Call Trace:
 check_preemption_disabled+0xde/0xe0
 nft_ng_random_eval+0x24/0x54 [nft_numgen]

Use the random driver instead, this also avoids need for local prandom
state. Moreover, prandom now uses the random driver since d4150779e6
("random32: use real rng for non-deterministic randomness").

Based on earlier patch from Pablo Neira.

Fixes: 6b2faee0ca ("netfilter: nft_meta: place prandom handling in a helper")
Fixes: 978d8f9055 ("netfilter: nft_numgen: add map lookups for numgen random operations")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-08 12:30:59 +02:00
Miaoqian Lin
37d838de36 soc: bcm: brcmstb: pm: pm-arm: Fix refcount leak in brcmstb_pm_probe
of_find_matching_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.

In brcmstb_init_sram, it pass dn to of_address_to_resource(),
of_address_to_resource() will call of_find_device_by_node() to take
reference, so we should release the reference returned by
of_find_matching_node().

Fixes: 0b741b8234 ("soc: bcm: brcmstb: Add support for S2/S3/S5 suspend states (ARM)")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2022-06-08 02:21:17 -07:00
Marc Zyngier
efedd01de4 KVM: arm64: Warn if accessing timer pending state outside of vcpu context
A recurrent bug in the KVM/arm64 code base consists in trying to
access the timer pending state outside of the vcpu context, which
makes zero sense (the pending state only exists when the vcpu
is loaded).

In order to avoid more embarassing crashes and catch the offenders
red-handed, add a warning to kvm_arch_timer_get_input_level() and
return the state as non-pending. This avoids taking the system down,
and still helps tracking down silly bugs.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220607131427.1164881-4-maz@kernel.org
2022-06-08 10:16:23 +01:00
Marc Zyngier
98432ccdec KVM: arm64: Replace vgic_v3_uaccess_read_pending with vgic_uaccess_read_pending
Now that GICv2 has a proper userspace accessor for the pending state,
switch GICv3 over to it, dropping the local version, moving over the
specific behaviours that CGIv3 requires (such as the distinction
between pending latch and line level which were never enforced
with GICv2).

We also gain extra locking that isn't really necessary for userspace,
but that's a small price to pay for getting rid of superfluous code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20220607131427.1164881-3-maz@kernel.org
2022-06-08 10:16:15 +01:00
Nicolas Saenz Julienne
46d6e11320 MAINTAINERS: Update BCM2711/BCM2835 maintainer
I haven't been able to find time to maintain BCM2711/BCM2835 these last
months, so it's only fair to pass the baton to Florian who's been doing
the work.

Signed-off-by: Nicolas Saenz Julienne <nsaenz@kernel.org>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2022-06-08 01:00:47 -07:00
Maximilian Luz
ce0db505bc drm/msm: Fix double pm_runtime_disable() call
Following commit 17e822f759 ("drm/msm: fix unbalanced
pm_runtime_enable in adreno_gpu_{init, cleanup}"), any call to
adreno_unbind() will disable runtime PM twice, as indicated by the call
trees below:

  adreno_unbind()
   -> pm_runtime_force_suspend()
   -> pm_runtime_disable()

  adreno_unbind()
   -> gpu->funcs->destroy() [= aNxx_destroy()]
   -> adreno_gpu_cleanup()
   -> pm_runtime_disable()

Note that pm_runtime_force_suspend() is called right before
gpu->funcs->destroy() and both functions are called unconditionally.

With recent addition of the eDP AUX bus code, this problem manifests
itself when the eDP panel cannot be found yet and probing is deferred.
On the first probe attempt, we disable runtime PM twice as described
above. This then causes any later probe attempt to fail with

  [drm:adreno_load_gpu [msm]] *ERROR* Couldn't power up the GPU: -13

preventing the driver from loading.

As there seem to be scenarios where the aNxx_destroy() functions are not
called from adreno_unbind(), simply removing pm_runtime_disable() from
inside adreno_unbind() does not seem to be the proper fix. This is what
commit 17e822f759 ("drm/msm: fix unbalanced pm_runtime_enable in
adreno_gpu_{init, cleanup}") intended to fix. Therefore, instead check
whether runtime PM is still enabled, and only disable it in that case.

Fixes: 17e822f759 ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}")
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Link: https://lore.kernel.org/r/20220606211305.189585-1-luzmaximilian@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-07 13:13:05 -07:00
Robert Marko
122e951eb8 regulator: qcom_smd: correct MP5496 ranges
Currently set MP5496 Buck and LDO ranges dont match its datasheet[1].
According to the datasheet:
Buck range is 0.6-2.1875V with a 12.5mV step
LDO range is 0.8-3.975V with a 25mV step.

So, correct the ranges according to the datasheet[1].

[1] https://www.monolithicpower.com/en/documentview/productdocument/index/version/2/document_type/Datasheet/lang/en/sku/MP5496GR/document_id/6906/

Signed-off-by: Robert Marko <robimarko@gmail.com>
Link: https://lore.kernel.org/r/20220604193300.125758-2-robimarko@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-07 20:38:09 +01:00
David Sterba
e3a4167c88 btrfs: add error messages to all unrecognized mount options
Almost none of the errors stemming from a valid mount option but wrong
value prints a descriptive message which would help to identify why
mount failed. Like in the linked report:

  $ uname -r
  v4.19
  $ mount -o compress=zstd /dev/sdb /mnt
  mount: /mnt: wrong fs type, bad option, bad superblock on
  /dev/sdb, missing codepage or helper program, or other error.
  $ dmesg
  ...
  BTRFS error (device sdb): open_ctree failed

Errors caused by memory allocation failures are left out as it's not a
user error so reporting that would be confusing.

Link: https://lore.kernel.org/linux-btrfs/9c3fec36-fc61-3a33-4977-a7e207c3fa4e@gmx.de/
CC: stable@vger.kernel.org # 4.9+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-06-07 17:29:50 +02:00
Marc Zyngier
2cdea19a34 KVM: arm64: Don't read a HW interrupt pending state in user context
Since 5bfa685e62 ("KVM: arm64: vgic: Read HW interrupt pending state
from the HW"), we're able to source the pending bit for an interrupt
that is stored either on the physical distributor or on a device.

However, this state is only available when the vcpu is loaded,
and is not intended to be accessed from userspace. Unfortunately,
the GICv2 emulation doesn't provide specific userspace accessors,
and we fallback with the ones that are intended for the guest,
with fatal consequences.

Add a new vgic_uaccess_read_pending() accessor for userspace
to use, build on top of the existing vgic_mmio_read_pending().

Reported-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Fixes: 5bfa685e62 ("KVM: arm64: vgic: Read HW interrupt pending state from the HW")
Link: https://lore.kernel.org/r/20220607131427.1164881-2-maz@kernel.org
Cc: stable@vger.kernel.org
2022-06-07 16:28:19 +01:00
Scott Mayhew
304791255a sunrpc: set cl_max_connect when cloning an rpc_clnt
If the initial attempt at trunking detection using the krb5i auth flavor
fails with -EACCES, -NFS4ERR_CLID_INUSE, or -NFS4ERR_WRONGSEC, then the
NFS client tries again using auth_sys, cloning the rpc_clnt in the
process.  If this second attempt at trunking detection succeeds, then
the resulting nfs_client->cl_rpcclient winds up having cl_max_connect=0
and subsequent attempts to add additional transport connections to the
rpc_clnt will fail with a message similar to the following being logged:

[502044.312640] SUNRPC: reached max allowed number (0) did not add
transport to server: 192.168.122.3

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Fixes: dc48e0abee ("SUNRPC enforce creation of no more than max_connect xprts")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-07 10:36:33 -04:00
sunliming
e3fe65e0d3 KVM: arm64: Fix inconsistent indenting
Fix the following smatch warnings:

arch/arm64/kvm/vmid.c:62 flush_context() warn: inconsistent indenting

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220602024805.511457-1-sunliming@kylinos.cn
2022-06-07 15:27:05 +01:00
Marc Zyngier
039f49c4ca KVM: arm64: Always start with clearing SME flag on load
On each vcpu load, we set the KVM_ARM64_HOST_SME_ENABLED
flag if SME is enabled for EL0 on the host. This is used to
restore the correct state on vpcu put.

However, it appears that nothing ever clears this flag. Once
set, it will stick until the vcpu is destroyed, which has the
potential to spuriously enable SME for userspace. As it turns
out, this is due to the SME code being more or less copied from
SVE, and inheriting the same shortcomings.

We never saw the issue because nothing uses SME, and the amount
of testing is probably still pretty low.

Fixes: 861262ab86 ("KVM: arm64: Handle SME host state when running guests")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviwed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220528113829.1043361-3-maz@kernel.org
2022-06-07 14:31:30 +01:00
Marc Zyngier
d52d165d67 KVM: arm64: Always start with clearing SVE flag on load
On each vcpu load, we set the KVM_ARM64_HOST_SVE_ENABLED
flag if SVE is enabled for EL0 on the host. This is used to restore
the correct state on vpcu put.

However, it appears that nothing ever clears this flag. Once
set, it will stick until the vcpu is destroyed, which has the
potential to spuriously enable SVE for userspace.

We probably never saw the issue because no VMM uses SVE, but
that's still pretty bad. Unconditionally clearing the flag
on vcpu load addresses the issue.

Fixes: 8383741ab2 ("KVM: arm64: Get rid of host SVE tracking/saving")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220528113829.1043361-2-maz@kernel.org
2022-06-07 14:19:23 +01:00
Eddie James
ac6888ac5a hwmon: (occ) Lock mutex in shutdown to prevent race with occ_active
Unbinding the driver or removing the parent device at the same time
as using the OCC active sysfs file can cause the driver to unregister
the hwmon device twice. Prevent this by locking the occ mutex in the
shutdown function.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20220606185455.21126-1-eajames@linux.ibm.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-06-07 05:45:42 -07:00
Rob Herring
5e3f89ad8e dt-bindings: hwmon: ti,tmp401: Drop 'items' from 'ti,n-factor' property
'ti,n-factor' is a scalar type, so 'items' should not be used as that is
for arrays/matrix.

A pending meta-schema change will catch future cases.

Fixes: bd90c5b939 ("dt-bindings: hwmon: Add TMP401, TMP411 and TMP43x")
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220606212223.1360395-1-robh@kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-06-07 05:45:29 -07:00
Helge Deller
1d0811b03e parisc/stifb: Fix fb_is_primary_device() only available with CONFIG_FB_STI
Fix this build error noticed by the kernel test robot:

drivers/video/console/sticore.c:1132:5: error: redefinition of 'fb_is_primary_device'
 arch/parisc/include/asm/fb.h:18:19: note: previous definition of 'fb_is_primary_device'

Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: kernel test robot <lkp@intel.com>
Cc: stable@vger.kernel.org   # v5.10+
2022-06-07 13:01:17 +02:00
Kamal Heib
118f767413 RDMA/qedr: Fix reporting QP timeout attribute
Make sure to save the passed QP timeout attribute when the QP gets modified,
so when calling query QP the right value is reported and not the
converted value that is required by the firmware. This issue was found
while running the pyverbs tests.

Fixes: cecbcddf64 ("qedr: Add support for QP verbs")
Link: https://lore.kernel.org/r/20220525132029.84813-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-06-07 11:46:56 +03:00
Chester Lin
680c0aee97 MAINTAINERS: add a new reviewer for S32G
Add the NXP S32 Linux team as a designated review group of s32g.

Signed-off-by: Chester Lin <clin@suse.com>
Link: https://lore.kernel.org/r/20220525161422.14156-1-clin@suse.com
2022-06-07 10:26:05 +08:00
Fabio Estevam
4266e2f70d arm64: s32g2: Pass unit name to soc node
Pass unit name to soc node to fix the following W=1 build warning:

arch/arm64/boot/dts/freescale/s32g2.dtsi:82.6-123.4: Warning (unit_address_vs_reg): /soc: node has a reg or ranges property, but no unit name

Signed-off-by: Fabio Estevam <festevam@denx.de>
Reviewed-by: Chester Lin <clin@suse.com>
Signed-off-by: Chester Lin <clin@suse.com>
Link: https://lore.kernel.org/r/20220514143505.1554813-1-festevam@gmail.com
2022-06-07 10:24:57 +08:00
Josh Poimboeuf
7b6c7a877c x86/ftrace: Remove OBJECT_FILES_NON_STANDARD usage
The file-wide OBJECT_FILES_NON_STANDARD annotation is used with
CONFIG_FRAME_POINTER to tell objtool to skip the entire file when frame
pointers are enabled.  However that annotation is now deprecated because
it doesn't work with IBT, where objtool runs on vmlinux.o instead of
individual translation units.

Instead, use more fine-grained function-specific annotations:

- The 'save_mcount_regs' macro does funny things with the frame pointer.
  Use STACK_FRAME_NON_STANDARD_FP to tell objtool to ignore the
  functions using it.

- The return_to_handler() "function" isn't actually a callable function.
  Instead of being called, it's returned to.  The real return address
  isn't on the stack, so unwinding is already doomed no matter which
  unwinder is used.  So just remove the STT_FUNC annotation, telling
  objtool to ignore it.  That also removes the implicit
  ANNOTATE_NOENDBR, which now needs to be made explicit.

Fixes the following warning:

  vmlinux.o: warning: objtool: __fentry__+0x16: return with modified stack frame

Fixes: ed53a0d971 ("x86/alternative: Use .ibt_endbr_seal to seal indirect calls")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/b7a7a42fe306aca37826043dac89e113a1acdbac.1654268610.git.jpoimboe@kernel.org
2022-06-06 11:50:22 -07:00
Josh Poimboeuf
dcea997bee faddr2line: Fix overlapping text section failures, the sequel
If a function lives in a section other than .text, but .text also exists
in the object, faddr2line may wrongly assume .text.  This can result in
comically wrong output.  For example:

  $ scripts/faddr2line vmlinux.o enter_from_user_mode+0x1c
  enter_from_user_mode+0x1c/0x30:
  find_next_bit at /home/jpoimboe/git/linux/./include/linux/find.h:40
  (inlined by) perf_clear_dirty_counters at /home/jpoimboe/git/linux/arch/x86/events/core.c:2504

Fix it by passing the section name to addr2line, unless the object file
is vmlinux, in which case the symbol table uses absolute addresses.

Fixes: 1d1a0e7c51 ("scripts/faddr2line: Fix overlapping text section failures")
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/7d25bc1408bd3a750ac26e60d2f2815a5f4a8363.1654130536.git.jpoimboe@kernel.org
2022-06-06 11:50:11 -07:00
Josh Poimboeuf
c2f75a43f5 objtool: Fix obsolete reference to CONFIG_X86_SMAP
CONFIG_X86_SMAP no longer exists.  For objtool's purposes it has been
replaced with CONFIG_HAVE_UACCESS_VALIDATION.

Fixes: 03f16cd020 ("objtool: Add CONFIG_OBJTOOL")
Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/44c57668768c1ba1b4ba1ff541ec54781636e07c.1654101721.git.jpoimboe@kernel.org
2022-06-06 11:49:55 -07:00
Trond Myklebust
880265c77a pNFS: Avoid a live lock condition in pnfs_update_layout()
If we're about to send the first layoutget for an empty layout, we want
to make sure that we drain out the existing pending layoutget calls
first. The reason is that these layouts may have been already implicitly
returned to the server by a recall to which the client gave a
NFS4ERR_NOMATCHING_LAYOUT response.

The problem is that wait_var_event_killable() could in principle see the
plh_outstanding count go back to '1' when the first process to wake up
starts sending a new layoutget. If it fails to get a layout, then this
loop can continue ad infinitum...

Fixes: 0b77f97a7e ("NFSv4/pnfs: Fix layoutget behaviour after invalidation")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-06 11:53:55 -04:00
Trond Myklebust
fe44fb23d6 pNFS: Don't keep retrying if the server replied NFS4ERR_LAYOUTUNAVAILABLE
If the server tells us that a pNFS layout is not available for a
specific file, then we should not keep pounding it with further
layoutget requests.

Fixes: 183d9e7b11 ("pnfs: rework LAYOUTGET retry handling")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-06 11:53:54 -04:00
Cristian Marussi
d0c94bef70 firmware: arm_scmi: Remove all the unused local variables
While using SCMI iterators helpers a few local automatic variables are
defined but then used only as input for sizeof operators.

cppcheck is fooled to complain about this with:

 | drivers/firmware/arm_scmi/sensors.c:341:48: warning: Variable 'msg' is
 |				not assigned a value. [unassignedVariable]
 |	 struct scmi_msg_sensor_list_update_intervals *msg;

Even though this is an innocuos warning, since the uninitialized variable
is at the end never used in the reported cases, fix these occurences all
over SCMI stack to avoid keeping unneeded objects on the stack.

Link: https://lore.kernel.org/r/20220530115237.277077-1-cristian.marussi@arm.com
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-06-06 15:47:04 +01:00
Cristian Marussi
122839b58a firmware: arm_scmi: Relax base protocol sanity checks on the protocol list
Even though malformed replies from firmware must be treated carefully to
avoid memory corruption in the kernel, some out-of-spec SCMI replies can
be tolerated to avoid breaking existing deployed system, as long as they
won't cause memory issues.

Relax the sanity checks on the recieved protocol list in the base protocol
to avoid breaking one of the deployed platform whose firmware is not easily
upgradable currently.

Link: https://lore.kernel.org/r/20220523171559.472112-1-cristian.marussi@arm.com
Cc: Etienne Carriere <etienne.carriere@linaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Reported-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com>
Tested-By: Frank Wunderlich <frank-w@public-files.de>
Acked-by: Michael Riesch <michael.riesch@wolfvision.net>
Acked-by: Etienne Carriere <etienne.carriere@linaro.org>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-06-06 15:46:35 +01:00
Qu Wenruo
0591f04036 btrfs: prevent remounting to v1 space cache for subpage mount
Upstream commit 9f73f1aef9 ("btrfs: force v2 space cache usage for
subpage mount") forces subpage mount to use v2 cache, to avoid
deprecated v1 cache which doesn't support subpage properly.

But there is a loophole that user can still remount to v1 cache.

The existing check will only give users a warning, but does not really
prevent to do the remount.

Although remounting to v1 will not cause any problems since the v1 cache
will always be marked invalid when mounted with a different page size,
it's still better to prevent v1 cache at all for subpage mounts.

Fixes: 9f73f1aef9 ("btrfs: force v2 space cache usage for subpage mount")
CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-06-06 16:18:59 +02:00
Filipe Manana
31e70e5278 btrfs: fix hang during unmount when block group reclaim task is running
When we start an unmount, at close_ctree(), if we have the reclaim task
running and in the middle of a data block group relocation, we can trigger
a deadlock when stopping an async reclaim task, producing a trace like the
following:

[629724.498185] task:kworker/u16:7   state:D stack:    0 pid:681170 ppid:     2 flags:0x00004000
[629724.499760] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
[629724.501267] Call Trace:
[629724.501759]  <TASK>
[629724.502174]  __schedule+0x3cb/0xed0
[629724.502842]  schedule+0x4e/0xb0
[629724.503447]  btrfs_wait_on_delayed_iputs+0x7c/0xc0 [btrfs]
[629724.504534]  ? prepare_to_wait_exclusive+0xc0/0xc0
[629724.505442]  flush_space+0x423/0x630 [btrfs]
[629724.506296]  ? rcu_read_unlock_trace_special+0x20/0x50
[629724.507259]  ? lock_release+0x220/0x4a0
[629724.507932]  ? btrfs_get_alloc_profile+0xb3/0x290 [btrfs]
[629724.508940]  ? do_raw_spin_unlock+0x4b/0xa0
[629724.509688]  btrfs_async_reclaim_metadata_space+0x139/0x320 [btrfs]
[629724.510922]  process_one_work+0x252/0x5a0
[629724.511694]  ? process_one_work+0x5a0/0x5a0
[629724.512508]  worker_thread+0x52/0x3b0
[629724.513220]  ? process_one_work+0x5a0/0x5a0
[629724.514021]  kthread+0xf2/0x120
[629724.514627]  ? kthread_complete_and_exit+0x20/0x20
[629724.515526]  ret_from_fork+0x22/0x30
[629724.516236]  </TASK>
[629724.516694] task:umount          state:D stack:    0 pid:719055 ppid:695412 flags:0x00004000
[629724.518269] Call Trace:
[629724.518746]  <TASK>
[629724.519160]  __schedule+0x3cb/0xed0
[629724.519835]  schedule+0x4e/0xb0
[629724.520467]  schedule_timeout+0xed/0x130
[629724.521221]  ? lock_release+0x220/0x4a0
[629724.521946]  ? lock_acquired+0x19c/0x420
[629724.522662]  ? trace_hardirqs_on+0x1b/0xe0
[629724.523411]  __wait_for_common+0xaf/0x1f0
[629724.524189]  ? usleep_range_state+0xb0/0xb0
[629724.524997]  __flush_work+0x26d/0x530
[629724.525698]  ? flush_workqueue_prep_pwqs+0x140/0x140
[629724.526580]  ? lock_acquire+0x1a0/0x310
[629724.527324]  __cancel_work_timer+0x137/0x1c0
[629724.528190]  close_ctree+0xfd/0x531 [btrfs]
[629724.529000]  ? evict_inodes+0x166/0x1c0
[629724.529510]  generic_shutdown_super+0x74/0x120
[629724.530103]  kill_anon_super+0x14/0x30
[629724.530611]  btrfs_kill_super+0x12/0x20 [btrfs]
[629724.531246]  deactivate_locked_super+0x31/0xa0
[629724.531817]  cleanup_mnt+0x147/0x1c0
[629724.532319]  task_work_run+0x5c/0xa0
[629724.532984]  exit_to_user_mode_prepare+0x1a6/0x1b0
[629724.533598]  syscall_exit_to_user_mode+0x16/0x40
[629724.534200]  do_syscall_64+0x48/0x90
[629724.534667]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[629724.535318] RIP: 0033:0x7fa2b90437a7
[629724.535804] RSP: 002b:00007ffe0b7e4458 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[629724.536912] RAX: 0000000000000000 RBX: 00007fa2b9182264 RCX: 00007fa2b90437a7
[629724.538156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000555d6cf20dd0
[629724.539053] RBP: 0000555d6cf20ba0 R08: 0000000000000000 R09: 00007ffe0b7e3200
[629724.539956] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[629724.540883] R13: 0000555d6cf20dd0 R14: 0000555d6cf20cb0 R15: 0000000000000000
[629724.541796]  </TASK>

This happens because:

1) Before entering close_ctree() we have the async block group reclaim
   task running and relocating a data block group;

2) There's an async metadata (or data) space reclaim task running;

3) We enter close_ctree() and park the cleaner kthread;

4) The async space reclaim task is at flush_space() and runs all the
   existing delayed iputs;

5) Before the async space reclaim task calls
   btrfs_wait_on_delayed_iputs(), the block group reclaim task which is
   doing the data block group relocation, creates a delayed iput at
   replace_file_extents() (called when COWing leaves that have file extent
   items pointing to relocated data extents, during the merging phase
   of relocation roots);

6) The async reclaim space reclaim task blocks at
   btrfs_wait_on_delayed_iputs(), since we have a new delayed iput;

7) The task at close_ctree() then calls cancel_work_sync() to stop the
   async space reclaim task, but it blocks since that task is waiting for
   the delayed iput to be run;

8) The delayed iput is never run because the cleaner kthread is parked,
   and no one else runs delayed iputs, resulting in a hang.

So fix this by stopping the async block group reclaim task before we
park the cleaner kthread.

Fixes: 18bb8bbf13 ("btrfs: zoned: automatically reclaim zones")
CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-06-06 16:18:52 +02:00
Rob Herring
6aa27071e4 spi: dt-bindings: Fix unevaluatedProperties warnings in examples
The 'unevaluatedProperties' schema checks is not fully working and doesn't
catch some cases where there's a $ref to another schema. A fix is pending,
but results in new warnings in examples.

'spi-max-frequency' is supposed to be a per SPI peripheral device property,
not a SPI controller property, so drop it.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20220526014141.2872567-1-robh@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-06 12:32:28 +01:00
Patrice Chotard
2283679f4c spi: spi-mem: Fix spi_mem_poll_status()
In spi_mem_exec_op(), in case cs_gpiod descriptor is set, exec_op()
callback can't be used.
The same must be applied in spi_mem_poll_status(), poll_status()
callback can't be used, we must use the legacy path using
read_poll_timeout().

Tested on STM32mp257c-ev1 specific evaluation board on which a
spi-nand was mounted instead of a spi-nor.

Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com>
Tested-by: Patrice Chotard <patrice.chotard@foss.st.com>
Link: https://lore.kernel.org/r/20220602091022.358127-1-patrice.chotard@foss.st.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-06 12:32:27 +01:00
Lars-Peter Clausen
7b40322f71 spi: cadence: Detect transmit FIFO depth
The depth of the transmit FIFO for the Cadence SPI controller is currently
hardcoded to 128. But the depth is a synthesis configuration parameter of
the core and can vary between different SoCs.

If the configured FIFO size is less than 128 the driver will busy loop in
the cdns_spi_fill_tx_fifo() function waiting for FIFO space to become
available.

Depending on the length and speed of the transfer it can spin for a
significant amount of time. The cdns_spi_fill_tx_fifo() function is called
from the drivers interrupt handler, so it can leave interrupts disabled for
a prolonged amount of time.

In addition the read FIFO will also overflow and data will be discarded.

To avoid this detect the actual size of the FIFO and use that rather than
the hardcoded value.

To detect the FIFO size the FIFO threshold register is used. The register
is sized so that it can hold FIFO size - 1 as its maximum value. Bits that
are not needed to hold the threshold value will always read 0. By writing
0xffff to the register and then reading back the value in the register we
get the FIFO size.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20220527091143.3780378-1-lars@metafoo.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-06 12:32:25 +01:00
Sai Krishna Potthuri
21b511ddee spi: spi-cadence: Fix SPI CS gets toggling sporadically
As part of unprepare_transfer_hardware, SPI controller will be disabled
which will indirectly deassert the CS line. This will create a problem
in some of the devices where message will be transferred with
cs_change flag set(CS should not be deasserted).
As per SPI controller implementation, if SPI controller is disabled then
all output enables are inactive and all pins are set to input mode which
means CS will go to default state high(deassert). This leads to an issue
when core explicitly ask not to deassert the CS (cs_change = 1). This
patch fix the above issue by checking the Slave select status bits from
configuration register before disabling the SPI.

Signed-off-by: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri@xilinx.com>
Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra@xilinx.com>
Link: https://lore.kernel.org/r/20220606062525.18447-1-amit.kumar-mahapatra@xilinx.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-06 12:32:24 +01:00
Miaoqian Lin
1332661e09 memory: samsung: exynos5422-dmc: Fix refcount leak in of_get_dram_timings
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
This function doesn't call of_node_put() in some error paths.
To unify the structure, Add put_node label and goto it on errors.

Fixes: 6e7674c3c6 ("memory: Add DMC driver for Exynos5422")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20220602041721.64348-1-linmq006@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2022-06-06 11:18:20 +02:00
Miaoqian Lin
038ae37c51 memory: mtk-smi: add missing put_device() call in mtk_smi_device_link_common
The reference taken by 'of_find_device_by_node()' must be released when
not needed anymore.
Add the corresponding 'put_device()' in the error handling paths.

Fixes: 4740475770 ("memory: mtk-smi: Add device link for smi-sub-common")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220601120118.60225-1-linmq006@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2022-06-06 11:17:14 +02:00
Geert Uytterhoeven
67c7fc6cd9 memory: omap-gpmc: OMAP_GPMC should depend on ARCH_OMAP2PLUS || ARCH_KEYSTONE || ARCH_K3
The Texas Instruments OMAP General Purpose Memory Controller (GPMC) is
only present on TI OMAP2/3/4/5, Keystone, AM33xx, AM43x, DRA7xx, TI81xx,
and K3 SoCs.  Hence add a dependency on ARCH_OMAP2PLUS || ARCH_KEYSTONE
|| ARCH_K3, to prevent asking the user about this driver when
configuring a kernel without OMAP2+, Keystone, or K3 SoC family support.

Fixes: be34f45f0d ("memory: omap-gpmc: Make OMAP_GPMC config visible and selectable")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Roger Quadros <rogerq@kernel.org>
Link: https://lore.kernel.org/r/f6780f572f882ed6ab5934321942cf2b412bf8d1.1652174849.git.geert+renesas@glider.be
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2022-06-06 11:10:53 +02:00
Miaoqian Lin
c4c7952504 ARM: exynos: Fix refcount leak in exynos_map_pmu
of_find_matching_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
of_node_put() checks null pointer.

Fixes: fce9e5bb25 ("ARM: EXYNOS: Add support for mapping PMU base address via DT")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220523145513.12341-1-linmq006@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2022-06-06 10:40:57 +02:00
David Virag
f84d83d816 arm64: dts: exynos: Correct UART clocks on Exynos7885
The clocks in the serial UART nodes were swapped by mistake on
Exynos7885. This only worked correctly because of a mistake in the clock
driver which has been fixed. With the fixed clock driver in place, the
baudrate of the UARTs get miscalculated. Fix this by correcting the
clocks in the dtsi.

Fixes: 0687401532 ("arm64: dts: exynos: Add initial device tree support for Exynos7885 SoC")
Signed-off-by: David Virag <virag.david003@gmail.com>
Link: https://lore.kernel.org/r/20220526055840.45209-3-virag.david003@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2022-06-06 10:37:14 +02:00
Larry Finger
96f0a54e8e staging: r8188eu: Fix warning of array overflow in ioctl_linux.c
Building with -Warray-bounds results in the following warning plus others
related to the same problem:

CC [M]  drivers/staging/r8188eu/os_dep/ioctl_linux.o
In function ‘wpa_set_encryption’,
    inlined from ‘rtw_wx_set_enc_ext’ at drivers/staging/r8188eu/os_dep/ioctl_linux.c:1868:9:
drivers/staging/r8188eu/os_dep/ioctl_linux.c:412:41: warning: array subscript ‘struct ndis_802_11_wep[0]’ is partly outside array bounds of ‘void[25]’ [-Warray-bounds]
  412 |                         pwep->KeyLength = wep_key_len;
      |                         ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~
In file included from drivers/staging/r8188eu/os_dep/../include/osdep_service.h:19,
                 from drivers/staging/r8188eu/os_dep/ioctl_linux.c:4:
In function ‘kmalloc’,
    inlined from ‘kzalloc’ at ./include/linux/slab.h:733:9,
    inlined from ‘wpa_set_encryption’ at drivers/staging/r8188eu/os_dep/ioctl_linux.c:408:11,
    inlined from ‘rtw_wx_set_enc_ext’ at drivers/staging/r8188eu/os_dep/ioctl_linux.c:1868:9:
./include/linux/slab.h:605:16: note: object of size [17, 25] allocated by ‘__kmalloc’
  605 |         return __kmalloc(size, flags);
      |                ^~~~~~~~~~~~~~~~~~~~~~
./include/linux/slab.h:600:24: note: object of size [17, 25] allocated by ‘kmem_cache_alloc_trace’
  600 |                 return kmem_cache_alloc_trace(
      |                        ^~~~~~~~~~~~~~~~~~~~~~~
  601 |                                 kmalloc_caches[kmalloc_type(flags)][index],
      |                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  602 |                                 flags, size);
      |                                 ~~~~~~~~~~~~

Although it is unlikely that anyone is still using WEP encryption, the
size of the allocation needs to be increased just in case.

Fixes commit 2b42bd58b3 ("staging: r8188eu: introduce new os_dep dir for RTL8188eu driver")

Fixes: 2b42bd58b3 ("staging: r8188eu: introduce new os_dep dir for RTL8188eu driver")
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Phillip Potter <phil@philpotter.co.uk>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20220531013103.2175-3-Larry.Finger@lwfinger.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-06 08:10:14 +02:00
Phillip Potter
5b7419ae1d staging: r8188eu: fix rtw_alloc_hwxmits error detection for now
In _rtw_init_xmit_priv, we use the res variable to store the error
return from the newly converted rtw_alloc_hwxmits function. Sadly, the
calling function interprets res using _SUCCESS and _FAIL still, meaning
we change the semantics of the variable, even in the success case.

This leads to the following on boot:
r8188eu 1-2:1.0: _rtw_init_xmit_priv failed

In the long term, we should reverse these semantics, but for now, this
fixes the driver. Also, inside rtw_alloc_hwxmits remove the if blocks,
as HWXMIT_ENTRY is always 4.

Fixes: f94b47c6bd ("staging: r8188eu: add check for kzalloc")
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Link: https://lore.kernel.org/r/20220521204741.921-1-phil@philpotter.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-06 08:09:21 +02:00
Rob Clark
036d20726c drm/msm: Ensure mmap offset is initialized
If a GEM object is allocated, and then exported as a dma-buf fd which is
mmap'd before or without the GEM buffer being directly mmap'd, the
vma_node could be unitialized.  This leads to a situation where the CPU
mapping is not correctly torn down in drm_vma_node_unmap().

Fixes: e551655399 ("drm: call drm_gem_object_funcs.mmap with fake offset")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20220531200857.136547-1-robdclark@gmail.com
2022-06-01 17:20:08 -07:00
Rob Clark
af0f2a8cc3 Merge tag 'msm-next-5.19-fixes-06-01' of https://gitlab.freedesktop.org/abhinavk/msm into msm-fixes-staging
5.19 fixes for msm-next

- Fix to add minimum ICC vote in the msm_mdss pm_resume path to address
  bootup splats
- Fix to avoid dereferencing without checking in WB encoder
- Fix to avoid crash during suspend in DP driver by ensuring interrupt
  mask bits are updated
- Remove unused code from dpu_encoder_virt_atomic_check()
- Fix to remove redundant init of dsc variable

Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-01 17:12:50 -07:00
Josh Poimboeuf
1dc6ff02c8 x86/speculation/mmio: Print SMT warning
Similar to MDS and TAA, print a warning if SMT is enabled for the MMIO
Stale Data vulnerability.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2022-06-01 10:54:53 +02:00
Pawan Gupta
027bbb884b KVM: x86/speculation: Disable Fill buffer clear within guests
The enumeration of MD_CLEAR in CPUID(EAX=7,ECX=0).EDX{bit 10} is not an
accurate indicator on all CPUs of whether the VERW instruction will
overwrite fill buffers. FB_CLEAR enumeration in
IA32_ARCH_CAPABILITIES{bit 17} covers the case of CPUs that are not
vulnerable to MDS/TAA, indicating that microcode does overwrite fill
buffers.

Guests running in VMM environments may not be aware of all the
capabilities/vulnerabilities of the host CPU. Specifically, a guest may
apply MDS/TAA mitigations when a virtual CPU is enumerated as vulnerable
to MDS/TAA even when the physical CPU is not. On CPUs that enumerate
FB_CLEAR_CTRL the VMM may set FB_CLEAR_DIS to skip overwriting of fill
buffers by the VERW instruction. This is done by setting FB_CLEAR_DIS
during VMENTER and resetting on VMEXIT. For guests that enumerate
FB_CLEAR (explicitly asking for fill buffer clear capability) the VMM
will not use FB_CLEAR_DIS.

Irrespective of guest state, host overwrites CPU buffers before VMENTER
to protect itself from an MMIO capable guest, as part of mitigation for
MMIO Stale Data vulnerabilities.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:41:35 +02:00
Pawan Gupta
a992b8a468 x86/speculation/mmio: Reuse SRBDS mitigation for SBDS
The Shared Buffers Data Sampling (SBDS) variant of Processor MMIO Stale
Data vulnerabilities may expose RDRAND, RDSEED and SGX EGETKEY data.
Mitigation for this is added by a microcode update.

As some of the implications of SBDS are similar to SRBDS, SRBDS mitigation
infrastructure can be leveraged by SBDS. Set X86_BUG_SRBDS and use SRBDS
mitigation.

Mitigation is enabled by default; use srbds=off to opt-out. Mitigation
status can be checked from below file:

  /sys/devices/system/cpu/vulnerabilities/srbds

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:37:25 +02:00
Pawan Gupta
22cac9c677 x86/speculation/srbds: Update SRBDS mitigation selection
Currently, Linux disables SRBDS mitigation on CPUs not affected by
MDS and have the TSX feature disabled. On such CPUs, secrets cannot
be extracted from CPU fill buffers using MDS or TAA. Without SRBDS
mitigation, Processor MMIO Stale Data vulnerabilities can be used to
extract RDRAND, RDSEED, and EGETKEY data.

Do not disable SRBDS mitigation by default when CPU is also affected by
Processor MMIO Stale Data vulnerabilities.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:36:07 +02:00
Pawan Gupta
8d50cdf8b8 x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data
Add the sysfs reporting file for Processor MMIO Stale Data
vulnerability. It exposes the vulnerability and mitigation state similar
to the existing files for the other hardware vulnerabilities.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:16:04 +02:00
Pawan Gupta
99a83db5a6 x86/speculation/mmio: Enable CPU Fill buffer clearing on idle
When the CPU is affected by Processor MMIO Stale Data vulnerabilities,
Fill Buffer Stale Data Propagator (FBSDP) can propagate stale data out
of Fill buffer to uncore buffer when CPU goes idle. Stale data can then
be exploited with other variants using MMIO operations.

Mitigate it by clearing the Fill buffer before entering idle state.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:14:58 +02:00
Pawan Gupta
e5925fb867 x86/bugs: Group MDS, TAA & Processor MMIO Stale Data mitigations
MDS, TAA and Processor MMIO Stale Data mitigations rely on clearing CPU
buffers. Moreover, status of these mitigations affects each other.
During boot, it is important to maintain the order in which these
mitigations are selected. This is especially true for
md_clear_update_mitigation() that needs to be called after MDS, TAA and
Processor MMIO Stale Data mitigation selection is done.

Introduce md_clear_select_mitigation(), and select all these mitigations
from there. This reflects relationships between these mitigations and
ensures proper ordering.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:14:56 +02:00
Pawan Gupta
8cb861e9e3 x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data
Processor MMIO Stale Data is a class of vulnerabilities that may
expose data after an MMIO operation. For details please refer to
Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst.

These vulnerabilities are broadly categorized as:

Device Register Partial Write (DRPW):
  Some endpoint MMIO registers incorrectly handle writes that are
  smaller than the register size. Instead of aborting the write or only
  copying the correct subset of bytes (for example, 2 bytes for a 2-byte
  write), more bytes than specified by the write transaction may be
  written to the register. On some processors, this may expose stale
  data from the fill buffers of the core that created the write
  transaction.

Shared Buffers Data Sampling (SBDS):
  After propagators may have moved data around the uncore and copied
  stale data into client core fill buffers, processors affected by MFBDS
  can leak data from the fill buffer.

Shared Buffers Data Read (SBDR):
  It is similar to Shared Buffer Data Sampling (SBDS) except that the
  data is directly read into the architectural software-visible state.

An attacker can use these vulnerabilities to extract data from CPU fill
buffers using MDS and TAA methods. Mitigate it by clearing the CPU fill
buffers using the VERW instruction before returning to a user or a
guest.

On CPUs not affected by MDS and TAA, user application cannot sample data
from CPU fill buffers using MDS or TAA. A guest with MMIO access can
still use DRPW or SBDR to extract data architecturally. Mitigate it with
VERW instruction to clear fill buffers before VMENTER for MMIO capable
guests.

Add a kernel parameter mmio_stale_data={off|full|full,nosmt} to control
the mitigation.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:14:52 +02:00
Pawan Gupta
f52ea6c269 x86/speculation: Add a common function for MD_CLEAR mitigation update
Processor MMIO Stale Data mitigation uses similar mitigation as MDS and
TAA. In preparation for adding its mitigation, add a common function to
update all mitigations that depend on MD_CLEAR.

  [ bp: Add a newline in md_clear_update_mitigation() to separate
    statements better. ]

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:14:50 +02:00
Pawan Gupta
5180218615 x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug
Processor MMIO Stale Data is a class of vulnerabilities that may
expose data after an MMIO operation. For more details please refer to
Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst

Add the Processor MMIO Stale Data bug enumeration. A microcode update
adds new bits to the MSR IA32_ARCH_CAPABILITIES, define them.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:14:30 +02:00
Pawan Gupta
4419470191 Documentation: Add documentation for Processor MMIO Stale Data
Add the admin guide for Processor MMIO stale data vulnerabilities.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:14:26 +02:00
Lv Ruyi
f8ef475aa0 iio: adc: xilinx-ams: fix return error variable
Return irq instead of ret which always equals to zero here.

Fixes: d5c70627a7 ("iio: adc: Add Xilinx AMS driver")
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Reviewed-by: Michal Simek <michal.simek@amd.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-05-14 14:48:26 +01:00
Linus Walleij
bb52d3691d iio: magnetometer: yas530: Fix memchr_inv() misuse
The call to check if the calibration is all zeroes is doing
it wrong: memchr_inv() returns NULL if the the calibration
contains all zeroes, but the check is for != NULL.

Fix it up. It's probably not an urgent fix because the inner
check for BIT(7) in data[13] will save us. But fix it.

Fixes: de8860b1ed ("iio: magnetometer: Add driver for Yamaha YAS530")
Reported-by: Jakob Hauser <jahau@rocketmail.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220501195029.151852-1-linus.walleij@linaro.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-05-07 15:34:19 +01:00
Hans de Goede
048058399f iio: adc: axp288: Override TS pin bias current for some models
Since commit 9bcf15f75c ("iio: adc: axp288: Fix TS-pin handling") we
preserve the bias current set by the firmware at boot. This fixes issues
we were seeing on various models.

Some models like the Nuvision Solo 10 Draw tablet actually need the
old hardcoded 80ųA bias current for battery temperature monitoring
to work properly.

Add a quirk entry for the Nuvision Solo 10 Draw to the DMI quirk table
to restore setting the bias current to 80ųA on this model.

Fixes: 9bcf15f75c ("iio: adc: axp288: Fix TS-pin handling")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215882
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220506095040.21008-1-hdegoede@redhat.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-05-07 15:23:39 +01:00
Haibo Chen
fe18894930 iio: mma8452: fix probe fail when device tree compatible is used.
Correct the logic for the probe. First check of_match_table, if
not meet, then check i2c_driver.id_table. If both not meet, then
return fail.

Fixes: a47ac019e7 ("iio: mma8452: Fix probe failing when an i2c_device_id is used")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Link: https://lore.kernel.org/r/1650876060-17577-1-git-send-email-haibo.chen@nxp.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-05-01 17:43:25 +01:00
834 changed files with 10632 additions and 6842 deletions

View File

@@ -10,6 +10,8 @@
# Please keep this list dictionary sorted.
#
Aaron Durbin <adurbin@google.com>
Abel Vesa <abelvesa@kernel.org> <abel.vesa@nxp.com>
Abel Vesa <abelvesa@kernel.org> <abelvesa@gmail.com>
Abhinav Kumar <quic_abhinavk@quicinc.com> <abhinavk@codeaurora.org>
Adam Oldham <oldhamca@gmail.com>
Adam Radford <aradford@gmail.com>
@@ -85,6 +87,7 @@ Christian Borntraeger <borntraeger@linux.ibm.com> <borntrae@de.ibm.com>
Christian Brauner <brauner@kernel.org> <christian@brauner.io>
Christian Brauner <brauner@kernel.org> <christian.brauner@canonical.com>
Christian Brauner <brauner@kernel.org> <christian.brauner@ubuntu.com>
Christian Marangi <ansuelsmth@gmail.com>
Christophe Ricard <christophe.ricard@gmail.com>
Christoph Hellwig <hch@lst.de>
Colin Ian King <colin.king@intel.com> <colin.king@canonical.com>
@@ -165,6 +168,7 @@ Jan Glauber <jan.glauber@gmail.com> <jang@de.ibm.com>
Jan Glauber <jan.glauber@gmail.com> <jang@linux.vnet.ibm.com>
Jan Glauber <jan.glauber@gmail.com> <jglauber@cavium.com>
Jarkko Sakkinen <jarkko@kernel.org> <jarkko.sakkinen@linux.intel.com>
Jarkko Sakkinen <jarkko@kernel.org> <jarkko@profian.com>
Jason Gunthorpe <jgg@ziepe.ca> <jgg@mellanox.com>
Jason Gunthorpe <jgg@ziepe.ca> <jgg@nvidia.com>
Jason Gunthorpe <jgg@ziepe.ca> <jgunthorpe@obsidianresearch.com>

View File

@@ -1,4 +1,4 @@
What: /sys/bus/iio/devices/iio:deviceX/conversion_mode
What: /sys/bus/iio/devices/iio:deviceX/in_conversion_mode
KernelVersion: 4.2
Contact: linux-iio@vger.kernel.org
Description:

View File

@@ -526,6 +526,7 @@ What: /sys/devices/system/cpu/vulnerabilities
/sys/devices/system/cpu/vulnerabilities/srbds
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
/sys/devices/system/cpu/vulnerabilities/itlb_multihit
/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
Date: January 2018
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description: Information about CPU vulnerabilities

View File

@@ -17,3 +17,4 @@ are configurable at compile, boot or run time.
special-register-buffer-data-sampling.rst
core-scheduling.rst
l1d_flush.rst
processor_mmio_stale_data.rst

View File

@@ -0,0 +1,246 @@
=========================================
Processor MMIO Stale Data Vulnerabilities
=========================================
Processor MMIO Stale Data Vulnerabilities are a class of memory-mapped I/O
(MMIO) vulnerabilities that can expose data. The sequences of operations for
exposing data range from simple to very complex. Because most of the
vulnerabilities require the attacker to have access to MMIO, many environments
are not affected. System environments using virtualization where MMIO access is
provided to untrusted guests may need mitigation. These vulnerabilities are
not transient execution attacks. However, these vulnerabilities may propagate
stale data into core fill buffers where the data can subsequently be inferred
by an unmitigated transient execution attack. Mitigation for these
vulnerabilities includes a combination of microcode update and software
changes, depending on the platform and usage model. Some of these mitigations
are similar to those used to mitigate Microarchitectural Data Sampling (MDS) or
those used to mitigate Special Register Buffer Data Sampling (SRBDS).
Data Propagators
================
Propagators are operations that result in stale data being copied or moved from
one microarchitectural buffer or register to another. Processor MMIO Stale Data
Vulnerabilities are operations that may result in stale data being directly
read into an architectural, software-visible state or sampled from a buffer or
register.
Fill Buffer Stale Data Propagator (FBSDP)
-----------------------------------------
Stale data may propagate from fill buffers (FB) into the non-coherent portion
of the uncore on some non-coherent writes. Fill buffer propagation by itself
does not make stale data architecturally visible. Stale data must be propagated
to a location where it is subject to reading or sampling.
Sideband Stale Data Propagator (SSDP)
-------------------------------------
The sideband stale data propagator (SSDP) is limited to the client (including
Intel Xeon server E3) uncore implementation. The sideband response buffer is
shared by all client cores. For non-coherent reads that go to sideband
destinations, the uncore logic returns 64 bytes of data to the core, including
both requested data and unrequested stale data, from a transaction buffer and
the sideband response buffer. As a result, stale data from the sideband
response and transaction buffers may now reside in a core fill buffer.
Primary Stale Data Propagator (PSDP)
------------------------------------
The primary stale data propagator (PSDP) is limited to the client (including
Intel Xeon server E3) uncore implementation. Similar to the sideband response
buffer, the primary response buffer is shared by all client cores. For some
processors, MMIO primary reads will return 64 bytes of data to the core fill
buffer including both requested data and unrequested stale data. This is
similar to the sideband stale data propagator.
Vulnerabilities
===============
Device Register Partial Write (DRPW) (CVE-2022-21166)
-----------------------------------------------------
Some endpoint MMIO registers incorrectly handle writes that are smaller than
the register size. Instead of aborting the write or only copying the correct
subset of bytes (for example, 2 bytes for a 2-byte write), more bytes than
specified by the write transaction may be written to the register. On
processors affected by FBSDP, this may expose stale data from the fill buffers
of the core that created the write transaction.
Shared Buffers Data Sampling (SBDS) (CVE-2022-21125)
----------------------------------------------------
After propagators may have moved data around the uncore and copied stale data
into client core fill buffers, processors affected by MFBDS can leak data from
the fill buffer. It is limited to the client (including Intel Xeon server E3)
uncore implementation.
Shared Buffers Data Read (SBDR) (CVE-2022-21123)
------------------------------------------------
It is similar to Shared Buffer Data Sampling (SBDS) except that the data is
directly read into the architectural software-visible state. It is limited to
the client (including Intel Xeon server E3) uncore implementation.
Affected Processors
===================
Not all the CPUs are affected by all the variants. For instance, most
processors for the server market (excluding Intel Xeon E3 processors) are
impacted by only Device Register Partial Write (DRPW).
Below is the list of affected Intel processors [#f1]_:
=================== ============ =========
Common name Family_Model Steppings
=================== ============ =========
HASWELL_X 06_3FH 2,4
SKYLAKE_L 06_4EH 3
BROADWELL_X 06_4FH All
SKYLAKE_X 06_55H 3,4,6,7,11
BROADWELL_D 06_56H 3,4,5
SKYLAKE 06_5EH 3
ICELAKE_X 06_6AH 4,5,6
ICELAKE_D 06_6CH 1
ICELAKE_L 06_7EH 5
ATOM_TREMONT_D 06_86H All
LAKEFIELD 06_8AH 1
KABYLAKE_L 06_8EH 9 to 12
ATOM_TREMONT 06_96H 1
ATOM_TREMONT_L 06_9CH 0
KABYLAKE 06_9EH 9 to 13
COMETLAKE 06_A5H 2,3,5
COMETLAKE_L 06_A6H 0,1
ROCKETLAKE 06_A7H 1
=================== ============ =========
If a CPU is in the affected processor list, but not affected by a variant, it
is indicated by new bits in MSR IA32_ARCH_CAPABILITIES. As described in a later
section, mitigation largely remains the same for all the variants, i.e. to
clear the CPU fill buffers via VERW instruction.
New bits in MSRs
================
Newer processors and microcode update on existing affected processors added new
bits to IA32_ARCH_CAPABILITIES MSR. These bits can be used to enumerate
specific variants of Processor MMIO Stale Data vulnerabilities and mitigation
capability.
MSR IA32_ARCH_CAPABILITIES
--------------------------
Bit 13 - SBDR_SSDP_NO - When set, processor is not affected by either the
Shared Buffers Data Read (SBDR) vulnerability or the sideband stale
data propagator (SSDP).
Bit 14 - FBSDP_NO - When set, processor is not affected by the Fill Buffer
Stale Data Propagator (FBSDP).
Bit 15 - PSDP_NO - When set, processor is not affected by Primary Stale Data
Propagator (PSDP).
Bit 17 - FB_CLEAR - When set, VERW instruction will overwrite CPU fill buffer
values as part of MD_CLEAR operations. Processors that do not
enumerate MDS_NO (meaning they are affected by MDS) but that do
enumerate support for both L1D_FLUSH and MD_CLEAR implicitly enumerate
FB_CLEAR as part of their MD_CLEAR support.
Bit 18 - FB_CLEAR_CTRL - Processor supports read and write to MSR
IA32_MCU_OPT_CTRL[FB_CLEAR_DIS]. On such processors, the FB_CLEAR_DIS
bit can be set to cause the VERW instruction to not perform the
FB_CLEAR action. Not all processors that support FB_CLEAR will support
FB_CLEAR_CTRL.
MSR IA32_MCU_OPT_CTRL
---------------------
Bit 3 - FB_CLEAR_DIS - When set, VERW instruction does not perform the FB_CLEAR
action. This may be useful to reduce the performance impact of FB_CLEAR in
cases where system software deems it warranted (for example, when performance
is more critical, or the untrusted software has no MMIO access). Note that
FB_CLEAR_DIS has no impact on enumeration (for example, it does not change
FB_CLEAR or MD_CLEAR enumeration) and it may not be supported on all processors
that enumerate FB_CLEAR.
Mitigation
==========
Like MDS, all variants of Processor MMIO Stale Data vulnerabilities have the
same mitigation strategy to force the CPU to clear the affected buffers before
an attacker can extract the secrets.
This is achieved by using the otherwise unused and obsolete VERW instruction in
combination with a microcode update. The microcode clears the affected CPU
buffers when the VERW instruction is executed.
Kernel reuses the MDS function to invoke the buffer clearing:
mds_clear_cpu_buffers()
On MDS affected CPUs, the kernel already invokes CPU buffer clear on
kernel/userspace, hypervisor/guest and C-state (idle) transitions. No
additional mitigation is needed on such CPUs.
For CPUs not affected by MDS or TAA, mitigation is needed only for the attacker
with MMIO capability. Therefore, VERW is not required for kernel/userspace. For
virtualization case, VERW is only needed at VMENTER for a guest with MMIO
capability.
Mitigation points
-----------------
Return to user space
^^^^^^^^^^^^^^^^^^^^
Same mitigation as MDS when affected by MDS/TAA, otherwise no mitigation
needed.
C-State transition
^^^^^^^^^^^^^^^^^^
Control register writes by CPU during C-state transition can propagate data
from fill buffer to uncore buffers. Execute VERW before C-state transition to
clear CPU fill buffers.
Guest entry point
^^^^^^^^^^^^^^^^^
Same mitigation as MDS when processor is also affected by MDS/TAA, otherwise
execute VERW at VMENTER only for MMIO capable guests. On CPUs not affected by
MDS/TAA, guest without MMIO access cannot extract secrets using Processor MMIO
Stale Data vulnerabilities, so there is no need to execute VERW for such guests.
Mitigation control on the kernel command line
---------------------------------------------
The kernel command line allows to control the Processor MMIO Stale Data
mitigations at boot time with the option "mmio_stale_data=". The valid
arguments for this option are:
========== =================================================================
full If the CPU is vulnerable, enable mitigation; CPU buffer clearing
on exit to userspace and when entering a VM. Idle transitions are
protected as well. It does not automatically disable SMT.
full,nosmt Same as full, with SMT disabled on vulnerable CPUs. This is the
complete mitigation.
off Disables mitigation completely.
========== =================================================================
If the CPU is affected and mmio_stale_data=off is not supplied on the kernel
command line, then the kernel selects the appropriate mitigation.
Mitigation status information
-----------------------------
The Linux kernel provides a sysfs interface to enumerate the current
vulnerability status of the system: whether the system is vulnerable, and
which mitigations are active. The relevant sysfs file is:
/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
The possible values in this file are:
.. list-table::
* - 'Not affected'
- The processor is not vulnerable
* - 'Vulnerable'
- The processor is vulnerable, but no mitigation enabled
* - 'Vulnerable: Clear CPU buffers attempted, no microcode'
- The processor is vulnerable, but microcode is not updated. The
mitigation is enabled on a best effort basis.
* - 'Mitigation: Clear CPU buffers'
- The processor is vulnerable and the CPU buffer clearing mitigation is
enabled.
If the processor is vulnerable then the following information is appended to
the above information:
======================== ===========================================
'SMT vulnerable' SMT is enabled
'SMT disabled' SMT is disabled
'SMT Host state unknown' Kernel runs in a VM, Host SMT state unknown
======================== ===========================================
References
----------
.. [#f1] Affected Processors
https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html

View File

@@ -2469,7 +2469,6 @@
protected: nVHE-based mode with support for guests whose
state is kept private from the host.
Not valid if the kernel is running in EL2.
Defaults to VHE/nVHE based on hardware support. Setting
mode to "protected" will disable kexec and hibernation
@@ -3176,6 +3175,7 @@
srbds=off [X86,INTEL]
no_entry_flush [PPC]
no_uaccess_flush [PPC]
mmio_stale_data=off [X86]
Exceptions:
This does not have any effect on
@@ -3197,6 +3197,7 @@
Equivalent to: l1tf=flush,nosmt [X86]
mds=full,nosmt [X86]
tsx_async_abort=full,nosmt [X86]
mmio_stale_data=full,nosmt [X86]
mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
@@ -3206,6 +3207,40 @@
log everything. Information is printed at KERN_DEBUG
so loglevel=8 may also need to be specified.
mmio_stale_data=
[X86,INTEL] Control mitigation for the Processor
MMIO Stale Data vulnerabilities.
Processor MMIO Stale Data is a class of
vulnerabilities that may expose data after an MMIO
operation. Exposed data could originate or end in
the same CPU buffers as affected by MDS and TAA.
Therefore, similar to MDS and TAA, the mitigation
is to clear the affected CPU buffers.
This parameter controls the mitigation. The
options are:
full - Enable mitigation on vulnerable CPUs
full,nosmt - Enable mitigation and disable SMT on
vulnerable CPUs.
off - Unconditionally disable mitigation
On MDS or TAA affected machines,
mmio_stale_data=off can be prevented by an active
MDS or TAA mitigation as these vulnerabilities are
mitigated with the same mechanism so in order to
disable this mitigation, you need to specify
mds=off and tsx_async_abort=off too.
Not specifying this option is equivalent to
mmio_stale_data=full.
For details see:
Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
module.sig_enforce
[KNL] When CONFIG_MODULE_SIG is set, this means that
modules without (valid) signatures will fail to load.

View File

@@ -40,9 +40,8 @@ properties:
value to be used for converting remote channel measurements to
temperature.
$ref: /schemas/types.yaml#/definitions/int32
items:
minimum: -128
maximum: 127
minimum: -128
maximum: 127
ti,beta-compensation:
description:

View File

@@ -30,6 +30,7 @@ properties:
- socionext,uniphier-ld11-aidet
- socionext,uniphier-ld20-aidet
- socionext,uniphier-pxs3-aidet
- socionext,uniphier-nx1-aidet
reg:
maxItems: 1

View File

@@ -47,6 +47,5 @@ examples:
clocks = <&clkcfg CLK_SPI0>;
interrupt-parent = <&plic>;
interrupts = <54>;
spi-max-frequency = <25000000>;
};
...

View File

@@ -110,7 +110,6 @@ examples:
pinctrl-names = "default";
pinctrl-0 = <&qup_spi1_default>;
interrupts = <GIC_SPI 602 IRQ_TYPE_LEVEL_HIGH>;
spi-max-frequency = <50000000>;
#address-cells = <1>;
#size-cells = <0>;
};

View File

@@ -136,7 +136,8 @@ properties:
Phandle of a companion.
phys:
maxItems: 1
minItems: 1
maxItems: 3
phy-names:
const: usb

View File

@@ -103,7 +103,8 @@ properties:
Overrides the detected port count
phys:
maxItems: 1
minItems: 1
maxItems: 3
phy-names:
const: usb

View File

@@ -13,6 +13,12 @@ EDD Interfaces
.. kernel-doc:: drivers/firmware/edd.c
:internal:
Generic System Framebuffers Interface
-------------------------------------
.. kernel-doc:: drivers/firmware/sysfb.c
:export:
Intel Stratix10 SoC Service Layer
---------------------------------
Some features of the Intel Stratix10 SoC require a level of privilege

View File

@@ -6,7 +6,7 @@ This document explains how GPIOs can be assigned to given devices and functions.
Note that it only applies to the new descriptor-based interface. For a
description of the deprecated integer-based GPIO interface please refer to
gpio-legacy.txt (actually, there is no real mapping possible with the old
legacy.rst (actually, there is no real mapping possible with the old
interface; you just fetch an integer from somewhere and request the
corresponding GPIO).

View File

@@ -4,7 +4,7 @@ GPIO Descriptor Consumer Interface
This document describes the consumer interface of the GPIO framework. Note that
it describes the new descriptor-based interface. For a description of the
deprecated integer-based GPIO interface please refer to gpio-legacy.txt.
deprecated integer-based GPIO interface please refer to legacy.rst.
Guidelines for GPIOs consumers
@@ -78,7 +78,7 @@ whether the line is configured active high or active low (see
The two last flags are used for use cases where open drain is mandatory, such
as I2C: if the line is not already configured as open drain in the mappings
(see board.txt), then open drain will be enforced anyway and a warning will be
(see board.rst), then open drain will be enforced anyway and a warning will be
printed that the board configuration needs to be updated to match the use case.
Both functions return either a valid GPIO descriptor, or an error code checkable
@@ -270,7 +270,7 @@ driven.
The same is applicable for open drain or open source output lines: those do not
actively drive their output high (open drain) or low (open source), they just
switch their output to a high impedance value. The consumer should not need to
care. (For details read about open drain in driver.txt.)
care. (For details read about open drain in driver.rst.)
With this, all the gpiod_set_(array)_value_xxx() functions interpret the
parameter "value" as "asserted" ("1") or "de-asserted" ("0"). The physical line

View File

@@ -14,12 +14,12 @@ Due to the history of GPIO interfaces in the kernel, there are two different
ways to obtain and use GPIOs:
- The descriptor-based interface is the preferred way to manipulate GPIOs,
and is described by all the files in this directory excepted gpio-legacy.txt.
and is described by all the files in this directory excepted legacy.rst.
- The legacy integer-based interface which is considered deprecated (but still
usable for compatibility reasons) is documented in gpio-legacy.txt.
usable for compatibility reasons) is documented in legacy.rst.
The remainder of this document applies to the new descriptor-based interface.
gpio-legacy.txt contains the same information applied to the legacy
legacy.rst contains the same information applied to the legacy
integer-based interface.

View File

@@ -19,13 +19,23 @@ The main Btrfs features include:
* Subvolumes (separate internal filesystem roots)
* Object level mirroring and striping
* Checksums on data and metadata (multiple algorithms available)
* Compression
* Compression (multiple algorithms available)
* Reflink, deduplication
* Scrub (on-line checksum verification)
* Hierarchical quota groups (subvolume and snapshot support)
* Integrated multiple device support, with several raid algorithms
* Offline filesystem check
* Efficient incremental backup and FS mirroring
* Efficient incremental backup and FS mirroring (send/receive)
* Trim/discard
* Online filesystem defragmentation
* Swapfile support
* Zoned mode
* Read/write metadata verification
* Online resize (shrink, grow)
For more information please refer to the wiki
For more information please refer to the documentation site or wiki
https://btrfs.readthedocs.io
https://btrfs.wiki.kernel.org

View File

@@ -13,8 +13,8 @@ disappeared as of Linux 3.0.
There are two places where extended attributes can be found. The first
place is between the end of each inode entry and the beginning of the
next inode entry. For example, if inode.i\_extra\_isize = 28 and
sb.inode\_size = 256, then there are 256 - (128 + 28) = 100 bytes
next inode entry. For example, if inode.i_extra_isize = 28 and
sb.inode_size = 256, then there are 256 - (128 + 28) = 100 bytes
available for in-inode extended attribute storage. The second place
where extended attributes can be found is in the block pointed to by
``inode.i_file_acl``. As of Linux 3.11, it is not possible for this
@@ -38,8 +38,8 @@ Extended attributes, when stored after the inode, have a header
- Name
- Description
* - 0x0
- \_\_le32
- h\_magic
- __le32
- h_magic
- Magic number for identification, 0xEA020000. This value is set by the
Linux driver, though e2fsprogs doesn't seem to check it(?)
@@ -55,28 +55,28 @@ The beginning of an extended attribute block is in
- Name
- Description
* - 0x0
- \_\_le32
- h\_magic
- __le32
- h_magic
- Magic number for identification, 0xEA020000.
* - 0x4
- \_\_le32
- h\_refcount
- __le32
- h_refcount
- Reference count.
* - 0x8
- \_\_le32
- h\_blocks
- __le32
- h_blocks
- Number of disk blocks used.
* - 0xC
- \_\_le32
- h\_hash
- __le32
- h_hash
- Hash value of all attributes.
* - 0x10
- \_\_le32
- h\_checksum
- __le32
- h_checksum
- Checksum of the extended attribute block.
* - 0x14
- \_\_u32
- h\_reserved[3]
- __u32
- h_reserved[3]
- Zero.
The checksum is calculated against the FS UUID, the 64-bit block number
@@ -100,46 +100,46 @@ Attributes stored inside an inode do not need be stored in sorted order.
- Name
- Description
* - 0x0
- \_\_u8
- e\_name\_len
- __u8
- e_name_len
- Length of name.
* - 0x1
- \_\_u8
- e\_name\_index
- __u8
- e_name_index
- Attribute name index. There is a discussion of this below.
* - 0x2
- \_\_le16
- e\_value\_offs
- __le16
- e_value_offs
- Location of this attribute's value on the disk block where it is stored.
Multiple attributes can share the same value. For an inode attribute
this value is relative to the start of the first entry; for a block this
value is relative to the start of the block (i.e. the header).
* - 0x4
- \_\_le32
- e\_value\_inum
- __le32
- e_value_inum
- The inode where the value is stored. Zero indicates the value is in the
same block as this entry. This field is only used if the
INCOMPAT\_EA\_INODE feature is enabled.
INCOMPAT_EA_INODE feature is enabled.
* - 0x8
- \_\_le32
- e\_value\_size
- __le32
- e_value_size
- Length of attribute value.
* - 0xC
- \_\_le32
- e\_hash
- __le32
- e_hash
- Hash value of attribute name and attribute value. The kernel doesn't
update the hash for in-inode attributes, so for that case this value
must be zero, because e2fsck validates any non-zero hash regardless of
where the xattr lives.
* - 0x10
- char
- e\_name[e\_name\_len]
- e_name[e_name_len]
- Attribute name. Does not include trailing NULL.
Attribute values can follow the end of the entry table. There appears to
be a requirement that they be aligned to 4-byte boundaries. The values
are stored starting at the end of the block and grow towards the
xattr\_header/xattr\_entry table. When the two collide, the overflow is
xattr_header/xattr_entry table. When the two collide, the overflow is
put into a separate disk block. If the disk block fills up, the
filesystem returns -ENOSPC.
@@ -167,15 +167,15 @@ the key name. Here is a map of name index values to key prefixes:
* - 1
- “user.”
* - 2
- “system.posix\_acl\_access”
- “system.posix_acl_access”
* - 3
- “system.posix\_acl\_default”
- “system.posix_acl_default”
* - 4
- “trusted.”
* - 6
- “security.”
* - 7
- “system.” (inline\_data only?)
- “system.” (inline_data only?)
* - 8
- “system.richacl” (SuSE kernels only?)

View File

@@ -23,7 +23,7 @@ means that a block group addresses 32 gigabytes instead of 128 megabytes,
also shrinking the amount of file system overhead for metadata.
The administrator can set a block cluster size at mkfs time (which is
stored in the s\_log\_cluster\_size field in the superblock); from then
stored in the s_log_cluster_size field in the superblock); from then
on, the block bitmaps track clusters, not individual blocks. This means
that block groups can be several gigabytes in size (instead of just
128MiB); however, the minimum allocation unit becomes a cluster, not a

View File

@@ -9,15 +9,15 @@ group.
The inode bitmap records which entries in the inode table are in use.
As with most bitmaps, one bit represents the usage status of one data
block or inode table entry. This implies a block group size of 8 \*
number\_of\_bytes\_in\_a\_logical\_block.
block or inode table entry. This implies a block group size of 8 *
number_of_bytes_in_a_logical_block.
NOTE: If ``BLOCK_UNINIT`` is set for a given block group, various parts
of the kernel and e2fsprogs code pretends that the block bitmap contains
zeros (i.e. all blocks in the group are free). However, it is not
necessarily the case that no blocks are in use -- if ``meta_bg`` is set,
the bitmaps and group descriptor live inside the group. Unfortunately,
ext2fs\_test\_block\_bitmap2() will return '0' for those locations,
ext2fs_test_block_bitmap2() will return '0' for those locations,
which produces confusing debugfs output.
Inode Table

View File

@@ -56,39 +56,39 @@ established that the super block and the group descriptor table, if
present, will be at the beginning of the block group. The bitmaps and
the inode table can be anywhere, and it is quite possible for the
bitmaps to come after the inode table, or for both to be in different
groups (flex\_bg). Leftover space is used for file data blocks, indirect
groups (flex_bg). Leftover space is used for file data blocks, indirect
block maps, extent tree blocks, and extended attributes.
Flexible Block Groups
---------------------
Starting in ext4, there is a new feature called flexible block groups
(flex\_bg). In a flex\_bg, several block groups are tied together as one
(flex_bg). In a flex_bg, several block groups are tied together as one
logical block group; the bitmap spaces and the inode table space in the
first block group of the flex\_bg are expanded to include the bitmaps
and inode tables of all other block groups in the flex\_bg. For example,
if the flex\_bg size is 4, then group 0 will contain (in order) the
first block group of the flex_bg are expanded to include the bitmaps
and inode tables of all other block groups in the flex_bg. For example,
if the flex_bg size is 4, then group 0 will contain (in order) the
superblock, group descriptors, data block bitmaps for groups 0-3, inode
bitmaps for groups 0-3, inode tables for groups 0-3, and the remaining
space in group 0 is for file data. The effect of this is to group the
block group metadata close together for faster loading, and to enable
large files to be continuous on disk. Backup copies of the superblock
and group descriptors are always at the beginning of block groups, even
if flex\_bg is enabled. The number of block groups that make up a
flex\_bg is given by 2 ^ ``sb.s_log_groups_per_flex``.
if flex_bg is enabled. The number of block groups that make up a
flex_bg is given by 2 ^ ``sb.s_log_groups_per_flex``.
Meta Block Groups
-----------------
Without the option META\_BG, for safety concerns, all block group
Without the option META_BG, for safety concerns, all block group
descriptors copies are kept in the first block group. Given the default
128MiB(2^27 bytes) block group size and 64-byte group descriptors, ext4
can have at most 2^27/64 = 2^21 block groups. This limits the entire
filesystem size to 2^21 * 2^27 = 2^48bytes or 256TiB.
The solution to this problem is to use the metablock group feature
(META\_BG), which is already in ext3 for all 2.6 releases. With the
META\_BG feature, ext4 filesystems are partitioned into many metablock
(META_BG), which is already in ext3 for all 2.6 releases. With the
META_BG feature, ext4 filesystems are partitioned into many metablock
groups. Each metablock group is a cluster of block groups whose group
descriptor structures can be stored in a single disk block. For ext4
filesystems with 4 KB block size, a single metablock group partition
@@ -110,7 +110,7 @@ bytes, a meta-block group contains 32 block groups for filesystems with
a 1KB block size, and 128 block groups for filesystems with a 4KB
blocksize. Filesystems can either be created using this new block group
descriptor layout, or existing filesystems can be resized on-line, and
the field s\_first\_meta\_bg in the superblock will indicate the first
the field s_first_meta_bg in the superblock will indicate the first
block group using this new layout.
Please see an important note about ``BLOCK_UNINIT`` in the section about
@@ -121,15 +121,15 @@ Lazy Block Group Initialization
A new feature for ext4 are three block group descriptor flags that
enable mkfs to skip initializing other parts of the block group
metadata. Specifically, the INODE\_UNINIT and BLOCK\_UNINIT flags mean
metadata. Specifically, the INODE_UNINIT and BLOCK_UNINIT flags mean
that the inode and block bitmaps for that group can be calculated and
therefore the on-disk bitmap blocks are not initialized. This is
generally the case for an empty block group or a block group containing
only fixed-location block group metadata. The INODE\_ZEROED flag means
only fixed-location block group metadata. The INODE_ZEROED flag means
that the inode table has been initialized; mkfs will unset this flag and
rely on the kernel to initialize the inode tables in the background.
By not writing zeroes to the bitmaps and inode table, mkfs time is
reduced considerably. Note the feature flag is RO\_COMPAT\_GDT\_CSUM,
but the dumpe2fs output prints this as “uninit\_bg”. They are the same
reduced considerably. Note the feature flag is RO_COMPAT_GDT_CSUM,
but the dumpe2fs output prints this as “uninit_bg”. They are the same
thing.

View File

@@ -1,7 +1,7 @@
.. SPDX-License-Identifier: GPL-2.0
+---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| i.i\_block Offset | Where It Points |
| i.i_block Offset | Where It Points |
+=====================+==============================================================================================================================================================================================================================+
| 0 to 11 | Direct map to file blocks 0 to 11. |
+---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

View File

@@ -4,7 +4,7 @@ Checksums
---------
Starting in early 2012, metadata checksums were added to all major ext4
and jbd2 data structures. The associated feature flag is metadata\_csum.
and jbd2 data structures. The associated feature flag is metadata_csum.
The desired checksum algorithm is indicated in the superblock, though as
of October 2012 the only supported algorithm is crc32c. Some data
structures did not have space to fit a full 32-bit checksum, so only the
@@ -20,7 +20,7 @@ encounters directory blocks that lack sufficient empty space to add a
checksum, it will request that you run ``e2fsck -D`` to have the
directories rebuilt with checksums. This has the added benefit of
removing slack space from the directory files and rebalancing the htree
indexes. If you \_ignore\_ this step, your directories will not be
indexes. If you _ignore_ this step, your directories will not be
protected by a checksum!
The following table describes the data elements that go into each type
@@ -35,39 +35,39 @@ of checksum. The checksum function is whatever the superblock describes
- Length
- Ingredients
* - Superblock
- \_\_le32
- __le32
- The entire superblock up to the checksum field. The UUID lives inside
the superblock.
* - MMP
- \_\_le32
- __le32
- UUID + the entire MMP block up to the checksum field.
* - Extended Attributes
- \_\_le32
- __le32
- UUID + the entire extended attribute block. The checksum field is set to
zero.
* - Directory Entries
- \_\_le32
- __le32
- UUID + inode number + inode generation + the directory block up to the
fake entry enclosing the checksum field.
* - HTREE Nodes
- \_\_le32
- __le32
- UUID + inode number + inode generation + all valid extents + HTREE tail.
The checksum field is set to zero.
* - Extents
- \_\_le32
- __le32
- UUID + inode number + inode generation + the entire extent block up to
the checksum field.
* - Bitmaps
- \_\_le32 or \_\_le16
- __le32 or __le16
- UUID + the entire bitmap. Checksums are stored in the group descriptor,
and truncated if the group descriptor size is 32 bytes (i.e. ^64bit)
* - Inodes
- \_\_le32
- __le32
- UUID + inode number + inode generation + the entire inode. The checksum
field is set to zero. Each inode has its own checksum.
* - Group Descriptors
- \_\_le16
- If metadata\_csum, then UUID + group number + the entire descriptor;
else if gdt\_csum, then crc16(UUID + group number + the entire
- __le16
- If metadata_csum, then UUID + group number + the entire descriptor;
else if gdt_csum, then crc16(UUID + group number + the entire
descriptor). In all cases, only the lower 16 bits are stored.

View File

@@ -42,24 +42,24 @@ is at most 263 bytes long, though on disk you'll need to reference
- Name
- Description
* - 0x0
- \_\_le32
- __le32
- inode
- Number of the inode that this directory entry points to.
* - 0x4
- \_\_le16
- rec\_len
- __le16
- rec_len
- Length of this directory entry. Must be a multiple of 4.
* - 0x6
- \_\_le16
- name\_len
- __le16
- name_len
- Length of the file name.
* - 0x8
- char
- name[EXT4\_NAME\_LEN]
- name[EXT4_NAME_LEN]
- File name.
Since file names cannot be longer than 255 bytes, the new directory
entry format shortens the name\_len field and uses the space for a file
entry format shortens the name_len field and uses the space for a file
type flag, probably to avoid having to load every inode during directory
tree traversal. This format is ``ext4_dir_entry_2``, which is at most
263 bytes long, though on disk you'll need to reference
@@ -74,24 +74,24 @@ tree traversal. This format is ``ext4_dir_entry_2``, which is at most
- Name
- Description
* - 0x0
- \_\_le32
- __le32
- inode
- Number of the inode that this directory entry points to.
* - 0x4
- \_\_le16
- rec\_len
- __le16
- rec_len
- Length of this directory entry.
* - 0x6
- \_\_u8
- name\_len
- __u8
- name_len
- Length of the file name.
* - 0x7
- \_\_u8
- file\_type
- __u8
- file_type
- File type code, see ftype_ table below.
* - 0x8
- char
- name[EXT4\_NAME\_LEN]
- name[EXT4_NAME_LEN]
- File name.
.. _ftype:
@@ -137,19 +137,19 @@ entry uses this extension, it may be up to 271 bytes.
- Name
- Description
* - 0x0
- \_\_le32
- __le32
- hash
- The hash of the directory name
* - 0x4
- \_\_le32
- minor\_hash
- __le32
- minor_hash
- The minor hash of the directory name
In order to add checksums to these classic directory blocks, a phony
``struct ext4_dir_entry`` is placed at the end of each leaf block to
hold the checksum. The directory entry is 12 bytes long. The inode
number and name\_len fields are set to zero to fool old software into
number and name_len fields are set to zero to fool old software into
ignoring an apparently empty directory entry, and the checksum is stored
in the place where the name normally goes. The structure is
``struct ext4_dir_entry_tail``:
@@ -163,24 +163,24 @@ in the place where the name normally goes. The structure is
- Name
- Description
* - 0x0
- \_\_le32
- det\_reserved\_zero1
- __le32
- det_reserved_zero1
- Inode number, which must be zero.
* - 0x4
- \_\_le16
- det\_rec\_len
- __le16
- det_rec_len
- Length of this directory entry, which must be 12.
* - 0x6
- \_\_u8
- det\_reserved\_zero2
- __u8
- det_reserved_zero2
- Length of the file name, which must be zero.
* - 0x7
- \_\_u8
- det\_reserved\_ft
- __u8
- det_reserved_ft
- File type, which must be 0xDE.
* - 0x8
- \_\_le32
- det\_checksum
- __le32
- det_checksum
- Directory leaf block checksum.
The leaf directory block checksum is calculated against the FS UUID, the
@@ -194,7 +194,7 @@ Hash Tree Directories
A linear array of directory entries isn't great for performance, so a
new feature was added to ext3 to provide a faster (but peculiar)
balanced tree keyed off a hash of the directory entry name. If the
EXT4\_INDEX\_FL (0x1000) flag is set in the inode, this directory uses a
EXT4_INDEX_FL (0x1000) flag is set in the inode, this directory uses a
hashed btree (htree) to organize and find directory entries. For
backwards read-only compatibility with ext2, this tree is actually
hidden inside the directory file, masquerading as “empty” directory data
@@ -206,14 +206,14 @@ rest of the directory block is empty so that it moves on.
The root of the tree always lives in the first data block of the
directory. By ext2 custom, the '.' and '..' entries must appear at the
beginning of this first block, so they are put here as two
``struct ext4_dir_entry_2``\ s and not stored in the tree. The rest of
``struct ext4_dir_entry_2`` s and not stored in the tree. The rest of
the root node contains metadata about the tree and finally a hash->block
map to find nodes that are lower in the htree. If
``dx_root.info.indirect_levels`` is non-zero then the htree has two
levels; the data block pointed to by the root node's map is an interior
node, which is indexed by a minor hash. Interior nodes in this tree
contains a zeroed out ``struct ext4_dir_entry_2`` followed by a
minor\_hash->block map to find leafe nodes. Leaf nodes contain a linear
minor_hash->block map to find leafe nodes. Leaf nodes contain a linear
array of all ``struct ext4_dir_entry_2``; all of these entries
(presumably) hash to the same value. If there is an overflow, the
entries simply overflow into the next leaf node, and the
@@ -245,83 +245,83 @@ of a data block:
- Name
- Description
* - 0x0
- \_\_le32
- __le32
- dot.inode
- inode number of this directory.
* - 0x4
- \_\_le16
- dot.rec\_len
- __le16
- dot.rec_len
- Length of this record, 12.
* - 0x6
- u8
- dot.name\_len
- dot.name_len
- Length of the name, 1.
* - 0x7
- u8
- dot.file\_type
- dot.file_type
- File type of this entry, 0x2 (directory) (if the feature flag is set).
* - 0x8
- char
- dot.name[4]
- “.\\0\\0\\0”
- “.\0\0\0”
* - 0xC
- \_\_le32
- __le32
- dotdot.inode
- inode number of parent directory.
* - 0x10
- \_\_le16
- dotdot.rec\_len
- block\_size - 12. The record length is long enough to cover all htree
- __le16
- dotdot.rec_len
- block_size - 12. The record length is long enough to cover all htree
data.
* - 0x12
- u8
- dotdot.name\_len
- dotdot.name_len
- Length of the name, 2.
* - 0x13
- u8
- dotdot.file\_type
- dotdot.file_type
- File type of this entry, 0x2 (directory) (if the feature flag is set).
* - 0x14
- char
- dotdot\_name[4]
- “..\\0\\0”
- dotdot_name[4]
- “..\0\0”
* - 0x18
- \_\_le32
- struct dx\_root\_info.reserved\_zero
- __le32
- struct dx_root_info.reserved_zero
- Zero.
* - 0x1C
- u8
- struct dx\_root\_info.hash\_version
- struct dx_root_info.hash_version
- Hash type, see dirhash_ table below.
* - 0x1D
- u8
- struct dx\_root\_info.info\_length
- struct dx_root_info.info_length
- Length of the tree information, 0x8.
* - 0x1E
- u8
- struct dx\_root\_info.indirect\_levels
- Depth of the htree. Cannot be larger than 3 if the INCOMPAT\_LARGEDIR
- struct dx_root_info.indirect_levels
- Depth of the htree. Cannot be larger than 3 if the INCOMPAT_LARGEDIR
feature is set; cannot be larger than 2 otherwise.
* - 0x1F
- u8
- struct dx\_root\_info.unused\_flags
- struct dx_root_info.unused_flags
-
* - 0x20
- \_\_le16
- __le16
- limit
- Maximum number of dx\_entries that can follow this header, plus 1 for
- Maximum number of dx_entries that can follow this header, plus 1 for
the header itself.
* - 0x22
- \_\_le16
- __le16
- count
- Actual number of dx\_entries that follow this header, plus 1 for the
- Actual number of dx_entries that follow this header, plus 1 for the
header itself.
* - 0x24
- \_\_le32
- __le32
- block
- The block number (within the directory file) that goes with hash=0.
* - 0x28
- struct dx\_entry
- struct dx_entry
- entries[0]
- As many 8-byte ``struct dx_entry`` as fits in the rest of the data block.
@@ -362,38 +362,38 @@ also the full length of a data block:
- Name
- Description
* - 0x0
- \_\_le32
- __le32
- fake.inode
- Zero, to make it look like this entry is not in use.
* - 0x4
- \_\_le16
- fake.rec\_len
- The size of the block, in order to hide all of the dx\_node data.
- __le16
- fake.rec_len
- The size of the block, in order to hide all of the dx_node data.
* - 0x6
- u8
- name\_len
- name_len
- Zero. There is no name for this “unused” directory entry.
* - 0x7
- u8
- file\_type
- file_type
- Zero. There is no file type for this “unused” directory entry.
* - 0x8
- \_\_le16
- __le16
- limit
- Maximum number of dx\_entries that can follow this header, plus 1 for
- Maximum number of dx_entries that can follow this header, plus 1 for
the header itself.
* - 0xA
- \_\_le16
- __le16
- count
- Actual number of dx\_entries that follow this header, plus 1 for the
- Actual number of dx_entries that follow this header, plus 1 for the
header itself.
* - 0xE
- \_\_le32
- __le32
- block
- The block number (within the directory file) that goes with the lowest
hash value of this block. This value is stored in the parent block.
* - 0x12
- struct dx\_entry
- struct dx_entry
- entries[0]
- As many 8-byte ``struct dx_entry`` as fits in the rest of the data block.
@@ -410,11 +410,11 @@ long:
- Name
- Description
* - 0x0
- \_\_le32
- __le32
- hash
- Hash code.
* - 0x4
- \_\_le32
- __le32
- block
- Block number (within the directory file, not filesystem blocks) of the
next node in the htree.
@@ -423,13 +423,13 @@ long:
author.)
If metadata checksums are enabled, the last 8 bytes of the directory
block (precisely the length of one dx\_entry) are used to store a
block (precisely the length of one dx_entry) are used to store a
``struct dx_tail``, which contains the checksum. The ``limit`` and
``count`` entries in the dx\_root/dx\_node structures are adjusted as
necessary to fit the dx\_tail into the block. If there is no space for
the dx\_tail, the user is notified to run e2fsck -D to rebuild the
``count`` entries in the dx_root/dx_node structures are adjusted as
necessary to fit the dx_tail into the block. If there is no space for
the dx_tail, the user is notified to run e2fsck -D to rebuild the
directory index (which will ensure that there's space for the checksum.
The dx\_tail structure is 8 bytes long and looks like this:
The dx_tail structure is 8 bytes long and looks like this:
.. list-table::
:widths: 8 8 24 40
@@ -441,13 +441,13 @@ The dx\_tail structure is 8 bytes long and looks like this:
- Description
* - 0x0
- u32
- dt\_reserved
- dt_reserved
- Zero.
* - 0x4
- \_\_le32
- dt\_checksum
- __le32
- dt_checksum
- Checksum of the htree directory block.
The checksum is calculated against the FS UUID, the htree index header
(dx\_root or dx\_node), all of the htree indices (dx\_entry) that are in
use, and the tail block (dx\_tail).
(dx_root or dx_node), all of the htree indices (dx_entry) that are in
use, and the tail block (dx_tail).

View File

@@ -5,14 +5,14 @@ Large Extended Attribute Values
To enable ext4 to store extended attribute values that do not fit in the
inode or in the single extended attribute block attached to an inode,
the EA\_INODE feature allows us to store the value in the data blocks of
the EA_INODE feature allows us to store the value in the data blocks of
a regular file inode. This “EA inode” is linked only from the extended
attribute name index and must not appear in a directory entry. The
inode's i\_atime field is used to store a checksum of the xattr value;
and i\_ctime/i\_version store a 64-bit reference count, which enables
inode's i_atime field is used to store a checksum of the xattr value;
and i_ctime/i_version store a 64-bit reference count, which enables
sharing of large xattr values between multiple owning inodes. For
backward compatibility with older versions of this feature, the
i\_mtime/i\_generation *may* store a back-reference to the inode number
and i\_generation of the **one** owning inode (in cases where the EA
i_mtime/i_generation *may* store a back-reference to the inode number
and i_generation of the **one** owning inode (in cases where the EA
inode is not referenced by multiple inodes) to verify that the EA inode
is the correct one being accessed.

View File

@@ -7,34 +7,34 @@ Each block group on the filesystem has one of these descriptors
associated with it. As noted in the Layout section above, the group
descriptors (if present) are the second item in the block group. The
standard configuration is for each block group to contain a full copy of
the block group descriptor table unless the sparse\_super feature flag
the block group descriptor table unless the sparse_super feature flag
is set.
Notice how the group descriptor records the location of both bitmaps and
the inode table (i.e. they can float). This means that within a block
group, the only data structures with fixed locations are the superblock
and the group descriptor table. The flex\_bg mechanism uses this
and the group descriptor table. The flex_bg mechanism uses this
property to group several block groups into a flex group and lay out all
of the groups' bitmaps and inode tables into one long run in the first
group of the flex group.
If the meta\_bg feature flag is set, then several block groups are
grouped together into a meta group. Note that in the meta\_bg case,
If the meta_bg feature flag is set, then several block groups are
grouped together into a meta group. Note that in the meta_bg case,
however, the first and last two block groups within the larger meta
group contain only group descriptors for the groups inside the meta
group.
flex\_bg and meta\_bg do not appear to be mutually exclusive features.
flex_bg and meta_bg do not appear to be mutually exclusive features.
In ext2, ext3, and ext4 (when the 64bit feature is not enabled), the
block group descriptor was only 32 bytes long and therefore ends at
bg\_checksum. On an ext4 filesystem with the 64bit feature enabled, the
bg_checksum. On an ext4 filesystem with the 64bit feature enabled, the
block group descriptor expands to at least the 64 bytes described below;
the size is stored in the superblock.
If gdt\_csum is set and metadata\_csum is not set, the block group
If gdt_csum is set and metadata_csum is not set, the block group
checksum is the crc16 of the FS UUID, the group number, and the group
descriptor structure. If metadata\_csum is set, then the block group
descriptor structure. If metadata_csum is set, then the block group
checksum is the lower 16 bits of the checksum of the FS UUID, the group
number, and the group descriptor structure. Both block and inode bitmap
checksums are calculated against the FS UUID, the group number, and the
@@ -51,59 +51,59 @@ The block group descriptor is laid out in ``struct ext4_group_desc``.
- Name
- Description
* - 0x0
- \_\_le32
- bg\_block\_bitmap\_lo
- __le32
- bg_block_bitmap_lo
- Lower 32-bits of location of block bitmap.
* - 0x4
- \_\_le32
- bg\_inode\_bitmap\_lo
- __le32
- bg_inode_bitmap_lo
- Lower 32-bits of location of inode bitmap.
* - 0x8
- \_\_le32
- bg\_inode\_table\_lo
- __le32
- bg_inode_table_lo
- Lower 32-bits of location of inode table.
* - 0xC
- \_\_le16
- bg\_free\_blocks\_count\_lo
- __le16
- bg_free_blocks_count_lo
- Lower 16-bits of free block count.
* - 0xE
- \_\_le16
- bg\_free\_inodes\_count\_lo
- __le16
- bg_free_inodes_count_lo
- Lower 16-bits of free inode count.
* - 0x10
- \_\_le16
- bg\_used\_dirs\_count\_lo
- __le16
- bg_used_dirs_count_lo
- Lower 16-bits of directory count.
* - 0x12
- \_\_le16
- bg\_flags
- __le16
- bg_flags
- Block group flags. See the bgflags_ table below.
* - 0x14
- \_\_le32
- bg\_exclude\_bitmap\_lo
- __le32
- bg_exclude_bitmap_lo
- Lower 32-bits of location of snapshot exclusion bitmap.
* - 0x18
- \_\_le16
- bg\_block\_bitmap\_csum\_lo
- __le16
- bg_block_bitmap_csum_lo
- Lower 16-bits of the block bitmap checksum.
* - 0x1A
- \_\_le16
- bg\_inode\_bitmap\_csum\_lo
- __le16
- bg_inode_bitmap_csum_lo
- Lower 16-bits of the inode bitmap checksum.
* - 0x1C
- \_\_le16
- bg\_itable\_unused\_lo
- __le16
- bg_itable_unused_lo
- Lower 16-bits of unused inode count. If set, we needn't scan past the
``(sb.s_inodes_per_group - gdt.bg_itable_unused)``\ th entry in the
``(sb.s_inodes_per_group - gdt.bg_itable_unused)`` th entry in the
inode table for this group.
* - 0x1E
- \_\_le16
- bg\_checksum
- Group descriptor checksum; crc16(sb\_uuid+group\_num+bg\_desc) if the
RO\_COMPAT\_GDT\_CSUM feature is set, or
crc32c(sb\_uuid+group\_num+bg\_desc) & 0xFFFF if the
RO\_COMPAT\_METADATA\_CSUM feature is set. The bg\_checksum
field in bg\_desc is skipped when calculating crc16 checksum,
- __le16
- bg_checksum
- Group descriptor checksum; crc16(sb_uuid+group_num+bg_desc) if the
RO_COMPAT_GDT_CSUM feature is set, or
crc32c(sb_uuid+group_num+bg_desc) & 0xFFFF if the
RO_COMPAT_METADATA_CSUM feature is set. The bg_checksum
field in bg_desc is skipped when calculating crc16 checksum,
and set to zero if crc32c checksum is used.
* -
-
@@ -111,48 +111,48 @@ The block group descriptor is laid out in ``struct ext4_group_desc``.
- These fields only exist if the 64bit feature is enabled and s_desc_size
> 32.
* - 0x20
- \_\_le32
- bg\_block\_bitmap\_hi
- __le32
- bg_block_bitmap_hi
- Upper 32-bits of location of block bitmap.
* - 0x24
- \_\_le32
- bg\_inode\_bitmap\_hi
- __le32
- bg_inode_bitmap_hi
- Upper 32-bits of location of inodes bitmap.
* - 0x28
- \_\_le32
- bg\_inode\_table\_hi
- __le32
- bg_inode_table_hi
- Upper 32-bits of location of inodes table.
* - 0x2C
- \_\_le16
- bg\_free\_blocks\_count\_hi
- __le16
- bg_free_blocks_count_hi
- Upper 16-bits of free block count.
* - 0x2E
- \_\_le16
- bg\_free\_inodes\_count\_hi
- __le16
- bg_free_inodes_count_hi
- Upper 16-bits of free inode count.
* - 0x30
- \_\_le16
- bg\_used\_dirs\_count\_hi
- __le16
- bg_used_dirs_count_hi
- Upper 16-bits of directory count.
* - 0x32
- \_\_le16
- bg\_itable\_unused\_hi
- __le16
- bg_itable_unused_hi
- Upper 16-bits of unused inode count.
* - 0x34
- \_\_le32
- bg\_exclude\_bitmap\_hi
- __le32
- bg_exclude_bitmap_hi
- Upper 32-bits of location of snapshot exclusion bitmap.
* - 0x38
- \_\_le16
- bg\_block\_bitmap\_csum\_hi
- __le16
- bg_block_bitmap_csum_hi
- Upper 16-bits of the block bitmap checksum.
* - 0x3A
- \_\_le16
- bg\_inode\_bitmap\_csum\_hi
- __le16
- bg_inode_bitmap_csum_hi
- Upper 16-bits of the inode bitmap checksum.
* - 0x3C
- \_\_u32
- bg\_reserved
- __u32
- bg_reserved
- Padding to 64 bytes.
.. _bgflags:
@@ -166,8 +166,8 @@ Block group flags can be any combination of the following:
* - Value
- Description
* - 0x1
- inode table and bitmap are not initialized (EXT4\_BG\_INODE\_UNINIT).
- inode table and bitmap are not initialized (EXT4_BG_INODE_UNINIT).
* - 0x2
- block bitmap is not initialized (EXT4\_BG\_BLOCK\_UNINIT).
- block bitmap is not initialized (EXT4_BG_BLOCK_UNINIT).
* - 0x4
- inode table is zeroed (EXT4\_BG\_INODE\_ZEROED).
- inode table is zeroed (EXT4_BG_INODE_ZEROED).

View File

@@ -1,6 +1,6 @@
.. SPDX-License-Identifier: GPL-2.0
The Contents of inode.i\_block
The Contents of inode.i_block
------------------------------
Depending on the type of file an inode describes, the 60 bytes of
@@ -47,7 +47,7 @@ In ext4, the file to logical block map has been replaced with an extent
tree. Under the old scheme, allocating a contiguous run of 1,000 blocks
requires an indirect block to map all 1,000 entries; with extents, the
mapping is reduced to a single ``struct ext4_extent`` with
``ee_len = 1000``. If flex\_bg is enabled, it is possible to allocate
``ee_len = 1000``. If flex_bg is enabled, it is possible to allocate
very large files with a single extent, at a considerable reduction in
metadata block use, and some improvement in disk efficiency. The inode
must have the extents flag (0x80000) flag set for this feature to be in
@@ -76,28 +76,28 @@ which is 12 bytes long:
- Name
- Description
* - 0x0
- \_\_le16
- eh\_magic
- __le16
- eh_magic
- Magic number, 0xF30A.
* - 0x2
- \_\_le16
- eh\_entries
- __le16
- eh_entries
- Number of valid entries following the header.
* - 0x4
- \_\_le16
- eh\_max
- __le16
- eh_max
- Maximum number of entries that could follow the header.
* - 0x6
- \_\_le16
- eh\_depth
- __le16
- eh_depth
- Depth of this extent node in the extent tree. 0 = this extent node
points to data blocks; otherwise, this extent node points to other
extent nodes. The extent tree can be at most 5 levels deep: a logical
block number can be at most ``2^32``, and the smallest ``n`` that
satisfies ``4*(((blocksize - 12)/12)^n) >= 2^32`` is 5.
* - 0x8
- \_\_le32
- eh\_generation
- __le32
- eh_generation
- Generation of the tree. (Used by Lustre, but not standard ext4).
Internal nodes of the extent tree, also known as index nodes, are
@@ -112,22 +112,22 @@ recorded as ``struct ext4_extent_idx``, and are 12 bytes long:
- Name
- Description
* - 0x0
- \_\_le32
- ei\_block
- __le32
- ei_block
- This index node covers file blocks from 'block' onward.
* - 0x4
- \_\_le32
- ei\_leaf\_lo
- __le32
- ei_leaf_lo
- Lower 32-bits of the block number of the extent node that is the next
level lower in the tree. The tree node pointed to can be either another
internal node or a leaf node, described below.
* - 0x8
- \_\_le16
- ei\_leaf\_hi
- __le16
- ei_leaf_hi
- Upper 16-bits of the previous field.
* - 0xA
- \_\_u16
- ei\_unused
- __u16
- ei_unused
-
Leaf nodes of the extent tree are recorded as ``struct ext4_extent``,
@@ -142,24 +142,24 @@ and are also 12 bytes long:
- Name
- Description
* - 0x0
- \_\_le32
- ee\_block
- __le32
- ee_block
- First file block number that this extent covers.
* - 0x4
- \_\_le16
- ee\_len
- __le16
- ee_len
- Number of blocks covered by extent. If the value of this field is <=
32768, the extent is initialized. If the value of the field is > 32768,
the extent is uninitialized and the actual extent length is ``ee_len`` -
32768. Therefore, the maximum length of a initialized extent is 32768
blocks, and the maximum length of an uninitialized extent is 32767.
* - 0x6
- \_\_le16
- ee\_start\_hi
- __le16
- ee_start_hi
- Upper 16-bits of the block number to which this extent points.
* - 0x8
- \_\_le32
- ee\_start\_lo
- __le32
- ee_start_lo
- Lower 32-bits of the block number to which this extent points.
Prior to the introduction of metadata checksums, the extent header +
@@ -182,8 +182,8 @@ including) the checksum itself.
- Name
- Description
* - 0x0
- \_\_le32
- eb\_checksum
- __le32
- eb_checksum
- Checksum of the extent block, crc32c(uuid+inum+igeneration+extentblock)
Inline Data

View File

@@ -11,12 +11,12 @@ file is smaller than 60 bytes, then the data are stored inline in
attribute space, then it might be found as an extended attribute
“system.data” within the inode body (“ibody EA”). This of course
constrains the amount of extended attributes one can attach to an inode.
If the data size increases beyond i\_block + ibody EA, a regular block
If the data size increases beyond i_block + ibody EA, a regular block
is allocated and the contents moved to that block.
Pending a change to compact the extended attribute key used to store
inline data, one ought to be able to store 160 bytes of data in a
256-byte inode (as of June 2015, when i\_extra\_isize is 28). Prior to
256-byte inode (as of June 2015, when i_extra_isize is 28). Prior to
that, the limit was 156 bytes due to inefficient use of inode space.
The inline data feature requires the presence of an extended attribute
@@ -25,12 +25,12 @@ for “system.data”, even if the attribute value is zero length.
Inline Directories
~~~~~~~~~~~~~~~~~~
The first four bytes of i\_block are the inode number of the parent
The first four bytes of i_block are the inode number of the parent
directory. Following that is a 56-byte space for an array of directory
entries; see ``struct ext4_dir_entry``. If there is a “system.data”
attribute in the inode body, the EA value is an array of
``struct ext4_dir_entry`` as well. Note that for inline directories, the
i\_block and EA space are treated as separate dirent blocks; directory
i_block and EA space are treated as separate dirent blocks; directory
entries cannot span the two.
Inline directory entries are not checksummed, as the inode checksum

View File

@@ -38,138 +38,138 @@ The inode table entry is laid out in ``struct ext4_inode``.
- Name
- Description
* - 0x0
- \_\_le16
- i\_mode
- __le16
- i_mode
- File mode. See the table i_mode_ below.
* - 0x2
- \_\_le16
- i\_uid
- __le16
- i_uid
- Lower 16-bits of Owner UID.
* - 0x4
- \_\_le32
- i\_size\_lo
- __le32
- i_size_lo
- Lower 32-bits of size in bytes.
* - 0x8
- \_\_le32
- i\_atime
- Last access time, in seconds since the epoch. However, if the EA\_INODE
- __le32
- i_atime
- Last access time, in seconds since the epoch. However, if the EA_INODE
inode flag is set, this inode stores an extended attribute value and
this field contains the checksum of the value.
* - 0xC
- \_\_le32
- i\_ctime
- __le32
- i_ctime
- Last inode change time, in seconds since the epoch. However, if the
EA\_INODE inode flag is set, this inode stores an extended attribute
EA_INODE inode flag is set, this inode stores an extended attribute
value and this field contains the lower 32 bits of the attribute value's
reference count.
* - 0x10
- \_\_le32
- i\_mtime
- __le32
- i_mtime
- Last data modification time, in seconds since the epoch. However, if the
EA\_INODE inode flag is set, this inode stores an extended attribute
EA_INODE inode flag is set, this inode stores an extended attribute
value and this field contains the number of the inode that owns the
extended attribute.
* - 0x14
- \_\_le32
- i\_dtime
- __le32
- i_dtime
- Deletion Time, in seconds since the epoch.
* - 0x18
- \_\_le16
- i\_gid
- __le16
- i_gid
- Lower 16-bits of GID.
* - 0x1A
- \_\_le16
- i\_links\_count
- __le16
- i_links_count
- Hard link count. Normally, ext4 does not permit an inode to have more
than 65,000 hard links. This applies to files as well as directories,
which means that there cannot be more than 64,998 subdirectories in a
directory (each subdirectory's '..' entry counts as a hard link, as does
the '.' entry in the directory itself). With the DIR\_NLINK feature
the '.' entry in the directory itself). With the DIR_NLINK feature
enabled, ext4 supports more than 64,998 subdirectories by setting this
field to 1 to indicate that the number of hard links is not known.
* - 0x1C
- \_\_le32
- i\_blocks\_lo
- Lower 32-bits of “block” count. If the huge\_file feature flag is not
- __le32
- i_blocks_lo
- Lower 32-bits of “block” count. If the huge_file feature flag is not
set on the filesystem, the file consumes ``i_blocks_lo`` 512-byte blocks
on disk. If huge\_file is set and EXT4\_HUGE\_FILE\_FL is NOT set in
on disk. If huge_file is set and EXT4_HUGE_FILE_FL is NOT set in
``inode.i_flags``, then the file consumes ``i_blocks_lo + (i_blocks_hi
<< 32)`` 512-byte blocks on disk. If huge\_file is set and
EXT4\_HUGE\_FILE\_FL IS set in ``inode.i_flags``, then this file
<< 32)`` 512-byte blocks on disk. If huge_file is set and
EXT4_HUGE_FILE_FL IS set in ``inode.i_flags``, then this file
consumes (``i_blocks_lo + i_blocks_hi`` << 32) filesystem blocks on
disk.
* - 0x20
- \_\_le32
- i\_flags
- __le32
- i_flags
- Inode flags. See the table i_flags_ below.
* - 0x24
- 4 bytes
- i\_osd1
- i_osd1
- See the table i_osd1_ for more details.
* - 0x28
- 60 bytes
- i\_block[EXT4\_N\_BLOCKS=15]
- Block map or extent tree. See the section “The Contents of inode.i\_block”.
- i_block[EXT4_N_BLOCKS=15]
- Block map or extent tree. See the section “The Contents of inode.i_block”.
* - 0x64
- \_\_le32
- i\_generation
- __le32
- i_generation
- File version (for NFS).
* - 0x68
- \_\_le32
- i\_file\_acl\_lo
- __le32
- i_file_acl_lo
- Lower 32-bits of extended attribute block. ACLs are of course one of
many possible extended attributes; I think the name of this field is a
result of the first use of extended attributes being for ACLs.
* - 0x6C
- \_\_le32
- i\_size\_high / i\_dir\_acl
- __le32
- i_size_high / i_dir_acl
- Upper 32-bits of file/directory size. In ext2/3 this field was named
i\_dir\_acl, though it was usually set to zero and never used.
i_dir_acl, though it was usually set to zero and never used.
* - 0x70
- \_\_le32
- i\_obso\_faddr
- __le32
- i_obso_faddr
- (Obsolete) fragment address.
* - 0x74
- 12 bytes
- i\_osd2
- i_osd2
- See the table i_osd2_ for more details.
* - 0x80
- \_\_le16
- i\_extra\_isize
- __le16
- i_extra_isize
- Size of this inode - 128. Alternately, the size of the extended inode
fields beyond the original ext2 inode, including this field.
* - 0x82
- \_\_le16
- i\_checksum\_hi
- __le16
- i_checksum_hi
- Upper 16-bits of the inode checksum.
* - 0x84
- \_\_le32
- i\_ctime\_extra
- __le32
- i_ctime_extra
- Extra change time bits. This provides sub-second precision. See Inode
Timestamps section.
* - 0x88
- \_\_le32
- i\_mtime\_extra
- __le32
- i_mtime_extra
- Extra modification time bits. This provides sub-second precision.
* - 0x8C
- \_\_le32
- i\_atime\_extra
- __le32
- i_atime_extra
- Extra access time bits. This provides sub-second precision.
* - 0x90
- \_\_le32
- i\_crtime
- __le32
- i_crtime
- File creation time, in seconds since the epoch.
* - 0x94
- \_\_le32
- i\_crtime\_extra
- __le32
- i_crtime_extra
- Extra file creation time bits. This provides sub-second precision.
* - 0x98
- \_\_le32
- i\_version\_hi
- __le32
- i_version_hi
- Upper 32-bits for version number.
* - 0x9C
- \_\_le32
- i\_projid
- __le32
- i_projid
- Project ID.
.. _i_mode:
@@ -183,45 +183,45 @@ The ``i_mode`` value is a combination of the following flags:
* - Value
- Description
* - 0x1
- S\_IXOTH (Others may execute)
- S_IXOTH (Others may execute)
* - 0x2
- S\_IWOTH (Others may write)
- S_IWOTH (Others may write)
* - 0x4
- S\_IROTH (Others may read)
- S_IROTH (Others may read)
* - 0x8
- S\_IXGRP (Group members may execute)
- S_IXGRP (Group members may execute)
* - 0x10
- S\_IWGRP (Group members may write)
- S_IWGRP (Group members may write)
* - 0x20
- S\_IRGRP (Group members may read)
- S_IRGRP (Group members may read)
* - 0x40
- S\_IXUSR (Owner may execute)
- S_IXUSR (Owner may execute)
* - 0x80
- S\_IWUSR (Owner may write)
- S_IWUSR (Owner may write)
* - 0x100
- S\_IRUSR (Owner may read)
- S_IRUSR (Owner may read)
* - 0x200
- S\_ISVTX (Sticky bit)
- S_ISVTX (Sticky bit)
* - 0x400
- S\_ISGID (Set GID)
- S_ISGID (Set GID)
* - 0x800
- S\_ISUID (Set UID)
- S_ISUID (Set UID)
* -
- These are mutually-exclusive file types:
* - 0x1000
- S\_IFIFO (FIFO)
- S_IFIFO (FIFO)
* - 0x2000
- S\_IFCHR (Character device)
- S_IFCHR (Character device)
* - 0x4000
- S\_IFDIR (Directory)
- S_IFDIR (Directory)
* - 0x6000
- S\_IFBLK (Block device)
- S_IFBLK (Block device)
* - 0x8000
- S\_IFREG (Regular file)
- S_IFREG (Regular file)
* - 0xA000
- S\_IFLNK (Symbolic link)
- S_IFLNK (Symbolic link)
* - 0xC000
- S\_IFSOCK (Socket)
- S_IFSOCK (Socket)
.. _i_flags:
@@ -234,56 +234,56 @@ The ``i_flags`` field is a combination of these values:
* - Value
- Description
* - 0x1
- This file requires secure deletion (EXT4\_SECRM\_FL). (not implemented)
- This file requires secure deletion (EXT4_SECRM_FL). (not implemented)
* - 0x2
- This file should be preserved, should undeletion be desired
(EXT4\_UNRM\_FL). (not implemented)
(EXT4_UNRM_FL). (not implemented)
* - 0x4
- File is compressed (EXT4\_COMPR\_FL). (not really implemented)
- File is compressed (EXT4_COMPR_FL). (not really implemented)
* - 0x8
- All writes to the file must be synchronous (EXT4\_SYNC\_FL).
- All writes to the file must be synchronous (EXT4_SYNC_FL).
* - 0x10
- File is immutable (EXT4\_IMMUTABLE\_FL).
- File is immutable (EXT4_IMMUTABLE_FL).
* - 0x20
- File can only be appended (EXT4\_APPEND\_FL).
- File can only be appended (EXT4_APPEND_FL).
* - 0x40
- The dump(1) utility should not dump this file (EXT4\_NODUMP\_FL).
- The dump(1) utility should not dump this file (EXT4_NODUMP_FL).
* - 0x80
- Do not update access time (EXT4\_NOATIME\_FL).
- Do not update access time (EXT4_NOATIME_FL).
* - 0x100
- Dirty compressed file (EXT4\_DIRTY\_FL). (not used)
- Dirty compressed file (EXT4_DIRTY_FL). (not used)
* - 0x200
- File has one or more compressed clusters (EXT4\_COMPRBLK\_FL). (not used)
- File has one or more compressed clusters (EXT4_COMPRBLK_FL). (not used)
* - 0x400
- Do not compress file (EXT4\_NOCOMPR\_FL). (not used)
- Do not compress file (EXT4_NOCOMPR_FL). (not used)
* - 0x800
- Encrypted inode (EXT4\_ENCRYPT\_FL). This bit value previously was
EXT4\_ECOMPR\_FL (compression error), which was never used.
- Encrypted inode (EXT4_ENCRYPT_FL). This bit value previously was
EXT4_ECOMPR_FL (compression error), which was never used.
* - 0x1000
- Directory has hashed indexes (EXT4\_INDEX\_FL).
- Directory has hashed indexes (EXT4_INDEX_FL).
* - 0x2000
- AFS magic directory (EXT4\_IMAGIC\_FL).
- AFS magic directory (EXT4_IMAGIC_FL).
* - 0x4000
- File data must always be written through the journal
(EXT4\_JOURNAL\_DATA\_FL).
(EXT4_JOURNAL_DATA_FL).
* - 0x8000
- File tail should not be merged (EXT4\_NOTAIL\_FL). (not used by ext4)
- File tail should not be merged (EXT4_NOTAIL_FL). (not used by ext4)
* - 0x10000
- All directory entry data should be written synchronously (see
``dirsync``) (EXT4\_DIRSYNC\_FL).
``dirsync``) (EXT4_DIRSYNC_FL).
* - 0x20000
- Top of directory hierarchy (EXT4\_TOPDIR\_FL).
- Top of directory hierarchy (EXT4_TOPDIR_FL).
* - 0x40000
- This is a huge file (EXT4\_HUGE\_FILE\_FL).
- This is a huge file (EXT4_HUGE_FILE_FL).
* - 0x80000
- Inode uses extents (EXT4\_EXTENTS\_FL).
- Inode uses extents (EXT4_EXTENTS_FL).
* - 0x100000
- Verity protected file (EXT4\_VERITY\_FL).
- Verity protected file (EXT4_VERITY_FL).
* - 0x200000
- Inode stores a large extended attribute value in its data blocks
(EXT4\_EA\_INODE\_FL).
(EXT4_EA_INODE_FL).
* - 0x400000
- This file has blocks allocated past EOF (EXT4\_EOFBLOCKS\_FL).
- This file has blocks allocated past EOF (EXT4_EOFBLOCKS_FL).
(deprecated)
* - 0x01000000
- Inode is a snapshot (``EXT4_SNAPFILE_FL``). (not in mainline)
@@ -294,21 +294,21 @@ The ``i_flags`` field is a combination of these values:
- Snapshot shrink has completed (``EXT4_SNAPFILE_SHRUNK_FL``). (not in
mainline)
* - 0x10000000
- Inode has inline data (EXT4\_INLINE\_DATA\_FL).
- Inode has inline data (EXT4_INLINE_DATA_FL).
* - 0x20000000
- Create children with the same project ID (EXT4\_PROJINHERIT\_FL).
- Create children with the same project ID (EXT4_PROJINHERIT_FL).
* - 0x80000000
- Reserved for ext4 library (EXT4\_RESERVED\_FL).
- Reserved for ext4 library (EXT4_RESERVED_FL).
* -
- Aggregate flags:
* - 0x705BDFFF
- User-visible flags.
* - 0x604BC0FF
- User-modifiable flags. Note that while EXT4\_JOURNAL\_DATA\_FL and
EXT4\_EXTENTS\_FL can be set with setattr, they are not in the kernel's
EXT4\_FL\_USER\_MODIFIABLE mask, since it needs to handle the setting of
- User-modifiable flags. Note that while EXT4_JOURNAL_DATA_FL and
EXT4_EXTENTS_FL can be set with setattr, they are not in the kernel's
EXT4_FL_USER_MODIFIABLE mask, since it needs to handle the setting of
these flags in a special manner and they are masked out of the set of
flags that are saved directly to i\_flags.
flags that are saved directly to i_flags.
.. _i_osd1:
@@ -325,9 +325,9 @@ Linux:
- Name
- Description
* - 0x0
- \_\_le32
- l\_i\_version
- Inode version. However, if the EA\_INODE inode flag is set, this inode
- __le32
- l_i_version
- Inode version. However, if the EA_INODE inode flag is set, this inode
stores an extended attribute value and this field contains the upper 32
bits of the attribute value's reference count.
@@ -342,8 +342,8 @@ Hurd:
- Name
- Description
* - 0x0
- \_\_le32
- h\_i\_translator
- __le32
- h_i_translator
- ??
Masix:
@@ -357,8 +357,8 @@ Masix:
- Name
- Description
* - 0x0
- \_\_le32
- m\_i\_reserved
- __le32
- m_i_reserved
- ??
.. _i_osd2:
@@ -376,30 +376,30 @@ Linux:
- Name
- Description
* - 0x0
- \_\_le16
- l\_i\_blocks\_high
- __le16
- l_i_blocks_high
- Upper 16-bits of the block count. Please see the note attached to
i\_blocks\_lo.
i_blocks_lo.
* - 0x2
- \_\_le16
- l\_i\_file\_acl\_high
- __le16
- l_i_file_acl_high
- Upper 16-bits of the extended attribute block (historically, the file
ACL location). See the Extended Attributes section below.
* - 0x4
- \_\_le16
- l\_i\_uid\_high
- __le16
- l_i_uid_high
- Upper 16-bits of the Owner UID.
* - 0x6
- \_\_le16
- l\_i\_gid\_high
- __le16
- l_i_gid_high
- Upper 16-bits of the GID.
* - 0x8
- \_\_le16
- l\_i\_checksum\_lo
- __le16
- l_i_checksum_lo
- Lower 16-bits of the inode checksum.
* - 0xA
- \_\_le16
- l\_i\_reserved
- __le16
- l_i_reserved
- Unused.
Hurd:
@@ -413,24 +413,24 @@ Hurd:
- Name
- Description
* - 0x0
- \_\_le16
- h\_i\_reserved1
- __le16
- h_i_reserved1
- ??
* - 0x2
- \_\_u16
- h\_i\_mode\_high
- __u16
- h_i_mode_high
- Upper 16-bits of the file mode.
* - 0x4
- \_\_le16
- h\_i\_uid\_high
- __le16
- h_i_uid_high
- Upper 16-bits of the Owner UID.
* - 0x6
- \_\_le16
- h\_i\_gid\_high
- __le16
- h_i_gid_high
- Upper 16-bits of the GID.
* - 0x8
- \_\_u32
- h\_i\_author
- __u32
- h_i_author
- Author code?
Masix:
@@ -444,17 +444,17 @@ Masix:
- Name
- Description
* - 0x0
- \_\_le16
- h\_i\_reserved1
- __le16
- h_i_reserved1
- ??
* - 0x2
- \_\_u16
- m\_i\_file\_acl\_high
- __u16
- m_i_file_acl_high
- Upper 16-bits of the extended attribute block (historically, the file
ACL location).
* - 0x4
- \_\_u32
- m\_i\_reserved2[2]
- __u32
- m_i_reserved2[2]
- ??
Inode Size
@@ -466,11 +466,11 @@ In ext2 and ext3, the inode structure size was fixed at 128 bytes
on-disk inode at format time for all inodes in the filesystem to provide
space beyond the end of the original ext2 inode. The on-disk inode
record size is recorded in the superblock as ``s_inode_size``. The
number of bytes actually used by struct ext4\_inode beyond the original
number of bytes actually used by struct ext4_inode beyond the original
128-byte ext2 inode is recorded in the ``i_extra_isize`` field for each
inode, which allows struct ext4\_inode to grow for a new kernel without
inode, which allows struct ext4_inode to grow for a new kernel without
having to upgrade all of the on-disk inodes. Access to fields beyond
EXT2\_GOOD\_OLD\_INODE\_SIZE should be verified to be within
EXT2_GOOD_OLD_INODE_SIZE should be verified to be within
``i_extra_isize``. By default, ext4 inode records are 256 bytes, and (as
of August 2019) the inode structure is 160 bytes
(``i_extra_isize = 32``). The extra space between the end of the inode
@@ -516,7 +516,7 @@ creation time (crtime); this field is 64-bits wide and decoded in the
same manner as 64-bit [cma]time. Neither crtime nor dtime are accessible
through the regular stat() interface, though debugfs will report them.
We use the 32-bit signed time value plus (2^32 \* (extra epoch bits)).
We use the 32-bit signed time value plus (2^32 * (extra epoch bits)).
In other words:
.. list-table::
@@ -525,8 +525,8 @@ In other words:
* - Extra epoch bits
- MSB of 32-bit time
- Adjustment for signed 32-bit to 64-bit tv\_sec
- Decoded 64-bit tv\_sec
- Adjustment for signed 32-bit to 64-bit tv_sec
- Decoded 64-bit tv_sec
- valid time range
* - 0 0
- 1

View File

@@ -63,8 +63,8 @@ Generally speaking, the journal has this format:
:header-rows: 1
* - Superblock
- descriptor\_block (data\_blocks or revocation\_block) [more data or
revocations] commmit\_block
- descriptor_block (data_blocks or revocation_block) [more data or
revocations] commmit_block
- [more transactions...]
* -
- One transaction
@@ -93,8 +93,8 @@ superblock.
* - 1024 bytes of padding
- ext4 Superblock
- Journal Superblock
- descriptor\_block (data\_blocks or revocation\_block) [more data or
revocations] commmit\_block
- descriptor_block (data_blocks or revocation_block) [more data or
revocations] commmit_block
- [more transactions...]
* -
-
@@ -117,17 +117,17 @@ Every block in the journal starts with a common 12-byte header
- Name
- Description
* - 0x0
- \_\_be32
- h\_magic
- __be32
- h_magic
- jbd2 magic number, 0xC03B3998.
* - 0x4
- \_\_be32
- h\_blocktype
- __be32
- h_blocktype
- Description of what this block contains. See the jbd2_blocktype_ table
below.
* - 0x8
- \_\_be32
- h\_sequence
- __be32
- h_sequence
- The transaction ID that goes with this block.
.. _jbd2_blocktype:
@@ -177,99 +177,99 @@ which is 1024 bytes long:
-
- Static information describing the journal.
* - 0x0
- journal\_header\_t (12 bytes)
- s\_header
- journal_header_t (12 bytes)
- s_header
- Common header identifying this as a superblock.
* - 0xC
- \_\_be32
- s\_blocksize
- __be32
- s_blocksize
- Journal device block size.
* - 0x10
- \_\_be32
- s\_maxlen
- __be32
- s_maxlen
- Total number of blocks in this journal.
* - 0x14
- \_\_be32
- s\_first
- __be32
- s_first
- First block of log information.
* -
-
-
- Dynamic information describing the current state of the log.
* - 0x18
- \_\_be32
- s\_sequence
- __be32
- s_sequence
- First commit ID expected in log.
* - 0x1C
- \_\_be32
- s\_start
- __be32
- s_start
- Block number of the start of log. Contrary to the comments, this field
being zero does not imply that the journal is clean!
* - 0x20
- \_\_be32
- s\_errno
- Error value, as set by jbd2\_journal\_abort().
- __be32
- s_errno
- Error value, as set by jbd2_journal_abort().
* -
-
-
- The remaining fields are only valid in a v2 superblock.
* - 0x24
- \_\_be32
- s\_feature\_compat;
- __be32
- s_feature_compat;
- Compatible feature set. See the table jbd2_compat_ below.
* - 0x28
- \_\_be32
- s\_feature\_incompat
- __be32
- s_feature_incompat
- Incompatible feature set. See the table jbd2_incompat_ below.
* - 0x2C
- \_\_be32
- s\_feature\_ro\_compat
- __be32
- s_feature_ro_compat
- Read-only compatible feature set. There aren't any of these currently.
* - 0x30
- \_\_u8
- s\_uuid[16]
- __u8
- s_uuid[16]
- 128-bit uuid for journal. This is compared against the copy in the ext4
super block at mount time.
* - 0x40
- \_\_be32
- s\_nr\_users
- __be32
- s_nr_users
- Number of file systems sharing this journal.
* - 0x44
- \_\_be32
- s\_dynsuper
- __be32
- s_dynsuper
- Location of dynamic super block copy. (Not used?)
* - 0x48
- \_\_be32
- s\_max\_transaction
- __be32
- s_max_transaction
- Limit of journal blocks per transaction. (Not used?)
* - 0x4C
- \_\_be32
- s\_max\_trans\_data
- __be32
- s_max_trans_data
- Limit of data blocks per transaction. (Not used?)
* - 0x50
- \_\_u8
- s\_checksum\_type
- __u8
- s_checksum_type
- Checksum algorithm used for the journal. See jbd2_checksum_type_ for
more info.
* - 0x51
- \_\_u8[3]
- s\_padding2
- __u8[3]
- s_padding2
-
* - 0x54
- \_\_be32
- s\_num\_fc\_blocks
- __be32
- s_num_fc_blocks
- Number of fast commit blocks in the journal.
* - 0x58
- \_\_u32
- s\_padding[42]
- __u32
- s_padding[42]
-
* - 0xFC
- \_\_be32
- s\_checksum
- __be32
- s_checksum
- Checksum of the entire superblock, with this field set to zero.
* - 0x100
- \_\_u8
- s\_users[16\*48]
- __u8
- s_users[16*48]
- ids of all file systems sharing the log. e2fsprogs/Linux don't allow
shared external journals, but I imagine Lustre (or ocfs2?), which use
the jbd2 code, might.
@@ -286,7 +286,7 @@ The journal compat features are any combination of the following:
- Description
* - 0x1
- Journal maintains checksums on the data blocks.
(JBD2\_FEATURE\_COMPAT\_CHECKSUM)
(JBD2_FEATURE_COMPAT_CHECKSUM)
.. _jbd2_incompat:
@@ -299,23 +299,23 @@ The journal incompat features are any combination of the following:
* - Value
- Description
* - 0x1
- Journal has block revocation records. (JBD2\_FEATURE\_INCOMPAT\_REVOKE)
- Journal has block revocation records. (JBD2_FEATURE_INCOMPAT_REVOKE)
* - 0x2
- Journal can deal with 64-bit block numbers.
(JBD2\_FEATURE\_INCOMPAT\_64BIT)
(JBD2_FEATURE_INCOMPAT_64BIT)
* - 0x4
- Journal commits asynchronously. (JBD2\_FEATURE\_INCOMPAT\_ASYNC\_COMMIT)
- Journal commits asynchronously. (JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT)
* - 0x8
- This journal uses v2 of the checksum on-disk format. Each journal
metadata block gets its own checksum, and the block tags in the
descriptor table contain checksums for each of the data blocks in the
journal. (JBD2\_FEATURE\_INCOMPAT\_CSUM\_V2)
journal. (JBD2_FEATURE_INCOMPAT_CSUM_V2)
* - 0x10
- This journal uses v3 of the checksum on-disk format. This is the same as
v2, but the journal block tag size is fixed regardless of the size of
block numbers. (JBD2\_FEATURE\_INCOMPAT\_CSUM\_V3)
block numbers. (JBD2_FEATURE_INCOMPAT_CSUM_V3)
* - 0x20
- Journal has fast commit blocks. (JBD2\_FEATURE\_INCOMPAT\_FAST\_COMMIT)
- Journal has fast commit blocks. (JBD2_FEATURE_INCOMPAT_FAST_COMMIT)
.. _jbd2_checksum_type:
@@ -355,11 +355,11 @@ Descriptor blocks consume at least 36 bytes, but use a full block:
- Name
- Descriptor
* - 0x0
- journal\_header\_t
- journal_header_t
- (open coded)
- Common block header.
* - 0xC
- struct journal\_block\_tag\_s
- struct journal_block_tag_s
- open coded array[]
- Enough tags either to fill up the block or to describe all the data
blocks that follow this descriptor block.
@@ -367,7 +367,7 @@ Descriptor blocks consume at least 36 bytes, but use a full block:
Journal block tags have any of the following formats, depending on which
journal feature and block tag flags are set.
If JBD2\_FEATURE\_INCOMPAT\_CSUM\_V3 is set, the journal block tag is
If JBD2_FEATURE_INCOMPAT_CSUM_V3 is set, the journal block tag is
defined as ``struct journal_block_tag3_s``, which looks like the
following. The size is 16 or 32 bytes.
@@ -380,24 +380,24 @@ following. The size is 16 or 32 bytes.
- Name
- Descriptor
* - 0x0
- \_\_be32
- t\_blocknr
- __be32
- t_blocknr
- Lower 32-bits of the location of where the corresponding data block
should end up on disk.
* - 0x4
- \_\_be32
- t\_flags
- __be32
- t_flags
- Flags that go with the descriptor. See the table jbd2_tag_flags_ for
more info.
* - 0x8
- \_\_be32
- t\_blocknr\_high
- __be32
- t_blocknr_high
- Upper 32-bits of the location of where the corresponding data block
should end up on disk. This is zero if JBD2\_FEATURE\_INCOMPAT\_64BIT is
should end up on disk. This is zero if JBD2_FEATURE_INCOMPAT_64BIT is
not enabled.
* - 0xC
- \_\_be32
- t\_checksum
- __be32
- t_checksum
- Checksum of the journal UUID, the sequence number, and the data block.
* -
-
@@ -433,7 +433,7 @@ The journal tag flags are any combination of the following:
* - 0x8
- This is the last tag in this descriptor block.
If JBD2\_FEATURE\_INCOMPAT\_CSUM\_V3 is NOT set, the journal block tag
If JBD2_FEATURE_INCOMPAT_CSUM_V3 is NOT set, the journal block tag
is defined as ``struct journal_block_tag_s``, which looks like the
following. The size is 8, 12, 24, or 28 bytes:
@@ -446,18 +446,18 @@ following. The size is 8, 12, 24, or 28 bytes:
- Name
- Descriptor
* - 0x0
- \_\_be32
- t\_blocknr
- __be32
- t_blocknr
- Lower 32-bits of the location of where the corresponding data block
should end up on disk.
* - 0x4
- \_\_be16
- t\_checksum
- __be16
- t_checksum
- Checksum of the journal UUID, the sequence number, and the data block.
Note that only the lower 16 bits are stored.
* - 0x6
- \_\_be16
- t\_flags
- __be16
- t_flags
- Flags that go with the descriptor. See the table jbd2_tag_flags_ for
more info.
* -
@@ -466,8 +466,8 @@ following. The size is 8, 12, 24, or 28 bytes:
- This next field is only present if the super block indicates support for
64-bit block numbers.
* - 0x8
- \_\_be32
- t\_blocknr\_high
- __be32
- t_blocknr_high
- Upper 32-bits of the location of where the corresponding data block
should end up on disk.
* -
@@ -483,8 +483,8 @@ following. The size is 8, 12, 24, or 28 bytes:
``j_uuid`` field in ``struct journal_s``, but only tune2fs touches that
field.
If JBD2\_FEATURE\_INCOMPAT\_CSUM\_V2 or
JBD2\_FEATURE\_INCOMPAT\_CSUM\_V3 are set, the end of the block is a
If JBD2_FEATURE_INCOMPAT_CSUM_V2 or
JBD2_FEATURE_INCOMPAT_CSUM_V3 are set, the end of the block is a
``struct jbd2_journal_block_tail``, which looks like this:
.. list-table::
@@ -496,8 +496,8 @@ JBD2\_FEATURE\_INCOMPAT\_CSUM\_V3 are set, the end of the block is a
- Name
- Descriptor
* - 0x0
- \_\_be32
- t\_checksum
- __be32
- t_checksum
- Checksum of the journal UUID + the descriptor block, with this field set
to zero.
@@ -538,25 +538,25 @@ length, but use a full block:
- Name
- Description
* - 0x0
- journal\_header\_t
- r\_header
- journal_header_t
- r_header
- Common block header.
* - 0xC
- \_\_be32
- r\_count
- __be32
- r_count
- Number of bytes used in this block.
* - 0x10
- \_\_be32 or \_\_be64
- __be32 or __be64
- blocks[0]
- Blocks to revoke.
After r\_count is a linear array of block numbers that are effectively
After r_count is a linear array of block numbers that are effectively
revoked by this transaction. The size of each block number is 8 bytes if
the superblock advertises 64-bit block number support, or 4 bytes
otherwise.
If JBD2\_FEATURE\_INCOMPAT\_CSUM\_V2 or
JBD2\_FEATURE\_INCOMPAT\_CSUM\_V3 are set, the end of the revocation
If JBD2_FEATURE_INCOMPAT_CSUM_V2 or
JBD2_FEATURE_INCOMPAT_CSUM_V3 are set, the end of the revocation
block is a ``struct jbd2_journal_revoke_tail``, which has this format:
.. list-table::
@@ -568,8 +568,8 @@ block is a ``struct jbd2_journal_revoke_tail``, which has this format:
- Name
- Description
* - 0x0
- \_\_be32
- r\_checksum
- __be32
- r_checksum
- Checksum of the journal UUID + revocation block
Commit Block
@@ -592,38 +592,38 @@ bytes long (but uses a full block):
- Name
- Descriptor
* - 0x0
- journal\_header\_s
- journal_header_s
- (open coded)
- Common block header.
* - 0xC
- unsigned char
- h\_chksum\_type
- h_chksum_type
- The type of checksum to use to verify the integrity of the data blocks
in the transaction. See jbd2_checksum_type_ for more info.
* - 0xD
- unsigned char
- h\_chksum\_size
- h_chksum_size
- The number of bytes used by the checksum. Most likely 4.
* - 0xE
- unsigned char
- h\_padding[2]
- h_padding[2]
-
* - 0x10
- \_\_be32
- h\_chksum[JBD2\_CHECKSUM\_BYTES]
- __be32
- h_chksum[JBD2_CHECKSUM_BYTES]
- 32 bytes of space to store checksums. If
JBD2\_FEATURE\_INCOMPAT\_CSUM\_V2 or JBD2\_FEATURE\_INCOMPAT\_CSUM\_V3
JBD2_FEATURE_INCOMPAT_CSUM_V2 or JBD2_FEATURE_INCOMPAT_CSUM_V3
are set, the first ``__be32`` is the checksum of the journal UUID and
the entire commit block, with this field zeroed. If
JBD2\_FEATURE\_COMPAT\_CHECKSUM is set, the first ``__be32`` is the
JBD2_FEATURE_COMPAT_CHECKSUM is set, the first ``__be32`` is the
crc32 of all the blocks already written to the transaction.
* - 0x30
- \_\_be64
- h\_commit\_sec
- __be64
- h_commit_sec
- The time that the transaction was committed, in seconds since the epoch.
* - 0x38
- \_\_be32
- h\_commit\_nsec
- __be32
- h_commit_nsec
- Nanoseconds component of the above timestamp.
Fast commits

View File

@@ -7,8 +7,8 @@ Multiple mount protection (MMP) is a feature that protects the
filesystem against multiple hosts trying to use the filesystem
simultaneously. When a filesystem is opened (for mounting, or fsck,
etc.), the MMP code running on the node (call it node A) checks a
sequence number. If the sequence number is EXT4\_MMP\_SEQ\_CLEAN, the
open continues. If the sequence number is EXT4\_MMP\_SEQ\_FSCK, then
sequence number. If the sequence number is EXT4_MMP_SEQ_CLEAN, the
open continues. If the sequence number is EXT4_MMP_SEQ_FSCK, then
fsck is (hopefully) running, and open fails immediately. Otherwise, the
open code will wait for twice the specified MMP check interval and check
the sequence number again. If the sequence number has changed, then the
@@ -40,38 +40,38 @@ The MMP structure (``struct mmp_struct``) is as follows:
- Name
- Description
* - 0x0
- \_\_le32
- mmp\_magic
- __le32
- mmp_magic
- Magic number for MMP, 0x004D4D50 (“MMP”).
* - 0x4
- \_\_le32
- mmp\_seq
- __le32
- mmp_seq
- Sequence number, updated periodically.
* - 0x8
- \_\_le64
- mmp\_time
- __le64
- mmp_time
- Time that the MMP block was last updated.
* - 0x10
- char[64]
- mmp\_nodename
- mmp_nodename
- Hostname of the node that opened the filesystem.
* - 0x50
- char[32]
- mmp\_bdevname
- mmp_bdevname
- Block device name of the filesystem.
* - 0x70
- \_\_le16
- mmp\_check\_interval
- __le16
- mmp_check_interval
- The MMP re-check interval, in seconds.
* - 0x72
- \_\_le16
- mmp\_pad1
- __le16
- mmp_pad1
- Zero.
* - 0x74
- \_\_le32[226]
- mmp\_pad2
- __le32[226]
- mmp_pad2
- Zero.
* - 0x3FC
- \_\_le32
- mmp\_checksum
- __le32
- mmp_checksum
- Checksum of the MMP block.

View File

@@ -7,7 +7,7 @@ An ext4 file system is split into a series of block groups. To reduce
performance difficulties due to fragmentation, the block allocator tries
very hard to keep each file's blocks within the same group, thereby
reducing seek times. The size of a block group is specified in
``sb.s_blocks_per_group`` blocks, though it can also calculated as 8 \*
``sb.s_blocks_per_group`` blocks, though it can also calculated as 8 *
``block_size_in_bytes``. With the default block size of 4KiB, each group
will contain 32,768 blocks, for a length of 128MiB. The number of block
groups is the size of the device divided by the size of a block group.

View File

@@ -34,7 +34,7 @@ ext4 reserves some inode for special features, as follows:
* - 10
- Replica inode, used for some non-upstream feature?
* - 11
- Traditional first non-reserved inode. Usually this is the lost+found directory. See s\_first\_ino in the superblock.
- Traditional first non-reserved inode. Usually this is the lost+found directory. See s_first_ino in the superblock.
Note that there are also some inodes allocated from non-reserved inode numbers
for other filesystem features which are not referenced from standard directory
@@ -47,9 +47,9 @@ hierarchy. These are generally reference from the superblock. They are:
* - Superblock field
- Description
* - s\_lpf\_ino
* - s_lpf_ino
- Inode number of lost+found directory.
* - s\_prj\_quota\_inum
* - s_prj_quota_inum
- Inode number of quota file tracking project quotas
* - s\_orphan\_file\_inum
* - s_orphan_file_inum
- Inode number of file tracking orphan inodes.

View File

@@ -7,7 +7,7 @@ The superblock records various information about the enclosing
filesystem, such as block counts, inode counts, supported features,
maintenance information, and more.
If the sparse\_super feature flag is set, redundant copies of the
If the sparse_super feature flag is set, redundant copies of the
superblock and group descriptors are kept only in the groups whose group
number is either 0 or a power of 3, 5, or 7. If the flag is not set,
redundant copies are kept in all groups.
@@ -27,107 +27,107 @@ The ext4 superblock is laid out as follows in
- Name
- Description
* - 0x0
- \_\_le32
- s\_inodes\_count
- __le32
- s_inodes_count
- Total inode count.
* - 0x4
- \_\_le32
- s\_blocks\_count\_lo
- __le32
- s_blocks_count_lo
- Total block count.
* - 0x8
- \_\_le32
- s\_r\_blocks\_count\_lo
- __le32
- s_r_blocks_count_lo
- This number of blocks can only be allocated by the super-user.
* - 0xC
- \_\_le32
- s\_free\_blocks\_count\_lo
- __le32
- s_free_blocks_count_lo
- Free block count.
* - 0x10
- \_\_le32
- s\_free\_inodes\_count
- __le32
- s_free_inodes_count
- Free inode count.
* - 0x14
- \_\_le32
- s\_first\_data\_block
- __le32
- s_first_data_block
- First data block. This must be at least 1 for 1k-block filesystems and
is typically 0 for all other block sizes.
* - 0x18
- \_\_le32
- s\_log\_block\_size
- Block size is 2 ^ (10 + s\_log\_block\_size).
- __le32
- s_log_block_size
- Block size is 2 ^ (10 + s_log_block_size).
* - 0x1C
- \_\_le32
- s\_log\_cluster\_size
- Cluster size is 2 ^ (10 + s\_log\_cluster\_size) blocks if bigalloc is
enabled. Otherwise s\_log\_cluster\_size must equal s\_log\_block\_size.
- __le32
- s_log_cluster_size
- Cluster size is 2 ^ (10 + s_log_cluster_size) blocks if bigalloc is
enabled. Otherwise s_log_cluster_size must equal s_log_block_size.
* - 0x20
- \_\_le32
- s\_blocks\_per\_group
- __le32
- s_blocks_per_group
- Blocks per group.
* - 0x24
- \_\_le32
- s\_clusters\_per\_group
- __le32
- s_clusters_per_group
- Clusters per group, if bigalloc is enabled. Otherwise
s\_clusters\_per\_group must equal s\_blocks\_per\_group.
s_clusters_per_group must equal s_blocks_per_group.
* - 0x28
- \_\_le32
- s\_inodes\_per\_group
- __le32
- s_inodes_per_group
- Inodes per group.
* - 0x2C
- \_\_le32
- s\_mtime
- __le32
- s_mtime
- Mount time, in seconds since the epoch.
* - 0x30
- \_\_le32
- s\_wtime
- __le32
- s_wtime
- Write time, in seconds since the epoch.
* - 0x34
- \_\_le16
- s\_mnt\_count
- __le16
- s_mnt_count
- Number of mounts since the last fsck.
* - 0x36
- \_\_le16
- s\_max\_mnt\_count
- __le16
- s_max_mnt_count
- Number of mounts beyond which a fsck is needed.
* - 0x38
- \_\_le16
- s\_magic
- __le16
- s_magic
- Magic signature, 0xEF53
* - 0x3A
- \_\_le16
- s\_state
- __le16
- s_state
- File system state. See super_state_ for more info.
* - 0x3C
- \_\_le16
- s\_errors
- __le16
- s_errors
- Behaviour when detecting errors. See super_errors_ for more info.
* - 0x3E
- \_\_le16
- s\_minor\_rev\_level
- __le16
- s_minor_rev_level
- Minor revision level.
* - 0x40
- \_\_le32
- s\_lastcheck
- __le32
- s_lastcheck
- Time of last check, in seconds since the epoch.
* - 0x44
- \_\_le32
- s\_checkinterval
- __le32
- s_checkinterval
- Maximum time between checks, in seconds.
* - 0x48
- \_\_le32
- s\_creator\_os
- __le32
- s_creator_os
- Creator OS. See the table super_creator_ for more info.
* - 0x4C
- \_\_le32
- s\_rev\_level
- __le32
- s_rev_level
- Revision level. See the table super_revision_ for more info.
* - 0x50
- \_\_le16
- s\_def\_resuid
- __le16
- s_def_resuid
- Default uid for reserved blocks.
* - 0x52
- \_\_le16
- s\_def\_resgid
- __le16
- s_def_resgid
- Default gid for reserved blocks.
* -
-
@@ -143,50 +143,50 @@ The ext4 superblock is laid out as follows in
about a feature in either the compatible or incompatible feature set, it
must abort and not try to meddle with things it doesn't understand...
* - 0x54
- \_\_le32
- s\_first\_ino
- __le32
- s_first_ino
- First non-reserved inode.
* - 0x58
- \_\_le16
- s\_inode\_size
- __le16
- s_inode_size
- Size of inode structure, in bytes.
* - 0x5A
- \_\_le16
- s\_block\_group\_nr
- __le16
- s_block_group_nr
- Block group # of this superblock.
* - 0x5C
- \_\_le32
- s\_feature\_compat
- __le32
- s_feature_compat
- Compatible feature set flags. Kernel can still read/write this fs even
if it doesn't understand a flag; fsck should not do that. See the
super_compat_ table for more info.
* - 0x60
- \_\_le32
- s\_feature\_incompat
- __le32
- s_feature_incompat
- Incompatible feature set. If the kernel or fsck doesn't understand one
of these bits, it should stop. See the super_incompat_ table for more
info.
* - 0x64
- \_\_le32
- s\_feature\_ro\_compat
- __le32
- s_feature_ro_compat
- Readonly-compatible feature set. If the kernel doesn't understand one of
these bits, it can still mount read-only. See the super_rocompat_ table
for more info.
* - 0x68
- \_\_u8
- s\_uuid[16]
- __u8
- s_uuid[16]
- 128-bit UUID for volume.
* - 0x78
- char
- s\_volume\_name[16]
- s_volume_name[16]
- Volume label.
* - 0x88
- char
- s\_last\_mounted[64]
- s_last_mounted[64]
- Directory where filesystem was last mounted.
* - 0xC8
- \_\_le32
- s\_algorithm\_usage\_bitmap
- __le32
- s_algorithm_usage_bitmap
- For compression (Not used in e2fsprogs/Linux)
* -
-
@@ -194,18 +194,18 @@ The ext4 superblock is laid out as follows in
- Performance hints. Directory preallocation should only happen if the
EXT4_FEATURE_COMPAT_DIR_PREALLOC flag is on.
* - 0xCC
- \_\_u8
- s\_prealloc\_blocks
- __u8
- s_prealloc_blocks
- #. of blocks to try to preallocate for ... files? (Not used in
e2fsprogs/Linux)
* - 0xCD
- \_\_u8
- s\_prealloc\_dir\_blocks
- __u8
- s_prealloc_dir_blocks
- #. of blocks to preallocate for directories. (Not used in
e2fsprogs/Linux)
* - 0xCE
- \_\_le16
- s\_reserved\_gdt\_blocks
- __le16
- s_reserved_gdt_blocks
- Number of reserved GDT entries for future filesystem expansion.
* -
-
@@ -213,281 +213,281 @@ The ext4 superblock is laid out as follows in
- Journalling support is valid only if EXT4_FEATURE_COMPAT_HAS_JOURNAL is
set.
* - 0xD0
- \_\_u8
- s\_journal\_uuid[16]
- __u8
- s_journal_uuid[16]
- UUID of journal superblock
* - 0xE0
- \_\_le32
- s\_journal\_inum
- __le32
- s_journal_inum
- inode number of journal file.
* - 0xE4
- \_\_le32
- s\_journal\_dev
- __le32
- s_journal_dev
- Device number of journal file, if the external journal feature flag is
set.
* - 0xE8
- \_\_le32
- s\_last\_orphan
- __le32
- s_last_orphan
- Start of list of orphaned inodes to delete.
* - 0xEC
- \_\_le32
- s\_hash\_seed[4]
- __le32
- s_hash_seed[4]
- HTREE hash seed.
* - 0xFC
- \_\_u8
- s\_def\_hash\_version
- __u8
- s_def_hash_version
- Default hash algorithm to use for directory hashes. See super_def_hash_
for more info.
* - 0xFD
- \_\_u8
- s\_jnl\_backup\_type
- If this value is 0 or EXT3\_JNL\_BACKUP\_BLOCKS (1), then the
- __u8
- s_jnl_backup_type
- If this value is 0 or EXT3_JNL_BACKUP_BLOCKS (1), then the
``s_jnl_blocks`` field contains a duplicate copy of the inode's
``i_block[]`` array and ``i_size``.
* - 0xFE
- \_\_le16
- s\_desc\_size
- __le16
- s_desc_size
- Size of group descriptors, in bytes, if the 64bit incompat feature flag
is set.
* - 0x100
- \_\_le32
- s\_default\_mount\_opts
- __le32
- s_default_mount_opts
- Default mount options. See the super_mountopts_ table for more info.
* - 0x104
- \_\_le32
- s\_first\_meta\_bg
- First metablock block group, if the meta\_bg feature is enabled.
- __le32
- s_first_meta_bg
- First metablock block group, if the meta_bg feature is enabled.
* - 0x108
- \_\_le32
- s\_mkfs\_time
- __le32
- s_mkfs_time
- When the filesystem was created, in seconds since the epoch.
* - 0x10C
- \_\_le32
- s\_jnl\_blocks[17]
- __le32
- s_jnl_blocks[17]
- Backup copy of the journal inode's ``i_block[]`` array in the first 15
elements and i\_size\_high and i\_size in the 16th and 17th elements,
elements and i_size_high and i_size in the 16th and 17th elements,
respectively.
* -
-
-
- 64bit support is valid only if EXT4_FEATURE_COMPAT_64BIT is set.
* - 0x150
- \_\_le32
- s\_blocks\_count\_hi
- __le32
- s_blocks_count_hi
- High 32-bits of the block count.
* - 0x154
- \_\_le32
- s\_r\_blocks\_count\_hi
- __le32
- s_r_blocks_count_hi
- High 32-bits of the reserved block count.
* - 0x158
- \_\_le32
- s\_free\_blocks\_count\_hi
- __le32
- s_free_blocks_count_hi
- High 32-bits of the free block count.
* - 0x15C
- \_\_le16
- s\_min\_extra\_isize
- __le16
- s_min_extra_isize
- All inodes have at least # bytes.
* - 0x15E
- \_\_le16
- s\_want\_extra\_isize
- __le16
- s_want_extra_isize
- New inodes should reserve # bytes.
* - 0x160
- \_\_le32
- s\_flags
- __le32
- s_flags
- Miscellaneous flags. See the super_flags_ table for more info.
* - 0x164
- \_\_le16
- s\_raid\_stride
- __le16
- s_raid_stride
- RAID stride. This is the number of logical blocks read from or written
to the disk before moving to the next disk. This affects the placement
of filesystem metadata, which will hopefully make RAID storage faster.
* - 0x166
- \_\_le16
- s\_mmp\_interval
- __le16
- s_mmp_interval
- #. seconds to wait in multi-mount prevention (MMP) checking. In theory,
MMP is a mechanism to record in the superblock which host and device
have mounted the filesystem, in order to prevent multiple mounts. This
feature does not seem to be implemented...
* - 0x168
- \_\_le64
- s\_mmp\_block
- __le64
- s_mmp_block
- Block # for multi-mount protection data.
* - 0x170
- \_\_le32
- s\_raid\_stripe\_width
- __le32
- s_raid_stripe_width
- RAID stripe width. This is the number of logical blocks read from or
written to the disk before coming back to the current disk. This is used
by the block allocator to try to reduce the number of read-modify-write
operations in a RAID5/6.
* - 0x174
- \_\_u8
- s\_log\_groups\_per\_flex
- __u8
- s_log_groups_per_flex
- Size of a flexible block group is 2 ^ ``s_log_groups_per_flex``.
* - 0x175
- \_\_u8
- s\_checksum\_type
- __u8
- s_checksum_type
- Metadata checksum algorithm type. The only valid value is 1 (crc32c).
* - 0x176
- \_\_le16
- s\_reserved\_pad
- __le16
- s_reserved_pad
-
* - 0x178
- \_\_le64
- s\_kbytes\_written
- __le64
- s_kbytes_written
- Number of KiB written to this filesystem over its lifetime.
* - 0x180
- \_\_le32
- s\_snapshot\_inum
- __le32
- s_snapshot_inum
- inode number of active snapshot. (Not used in e2fsprogs/Linux.)
* - 0x184
- \_\_le32
- s\_snapshot\_id
- __le32
- s_snapshot_id
- Sequential ID of active snapshot. (Not used in e2fsprogs/Linux.)
* - 0x188
- \_\_le64
- s\_snapshot\_r\_blocks\_count
- __le64
- s_snapshot_r_blocks_count
- Number of blocks reserved for active snapshot's future use. (Not used in
e2fsprogs/Linux.)
* - 0x190
- \_\_le32
- s\_snapshot\_list
- __le32
- s_snapshot_list
- inode number of the head of the on-disk snapshot list. (Not used in
e2fsprogs/Linux.)
* - 0x194
- \_\_le32
- s\_error\_count
- __le32
- s_error_count
- Number of errors seen.
* - 0x198
- \_\_le32
- s\_first\_error\_time
- __le32
- s_first_error_time
- First time an error happened, in seconds since the epoch.
* - 0x19C
- \_\_le32
- s\_first\_error\_ino
- __le32
- s_first_error_ino
- inode involved in first error.
* - 0x1A0
- \_\_le64
- s\_first\_error\_block
- __le64
- s_first_error_block
- Number of block involved of first error.
* - 0x1A8
- \_\_u8
- s\_first\_error\_func[32]
- __u8
- s_first_error_func[32]
- Name of function where the error happened.
* - 0x1C8
- \_\_le32
- s\_first\_error\_line
- __le32
- s_first_error_line
- Line number where error happened.
* - 0x1CC
- \_\_le32
- s\_last\_error\_time
- __le32
- s_last_error_time
- Time of most recent error, in seconds since the epoch.
* - 0x1D0
- \_\_le32
- s\_last\_error\_ino
- __le32
- s_last_error_ino
- inode involved in most recent error.
* - 0x1D4
- \_\_le32
- s\_last\_error\_line
- __le32
- s_last_error_line
- Line number where most recent error happened.
* - 0x1D8
- \_\_le64
- s\_last\_error\_block
- __le64
- s_last_error_block
- Number of block involved in most recent error.
* - 0x1E0
- \_\_u8
- s\_last\_error\_func[32]
- __u8
- s_last_error_func[32]
- Name of function where the most recent error happened.
* - 0x200
- \_\_u8
- s\_mount\_opts[64]
- __u8
- s_mount_opts[64]
- ASCIIZ string of mount options.
* - 0x240
- \_\_le32
- s\_usr\_quota\_inum
- __le32
- s_usr_quota_inum
- Inode number of user `quota <quota>`__ file.
* - 0x244
- \_\_le32
- s\_grp\_quota\_inum
- __le32
- s_grp_quota_inum
- Inode number of group `quota <quota>`__ file.
* - 0x248
- \_\_le32
- s\_overhead\_blocks
- __le32
- s_overhead_blocks
- Overhead blocks/clusters in fs. (Huh? This field is always zero, which
means that the kernel calculates it dynamically.)
* - 0x24C
- \_\_le32
- s\_backup\_bgs[2]
- Block groups containing superblock backups (if sparse\_super2)
- __le32
- s_backup_bgs[2]
- Block groups containing superblock backups (if sparse_super2)
* - 0x254
- \_\_u8
- s\_encrypt\_algos[4]
- __u8
- s_encrypt_algos[4]
- Encryption algorithms in use. There can be up to four algorithms in use
at any time; valid algorithm codes are given in the super_encrypt_ table
below.
* - 0x258
- \_\_u8
- s\_encrypt\_pw\_salt[16]
- __u8
- s_encrypt_pw_salt[16]
- Salt for the string2key algorithm for encryption.
* - 0x268
- \_\_le32
- s\_lpf\_ino
- __le32
- s_lpf_ino
- Inode number of lost+found
* - 0x26C
- \_\_le32
- s\_prj\_quota\_inum
- __le32
- s_prj_quota_inum
- Inode that tracks project quotas.
* - 0x270
- \_\_le32
- s\_checksum\_seed
- Checksum seed used for metadata\_csum calculations. This value is
crc32c(~0, $orig\_fs\_uuid).
- __le32
- s_checksum_seed
- Checksum seed used for metadata_csum calculations. This value is
crc32c(~0, $orig_fs_uuid).
* - 0x274
- \_\_u8
- s\_wtime_hi
- __u8
- s_wtime_hi
- Upper 8 bits of the s_wtime field.
* - 0x275
- \_\_u8
- s\_mtime_hi
- __u8
- s_mtime_hi
- Upper 8 bits of the s_mtime field.
* - 0x276
- \_\_u8
- s\_mkfs_time_hi
- __u8
- s_mkfs_time_hi
- Upper 8 bits of the s_mkfs_time field.
* - 0x277
- \_\_u8
- s\_lastcheck_hi
- __u8
- s_lastcheck_hi
- Upper 8 bits of the s_lastcheck_hi field.
* - 0x278
- \_\_u8
- s\_first_error_time_hi
- __u8
- s_first_error_time_hi
- Upper 8 bits of the s_first_error_time_hi field.
* - 0x279
- \_\_u8
- s\_last_error_time_hi
- __u8
- s_last_error_time_hi
- Upper 8 bits of the s_last_error_time_hi field.
* - 0x27A
- \_\_u8
- s\_pad[2]
- __u8
- s_pad[2]
- Zero padding.
* - 0x27C
- \_\_le16
- s\_encoding
- __le16
- s_encoding
- Filename charset encoding.
* - 0x27E
- \_\_le16
- s\_encoding_flags
- __le16
- s_encoding_flags
- Filename charset encoding flags.
* - 0x280
- \_\_le32
- s\_orphan\_file\_inum
- __le32
- s_orphan_file_inum
- Orphan file inode number.
* - 0x284
- \_\_le32
- s\_reserved[94]
- __le32
- s_reserved[94]
- Padding to the end of the block.
* - 0x3FC
- \_\_le32
- s\_checksum
- __le32
- s_checksum
- Superblock checksum.
.. _super_state:
@@ -574,44 +574,44 @@ following:
* - Value
- Description
* - 0x1
- Directory preallocation (COMPAT\_DIR\_PREALLOC).
- Directory preallocation (COMPAT_DIR_PREALLOC).
* - 0x2
- “imagic inodes”. Not clear from the code what this does
(COMPAT\_IMAGIC\_INODES).
(COMPAT_IMAGIC_INODES).
* - 0x4
- Has a journal (COMPAT\_HAS\_JOURNAL).
- Has a journal (COMPAT_HAS_JOURNAL).
* - 0x8
- Supports extended attributes (COMPAT\_EXT\_ATTR).
- Supports extended attributes (COMPAT_EXT_ATTR).
* - 0x10
- Has reserved GDT blocks for filesystem expansion
(COMPAT\_RESIZE\_INODE). Requires RO\_COMPAT\_SPARSE\_SUPER.
(COMPAT_RESIZE_INODE). Requires RO_COMPAT_SPARSE_SUPER.
* - 0x20
- Has directory indices (COMPAT\_DIR\_INDEX).
- Has directory indices (COMPAT_DIR_INDEX).
* - 0x40
- “Lazy BG”. Not in Linux kernel, seems to have been for uninitialized
block groups? (COMPAT\_LAZY\_BG)
block groups? (COMPAT_LAZY_BG)
* - 0x80
- “Exclude inode”. Not used. (COMPAT\_EXCLUDE\_INODE).
- “Exclude inode”. Not used. (COMPAT_EXCLUDE_INODE).
* - 0x100
- “Exclude bitmap”. Seems to be used to indicate the presence of
snapshot-related exclude bitmaps? Not defined in kernel or used in
e2fsprogs (COMPAT\_EXCLUDE\_BITMAP).
e2fsprogs (COMPAT_EXCLUDE_BITMAP).
* - 0x200
- Sparse Super Block, v2. If this flag is set, the SB field s\_backup\_bgs
- Sparse Super Block, v2. If this flag is set, the SB field s_backup_bgs
points to the two block groups that contain backup superblocks
(COMPAT\_SPARSE\_SUPER2).
(COMPAT_SPARSE_SUPER2).
* - 0x400
- Fast commits supported. Although fast commits blocks are
backward incompatible, fast commit blocks are not always
present in the journal. If fast commit blocks are present in
the journal, JBD2 incompat feature
(JBD2\_FEATURE\_INCOMPAT\_FAST\_COMMIT) gets
set (COMPAT\_FAST\_COMMIT).
(JBD2_FEATURE_INCOMPAT_FAST_COMMIT) gets
set (COMPAT_FAST_COMMIT).
* - 0x1000
- Orphan file allocated. This is the special file for more efficient
tracking of unlinked but still open inodes. When there may be any
entries in the file, we additionally set proper rocompat feature
(RO\_COMPAT\_ORPHAN\_PRESENT).
(RO_COMPAT_ORPHAN_PRESENT).
.. _super_incompat:
@@ -625,45 +625,45 @@ following:
* - Value
- Description
* - 0x1
- Compression (INCOMPAT\_COMPRESSION).
- Compression (INCOMPAT_COMPRESSION).
* - 0x2
- Directory entries record the file type. See ext4\_dir\_entry\_2 below
(INCOMPAT\_FILETYPE).
- Directory entries record the file type. See ext4_dir_entry_2 below
(INCOMPAT_FILETYPE).
* - 0x4
- Filesystem needs recovery (INCOMPAT\_RECOVER).
- Filesystem needs recovery (INCOMPAT_RECOVER).
* - 0x8
- Filesystem has a separate journal device (INCOMPAT\_JOURNAL\_DEV).
- Filesystem has a separate journal device (INCOMPAT_JOURNAL_DEV).
* - 0x10
- Meta block groups. See the earlier discussion of this feature
(INCOMPAT\_META\_BG).
(INCOMPAT_META_BG).
* - 0x40
- Files in this filesystem use extents (INCOMPAT\_EXTENTS).
- Files in this filesystem use extents (INCOMPAT_EXTENTS).
* - 0x80
- Enable a filesystem size of 2^64 blocks (INCOMPAT\_64BIT).
- Enable a filesystem size of 2^64 blocks (INCOMPAT_64BIT).
* - 0x100
- Multiple mount protection (INCOMPAT\_MMP).
- Multiple mount protection (INCOMPAT_MMP).
* - 0x200
- Flexible block groups. See the earlier discussion of this feature
(INCOMPAT\_FLEX\_BG).
(INCOMPAT_FLEX_BG).
* - 0x400
- Inodes can be used to store large extended attribute values
(INCOMPAT\_EA\_INODE).
(INCOMPAT_EA_INODE).
* - 0x1000
- Data in directory entry (INCOMPAT\_DIRDATA). (Not implemented?)
- Data in directory entry (INCOMPAT_DIRDATA). (Not implemented?)
* - 0x2000
- Metadata checksum seed is stored in the superblock. This feature enables
the administrator to change the UUID of a metadata\_csum filesystem
the administrator to change the UUID of a metadata_csum filesystem
while the filesystem is mounted; without it, the checksum definition
requires all metadata blocks to be rewritten (INCOMPAT\_CSUM\_SEED).
requires all metadata blocks to be rewritten (INCOMPAT_CSUM_SEED).
* - 0x4000
- Large directory >2GB or 3-level htree (INCOMPAT\_LARGEDIR). Prior to
- Large directory >2GB or 3-level htree (INCOMPAT_LARGEDIR). Prior to
this feature, directories could not be larger than 4GiB and could not
have an htree more than 2 levels deep. If this feature is enabled,
directories can be larger than 4GiB and have a maximum htree depth of 3.
* - 0x8000
- Data in inode (INCOMPAT\_INLINE\_DATA).
- Data in inode (INCOMPAT_INLINE_DATA).
* - 0x10000
- Encrypted inodes are present on the filesystem. (INCOMPAT\_ENCRYPT).
- Encrypted inodes are present on the filesystem. (INCOMPAT_ENCRYPT).
.. _super_rocompat:
@@ -678,54 +678,54 @@ the following:
- Description
* - 0x1
- Sparse superblocks. See the earlier discussion of this feature
(RO\_COMPAT\_SPARSE\_SUPER).
(RO_COMPAT_SPARSE_SUPER).
* - 0x2
- This filesystem has been used to store a file greater than 2GiB
(RO\_COMPAT\_LARGE\_FILE).
(RO_COMPAT_LARGE_FILE).
* - 0x4
- Not used in kernel or e2fsprogs (RO\_COMPAT\_BTREE\_DIR).
- Not used in kernel or e2fsprogs (RO_COMPAT_BTREE_DIR).
* - 0x8
- This filesystem has files whose sizes are represented in units of
logical blocks, not 512-byte sectors. This implies a very large file
indeed! (RO\_COMPAT\_HUGE\_FILE)
indeed! (RO_COMPAT_HUGE_FILE)
* - 0x10
- Group descriptors have checksums. In addition to detecting corruption,
this is useful for lazy formatting with uninitialized groups
(RO\_COMPAT\_GDT\_CSUM).
(RO_COMPAT_GDT_CSUM).
* - 0x20
- Indicates that the old ext3 32,000 subdirectory limit no longer applies
(RO\_COMPAT\_DIR\_NLINK). A directory's i\_links\_count will be set to 1
(RO_COMPAT_DIR_NLINK). A directory's i_links_count will be set to 1
if it is incremented past 64,999.
* - 0x40
- Indicates that large inodes exist on this filesystem
(RO\_COMPAT\_EXTRA\_ISIZE).
(RO_COMPAT_EXTRA_ISIZE).
* - 0x80
- This filesystem has a snapshot (RO\_COMPAT\_HAS\_SNAPSHOT).
- This filesystem has a snapshot (RO_COMPAT_HAS_SNAPSHOT).
* - 0x100
- `Quota <Quota>`__ (RO\_COMPAT\_QUOTA).
- `Quota <Quota>`__ (RO_COMPAT_QUOTA).
* - 0x200
- This filesystem supports “bigalloc”, which means that file extents are
tracked in units of clusters (of blocks) instead of blocks
(RO\_COMPAT\_BIGALLOC).
(RO_COMPAT_BIGALLOC).
* - 0x400
- This filesystem supports metadata checksumming.
(RO\_COMPAT\_METADATA\_CSUM; implies RO\_COMPAT\_GDT\_CSUM, though
GDT\_CSUM must not be set)
(RO_COMPAT_METADATA_CSUM; implies RO_COMPAT_GDT_CSUM, though
GDT_CSUM must not be set)
* - 0x800
- Filesystem supports replicas. This feature is neither in the kernel nor
e2fsprogs. (RO\_COMPAT\_REPLICA)
e2fsprogs. (RO_COMPAT_REPLICA)
* - 0x1000
- Read-only filesystem image; the kernel will not mount this image
read-write and most tools will refuse to write to the image.
(RO\_COMPAT\_READONLY)
(RO_COMPAT_READONLY)
* - 0x2000
- Filesystem tracks project quotas. (RO\_COMPAT\_PROJECT)
- Filesystem tracks project quotas. (RO_COMPAT_PROJECT)
* - 0x8000
- Verity inodes may be present on the filesystem. (RO\_COMPAT\_VERITY)
- Verity inodes may be present on the filesystem. (RO_COMPAT_VERITY)
* - 0x10000
- Indicates orphan file may have valid orphan entries and thus we need
to clean them up when mounting the filesystem
(RO\_COMPAT\_ORPHAN\_PRESENT).
(RO_COMPAT_ORPHAN_PRESENT).
.. _super_def_hash:
@@ -761,36 +761,36 @@ The ``s_default_mount_opts`` field is any combination of the following:
* - Value
- Description
* - 0x0001
- Print debugging info upon (re)mount. (EXT4\_DEFM\_DEBUG)
- Print debugging info upon (re)mount. (EXT4_DEFM_DEBUG)
* - 0x0002
- New files take the gid of the containing directory (instead of the fsgid
of the current process). (EXT4\_DEFM\_BSDGROUPS)
of the current process). (EXT4_DEFM_BSDGROUPS)
* - 0x0004
- Support userspace-provided extended attributes. (EXT4\_DEFM\_XATTR\_USER)
- Support userspace-provided extended attributes. (EXT4_DEFM_XATTR_USER)
* - 0x0008
- Support POSIX access control lists (ACLs). (EXT4\_DEFM\_ACL)
- Support POSIX access control lists (ACLs). (EXT4_DEFM_ACL)
* - 0x0010
- Do not support 32-bit UIDs. (EXT4\_DEFM\_UID16)
- Do not support 32-bit UIDs. (EXT4_DEFM_UID16)
* - 0x0020
- All data and metadata are commited to the journal.
(EXT4\_DEFM\_JMODE\_DATA)
(EXT4_DEFM_JMODE_DATA)
* - 0x0040
- All data are flushed to the disk before metadata are committed to the
journal. (EXT4\_DEFM\_JMODE\_ORDERED)
journal. (EXT4_DEFM_JMODE_ORDERED)
* - 0x0060
- Data ordering is not preserved; data may be written after the metadata
has been written. (EXT4\_DEFM\_JMODE\_WBACK)
has been written. (EXT4_DEFM_JMODE_WBACK)
* - 0x0100
- Disable write flushes. (EXT4\_DEFM\_NOBARRIER)
- Disable write flushes. (EXT4_DEFM_NOBARRIER)
* - 0x0200
- Track which blocks in a filesystem are metadata and therefore should not
be used as data blocks. This option will be enabled by default on 3.18,
hopefully. (EXT4\_DEFM\_BLOCK\_VALIDITY)
hopefully. (EXT4_DEFM_BLOCK_VALIDITY)
* - 0x0400
- Enable DISCARD support, where the storage device is told about blocks
becoming unused. (EXT4\_DEFM\_DISCARD)
becoming unused. (EXT4_DEFM_DISCARD)
* - 0x0800
- Disable delayed allocation. (EXT4\_DEFM\_NODELALLOC)
- Disable delayed allocation. (EXT4_DEFM_NODELALLOC)
.. _super_flags:
@@ -820,12 +820,12 @@ The ``s_encrypt_algos`` list can contain any of the following:
* - Value
- Description
* - 0
- Invalid algorithm (ENCRYPTION\_MODE\_INVALID).
- Invalid algorithm (ENCRYPTION_MODE_INVALID).
* - 1
- 256-bit AES in XTS mode (ENCRYPTION\_MODE\_AES\_256\_XTS).
- 256-bit AES in XTS mode (ENCRYPTION_MODE_AES_256_XTS).
* - 2
- 256-bit AES in GCM mode (ENCRYPTION\_MODE\_AES\_256\_GCM).
- 256-bit AES in GCM mode (ENCRYPTION_MODE_AES_256_GCM).
* - 3
- 256-bit AES in CBC mode (ENCRYPTION\_MODE\_AES\_256\_CBC).
- 256-bit AES in CBC mode (ENCRYPTION_MODE_AES_256_CBC).
Total size of the superblock is 1024 bytes.

View File

@@ -129,18 +129,24 @@ yet. Bug reports are always welcome at the issue tracker below!
* - arm64
- Supported
- ``LLVM=1``
* - hexagon
- Maintained
- ``LLVM=1``
* - mips
- Maintained
- ``CC=clang``
- ``LLVM=1``
* - powerpc
- Maintained
- ``CC=clang``
* - riscv
- Maintained
- ``CC=clang``
- ``LLVM=1``
* - s390
- Maintained
- ``CC=clang``
* - um (User Mode)
- Maintained
- ``LLVM=1``
* - x86
- Supported
- ``LLVM=1``

View File

@@ -45,10 +45,12 @@ Name Alias Usage Preserved
``$r23``-``$r31`` ``$s0``-``$s8`` Static registers Yes
================= =============== =================== ============
Note: The register ``$r21`` is reserved in the ELF psABI, but used by the Linux
kernel for storing the percpu base address. It normally has no ABI name, but is
called ``$u0`` in the kernel. You may also see ``$v0`` or ``$v1`` in some old code,
however they are deprecated aliases of ``$a0`` and ``$a1`` respectively.
.. Note::
The register ``$r21`` is reserved in the ELF psABI, but used by the Linux
kernel for storing the percpu base address. It normally has no ABI name,
but is called ``$u0`` in the kernel. You may also see ``$v0`` or ``$v1``
in some old code,however they are deprecated aliases of ``$a0`` and ``$a1``
respectively.
FPRs
----
@@ -69,8 +71,9 @@ Name Alias Usage Preserved
``$f24``-``$f31`` ``$fs0``-``$fs7`` Static registers Yes
================= ================== =================== ============
Note: You may see ``$fv0`` or ``$fv1`` in some old code, however they are deprecated
aliases of ``$fa0`` and ``$fa1`` respectively.
.. Note::
You may see ``$fv0`` or ``$fv1`` in some old code, however they are
deprecated aliases of ``$fa0`` and ``$fa1`` respectively.
VRs
----

View File

@@ -145,12 +145,16 @@ Documentation of Loongson's LS7A chipset:
https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-7A1000-usermanual-2.00-EN.pdf (in English)
Note: CPUINTC is CSR.ECFG/CSR.ESTAT and its interrupt controller described
in Section 7.4 of "LoongArch Reference Manual, Vol 1"; LIOINTC is "Legacy I/O
Interrupts" described in Section 11.1 of "Loongson 3A5000 Processor Reference
Manual"; EIOINTC is "Extended I/O Interrupts" described in Section 11.2 of
"Loongson 3A5000 Processor Reference Manual"; HTVECINTC is "HyperTransport
Interrupts" described in Section 14.3 of "Loongson 3A5000 Processor Reference
Manual"; PCH-PIC/PCH-MSI is "Interrupt Controller" described in Section 5 of
"Loongson 7A1000 Bridge User Manual"; PCH-LPC is "LPC Interrupts" described in
Section 24.3 of "Loongson 7A1000 Bridge User Manual".
.. Note::
- CPUINTC is CSR.ECFG/CSR.ESTAT and its interrupt controller described
in Section 7.4 of "LoongArch Reference Manual, Vol 1";
- LIOINTC is "Legacy I/OInterrupts" described in Section 11.1 of
"Loongson 3A5000 Processor Reference Manual";
- EIOINTC is "Extended I/O Interrupts" described in Section 11.2 of
"Loongson 3A5000 Processor Reference Manual";
- HTVECINTC is "HyperTransport Interrupts" described in Section 14.3 of
"Loongson 3A5000 Processor Reference Manual";
- PCH-PIC/PCH-MSI is "Interrupt Controller" described in Section 5 of
"Loongson 7A1000 Bridge User Manual";
- PCH-LPC is "LPC Interrupts" described in Section 24.3 of
"Loongson 7A1000 Bridge User Manual".

View File

@@ -2925,6 +2925,43 @@ plpmtud_probe_interval - INTEGER
Default: 0
reconf_enable - BOOLEAN
Enable or disable extension of Stream Reconfiguration functionality
specified in RFC6525. This extension provides the ability to "reset"
a stream, and it includes the Parameters of "Outgoing/Incoming SSN
Reset", "SSN/TSN Reset" and "Add Outgoing/Incoming Streams".
- 1: Enable extension.
- 0: Disable extension.
Default: 0
intl_enable - BOOLEAN
Enable or disable extension of User Message Interleaving functionality
specified in RFC8260. This extension allows the interleaving of user
messages sent on different streams. With this feature enabled, I-DATA
chunk will replace DATA chunk to carry user messages if also supported
by the peer. Note that to use this feature, one needs to set this option
to 1 and also needs to set socket options SCTP_FRAGMENT_INTERLEAVE to 2
and SCTP_INTERLEAVING_SUPPORTED to 1.
- 1: Enable extension.
- 0: Disable extension.
Default: 0
ecn_enable - BOOLEAN
Control use of Explicit Congestion Notification (ECN) by SCTP.
Like in TCP, ECN is used only when both ends of the SCTP connection
indicate support for it. This feature is useful in avoiding losses
due to congestion by allowing supporting routers to signal congestion
before having to drop packets.
1: Enable ecn.
0: Disable ecn.
Default: 1
``/proc/sys/net/core/*``
========================

View File

@@ -104,7 +104,7 @@ Whenever possible, use the PHY side RGMII delay for these reasons:
* PHY device drivers in PHYLIB being reusable by nature, being able to
configure correctly a specified delay enables more designs with similar delay
requirements to be operate correctly
requirements to be operated correctly
For cases where the PHY is not capable of providing this delay, but the
Ethernet MAC driver is capable of doing so, the correct phy_interface_t value

View File

@@ -46,10 +46,11 @@ LA64中每个寄存器为64位宽。 ``$r0`` 的内容总是固定为0而其
``$r23``-``$r31`` ``$s0``-``$s8`` 静态寄存器 是
================= =============== =================== ==========
注意:``$r21``寄存器在ELF psABI中保留未使用但是在Linux内核用于保存每CPU
变量基地址。该寄存器没有ABI命名不过在内核中称为``$u0``。在一些遗留代码
中有时可能见到``$v0````$v1``,它们是``$a0````$a1``的别名,属于已经废弃
的用法。
.. note::
注意: ``$r21`` 寄存器在ELF psABI中保留未使用但是在Linux内核用于保
存每CPU变量基地址。该寄存器没有ABI命名不过在内核中称为 ``$u0`` 。在
一些遗留代码中有时可能见到 ``$v0````$v1`` ,它们是 ``$a0``
``$a1`` 的别名,属于已经废弃的用法。
浮点寄存器
----------
@@ -68,8 +69,9 @@ LA64中每个寄存器为64位宽。 ``$r0`` 的内容总是固定为0而其
``$f24``-``$f31`` ``$fs0``-``$fs7`` 静态寄存器 是
================= ================== =================== ==========
注意:在一些遗留代码中有时可能见到 ``$v0````$v1`` ,它们是 ``$a0``
``$a1`` 的别名,属于已经废弃的用法。
.. note::
注意:在一些遗留代码中有时可能见到 ``$v0`` ``$v1`` ,它们是
``$a0````$a1`` 的别名,属于已经废弃的用法。
向量寄存器

View File

@@ -147,9 +147,11 @@ PCH-LPC::
https://github.com/loongson/LoongArch-Documentation/releases/latest/download/Loongson-7A1000-usermanual-2.00-EN.pdf (英文版)
CPUINTC即《龙芯架构参考手册卷一》第7.4节所描述的CSR.ECFG/CSR.ESTAT寄存器及其中断
控制逻辑LIOINTC即《龙芯3A5000处理器使用手册》第11.1节所描述的“传统I/O中断”EIOINTC
即《龙芯3A5000处理器使用手册》第11.2节所描述的“扩展I/O中断”HTVECINTC即《龙芯3A5000
处理器使用手册》第14.3节所描述的“HyperTransport中断”PCH-PIC/PCH-MSI即《龙芯7A1000桥
片用户手册》第5章所描述的“中断控制器”PCH-LPC即《龙芯7A1000桥片用户手册》第24.3节所
描述的“LPC中断”
.. note::
- CPUINTC即《龙芯架构参考手册卷一》第7.4节所描述的CSR.ECFG/CSR.ESTAT寄存器及其
中断控制逻辑;
- LIOINTC即《龙芯3A5000处理器使用手册》第11.1节所描述的“传统I/O中断”
- EIOINTC即《龙芯3A5000处理器使用手册》第11.2节所描述的“扩展I/O中断”
- HTVECINTC即《龙芯3A5000处理器使用手册》第14.3节所描述的“HyperTransport中断”
- PCH-PIC/PCH-MSI即《龙芯7A1000桥片用户手册》第5章所描述的“中断控制器”
- PCH-LPC即《龙芯7A1000桥片用户手册》第24.3节所描述的“LPC中断”。

View File

@@ -120,7 +120,8 @@ Testing
unpoison-pfn
Software-unpoison page at PFN echoed into this file. This way
a page can be reused again. This only works for Linux
injected failures, not for real memory failures.
injected failures, not for real memory failures. Once any hardware
memory failure happens, this feature is disabled.
Note these injection interfaces are not stable and might change between
kernel versions

View File

@@ -427,6 +427,7 @@ ACPI VIOT DRIVER
M: Jean-Philippe Brucker <jean-philippe@linaro.org>
L: linux-acpi@vger.kernel.org
L: iommu@lists.linux-foundation.org
L: iommu@lists.linux.dev
S: Maintained
F: drivers/acpi/viot.c
F: include/linux/acpi_viot.h
@@ -960,6 +961,7 @@ AMD IOMMU (AMD-VI)
M: Joerg Roedel <joro@8bytes.org>
R: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
L: iommu@lists.linux-foundation.org
L: iommu@lists.linux.dev
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
F: drivers/iommu/amd/
@@ -2467,6 +2469,7 @@ ARM/NXP S32G ARCHITECTURE
M: Chester Lin <clin@suse.com>
R: Andreas Färber <afaerber@suse.de>
R: Matthias Brugger <mbrugger@suse.com>
R: NXP S32 Linux Team <s32@nxp.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
F: arch/arm64/boot/dts/freescale/s32g*.dts*
@@ -3662,7 +3665,7 @@ BPF JIT for ARM
M: Shubham Bansal <illusionist.neo@gmail.com>
L: netdev@vger.kernel.org
L: bpf@vger.kernel.org
S: Maintained
S: Odd Fixes
F: arch/arm/net/
BPF JIT for ARM64
@@ -3686,14 +3689,15 @@ BPF JIT for NFP NICs
M: Jakub Kicinski <kuba@kernel.org>
L: netdev@vger.kernel.org
L: bpf@vger.kernel.org
S: Supported
S: Odd Fixes
F: drivers/net/ethernet/netronome/nfp/bpf/
BPF JIT for POWERPC (32-BIT AND 64-BIT)
M: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
M: Michael Ellerman <mpe@ellerman.id.au>
L: netdev@vger.kernel.org
L: bpf@vger.kernel.org
S: Maintained
S: Supported
F: arch/powerpc/net/
BPF JIT for RISC-V (32-bit)
@@ -3719,7 +3723,7 @@ M: Heiko Carstens <hca@linux.ibm.com>
M: Vasily Gorbik <gor@linux.ibm.com>
L: netdev@vger.kernel.org
L: bpf@vger.kernel.org
S: Maintained
S: Supported
F: arch/s390/net/
X: arch/s390/net/pnet.c
@@ -3727,14 +3731,14 @@ BPF JIT for SPARC (32-BIT AND 64-BIT)
M: David S. Miller <davem@davemloft.net>
L: netdev@vger.kernel.org
L: bpf@vger.kernel.org
S: Maintained
S: Odd Fixes
F: arch/sparc/net/
BPF JIT for X86 32-BIT
M: Wang YanQing <udknight@gmail.com>
L: netdev@vger.kernel.org
L: bpf@vger.kernel.org
S: Maintained
S: Odd Fixes
F: arch/x86/net/bpf_jit_comp32.c
BPF JIT for X86 64-BIT
@@ -3757,6 +3761,19 @@ F: include/linux/bpf_lsm.h
F: kernel/bpf/bpf_lsm.c
F: security/bpf/
BPF L7 FRAMEWORK
M: John Fastabend <john.fastabend@gmail.com>
M: Jakub Sitnicki <jakub@cloudflare.com>
L: netdev@vger.kernel.org
L: bpf@vger.kernel.org
S: Maintained
F: include/linux/skmsg.h
F: net/core/skmsg.c
F: net/core/sock_map.c
F: net/ipv4/tcp_bpf.c
F: net/ipv4/udp_bpf.c
F: net/unix/unix_bpf.c
BPFTOOL
M: Quentin Monnet <quentin@isovalent.com>
L: bpf@vger.kernel.org
@@ -3796,12 +3813,12 @@ N: bcmbca
N: bcm[9]?47622
BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE
M: Nicolas Saenz Julienne <nsaenz@kernel.org>
M: Florian Fainelli <f.fainelli@gmail.com>
R: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
L: linux-rpi-kernel@lists.infradead.org (moderated for non-subscribers)
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/nsaenz/linux-rpi.git
T: git git://github.com/broadcom/stblinux.git
F: Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml
F: drivers/pci/controller/pcie-brcmstb.c
F: drivers/staging/vc04_services
@@ -4959,6 +4976,7 @@ Q: http://patchwork.kernel.org/project/linux-clk/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git
F: Documentation/devicetree/bindings/clock/
F: drivers/clk/
F: include/dt-bindings/clock/
F: include/linux/clk-pr*
F: include/linux/clk/
F: include/linux/of_clk.h
@@ -5962,6 +5980,7 @@ M: Christoph Hellwig <hch@lst.de>
M: Marek Szyprowski <m.szyprowski@samsung.com>
R: Robin Murphy <robin.murphy@arm.com>
L: iommu@lists.linux-foundation.org
L: iommu@lists.linux.dev
S: Supported
W: http://git.infradead.org/users/hch/dma-mapping.git
T: git git://git.infradead.org/users/hch/dma-mapping.git
@@ -5974,6 +5993,7 @@ F: kernel/dma/
DMA MAPPING BENCHMARK
M: Xiang Chen <chenxiang66@hisilicon.com>
L: iommu@lists.linux-foundation.org
L: iommu@lists.linux.dev
F: kernel/dma/map_benchmark.c
F: tools/testing/selftests/dma/
@@ -7558,6 +7578,7 @@ F: drivers/gpu/drm/exynos/exynos_dp*
EXYNOS SYSMMU (IOMMU) driver
M: Marek Szyprowski <m.szyprowski@samsung.com>
L: iommu@lists.linux-foundation.org
L: iommu@lists.linux.dev
S: Maintained
F: drivers/iommu/exynos-iommu.c
@@ -8479,6 +8500,7 @@ F: Documentation/devicetree/bindings/gpio/
F: Documentation/driver-api/gpio/
F: drivers/gpio/
F: include/asm-generic/gpio.h
F: include/dt-bindings/gpio/
F: include/linux/gpio.h
F: include/linux/gpio/
F: include/linux/of_gpio.h
@@ -9132,6 +9154,7 @@ F: drivers/media/platform/st/sti/hva
HWPOISON MEMORY FAILURE HANDLING
M: Naoya Horiguchi <naoya.horiguchi@nec.com>
R: Miaohe Lin <linmiaohe@huawei.com>
L: linux-mm@kvack.org
S: Maintained
F: mm/hwpoison-inject.c
@@ -9276,6 +9299,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git
F: Documentation/devicetree/bindings/i2c/i2c.txt
F: Documentation/i2c/
F: drivers/i2c/*
F: include/dt-bindings/i2c/i2c.h
F: include/linux/i2c-dev.h
F: include/linux/i2c-smbus.h
F: include/linux/i2c.h
@@ -9291,6 +9315,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git
F: Documentation/devicetree/bindings/i2c/
F: drivers/i2c/algos/
F: drivers/i2c/busses/
F: include/dt-bindings/i2c/
I2C-TAOS-EVM DRIVER
M: Jean Delvare <jdelvare@suse.com>
@@ -9975,6 +10000,7 @@ INTEL IOMMU (VT-d)
M: David Woodhouse <dwmw2@infradead.org>
M: Lu Baolu <baolu.lu@linux.intel.com>
L: iommu@lists.linux-foundation.org
L: iommu@lists.linux.dev
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
F: drivers/iommu/intel/
@@ -10354,6 +10380,7 @@ IOMMU DRIVERS
M: Joerg Roedel <joro@8bytes.org>
M: Will Deacon <will@kernel.org>
L: iommu@lists.linux-foundation.org
L: iommu@lists.linux.dev
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
F: Documentation/devicetree/bindings/iommu/
@@ -10830,6 +10857,7 @@ M: Marc Zyngier <maz@kernel.org>
R: James Morse <james.morse@arm.com>
R: Alexandru Elisei <alexandru.elisei@arm.com>
R: Suzuki K Poulose <suzuki.poulose@arm.com>
R: Oliver Upton <oliver.upton@linux.dev>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
L: kvmarm@lists.cs.columbia.edu (moderated for non-subscribers)
S: Maintained
@@ -10872,7 +10900,6 @@ F: arch/riscv/include/asm/kvm*
F: arch/riscv/include/uapi/asm/kvm*
F: arch/riscv/kvm/
F: tools/testing/selftests/kvm/*/riscv/
F: tools/testing/selftests/kvm/riscv/
KERNEL VIRTUAL MACHINE for s390 (KVM/s390)
M: Christian Borntraeger <borntraeger@linux.ibm.com>
@@ -10897,28 +10924,51 @@ F: tools/testing/selftests/kvm/*/s390x/
F: tools/testing/selftests/kvm/s390x/
KERNEL VIRTUAL MACHINE FOR X86 (KVM/x86)
M: Sean Christopherson <seanjc@google.com>
M: Paolo Bonzini <pbonzini@redhat.com>
R: Sean Christopherson <seanjc@google.com>
R: Vitaly Kuznetsov <vkuznets@redhat.com>
R: Wanpeng Li <wanpengli@tencent.com>
R: Jim Mattson <jmattson@google.com>
R: Joerg Roedel <joro@8bytes.org>
L: kvm@vger.kernel.org
S: Supported
W: http://www.linux-kvm.org
T: git git://git.kernel.org/pub/scm/virt/kvm/kvm.git
F: arch/x86/include/asm/kvm*
F: arch/x86/include/asm/pvclock-abi.h
F: arch/x86/include/asm/svm.h
F: arch/x86/include/asm/vmx*.h
F: arch/x86/include/uapi/asm/kvm*
F: arch/x86/include/uapi/asm/svm.h
F: arch/x86/include/uapi/asm/vmx.h
F: arch/x86/kernel/kvm.c
F: arch/x86/kernel/kvmclock.c
F: arch/x86/kvm/
F: arch/x86/kvm/*/
KVM PARAVIRT (KVM/paravirt)
M: Paolo Bonzini <pbonzini@redhat.com>
R: Wanpeng Li <wanpengli@tencent.com>
R: Vitaly Kuznetsov <vkuznets@redhat.com>
L: kvm@vger.kernel.org
S: Supported
T: git git://git.kernel.org/pub/scm/virt/kvm/kvm.git
F: arch/x86/kernel/kvm.c
F: arch/x86/kernel/kvmclock.c
F: arch/x86/include/asm/pvclock-abi.h
F: include/linux/kvm_para.h
F: include/uapi/linux/kvm_para.h
F: include/uapi/asm-generic/kvm_para.h
F: include/asm-generic/kvm_para.h
F: arch/um/include/asm/kvm_para.h
F: arch/x86/include/asm/kvm_para.h
F: arch/x86/include/uapi/asm/kvm_para.h
KVM X86 HYPER-V (KVM/hyper-v)
M: Vitaly Kuznetsov <vkuznets@redhat.com>
M: Sean Christopherson <seanjc@google.com>
M: Paolo Bonzini <pbonzini@redhat.com>
L: kvm@vger.kernel.org
S: Supported
T: git git://git.kernel.org/pub/scm/virt/kvm/kvm.git
F: arch/x86/kvm/hyperv.*
F: arch/x86/kvm/kvm_onhyperv.*
F: arch/x86/kvm/svm/hyperv.*
F: arch/x86/kvm/svm/svm_onhyperv.*
F: arch/x86/kvm/vmx/evmcs.*
KERNFS
M: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
M: Tejun Heo <tj@kernel.org>
@@ -11097,20 +11147,6 @@ S: Maintained
F: include/net/l3mdev.h
F: net/l3mdev
L7 BPF FRAMEWORK
M: John Fastabend <john.fastabend@gmail.com>
M: Daniel Borkmann <daniel@iogearbox.net>
M: Jakub Sitnicki <jakub@cloudflare.com>
L: netdev@vger.kernel.org
L: bpf@vger.kernel.org
S: Maintained
F: include/linux/skmsg.h
F: net/core/skmsg.c
F: net/core/sock_map.c
F: net/ipv4/tcp_bpf.c
F: net/ipv4/udp_bpf.c
F: net/unix/unix_bpf.c
LANDLOCK SECURITY MODULE
M: Mickaël Salaün <mic@digikod.net>
L: linux-security-module@vger.kernel.org
@@ -11590,6 +11626,7 @@ F: drivers/gpu/drm/bridge/lontium-lt8912b.c
LOONGARCH
M: Huacai Chen <chenhuacai@kernel.org>
R: WANG Xuerui <kernel@xen0n.name>
L: loongarch@lists.linux.dev
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git
F: arch/loongarch/
@@ -12503,6 +12540,7 @@ F: drivers/i2c/busses/i2c-mt65xx.c
MEDIATEK IOMMU DRIVER
M: Yong Wu <yong.wu@mediatek.com>
L: iommu@lists.linux-foundation.org
L: iommu@lists.linux.dev
L: linux-mediatek@lists.infradead.org (moderated for non-subscribers)
S: Supported
F: Documentation/devicetree/bindings/iommu/mediatek*
@@ -12845,9 +12883,8 @@ M: Andrew Morton <akpm@linux-foundation.org>
L: linux-mm@kvack.org
S: Maintained
W: http://www.linux-mm.org
T: quilt https://ozlabs.org/~akpm/mmotm/
T: quilt https://ozlabs.org/~akpm/mmots/
T: git git://github.com/hnaz/linux-mm.git
T: git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
T: quilt git://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new
F: include/linux/gfp.h
F: include/linux/memory_hotplug.h
F: include/linux/mm.h
@@ -12857,6 +12894,18 @@ F: include/linux/vmalloc.h
F: mm/
F: tools/testing/selftests/vm/
MEMORY HOT(UN)PLUG
M: David Hildenbrand <david@redhat.com>
M: Oscar Salvador <osalvador@suse.de>
L: linux-mm@kvack.org
S: Maintained
F: Documentation/admin-guide/mm/memory-hotplug.rst
F: Documentation/core-api/memory-hotplug.rst
F: drivers/base/memory.c
F: include/linux/memory_hotplug.h
F: mm/memory_hotplug.c
F: tools/testing/selftests/memory-hotplug/
MEMORY TECHNOLOGY DEVICES (MTD)
M: Miquel Raynal <miquel.raynal@bootlin.com>
M: Richard Weinberger <richard@nod.at>
@@ -13801,6 +13850,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
F: Documentation/devicetree/bindings/net/
F: drivers/connector/
F: drivers/net/
F: include/dt-bindings/net/
F: include/linux/etherdevice.h
F: include/linux/fcdevice.h
F: include/linux/fddidevice.h
@@ -13952,7 +14002,6 @@ F: net/ipv6/tcp*.c
NETWORKING [TLS]
M: Boris Pismenny <borisp@nvidia.com>
M: John Fastabend <john.fastabend@gmail.com>
M: Daniel Borkmann <daniel@iogearbox.net>
M: Jakub Kicinski <kuba@kernel.org>
L: netdev@vger.kernel.org
S: Maintained
@@ -14261,7 +14310,7 @@ F: drivers/iio/gyro/fxas21002c_i2c.c
F: drivers/iio/gyro/fxas21002c_spi.c
NXP i.MX CLOCK DRIVERS
M: Abel Vesa <abel.vesa@nxp.com>
M: Abel Vesa <abelvesa@kernel.org>
L: linux-clk@vger.kernel.org
L: linux-imx@nxp.com
S: Maintained
@@ -14349,9 +14398,8 @@ F: Documentation/devicetree/bindings/sound/nxp,tfa989x.yaml
F: sound/soc/codecs/tfa989x.c
NXP-NCI NFC DRIVER
R: Charles Gorand <charles.gorand@effinnov.com>
L: linux-nfc@lists.01.org (subscribers-only)
S: Supported
S: Orphan
F: Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml
F: drivers/nfc/nxp-nci
@@ -14869,6 +14917,7 @@ F: include/dt-bindings/
OPENCOMPUTE PTP CLOCK DRIVER
M: Jonathan Lemon <jonathan.lemon@gmail.com>
M: Vadim Fedorenko <vadfed@fb.com>
L: netdev@vger.kernel.org
S: Maintained
F: drivers/ptp/ptp_ocp.c
@@ -16488,7 +16537,7 @@ F: Documentation/devicetree/bindings/opp/opp-v2-kryo-cpu.yaml
F: drivers/cpufreq/qcom-cpufreq-nvmem.c
QUALCOMM CRYPTO DRIVERS
M: Thara Gopinath <thara.gopinath@linaro.org>
M: Thara Gopinath <thara.gopinath@gmail.com>
L: linux-crypto@vger.kernel.org
L: linux-arm-msm@vger.kernel.org
S: Maintained
@@ -16543,6 +16592,7 @@ F: drivers/i2c/busses/i2c-qcom-cci.c
QUALCOMM IOMMU
M: Rob Clark <robdclark@gmail.com>
L: iommu@lists.linux-foundation.org
L: iommu@lists.linux.dev
L: linux-arm-msm@vger.kernel.org
S: Maintained
F: drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -16598,7 +16648,7 @@ F: include/linux/if_rmnet.h
QUALCOMM TSENS THERMAL DRIVER
M: Amit Kucheria <amitk@kernel.org>
M: Thara Gopinath <thara.gopinath@linaro.org>
M: Thara Gopinath <thara.gopinath@gmail.com>
L: linux-pm@vger.kernel.org
L: linux-arm-msm@vger.kernel.org
S: Maintained
@@ -19168,6 +19218,7 @@ F: arch/x86/boot/video*
SWIOTLB SUBSYSTEM
M: Christoph Hellwig <hch@infradead.org>
L: iommu@lists.linux-foundation.org
L: iommu@lists.linux.dev
S: Supported
W: http://git.infradead.org/users/hch/dma-mapping.git
T: git git://git.infradead.org/users/hch/dma-mapping.git
@@ -19305,7 +19356,7 @@ R: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
R: Mika Westerberg <mika.westerberg@linux.intel.com>
R: Jan Dabros <jsd@semihalf.com>
L: linux-i2c@vger.kernel.org
S: Maintained
S: Supported
F: drivers/i2c/busses/i2c-designware-*
SYNOPSYS DESIGNWARE MMC/SD/SDIO DRIVER
@@ -20712,6 +20763,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
F: Documentation/devicetree/bindings/usb/
F: Documentation/usb/
F: drivers/usb/
F: include/dt-bindings/usb/
F: include/linux/usb.h
F: include/linux/usb/
@@ -21842,6 +21894,7 @@ M: Juergen Gross <jgross@suse.com>
M: Stefano Stabellini <sstabellini@kernel.org>
L: xen-devel@lists.xenproject.org (moderated for non-subscribers)
L: iommu@lists.linux-foundation.org
L: iommu@lists.linux.dev
S: Supported
F: arch/x86/xen/*swiotlb*
F: drivers/xen/*swiotlb*

View File

@@ -2,7 +2,7 @@
VERSION = 5
PATCHLEVEL = 19
SUBLEVEL = 0
EXTRAVERSION = -rc2
EXTRAVERSION = -rc5
NAME = Superb Owl
# *DOCUMENTATION*
@@ -1141,7 +1141,7 @@ KBUILD_MODULES := 1
autoksyms_recursive: descend modules.order
$(Q)$(CONFIG_SHELL) $(srctree)/scripts/adjust_autoksyms.sh \
"$(MAKE) -f $(srctree)/Makefile vmlinux"
"$(MAKE) -f $(srctree)/Makefile autoksyms_recursive"
endif
autoksyms_h := $(if $(CONFIG_TRIM_UNUSED_KSYMS), include/generated/autoksyms.h)

View File

@@ -1586,7 +1586,6 @@ dtb-$(CONFIG_ARCH_ASPEED) += \
aspeed-bmc-lenovo-hr630.dtb \
aspeed-bmc-lenovo-hr855xg2.dtb \
aspeed-bmc-microsoft-olympus.dtb \
aspeed-bmc-nuvia-dc-scm.dtb \
aspeed-bmc-opp-lanyang.dtb \
aspeed-bmc-opp-mihawk.dtb \
aspeed-bmc-opp-mowgli.dtb \
@@ -1599,6 +1598,7 @@ dtb-$(CONFIG_ARCH_ASPEED) += \
aspeed-bmc-opp-witherspoon.dtb \
aspeed-bmc-opp-zaius.dtb \
aspeed-bmc-portwell-neptune.dtb \
aspeed-bmc-qcom-dc-scm-v1.dtb \
aspeed-bmc-quanta-q71l.dtb \
aspeed-bmc-quanta-s6q.dtb \
aspeed-bmc-supermicro-x11spi.dtb \

View File

@@ -6,8 +6,8 @@
#include "aspeed-g6.dtsi"
/ {
model = "Nuvia DC-SCM BMC";
compatible = "nuvia,dc-scm-bmc", "aspeed,ast2600";
model = "Qualcomm DC-SCM V1 BMC";
compatible = "qcom,dc-scm-v1-bmc", "aspeed,ast2600";
aliases {
serial4 = &uart5;

View File

@@ -120,26 +120,31 @@
port@0 {
reg = <0>;
label = "lan1";
phy-mode = "internal";
};
port@1 {
reg = <1>;
label = "lan2";
phy-mode = "internal";
};
port@2 {
reg = <2>;
label = "lan3";
phy-mode = "internal";
};
port@3 {
reg = <3>;
label = "lan4";
phy-mode = "internal";
};
port@4 {
reg = <4>;
label = "lan5";
phy-mode = "internal";
};
port@5 {

View File

@@ -28,12 +28,12 @@
&expgpio {
gpio-line-names = "BT_ON",
"WL_ON",
"",
"PWR_LED_OFF",
"GLOBAL_RESET",
"VDD_SD_IO_SEL",
"CAM_GPIO",
"GLOBAL_SHUTDOWN",
"SD_PWR_ON",
"SD_OC_N";
"SHUTDOWN_REQUEST";
};
&genet_mdio {

View File

@@ -593,7 +593,7 @@
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_atmel_conn>;
reg = <0x4a>;
reset-gpios = <&gpio1 14 GPIO_ACTIVE_HIGH>; /* SODIMM 106 */
reset-gpios = <&gpio1 14 GPIO_ACTIVE_LOW>; /* SODIMM 106 */
status = "disabled";
};
};

View File

@@ -762,7 +762,7 @@
regulator-name = "vddpu";
regulator-min-microvolt = <725000>;
regulator-max-microvolt = <1450000>;
regulator-enable-ramp-delay = <150>;
regulator-enable-ramp-delay = <380>;
anatop-reg-offset = <0x140>;
anatop-vol-bit-shift = <9>;
anatop-vol-bit-width = <5>;

View File

@@ -120,6 +120,7 @@
compatible = "usb-nop-xceiv";
clocks = <&clks IMX7D_USB_HSIC_ROOT_CLK>;
clock-names = "main_clk";
power-domains = <&pgc_hsic_phy>;
#phy-cells = <0>;
};
@@ -1153,7 +1154,6 @@
compatible = "fsl,imx7d-usb", "fsl,imx27-usb";
reg = <0x30b30000 0x200>;
interrupts = <GIC_SPI 40 IRQ_TYPE_LEVEL_HIGH>;
power-domains = <&pgc_hsic_phy>;
clocks = <&clks IMX7D_USB_CTRL_CLK>;
fsl,usbphy = <&usbphynop3>;
fsl,usbmisc = <&usbmisc3 0>;

View File

@@ -0,0 +1,47 @@
// SPDX-License-Identifier: (GPL-2.0+ OR BSD-3-Clause)
/*
* Copyright (C) STMicroelectronics 2022 - All Rights Reserved
* Author: Alexandre Torgue <alexandre.torgue@foss.st.com> for STMicroelectronics.
*/
/ {
firmware {
optee: optee {
compatible = "linaro,optee-tz";
method = "smc";
};
scmi: scmi {
compatible = "linaro,scmi-optee";
#address-cells = <1>;
#size-cells = <0>;
linaro,optee-channel-id = <0>;
shmem = <&scmi_shm>;
scmi_clk: protocol@14 {
reg = <0x14>;
#clock-cells = <1>;
};
scmi_reset: protocol@16 {
reg = <0x16>;
#reset-cells = <1>;
};
};
};
soc {
scmi_sram: sram@2ffff000 {
compatible = "mmio-sram";
reg = <0x2ffff000 0x1000>;
#address-cells = <1>;
#size-cells = <1>;
ranges = <0 0x2ffff000 0x1000>;
scmi_shm: scmi-sram@0 {
compatible = "arm,scmi-shmem";
reg = <0 0x80>;
};
};
};
};

View File

@@ -115,33 +115,6 @@
status = "disabled";
};
firmware {
optee: optee {
compatible = "linaro,optee-tz";
method = "smc";
status = "disabled";
};
scmi: scmi {
compatible = "linaro,scmi-optee";
#address-cells = <1>;
#size-cells = <0>;
linaro,optee-channel-id = <0>;
shmem = <&scmi_shm>;
status = "disabled";
scmi_clk: protocol@14 {
reg = <0x14>;
#clock-cells = <1>;
};
scmi_reset: protocol@16 {
reg = <0x16>;
#reset-cells = <1>;
};
};
};
soc {
compatible = "simple-bus";
#address-cells = <1>;
@@ -149,20 +122,6 @@
interrupt-parent = <&intc>;
ranges;
scmi_sram: sram@2ffff000 {
compatible = "mmio-sram";
reg = <0x2ffff000 0x1000>;
#address-cells = <1>;
#size-cells = <1>;
ranges = <0 0x2ffff000 0x1000>;
scmi_shm: scmi-sram@0 {
compatible = "arm,scmi-shmem";
reg = <0 0x80>;
status = "disabled";
};
};
timers2: timer@40000000 {
#address-cells = <1>;
#size-cells = <0>;

View File

@@ -7,6 +7,7 @@
/dts-v1/;
#include "stm32mp157a-dk1.dts"
#include "stm32mp15-scmi.dtsi"
/ {
model = "STMicroelectronics STM32MP157A-DK1 SCMI Discovery Board";
@@ -54,10 +55,6 @@
resets = <&scmi_reset RST_SCMI_MCU>;
};
&optee {
status = "okay";
};
&rcc {
compatible = "st,stm32mp1-rcc-secure", "syscon";
clock-names = "hse", "hsi", "csi", "lse", "lsi";
@@ -76,11 +73,3 @@
&rtc {
clocks = <&scmi_clk CK_SCMI_RTCAPB>, <&scmi_clk CK_SCMI_RTC>;
};
&scmi {
status = "okay";
};
&scmi_shm {
status = "okay";
};

View File

@@ -7,6 +7,7 @@
/dts-v1/;
#include "stm32mp157c-dk2.dts"
#include "stm32mp15-scmi.dtsi"
/ {
model = "STMicroelectronics STM32MP157C-DK2 SCMI Discovery Board";
@@ -63,10 +64,6 @@
resets = <&scmi_reset RST_SCMI_MCU>;
};
&optee {
status = "okay";
};
&rcc {
compatible = "st,stm32mp1-rcc-secure", "syscon";
clock-names = "hse", "hsi", "csi", "lse", "lsi";
@@ -85,11 +82,3 @@
&rtc {
clocks = <&scmi_clk CK_SCMI_RTCAPB>, <&scmi_clk CK_SCMI_RTC>;
};
&scmi {
status = "okay";
};
&scmi_shm {
status = "okay";
};

View File

@@ -7,6 +7,7 @@
/dts-v1/;
#include "stm32mp157c-ed1.dts"
#include "stm32mp15-scmi.dtsi"
/ {
model = "STMicroelectronics STM32MP157C-ED1 SCMI eval daughter";
@@ -59,10 +60,6 @@
resets = <&scmi_reset RST_SCMI_MCU>;
};
&optee {
status = "okay";
};
&rcc {
compatible = "st,stm32mp1-rcc-secure", "syscon";
clock-names = "hse", "hsi", "csi", "lse", "lsi";
@@ -81,11 +78,3 @@
&rtc {
clocks = <&scmi_clk CK_SCMI_RTCAPB>, <&scmi_clk CK_SCMI_RTC>;
};
&scmi {
status = "okay";
};
&scmi_shm {
status = "okay";
};

View File

@@ -7,6 +7,7 @@
/dts-v1/;
#include "stm32mp157c-ev1.dts"
#include "stm32mp15-scmi.dtsi"
/ {
model = "STMicroelectronics STM32MP157C-EV1 SCMI eval daughter on eval mother";
@@ -68,10 +69,6 @@
resets = <&scmi_reset RST_SCMI_MCU>;
};
&optee {
status = "okay";
};
&rcc {
compatible = "st,stm32mp1-rcc-secure", "syscon";
clock-names = "hse", "hsi", "csi", "lse", "lsi";
@@ -90,11 +87,3 @@
&rtc {
clocks = <&scmi_clk CK_SCMI_RTCAPB>, <&scmi_clk CK_SCMI_RTC>;
};
&scmi {
status = "okay";
};
&scmi_shm {
status = "okay";
};

View File

@@ -39,6 +39,7 @@ static int axxia_boot_secondary(unsigned int cpu, struct task_struct *idle)
return -ENOENT;
syscon = of_iomap(syscon_np, 0);
of_node_put(syscon_np);
if (!syscon)
return -ENOMEM;

View File

@@ -372,6 +372,7 @@ static void __init cns3xxx_init(void)
/* De-Asscer SATA Reset */
cns3xxx_pwr_soft_rst(CNS3XXX_PWR_SOFTWARE_RST(SATA));
}
of_node_put(dn);
dn = of_find_compatible_node(NULL, NULL, "cavium,cns3420-sdhci");
if (of_device_is_available(dn)) {
@@ -385,6 +386,7 @@ static void __init cns3xxx_init(void)
cns3xxx_pwr_clk_en(CNS3XXX_PWR_CLK_EN(SDIO));
cns3xxx_pwr_soft_rst(CNS3XXX_PWR_SOFTWARE_RST(SDIO));
}
of_node_put(dn);
pm_power_off = cns3xxx_power_off;

View File

@@ -149,6 +149,7 @@ static void exynos_map_pmu(void)
np = of_find_matching_node(NULL, exynos_dt_pmu_match);
if (np)
pmu_base_addr = of_iomap(np, 0);
of_node_put(np);
}
static void __init exynos_init_irq(void)

View File

@@ -218,13 +218,13 @@ void __init spear_setup_of_timer(void)
irq = irq_of_parse_and_map(np, 0);
if (!irq) {
pr_err("%s: No irq passed for timer via DT\n", __func__);
return;
goto err_put_np;
}
gpt_base = of_iomap(np, 0);
if (!gpt_base) {
pr_err("%s: of iomap failed\n", __func__);
return;
goto err_put_np;
}
gpt_clk = clk_get_sys("gpt0", NULL);
@@ -239,6 +239,8 @@ void __init spear_setup_of_timer(void)
goto err_prepare_enable_clk;
}
of_node_put(np);
spear_clockevent_init(irq);
spear_clocksource_init();
@@ -248,4 +250,6 @@ err_prepare_enable_clk:
clk_put(gpt_clk);
err_iomap:
iounmap(gpt_base);
err_put_np:
of_node_put(np);
}

View File

@@ -280,8 +280,8 @@
interrupts = <GIC_SPI 246 IRQ_TYPE_LEVEL_HIGH>;
pinctrl-names = "default";
pinctrl-0 = <&uart0_bus>;
clocks = <&cmu_peri CLK_GOUT_UART0_EXT_UCLK>,
<&cmu_peri CLK_GOUT_UART0_PCLK>;
clocks = <&cmu_peri CLK_GOUT_UART0_PCLK>,
<&cmu_peri CLK_GOUT_UART0_EXT_UCLK>;
clock-names = "uart", "clk_uart_baud0";
samsung,uart-fifosize = <64>;
status = "disabled";
@@ -293,8 +293,8 @@
interrupts = <GIC_SPI 247 IRQ_TYPE_LEVEL_HIGH>;
pinctrl-names = "default";
pinctrl-0 = <&uart1_bus>;
clocks = <&cmu_peri CLK_GOUT_UART1_EXT_UCLK>,
<&cmu_peri CLK_GOUT_UART1_PCLK>;
clocks = <&cmu_peri CLK_GOUT_UART1_PCLK>,
<&cmu_peri CLK_GOUT_UART1_EXT_UCLK>;
clock-names = "uart", "clk_uart_baud0";
samsung,uart-fifosize = <256>;
status = "disabled";
@@ -306,8 +306,8 @@
interrupts = <GIC_SPI 279 IRQ_TYPE_LEVEL_HIGH>;
pinctrl-names = "default";
pinctrl-0 = <&uart2_bus>;
clocks = <&cmu_peri CLK_GOUT_UART2_EXT_UCLK>,
<&cmu_peri CLK_GOUT_UART2_PCLK>;
clocks = <&cmu_peri CLK_GOUT_UART2_PCLK>,
<&cmu_peri CLK_GOUT_UART2_EXT_UCLK>;
clock-names = "uart", "clk_uart_baud0";
samsung,uart-fifosize = <256>;
status = "disabled";

View File

@@ -79,7 +79,7 @@
};
};
soc {
soc@0 {
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;

View File

@@ -456,13 +456,11 @@
clock-names = "clk_ahb", "clk_xin";
mmc-ddr-1_8v;
mmc-hs200-1_8v;
mmc-hs400-1_8v;
ti,trm-icp = <0x2>;
ti,otap-del-sel-legacy = <0x0>;
ti,otap-del-sel-mmc-hs = <0x0>;
ti,otap-del-sel-ddr52 = <0x6>;
ti,otap-del-sel-hs200 = <0x7>;
ti,otap-del-sel-hs400 = <0x4>;
};
sdhci1: mmc@fa00000 {

View File

@@ -33,7 +33,7 @@
ranges;
#interrupt-cells = <3>;
interrupt-controller;
reg = <0x00 0x01800000 0x00 0x200000>, /* GICD */
reg = <0x00 0x01800000 0x00 0x100000>, /* GICD */
<0x00 0x01900000 0x00 0x100000>, /* GICR */
<0x00 0x6f000000 0x00 0x2000>, /* GICC */
<0x00 0x6f010000 0x00 0x1000>, /* GICH */

View File

@@ -362,11 +362,6 @@ struct kvm_vcpu_arch {
struct arch_timer_cpu timer_cpu;
struct kvm_pmu pmu;
/*
* Anything that is not used directly from assembly code goes
* here.
*/
/*
* Guest registers we preserve during guest debugging.
*

View File

@@ -113,6 +113,9 @@ static __always_inline bool has_vhe(void)
/*
* Code only run in VHE/NVHE hyp context can assume VHE is present or
* absent. Otherwise fall back to caps.
* This allows the compiler to discard VHE-specific code from the
* nVHE object, reducing the number of external symbol references
* needed to link.
*/
if (is_vhe_hyp_code())
return true;

View File

@@ -1974,15 +1974,7 @@ static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
#ifdef CONFIG_KVM
static bool is_kvm_protected_mode(const struct arm64_cpu_capabilities *entry, int __unused)
{
if (kvm_get_mode() != KVM_MODE_PROTECTED)
return false;
if (is_kernel_in_hyp_mode()) {
pr_warn("Protected KVM not available with VHE\n");
return false;
}
return true;
return kvm_get_mode() == KVM_MODE_PROTECTED;
}
#endif /* CONFIG_KVM */
@@ -3109,7 +3101,6 @@ void cpu_set_feature(unsigned int num)
WARN_ON(num >= MAX_CPU_FEATURES);
elf_hwcap |= BIT(num);
}
EXPORT_SYMBOL_GPL(cpu_set_feature);
bool cpu_have_feature(unsigned int num)
{

View File

@@ -102,7 +102,6 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
* x19-x29 per the AAPCS, and we created frame records upon entry, so we need
* to restore x0-x8, x29, and x30.
*/
ftrace_common_return:
/* Restore function arguments */
ldp x0, x1, [sp]
ldp x2, x3, [sp, #S_X2]

View File

@@ -77,6 +77,66 @@ static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr)
return NULL;
}
/*
* Find the address the callsite must branch to in order to reach '*addr'.
*
* Due to the limited range of 'BL' instructions, modules may be placed too far
* away to branch directly and must use a PLT.
*
* Returns true when '*addr' contains a reachable target address, or has been
* modified to contain a PLT address. Returns false otherwise.
*/
static bool ftrace_find_callable_addr(struct dyn_ftrace *rec,
struct module *mod,
unsigned long *addr)
{
unsigned long pc = rec->ip;
long offset = (long)*addr - (long)pc;
struct plt_entry *plt;
/*
* When the target is within range of the 'BL' instruction, use 'addr'
* as-is and branch to that directly.
*/
if (offset >= -SZ_128M && offset < SZ_128M)
return true;
/*
* When the target is outside of the range of a 'BL' instruction, we
* must use a PLT to reach it. We can only place PLTs for modules, and
* only when module PLT support is built-in.
*/
if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
return false;
/*
* 'mod' is only set at module load time, but if we end up
* dealing with an out-of-range condition, we can assume it
* is due to a module being loaded far away from the kernel.
*
* NOTE: __module_text_address() must be called with preemption
* disabled, but we can rely on ftrace_lock to ensure that 'mod'
* retains its validity throughout the remainder of this code.
*/
if (!mod) {
preempt_disable();
mod = __module_text_address(pc);
preempt_enable();
}
if (WARN_ON(!mod))
return false;
plt = get_ftrace_plt(mod, *addr);
if (!plt) {
pr_err("ftrace: no module PLT for %ps\n", (void *)*addr);
return false;
}
*addr = (unsigned long)plt;
return true;
}
/*
* Turn on the call to ftrace_caller() in instrumented function
*/
@@ -84,40 +144,9 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
{
unsigned long pc = rec->ip;
u32 old, new;
long offset = (long)pc - (long)addr;
if (offset < -SZ_128M || offset >= SZ_128M) {
struct module *mod;
struct plt_entry *plt;
if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
return -EINVAL;
/*
* On kernels that support module PLTs, the offset between the
* branch instruction and its target may legally exceed the
* range of an ordinary relative 'bl' opcode. In this case, we
* need to branch via a trampoline in the module.
*
* NOTE: __module_text_address() must be called with preemption
* disabled, but we can rely on ftrace_lock to ensure that 'mod'
* retains its validity throughout the remainder of this code.
*/
preempt_disable();
mod = __module_text_address(pc);
preempt_enable();
if (WARN_ON(!mod))
return -EINVAL;
plt = get_ftrace_plt(mod, addr);
if (!plt) {
pr_err("ftrace: no module PLT for %ps\n", (void *)addr);
return -EINVAL;
}
addr = (unsigned long)plt;
}
if (!ftrace_find_callable_addr(rec, NULL, &addr))
return -EINVAL;
old = aarch64_insn_gen_nop();
new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
@@ -132,6 +161,11 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
unsigned long pc = rec->ip;
u32 old, new;
if (!ftrace_find_callable_addr(rec, NULL, &old_addr))
return -EINVAL;
if (!ftrace_find_callable_addr(rec, NULL, &addr))
return -EINVAL;
old = aarch64_insn_gen_branch_imm(pc, old_addr,
AARCH64_INSN_BRANCH_LINK);
new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
@@ -181,54 +215,15 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
unsigned long addr)
{
unsigned long pc = rec->ip;
bool validate = true;
u32 old = 0, new;
long offset = (long)pc - (long)addr;
if (offset < -SZ_128M || offset >= SZ_128M) {
u32 replaced;
if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
return -EINVAL;
/*
* 'mod' is only set at module load time, but if we end up
* dealing with an out-of-range condition, we can assume it
* is due to a module being loaded far away from the kernel.
*/
if (!mod) {
preempt_disable();
mod = __module_text_address(pc);
preempt_enable();
if (WARN_ON(!mod))
return -EINVAL;
}
/*
* The instruction we are about to patch may be a branch and
* link instruction that was redirected via a PLT entry. In
* this case, the normal validation will fail, but we can at
* least check that we are dealing with a branch and link
* instruction that points into the right module.
*/
if (aarch64_insn_read((void *)pc, &replaced))
return -EFAULT;
if (!aarch64_insn_is_bl(replaced) ||
!within_module(pc + aarch64_get_branch_offset(replaced),
mod))
return -EINVAL;
validate = false;
} else {
old = aarch64_insn_gen_branch_imm(pc, addr,
AARCH64_INSN_BRANCH_LINK);
}
if (!ftrace_find_callable_addr(rec, mod, &addr))
return -EINVAL;
old = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
new = aarch64_insn_gen_nop();
return ftrace_modify_code(pc, old, new, validate);
return ftrace_modify_code(pc, old, new, true);
}
void arch_ftrace_update_code(int command)

View File

@@ -303,14 +303,13 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
early_fixmap_init();
early_ioremap_init();
/*
* Initialise the static keys early as they may be enabled by the
* cpufeature code, early parameters, and DT setup.
*/
jump_label_init();
setup_machine_fdt(__fdt_pointer);
/*
* Initialise the static keys early as they may be enabled by the
* cpufeature code and early parameters.
*/
jump_label_init();
parse_early_param();
/*

View File

@@ -1230,6 +1230,9 @@ bool kvm_arch_timer_get_input_level(int vintid)
struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
struct arch_timer_context *timer;
if (WARN(!vcpu, "No vcpu context!\n"))
return false;
if (vintid == vcpu_vtimer(vcpu)->irq.irq)
timer = vcpu_vtimer(vcpu);
else if (vintid == vcpu_ptimer(vcpu)->irq.irq)

View File

@@ -150,8 +150,10 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
if (ret)
goto out_free_stage2_pgd;
if (!zalloc_cpumask_var(&kvm->arch.supported_cpus, GFP_KERNEL))
if (!zalloc_cpumask_var(&kvm->arch.supported_cpus, GFP_KERNEL)) {
ret = -ENOMEM;
goto out_free_stage2_pgd;
}
cpumask_copy(kvm->arch.supported_cpus, cpu_possible_mask);
kvm_vgic_early_init(kvm);
@@ -2110,11 +2112,11 @@ static int finalize_hyp_mode(void)
return 0;
/*
* Exclude HYP BSS from kmemleak so that it doesn't get peeked
* at, which would end badly once the section is inaccessible.
* None of other sections should ever be introspected.
* Exclude HYP sections from kmemleak so that they don't get peeked
* at, which would end badly once inaccessible.
*/
kmemleak_free_part(__hyp_bss_start, __hyp_bss_end - __hyp_bss_start);
kmemleak_free_part(__va(hyp_mem_base), hyp_mem_size);
return pkvm_drop_host_privileges();
}
@@ -2271,7 +2273,11 @@ static int __init early_kvm_mode_cfg(char *arg)
return -EINVAL;
if (strcmp(arg, "protected") == 0) {
kvm_mode = KVM_MODE_PROTECTED;
if (!is_kernel_in_hyp_mode())
kvm_mode = KVM_MODE_PROTECTED;
else
pr_warn_once("Protected KVM not available with VHE\n");
return 0;
}

View File

@@ -80,6 +80,7 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
vcpu->arch.flags |= KVM_ARM64_FP_HOST;
vcpu->arch.flags &= ~KVM_ARM64_HOST_SVE_ENABLED;
if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN)
vcpu->arch.flags |= KVM_ARM64_HOST_SVE_ENABLED;
@@ -93,6 +94,7 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
* operations. Do this for ZA as well for now for simplicity.
*/
if (system_supports_sme()) {
vcpu->arch.flags &= ~KVM_ARM64_HOST_SME_ENABLED;
if (read_sysreg(cpacr_el1) & CPACR_EL1_SMEN_EL0EN)
vcpu->arch.flags |= KVM_ARM64_HOST_SME_ENABLED;

View File

@@ -314,15 +314,11 @@ static int host_stage2_adjust_range(u64 addr, struct kvm_mem_range *range)
int host_stage2_idmap_locked(phys_addr_t addr, u64 size,
enum kvm_pgtable_prot prot)
{
hyp_assert_lock_held(&host_kvm.lock);
return host_stage2_try(__host_stage2_idmap, addr, addr + size, prot);
}
int host_stage2_set_owner_locked(phys_addr_t addr, u64 size, u8 owner_id)
{
hyp_assert_lock_held(&host_kvm.lock);
return host_stage2_try(kvm_pgtable_stage2_set_owner, &host_kvm.pgt,
addr, size, &host_s2_pool, owner_id);
}

View File

@@ -243,15 +243,9 @@ u64 pvm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
case SYS_ID_AA64MMFR2_EL1:
return get_pvm_id_aa64mmfr2(vcpu);
default:
/*
* Should never happen because all cases are covered in
* pvm_sys_reg_descs[].
*/
WARN_ON(1);
break;
/* Unhandled ID register, RAZ */
return 0;
}
return 0;
}
static u64 read_id_reg(const struct kvm_vcpu *vcpu,
@@ -332,6 +326,16 @@ static bool pvm_gic_read_sre(struct kvm_vcpu *vcpu,
/* Mark the specified system register as an AArch64 feature id register. */
#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
/*
* sys_reg_desc initialiser for architecturally unallocated cpufeature ID
* register with encoding Op0=3, Op1=0, CRn=0, CRm=crm, Op2=op2
* (1 <= crm < 8, 0 <= Op2 < 8).
*/
#define ID_UNALLOCATED(crm, op2) { \
Op0(3), Op1(0), CRn(0), CRm(crm), Op2(op2), \
.access = pvm_access_id_aarch64, \
}
/* Mark the specified system register as Read-As-Zero/Write-Ignored */
#define RAZ_WI(REG) { SYS_DESC(REG), .access = pvm_access_raz_wi }
@@ -375,24 +379,46 @@ static const struct sys_reg_desc pvm_sys_reg_descs[] = {
AARCH32(SYS_MVFR0_EL1),
AARCH32(SYS_MVFR1_EL1),
AARCH32(SYS_MVFR2_EL1),
ID_UNALLOCATED(3,3),
AARCH32(SYS_ID_PFR2_EL1),
AARCH32(SYS_ID_DFR1_EL1),
AARCH32(SYS_ID_MMFR5_EL1),
ID_UNALLOCATED(3,7),
/* AArch64 ID registers */
/* CRm=4 */
AARCH64(SYS_ID_AA64PFR0_EL1),
AARCH64(SYS_ID_AA64PFR1_EL1),
ID_UNALLOCATED(4,2),
ID_UNALLOCATED(4,3),
AARCH64(SYS_ID_AA64ZFR0_EL1),
ID_UNALLOCATED(4,5),
ID_UNALLOCATED(4,6),
ID_UNALLOCATED(4,7),
AARCH64(SYS_ID_AA64DFR0_EL1),
AARCH64(SYS_ID_AA64DFR1_EL1),
ID_UNALLOCATED(5,2),
ID_UNALLOCATED(5,3),
AARCH64(SYS_ID_AA64AFR0_EL1),
AARCH64(SYS_ID_AA64AFR1_EL1),
ID_UNALLOCATED(5,6),
ID_UNALLOCATED(5,7),
AARCH64(SYS_ID_AA64ISAR0_EL1),
AARCH64(SYS_ID_AA64ISAR1_EL1),
AARCH64(SYS_ID_AA64ISAR2_EL1),
ID_UNALLOCATED(6,3),
ID_UNALLOCATED(6,4),
ID_UNALLOCATED(6,5),
ID_UNALLOCATED(6,6),
ID_UNALLOCATED(6,7),
AARCH64(SYS_ID_AA64MMFR0_EL1),
AARCH64(SYS_ID_AA64MMFR1_EL1),
AARCH64(SYS_ID_AA64MMFR2_EL1),
ID_UNALLOCATED(7,3),
ID_UNALLOCATED(7,4),
ID_UNALLOCATED(7,5),
ID_UNALLOCATED(7,6),
ID_UNALLOCATED(7,7),
/* Scalable Vector Registers are restricted. */

View File

@@ -429,11 +429,11 @@ static const struct vgic_register_region vgic_v2_dist_registers[] = {
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_PENDING_SET,
vgic_mmio_read_pending, vgic_mmio_write_spending,
NULL, vgic_uaccess_write_spending, 1,
vgic_uaccess_read_pending, vgic_uaccess_write_spending, 1,
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_PENDING_CLEAR,
vgic_mmio_read_pending, vgic_mmio_write_cpending,
NULL, vgic_uaccess_write_cpending, 1,
vgic_uaccess_read_pending, vgic_uaccess_write_cpending, 1,
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ACTIVE_SET,
vgic_mmio_read_active, vgic_mmio_write_sactive,

View File

@@ -353,42 +353,6 @@ static unsigned long vgic_mmio_read_v3_idregs(struct kvm_vcpu *vcpu,
return 0;
}
static unsigned long vgic_v3_uaccess_read_pending(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len)
{
u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
u32 value = 0;
int i;
/*
* pending state of interrupt is latched in pending_latch variable.
* Userspace will save and restore pending state and line_level
* separately.
* Refer to Documentation/virt/kvm/devices/arm-vgic-v3.rst
* for handling of ISPENDR and ICPENDR.
*/
for (i = 0; i < len * 8; i++) {
struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
bool state = irq->pending_latch;
if (irq->hw && vgic_irq_is_sgi(irq->intid)) {
int err;
err = irq_get_irqchip_state(irq->host_irq,
IRQCHIP_STATE_PENDING,
&state);
WARN_ON(err);
}
if (state)
value |= (1U << i);
vgic_put_irq(vcpu->kvm, irq);
}
return value;
}
static int vgic_v3_uaccess_write_pending(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len,
unsigned long val)
@@ -666,7 +630,7 @@ static const struct vgic_register_region vgic_v3_dist_registers[] = {
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ISPENDR,
vgic_mmio_read_pending, vgic_mmio_write_spending,
vgic_v3_uaccess_read_pending, vgic_v3_uaccess_write_pending, 1,
vgic_uaccess_read_pending, vgic_v3_uaccess_write_pending, 1,
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ICPENDR,
vgic_mmio_read_pending, vgic_mmio_write_cpending,
@@ -750,7 +714,7 @@ static const struct vgic_register_region vgic_v3_rd_registers[] = {
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_LENGTH_UACCESS(SZ_64K + GICR_ISPENDR0,
vgic_mmio_read_pending, vgic_mmio_write_spending,
vgic_v3_uaccess_read_pending, vgic_v3_uaccess_write_pending, 4,
vgic_uaccess_read_pending, vgic_v3_uaccess_write_pending, 4,
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_LENGTH_UACCESS(SZ_64K + GICR_ICPENDR0,
vgic_mmio_read_pending, vgic_mmio_write_cpending,

View File

@@ -226,8 +226,9 @@ int vgic_uaccess_write_cenable(struct kvm_vcpu *vcpu,
return 0;
}
unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len)
static unsigned long __read_pending(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len,
bool is_user)
{
u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
u32 value = 0;
@@ -239,6 +240,15 @@ unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
unsigned long flags;
bool val;
/*
* When used from userspace with a GICv3 model:
*
* Pending state of interrupt is latched in pending_latch
* variable. Userspace will save and restore pending state
* and line_level separately.
* Refer to Documentation/virt/kvm/devices/arm-vgic-v3.rst
* for handling of ISPENDR and ICPENDR.
*/
raw_spin_lock_irqsave(&irq->irq_lock, flags);
if (irq->hw && vgic_irq_is_sgi(irq->intid)) {
int err;
@@ -248,10 +258,20 @@ unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
IRQCHIP_STATE_PENDING,
&val);
WARN_RATELIMIT(err, "IRQ %d", irq->host_irq);
} else if (vgic_irq_is_mapped_level(irq)) {
} else if (!is_user && vgic_irq_is_mapped_level(irq)) {
val = vgic_get_phys_line_level(irq);
} else {
val = irq_is_pending(irq);
switch (vcpu->kvm->arch.vgic.vgic_model) {
case KVM_DEV_TYPE_ARM_VGIC_V3:
if (is_user) {
val = irq->pending_latch;
break;
}
fallthrough;
default:
val = irq_is_pending(irq);
break;
}
}
value |= ((u32)val << i);
@@ -263,6 +283,18 @@ unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
return value;
}
unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len)
{
return __read_pending(vcpu, addr, len, false);
}
unsigned long vgic_uaccess_read_pending(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len)
{
return __read_pending(vcpu, addr, len, true);
}
static bool is_vgic_v2_sgi(struct kvm_vcpu *vcpu, struct vgic_irq *irq)
{
return (vgic_irq_is_sgi(irq->intid) &&

View File

@@ -149,6 +149,9 @@ int vgic_uaccess_write_cenable(struct kvm_vcpu *vcpu,
unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len);
unsigned long vgic_uaccess_read_pending(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len);
void vgic_mmio_write_spending(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len,
unsigned long val);

View File

@@ -66,7 +66,7 @@ static void flush_context(void)
* the next context-switch, we broadcast TLB flush + I-cache
* invalidation over the inner shareable domain on rollover.
*/
kvm_call_hyp(__kvm_flush_vm_context);
kvm_call_hyp(__kvm_flush_vm_context);
}
static bool check_update_reserved_vmid(u64 vmid, u64 newvmid)

View File

@@ -218,8 +218,6 @@ SYM_FUNC_ALIAS(__dma_flush_area, __pi___dma_flush_area)
*/
SYM_FUNC_START(__pi___dma_map_area)
add x1, x0, x1
cmp w2, #DMA_FROM_DEVICE
b.eq __pi_dcache_inval_poc
b __pi_dcache_clean_poc
SYM_FUNC_END(__pi___dma_map_area)
SYM_FUNC_ALIAS(__dma_map_area, __pi___dma_map_area)

View File

@@ -214,6 +214,19 @@ static pte_t get_clear_contig(struct mm_struct *mm,
return orig_pte;
}
static pte_t get_clear_contig_flush(struct mm_struct *mm,
unsigned long addr,
pte_t *ptep,
unsigned long pgsize,
unsigned long ncontig)
{
pte_t orig_pte = get_clear_contig(mm, addr, ptep, pgsize, ncontig);
struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0);
flush_tlb_range(&vma, addr, addr + (pgsize * ncontig));
return orig_pte;
}
/*
* Changing some bits of contiguous entries requires us to follow a
* Break-Before-Make approach, breaking the whole contiguous set
@@ -447,19 +460,20 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
int ncontig, i;
size_t pgsize = 0;
unsigned long pfn = pte_pfn(pte), dpfn;
struct mm_struct *mm = vma->vm_mm;
pgprot_t hugeprot;
pte_t orig_pte;
if (!pte_cont(pte))
return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
ncontig = find_num_contig(vma->vm_mm, addr, ptep, &pgsize);
ncontig = find_num_contig(mm, addr, ptep, &pgsize);
dpfn = pgsize >> PAGE_SHIFT;
if (!__cont_access_flags_changed(ptep, pte, ncontig))
return 0;
orig_pte = get_clear_contig(vma->vm_mm, addr, ptep, pgsize, ncontig);
orig_pte = get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig);
/* Make sure we don't lose the dirty or young state */
if (pte_dirty(orig_pte))
@@ -470,7 +484,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
hugeprot = pte_pgprot(pte);
for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn)
set_pte_at(vma->vm_mm, addr, ptep, pfn_pte(pfn, hugeprot));
set_pte_at(mm, addr, ptep, pfn_pte(pfn, hugeprot));
return 1;
}
@@ -492,7 +506,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm,
ncontig = find_num_contig(mm, addr, ptep, &pgsize);
dpfn = pgsize >> PAGE_SHIFT;
pte = get_clear_contig(mm, addr, ptep, pgsize, ncontig);
pte = get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig);
pte = pte_wrprotect(pte);
hugeprot = pte_pgprot(pte);
@@ -505,17 +519,15 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm,
pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep)
{
struct mm_struct *mm = vma->vm_mm;
size_t pgsize;
int ncontig;
pte_t orig_pte;
if (!pte_cont(READ_ONCE(*ptep)))
return ptep_clear_flush(vma, addr, ptep);
ncontig = find_num_contig(vma->vm_mm, addr, ptep, &pgsize);
orig_pte = get_clear_contig(vma->vm_mm, addr, ptep, pgsize, ncontig);
flush_tlb_range(vma, addr, addr + pgsize * ncontig);
return orig_pte;
ncontig = find_num_contig(mm, addr, ptep, &pgsize);
return get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig);
}
static int __init hugetlbpage_init(void)

View File

@@ -12,10 +12,9 @@ static inline unsigned long exception_era(struct pt_regs *regs)
return regs->csr_era;
}
static inline int compute_return_era(struct pt_regs *regs)
static inline void compute_return_era(struct pt_regs *regs)
{
regs->csr_era += 4;
return 0;
}
#endif /* _ASM_BRANCH_H */

View File

@@ -426,6 +426,11 @@ static inline void update_mmu_cache_pmd(struct vm_area_struct *vma,
#define kern_addr_valid(addr) (1)
static inline unsigned long pmd_pfn(pmd_t pmd)
{
return (pmd_val(pmd) & _PFN_MASK) >> _PFN_SHIFT;
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
/* We don't have hardware dirty/accessed bits, generic_pmdp_establish is fine.*/
@@ -497,11 +502,6 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd)
return pmd;
}
static inline unsigned long pmd_pfn(pmd_t pmd)
{
return (pmd_val(pmd) & _PFN_MASK) >> _PFN_SHIFT;
}
static inline struct page *pmd_page(pmd_t pmd)
{
if (pmd_trans_huge(pmd))

View File

@@ -263,7 +263,7 @@ void cpu_probe(void)
c->cputype = CPU_UNKNOWN;
c->processor_id = read_cpucfg(LOONGARCH_CPUCFG0);
c->fpu_vers = (read_cpucfg(LOONGARCH_CPUCFG2) >> 3) & 0x3;
c->fpu_vers = (read_cpucfg(LOONGARCH_CPUCFG2) & CPUCFG2_FPVERS) >> 3;
c->fpu_csr0 = FPU_CSR_RN;
c->fpu_mask = FPU_CSR_RSVD;

View File

@@ -14,8 +14,6 @@
__REF
SYM_ENTRY(_stext, SYM_L_GLOBAL, SYM_A_NONE)
SYM_CODE_START(kernel_entry) # kernel entry point
/* Config direct window and set PG */

View File

@@ -475,8 +475,7 @@ asmlinkage void noinstr do_ri(struct pt_regs *regs)
die_if_kernel("Reserved instruction in kernel code", regs);
if (unlikely(compute_return_era(regs) < 0))
goto out;
compute_return_era(regs);
if (unlikely(get_user(opcode, era) < 0)) {
status = SIGSEGV;

View File

@@ -37,6 +37,7 @@ SECTIONS
HEAD_TEXT_SECTION
. = ALIGN(PECOFF_SEGMENT_ALIGN);
_stext = .;
.text : {
TEXT_TEXT
SCHED_TEXT
@@ -101,6 +102,7 @@ SECTIONS
STABS_DEBUG
DWARF_DEBUG
ELF_DETAILS
.gptab.sdata : {
*(.gptab.data)

View File

@@ -281,15 +281,16 @@ void setup_tlb_handler(int cpu)
if (pcpu_handlers[cpu])
return;
page = alloc_pages_node(cpu_to_node(cpu), GFP_KERNEL, get_order(vec_sz));
page = alloc_pages_node(cpu_to_node(cpu), GFP_ATOMIC, get_order(vec_sz));
if (!page)
return;
addr = page_address(page);
pcpu_handlers[cpu] = virt_to_phys(addr);
pcpu_handlers[cpu] = (unsigned long)addr;
memcpy((void *)addr, (void *)eentry, vec_sz);
local_flush_icache_range((unsigned long)addr, (unsigned long)addr + vec_sz);
csr_write64(pcpu_handlers[cpu], LOONGARCH_CSR_TLBRENTRY);
csr_write64(pcpu_handlers[cpu], LOONGARCH_CSR_EENTRY);
csr_write64(pcpu_handlers[cpu], LOONGARCH_CSR_MERRENTRY);
csr_write64(pcpu_handlers[cpu] + 80*VECSIZE, LOONGARCH_CSR_TLBRENTRY);
}
#endif

View File

@@ -111,8 +111,9 @@
clocks = <&cgu X1000_CLK_RTCLK>,
<&cgu X1000_CLK_EXCLK>,
<&cgu X1000_CLK_PCLK>;
clock-names = "rtc", "ext", "pclk";
<&cgu X1000_CLK_PCLK>,
<&cgu X1000_CLK_TCU>;
clock-names = "rtc", "ext", "pclk", "tcu";
interrupt-controller;
#interrupt-cells = <1>;

View File

@@ -104,8 +104,9 @@
clocks = <&cgu X1830_CLK_RTCLK>,
<&cgu X1830_CLK_EXCLK>,
<&cgu X1830_CLK_PCLK>;
clock-names = "rtc", "ext", "pclk";
<&cgu X1830_CLK_PCLK>,
<&cgu X1830_CLK_TCU>;
clock-names = "rtc", "ext", "pclk", "tcu";
interrupt-controller;
#interrupt-cells = <1>;

View File

@@ -44,6 +44,7 @@ static __init unsigned int ranchu_measure_hpt_freq(void)
__func__);
rtc_base = of_iomap(np, 0);
of_node_put(np);
if (!rtc_base)
panic("%s(): Failed to ioremap Goldfish RTC base!", __func__);

View File

@@ -208,6 +208,12 @@ void __init ltq_soc_init(void)
of_address_to_resource(np_sysgpe, 0, &res_sys[2]))
panic("Failed to get core resources");
of_node_put(np_status);
of_node_put(np_ebu);
of_node_put(np_sys1);
of_node_put(np_syseth);
of_node_put(np_sysgpe);
if ((request_mem_region(res_status.start, resource_size(&res_status),
res_status.name) < 0) ||
(request_mem_region(res_ebu.start, resource_size(&res_ebu),

View File

@@ -408,6 +408,7 @@ int __init icu_of_init(struct device_node *node, struct device_node *parent)
if (!ltq_eiu_membase)
panic("Failed to remap eiu memory");
}
of_node_put(eiu_node);
return 0;
}

View File

@@ -441,6 +441,10 @@ void __init ltq_soc_init(void)
of_address_to_resource(np_ebu, 0, &res_ebu))
panic("Failed to get core resources");
of_node_put(np_pmu);
of_node_put(np_cgu);
of_node_put(np_ebu);
if (!request_mem_region(res_pmu.start, resource_size(&res_pmu),
res_pmu.name) ||
!request_mem_region(res_cgu.start, resource_size(&res_cgu),

View File

@@ -214,6 +214,8 @@ static void update_gic_frequency_dt(void)
if (of_update_property(node, &gic_frequency_prop) < 0)
pr_err("error updating gic frequency property\n");
of_node_put(node);
}
#endif

View File

@@ -98,13 +98,18 @@ static int __init pic32_of_prepare_platform_data(struct of_dev_auxdata *lookup)
np = of_find_compatible_node(NULL, NULL, lookup->compatible);
if (np) {
lookup->name = (char *)np->name;
if (lookup->phys_addr)
if (lookup->phys_addr) {
of_node_put(np);
continue;
}
if (!of_address_to_resource(np, 0, &res))
lookup->phys_addr = res.start;
of_node_put(np);
}
}
of_node_put(root);
return 0;
}

View File

@@ -32,6 +32,9 @@ static unsigned int pic32_xlate_core_timer_irq(void)
goto default_map;
irq = irq_of_parse_and_map(node, 0);
of_node_put(node);
if (!irq)
goto default_map;

View File

@@ -40,6 +40,8 @@ __iomem void *plat_of_remap_node(const char *node)
if (of_address_to_resource(np, 0, &res))
panic("Failed to get resource for %s", node);
of_node_put(np);
if (!request_mem_region(res.start,
resource_size(&res),
res.name))

Some files were not shown because too many files have changed in this diff Show More