Compare commits

...

247 Commits

Author SHA1 Message Date
Linus Torvalds
bbf5c97901 Linux 5.9 2020-10-11 14:15:50 -07:00
Linus Torvalds
3dd0130f24 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "Five fixes.

  Subsystems affected by this patch series: MAINTAINERS, mm/pagemap,
  mm/swap, and mm/hugetlb"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged
  mm: validate inode in mapping_set_error()
  mm: mmap: Fix general protection fault in unlink_file_vma()
  MAINTAINERS: Antoine Tenart's email address
  MAINTAINERS: change hardening mailing list
2020-10-11 11:18:04 -07:00
Linus Torvalds
5b697f86f9 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fix from Al Viro:
 "Fixes an obvious bug (memory leak introduced in 5.8)"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  pipe: Fix memory leaks in create_pipe_files()
2020-10-11 11:11:35 -07:00
Linus Torvalds
c120ec12e2 Merge tag 'x86-urgent-2020-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Two fixes:

   - Fix a (hopefully final) IRQ state tracking bug vs MCE handling

   - Fix a documentation link"

* tag 'x86-urgent-2020-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation/x86: Fix incorrect references to zero-page.txt
  x86/mce: Use idtentry_nmi_enter/exit()
2020-10-11 10:53:37 -07:00
Linus Torvalds
aa5c3a2911 Merge tag 'perf-urgent-2020-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Ingo Molnar:
 "Fix an error handling bug that can cause a lockup if a CPU is offline
  (doh ...)"

* tag 'perf-urgent-2020-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix task_function_call() error handling
2020-10-11 10:43:37 -07:00
Vijay Balakrishna
4aab2be098 mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged
When memory is hotplug added or removed the min_free_kbytes should be
recalculated based on what is expected by khugepaged.  Currently after
hotplug, min_free_kbytes will be set to a lower default and higher
default set when THP enabled is lost.

This change restores min_free_kbytes as expected for THP consumers.

[vijayb@linux.microsoft.com: v5]
  Link: https://lkml.kernel.org/r/1601398153-5517-1-git-send-email-vijayb@linux.microsoft.com

Fixes: f000565adb ("thp: set recommended min free kbytes")
Signed-off-by: Vijay Balakrishna <vijayb@linux.microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Allen Pais <apais@microsoft.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/1600305709-2319-2-git-send-email-vijayb@linux.microsoft.com
Link: https://lkml.kernel.org/r/1600204258-13683-1-git-send-email-vijayb@linux.microsoft.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-11 10:31:11 -07:00
Minchan Kim
8b7b2eb131 mm: validate inode in mapping_set_error()
The swap address_space doesn't have host. Thus, it makes kernel crash once
swap write meets error. Fix it.

Fixes: 735e4ae5ba ("vfs: track per-sb writeback errors and report them to syncfs")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Jeff Layton <jlayton@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Andres Freund <andres@anarazel.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20201010000650.750063-1-minchan@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-11 10:31:10 -07:00
Miaohe Lin
bc4fe4cdd6 mm: mmap: Fix general protection fault in unlink_file_vma()
The syzbot reported the below general protection fault:

  general protection fault, probably for non-canonical address
  0xe00eeaee0000003b: 0000 [#1] PREEMPT SMP KASAN
  KASAN: maybe wild-memory-access in range [0x00777770000001d8-0x00777770000001df]
  CPU: 1 PID: 10488 Comm: syz-executor721 Not tainted 5.9.0-rc3-syzkaller #0
  RIP: 0010:unlink_file_vma+0x57/0xb0 mm/mmap.c:164
  Call Trace:
     free_pgtables+0x1b3/0x2f0 mm/memory.c:415
     exit_mmap+0x2c0/0x530 mm/mmap.c:3184
     __mmput+0x122/0x470 kernel/fork.c:1076
     mmput+0x53/0x60 kernel/fork.c:1097
     exit_mm kernel/exit.c:483 [inline]
     do_exit+0xa8b/0x29f0 kernel/exit.c:793
     do_group_exit+0x125/0x310 kernel/exit.c:903
     get_signal+0x428/0x1f00 kernel/signal.c:2757
     arch_do_signal+0x82/0x2520 arch/x86/kernel/signal.c:811
     exit_to_user_mode_loop kernel/entry/common.c:136 [inline]
     exit_to_user_mode_prepare+0x1ae/0x200 kernel/entry/common.c:167
     syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:242
     entry_SYSCALL_64_after_hwframe+0x44/0xa9

It's because the ->mmap() callback can change vma->vm_file and fput the
original file.  But the commit d70cec8983 ("mm: mmap: merge vma after
call_mmap() if possible") failed to catch this case and always fput()
the original file, hence add an extra fput().

[ Thanks Hillf for pointing this extra fput() out. ]

Fixes: d70cec8983 ("mm: mmap: merge vma after call_mmap() if possible")
Reported-by: syzbot+c5d5a51dcbb558ca0cb5@syzkaller.appspotmail.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christian König <ckoenig.leichtzumerken@gmail.com>
Cc: Hongxiang Lou <louhongxiang@huawei.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Link: https://lkml.kernel.org/r/20200916090733.31427-1-linmiaohe@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-11 10:31:10 -07:00
Antoine Tenart
512b557ac8 MAINTAINERS: Antoine Tenart's email address
Use my kernel.org address instead of my bootlin.com one.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20201005164533.16811-1-atenart@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-11 10:31:10 -07:00
Kees Cook
ae4a380109 MAINTAINERS: change hardening mailing list
As more email from git history gets aimed at the OpenWall
kernel-hardening@ list, there has been a desire to separate "new topics"
from "on-going" work.

To handle this, the superset of hardening email topics are now to be
directed to linux-hardening@vger.kernel.org.

Update the MAINTAINERS file and the .mailmap to accomplish this, so that
linux-hardening@ can be treated like any other regular upstream kernel
development list.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: "Tobin C. Harding" <me@tobin.cc>
Cc: Tycho Andersen <tycho@tycho.pizza>
Cc: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/linux-hardening/202010051443.279CC265D@keescook/
Link: https://lkml.kernel.org/r/20201006000012.2768958-1-keescook@chromium.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-11 10:31:10 -07:00
Linus Torvalds
da690031a5 Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "Some more driver bugfixes for I2C. Including a revert - the updated
  series for it will come during the next merge window"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: owl: Clear NACK and BUS error bits
  Revert "i2c: imx: Fix reset of I2SR_IAL flag"
  i2c: meson: fixup rate calculation with filter delay
  i2c: meson: keep peripheral clock enabled
  i2c: meson: fix clock setting overwrite
  i2c: imx: Fix reset of I2SR_IAL flag
2020-10-10 16:09:12 -07:00
Vladimir Zapolskiy
64b7f674c2 cifs: Fix incomplete memory allocation on setxattr path
On setxattr() syscall path due to an apprent typo the size of a dynamically
allocated memory chunk for storing struct smb2_file_full_ea_info object is
computed incorrectly, to be more precise the first addend is the size of
a pointer instead of the wanted object size. Coincidentally it makes no
difference on 64-bit platforms, however on 32-bit targets the following
memcpy() writes 4 bytes of data outside of the dynamically allocated memory.

  =============================================================================
  BUG kmalloc-16 (Not tainted): Redzone overwritten
  -----------------------------------------------------------------------------

  Disabling lock debugging due to kernel taint
  INFO: 0x79e69a6f-0x9e5cdecf @offset=368. First byte 0x73 instead of 0xcc
  INFO: Slab 0xd36d2454 objects=85 used=51 fp=0xf7d0fc7a flags=0x35000201
  INFO: Object 0x6f171df3 @offset=352 fp=0x00000000

  Redzone 5d4ff02d: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ................
  Object 6f171df3: 00 00 00 00 00 05 06 00 73 6e 72 75 62 00 66 69  ........snrub.fi
  Redzone 79e69a6f: 73 68 32 0a                                      sh2.
  Padding 56254d82: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
  CPU: 0 PID: 8196 Comm: attr Tainted: G    B             5.9.0-rc8+ #3
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
  Call Trace:
   dump_stack+0x54/0x6e
   print_trailer+0x12c/0x134
   check_bytes_and_report.cold+0x3e/0x69
   check_object+0x18c/0x250
   free_debug_processing+0xfe/0x230
   __slab_free+0x1c0/0x300
   kfree+0x1d3/0x220
   smb2_set_ea+0x27d/0x540
   cifs_xattr_set+0x57f/0x620
   __vfs_setxattr+0x4e/0x60
   __vfs_setxattr_noperm+0x4e/0x100
   __vfs_setxattr_locked+0xae/0xd0
   vfs_setxattr+0x4e/0xe0
   setxattr+0x12c/0x1a0
   path_setxattr+0xa4/0xc0
   __ia32_sys_lsetxattr+0x1d/0x20
   __do_fast_syscall_32+0x40/0x70
   do_fast_syscall_32+0x29/0x60
   do_SYSENTER_32+0x15/0x20
   entry_SYSENTER_32+0x9f/0xf2

Fixes: 5517554e43 ("cifs: Add support for writing attributes on SMB2+")
Signed-off-by: Vladimir Zapolskiy <vladimir@tuxera.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-10 15:52:54 -07:00
Hugh Dickins
033b5d7755 mm/khugepaged: fix filemap page_to_pgoff(page) != offset
There have been elusive reports of filemap_fault() hitting its
VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page) on kernels built
with CONFIG_READ_ONLY_THP_FOR_FS=y.

Suren has hit it on a kernel with CONFIG_READ_ONLY_THP_FOR_FS=y and
CONFIG_NUMA is not set: and he has analyzed it down to how khugepaged
without NUMA reuses the same huge page after collapse_file() failed
(whereas NUMA targets its allocation to the respective node each time).
And most of us were usually testing with CONFIG_NUMA=y kernels.

collapse_file(old start)
  new_page = khugepaged_alloc_page(hpage)
  __SetPageLocked(new_page)
  new_page->index = start // hpage->index=old offset
  new_page->mapping = mapping
  xas_store(&xas, new_page)

                          filemap_fault
                            page = find_get_page(mapping, offset)
                            // if offset falls inside hpage then
                            // compound_head(page) == hpage
                            lock_page_maybe_drop_mmap()
                              __lock_page(page)

  // collapse fails
  xas_store(&xas, old page)
  new_page->mapping = NULL
  unlock_page(new_page)

collapse_file(new start)
  new_page = khugepaged_alloc_page(hpage)
  __SetPageLocked(new_page)
  new_page->index = start // hpage->index=new offset
  new_page->mapping = mapping // mapping becomes valid again

                            // since compound_head(page) == hpage
                            // page_to_pgoff(page) got changed
                            VM_BUG_ON_PAGE(page_to_pgoff(page) != offset)

An initial patch replaced __SetPageLocked() by lock_page(), which did
fix the race which Suren illustrates above.  But testing showed that it's
not good enough: if the racing task's __lock_page() gets delayed long
after its find_get_page(), then it may follow collapse_file(new start)'s
successful final unlock_page(), and crash on the same VM_BUG_ON_PAGE.

It could be fixed by relaxing filemap_fault()'s VM_BUG_ON_PAGE to a
check and retry (as is done for mapping), with similar relaxations in
find_lock_entry() and pagecache_get_page(): but it's not obvious what
else might get caught out; and khugepaged non-NUMA appears to be unique
in exposing a page to page cache, then revoking, without going through
a full cycle of freeing before reuse.

Instead, non-NUMA khugepaged_prealloc_page() release the old page
if anyone else has a reference to it (1% of cases when I tested).

Although never reported on huge tmpfs, I believe its find_lock_entry()
has been at similar risk; but huge tmpfs does not rely on khugepaged
for its normal working nearly so much as READ_ONLY_THP_FOR_FS does.

Reported-by: Denis Lisov <dennis.lissov@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206569
Link: https://lore.kernel.org/linux-mm/?q=20200219144635.3b7417145de19b65f258c943%40linux-foundation.org
Reported-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/linux-xfs/?q=20200616013309.GB815%40lca.pw
Reported-and-analyzed-by: Suren Baghdasaryan <surenb@google.com>
Fixes: 87c460a0bd ("mm/khugepaged: collapse_shmem() without freezing new_page")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org # v4.9+
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-10 15:52:54 -07:00
Cristian Ciocaltea
f5b3f43364 i2c: owl: Clear NACK and BUS error bits
When the NACK and BUS error bits are set by the hardware, the driver is
responsible for clearing them by writing "1" into the corresponding
status registers.

Hence perform the necessary operations in owl_i2c_interrupt().

Fixes: d211e62af4 ("i2c: Add Actions Semiconductor Owl family S900 I2C driver")
Reported-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@gmail.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-10-10 13:15:46 +02:00
Wolfram Sang
5a02e7c429 Revert "i2c: imx: Fix reset of I2SR_IAL flag"
This reverts commit fa4d305568. An updated
version was sent. So, revert this version and give the new version more
time for testing.

Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-10-10 13:03:54 +02:00
Linus Torvalds
6f2f486d57 Merge tag 'spi-fix-v5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fix from Mark Brown:
 "One last minute fix for v5.9 which has been causing crashes in test
  systems with the fsl-dspi driver when they hit deferred probe (and
  which I probably let cook in next a bit longer than is ideal).

  And an update to MAINTAINERS reflecting Serge's extensive and
  detailed recent work on the DesignWare driver"

* tag 'spi-fix-v5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  MAINTAINERS: Add maintainer of DW APB SSI driver
  spi: fsl-dspi: fix NULL pointer dereference
2020-10-09 18:05:12 -07:00
Linus Torvalds
8a5f78d98c Merge tag 'riscv-for-linus-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
 "Two fixes this week:

   - A fix to actually reserve the device tree's memory. Without this
     the device tree can be overwritten on systems that don't otherwise
     reserve it. This issue should only manifest on !MMU systems.

   - A workaround for a BUG() that triggers when the memory that
     originally contained initdata is freed and later repurposed. This
     triggers a BUG() on builds that had HARDENED_USERCOPY enabled"

* tag 'riscv-for-linus-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fixup bootup failure with HARDENED_USERCOPY
  RISC-V: Make sure memblock reserves the memory containing DT
2020-10-09 11:49:22 -07:00
Linus Torvalds
277e570ae1 Merge tag 'for-v5.9-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply fix from Sebastian Reichel:
 "Just a single change to revert enablement of packet error checking for
  battery data on Chromebooks, since some of their embedded controllers
  do not handle it correctly"

* tag 'for-v5.9-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: supply: sbs-battery: chromebook workaround for PEC
2020-10-09 11:38:07 -07:00
Linus Torvalds
d813a8cb8d Merge tag 'gpio-v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
 "Some late fixes: one IRQ issue and one compilation issue for UML.

   - Fix a compilation issue with User Mode Linux

   - Handle spurious interrupts properly in the PCA953x driver"

* tag 'gpio-v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: pca953x: Survive spurious interrupts
  gpiolib: Disable compat ->read() code in UML case
2020-10-09 11:33:48 -07:00
Linus Torvalds
f318052ef2 Merge tag 'mmc-v5.9-rc4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fix from Ulf Hansson:
 "Assign a proper discard granularity rather than incorrectly set it to
  zero"

* tag 'mmc-v5.9-rc4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: core: don't set limits.discard_granularity as 0
2020-10-09 10:10:52 -07:00
Linus Torvalds
fd330b1bc2 Merge tag 'drm-fixes-2020-10-09' of git://anongit.freedesktop.org/drm/drm
Pull amdgpu drm fixes from Dave Airlie:
 "Fixes trickling in this week.

  Alex had a final fix for the newest GPU they introduced in rc1, along
  with one build regression and one crasher fix.

  Cross my fingers that's it for 5.9:

   - Fix a crash on renoir if you override the IP discovery parameter

   - Fix the build on ARC platforms

   - Display fix for Sienna Cichlid"

* tag 'drm-fixes-2020-10-09' of git://anongit.freedesktop.org/drm/drm:
  drm/amd/display: Change ABM config init interface
  drm/amdgpu/swsmu: fix ARC build errors
  drm/amdgpu: fix NULL pointer dereference for Renoir
2020-10-09 09:59:36 -07:00
Coly Li
4243219141 mmc: core: don't set limits.discard_granularity as 0
In mmc_queue_setup_discard() the mmc driver queue's discard_granularity
might be set as 0 (when card->pref_erase > max_discard) while the mmc
device still declares to support discard operation. This is buggy and
triggered the following kernel warning message,

WARNING: CPU: 0 PID: 135 at __blkdev_issue_discard+0x200/0x294
CPU: 0 PID: 135 Comm: f2fs_discard-17 Not tainted 5.9.0-rc6 #1
Hardware name: Google Kevin (DT)
pstate: 00000005 (nzcv daif -PAN -UAO BTYPE=--)
pc : __blkdev_issue_discard+0x200/0x294
lr : __blkdev_issue_discard+0x54/0x294
sp : ffff800011dd3b10
x29: ffff800011dd3b10 x28: 0000000000000000 x27: ffff800011dd3cc4 x26: ffff800011dd3e18 x25: 000000000004e69b x24: 0000000000000c40 x23: ffff0000f1deaaf0 x22: ffff0000f2849200 x21: 00000000002734d8 x20: 0000000000000008 x19: 0000000000000000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000394 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 00000000000008b0 x9 : ffff800011dd3cb0 x8 : 000000000004e69b x7 : 0000000000000000 x6 : ffff0000f1926400 x5 : ffff0000f1940800 x4 : 0000000000000000 x3 : 0000000000000c40 x2 : 0000000000000008 x1 : 00000000002734d8 x0 : 0000000000000000 Call trace:
__blkdev_issue_discard+0x200/0x294
__submit_discard_cmd+0x128/0x374
__issue_discard_cmd_orderly+0x188/0x244
__issue_discard_cmd+0x2e8/0x33c
issue_discard_thread+0xe8/0x2f0
kthread+0x11c/0x120
ret_from_fork+0x10/0x1c
---[ end trace e4c8023d33dfe77a ]---

This patch fixes the issue by setting discard_granularity as SECTOR_SIZE
instead of 0 when (card->pref_erase > max_discard) is true. Now no more
complain from __blkdev_issue_discard() for the improper value of discard
granularity.

This issue is exposed after commit b35fd7422c ("block: check queue's
limits.discard_granularity in __blkdev_issue_discard()"), a "Fixes:" tag
is also added for the commit to make sure people won't miss this patch
after applying the change of __blkdev_issue_discard().

Fixes: e056a1b5b6 ("mmc: queue: let host controllers specify maximum discard timeout")
Fixes: b35fd7422c ("block: check queue's limits.discard_granularity in __blkdev_issue_discard()").
Reported-and-tested-by: Vicente Bergas <vicencb@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20201002013852.51968-1-colyli@suse.de
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-10-09 08:26:09 +02:00
Kajol Jain
6d6b8b9f4f perf: Fix task_function_call() error handling
The error handling introduced by commit:

  2ed6edd33a ("perf: Add cond_resched() to task_function_call()")

looses any return value from smp_call_function_single() that is not
{0, -EINVAL}. This is a problem because it will return -EXNIO when the
target CPU is offline. Worse, in that case it'll turn into an infinite
loop.

Fixes: 2ed6edd33a ("perf: Add cond_resched() to task_function_call()")
Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Barret Rhoden <brho@google.com>
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: https://lkml.kernel.org/r/20200827064732.20860-1-kjain@linux.ibm.com
2020-10-09 08:18:33 +02:00
Dave Airlie
dded93ffbb Merge tag 'amd-drm-fixes-5.9-2020-10-08' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
amd-drm-fixes-5.9-2020-10-08:

amdgpu:
- Fix a crash on renoir if you override the IP discovery parameter
- Fix the build on ARC platforms
- Display fix for Sienna Cichlid

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201009024917.3984-1-alexander.deucher@amd.com
2020-10-09 13:02:49 +10:00
Linus Torvalds
583090b1b8 Merge tag 'block5.9-2020-10-08' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A few fixes that should go into this release:

   - NVMe controller error path reference fix (Chaitanya)

   - Fix regression with IBM partitions on non-dasd devices (Christoph)

   - Fix a missing clear in the compat CDROM packet structure (Peilin)"

* tag 'block5.9-2020-10-08' of git://git.kernel.dk/linux-block:
  partitions/ibm: fix non-DASD devices
  nvme-core: put ctrl ref when module ref get fail
  block/scsi-ioctl: Fix kernel-infoleak in scsi_put_cdrom_generic_arg()
2020-10-08 18:48:34 -07:00
Sebastian Reichel
e3f2396b75 power: supply: sbs-battery: chromebook workaround for PEC
Looks like the I2C tunnel implementation from Chromebook's
embedded controller does not handle PEC correctly. Fix this
by disabling PEC for batteries behind those I2C tunnels as
a workaround.

Note, that some Chromebooks actually have been reported to
have working PEC support (with I2C tunnel). Since the problem
has not yet been fully understood this simply reverts all
Chromebooks to not use PEC for now.

Reported-by: "Milan P. Stanić" <mps@arvanta.net>
Reported-by: Vicente Bergas <vicencb@gmail.com>
CC: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Fixes: 7222bd603d ("power: supply: sbs-battery: add PEC support")
Tested-by: Vicente Bergas <vicencb@gmail.com>
Tested-by: "Milan P. Stanić" <mps@arvanta.net>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2020-10-09 01:09:37 +02:00
Linus Torvalds
3fdd47c3b4 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull vhost fixes from Michael Tsirkin:
 "Some last minute vhost,vdpa fixes.

  The last two of them haven't been in next but they do seem kind of
  obvious, very small and safe, fix bugs reported in the field, and they
  are both in a new mlx5 vdpa driver, so it's not like we can introduce
  regressions"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vdpa/mlx5: Fix dependency on MLX5_CORE
  vdpa/mlx5: should keep avail_index despite device status
  vhost-vdpa: fix page pinning leakage in error path
  vhost-vdpa: fix vhost_vdpa_map() on error condition
  vhost: Don't call log_access_ok() when using IOTLB
  vhost: Use vhost_get_used_size() in vhost_vring_set_addr()
  vhost: Don't call access_ok() when using IOTLB
  vhost vdpa: fix vhost_vdpa_open error handling
2020-10-08 14:25:46 -07:00
Yongqiang Sun
33c8256b3b drm/amd/display: Change ABM config init interface
[Why & How]
change abm config init interface to support multiple ABMs.

Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com>
Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-08 17:15:52 -04:00
Linus Torvalds
6288c1d802 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "One more set of fixes from the networking tree:

   - add missing input validation in nl80211_del_key(), preventing
     out-of-bounds access

   - last minute fix / improvement of a MRP netlink (uAPI) interface
     introduced in 5.9 (current) release

   - fix "unresolved symbol" build error under CONFIG_NET w/o
     CONFIG_INET due to missing tcp_timewait_sock and inet_timewait_sock
     BTF.

   - fix 32 bit sub-register bounds tracking in the bpf verifier for OR
     case

   - tcp: fix receive window update in tcp_add_backlog()

   - openvswitch: handle DNAT tuple collision in conntrack-related code

   - r8169: wait for potential PHY reset to finish after applying a FW
     file, avoiding unexpected PHY behaviour and failures later on

   - mscc: fix tail dropping watermarks for Ocelot switches

   - avoid use-after-free in macsec code after a call to the GRO layer

   - avoid use-after-free in sctp error paths

   - add a device id for Cellient MPL200 WWAN card

   - rxrpc fixes:
      - fix the xdr encoding of the contents read from an rxrpc key
      - fix a BUG() for a unsupported encoding type.
      - fix missing _bh lock annotations.
      - fix acceptance handling for an incoming call where the incoming
        call is encrypted.
      - the server token keyring isn't network namespaced - it belongs
        to the server, so there's no need. Namespacing it means that
        request_key() fails to find it.
      - fix a leak of the server keyring"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (21 commits)
  net: usb: qmi_wwan: add Cellient MPL200 card
  macsec: avoid use-after-free in macsec_handle_frame()
  r8169: consider that PHY reset may still be in progress after applying firmware
  openvswitch: handle DNAT tuple collision
  sctp: fix sctp_auth_init_hmacs() error path
  bridge: Netlink interface fix.
  net: wireless: nl80211: fix out-of-bounds access in nl80211_del_key()
  bpf: Fix scalar32_min_max_or bounds tracking
  tcp: fix receive window update in tcp_add_backlog()
  net: usb: rtl8150: set random MAC address when set_ethernet_addr() fails
  mptcp: more DATA FIN fixes
  net: mscc: ocelot: warn when encoding an out-of-bounds watermark value
  net: mscc: ocelot: divide watermark value by 60 when writing to SYS_ATOP
  net: qrtr: ns: Fix the incorrect usage of rcu_read_lock()
  rxrpc: Fix server keyring leak
  rxrpc: The server keyring isn't network-namespaced
  rxrpc: Fix accept on a connection that need securing
  rxrpc: Fix some missing _bh annotations on locking conn->state_lock
  rxrpc: Downgrade the BUG() for unsupported token type in rxrpc_read()
  rxrpc: Fix rxkad token xdr encoding
  ...
2020-10-08 14:11:21 -07:00
Eli Cohen
aff90770e5 vdpa/mlx5: Fix dependency on MLX5_CORE
Remove propmt for selecting MLX5_VDPA by the user and modify
MLX5_VDPA_NET to select MLX5_VDPA. Also modify MLX5_VDPA_NET to depend
on mlx5_core.

This fixes an issue where configuration sets 'y' for MLX5_VDPA_NET while
MLX5_CORE is compiled as a module causing link errors.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 1a86b377aa ("vdpa/mlx5: Add VDPA driver for supported mlx5 device")s
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20201007064011.GA50074@mtl-vdi-166.wap.labs.mlnx
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-10-08 16:02:00 -04:00
Si-Wei Liu
3176e974a7 vdpa/mlx5: should keep avail_index despite device status
A VM with mlx5 vDPA has below warnings while being reset:

vhost VQ 0 ring restore failed: -1: Resource temporarily unavailable (11)
vhost VQ 1 ring restore failed: -1: Resource temporarily unavailable (11)

We should allow userspace emulating the virtio device be
able to get to vq's avail_index, regardless of vDPA device
status. Save the index that was last seen when virtq was
stopped, so that userspace doesn't complain.

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Link: https://lore.kernel.org/r/1601583511-15138-1-git-send-email-si-wei.liu@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eli Cohen <elic@nvidia.com>
2020-10-08 16:02:00 -04:00
Wilken Gottwalt
28802e7c0c net: usb: qmi_wwan: add Cellient MPL200 card
Add usb ids of the Cellient MPL200 card.

Signed-off-by: Wilken Gottwalt <wilken.gottwalt@mailbox.org>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-08 12:26:31 -07:00
Eric Dumazet
c7cc9200e9 macsec: avoid use-after-free in macsec_handle_frame()
De-referencing skb after call to gro_cells_receive() is not allowed.
We need to fetch skb->len earlier.

Fixes: 5491e7c6b1 ("macsec: enable GRO and RPS on macsec devices")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-08 12:21:08 -07:00
Heiner Kallweit
47dda78671 r8169: consider that PHY reset may still be in progress after applying firmware
Some firmware files trigger a PHY soft reset and don't wait for it to
be finished. PHY register writes directly after applying the firmware
may fail or provide unexpected results therefore. Fix this by waiting
for bit BMCR_RESET to be cleared after applying firmware.

There's nothing wrong with the referenced change, it's just that the
fix will apply cleanly only after this change.

Fixes: 89fbd26cca ("r8169: fix firmware not resetting tp->ocp_base")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-08 12:20:51 -07:00
Dumitru Ceara
8aa7b526dc openvswitch: handle DNAT tuple collision
With multiple DNAT rules it's possible that after destination
translation the resulting tuples collide.

For example, two openvswitch flows:
nw_dst=10.0.0.10,tp_dst=10, actions=ct(commit,table=2,nat(dst=20.0.0.1:20))
nw_dst=10.0.0.20,tp_dst=10, actions=ct(commit,table=2,nat(dst=20.0.0.1:20))

Assuming two TCP clients initiating the following connections:
10.0.0.10:5000->10.0.0.10:10
10.0.0.10:5000->10.0.0.20:10

Both tuples would translate to 10.0.0.10:5000->20.0.0.1:20 causing
nf_conntrack_confirm() to fail because of tuple collision.

Netfilter handles this case by allocating a null binding for SNAT at
egress by default.  Perform the same operation in openvswitch for DNAT
if no explicit SNAT is requested by the user and allocate a null binding
for SNAT for packets in the "original" direction.

Reported-at: https://bugzilla.redhat.com/1877128
Suggested-by: Florian Westphal <fw@strlen.de>
Fixes: 05752523e5 ("openvswitch: Interface with NAT.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-08 12:20:35 -07:00
Eric Dumazet
d42ee76ecb sctp: fix sctp_auth_init_hmacs() error path
After freeing ep->auth_hmacs we have to clear the pointer
or risk use-after-free as reported by syzbot:

BUG: KASAN: use-after-free in sctp_auth_destroy_hmacs net/sctp/auth.c:509 [inline]
BUG: KASAN: use-after-free in sctp_auth_destroy_hmacs net/sctp/auth.c:501 [inline]
BUG: KASAN: use-after-free in sctp_auth_free+0x17e/0x1d0 net/sctp/auth.c:1070
Read of size 8 at addr ffff8880a8ff52c0 by task syz-executor941/6874

CPU: 0 PID: 6874 Comm: syz-executor941 Not tainted 5.9.0-rc8-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x198/0x1fd lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 sctp_auth_destroy_hmacs net/sctp/auth.c:509 [inline]
 sctp_auth_destroy_hmacs net/sctp/auth.c:501 [inline]
 sctp_auth_free+0x17e/0x1d0 net/sctp/auth.c:1070
 sctp_endpoint_destroy+0x95/0x240 net/sctp/endpointola.c:203
 sctp_endpoint_put net/sctp/endpointola.c:236 [inline]
 sctp_endpoint_free+0xd6/0x110 net/sctp/endpointola.c:183
 sctp_destroy_sock+0x9c/0x3c0 net/sctp/socket.c:4981
 sctp_v6_destroy_sock+0x11/0x20 net/sctp/socket.c:9415
 sk_common_release+0x64/0x390 net/core/sock.c:3254
 sctp_close+0x4ce/0x8b0 net/sctp/socket.c:1533
 inet_release+0x12e/0x280 net/ipv4/af_inet.c:431
 inet6_release+0x4c/0x70 net/ipv6/af_inet6.c:475
 __sock_release+0xcd/0x280 net/socket.c:596
 sock_close+0x18/0x20 net/socket.c:1277
 __fput+0x285/0x920 fs/file_table.c:281
 task_work_run+0xdd/0x190 kernel/task_work.c:141
 exit_task_work include/linux/task_work.h:25 [inline]
 do_exit+0xb7d/0x29f0 kernel/exit.c:806
 do_group_exit+0x125/0x310 kernel/exit.c:903
 __do_sys_exit_group kernel/exit.c:914 [inline]
 __se_sys_exit_group kernel/exit.c:912 [inline]
 __x64_sys_exit_group+0x3a/0x50 kernel/exit.c:912
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x43f278
Code: Bad RIP value.
RSP: 002b:00007fffe0995c38 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043f278
RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
RBP: 00000000004bf068 R08: 00000000000000e7 R09: ffffffffffffffd0
R10: 0000000020000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000

Allocated by task 6874:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
 kasan_set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:461
 kmem_cache_alloc_trace+0x174/0x300 mm/slab.c:3554
 kmalloc include/linux/slab.h:554 [inline]
 kmalloc_array include/linux/slab.h:593 [inline]
 kcalloc include/linux/slab.h:605 [inline]
 sctp_auth_init_hmacs+0xdb/0x3b0 net/sctp/auth.c:464
 sctp_auth_init+0x8a/0x4a0 net/sctp/auth.c:1049
 sctp_setsockopt_auth_supported net/sctp/socket.c:4354 [inline]
 sctp_setsockopt+0x477e/0x97f0 net/sctp/socket.c:4631
 __sys_setsockopt+0x2db/0x610 net/socket.c:2132
 __do_sys_setsockopt net/socket.c:2143 [inline]
 __se_sys_setsockopt net/socket.c:2140 [inline]
 __x64_sys_setsockopt+0xba/0x150 net/socket.c:2140
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 6874:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
 kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355
 __kasan_slab_free+0xd8/0x120 mm/kasan/common.c:422
 __cache_free mm/slab.c:3422 [inline]
 kfree+0x10e/0x2b0 mm/slab.c:3760
 sctp_auth_destroy_hmacs net/sctp/auth.c:511 [inline]
 sctp_auth_destroy_hmacs net/sctp/auth.c:501 [inline]
 sctp_auth_init_hmacs net/sctp/auth.c:496 [inline]
 sctp_auth_init_hmacs+0x2b7/0x3b0 net/sctp/auth.c:454
 sctp_auth_init+0x8a/0x4a0 net/sctp/auth.c:1049
 sctp_setsockopt_auth_supported net/sctp/socket.c:4354 [inline]
 sctp_setsockopt+0x477e/0x97f0 net/sctp/socket.c:4631
 __sys_setsockopt+0x2db/0x610 net/socket.c:2132
 __do_sys_setsockopt net/socket.c:2143 [inline]
 __se_sys_setsockopt net/socket.c:2140 [inline]
 __x64_sys_setsockopt+0xba/0x150 net/socket.c:2140
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 1f485649f5 ("[SCTP]: Implement SCTP-AUTH internals")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-08 12:19:51 -07:00
Jakub Kicinski
a9e54cb3d5 Merge tag 'mac80211-for-net-2020-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:

====================
pull-request: mac80211 2020-10-08

A single fix for missing input validation in nl80211.
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-08 12:18:34 -07:00
Jakub Kicinski
cfe90f4980 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2020-10-08

The main changes are:

1) Fix "unresolved symbol" build error under CONFIG_NET w/o CONFIG_INET due
   to missing tcp_timewait_sock and inet_timewait_sock BTF, from Yonghong Song.

2) Fix 32 bit sub-register bounds tracking for OR case, from Daniel Borkmann.
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-08 12:05:37 -07:00
Henrik Bjoernlund
b6c02ef549 bridge: Netlink interface fix.
This commit is correcting NETLINK br_fill_ifinfo() to be able to
handle 'filter_mask' with multiple flags asserted.

Fixes: 36a8e8e265 ("bridge: Extend br_fill_ifinfo to return MPR status")

Signed-off-by: Henrik Bjoernlund <henrik.bjoernlund@microchip.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Suggested-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-08 12:05:07 -07:00
Linus Torvalds
3d006ee42d Merge tag 'drm-fixes-2020-10-08' of git://anongit.freedesktop.org/drm/drm
Pull drm nouveau fixes from Dave Airlie:
 "Karol found two last minute nouveau fixes, they both fix crashes, the
  TTM one follows what other drivers do already, and the other is for
  bailing on load on unrecognised chipsets.

   - fix crash in TTM alloc fail path

   - return error earlier for unknown chipsets"

* tag 'drm-fixes-2020-10-08' of git://anongit.freedesktop.org/drm/drm:
  drm/nouveau/mem: guard against NULL pointer access in mem_del
  drm/nouveau/device: return error for unknown chipsets
2020-10-08 11:14:17 -07:00
Linus Torvalds
b9e3aa2a9b Merge tag 'exfat-for-5.9-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat fixes from Namjae Jeon:

 - Fix use of uninitialized spinlock on error path

 - Fix missing err assignment in exfat_build_inode()

* tag 'exfat-for-5.9-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: fix use of uninitialized spinlock on error path
  exfat: fix pointer error checking
2020-10-08 11:10:13 -07:00
Linus Torvalds
86f0a5fb1b Merge tag 'for-linus-5.9b-rc9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
 "One fix for a regression when booting as a Xen guest on ARM64
  introduced probably during the 5.9 cycle. It is very low risk as it is
  modifying Xen specific code only.

  The exact commit introducing the bug hasn't been identified yet, but
  everything was fine in 5.8 and only in 5.9 some configurations started
  to fail"

* tag 'for-linus-5.9b-rc9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  arm/arm64: xen: Fix to convert percpu address to gfn correctly
2020-10-08 11:01:53 -07:00
David Howells
ec0fa0b659 afs: Fix deadlock between writeback and truncate
The afs filesystem has a lock[*] that it uses to serialise I/O operations
going to the server (vnode->io_lock), as the server will only perform one
modification operation at a time on any given file or directory.  This
prevents the the filesystem from filling up all the call slots to a server
with calls that aren't going to be executed in parallel anyway, thereby
allowing operations on other files to obtain slots.

  [*] Note that is probably redundant for directories at least since
      i_rwsem is used to serialise directory modifications and
      lookup/reading vs modification.  The server does allow parallel
      non-modification ops, however.

When a file truncation op completes, we truncate the in-memory copy of the
file to match - but we do it whilst still holding the io_lock, the idea
being to prevent races with other operations.

However, if writeback starts in a worker thread simultaneously with
truncation (whilst notify_change() is called with i_rwsem locked, writeback
pays it no heed), it may manage to set PG_writeback bits on the pages that
will get truncated before afs_setattr_success() manages to call
truncate_pagecache().  Truncate will then wait for those pages - whilst
still inside io_lock:

    # cat /proc/8837/stack
    [<0>] wait_on_page_bit_common+0x184/0x1e7
    [<0>] truncate_inode_pages_range+0x37f/0x3eb
    [<0>] truncate_pagecache+0x3c/0x53
    [<0>] afs_setattr_success+0x4d/0x6e
    [<0>] afs_wait_for_operation+0xd8/0x169
    [<0>] afs_do_sync_operation+0x16/0x1f
    [<0>] afs_setattr+0x1fb/0x25d
    [<0>] notify_change+0x2cf/0x3c4
    [<0>] do_truncate+0x7f/0xb2
    [<0>] do_sys_ftruncate+0xd1/0x104
    [<0>] do_syscall_64+0x2d/0x3a
    [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

The writeback operation, however, stalls indefinitely because it needs to
get the io_lock to proceed:

    # cat /proc/5940/stack
    [<0>] afs_get_io_locks+0x58/0x1ae
    [<0>] afs_begin_vnode_operation+0xc7/0xd1
    [<0>] afs_store_data+0x1b2/0x2a3
    [<0>] afs_write_back_from_locked_page+0x418/0x57c
    [<0>] afs_writepages_region+0x196/0x224
    [<0>] afs_writepages+0x74/0x156
    [<0>] do_writepages+0x2d/0x56
    [<0>] __writeback_single_inode+0x84/0x207
    [<0>] writeback_sb_inodes+0x238/0x3cf
    [<0>] __writeback_inodes_wb+0x68/0x9f
    [<0>] wb_writeback+0x145/0x26c
    [<0>] wb_do_writeback+0x16a/0x194
    [<0>] wb_workfn+0x74/0x177
    [<0>] process_one_work+0x174/0x264
    [<0>] worker_thread+0x117/0x1b9
    [<0>] kthread+0xec/0xf1
    [<0>] ret_from_fork+0x1f/0x30

and thus deadlock has occurred.

Note that whilst afs_setattr() calls filemap_write_and_wait(), the fact
that the caller is holding i_rwsem doesn't preclude more pages being
dirtied through an mmap'd region.

Fix this by:

 (1) Use the vnode validate_lock to mediate access between afs_setattr()
     and afs_writepages():

     (a) Exclusively lock validate_lock in afs_setattr() around the whole
     	 RPC operation.

     (b) If WB_SYNC_ALL isn't set on entry to afs_writepages(), trying to
     	 shared-lock validate_lock and returning immediately if we couldn't
     	 get it.

     (c) If WB_SYNC_ALL is set, wait for the lock.

     The validate_lock is also used to validate a file and to zap its cache
     if the file was altered by a third party, so it's probably a good fit
     for this.

 (2) Move the truncation outside of the io_lock in setattr, using the same
     hook as is used for local directory editing.

     This requires the old i_size to be retained in the operation record as
     we commit the revised status to the inode members inside the io_lock
     still, but we still need to know if we reduced the file size.

Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-08 10:50:55 -07:00
Linus Torvalds
f3c64eda3e mm: avoid early COW write protect games during fork()
In commit 70e806e4e6 ("mm: Do early cow for pinned pages during fork()
for ptes") we write-protected the PTE before doing the page pinning
check, in order to avoid a race with concurrent fast-GUP pinning (which
doesn't take the mm semaphore or the page table lock).

That trick doesn't actually work - it doesn't handle memory ordering
properly, and doing so would be prohibitively expensive.

It also isn't really needed.  While we're moving in the direction of
allowing and supporting page pinning without marking the pinned area
with MADV_DONTFORK, the fact is that we've never really supported this
kind of odd "concurrent fork() and page pinning", and doing the
serialization on a pte level is just wrong.

We can add serialization with a per-mm sequence counter, so we know how
to solve that race properly, but we'll do that at a more appropriate
time.  Right now this just removes the write protect games.

It also turns out that the write protect games actually break on Power,
as reported by Aneesh Kumar:

 "Architecture like ppc64 expects set_pte_at to be not used for updating
  a valid pte. This is further explained in commit 56eecdb912 ("mm:
  Use ptep/pmdp_set_numa() for updating _PAGE_NUMA bit")"

and the code triggered a warning there:

  WARNING: CPU: 0 PID: 30613 at arch/powerpc/mm/pgtable.c:185 set_pte_at+0x2a8/0x3a0 arch/powerpc/mm/pgtable.c:185
  Call Trace:
    copy_present_page mm/memory.c:857 [inline]
    copy_present_pte mm/memory.c:899 [inline]
    copy_pte_range mm/memory.c:1014 [inline]
    copy_pmd_range mm/memory.c:1092 [inline]
    copy_pud_range mm/memory.c:1127 [inline]
    copy_p4d_range mm/memory.c:1150 [inline]
    copy_page_range+0x1f6c/0x2cc0 mm/memory.c:1212
    dup_mmap kernel/fork.c:592 [inline]
    dup_mm+0x77c/0xab0 kernel/fork.c:1355
    copy_mm kernel/fork.c:1411 [inline]
    copy_process+0x1f00/0x2740 kernel/fork.c:2070
    _do_fork+0xc4/0x10b0 kernel/fork.c:2429

Link: https://lore.kernel.org/lkml/CAHk-=wiWr+gO0Ro4LvnJBMs90OiePNyrE3E+pJvc9PzdBShdmw@mail.gmail.com/
Link: https://lore.kernel.org/linuxppc-dev/20201008092541.398079-1-aneesh.kumar@linux.ibm.com/
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Tested-by: Leon Romanovsky <leonro@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kirill Shutemov <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-08 10:11:32 -07:00
Anant Thazhemadam
3dc289f8f1 net: wireless: nl80211: fix out-of-bounds access in nl80211_del_key()
In nl80211_parse_key(), key.idx is first initialized as -1.
If this value of key.idx remains unmodified and gets returned, and
nl80211_key_allowed() also returns 0, then rdev_del_key() gets called
with key.idx = -1.
This causes an out-of-bounds array access.

Handle this issue by checking if the value of key.idx after
nl80211_parse_key() is called and return -EINVAL if key.idx < 0.

Cc: stable@vger.kernel.org
Reported-by: syzbot+b1bb342d1d097516cbda@syzkaller.appspotmail.com
Tested-by: syzbot+b1bb342d1d097516cbda@syzkaller.appspotmail.com
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Link: https://lore.kernel.org/r/20201007035401.9522-1-anant.thazhemadam@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-10-08 12:37:25 +02:00
Nicolas Belin
1334d3b4e4 i2c: meson: fixup rate calculation with filter delay
Apparently, 15 cycles of the peripheral clock are used by the controller
for sampling and filtering. Because this was not known before, the rate
calculation is slightly off.

Clean up and fix the calculation taking this filtering delay into account.

Fixes: 30021e3707 ("i2c: add support for Amlogic Meson I2C controller")
Signed-off-by: Nicolas Belin <nbelin@baylibre.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-10-08 11:57:23 +02:00
Jerome Brunet
79e137b154 i2c: meson: keep peripheral clock enabled
SCL rate appears to be different than what is expected. For example,
We get 164kHz on i2c3 of the vim3 when 400kHz is expected. This is
partially due to the peripheral clock being disabled when the clock is
set.

Let's keep the peripheral clock on after probe to fix the problem. This
does not affect the SCL output which is still gated when i2c is idle.

Fixes: 09af1c2fa4 ("i2c: meson: set clock divider in probe instead of setting it for each transfer")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-10-08 11:57:14 +02:00
Jerome Brunet
28683e847e i2c: meson: fix clock setting overwrite
When the slave address is written in do_start(), SLAVE_ADDR is written
completely. This may overwrite some setting related to the clock rate
or signal filtering.

Fix this by writing only the bits related to slave address. To avoid
causing unexpected changed, explicitly disable filtering or high/low
clock mode which may have been left over by the bootloader.

Fixes: 30021e3707 ("i2c: add support for Amlogic Meson I2C controller")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-10-08 11:57:06 +02:00
Christian Eggers
fa4d305568 i2c: imx: Fix reset of I2SR_IAL flag
According to the "VFxxx Controller Reference Manual" (and the comment
block starting at line 97), Vybrid requires writing a one for clearing
an interrupt flag. Syncing the method for clearing I2SR_IIF in
i2c_imx_isr().

Signed-off-by: Christian Eggers <ceggers@arri.de>
Fixes: 4b775022f6 ("i2c: imx: add struct to hold more configurable quirks")
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-10-08 11:54:54 +02:00
Daniel Borkmann
5b9fbeb75b bpf: Fix scalar32_min_max_or bounds tracking
Simon reported an issue with the current scalar32_min_max_or() implementation.
That is, compared to the other 32 bit subreg tracking functions, the code in
scalar32_min_max_or() stands out that it's using the 64 bit registers instead
of 32 bit ones. This leads to bounds tracking issues, for example:

  [...]
  8: R0=map_value(id=0,off=0,ks=4,vs=48,imm=0) R10=fp0 fp-8=mmmmmmmm
  8: (79) r1 = *(u64 *)(r0 +0)
   R0=map_value(id=0,off=0,ks=4,vs=48,imm=0) R10=fp0 fp-8=mmmmmmmm
  9: R0=map_value(id=0,off=0,ks=4,vs=48,imm=0) R1_w=inv(id=0) R10=fp0 fp-8=mmmmmmmm
  9: (b7) r0 = 1
  10: R0_w=inv1 R1_w=inv(id=0) R10=fp0 fp-8=mmmmmmmm
  10: (18) r2 = 0x600000002
  12: R0_w=inv1 R1_w=inv(id=0) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  12: (ad) if r1 < r2 goto pc+1
   R0_w=inv1 R1_w=inv(id=0,umin_value=25769803778) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  13: R0_w=inv1 R1_w=inv(id=0,umin_value=25769803778) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  13: (95) exit
  14: R0_w=inv1 R1_w=inv(id=0,umax_value=25769803777,var_off=(0x0; 0x7ffffffff)) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  14: (25) if r1 > 0x0 goto pc+1
   R0_w=inv1 R1_w=inv(id=0,umax_value=0,var_off=(0x0; 0x7fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  15: R0_w=inv1 R1_w=inv(id=0,umax_value=0,var_off=(0x0; 0x7fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  15: (95) exit
  16: R0_w=inv1 R1_w=inv(id=0,umin_value=1,umax_value=25769803777,var_off=(0x0; 0x77fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  16: (47) r1 |= 0
  17: R0_w=inv1 R1_w=inv(id=0,umin_value=1,umax_value=32212254719,var_off=(0x1; 0x700000000),s32_max_value=1,u32_max_value=1) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  [...]

The bound tests on the map value force the upper unsigned bound to be 25769803777
in 64 bit (0b11000000000000000000000000000000001) and then lower one to be 1. By
using OR they are truncated and thus result in the range [1,1] for the 32 bit reg
tracker. This is incorrect given the only thing we know is that the value must be
positive and thus 2147483647 (0b1111111111111111111111111111111) at max for the
subregs. Fix it by using the {u,s}32_{min,max}_value vars instead. This also makes
sense, for example, for the case where we update dst_reg->s32_{min,max}_value in
the else branch we need to use the newly computed dst_reg->u32_{min,max}_value as
we know that these are positive. Previously, in the else branch the 64 bit values
of umin_value=1 and umax_value=32212254719 were used and latter got truncated to
be 1 as upper bound there. After the fix the subreg range is now correct:

  [...]
  8: R0=map_value(id=0,off=0,ks=4,vs=48,imm=0) R10=fp0 fp-8=mmmmmmmm
  8: (79) r1 = *(u64 *)(r0 +0)
   R0=map_value(id=0,off=0,ks=4,vs=48,imm=0) R10=fp0 fp-8=mmmmmmmm
  9: R0=map_value(id=0,off=0,ks=4,vs=48,imm=0) R1_w=inv(id=0) R10=fp0 fp-8=mmmmmmmm
  9: (b7) r0 = 1
  10: R0_w=inv1 R1_w=inv(id=0) R10=fp0 fp-8=mmmmmmmm
  10: (18) r2 = 0x600000002
  12: R0_w=inv1 R1_w=inv(id=0) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  12: (ad) if r1 < r2 goto pc+1
   R0_w=inv1 R1_w=inv(id=0,umin_value=25769803778) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  13: R0_w=inv1 R1_w=inv(id=0,umin_value=25769803778) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  13: (95) exit
  14: R0_w=inv1 R1_w=inv(id=0,umax_value=25769803777,var_off=(0x0; 0x7ffffffff)) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  14: (25) if r1 > 0x0 goto pc+1
   R0_w=inv1 R1_w=inv(id=0,umax_value=0,var_off=(0x0; 0x7fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  15: R0_w=inv1 R1_w=inv(id=0,umax_value=0,var_off=(0x0; 0x7fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  15: (95) exit
  16: R0_w=inv1 R1_w=inv(id=0,umin_value=1,umax_value=25769803777,var_off=(0x0; 0x77fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  16: (47) r1 |= 0
  17: R0_w=inv1 R1_w=inv(id=0,umin_value=1,umax_value=32212254719,var_off=(0x0; 0x77fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
  [...]

Fixes: 3f50f132d8 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Reported-by: Simon Scannell <scannell.smn@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2020-10-08 11:02:53 +02:00
Alex Deucher
dcba603f82 drm/amdgpu/swsmu: fix ARC build errors
We want to use the dev_* functions here rather than the pr_* variants.
Switch to using dev_warn() which mirrors what we do on other asics.

Fixes the following build errors on ARC:

../drivers/gpu/drm/amd/amdgpu/../powerplay/navi10_ppt.c: In function 'navi10_fill_i2c_req':
../arch/arc/include/asm/bug.h:24:2: error: implicit declaration of function 'pr_warn'; did you mean 'drm_warn'? [-Werror=implicit-function-declaration]

../drivers/gpu/drm/amd/amdgpu/../powerplay/sienna_cichlid_ppt.c: In function 'sienna_cichlid_fill_i2c_req':
../arch/arc/include/asm/bug.h:24:2: error: implicit declaration of function 'pr_warn'; did you mean 'drm_warn'? [-Werror=implicit-function-declaration]

Reported-by: kernel test robot <lkp@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Evan Quan <evan.quan@amd.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-snps-arc@lists.infradead.org
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-07 17:04:27 -04:00
Dirk Gouders
33eade2cd2 drm/amdgpu: fix NULL pointer dereference for Renoir
Commit c1cf79ca5c ("drm/amdgpu: use IP discovery table for renoir")
introduced a NULL pointer dereference when booting with
amdgpu.discovery=0, because it removed the call of vega10_reg_base_init()
for that case.

Fix this by calling that funcion if amdgpu_discovery == 0 in addition to
the case that amdgpu_discovery_reg_base_init() failed.

Fixes: c1cf79ca5c ("drm/amdgpu: use IP discovery table for renoir")
Signed-off-by: Dirk Gouders <dirk@gouders.net>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-07 17:03:08 -04:00
Jens Axboe
e0894cd618 Merge tag 'nvme-5.9-2020-10-07' of git://git.infradead.org/nvme into block-5.9
Pull NVMe fix from Christoph:

"nvme fix for 5.9:

 - fix a recently introduced controller leak (Logan Gunthorpe)"

* tag 'nvme-5.9-2020-10-07' of git://git.infradead.org/nvme:
  nvme-core: put ctrl ref when module ref get fail
2020-10-07 08:24:09 -06:00
Christoph Hellwig
7370997d48 partitions/ibm: fix non-DASD devices
Don't error out if the dasd_biodasdinfo symbol is not available.

Cc: stable@vger.kernel.org
Fixes: 26d7e28e38 ("s390/dasd: remove ioctl_by_bdev calls")
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-07 07:55:35 -06:00
Marc Zyngier
8b81edd80b gpio: pca953x: Survive spurious interrupts
The pca953x driver never checks the result of irq_find_mapping(),
which returns 0 when no mapping is found. When a spurious interrupt
is delivered (which can happen under obscure circumstances), the
kernel explodes as it still tries to handle the error code as
a real interrupt.

Handle this particular case and warn on spurious interrupts.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201005140217.1390851-1-maz@kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2020-10-07 11:47:41 +02:00
Andy Shevchenko
47e538d86d gpiolib: Disable compat ->read() code in UML case
It appears that UML (arch/um) has no compat.h header defined and hence
can't compile a recently provided piece of code in GPIO library.

Disable compat ->read() code in UML case to avoid compilation errors.

While at it, use pattern which is already being used in the kernel elsewhere.

Fixes: 5ad284ab3a ("gpiolib: Fix line event handling in syscall compatible mode")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20201005131044.87276-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2020-10-07 11:42:03 +02:00
Chaitanya Kulkarni
4bab690930 nvme-core: put ctrl ref when module ref get fail
When try_module_get() fails in the nvme_dev_open() it returns without
releasing the ctrl reference which was taken earlier.

Put the ctrl reference which is taken before calling the
try_module_get() in the error return code path.

Fixes: 52a3974feb "nvme-core: get/put ctrl and transport module in nvme_dev_open/release()"
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-07 07:55:40 +02:00
Karol Herbst
d10285a25e drm/nouveau/mem: guard against NULL pointer access in mem_del
other drivers seems to do something similar

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Cc: dri-devel <dri-devel@lists.freedesktop.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006220528.13925-2-kherbst@redhat.com
2020-10-07 15:33:09 +10:00
Karol Herbst
c3e0276c31 drm/nouveau/device: return error for unknown chipsets
Previously the code relied on device->pri to be NULL and to fail probing
later. We really should just return an error inside nvkm_device_ctor for
unsupported GPUs.

Fixes: 24d5ff40a7 ("drm/nouveau/device: rework mmio mapping code to get rid of second map")

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Cc: dann frazier <dann.frazier@canonical.com>
Cc: dri-devel <dri-devel@lists.freedesktop.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jeremy Cline <jcline@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006220528.13925-1-kherbst@redhat.com
2020-10-07 15:33:00 +10:00
Namjae Jeon
8ff006e57a exfat: fix use of uninitialized spinlock on error path
syzbot reported warning message:

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1d6/0x29e lib/dump_stack.c:118
 register_lock_class+0xf06/0x1520 kernel/locking/lockdep.c:893
 __lock_acquire+0xfd/0x2ae0 kernel/locking/lockdep.c:4320
 lock_acquire+0x148/0x720 kernel/locking/lockdep.c:5029
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 exfat_cache_inval_inode+0x30/0x280 fs/exfat/cache.c:226
 exfat_evict_inode+0x124/0x270 fs/exfat/inode.c:660
 evict+0x2bb/0x6d0 fs/inode.c:576
 exfat_fill_super+0x1e07/0x27d0 fs/exfat/super.c:681
 get_tree_bdev+0x3e9/0x5f0 fs/super.c:1342
 vfs_get_tree+0x88/0x270 fs/super.c:1547
 do_new_mount fs/namespace.c:2875 [inline]
 path_mount+0x179d/0x29e0 fs/namespace.c:3192
 do_mount fs/namespace.c:3205 [inline]
 __do_sys_mount fs/namespace.c:3413 [inline]
 __se_sys_mount+0x126/0x180 fs/namespace.c:3390
 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

If exfat_read_root() returns an error, spinlock is used in
exfat_evict_inode() without initialization. This patch combines
exfat_cache_init_inode() with exfat_inode_init_once() to initialize
spinlock by slab constructor.

Fixes: c35b6810c4 ("exfat: add exfat cache")
Cc: stable@vger.kernel.org # v5.7+
Reported-by: syzbot <syzbot+b91107320911a26c9a95@syzkaller.appspotmail.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-10-07 14:27:13 +09:00
Tetsuhiro Kohada
d6c9efd924 exfat: fix pointer error checking
Fix missing result check of exfat_build_inode().
And use PTR_ERR_OR_ZERO instead of PTR_ERR.

Signed-off-by: Tetsuhiro Kohada <kohada.t2@gmail.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-10-07 14:26:55 +09:00
Masami Hiramatsu
5a0677110b arm/arm64: xen: Fix to convert percpu address to gfn correctly
Use per_cpu_ptr_to_phys() instead of virt_to_phys() for per-cpu
address conversion.

In xen_starting_cpu(), per-cpu xen_vcpu_info address is converted
to gfn by virt_to_gfn() macro. However, since the virt_to_gfn(v)
assumes the given virtual address is in linear mapped kernel memory
area, it can not convert the per-cpu memory if it is allocated on
vmalloc area.

This depends on CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK.
If it is enabled, the first chunk of percpu memory is linear mapped.
In the other case, that is allocated from vmalloc area. Moreover,
if the first chunk of percpu has run out until allocating
xen_vcpu_info, it will be allocated on the 2nd chunk, which is
based on kernel memory or vmalloc memory (depends on
CONFIG_NEED_PER_CPU_KM).

Without this fix and kernel configured to use vmalloc area for
the percpu memory, the Dom0 kernel will fail to boot with following
errors.

[    0.466172] Xen: initializing cpu0
[    0.469601] ------------[ cut here ]------------
[    0.474295] WARNING: CPU: 0 PID: 1 at arch/arm64/xen/../../arm/xen/enlighten.c:153 xen_starting_cpu+0x160/0x180
[    0.484435] Modules linked in:
[    0.487565] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4+ #4
[    0.493895] Hardware name: Socionext Developer Box (DT)
[    0.499194] pstate: 00000005 (nzcv daif -PAN -UAO BTYPE=--)
[    0.504836] pc : xen_starting_cpu+0x160/0x180
[    0.509263] lr : xen_starting_cpu+0xb0/0x180
[    0.513599] sp : ffff8000116cbb60
[    0.516984] x29: ffff8000116cbb60 x28: ffff80000abec000
[    0.522366] x27: 0000000000000000 x26: 0000000000000000
[    0.527754] x25: ffff80001156c000 x24: fffffdffbfcdb600
[    0.533129] x23: 0000000000000000 x22: 0000000000000000
[    0.538511] x21: ffff8000113a99c8 x20: ffff800010fe4f68
[    0.543892] x19: ffff8000113a9988 x18: 0000000000000010
[    0.549274] x17: 0000000094fe0f81 x16: 00000000deadbeef
[    0.554655] x15: ffffffffffffffff x14: 0720072007200720
[    0.560037] x13: 0720072007200720 x12: 0720072007200720
[    0.565418] x11: 0720072007200720 x10: 0720072007200720
[    0.570801] x9 : ffff8000100fbdc0 x8 : ffff800010715208
[    0.576182] x7 : 0000000000000054 x6 : ffff00001b790f00
[    0.581564] x5 : ffff800010bbf880 x4 : 0000000000000000
[    0.586945] x3 : 0000000000000000 x2 : ffff80000abec000
[    0.592327] x1 : 000000000000002f x0 : 0000800000000000
[    0.597716] Call trace:
[    0.600232]  xen_starting_cpu+0x160/0x180
[    0.604309]  cpuhp_invoke_callback+0xac/0x640
[    0.608736]  cpuhp_issue_call+0xf4/0x150
[    0.612728]  __cpuhp_setup_state_cpuslocked+0x128/0x2c8
[    0.618030]  __cpuhp_setup_state+0x84/0xf8
[    0.622192]  xen_guest_init+0x324/0x364
[    0.626097]  do_one_initcall+0x54/0x250
[    0.630003]  kernel_init_freeable+0x12c/0x2c8
[    0.634428]  kernel_init+0x1c/0x128
[    0.637988]  ret_from_fork+0x10/0x18
[    0.641635] ---[ end trace d95b5309a33f8b27 ]---
[    0.646337] ------------[ cut here ]------------
[    0.651005] kernel BUG at arch/arm64/xen/../../arm/xen/enlighten.c:158!
[    0.657697] Internal error: Oops - BUG: 0 [#1] SMP
[    0.662548] Modules linked in:
[    0.665676] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W         5.9.0-rc4+ #4
[    0.673398] Hardware name: Socionext Developer Box (DT)
[    0.678695] pstate: 00000005 (nzcv daif -PAN -UAO BTYPE=--)
[    0.684338] pc : xen_starting_cpu+0x178/0x180
[    0.688765] lr : xen_starting_cpu+0x144/0x180
[    0.693188] sp : ffff8000116cbb60
[    0.696573] x29: ffff8000116cbb60 x28: ffff80000abec000
[    0.701955] x27: 0000000000000000 x26: 0000000000000000
[    0.707344] x25: ffff80001156c000 x24: fffffdffbfcdb600
[    0.712718] x23: 0000000000000000 x22: 0000000000000000
[    0.718107] x21: ffff8000113a99c8 x20: ffff800010fe4f68
[    0.723481] x19: ffff8000113a9988 x18: 0000000000000010
[    0.728863] x17: 0000000094fe0f81 x16: 00000000deadbeef
[    0.734245] x15: ffffffffffffffff x14: 0720072007200720
[    0.739626] x13: 0720072007200720 x12: 0720072007200720
[    0.745008] x11: 0720072007200720 x10: 0720072007200720
[    0.750390] x9 : ffff8000100fbdc0 x8 : ffff800010715208
[    0.755771] x7 : 0000000000000054 x6 : ffff00001b790f00
[    0.761153] x5 : ffff800010bbf880 x4 : 0000000000000000
[    0.766534] x3 : 0000000000000000 x2 : 00000000deadbeef
[    0.771916] x1 : 00000000deadbeef x0 : ffffffffffffffea
[    0.777304] Call trace:
[    0.779819]  xen_starting_cpu+0x178/0x180
[    0.783898]  cpuhp_invoke_callback+0xac/0x640
[    0.788325]  cpuhp_issue_call+0xf4/0x150
[    0.792317]  __cpuhp_setup_state_cpuslocked+0x128/0x2c8
[    0.797619]  __cpuhp_setup_state+0x84/0xf8
[    0.801779]  xen_guest_init+0x324/0x364
[    0.805683]  do_one_initcall+0x54/0x250
[    0.809590]  kernel_init_freeable+0x12c/0x2c8
[    0.814016]  kernel_init+0x1c/0x128
[    0.817583]  ret_from_fork+0x10/0x18
[    0.821226] Code: d0006980 f9427c00 cb000300 17ffffea (d4210000)
[    0.827415] ---[ end trace d95b5309a33f8b28 ]---
[    0.832076] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    0.839815] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/160196697165.60224.17470743378683334995.stgit@devnote2
Signed-off-by: Juergen Gross <jgross@suse.com>
2020-10-07 07:08:43 +02:00
Guo Ren
84814460ee riscv: Fixup bootup failure with HARDENED_USERCOPY
6184358da0 ("riscv: Fixup static_obj() fail") attempted to elide a lockdep
failure by rearranging our kernel image to place all initdata within [_stext,
_end], thus triggering lockdep to treat these as static objects.  These objects
are released and eventually reallocated, causing check_kernel_text_object() to
trigger a BUG().

This backs out the change to make [_stext, _end] all-encompassing, instead just
moving initdata.  This results in initdata being outside of [__init_begin,
__init_end], which means initdata can't be freed.

Link: https://lore.kernel.org/linux-riscv/1593266228-61125-1-git-send-email-guoren@kernel.org/T/#t
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Reported-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
[Palmer: Clean up commit text]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-10-06 18:34:00 -07:00
Linus Torvalds
c85fb28b6f Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
 "Fix a kernel panic in the AES crypto code caused by a BR tail call not
  matching the target BTI instruction (when branch target identification
  is enabled)"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  crypto: arm64: Use x16 with indirect branch to bti_c
2020-10-06 12:09:29 -07:00
Linus Torvalds
6ec37e6bb1 Merge tag 'platform-drivers-x86-v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull another x86 platform driver fix from Hans de Goede:
 "One final pdx86 fix for Tablet Mode reporting regressions (which make
  the keyboard and touchpad unusable) on various Asus notebooks.

  These regressions were caused by the asus-nb-wmi and the intel-vbtn
  drivers both receiving recent patches to start reporting Tablet Mode /
  to report it on more models.

  Due to a miscommunication between Andy and me, Andy's earlier pull-req
  only contained the fix for the intel-vbtn driver and not the fix for
  the asus-nb-wmi code.

  This fix has been tested as a downstream patch in Fedora kernels for
  approx two weeks with no problems being reported"

* tag 'platform-drivers-x86-v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: asus-wmi: Fix SW_TABLET_MODE always reporting 1 on many different models
2020-10-06 12:00:52 -07:00
Linus Torvalds
f1e141e9db Merge tag 'drm-fixes-2020-10-06-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Daniel queued these up last week and I took a long weekend so didn't
  get them out, but fixing the OOB access on get font seems like
  something we should land and it's cc'ed stable as well.

  The other big change is a partial revert for a regression on android
  on the clcd fbdev driver, and one other docs fix.

  fbdev:
   - Re-add FB_ARMCLCD for android
   - Fix global-out-of-bounds read in fbcon_get_font()

  core:
   - Small doc fix"

* tag 'drm-fixes-2020-10-06-1' of git://anongit.freedesktop.org/drm/drm:
  drm: drm_dsc.h: fix a kernel-doc markup
  Partially revert "video: fbdev: amba-clcd: Retire elder CLCD driver"
  fbcon: Fix global-out-of-bounds read in fbcon_get_font()
  Fonts: Support FONT_EXTRA_WORDS macros for built-in fonts
  fbdev, newport_con: Move FONT_EXTRA_WORDS macros into linux/font.h
2020-10-06 11:05:44 -07:00
Linus Torvalds
4013c1496c usermodehelper: reset umask to default before executing user process
Kernel threads intentionally do CLONE_FS in order to follow any changes
that 'init' does to set up the root directory (or cwd).

It is admittedly a bit odd, but it avoids the situation where 'init'
does some extensive setup to initialize the system environment, and then
we execute a usermode helper program, and it uses the original FS setup
from boot time that may be very limited and incomplete.

[ Both Al Viro and Eric Biederman point out that 'pivot_root()' will
  follow the root regardless, since it fixes up other users of root (see
  chroot_fs_refs() for details), but overmounting root and doing a
  chroot() would not. ]

However, Vegard Nossum noticed that the CLONE_FS not only means that we
follow the root and current working directories, it also means we share
umask with whatever init changed it to. That wasn't intentional.

Just reset umask to the original default (0022) before actually starting
the usermode helper program.

Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-06 10:31:52 -07:00
Linus Torvalds
d1a819a2ec splice: teach splice pipe reading about empty pipe buffers
Tetsuo Handa reports that splice() can return 0 before the real EOF, if
the data in the splice source pipe is an empty pipe buffer.  That empty
pipe buffer case doesn't happen in any normal situation, but you can
trigger it by doing a write to a pipe that fails due to a page fault.

Tetsuo has a test-case to show the behavior:

  #define _GNU_SOURCE
  #include <sys/types.h>
  #include <sys/stat.h>
  #include <fcntl.h>
  #include <unistd.h>

  int main(int argc, char *argv[])
  {
	const int fd = open("/tmp/testfile", O_WRONLY | O_CREAT, 0600);
	int pipe_fd[2] = { -1, -1 };
	pipe(pipe_fd);
	write(pipe_fd[1], NULL, 4096);
	/* This splice() should wait unless interrupted. */
	return !splice(pipe_fd[0], NULL, fd, NULL, 65536, 0);
  }

which results in

    write(5, NULL, 4096)                    = -1 EFAULT (Bad address)
    splice(4, NULL, 3, NULL, 65536, 0)      = 0

and this can confuse splice() users into believing they have hit EOF
prematurely.

The issue was introduced when the pipe write code started pre-allocating
the pipe buffers before copying data from user space.

This is modified verion of Tetsuo's original patch.

Fixes: a194dfe6e6 ("pipe: Rearrange sequence in pipe_write() to preallocate slot")
Link:https://lore.kernel.org/linux-fsdevel/20201005121339.4063-1-penguin-kernel@I-love.SAKURA.ne.jp/
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Acked-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-06 10:27:22 -07:00
Jeremy Linton
39e4716caa crypto: arm64: Use x16 with indirect branch to bti_c
The AES code uses a 'br x7' as part of a function called by
a macro. That branch needs a bti_j as a target. This results
in a panic as seen below. Using x16 (or x17) with an indirect
branch keeps the target bti_c.

  Bad mode in Synchronous Abort handler detected on CPU1, code 0x34000003 -- BTI
  CPU: 1 PID: 265 Comm: cryptomgr_test Not tainted 5.8.11-300.fc33.aarch64 #1
  pstate: 20400c05 (nzCv daif +PAN -UAO BTYPE=j-)
  pc : aesbs_encrypt8+0x0/0x5f0 [aes_neon_bs]
  lr : aesbs_xts_encrypt+0x48/0xe0 [aes_neon_bs]
  sp : ffff80001052b730

  aesbs_encrypt8+0x0/0x5f0 [aes_neon_bs]
   __xts_crypt+0xb0/0x2dc [aes_neon_bs]
   xts_encrypt+0x28/0x3c [aes_neon_bs]
  crypto_skcipher_encrypt+0x50/0x84
  simd_skcipher_encrypt+0xc8/0xe0
  crypto_skcipher_encrypt+0x50/0x84
  test_skcipher_vec_cfg+0x224/0x5f0
  test_skcipher+0xbc/0x120
  alg_test_skcipher+0xa0/0x1b0
  alg_test+0x3dc/0x47c
  cryptomgr_test+0x38/0x60

Fixes: 0e89640b64 ("crypto: arm64 - Use modern annotations for assembly functions")
Cc: <stable@vger.kernel.org> # 5.6.x-
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Suggested-by: Dave P Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20201006163326.2780619-1-jeremy.linton@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-10-06 18:14:47 +01:00
David S. Miller
d91dc434f2 Merge tag 'rxrpc-fixes-20201005' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:

====================
rxrpc: Miscellaneous fixes

Here are some miscellaneous rxrpc fixes:

 (1) Fix the xdr encoding of the contents read from an rxrpc key.

 (2) Fix a BUG() for a unsupported encoding type.

 (3) Fix missing _bh lock annotations.

 (4) Fix acceptance handling for an incoming call where the incoming call
     is encrypted.

 (5) The server token keyring isn't network namespaced - it belongs to the
     server, so there's no need.  Namespacing it means that request_key()
     fails to find it.

 (6) Fix a leak of the server keyring.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-06 06:18:20 -07:00
Eric Dumazet
86bccd0367 tcp: fix receive window update in tcp_add_backlog()
We got reports from GKE customers flows being reset by netfilter
conntrack unless nf_conntrack_tcp_be_liberal is set to 1.

Traces seemed to suggest ACK packet being dropped by the
packet capture, or more likely that ACK were received in the
wrong order.

 wscale=7, SYN and SYNACK not shown here.

 This ACK allows the sender to send 1871*128 bytes from seq 51359321 :
 New right edge of the window -> 51359321+1871*128=51598809

 09:17:23.389210 IP A > B: Flags [.], ack 51359321, win 1871, options [nop,nop,TS val 10 ecr 999], length 0

 09:17:23.389212 IP B > A: Flags [.], seq 51422681:51424089, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 1408
 09:17:23.389214 IP A > B: Flags [.], ack 51422681, win 1376, options [nop,nop,TS val 10 ecr 999], length 0
 09:17:23.389253 IP B > A: Flags [.], seq 51424089:51488857, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 64768
 09:17:23.389272 IP A > B: Flags [.], ack 51488857, win 859, options [nop,nop,TS val 10 ecr 999], length 0
 09:17:23.389275 IP B > A: Flags [.], seq 51488857:51521241, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 32384

 Receiver now allows to send 606*128=77568 from seq 51521241 :
 New right edge of the window -> 51521241+606*128=51598809

 09:17:23.389296 IP A > B: Flags [.], ack 51521241, win 606, options [nop,nop,TS val 10 ecr 999], length 0

 09:17:23.389308 IP B > A: Flags [.], seq 51521241:51553625, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 32384

 It seems the sender exceeds RWIN allowance, since 51611353 > 51598809

 09:17:23.389346 IP B > A: Flags [.], seq 51553625:51611353, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 57728
 09:17:23.389356 IP B > A: Flags [.], seq 51611353:51618393, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 7040

 09:17:23.389367 IP A > B: Flags [.], ack 51611353, win 0, options [nop,nop,TS val 10 ecr 999], length 0

 netfilter conntrack is not happy and sends RST

 09:17:23.389389 IP A > B: Flags [R], seq 92176528, win 0, length 0
 09:17:23.389488 IP B > A: Flags [R], seq 174478967, win 0, length 0

 Now imagine ACK were delivered out of order and tcp_add_backlog() sets window based on wrong packet.
 New right edge of the window -> 51521241+859*128=51631193

Normally TCP stack handles OOO packets just fine, but it
turns out tcp_add_backlog() does not. It can update the window
field of the aggregated packet even if the ACK sequence
of the last received packet is too old.

Many thanks to Alexandre Ferrieux for independently reporting the issue
and suggesting a fix.

Fixes: 4f693b55c3 ("tcp: implement coalescing on backlog queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-06 06:11:58 -07:00
Anant Thazhemadam
f45a4248ea net: usb: rtl8150: set random MAC address when set_ethernet_addr() fails
When get_registers() fails in set_ethernet_addr(),the uninitialized
value of node_id gets copied over as the address.
So, check the return value of get_registers().

If get_registers() executed successfully (i.e., it returns
sizeof(node_id)), copy over the MAC address using ether_addr_copy()
(instead of using memcpy()).

Else, if get_registers() failed instead, a randomly generated MAC
address is set as the MAC address instead.

Reported-by: syzbot+abbc768b560c84d92fd3@syzkaller.appspotmail.com
Tested-by: syzbot+abbc768b560c84d92fd3@syzkaller.appspotmail.com
Acked-by: Petko Manolov <petkan@nucleusys.com>
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-06 06:10:21 -07:00
Paolo Abeni
017512a07e mptcp: more DATA FIN fixes
Currently data fin on data packet are not handled properly:
the 'rcv_data_fin_seq' field is interpreted as the last
sequence number carrying a valid data, but for data fin
packet with valid maps we currently store map_seq + map_len,
that is, the next value.

The 'write_seq' fields carries instead the value subseguent
to the last valid byte, so in mptcp_write_data_fin() we
never detect correctly the last DSS map.

Fixes: 7279da6145 ("mptcp: Use MPTCP-level flag for sending DATA_FIN")
Fixes: 1a49b2c2a5 ("mptcp: Handle incoming 32-bit DATA_FIN values")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-06 06:06:59 -07:00
David S. Miller
c88c5ed75f Merge branch 'Fix-tail-dropping-watermarks-for-Ocelot-switches'
Vladimir Oltean says:

====================
Fix tail dropping watermarks for Ocelot switches

This series adds a missing division by 60, and a warning to prevent that
in the future.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-06 06:05:47 -07:00
Vladimir Oltean
0132649366 net: mscc: ocelot: warn when encoding an out-of-bounds watermark value
There is an upper bound to the value that a watermark may hold. That
upper bound is not immediately obvious during configuration, and it
might be possible to have accidental truncation.

Actually this has happened already, add a warning to prevent it from
happening again.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-06 06:05:47 -07:00
Vladimir Oltean
601e984f23 net: mscc: ocelot: divide watermark value by 60 when writing to SYS_ATOP
Tail dropping is enabled for a port when:

1. A source port consumes more packet buffers than the watermark encoded
   in SYS:PORT:ATOP_CFG.ATOP.

AND

2. Total memory use exceeds the consumption watermark encoded in
   SYS:PAUSE_CFG:ATOP_TOT_CFG.

The unit of these watermarks is a 60 byte memory cell. That unit is
programmed properly into ATOP_TOT_CFG, but not into ATOP. Actually when
written into ATOP, it would get truncated and wrap around.

Fixes: a556c76adc ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-06 06:05:47 -07:00
Manivannan Sadhasivam
082bb94fe1 net: qrtr: ns: Fix the incorrect usage of rcu_read_lock()
The rcu_read_lock() is not supposed to lock the kernel_sendmsg() API
since it has the lock_sock() in qrtr_sendmsg() which will sleep. Hence,
fix it by excluding the locking for kernel_sendmsg().

While at it, let's also use radix_tree_deref_retry() to confirm the
validity of the pointer returned by radix_tree_deref_slot() and use
radix_tree_iter_resume() to resume iterating the tree properly before
releasing the lock as suggested by Doug.

Fixes: a7809ff90c ("net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks")
Reported-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Alex Elder <elder@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-06 06:01:35 -07:00
Hans de Goede
1797d588af platform/x86: asus-wmi: Fix SW_TABLET_MODE always reporting 1 on many different models
Commit b0dbd97de1 ("platform/x86: asus-wmi: Add support for
SW_TABLET_MODE") added support for reporting SW_TABLET_MODE using the
Asus 0x00120063 WMI-device-id to see if various transformer models were
docked into their keyboard-dock (SW_TABLET_MODE=0) or if they were
being used as a tablet.

The new SW_TABLET_MODE support (naively?) assumed that non Transformer
devices would either not support the 0x00120063 WMI-device-id at all,
or would NOT set ASUS_WMI_DSTS_PRESENCE_BIT in their reply when querying
the device-id.

Unfortunately this is not true and we have received many bug reports about
this change causing the asus-wmi driver to always report SW_TABLET_MODE=1
on non Transformer devices. This causes libinput to think that these are
360 degree hinges style 2-in-1s folded into tablet-mode. Making libinput
suppress keyboard and touchpad events from the builtin keyboard and
touchpad. So effectively this causes the keyboard and touchpad to not work
on many non Transformer Asus models.

This commit fixes this by using the existing DMI based quirk mechanism in
asus-nb-wmi.c to allow using the 0x00120063 device-id for reporting
SW_TABLET_MODE on Transformer models and ignoring it on all other models.

Fixes: b0dbd97de1 ("platform/x86: asus-wmi: Add support for SW_TABLET_MODE")
Link: https://patchwork.kernel.org/patch/11780901/
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=209011
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1876997
Reported-by: Samuel Čavoj <samuel@cavoj.net>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2020-10-06 09:48:05 +02:00
Dave Airlie
86fdf61e71 Merge tag 'drm-misc-fixes-2020-10-01' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.9:
- Small doc fix.
- Re-add FB_ARMCLCD for android.
- Fix global-out-of-bounds read in fbcon_get_font().

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8585daa2-fcbc-3924-ac4f-e7b5668808e0@linux.intel.com
2020-10-06 12:38:28 +10:00
Linus Torvalds
7575fdda56 Merge tag 'platform-drivers-x86-v5.9-2' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver fixes from Andy Shevchenko:
 "We have some fixes for Tablet Mode reporting in particular, that users
  are complaining a lot about.

  Summary:

   - Attempt #3 of enabling Tablet Mode reporting w/o regressions

   - Improve battery recognition code in ASUS WMI driver

   - Fix Kconfig dependency warning for Fujitsu and LG laptop drivers

   - Add fixes in Thinkpad ACPI driver for _BCL method and NVRAM polling

   - Fix power supply extended topology in Mellanox driver

   - Fix memory leak in OLPC EC driver

   - Avoid static struct device in Intel PMC core driver

   - Add support for the touchscreen found in MPMAN Converter9 2-in-1

   - Update MAINTAINERS to reflect the real state of affairs"

* tag 'platform-drivers-x86-v5.9-2' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: thinkpad_acpi: re-initialize ACPI buffer size when reuse
  MAINTAINERS: Add Mark Gross and Hans de Goede as x86 platform drivers maintainers
  platform/x86: intel-vbtn: Switch to an allow-list for SW_TABLET_MODE reporting
  platform/x86: intel-vbtn: Revert "Fix SW_TABLET_MODE always reporting 1 on the HP Pavilion 11 x360"
  platform/x86: intel_pmc_core: do not create a static struct device
  platform/x86: mlx-platform: Fix extended topology configuration for power supply units
  platform/x86: pcengines-apuv2: Fix typo on define of AMD_FCH_GPIO_REG_GPIO55_DEVSLP0
  platform/x86: fix kconfig dependency warning for FUJITSU_LAPTOP
  platform/x86: fix kconfig dependency warning for LG_LAPTOP
  platform/x86: thinkpad_acpi: initialize tp_nvram_state variable
  platform/x86: intel-vbtn: Fix SW_TABLET_MODE always reporting 1 on the HP Pavilion 11 x360
  platform/x86: asus-wmi: Add BATC battery name to the list of supported
  platform/x86: asus-nb-wmi: Revert "Do not load on Asus T100TA and T200TA"
  platform/x86: touchscreen_dmi: Add info for the MPMAN Converter9 2-in-1
  Documentation: laptops: thinkpad-acpi: fix underline length build warning
  Platform: OLPC: Fix memleak in olpc_ec_probe
2020-10-05 11:54:20 -07:00
Linus Torvalds
165563c050 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Make sure SKB control block is in the proper state during IPSEC
    ESP-in-TCP encapsulation. From Sabrina Dubroca.

 2) Various kinds of attributes were not being cloned properly when we
    build new xfrm_state objects from existing ones. Fix from Antony
    Antony.

 3) Make sure to keep BTF sections, from Tony Ambardar.

 4) TX DMA channels need proper locking in lantiq driver, from Hauke
    Mehrtens.

 5) Honour route MTU during forwarding, always. From Maciej
    Żenczykowski.

 6) Fix races in kTLS which can result in crashes, from Rohit
    Maheshwari.

 7) Skip TCP DSACKs with rediculous sequence ranges, from Priyaranjan
    Jha.

 8) Use correct address family in xfrm state lookups, from Herbert Xu.

 9) A bridge FDB flush should not clear out user managed fdb entries
    with the ext_learn flag set, from Nikolay Aleksandrov.

10) Fix nested locking of netdev address lists, from Taehee Yoo.

11) Fix handling of 32-bit DATA_FIN values in mptcp, from Mat Martineau.

12) Fix r8169 data corruptions on RTL8402 chips, from Heiner Kallweit.

13) Don't free command entries in mlx5 while comp handler could still be
    running, from Eran Ben Elisha.

14) Error flow of request_irq() in mlx5 is busted, due to an off by one
    we try to free and IRQ never allocated. From Maor Gottlieb.

15) Fix leak when dumping netlink policies, from Johannes Berg.

16) Sendpage cannot be performed when a page is a slab page, or the page
    count is < 1. Some subsystems such as nvme were doing so. Create a
    "sendpage_ok()" helper and use it as needed, from Coly Li.

17) Don't leak request socket when using syncookes with mptcp, from
    Paolo Abeni.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
  net/core: check length before updating Ethertype in skb_mpls_{push,pop}
  net: mvneta: fix double free of txq->buf
  net_sched: check error pointer in tcf_dump_walker()
  net: team: fix memory leak in __team_options_register
  net: typhoon: Fix a typo Typoon --> Typhoon
  net: hinic: fix DEVLINK build errors
  net: stmmac: Modify configuration method of EEE timers
  tcp: fix syn cookied MPTCP request socket leak
  libceph: use sendpage_ok() in ceph_tcp_sendpage()
  scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map()
  drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage()
  tcp: use sendpage_ok() to detect misused .sendpage
  nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage()
  net: add WARN_ONCE in kernel_sendpage() for improper zero-copy send
  net: introduce helper sendpage_ok() in include/linux/net.h
  net: usb: pegasus: Proper error handing when setting pegasus' MAC address
  net: core: document two new elements of struct net_device
  netlink: fix policy dump leak
  net/mlx5e: Fix race condition on nhe->n pointer in neigh update
  net/mlx5e: Fix VLAN create flow
  ...
2020-10-05 11:27:14 -07:00
David Howells
38b1dc47a3 rxrpc: Fix server keyring leak
If someone calls setsockopt() twice to set a server key keyring, the first
keyring is leaked.

Fix it to return an error instead if the server key keyring is already set.

Fixes: 17926a7932 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-05 17:09:22 +01:00
David Howells
fea9911124 rxrpc: The server keyring isn't network-namespaced
The keyring containing the server's tokens isn't network-namespaced, so it
shouldn't be looked up with a network namespace.  It is expected to be
owned specifically by the server, so namespacing is unnecessary.

Fixes: a58946c158 ("keys: Pass the network namespace into request_key mechanism")
Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-05 16:36:06 +01:00
David Howells
2d914c1bf0 rxrpc: Fix accept on a connection that need securing
When a new incoming call arrives at an userspace rxrpc socket on a new
connection that has a security class set, the code currently pushes it onto
the accept queue to hold a ref on it for the socket.  This doesn't work,
however, as recvmsg() pops it off, notices that it's in the SERVER_SECURING
state and discards the ref.  This means that the call runs out of refs too
early and the kernel oopses.

By contrast, a kernel rxrpc socket manually pre-charges the incoming call
pool with calls that already have user call IDs assigned, so they are ref'd
by the call tree on the socket.

Change the mode of operation for userspace rxrpc server sockets to work
like this too.  Although this is a UAPI change, server sockets aren't
currently functional.

Fixes: 248f219cb8 ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-05 16:35:57 +01:00
David Howells
fa1d113a0f rxrpc: Fix some missing _bh annotations on locking conn->state_lock
conn->state_lock may be taken in softirq mode, but a previous patch
replaced an outer lock in the response-packet event handling code, and lost
the _bh from that when doing so.

Fix this by applying the _bh annotation to the state_lock locking.

Fixes: a1399f8bb0 ("rxrpc: Call channels should have separate call number spaces")
Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-05 16:34:32 +01:00
David Howells
9a059cd5ca rxrpc: Downgrade the BUG() for unsupported token type in rxrpc_read()
If rxrpc_read() (which allows KEYCTL_READ to read a key), sees a token of a
type it doesn't recognise, it can BUG in a couple of places, which is
unnecessary as it can easily get back to userspace.

Fix this to print an error message instead.

Fixes: 99455153d0 ("RxRPC: Parse security index 5 keys (Kerberos 5)")
Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-05 16:33:37 +01:00
Marc Dionne
56305118e0 rxrpc: Fix rxkad token xdr encoding
The session key should be encoded with just the 8 data bytes and
no length; ENCODE_DATA precedes it with a 4 byte length, which
confuses some existing tools that try to parse this format.

Add an ENCODE_BYTES macro that does not include a length, and use
it for the key.  Also adjust the expected length.

Note that commit 774521f353 ("rxrpc: Fix an assertion in
rxrpc_read()") had fixed a BUG by changing the length rather than
fixing the encoding.  The original length was correct.

Fixes: 99455153d0 ("RxRPC: Parse security index 5 keys (Kerberos 5)")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-05 16:33:28 +01:00
Serge Semin
1c33524f79 MAINTAINERS: Add maintainer of DW APB SSI driver
Add myself as a maintainer of the Synopsis DesignWare APB SSI driver.

Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Link: https://lore.kernel.org/r/20201002211648.24320-1-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-10-05 13:22:59 +01:00
Aaron Ma
720ef73d1a platform/x86: thinkpad_acpi: re-initialize ACPI buffer size when reuse
Evaluating ACPI _BCL could fail, then ACPI buffer size will be set to 0.
When reuse this ACPI buffer, AE_BUFFER_OVERFLOW will be triggered.

Re-initialize buffer size will make ACPI evaluate successfully.

Fixes: 46445b6b89 ("thinkpad-acpi: fix handle locate for video and query of _BCL")
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-10-05 12:20:42 +03:00
Atish Patra
a78c6f5956 RISC-V: Make sure memblock reserves the memory containing DT
Currently, the memory containing DT is not reserved. Thus, that region
of memory can be reallocated or reused for other purposes. This may result
in  corrupted DT for nommu virt board in Qemu. We may not face any issue
in kendryte as DT is embedded in the kernel image for that.

Fixes: 6bd33e1ece ("riscv: add nommu support")
Cc: stable@vger.kernel.org
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-10-04 16:19:28 -07:00
Guillaume Nault
4296adc3e3 net/core: check length before updating Ethertype in skb_mpls_{push,pop}
Openvswitch allows to drop a packet's Ethernet header, therefore
skb_mpls_push() and skb_mpls_pop() might be called with ethernet=true
and mac_len=0. In that case the pointer passed to skb_mod_eth_type()
doesn't point to an Ethernet header and the new Ethertype is written at
unexpected locations.

Fix this by verifying that mac_len is big enough to contain an Ethernet
header.

Fixes: fa4e0f8855 ("net/sched: fix corrupted L2 header with MPLS 'push' and 'pop' actions")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 15:09:26 -07:00
Tom Rix
f4544e5361 net: mvneta: fix double free of txq->buf
clang static analysis reports this problem:

drivers/net/ethernet/marvell/mvneta.c:3465:2: warning:
  Attempt to free released memory
        kfree(txq->buf);
        ^~~~~~~~~~~~~~~

When mvneta_txq_sw_init() fails to alloc txq->tso_hdrs,
it frees without poisoning txq->buf.  The error is caught
in the mvneta_setup_txqs() caller which handles the error
by cleaning up all of the txqs with a call to
mvneta_txq_sw_deinit which also frees txq->buf.

Since mvneta_txq_sw_deinit is a general cleaner, all of the
partial cleaning in mvneta_txq_sw_deinit()'s error handling
is not needed.

Fixes: 2adb719d74 ("net: mvneta: Implement software TSO")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 15:07:19 -07:00
Cong Wang
580e4273d7 net_sched: check error pointer in tcf_dump_walker()
Although we take RTNL on dump path, it is possible to
skip RTNL on insertion path. So the following race condition
is possible:

rtnl_lock()		// no rtnl lock
			mutex_lock(&idrinfo->lock);
			// insert ERR_PTR(-EBUSY)
			mutex_unlock(&idrinfo->lock);
tc_dump_action()
rtnl_unlock()

So we have to skip those temporary -EBUSY entries on dump path
too.

Reported-and-tested-by: syzbot+b47bc4f247856fb4d9e1@syzkaller.appspotmail.com
Fixes: 0fedc63fad ("net_sched: commit action insertions together")
Cc: Vlad Buslov <vladbu@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 14:53:06 -07:00
Anant Thazhemadam
9a9e774959 net: team: fix memory leak in __team_options_register
The variable "i" isn't initialized back correctly after the first loop
under the label inst_rollback gets executed.

The value of "i" is assigned to be option_count - 1, and the ensuing
loop (under alloc_rollback) begins by initializing i--.
Thus, the value of i when the loop begins execution will now become
i = option_count - 2.

Thus, when kfree(dst_opts[i]) is called in the second loop in this
order, (i.e., inst_rollback followed by alloc_rollback),
dst_optsp[option_count - 2] is the first element freed, and
dst_opts[option_count - 1] does not get freed, and thus, a memory
leak is caused.

This memory leak can be fixed, by assigning i = option_count (instead of
option_count - 1).

Fixes: 80f7c6683f ("team: add support for per-port options")
Reported-by: syzbot+69b804437cfec30deac3@syzkaller.appspotmail.com
Tested-by: syzbot+69b804437cfec30deac3@syzkaller.appspotmail.com
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 14:47:22 -07:00
Si-Wei Liu
7ed9e3d97c vhost-vdpa: fix page pinning leakage in error path
Pinned pages are not properly accounted particularly when
mapping error occurs on IOTLB update. Clean up dangling
pinned pages for the error path. As the inflight pinned
pages, specifically for memory region that strides across
multiple chunks, would need more than one free page for
book keeping and accounting. For simplicity, pin pages
for all memory in the IOVA range in one go rather than
have multiple pin_user_pages calls to make up the entire
region. This way it's easier to track and account the
pages already mapped, particularly for clean-up in the
error path.

Fixes: 4c8cf31885 ("vhost: introduce vDPA-based backend")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Link: https://lore.kernel.org/r/1601701330-16837-3-git-send-email-si-wei.liu@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-10-04 03:47:02 -04:00
Si-Wei Liu
1477c8aebb vhost-vdpa: fix vhost_vdpa_map() on error condition
vhost_vdpa_map() should remove the iotlb entry just added
if the corresponding mapping fails to set up properly.

Fixes: 4c8cf31885 ("vhost: introduce vDPA-based backend")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Link: https://lore.kernel.org/r/1601701330-16837-2-git-send-email-si-wei.liu@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-10-04 03:47:02 -04:00
Greg Kurz
ab5122510b vhost: Don't call log_access_ok() when using IOTLB
When the IOTLB device is enabled, the log_guest_addr that is passed by
userspace to the VHOST_SET_VRING_ADDR ioctl, and which is then written
to vq->log_addr, is a GIOVA. All writes to this address are translated
by log_user() to writes to an HVA, and then ultimately logged through
the corresponding GPAs in log_write_hva(). No logging will ever occur
with vq->log_addr in this case. It is thus wrong to pass vq->log_addr
and log_guest_addr to log_access_vq() which assumes they are actual
GPAs.

Introduce a new vq_log_used_access_ok() helper that only checks accesses
to the log for the used structure when there isn't an IOTLB device around.

Signed-off-by: Greg Kurz <groug@kaod.org>
Link: https://lore.kernel.org/r/160171933385.284610.10189082586063280867.stgit@bahia.lan
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-10-04 03:45:20 -04:00
Greg Kurz
71878fa46c vhost: Use vhost_get_used_size() in vhost_vring_set_addr()
The open-coded computation of the used size doesn't take the event
into account when the VIRTIO_RING_F_EVENT_IDX feature is present.
Fix that by using vhost_get_used_size().

Fixes: 8ea8cf89e1 ("vhost: support event index")
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kurz <groug@kaod.org>
Link: https://lore.kernel.org/r/160171932300.284610.11846106312938909461.stgit@bahia.lan
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-10-04 03:44:25 -04:00
Greg Kurz
0210a8db2a vhost: Don't call access_ok() when using IOTLB
When the IOTLB device is enabled, the vring addresses we get
from userspace are GIOVAs. It is thus wrong to pass them down
to access_ok() which only takes HVAs.

Access validation is done at prefetch time with IOTLB. Teach
vq_access_ok() about that by moving the (vq->iotlb) check
from vhost_vq_access_ok() to vq_access_ok(). This prevents
vhost_vring_set_addr() to fail when verifying the accesses.
No behavior change for vhost_vq_access_ok().

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1883084
Fixes: 6b1e6cc785 ("vhost: new device IOTLB API")
Cc: jasowang@redhat.com
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Greg Kurz <groug@kaod.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/160171931213.284610.2052489816407219136.stgit@bahia.lan
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-10-04 03:43:03 -04:00
Christophe JAILLET
790ca79d3e net: typhoon: Fix a typo Typoon --> Typhoon
s/Typoon/Typhoon/

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 17:23:02 -07:00
Randy Dunlap
1f7e877c20 net: hinic: fix DEVLINK build errors
Fix many (lots deleted here) build errors in hinic by selecting NET_DEVLINK.

ld: drivers/net/ethernet/huawei/hinic/hinic_hw_dev.o: in function `mgmt_watchdog_timeout_event_handler':
hinic_hw_dev.c:(.text+0x30a): undefined reference to `devlink_health_report'
ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_fw_reporter_dump':
hinic_devlink.c:(.text+0x1c): undefined reference to `devlink_fmsg_u32_pair_put'
ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_fw_reporter_dump':
hinic_devlink.c:(.text+0x126): undefined reference to `devlink_fmsg_binary_pair_put'
ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_hw_reporter_dump':
hinic_devlink.c:(.text+0x1ba): undefined reference to `devlink_fmsg_string_pair_put'
ld: hinic_devlink.c:(.text+0x227): undefined reference to `devlink_fmsg_u8_pair_put'
ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_devlink_alloc':
hinic_devlink.c:(.text+0xaee): undefined reference to `devlink_alloc'
ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_devlink_free':
hinic_devlink.c:(.text+0xb04): undefined reference to `devlink_free'
ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_devlink_register':
hinic_devlink.c:(.text+0xb26): undefined reference to `devlink_register'
ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_devlink_unregister':
hinic_devlink.c:(.text+0xb46): undefined reference to `devlink_unregister'
ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_health_reporters_create':
hinic_devlink.c:(.text+0xb75): undefined reference to `devlink_health_reporter_create'
ld: hinic_devlink.c:(.text+0xb95): undefined reference to `devlink_health_reporter_create'
ld: hinic_devlink.c:(.text+0xbac): undefined reference to `devlink_health_reporter_destroy'
ld: drivers/net/ethernet/huawei/hinic/hinic_devlink.o: in function `hinic_health_reporters_destroy':

Fixes: 51ba902a16 ("net-next/hinic: Initialize hw interface")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Bin Luo <luobin9@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Cc: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 16:52:19 -07:00
Vineetha G. Jaya Kumaran
388e201d41 net: stmmac: Modify configuration method of EEE timers
Ethtool manual stated that the tx-timer is the "the amount of time the
device should stay in idle mode prior to asserting its Tx LPI". The
previous implementation for "ethtool --set-eee tx-timer" sets the LPI TW
timer duration which is not correct. Hence, this patch fixes the
"ethtool --set-eee tx-timer" to configure the EEE LPI timer.

The LPI TW Timer will be using the defined default value instead of
"ethtool --set-eee tx-timer" which follows the EEE LS timer implementation.

Changelog V2
*Not removing/modifying the eee_timer.
*EEE LPI timer can be configured through ethtool and also the eee_timer
module param.
*EEE TW Timer will be configured with default value only, not able to be
configured through ethtool or module param. This follows the implementation
of the EEE LS Timer.

Fixes: d765955d2a ("stmmac: add the Energy Efficient Ethernet support")
Signed-off-by: Vineetha G. Jaya Kumaran <vineetha.g.jaya.kumaran@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 16:40:25 -07:00
David S. Miller
ab0faf5f04 Merge tag 'mlx5-fixes-2020-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
From: Saeed Mahameed <saeedm@nvidia.com>

====================
This series introduces some fixes to mlx5 driver.

v1->v2:
 - Patch #1 Don't return while mutex is held. (Dave)

v2->v3:
 - Drop patch #1, will consider a better approach (Jakub)
 - use cpu_relax() instead of cond_resched() (Jakub)
 - while(i--) to reveres a loop (Jakub)
 - Drop old mellanox email sign-off and change the committer email
   (Jakub)

Please pull and let me know if there is any problem.

For -stable v4.15
 ('net/mlx5e: Fix VLAN cleanup flow')
 ('net/mlx5e: Fix VLAN create flow')

For -stable v4.16
 ('net/mlx5: Fix request_irqs error flow')

For -stable v5.4
 ('net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTU')
 ('net/mlx5: Avoid possible free of command entry while timeout comp handler')

For -stable v5.7
 ('net/mlx5e: Fix return status when setting unsupported FEC mode')

For -stable v5.8
 ('net/mlx5e: Fix race condition on nhe->n pointer in neigh update')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 16:14:21 -07:00
Paolo Abeni
9d8c05ad56 tcp: fix syn cookied MPTCP request socket leak
If a syn-cookies request socket don't pass MPTCP-level
validation done in syn_recv_sock(), we need to release
it immediately, or it will be leaked.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/89
Fixes: 9466a1cceb ("mptcp: enable JOIN requests even if cookies are in use")
Reported-and-tested-by: Geliang Tang <geliangtang@gmail.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:34:38 -07:00
David S. Miller
e7d4005d48 Merge branch 'Introduce-sendpage_ok-to-detect-misused-sendpage-in-network-related-drivers'
Coly Li says:

====================
Introduce sendpage_ok() to detect misused sendpage in network related drivers

As Sagi Grimberg suggested, the original fix is refind to a more common
inline routine:
    static inline bool sendpage_ok(struct page *page)
    {
        return  (!PageSlab(page) && page_count(page) >= 1);
    }
If sendpage_ok() returns true, the checking page can be handled by the
concrete zero-copy sendpage method in network layer.

The v10 series has 7 patches, fixes a WARN_ONCE() usage from v9 series,
- The 1st patch in this series introduces sendpage_ok() in header file
  include/linux/net.h.
- The 2nd patch adds WARN_ONCE() for improper zero-copy send in
  kernel_sendpage().
- The 3rd patch fixes the page checking issue in nvme-over-tcp driver.
- The 4th patch adds page_count check by using sendpage_ok() in
  do_tcp_sendpages() as Eric Dumazet suggested.
- The 5th and 6th patches just replace existing open coded checks with
  the inline sendpage_ok() routine.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:27:08 -07:00
Coly Li
40efc4dc73 libceph: use sendpage_ok() in ceph_tcp_sendpage()
In libceph, ceph_tcp_sendpage() does the following checks before handle
the page by network layer's zero copy sendpage method,
	if (page_count(page) >= 1 && !PageSlab(page))

This check is exactly what sendpage_ok() does. This patch replace the
open coded checks by sendpage_ok() as a code cleanup.

Signed-off-by: Coly Li <colyli@suse.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:27:08 -07:00
Coly Li
6aa25c7377 scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map()
In iscsci driver, iscsi_tcp_segment_map() uses the following code to
check whether the page should or not be handled by sendpage:
    if (!recv && page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)))

The "page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)" part is to
make sure the page can be sent to network layer's zero copy path. This
part is exactly what sendpage_ok() does.

This patch uses  use sendpage_ok() in iscsi_tcp_segment_map() to replace
the original open coded checks.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Cong Wang <amwang@redhat.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Chris Leech <cleech@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:27:08 -07:00
Coly Li
fb25ebe1b2 drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage()
In _drbd_send_page() a page is checked by following code before sending
it by kernel_sendpage(),
        (page_count(page) < 1) || PageSlab(page)
If the check is true, this page won't be send by kernel_sendpage() and
handled by sock_no_sendpage().

This kind of check is exactly what macro sendpage_ok() does, which is
introduced into include/linux/net.h to solve a similar send page issue
in nvme-tcp code.

This patch uses macro sendpage_ok() to replace the open coded checks to
page type and refcount in _drbd_send_page(), as a code cleanup.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:27:08 -07:00
Coly Li
cf83a17ede tcp: use sendpage_ok() to detect misused .sendpage
commit a10674bf24 ("tcp: detecting the misuse of .sendpage for Slab
objects") adds the checks for Slab pages, but the pages don't have
page_count are still missing from the check.

Network layer's sendpage method is not designed to send page_count 0
pages neither, therefore both PageSlab() and page_count() should be
both checked for the sending page. This is exactly what sendpage_ok()
does.

This patch uses sendpage_ok() in do_tcp_sendpages() to detect misused
.sendpage, to make the code more robust.

Fixes: a10674bf24 ("tcp: detecting the misuse of .sendpage for Slab objects")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:27:08 -07:00
Coly Li
7d4194abfc nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage()
Currently nvme_tcp_try_send_data() doesn't use kernel_sendpage() to
send slab pages. But for pages allocated by __get_free_pages() without
__GFP_COMP, which also have refcount as 0, they are still sent by
kernel_sendpage() to remote end, this is problematic.

The new introduced helper sendpage_ok() checks both PageSlab tag and
page_count counter, and returns true if the checking page is OK to be
sent by kernel_sendpage().

This patch fixes the page checking issue of nvme_tcp_try_send_data()
with sendpage_ok(). If sendpage_ok() returns true, send this page by
kernel_sendpage(), otherwise use sock_no_sendpage to handle this page.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jan Kara <jack@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Vlastimil Babka <vbabka@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:27:08 -07:00
Coly Li
7b62d31d3f net: add WARN_ONCE in kernel_sendpage() for improper zero-copy send
If a page sent into kernel_sendpage() is a slab page or it doesn't have
ref_count, this page is improper to send by the zero copy sendpage()
method. Otherwise such page might be unexpected released in network code
path and causes impredictable panic due to kernel memory management data
structure corruption.

This path adds a WARN_ON() on the sending page before sends it into the
concrete zero-copy sendpage() method, if the page is improper for the
zero-copy sendpage() method, a warning message can be observed before
the consequential unpredictable kernel panic.

This patch does not change existing kernel_sendpage() behavior for the
improper page zero-copy send, it just provides hint warning message for
following potential panic due the kernel memory heap corruption.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Cong Wang <amwang@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:27:08 -07:00
Coly Li
c381b07941 net: introduce helper sendpage_ok() in include/linux/net.h
The original problem was from nvme-over-tcp code, who mistakenly uses
kernel_sendpage() to send pages allocated by __get_free_pages() without
__GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
tail pages, sending them by kernel_sendpage() may trigger a kernel panic
from a corrupted kernel heap, because these pages are incorrectly freed
in network stack as page_count 0 pages.

This patch introduces a helper sendpage_ok(), it returns true if the
checking page,
- is not slab page: PageSlab(page) is false.
- has page refcount: page_count(page) is not zero

All drivers who want to send page to remote end by kernel_sendpage()
may use this helper to check whether the page is OK. If the helper does
not return true, the driver should try other non sendpage method (e.g.
sock_no_sendpage()) to handle the page.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jan Kara <jack@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Vlastimil Babka <vbabka@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:27:08 -07:00
Petko Manolov
f30e25a9d1 net: usb: pegasus: Proper error handing when setting pegasus' MAC address
v2:

If reading the MAC address from eeprom fail don't throw an error, use randomly
generated MAC instead.  Either way the adapter will soldier on and the return
type of set_ethernet_addr() can be reverted to void.

v1:

Fix a bug in set_ethernet_addr() which does not take into account possible
errors (or partial reads) returned by its helpers.  This can potentially lead to
writing random data into device's MAC address registers.

Signed-off-by: Petko Manolov <petko.manolov@konsulko.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:18:42 -07:00
Mauro Carvalho Chehab
a93bdcb94a net: core: document two new elements of struct net_device
As warned by "make htmldocs", there are two new struct elements
that aren't documented:

	../include/linux/netdevice.h:2159: warning: Function parameter or member 'unlink_list' not described in 'net_device'
	../include/linux/netdevice.h:2159: warning: Function parameter or member 'nested_level' not described in 'net_device'

Fixes: 1fc70edb7d ("net: core: add nested_level variable in net_device")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 15:15:56 -07:00
Heinrich Schuchardt
0c7689830e Documentation/x86: Fix incorrect references to zero-page.txt
The file zero-page.txt does not exit. Add links to zero-page.rst
instead.

 [ bp: Massage a bit. ]

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20201002190623.7489-1-xypron.glpk@gmx.de
2020-10-02 22:49:29 +02:00
Johannes Berg
a95bc734e6 netlink: fix policy dump leak
If userspace doesn't complete the policy dump, we leak the
allocated state. Fix this.

Fixes: d07dcf9aad ("netlink: add infrastructure to expose policies to userspace")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02 13:00:38 -07:00
Peilin Ye
6d53a9fe5a block/scsi-ioctl: Fix kernel-infoleak in scsi_put_cdrom_generic_arg()
scsi_put_cdrom_generic_arg() is copying uninitialized stack memory to
userspace, since the compiler may leave a 3-byte hole in the middle of
`cgc32`. Fix it by adding a padding field to `struct
compat_cdrom_generic_command`.

Cc: stable@vger.kernel.org
Fixes: f3ee6e63a9 ("compat_ioctl: move CDROM_SEND_PACKET handling into scsi")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: syzbot+85433a479a646a064ab3@syzkaller.appspotmail.com
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-02 12:01:47 -06:00
Vlad Buslov
1253935ad8 net/mlx5e: Fix race condition on nhe->n pointer in neigh update
Current neigh update event handler implementation takes reference to
neighbour structure, assigns it to nhe->n, tries to schedule workqueue task
and releases the reference if task was already enqueued. This results
potentially overwriting existing nhe->n pointer with another neighbour
instance, which causes double release of the instance (once in neigh update
handler that failed to enqueue to workqueue and another one in neigh update
workqueue task that processes updated nhe->n pointer instead of original
one):

[ 3376.512806] ------------[ cut here ]------------
[ 3376.513534] refcount_t: underflow; use-after-free.
[ 3376.521213] Modules linked in: act_skbedit act_mirred act_tunnel_key vxlan ip6_udp_tunnel udp_tunnel nfnetlink act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 mlx5_ib mlx5_core mlxfw pci_hyperv_intf ptp pps_core nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd
 grace fscache ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp rpcrdma rdma_ucm ib_umad ib_ipoib ib_iser rdma_cm ib_cm iw_cm rfkill ib_uverbs ib_core sunrpc kvm_intel kvm iTCO_wdt iTCO_vendor_support virtio_net irqbypass net_failover crc32_pclmul lpc_ich i2c_i801 failover pcspkr i2c_smbus mfd_core ghash_clmulni_intel sch_fq_codel drm i2c
_core ip_tables crc32c_intel serio_raw [last unloaded: mlxfw]
[ 3376.529468] CPU: 8 PID: 22756 Comm: kworker/u20:5 Not tainted 5.9.0-rc5+ #6
[ 3376.530399] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[ 3376.531975] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core]
[ 3376.532820] RIP: 0010:refcount_warn_saturate+0xd8/0xe0
[ 3376.533589] Code: ff 48 c7 c7 e0 b8 27 82 c6 05 0b b6 09 01 01 e8 94 93 c1 ff 0f 0b c3 48 c7 c7 88 b8 27 82 c6 05 f7 b5 09 01 01 e8 7e 93 c1 ff <0f> 0b c3 0f 1f 44 00 00 8b 07 3d 00 00 00 c0 74 12 83 f8 01 74 13
[ 3376.536017] RSP: 0018:ffffc90002a97e30 EFLAGS: 00010286
[ 3376.536793] RAX: 0000000000000000 RBX: ffff8882de30d648 RCX: 0000000000000000
[ 3376.537718] RDX: ffff8882f5c28f20 RSI: ffff8882f5c18e40 RDI: ffff8882f5c18e40
[ 3376.538654] RBP: ffff8882cdf56c00 R08: 000000000000c580 R09: 0000000000001a4d
[ 3376.539582] R10: 0000000000000731 R11: ffffc90002a97ccd R12: 0000000000000000
[ 3376.540519] R13: ffff8882de30d600 R14: ffff8882de30d640 R15: ffff88821e000900
[ 3376.541444] FS:  0000000000000000(0000) GS:ffff8882f5c00000(0000) knlGS:0000000000000000
[ 3376.542732] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3376.543545] CR2: 0000556e5504b248 CR3: 00000002c6f10005 CR4: 0000000000770ee0
[ 3376.544483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3376.545419] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 3376.546344] PKRU: 55555554
[ 3376.546911] Call Trace:
[ 3376.547479]  mlx5e_rep_neigh_update.cold+0x33/0xe2 [mlx5_core]
[ 3376.548299]  process_one_work+0x1d8/0x390
[ 3376.548977]  worker_thread+0x4d/0x3e0
[ 3376.549631]  ? rescuer_thread+0x3e0/0x3e0
[ 3376.550295]  kthread+0x118/0x130
[ 3376.550914]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 3376.551675]  ret_from_fork+0x1f/0x30
[ 3376.552312] ---[ end trace d84e8f46d2a77eec ]---

Fix the bug by moving work_struct to dedicated dynamically-allocated
structure. This enabled every event handler to work on its own private
neighbour pointer and removes the need for handling the case when task is
already enqueued.

Fixes: 232c001398 ("net/mlx5e: Add support to neighbour update flow")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:58 -07:00
Aya Levin
d4a16052bc net/mlx5e: Fix VLAN create flow
When interface is attached while in promiscuous mode and with VLAN
filtering turned off, both configurations are not respected and VLAN
filtering is performed.
There are 2 flows which add the any-vid rules during interface attach:
VLAN creation table and set rx mode. Each is relaying on the other to
add any-vid rules, eventually non of them does.

Fix this by adding any-vid rules on VLAN creation regardless of
promiscuous mode.

Fixes: 9df30601c8 ("net/mlx5e: Restore vlan filter after seamless reset")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:58 -07:00
Aya Levin
8c7353b6f7 net/mlx5e: Fix VLAN cleanup flow
Prior to this patch unloading an interface in promiscuous mode with RX
VLAN filtering feature turned off - resulted in a warning. This is due
to a wrong condition in the VLAN rules cleanup flow, which left the
any-vid rules in the VLAN steering table. These rules prevented
destroying the flow group and the flow table.

The any-vid rules are removed in 2 flows, but none of them remove it in
case both promiscuous is set and VLAN filtering is off. Fix the issue by
changing the condition of the VLAN table cleanup flow to clean also in
case of promiscuous mode.

mlx5_core 0000:00:08.0: mlx5_destroy_flow_group:2123:(pid 28729): Flow group 20 wasn't destroyed, refcount > 1
mlx5_core 0000:00:08.0: mlx5_destroy_flow_group:2123:(pid 28729): Flow group 19 wasn't destroyed, refcount > 1
mlx5_core 0000:00:08.0: mlx5_destroy_flow_table:2112:(pid 28729): Flow table 262149 wasn't destroyed, refcount > 1
...
...
------------[ cut here ]------------
FW pages counter is 11560 after reclaiming all pages
WARNING: CPU: 1 PID: 28729 at
drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:660
mlx5_reclaim_startup_pages+0x178/0x230 [mlx5_core]
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
  mlx5_function_teardown+0x2f/0x90 [mlx5_core]
  mlx5_unload_one+0x71/0x110 [mlx5_core]
  remove_one+0x44/0x80 [mlx5_core]
  pci_device_remove+0x3e/0xc0
  device_release_driver_internal+0xfb/0x1c0
  device_release_driver+0x12/0x20
  pci_stop_bus_device+0x68/0x90
  pci_stop_and_remove_bus_device+0x12/0x20
  hv_eject_device_work+0x6f/0x170 [pci_hyperv]
  ? __schedule+0x349/0x790
  process_one_work+0x206/0x400
  worker_thread+0x34/0x3f0
  ? process_one_work+0x400/0x400
  kthread+0x126/0x140
  ? kthread_park+0x90/0x90
  ret_from_fork+0x22/0x30
   ---[ end trace 6283bde8d26170dc ]---

Fixes: 9df30601c8 ("net/mlx5e: Restore vlan filter after seamless reset")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:58 -07:00
Aya Levin
2608a2f831 net/mlx5e: Fix return status when setting unsupported FEC mode
Verify the configured FEC mode is supported by at least a single link
mode before applying the command. Otherwise fail the command and return
"Operation not supported".
Prior to this patch, the command was successful, yet it falsely set all
link modes to FEC auto mode - like configuring FEC mode to auto. Auto
mode is the default configuration if a link mode doesn't support the
configured FEC mode.

Fixes: b5ede32d33 ("net/mlx5e: Add support for FEC modes based on 50G per lane links")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:57 -07:00
Aya Levin
3d093bc236 net/mlx5e: Fix driver's declaration to support GRE offload
Declare GRE offload support with respect to the inner protocol. Add a
list of supported inner protocols on which the driver can offload
checksum and GSO. For other protocols, inform the stack to do the needed
operations. There is no noticeable impact on GRE performance.

Fixes: 2729984149 ("net/mlx5e: Support TSO and TX checksum offloads for GRE tunnels")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:57 -07:00
Maor Dickman
2b0219898b net/mlx5e: CT, Fix coverity issue
The cited commit introduced the following coverity issue at function
mlx5_tc_ct_rule_to_tuple_nat:
- Memory - corruptions (OVERRUN)
  Overrunning array "tuple->ip.src_v6.in6_u.u6_addr32" of 4 4-byte
  elements at element index 7 (byte offset 31) using index
  "ip6_offset" (which evaluates to 7).

In case of IPv6 destination address rewrite, ip6_offset values are
between 4 to 7, which will cause memory overrun of array
"tuple->ip.src_v6.in6_u.u6_addr32" to array
"tuple->ip.dst_v6.in6_u.u6_addr32".

Fixed by writing the value directly to array
"tuple->ip.dst_v6.in6_u.u6_addr32" in case ip6_offset values are
between 4 to 7.

Fixes: bc562be967 ("net/mlx5e: CT: Save ct entries tuples in hashtables")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:57 -07:00
Aya Levin
c3c9402373 net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTU
Prior to this fix, in Striding RQ mode the driver was vulnerable when
receiving packets in the range (stride size - headroom, stride size].
Where stride size is calculated by mtu+headroom+tailroom aligned to the
closest power of 2.
Usually, this filtering is performed by the HW, except for a few cases:
- Between 2 VFs over the same PF with different MTUs
- On bluefield, when the host physical function sets a larger MTU than
  the ARM has configured on its representor and uplink representor.

When the HW filtering is not present, packets that are larger than MTU
might be harmful for the RQ's integrity, in the following impacts:
1) Overflow from one WQE to the next, causing a memory corruption that
in most cases is unharmful: as the write happens to the headroom of next
packet, which will be overwritten by build_skb(). In very rare cases,
high stress/load, this is harmful. When the next WQE is not yet reposted
and points to existing SKB head.
2) Each oversize packet overflows to the headroom of the next WQE. On
the last WQE of the WQ, where addresses wrap-around, the address of the
remainder headroom does not belong to the next WQE, but it is out of the
memory region range. This results in a HW CQE error that moves the RQ
into an error state.

Solution:
Add a page buffer at the end of each WQE to absorb the leak. Actually
the maximal overflow size is headroom but since all memory units must be
of the same size, we use page size to comply with UMR WQEs. The increase
in memory consumption is of a single page per RQ. Initialize the mkey
with all MTTs pointing to a default page. When the channels are
activated, UMR WQEs will redirect the RX WQEs to the actual memory from
the RQ's pool, while the overflow MTTs remain mapped to the default page.

Fixes: 73281b78a3 ("net/mlx5e: Derive Striding RQ size from MTU")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:56 -07:00
Aya Levin
08a762cecc net/mlx5e: Fix error path for RQ alloc
Increase granularity of the error path to avoid unneeded free/release.
Fix the cleanup to be symmetric to the order of creation.

Fixes: 0ddf543226 ("xdp/mlx5: setup xdp_rxq_info")
Fixes: 422d4c401e ("net/mlx5e: RX, Split WQ objects for different RQ types")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:56 -07:00
Maor Gottlieb
732ebfab7f net/mlx5: Fix request_irqs error flow
Fix error flow handling in request_irqs which try to free irq
that we failed to request.
It fixes the below trace.

WARNING: CPU: 1 PID: 7587 at kernel/irq/manage.c:1684 free_irq+0x4d/0x60
CPU: 1 PID: 7587 Comm: bash Tainted: G        W  OE    4.15.15-1.el7MELLANOXsmp-x86_64 #1
Hardware name: Advantech SKY-6200/SKY-6200, BIOS F2.00 08/06/2020
RIP: 0010:free_irq+0x4d/0x60
RSP: 0018:ffffc9000ef47af0 EFLAGS: 00010282
RAX: ffff88001476ae00 RBX: 0000000000000655 RCX: 0000000000000000
RDX: ffff88001476ae00 RSI: ffffc9000ef47ab8 RDI: ffff8800398bb478
RBP: ffff88001476a838 R08: ffff88001476ae00 R09: 000000000000156d
R10: 0000000000000000 R11: 0000000000000004 R12: ffff88001476a838
R13: 0000000000000006 R14: ffff88001476a888 R15: 00000000ffffffe4
FS:  00007efeadd32740(0000) GS:ffff88047fc40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc9cc010008 CR3: 00000001a2380004 CR4: 00000000007606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 mlx5_irq_table_create+0x38d/0x400 [mlx5_core]
 ? atomic_notifier_chain_register+0x50/0x60
 mlx5_load_one+0x7ee/0x1130 [mlx5_core]
 init_one+0x4c9/0x650 [mlx5_core]
 pci_device_probe+0xb8/0x120
 driver_probe_device+0x2a1/0x470
 ? driver_allows_async_probing+0x30/0x30
 bus_for_each_drv+0x54/0x80
 __device_attach+0xa3/0x100
 pci_bus_add_device+0x4a/0x90
 pci_iov_add_virtfn+0x2dc/0x2f0
 pci_enable_sriov+0x32e/0x420
 mlx5_core_sriov_configure+0x61/0x1b0 [mlx5_core]
 ? kstrtoll+0x22/0x70
 num_vf_store+0x4b/0x70 [mlx5_core]
 kernfs_fop_write+0x102/0x180
 __vfs_write+0x26/0x140
 ? rcu_all_qs+0x5/0x80
 ? _cond_resched+0x15/0x30
 ? __sb_start_write+0x41/0x80
 vfs_write+0xad/0x1a0
 SyS_write+0x42/0x90
 do_syscall_64+0x60/0x110
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Fixes: 24163189da ("net/mlx5: Separate IRQ request/free from EQ life cycle")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:56 -07:00
Saeed Mahameed
b898ce7bcc net/mlx5: cmdif, Avoid skipping reclaim pages if FW is not accessible
In case of pci is offline reclaim_pages_cmd() will still try to call
the FW to release FW pages, cmd_exec() in this case will return a silent
success without actually calling the FW.

This is wrong and will cause page leaks, what we should do is to detect
pci offline or command interface un-available before tying to access the
FW and manually release the FW pages in the driver.

In this patch we share the code to check for FW command interface
availability and we call it in sensitive places e.g. reclaim_pages_cmd().

Alternative fix:
 1. Remove MLX5_CMD_OP_MANAGE_PAGES form mlx5_internal_err_ret_value,
    command success simulation list.
 2. Always Release FW pages even if cmd_exec fails in reclaim_pages_cmd().

Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:55 -07:00
Eran Ben Elisha
410bd754cd net/mlx5: Add retry mechanism to the command entry index allocation
It is possible that new command entry index allocation will temporarily
fail. The new command holds the semaphore, so it means that a free entry
should be ready soon. Add one second retry mechanism before returning an
error.

Patch "net/mlx5: Avoid possible free of command entry while timeout comp
handler" increase the possibility to bump into this temporarily failure
as it delays the entry index release for non-callback commands.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:55 -07:00
Eran Ben Elisha
1d5558b1f0 net/mlx5: poll cmd EQ in case of command timeout
Once driver detects a command interface command timeout, it warns the
user and returns timeout error to the caller. In such case, the entry of
the command is not evacuated (because only real event interrupt is allowed
to clear command interface entry). If the HW event interrupt
of this entry will never arrive, this entry will be left unused forever.
Command interface entries are limited and eventually we can end up without
the ability to post a new command.

In addition, if driver will not consume the EQE of the lost interrupt and
rearm the EQ, no new interrupts will arrive for other commands.

Add a resiliency mechanism for manually polling the command EQ in case of
a command timeout. In case resiliency mechanism will find non-handled EQE,
it will consume it, and the command interface will be fully functional
again. Once the resiliency flow finished, wait another 5 seconds for the
command interface to complete for this command entry.

Define mlx5_cmd_eq_recover() to manage the cmd EQ polling resiliency flow.
Add an async EQ spinlock to avoid races between resiliency flows and real
interrupts that might run simultaneously.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:55 -07:00
Eran Ben Elisha
50b2412b7e net/mlx5: Avoid possible free of command entry while timeout comp handler
Upon command completion timeout, driver simulates a forced command
completion. In a rare case where real interrupt for that command arrives
simultaneously, it might release the command entry while the forced
handler might still access it.

Fix that by adding an entry refcount, to track current amount of allowed
handlers. Command entry to be released only when this refcount is
decremented to zero.

Command refcount is always initialized to one. For callback commands,
command completion handler is the symmetric flow to decrement it. For
non-callback commands, it is wait_func().

Before ringing the doorbell, increment the refcount for the real completion
handler. Once the real completion handler is called, it will decrement it.

For callback commands, once the delayed work is scheduled, increment the
refcount. Upon callback command completion handler, we will try to cancel
the timeout callback. In case of success, we need to decrement the callback
refcount as it will never run.

In addition, gather the entry index free and the entry free into a one
flow for all command types release.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:54 -07:00
Eran Ben Elisha
432161ea26 net/mlx5: Fix a race when moving command interface to polling mode
As part of driver unload, it destroys the commands EQ (via FW command).
As the commands EQ is destroyed, FW will not generate EQEs for any command
that driver sends afterwards. Driver should poll for later commands status.

Driver commands mode metadata is updated before the commands EQ is
actually destroyed. This can lead for double completion handle by the
driver (polling and interrupt), if a command is executed and completed by
FW after the mode was changed, but before the EQ was destroyed.

Fix that by using the mlx5_cmd_allowed_opcode mechanism to guarantee
that only DESTROY_EQ command can be executed during this time period.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-10-02 10:59:54 -07:00
Hans de Goede
9fb7779955 MAINTAINERS: Add Mark Gross and Hans de Goede as x86 platform drivers maintainers
Darren Hart and Andy Shevchenko lately have not had enough time to
maintain the x86 platform drivers, dropping their status to:
"Odd Fixes".

Mark Gross and Hans de Goede will take over maintainership of
the x86 platform drivers. Replace Darren and Andy's entries with
theirs and change the status to "Maintained".

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Mark Gross <mgross@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-10-02 17:55:01 +03:00
Hans de Goede
8169bd3e6e platform/x86: intel-vbtn: Switch to an allow-list for SW_TABLET_MODE reporting
2 recent commits:
cfae58ed68 ("platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE
on the 9 / "Laptop" chasis-type")
1fac39fd03 ("platform/x86: intel-vbtn: Also handle tablet-mode switch on
"Detachable" and "Portable" chassis-types")

Enabled reporting of SW_TABLET_MODE on more devices since the vbtn ACPI
interface is used by the firmware on some of those devices to report this.

Testing has shown that unconditionally enabling SW_TABLET_MODE reporting
on all devices with a chassis type of 8 ("Portable") or 10 ("Notebook")
which support the VGBS method is a very bad idea.

Many of these devices are normal laptops (non 2-in-1) models with a VGBS
which always returns 0, which we translate to SW_TABLET_MODE=1. This in
turn causes userspace (libinput) to suppress events from the builtin
keyboard and touchpad, making the laptop essentially unusable.

Since the problem of wrongly reporting SW_TABLET_MODE=1 in combination
with libinput, leads to a non-usable system. Where as OTOH many people will
not even notice when SW_TABLET_MODE is not being reported, this commit
changes intel_vbtn_has_switches() to use a DMI based allow-list.

The new DMI based allow-list matches on the 31 ("Convertible") and
32 ("Detachable") chassis-types, as these clearly are 2-in-1s and
so far if they support the intel-vbtn ACPI interface they all have
properly working SW_TABLET_MODE reporting.

Besides these 2 generic matches, it also contains model specific matches
for 2-in-1 models which use a different chassis-type and which are known
to have properly working SW_TABLET_MODE reporting.

This has been tested on the following 2-in-1 devices:

Dell Venue 11 Pro 7130 vPro
HP Pavilion X2 10-p002nd
HP Stream x360 Convertible PC 11
Medion E1239T

Fixes: cfae58ed68 ("platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE on the 9 / "Laptop" chasis-type")
BugLink: https://forum.manjaro.org/t/keyboard-and-touchpad-only-work-on-kernel-5-6/22668
BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1175599
Cc: Barnabás Pőcze <pobrn@protonmail.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-10-02 17:54:44 +03:00
Andy Shevchenko
21d64817c7 platform/x86: intel-vbtn: Revert "Fix SW_TABLET_MODE always reporting 1 on the HP Pavilion 11 x360"
After discussion, see the Link tag, it appears that this is not good enough.
So, revert it now and apply a better fix.

This reverts commit d823346876.

Link: https://lore.kernel.org/platform-driver-x86/s5hft71klxl.wl-tiwai@suse.de/
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-10-02 17:30:02 +03:00
Heiner Kallweit
ef9da46dde r8169: fix data corruption issue on RTL8402
Petr reported that after resume from suspend RTL8402 partially
truncates incoming packets, and re-initializing register RxConfig
before the actual chip re-initialization sequence is needed to avoid
the issue.

Reported-by: Petr Tesarik <ptesarik@suse.cz>
Proposed-by: Petr Tesarik <ptesarik@suse.cz>
Tested-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-01 12:37:21 -07:00
Heiner Kallweit
bb13a80062 r8169: fix handling ether_clk
Petr reported that system freezes on r8169 driver load on a system
using ether_clk. The original change was done under the assumption
that the clock isn't needed for basic operations like chip register
access. But obviously that was wrong.
Therefore effectively revert the original change, and in addition
leave the clock active when suspending and WoL is enabled. Chip may
not be able to process incoming packets otherwise.

Fixes: 9f0b54cd16 ("r8169: move switching optional clock on/off to pll power functions")
Reported-by: Petr Tesarik <ptesarik@suse.cz>
Tested-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-01 12:35:21 -07:00
Yonghong Song
d82a532a61 bpf: Fix "unresolved symbol" build error with resolve_btfids
Michal reported a build failure likes below:

   BTFIDS  vmlinux
   FAILED unresolved symbol tcp_timewait_sock
   make[1]: *** [/.../linux-5.9-rc7/Makefile:1176: vmlinux] Error 255

This error can be triggered when config has CONFIG_NET enabled
but CONFIG_INET disabled. In this case, there is no user of
istructs inet_timewait_sock and tcp_timewait_sock and hence
vmlinux BTF types are not generated for these two structures.

To fix the problem, let us force BTF generation for these two
structures with BTF_TYPE_EMIT.

Fixes: fce557bcef ("bpf: Make btf_sock_ids global")
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201001051339.2549085-1-yhs@fb.com
2020-10-01 18:38:50 +02:00
Qian Cai
8a018eb55e pipe: Fix memory leaks in create_pipe_files()
Calling pipe2() with O_NOTIFICATION_PIPE could results in memory
leaks unless watch_queue_init() is successful.

        In case of watch_queue_init() failure in pipe2() we are left
with inode and pipe_inode_info instances that need to be freed.  That
failure exit has been introduced in commit c73be61ced ("pipe: Add
general notification queue support") and its handling should've been
identical to nearby treatment of alloc_file_pseudo() failures - it
is dealing with the same situation.  As it is, the mainline kernel
leaks in that case.

        Another problem is that CONFIG_WATCH_QUEUE and !CONFIG_WATCH_QUEUE
cases are treated differently (and the former leaks just pipe_inode_info,
the latter - both pipe_inode_info and inode).

        Fixed by providing a dummy wacth_queue_init() in !CONFIG_WATCH_QUEUE
case and by having failures of wacth_queue_init() handled the same way
we handle alloc_file_pseudo() ones.

Fixes: c73be61ced ("pipe: Add general notification queue support")
Signed-off-by: Qian Cai <cai@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-01 09:40:35 -04:00
David S. Miller
a59cf61978 Merge branch 'Fix-bugs-in-Octeontx2-netdev-driver'
Geetha sowjanya says:

====================
Fix bugs in Octeontx2 netdev driver

In existing Octeontx2 network drivers code has issues
like stale entries in broadcast replication list, missing
L3TYPE for IPv6 frames, running tx queues on error and
race condition in mbox reset.
This patch set fixes the above issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30 15:07:19 -07:00
Hariprasad Kelam
66a5209b53 octeontx2-pf: Fix synchnorization issue in mbox
Mbox implementation in octeontx2 driver has three states
alloc, send and reset in mbox response. VF allocate and
sends message to PF for processing, PF ACKs them back and
reset the mbox memory. In some case we see synchronization
issue where after msgs_acked is incremented and before
mbox_reset API is called, if current execution is scheduled
out and a different thread is scheduled in which checks for
msgs_acked. Since the new thread sees msgs_acked == msgs_sent
it will try to allocate a new message and to send a new mbox
message to PF.Now if mbox_reset is scheduled in, PF will see
'0' in msgs_send.
This patch fixes the issue by calling mbox_reset before
incrementing msgs_acked flag for last processing message and
checks for valid message size.

Fixes: d424b6c02 ("octeontx2-pf: Enable SRIOV and added VF mbox handling")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30 15:07:19 -07:00
Hariprasad Kelam
1ea0166da0 octeontx2-pf: Fix the device state on error
Currently in otx2_open on failure of nix_lf_start
transmit queues are not stopped which are already
started in link_event. Since the tx queues are not
stopped network stack still try's to send the packets
leading to driver crash while access the device resources.

Fixes: 50fe6c02e ("octeontx2-pf: Register and handle link notifications")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30 15:07:19 -07:00
Geetha sowjanya
89eae5e87b octeontx2-pf: Fix TCP/UDP checksum offload for IPv6 frames
For TCP/UDP checksum offload feature in Octeontx2
expects L3TYPE to be set irrespective of IP header
checksum is being offloaded or not. Currently for
IPv6 frames L3TYPE is not being set resulting in
packet drop with checksum error. This patch fixes
this issue.

Fixes: 3ca6c4c88 ("octeontx2-pf: Add packet transmission support")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30 15:07:19 -07:00
Subbaraya Sundeep
e154b5b703 octeontx2-af: Fix enable/disable of default NPC entries
Packet replication feature present in Octeontx2
is a hardware linked list of PF and its VF
interfaces so that broadcast packets are sent
to all interfaces present in the list. It is
driver job to add and delete a PF/VF interface
to/from the list when the interface is brought
up and down. This patch fixes the
npc_enadis_default_entries function to handle
broadcast replication properly if packet replication
feature is present.

Fixes: 40df309e41 ("octeontx2-af: Support to enable/disable default MCAM entries")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30 15:07:19 -07:00
David S. Miller
03e7e72ced Merge branch '100GbE' of https://github.com/anguy11/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2020-09-30

This series contains updates to ice driver only.

Jake increases the wait time for firmware response as it can take longer
than the current wait time. Preserves the NVM capabilities of the device in
safe mode so the device reports its NVM update capabilities properly
when in this state.

v2: Added cover letter
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30 15:01:09 -07:00
Jacob Keller
be49b1ad29 ice: preserve NVM capabilities in safe mode
If the driver initializes in safe mode, it will call
ice_set_safe_mode_caps. This results in clearing the capabilities
structures, in order to set them up for operating in safe mode, ensuring
many features are disabled.

This has a side effect of also clearing the capability bits that relate
to NVM update. The result is that the device driver will not indicate
support for unified update, even if the firmware is capable.

Fix this by adding the relevant capability fields to the list of values
we preserve. To simplify the code, use a common_cap structure instead of
a handful of local variables. To reduce some duplication of the
capability name, introduce a couple of macros used to restore the
capabilities values from the cached copy.

Fixes: de9b277ee0 ("ice: Add support for unified NVM update flow capability")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Brijesh Behera <brijeshx.behera@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-09-30 08:32:35 -07:00
Jacob Keller
0ec86e8e82 ice: increase maximum wait time for flash write commands
The ice driver needs to wait for a firmware response to each command to
write a block of data to the scratch area used to update the device
firmware. The driver currently waits for up to 1 second for this to be
returned.

It turns out that firmware might take longer than 1 second to return
a completion in some cases. If this happens, the flash update will fail
to complete.

Fix this by increasing the maximum time that the driver will wait for
both writing a block of data, and for activating the new NVM bank. The
timeout for an erase command is already several minutes, as the firmware
had to erase the entire bank which was already expected to take a minute
or more in the worst case.

In the case where firmware really won't respond, we will now take longer
to fail. However, this ensures that if the firmware is simply slow to
respond, the flash update can still complete. This new maximum timeout
should not adversely increase the update time, as the implementation for
wait_event_interruptible_timeout, and should wake very soon after we get
a completion event. It is better for a flash update be slow but still
succeed than to fail because we gave up too quickly.

Fixes: d69ea414c9 ("ice: implement device flash update via devlink")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Brijesh Behera <brijeshx.behera@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-09-30 08:32:35 -07:00
Mike Christie
37787e9f81 vhost vdpa: fix vhost_vdpa_open error handling
We must free the vqs array in the open failure path, because
vhost_vdpa_release will not be called.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/1600712588-9514-2-git-send-email-michael.christie@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2020-09-30 11:25:06 -04:00
Mauro Carvalho Chehab
27204b99b0 drm: drm_dsc.h: fix a kernel-doc markup
As warned by Sphinx:

	./Documentation/gpu/drm-kms-helpers:305: ./include/drm/drm_dsc.h:587: WARNING: Unparseable C cross-reference: 'struct'
	Invalid C declaration: Expected identifier in nested name, got keyword: struct [error at 6]
	  struct
	  ------^

The markup for one struct is wrong, as struct is used twice.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/3d467022325e15bba8dcb13da8fb730099303266.1601467849.git.mchehab+huawei@kernel.org
2020-09-30 16:40:44 +02:00
Peter Collingbourne
112c35237c Partially revert "video: fbdev: amba-clcd: Retire elder CLCD driver"
Also partially revert the follow-up change "drm: pl111: Absorb the
external register header".

This reverts the parts of commits
7e4e589db7 and
0fb8125635 that touch paths outside
of drivers/gpu/drm/pl111.

The fbdev driver is used by Android's FVP configuration. Using the
DRM driver together with DRM's fbdev emulation results in a failure
to boot Android. The root cause is that Android's generic fbdev
userspace driver relies on the ability to set the pixel format via
FBIOPUT_VSCREENINFO, which is not supported by fbdev emulation.

There have been other less critical behavioral differences identified
between the fbdev driver and the DRM driver with fbdev emulation. The
DRM driver exposes different values for the panel's width, height and
refresh rate, and the DRM driver fails a FBIOPUT_VSCREENINFO syscall
with yres_virtual greater than the maximum supported value instead
of letting the syscall succeed and setting yres_virtual based on yres.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200929195344.2219796-1-pcc@google.com
2020-09-30 16:37:39 +02:00
David S. Miller
1f25c9bbfd Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2020-09-29

The following pull-request contains BPF updates for your *net* tree.

We've added 7 non-merge commits during the last 14 day(s) which contain
a total of 7 files changed, 28 insertions(+), 8 deletions(-).

The main changes are:

1) fix xdp loading regression in libbpf for old kernels, from Andrii.

2) Do not discard packet when NETDEV_TX_BUSY, from Magnus.

3) Fix corner cases in libbpf related to endianness and kconfig, from Tony.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30 01:49:20 -07:00
Thomas Gleixner
bc21a291fc x86/mce: Use idtentry_nmi_enter/exit()
The recent fix for NMI vs. IRQ state tracking missed to apply the cure
to the MCE handler.

Fixes: ba1f2b2eaa ("x86/entry: Fix NMI vs IRQ state tracking")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/87mu17ism2.fsf@nanos.tec.linutronix.de
2020-09-30 10:41:56 +02:00
David S. Miller
2b3e981a94 Merge branch 'mptcp-Fix-for-32-bit-DATA_FIN'
Mat Martineau says:

====================
mptcp: Fix for 32-bit DATA_FIN

The main fix is contained in patch 2, and that commit message explains
the issue with not properly converting truncated DATA_FIN sequence
numbers sent by the peer.

With patch 2 adding an unlocked read of msk->ack_seq, patch 1 cleans up
access to that data with READ_ONCE/WRITE_ONCE.

This does introduce two merge conflicts with net-next, but both have
straightforward resolution. Patch 1 modifies a line that got removed in
net-next so the modification can be dropped when merging. Patch 2 will
require a trivial conflict resolution for a modified function
declaration.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 18:16:09 -07:00
Mat Martineau
1a49b2c2a5 mptcp: Handle incoming 32-bit DATA_FIN values
The peer may send a DATA_FIN mapping with either a 32-bit or 64-bit
sequence number. When a 32-bit sequence number is received for the
DATA_FIN, it must be expanded to 64 bits before comparing it to the
last acked sequence number. This expansion was missing.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/93
Fixes: 3721b9b646 ("mptcp: Track received DATA_FIN sequence number and add related helpers")
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 18:15:46 -07:00
Mat Martineau
917944da3b mptcp: Consistently use READ_ONCE/WRITE_ONCE with msk->ack_seq
The msk->ack_seq value is sometimes read without the msk lock held, so
make proper use of READ_ONCE and WRITE_ONCE.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 18:15:40 -07:00
David S. Miller
4972c6ccf9 Merge branch 'via-rhine-Resume-fix-and-other-maintenance-work'
Kevin Brace says:

====================
via-rhine: Resume fix and other maintenance work

I use via-rhine based Ethernet regularly, and the Ethernet dying
after resume was really annoying me.  I decided to take the
matter into my own hands, and came up with a fix for the Ethernet
disappearing after resume.  I will also want to take over the code
maintenance work for via-rhine.  The patches apply to the latest
code, but they should be backported to older kernels as well.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 14:23:45 -07:00
Kevin Brace
2b6b78e082 via-rhine: New device driver maintainer
Signed-off-by: Kevin Brace <kevinbrace@bracecomputerlab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 14:23:45 -07:00
Kevin Brace
9f5159e89d via-rhine: Eliminate version information
Signed-off-by: Kevin Brace <kevinbrace@bracecomputerlab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 14:23:45 -07:00
Kevin Brace
aa15190cf2 via-rhine: VTunknown1 device is really VT8251 South Bridge
The VIA Technologies VT8251 South Bridge's integrated Rhine-II
Ethernet MAC comes has a PCI revision value of 0x7c.  This was
verified on ASUS P5V800-VM mainboard.

Signed-off-by: Kevin Brace <kevinbrace@bracecomputerlab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 14:23:45 -07:00
Kevin Brace
d120c9a81e via-rhine: Fix for the hardware having a reset failure after resume
In rhine_resume() and rhine_suspend(), the code calls netif_running()
to see if the network interface is down or not.  If it is down (i.e.,
netif_running() returning false), they will skip any housekeeping work
within the function relating to the hardware.  This becomes a problem
when the hardware resumes from a standby since it is counting on
rhine_resume() to map its MMIO and power up rest of the hardware.
Not getting its MMIO remapped and rest of the hardware powered
up lead to a soft reset failure and hardware disappearance.  The
solution is to map its MMIO and power up rest of the hardware inside
rhine_open() before soft reset is to be performed.  This solution was
verified on ASUS P5V800-VM mainboard's integrated Rhine-II Ethernet
MAC inside VIA Technologies VT8251 South Bridge.

Signed-off-by: Kevin Brace <kevinbrace@bracecomputerlab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 14:23:45 -07:00
Tony Nguyen
6667df916f MAINTAINERS: Update MAINTAINERS for Intel ethernet drivers
Add Jesse Brandeburg and myself; remove Jeff Kirsher.

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 14:21:34 -07:00
David S. Miller
8ba00e2434 Merge branch 'More-incorrect-VCAP-offsets-for-mscc_ocelot-switch'
Vladimir Oltean says:

====================
More incorrect VCAP offsets for mscc_ocelot switch

This small series fixes some wrong tc-flower action fields in the
Seville and Felix DSA drivers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 13:24:17 -07:00
Vladimir Oltean
eaa0355c66 net: dsa: seville: fix VCAP IS2 action width
Since the actions are packed together in the action RAM, an incorrect
action width means that no action except the first one would behave
correctly.

The tc-flower offload has probably not been tested on this hardware
since its introduction.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 13:24:17 -07:00
Vladimir Oltean
460e985ea0 net: dsa: felix: fix incorrect action offsets for VCAP IS2
The port mask width was larger than the actual number of ports, and
therefore, all fields following this one were also shifted by the number
of excess bits. But the driver doesn't use the REW_OP, SMAC_REPLACE_ENA
or ACL_ID bits from the action vector, so the bug was inconsequential.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 13:24:17 -07:00
Willy Liu
bbc4d71d63 net: phy: realtek: fix rtl8211e rx/tx delay config
There are two chip pins named TXDLY and RXDLY which actually adds the 2ns
delays to TXC and RXC for TXD/RXD latching. These two pins can config via
4.7k-ohm resistor to 3.3V hw setting, but also config via software setting
(extension page 0xa4 register 0x1c bit13 12 and 11).

The configuration register definitions from table 13 official PHY datasheet:
PHYAD[2:0] = PHY Address
AN[1:0] = Auto-Negotiation
Mode = Interface Mode Select
RX Delay = RX Delay
TX Delay = TX Delay
SELRGV = RGMII/GMII Selection

This table describes how to config these hw pins via external pull-high or pull-
low resistor.

It is a misunderstanding that mapping it as register bits below:
8:6 = PHY Address
5:4 = Auto-Negotiation
3 = Interface Mode Select
2 = RX Delay
1 = TX Delay
0 = SELRGV
So I removed these descriptions above and add related settings as below:
14 = reserved
13 = force Tx RX Delay controlled by bit12 bit11
12 = Tx Delay
11 = Rx Delay
10:0 = Test && debug settings reserved by realtek

Test && debug settings are not recommend to modify by default.

Fixes: f81dadbcf7 ("net: phy: realtek: Add rtl8211e rx/tx delays config")
Signed-off-by: Willy Liu <willy.liu@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 12:55:33 -07:00
Tonghao Zhang
1a03b8a35a virtio-net: don't disable guest csum when disable LRO
Open vSwitch and Linux bridge will disable LRO of the interface
when this interface added to them. Now when disable the LRO, the
virtio-net csum is disable too. That drops the forwarding performance.

Fixes: a02e8964ea ("virtio-net: ethtool configurable LRO")
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29 12:53:19 -07:00
He Zhe
9cf51446e6 bpf, powerpc: Fix misuse of fallthrough in bpf_jit_comp()
The user defined label following "fallthrough" is not considered by GCC
and causes build failure.

kernel-source/include/linux/compiler_attributes.h:208:41: error: attribute
'fallthrough' not preceding a case label or default label [-Werror]
 208   define fallthrough _attribute((fallthrough_))
                          ^~~~~~~~~~~~~

Fixes: df561f6688 ("treewide: Use fallthrough pseudo-keyword")
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/bpf/20200928090023.38117-1-zhe.he@windriver.com
2020-09-29 16:39:11 +02:00
Jakub Kicinski
78b70155dc ethtool: mark netlink family as __ro_after_init
Like all genl families ethtool_genl_family needs to not
be a straight up constant, because it's modified/initialized
by genl_register_family(). After init, however, it's only
passed to genlmsg_put() & co. therefore we can mark it
as __ro_after_init.

Since genl_family structure contains function pointers
mark this as a fix.

Fixes: 2b4a8990b7 ("ethtool: introduce ethtool netlink interface")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:52:50 -07:00
Jakub Kicinski
3ddf9b431b genetlink: add missing kdoc for validation flags
Validation flags are missing kdoc, add it.

Fixes: ef6243acb4 ("genetlink: optionally validate strictly/dumps")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:51:50 -07:00
Wilken Gottwalt
c92a79829c net: usb: ax88179_178a: add MCT usb 3.0 adapter
Adds the driver_info and usb ids of the AX88179 based MCT U3-A9003 USB
3.0 ethernet adapter.

Signed-off-by: Wilken Gottwalt <wilken.gottwalt@mailbox.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:39:05 -07:00
Wilken Gottwalt
9666ea66a7 net: usb: ax88179_178a: fix missing stop entry in driver_info
Adds the missing .stop entry in the Belkin driver_info structure.

Fixes: e20bd60bf6 ("net: usb: asix88179_178a: Add support for the Belkin B2B128")
Signed-off-by: Wilken Gottwalt <wilken.gottwalt@mailbox.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:37:28 -07:00
Manivannan Sadhasivam
a7809ff90c net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks
The rcu read locks are needed to avoid potential race condition while
dereferencing radix tree from multiple threads. The issue was identified
by syzbot. Below is the crash report:

=============================
WARNING: suspicious RCU usage
5.7.0-syzkaller #0 Not tainted
-----------------------------
include/linux/radix-tree.h:176 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
2 locks held by kworker/u4:1/21:
 #0: ffff88821b097938 ((wq_completion)qrtr_ns_handler){+.+.}-{0:0}, at: spin_unlock_irq include/linux/spinlock.h:403 [inline]
 #0: ffff88821b097938 ((wq_completion)qrtr_ns_handler){+.+.}-{0:0}, at: process_one_work+0x6df/0xfd0 kernel/workqueue.c:2241
 #1: ffffc90000dd7d80 ((work_completion)(&qrtr_ns.work)){+.+.}-{0:0}, at: process_one_work+0x71e/0xfd0 kernel/workqueue.c:2243

stack backtrace:
CPU: 0 PID: 21 Comm: kworker/u4:1 Not tainted 5.7.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: qrtr_ns_handler qrtr_ns_worker
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1e9/0x30e lib/dump_stack.c:118
 radix_tree_deref_slot include/linux/radix-tree.h:176 [inline]
 ctrl_cmd_new_lookup net/qrtr/ns.c:558 [inline]
 qrtr_ns_worker+0x2aff/0x4500 net/qrtr/ns.c:674
 process_one_work+0x76e/0xfd0 kernel/workqueue.c:2268
 worker_thread+0xa7f/0x1450 kernel/workqueue.c:2414
 kthread+0x353/0x380 kernel/kthread.c:268

Fixes: 0c2204a4ad ("net: qrtr: Migrate nameservice to kernel from userspace")
Reported-and-tested-by: syzbot+0f84f6eed90503da72fc@syzkaller.appspotmail.com
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:54:57 -07:00
David S. Miller
0ba56b89fa Merge branch 'net-core-fix-a-lockdep-splat-in-the-dev_addr_list'
Taehee Yoo says:

====================
net: core: fix a lockdep splat in the dev_addr_list.

This patchset is to avoid lockdep splat.

When a stacked interface graph is changed, netif_addr_lock() is called
recursively and it internally calls spin_lock_nested().
The parameter of spin_lock_nested() is 'dev->lower_level',
this is called subclass.
The problem of 'dev->lower_level' is that while 'dev->lower_level' is
being used as a subclass of spin_lock_nested(), its value can be changed.
So, spin_lock_nested() would be called recursively with the same
subclass value, the lockdep understands a deadlock.
In order to avoid this, a new variable is needed and it is going to be
used as a parameter of spin_lock_nested().
The first and second patch is a preparation patch for the third patch.
In the third patch, the problem will be fixed.

The first patch is to add __netdev_upper_dev_unlink().
An existed netdev_upper_dev_unlink() is renamed to
__netdev_upper_dev_unlink(). and netdev_upper_dev_unlink()
is added as an wrapper of this function.

The second patch is to add the netdev_nested_priv structure.
netdev_walk_all_{ upper | lower }_dev() pass both private functions
and "data" pointer to handle their own things.
At this point, the data pointer type is void *.
In order to make it easier to expand common variables and functions,
this new netdev_nested_priv structure is added.

The third patch is to add a new variable 'nested_level'
into the net_device structure.
This variable will be used as a parameter of spin_lock_nested() of
dev->addr_list_lock.
Due to this variable, it can avoid lockdep splat.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:00:15 -07:00
Taehee Yoo
1fc70edb7d net: core: add nested_level variable in net_device
This patch is to add a new variable 'nested_level' into the net_device
structure.
This variable will be used as a parameter of spin_lock_nested() of
dev->addr_list_lock.

netif_addr_lock() can be called recursively so spin_lock_nested() is
used instead of spin_lock() and dev->lower_level is used as a parameter
of spin_lock_nested().
But, dev->lower_level value can be updated while it is being used.
So, lockdep would warn a possible deadlock scenario.

When a stacked interface is deleted, netif_{uc | mc}_sync() is
called recursively.
So, spin_lock_nested() is called recursively too.
At this moment, the dev->lower_level variable is used as a parameter of it.
dev->lower_level value is updated when interfaces are being unlinked/linked
immediately.
Thus, After unlinking, dev->lower_level shouldn't be a parameter of
spin_lock_nested().

    A (macvlan)
    |
    B (vlan)
    |
    C (bridge)
    |
    D (macvlan)
    |
    E (vlan)
    |
    F (bridge)

    A->lower_level : 6
    B->lower_level : 5
    C->lower_level : 4
    D->lower_level : 3
    E->lower_level : 2
    F->lower_level : 1

When an interface 'A' is removed, it releases resources.
At this moment, netif_addr_lock() would be called.
Then, netdev_upper_dev_unlink() is called recursively.
Then dev->lower_level is updated.
There is no problem.

But, when the bridge module is removed, 'C' and 'F' interfaces
are removed at once.
If 'F' is removed first, a lower_level value is like below.
    A->lower_level : 5
    B->lower_level : 4
    C->lower_level : 3
    D->lower_level : 2
    E->lower_level : 1
    F->lower_level : 1

Then, 'C' is removed. at this moment, netif_addr_lock() is called
recursively.
The ordering is like this.
C(3)->D(2)->E(1)->F(1)
At this moment, the lower_level value of 'E' and 'F' are the same.
So, lockdep warns a possible deadlock scenario.

In order to avoid this problem, a new variable 'nested_level' is added.
This value is the same as dev->lower_level - 1.
But this value is updated in rtnl_unlock().
So, this variable can be used as a parameter of spin_lock_nested() safely
in the rtnl context.

Test commands:
   ip link add br0 type bridge vlan_filtering 1
   ip link add vlan1 link br0 type vlan id 10
   ip link add macvlan2 link vlan1 type macvlan
   ip link add br3 type bridge vlan_filtering 1
   ip link set macvlan2 master br3
   ip link add vlan4 link br3 type vlan id 10
   ip link add macvlan5 link vlan4 type macvlan
   ip link add br6 type bridge vlan_filtering 1
   ip link set macvlan5 master br6
   ip link add vlan7 link br6 type vlan id 10
   ip link add macvlan8 link vlan7 type macvlan

   ip link set br0 up
   ip link set vlan1 up
   ip link set macvlan2 up
   ip link set br3 up
   ip link set vlan4 up
   ip link set macvlan5 up
   ip link set br6 up
   ip link set vlan7 up
   ip link set macvlan8 up
   modprobe -rv bridge

Splat looks like:
[   36.057436][  T744] WARNING: possible recursive locking detected
[   36.058848][  T744] 5.9.0-rc6+ #728 Not tainted
[   36.059959][  T744] --------------------------------------------
[   36.061391][  T744] ip/744 is trying to acquire lock:
[   36.062590][  T744] ffff8c4767509280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_set_rx_mode+0x19/0x30
[   36.064922][  T744]
[   36.064922][  T744] but task is already holding lock:
[   36.066626][  T744] ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60
[   36.068851][  T744]
[   36.068851][  T744] other info that might help us debug this:
[   36.070731][  T744]  Possible unsafe locking scenario:
[   36.070731][  T744]
[   36.072497][  T744]        CPU0
[   36.073238][  T744]        ----
[   36.074007][  T744]   lock(&vlan_netdev_addr_lock_key);
[   36.075290][  T744]   lock(&vlan_netdev_addr_lock_key);
[   36.076590][  T744]
[   36.076590][  T744]  *** DEADLOCK ***
[   36.076590][  T744]
[   36.078515][  T744]  May be due to missing lock nesting notation
[   36.078515][  T744]
[   36.080491][  T744] 3 locks held by ip/744:
[   36.081471][  T744]  #0: ffffffff98571df0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x236/0x490
[   36.083614][  T744]  #1: ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60
[   36.085942][  T744]  #2: ffff8c476c8da280 (&bridge_netdev_addr_lock_key/4){+...}-{2:2}, at: dev_uc_sync+0x39/0x80
[   36.088400][  T744]
[   36.088400][  T744] stack backtrace:
[   36.089772][  T744] CPU: 6 PID: 744 Comm: ip Not tainted 5.9.0-rc6+ #728
[   36.091364][  T744] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[   36.093630][  T744] Call Trace:
[   36.094416][  T744]  dump_stack+0x77/0x9b
[   36.095385][  T744]  __lock_acquire+0xbc3/0x1f40
[   36.096522][  T744]  lock_acquire+0xb4/0x3b0
[   36.097540][  T744]  ? dev_set_rx_mode+0x19/0x30
[   36.098657][  T744]  ? rtmsg_ifinfo+0x1f/0x30
[   36.099711][  T744]  ? __dev_notify_flags+0xa5/0xf0
[   36.100874][  T744]  ? rtnl_is_locked+0x11/0x20
[   36.101967][  T744]  ? __dev_set_promiscuity+0x7b/0x1a0
[   36.103230][  T744]  _raw_spin_lock_bh+0x38/0x70
[   36.104348][  T744]  ? dev_set_rx_mode+0x19/0x30
[   36.105461][  T744]  dev_set_rx_mode+0x19/0x30
[   36.106532][  T744]  dev_set_promiscuity+0x36/0x50
[   36.107692][  T744]  __dev_set_promiscuity+0x123/0x1a0
[   36.108929][  T744]  dev_set_promiscuity+0x1e/0x50
[   36.110093][  T744]  br_port_set_promisc+0x1f/0x40 [bridge]
[   36.111415][  T744]  br_manage_promisc+0x8b/0xe0 [bridge]
[   36.112728][  T744]  __dev_set_promiscuity+0x123/0x1a0
[   36.113967][  T744]  ? __hw_addr_sync_one+0x23/0x50
[   36.115135][  T744]  __dev_set_rx_mode+0x68/0x90
[   36.116249][  T744]  dev_uc_sync+0x70/0x80
[   36.117244][  T744]  dev_uc_add+0x50/0x60
[   36.118223][  T744]  macvlan_open+0x18e/0x1f0 [macvlan]
[   36.119470][  T744]  __dev_open+0xd6/0x170
[   36.120470][  T744]  __dev_change_flags+0x181/0x1d0
[   36.121644][  T744]  dev_change_flags+0x23/0x60
[   36.122741][  T744]  do_setlink+0x30a/0x11e0
[   36.123778][  T744]  ? __lock_acquire+0x92c/0x1f40
[   36.124929][  T744]  ? __nla_validate_parse.part.6+0x45/0x8e0
[   36.126309][  T744]  ? __lock_acquire+0x92c/0x1f40
[   36.127457][  T744]  __rtnl_newlink+0x546/0x8e0
[   36.128560][  T744]  ? lock_acquire+0xb4/0x3b0
[   36.129623][  T744]  ? deactivate_slab.isra.85+0x6a1/0x850
[   36.130946][  T744]  ? __lock_acquire+0x92c/0x1f40
[   36.132102][  T744]  ? lock_acquire+0xb4/0x3b0
[   36.133176][  T744]  ? is_bpf_text_address+0x5/0xe0
[   36.134364][  T744]  ? rtnl_newlink+0x2e/0x70
[   36.135445][  T744]  ? rcu_read_lock_sched_held+0x32/0x60
[   36.136771][  T744]  ? kmem_cache_alloc_trace+0x2d8/0x380
[   36.138070][  T744]  ? rtnl_newlink+0x2e/0x70
[   36.139164][  T744]  rtnl_newlink+0x47/0x70
[ ... ]

Fixes: 845e0ebb44 ("net: change addr_list_lock back to static key")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:00:15 -07:00
Taehee Yoo
eff7423365 net: core: introduce struct netdev_nested_priv for nested interface infrastructure
Functions related to nested interface infrastructure such as
netdev_walk_all_{ upper | lower }_dev() pass both private functions
and "data" pointer to handle their own things.
At this point, the data pointer type is void *.
In order to make it easier to expand common variables and functions,
this new netdev_nested_priv structure is added.

In the following patch, a new member variable will be added into this
struct to fix the lockdep issue.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:00:15 -07:00
Taehee Yoo
fe8300fd8d net: core: add __netdev_upper_dev_unlink()
The netdev_upper_dev_unlink() has to work differently according to flags.
This idea is the same with __netdev_upper_dev_link().

In the following patches, new flags will be added.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:00:14 -07:00
Cong Wang
1aad804990 net_sched: remove a redundant goto chain check
All TC actions call tcf_action_check_ctrlact() to validate
goto chain, so this check in tcf_action_init_1() is actually
redundant. Remove it to save troubles of leaking memory.

Fixes: e49d8c22f1 ("net_sched: defer tcf_idr_insert() in tcf_action_init_1()")
Reported-by: Vlad Buslov <vladbu@mellanox.com>
Suggested-by: Davide Caratti <dcaratti@redhat.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 12:58:45 -07:00
Nikolay Aleksandrov
f2f3729fb6 net: bridge: fdb: don't flush ext_learn entries
When a user-space software manages fdb entries externally it should
set the ext_learn flag which marks the fdb entry as externally managed
and avoids expiring it (they're treated as static fdbs). Unfortunately
on events where fdb entries are flushed (STP down, netlink fdb flush
etc) these fdbs are also deleted automatically by the bridge. That in turn
causes trouble for the managing user-space software (e.g. in MLAG setups
we lose remote fdb entries on port flaps).
These entries are completely externally managed so we should avoid
automatically deleting them, the only exception are offloaded entries
(i.e. BR_FDB_ADDED_BY_EXT_LEARN + BR_FDB_OFFLOADED). They are flushed as
before.

Fixes: eb100e0e24 ("net: bridge: allow to add externally learned entries from user-space")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 12:47:43 -07:00
David S. Miller
a4be47afb0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2020-09-28

1) Fix a build warning in ip_vti if CONFIG_IPV6 is not set.
   From YueHaibing.

2) Restore IPCB on espintcp before handing the packet to xfrm
   as the information there is still needed.
   From Sabrina Dubroca.

3) Fix pmtu updating for xfrm interfaces.
   From Sabrina Dubroca.

4) Some xfrm state information was not cloned with xfrm_do_migrate.
   Fixes to clone the full xfrm state, from Antony Antony.

5) Use the correct address family in xfrm_state_find. The struct
   flowi must always be interpreted along with the original
   address family. This got lost over the years.
   Fix from Herbert Xu.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 12:25:42 -07:00
Michael Walle
6e3837668e spi: fsl-dspi: fix NULL pointer dereference
Since commit 530b5affc6 ("spi: fsl-dspi: fix use-after-free in remove
path") this driver causes a kernel oops:

[    1.891065] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000080
[..]
[    2.056973] Call trace:
[    2.059425]  dspi_setup+0xc8/0x2e0
[    2.062837]  spi_setup+0xcc/0x248
[    2.066160]  spi_add_device+0xb4/0x198
[    2.069918]  of_register_spi_device+0x250/0x370
[    2.074462]  spi_register_controller+0x4f4/0x770
[    2.079094]  dspi_probe+0x5bc/0x7b0
[    2.082594]  platform_drv_probe+0x5c/0xb0
[    2.086615]  really_probe+0xec/0x3c0
[    2.090200]  driver_probe_device+0x60/0xc0
[    2.094308]  device_driver_attach+0x7c/0x88
[    2.098503]  __driver_attach+0x60/0xe8
[    2.102263]  bus_for_each_dev+0x7c/0xd0
[    2.106109]  driver_attach+0x2c/0x38
[    2.109692]  bus_add_driver+0x194/0x1f8
[    2.113538]  driver_register+0x6c/0x128
[    2.117385]  __platform_driver_register+0x50/0x60
[    2.122105]  fsl_dspi_driver_init+0x24/0x30
[    2.126302]  do_one_initcall+0x54/0x2d0
[    2.130149]  kernel_init_freeable+0x1ec/0x258
[    2.134520]  kernel_init+0x1c/0x120
[    2.138018]  ret_from_fork+0x10/0x34
[    2.141606] Code: 97e0b11d aa0003f3 b4000680 f94006e0 (f9404000)
[    2.147723] ---[ end trace 26cf63e6cbba33a8 ]---

This is because since this commit, the allocation of the drivers private
data is done explicitly and in this case spi_alloc_master() won't set the
correct pointer.

Also move the platform_set_drvdata() to have both next to each other.

Fixes: 530b5affc6 ("spi: fsl-dspi: fix use-after-free in remove path")
Signed-off-by: Michael Walle <michael@walle.cc>
Tested-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20200928085500.28254-1-michael@walle.cc
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-28 20:17:42 +01:00
Heiner Kallweit
709a16be05 r8169: fix RTL8168f/RTL8411 EPHY config
Mistakenly bit 2 was set instead of bit 3 as in the vendor driver.

Fixes: a7a92cf815 ("r8169: sync PCIe PHY init with vendor driver 8.047.01")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-27 13:37:30 -07:00
Marian-Cristian Rotariu
307eea32b2 dt-bindings: net: renesas,ravb: Add support for r8a774e1 SoC
Document RZ/G2H (R8A774E1) SoC bindings.

Signed-off-by: Marian-Cristian Rotariu <marian-cristian.rotariu.rb@bp.renesas.com>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-27 13:32:06 -07:00
Ido Schimmel
7286502858 mlxsw: spectrum_acl: Fix mlxsw_sp_acl_tcam_group_add()'s error path
If mlxsw_sp_acl_tcam_group_id_get() fails, the mutex initialized earlier
is not destroyed.

Fix this by initializing the mutex after calling the function. This is
symmetric to mlxsw_sp_acl_tcam_group_del().

Fixes: 5ec2ee28d2 ("mlxsw: spectrum_acl: Introduce a mutex to guard region list updates")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-27 13:22:31 -07:00
Randy Dunlap
7dbbcf496f mdio: fix mdio-thunder.c dependency & build error
Fix build error by selecting MDIO_DEVRES for MDIO_THUNDER.
Fixes this build error:

ld: drivers/net/phy/mdio-thunder.o: in function `thunder_mdiobus_pci_probe':
drivers/net/phy/mdio-thunder.c:78: undefined reference to `devm_mdiobus_alloc_size'

Fixes: 379d7ac7ca ("phy: mdio-thunder: Add driver for Cavium Thunder SoC MDIO buses.")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: netdev@vger.kernel.org
Cc: David Daney <david.daney@cavium.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-27 13:21:28 -07:00
Igor Russkikh
059432495e net: atlantic: fix build when object tree is separate
Driver subfolder files refer parent folder includes in an
absolute manner.

Makefile contains a -I for this, but apparently that does not
work if object tree is separated.

Adding srctree to fix that.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 17:19:14 -07:00
Florian Fainelli
5e46e43c2a MAINTAINERS: Add Vladimir as a maintainer for DSA
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 17:14:20 -07:00
Ioana Ciornei
72e27c38ab dpaa2-eth: fix command version for Tx shaping
When adding the support for TBF offload, the improper command version
was added even though the command format is for the V2 of
dpni_set_tx_shaping(). This does not affect the functionality of TBF
since the only change between these two versions is the addition of the
exceeded parameters which are not used in TBF. Still, fix the bug so
that we keep things in sync.

Fixes: 39344a8962 ("dpaa2-eth: add API for Tx shaping")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 17:08:48 -07:00
Wilken Gottwalt
e42d72fea9 net: usb: ax88179_178a: add Toshiba usb 3.0 adapter
Adds the driver_info and usb ids of the AX88179 based Toshiba USB 3.0
ethernet adapter.

Signed-off-by: Wilken Gottwalt <wilken.gottwalt@mailbox.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 17:07:25 -07:00
David S. Miller
dc171dcf8a Merge branch 'bonding-team-basic-dev-needed_headroom-support'
Eric Dumazet says:

====================
bonding/team: basic dev->needed_headroom support

Both bonding and team drivers support non-ethernet devices,
but missed proper dev->needed_headroom initializations.

syzbot found a crash caused by bonding, I mirrored the fix in team as well.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 17:04:56 -07:00
Eric Dumazet
89d01748b2 team: set dev->needed_headroom in team_setup_by_port()
Some devices set needed_headroom. If we ignore it, we might
end up crashing in various skb_push() for example in ipgre_header()
since some layers assume enough headroom has been reserved.

Fixes: 1d76efe157 ("team: add support for non-ethernet devices")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 17:04:55 -07:00
Eric Dumazet
f32f193395 bonding: set dev->needed_headroom in bond_setup_by_slave()
syzbot managed to crash a host by creating a bond
with a GRE device.

For non Ethernet device, bonding calls bond_setup_by_slave()
instead of ether_setup(), and unfortunately dev->needed_headroom
was not copied from the new added member.

[  171.243095] skbuff: skb_under_panic: text:ffffffffa184b9ea len:116 put:20 head:ffff883f84012dc0 data:ffff883f84012dbc tail:0x70 end:0xd00 dev:bond0
[  171.243111] ------------[ cut here ]------------
[  171.243112] kernel BUG at net/core/skbuff.c:112!
[  171.243117] invalid opcode: 0000 [#1] SMP KASAN PTI
[  171.243469] gsmi: Log Shutdown Reason 0x03
[  171.243505] Call Trace:
[  171.243506]  <IRQ>
[  171.243512]  [<ffffffffa171be59>] skb_push+0x49/0x50
[  171.243516]  [<ffffffffa184b9ea>] ipgre_header+0x2a/0xf0
[  171.243520]  [<ffffffffa17452d7>] neigh_connected_output+0xb7/0x100
[  171.243524]  [<ffffffffa186f1d3>] ip6_finish_output2+0x383/0x490
[  171.243528]  [<ffffffffa186ede2>] __ip6_finish_output+0xa2/0x110
[  171.243531]  [<ffffffffa186acbc>] ip6_finish_output+0x2c/0xa0
[  171.243534]  [<ffffffffa186abe9>] ip6_output+0x69/0x110
[  171.243537]  [<ffffffffa186ac90>] ? ip6_output+0x110/0x110
[  171.243541]  [<ffffffffa189d952>] mld_sendpack+0x1b2/0x2d0
[  171.243544]  [<ffffffffa189d290>] ? mld_send_report+0xf0/0xf0
[  171.243548]  [<ffffffffa189c797>] mld_ifc_timer_expire+0x2d7/0x3b0
[  171.243551]  [<ffffffffa189c4c0>] ? mld_gq_timer_expire+0x50/0x50
[  171.243556]  [<ffffffffa0fea270>] call_timer_fn+0x30/0x130
[  171.243559]  [<ffffffffa0fea17c>] expire_timers+0x4c/0x110
[  171.243563]  [<ffffffffa0fea0e3>] __run_timers+0x213/0x260
[  171.243566]  [<ffffffffa0fecb7d>] ? ktime_get+0x3d/0xa0
[  171.243570]  [<ffffffffa0ff9c4e>] ? clockevents_program_event+0x7e/0xe0
[  171.243574]  [<ffffffffa0f7e5d5>] ? sched_clock_cpu+0x15/0x190
[  171.243577]  [<ffffffffa0fe973d>] run_timer_softirq+0x1d/0x40
[  171.243581]  [<ffffffffa1c00152>] __do_softirq+0x152/0x2f0
[  171.243585]  [<ffffffffa0f44e1f>] irq_exit+0x9f/0xb0
[  171.243588]  [<ffffffffa1a02e1d>] smp_apic_timer_interrupt+0xfd/0x1a0
[  171.243591]  [<ffffffffa1a01ea6>] apic_timer_interrupt+0x86/0x90

Fixes: f5184d267c ("net: Allow netdevices to specify needed head/tailroom")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 17:04:55 -07:00
Ivan Khoronzhuk
4663ff6025 net: ethernet: cavium: octeon_mgmt: use phy_start and phy_stop
To start also "phy state machine", with UP state as it should be,
the phy_start() has to be used, in another case machine even is not
triggered. After this change negotiation is supposed to be triggered
by SM workqueue.

It's not correct usage, but it appears after the following patch,
so add it as a fix.

Fixes: 74a992b359 ("net: phy: add phy_check_link_status")
Signed-off-by: Ivan Khoronzhuk <ikhoronz@cisco.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 16:51:31 -07:00
Wong Vee Khee
ac322f86b5 net: stmmac: Fix clock handling on remove path
While unloading the dwmac-intel driver, clk_disable_unprepare() is
being called twice in stmmac_dvr_remove() and
intel_eth_pci_remove(). This causes kernel panic on the second call.

Removing the second call of clk_disable_unprepare() in
intel_eth_pci_remove().

Fixes: 09f012e64e ("stmmac: intel: Fix clock handling on error and remove paths")
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 16:49:15 -07:00
Ronak Doshi
1dac3b1bc6 vmxnet3: fix cksum offload issues for non-udp tunnels
Commit dacce2be33 ("vmxnet3: add geneve and vxlan tunnel offload
support") added support for encapsulation offload. However, the inner
offload capability is to be restrictued to UDP tunnels.

This patch fixes the issue for non-udp tunnels by adding features
check capability and filtering appropriate features for non-udp tunnels.

Fixes: dacce2be33 ("vmxnet3: add geneve and vxlan tunnel offload support")
Signed-off-by: Ronak Doshi <doshir@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 16:41:40 -07:00
David S. Miller
abe2f12d94 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2020-09-25

This series contains updates to the iavf and ice driver.

Sylwester fixes a crash with iavf resume due to getting the wrong pointers.

Ani fixes a call trace in ice resume by calling pci_save_state().

Jakes fixes memory leaks in case of register_netdev() failure or
ice_cfg_vsi_lan() failure for the ice driver.

v2: Rebased; no other changes
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 16:24:20 -07:00
David S. Miller
4e1b469ab0 Merge tag 'wireless-drivers-2020-09-25' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:

====================
wireless-drivers fixes for v5.9

Second, and last, set of fixes for v5.9. Only one important regression
fix for mt76.

mt76

* fix a regression in aggregation which appeared after mac80211 changes
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25 13:14:15 -07:00
Jacob Keller
f6a07271bb ice: fix memory leak in ice_vsi_setup
During ice_vsi_setup, if ice_cfg_vsi_lan fails, it does not properly
release memory associated with the VSI rings. If we had used devres
allocations for the rings, this would be ok. However, we use kzalloc and
kfree_rcu for these ring structures.

Using the correct label to cleanup the rings during ice_vsi_setup
highlights an issue in the ice_vsi_clear_rings function: it can leave
behind stale ring pointers in the q_vectors structure.

When releasing rings, we must also ensure that no q_vector associated
with the VSI will point to this ring again. To resolve this, loop over
all q_vectors and release their ring mapping. Because we are about to
free all rings, no q_vector should remain pointing to any of the rings
in this VSI.

Fixes: 5513b920a4 ("ice: Update Tx scheduler tree for VSI multi-Tx queue support")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-09-25 07:39:24 -07:00
Jacob Keller
135f4b9e93 ice: fix memory leak if register_netdev_fails
The ice_setup_pf_sw function can cause a memory leak if register_netdev
fails, due to accidentally failing to free the VSI rings. Fix the memory
leak by using ice_vsi_release, ensuring we actually go through the full
teardown process.

This should be safe even if the netdevice is not registered because we
will have set the netdev pointer to NULL, ensuring ice_vsi_release won't
call unregister_netdev.

An alternative fix would be moving management of the PF VSI netdev into
the main VSI setup code. This is complicated and likely requires
significant refactor in how we manage VSIs

Fixes: 3a858ba392 ("ice: Add support for VSI allocation and deallocation")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-09-25 07:39:24 -07:00
Anirudh Venkataramanan
466e439292 ice: Fix call trace on suspend
It appears that the ice_suspend flow is missing a call to pci_save_state
and this is triggering the message "State of device not saved by
ice_suspend" and a call trace. Fix it.

Fixes: 769c500dcc ("ice: Add advanced power mgmt for WoL")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-09-25 07:39:24 -07:00
Sylwester Dziedziuch
75598a8fc0 iavf: Fix incorrect adapter get in iavf_resume
When calling iavf_resume there was a crash because wrong
function was used to get iavf_adapter and net_device pointers.
Changed how iavf_resume is getting iavf_adapter and net_device
pointers from pci_dev.

Fixes: 5eae00c57f ("i40evf: main driver core")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-09-25 07:39:24 -07:00
Peilin Ye
5af0864079 fbcon: Fix global-out-of-bounds read in fbcon_get_font()
fbcon_get_font() is reading out-of-bounds. A malicious user may resize
`vc->vc_font.height` to a large value, causing fbcon_get_font() to
read out of `fontdata`.

fbcon_get_font() handles both built-in and user-provided fonts.
Fortunately, recently we have added FONT_EXTRA_WORDS support for built-in
fonts, so fix it by adding range checks using FNTSIZE().

This patch depends on patch "fbdev, newport_con: Move FONT_EXTRA_WORDS
macros into linux/font.h", and patch "Fonts: Support FONT_EXTRA_WORDS
macros for built-in fonts".

Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+29d4ed7f3bdedf2aa2fd@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=08b8be45afea11888776f897895aef9ad1c3ecfd
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/b34544687a1a09d6de630659eb7a773f4953238b.1600953813.git.yepeilin.cs@gmail.com
2020-09-25 10:29:22 +02:00
Peilin Ye
6735b4632d Fonts: Support FONT_EXTRA_WORDS macros for built-in fonts
syzbot has reported an issue in the framebuffer layer, where a malicious
user may overflow our built-in font data buffers.

In order to perform a reliable range check, subsystems need to know
`FONTDATAMAX` for each built-in font. Unfortunately, our font descriptor,
`struct console_font` does not contain `FONTDATAMAX`, and is part of the
UAPI, making it infeasible to modify it.

For user-provided fonts, the framebuffer layer resolves this issue by
reserving four extra words at the beginning of data buffers. Later,
whenever a function needs to access them, it simply uses the following
macros:

Recently we have gathered all the above macros to <linux/font.h>. Let us
do the same thing for built-in fonts, prepend four extra words (including
`FONTDATAMAX`) to their data buffers, so that subsystems can use these
macros for all fonts, no matter built-in or user-provided.

This patch depends on patch "fbdev, newport_con: Move FONT_EXTRA_WORDS
macros into linux/font.h".

Cc: stable@vger.kernel.org
Link: https://syzkaller.appspot.com/bug?id=08b8be45afea11888776f897895aef9ad1c3ecfd
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/ef18af00c35fb3cc826048a5f70924ed6ddce95b.1600953813.git.yepeilin.cs@gmail.com
2020-09-25 10:28:51 +02:00
Peilin Ye
bb0890b4cd fbdev, newport_con: Move FONT_EXTRA_WORDS macros into linux/font.h
drivers/video/console/newport_con.c is borrowing FONT_EXTRA_WORDS macros
from drivers/video/fbdev/core/fbcon.h. To keep things simple, move all
definitions into <linux/font.h>.

Since newport_con now uses four extra words, initialize the fourth word in
newport_set_font() properly.

Cc: stable@vger.kernel.org
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/7fb8bc9b0abc676ada6b7ac0e0bd443499357267.1600953813.git.yepeilin.cs@gmail.com
2020-09-25 10:28:18 +02:00
Herbert Xu
e94ee17134 xfrm: Use correct address family in xfrm_state_find
The struct flowi must never be interpreted by itself as its size
depends on the address family.  Therefore it must always be grouped
with its original family value.

In this particular instance, the original family value is lost in
the function xfrm_state_find.  Therefore we get a bogus read when
it's coupled with the wrong family which would occur with inter-
family xfrm states.

This patch fixes it by keeping the original family value.

Note that the same bug could potentially occur in LSM through
the xfrm_state_pol_flow_match hook.  I checked the current code
there and it seems to be safe for now as only secid is used which
is part of struct flowi_common.  But that API should be changed
so that so that we don't get new bugs in the future.  We could
do that by replacing fl with just secid or adding a family field.

Reported-by: syzbot+577fbac3145a6eb2e7a5@syzkaller.appspotmail.com
Fixes: 48b8d78315 ("[XFRM]: State selection update to use inner...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-09-25 09:59:51 +02:00
Priyaranjan Jha
ad2b9b0f8d tcp: skip DSACKs with dubious sequence ranges
Currently, we use length of DSACKed range to compute number of
delivered packets. And if sequence range in DSACK is corrupted,
we can get bogus dsacked/acked count, and bogus cwnd.

This patch put bounds on DSACKed range to skip update of data
delivery and spurious retransmission information, if the DSACK
is unlikely caused by sender's action:
- DSACKed range shouldn't be greater than maximum advertised rwnd.
- Total no. of DSACKed segments shouldn't be greater than total
  no. of retransmitted segs. Unlike spurious retransmits, network
  duplicates or corrupted DSACKs shouldn't be counted as delivery.

Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 20:15:45 -07:00
Jamie Iles
1ec8e74855 net/fsl: quieten expected MDIO access failures
MDIO reads can happen during PHY probing, and printing an error with
dev_err can result in a large number of error messages during device
probe.  On a platform with a serial console this can result in
excessively long boot times in a way that looks like an infinite loop
when multiple busses are present.  Since 0f183fd151 (net/fsl: enable
extended scanning in xgmac_mdio) we perform more scanning so there are
potentially more failures.

Reduce the logging level to dev_dbg which is consistent with the
Freescale enetc driver.

Cc: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 20:13:26 -07:00
Helmut Grohne
912aae27c6 net: dsa: microchip: really look for phy-mode in port nodes
The previous implementation failed to account for the "ports" node. The
actual port nodes are not child nodes of the switch node, but a "ports"
node sits in between.

Fixes: edecfa98f6 ("net: dsa: microchip: look for phy-mode in port nodes")
Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 20:08:48 -07:00
Rohit Maheshwari
38f7e1c0c4 net/tls: race causes kernel panic
BUG: kernel NULL pointer dereference, address: 00000000000000b8
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 80000008b6fef067 P4D 80000008b6fef067 PUD 8b6fe6067 PMD 0
 Oops: 0000 [#1] SMP PTI
 CPU: 12 PID: 23871 Comm: kworker/12:80 Kdump: loaded Tainted: G S
 5.9.0-rc3+ #1
 Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.1 03/29/2018
 Workqueue: events tx_work_handler [tls]
 RIP: 0010:tx_work_handler+0x1b/0x70 [tls]
 Code: dc fe ff ff e8 16 d4 a3 f6 66 0f 1f 44 00 00 0f 1f 44 00 00 55 53 48 8b
 6f 58 48 8b bd a0 04 00 00 48 85 ff 74 1c 48 8b 47 28 <48> 8b 90 b8 00 00 00 83
 e2 02 75 0c f0 48 0f ba b0 b8 00 00 00 00
 RSP: 0018:ffffa44ace61fe88 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffff91da9e45cc30 RCX: dead000000000122
 RDX: 0000000000000001 RSI: ffff91da9e45cc38 RDI: ffff91d95efac200
 RBP: ffff91da133fd780 R08: 0000000000000000 R09: 000073746e657665
 R10: 8080808080808080 R11: 0000000000000000 R12: ffff91dad7d30700
 R13: ffff91dab6561080 R14: 0ffff91dad7d3070 R15: ffff91da9e45cc38
 FS:  0000000000000000(0000) GS:ffff91dad7d00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00000000000000b8 CR3: 0000000906478003 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  process_one_work+0x1a7/0x370
  worker_thread+0x30/0x370
  ? process_one_work+0x370/0x370
  kthread+0x114/0x130
  ? kthread_park+0x80/0x80
  ret_from_fork+0x22/0x30

tls_sw_release_resources_tx() waits for encrypt_pending, which
can have race, so we need similar changes as in commit
0cada33241 here as well.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 20:06:29 -07:00
Wang Qing
0eb11dfe22 net/ethernet/broadcom: fix spelling typo
Modify the comment typo: "compliment" -> "complement".

Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 20:05:48 -07:00
Xiaoliang Yang
4ab810a4e0 net: mscc: ocelot: fix fields offset in SG_CONFIG_REG_3
INIT_IPS and GATE_ENABLE fields have a wrong offset in SG_CONFIG_REG_3.
This register is used by stream gate control of PSFP, and it has not
been used before, because PSFP is not implemented in ocelot driver.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 20:00:40 -07:00
Xiaoliang Yang
dba1e4660a net: dsa: felix: convert TAS link speed based on phylink speed
state->speed holds a value of 10, 100, 1000 or 2500, but
QSYS_TAG_CONFIG_LINK_SPEED expects a value of 0, 1, 2, 3. So convert the
speed to a proper value.

Fixes: de143c0e27 ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload")
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 20:00:40 -07:00
Luo bin
f68910a805 hinic: fix wrong return value of mac-set cmd
It should also be regarded as an error when hw return status=4 for PF's
setting mac cmd. Only if PF return status=4 to VF should this cmd be
taken special treatment.

Fixes: 7dd29ee128 ("hinic: add sriov feature support")
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 20:00:40 -07:00
Xie He
ed46cd1d4c drivers/net/wan/x25_asy: Correct the ndo_open and ndo_stop functions
1.
Move the lapb_register/lapb_unregister calls into the ndo_open/ndo_stop
functions.
This makes the LAPB protocol start/stop when the network interface
starts/stops. When the network interface is down, the LAPB protocol
shouldn't be running and the LAPB module shoudn't be generating control
frames.

2.
Move netif_start_queue/netif_stop_queue into the ndo_open/ndo_stop
functions.
This makes the TX queue start/stop when the network interface
starts/stops.
(netif_stop_queue was originally in the ndo_stop function. But to make
the code look better, I created a new function to use as ndo_stop, and
made it call the original ndo_stop function. I moved netif_stop_queue
from the original ndo_stop function to the new ndo_stop function.)

Cc: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 19:52:58 -07:00
Maciej Żenczykowski
02a1b175b0 net/ipv4: always honour route mtu during forwarding
Documentation/networking/ip-sysctl.txt:46 says:
  ip_forward_use_pmtu - BOOLEAN
    By default we don't trust protocol path MTUs while forwarding
    because they could be easily forged and can lead to unwanted
    fragmentation by the router.
    You only need to enable this if you have user-space software
    which tries to discover path mtus by itself and depends on the
    kernel honoring this information. This is normally not the case.
    Default: 0 (disabled)
    Possible values:
    0 - disabled
    1 - enabled

Which makes it pretty clear that setting it to 1 is a potential
security/safety/DoS issue, and yet it is entirely reasonable to want
forwarded traffic to honour explicitly administrator configured
route mtus (instead of defaulting to device mtu).

Indeed, I can't think of a single reason why you wouldn't want to.
Since you configured a route mtu you probably know better...

It is pretty common to have a higher device mtu to allow receiving
large (jumbo) frames, while having some routes via that interface
(potentially including the default route to the internet) specify
a lower mtu.

Note that ipv6 forwarding uses device mtu unless the route is locked
(in which case it will use the route mtu).

This approach is not usable for IPv4 where an 'mtu lock' on a route
also has the side effect of disabling TCP path mtu discovery via
disabling the IPv4 DF (don't frag) bit on all outgoing frames.

I'm not aware of a way to lock a route from an IPv6 RA, so that also
potentially seems wrong.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Sunmeet Gill (Sunny) <sgill@quicinc.com>
Cc: Vinay Paradkar <vparadka@qti.qualcomm.com>
Cc: Tyler Wear <twear@quicinc.com>
Cc: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 19:51:16 -07:00
David S. Miller
6d8899962a Merge branch 'net_sched-fix-a-UAF-in-tcf_action_init'
Cong Wang says:

====================
net_sched: fix a UAF in tcf_action_init()

This patchset fixes a use-after-free triggered by syzbot. Please
find more details in each patch description.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 19:46:21 -07:00
Cong Wang
0fedc63fad net_sched: commit action insertions together
syzbot is able to trigger a failure case inside the loop in
tcf_action_init(), and when this happens we clean up with
tcf_action_destroy(). But, as these actions are already inserted
into the global IDR, other parallel process could free them
before tcf_action_destroy(), then we will trigger a use-after-free.

Fix this by deferring the insertions even later, after the loop,
and committing all the insertions in a separate loop, so we will
never fail in the middle of the insertions any more.

One side effect is that the window between alloction and final
insertion becomes larger, now it is more likely that the loop in
tcf_del_walker() sees the placeholder -EBUSY pointer. So we have
to check for error pointer in tcf_del_walker().

Reported-and-tested-by: syzbot+2287853d392e4b42374a@syzkaller.appspotmail.com
Fixes: 0190c1d452 ("net: sched: atomically check-allocate action")
Cc: Vlad Buslov <vladbu@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 19:46:21 -07:00
Cong Wang
e49d8c22f1 net_sched: defer tcf_idr_insert() in tcf_action_init_1()
All TC actions call tcf_idr_insert() for new action at the end
of their ->init(), so we can actually move it to a central place
in tcf_action_init_1().

And once the action is inserted into the global IDR, other parallel
process could free it immediately as its refcnt is still 1, so we can
not fail after this, we need to move it after the goto action
validation to avoid handling the failure case after insertion.

This is found during code review, is not directly triggered by syzbot.
And this prepares for the next patch.

Cc: Vlad Buslov <vladbu@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-24 19:46:21 -07:00
Andrii Nakryiko
87f92ac4c1 libbpf: Fix XDP program load regression for old kernels
Fix regression in libbpf, introduced by XDP link change, which causes XDP
programs to fail to be loaded into kernel due to specified BPF_XDP
expected_attach_type. While kernel doesn't enforce expected_attach_type for
BPF_PROG_TYPE_XDP, some old kernels already support XDP program, but they
don't yet recognize expected_attach_type field in bpf_attr, so setting it to
non-zero value causes program load to fail.

Luckily, libbpf already has a mechanism to deal with such cases, so just make
expected_attach_type optional for XDP programs.

Fixes: dc8698cac7 ("libbpf: Add support for BPF XDP link")
Reported-by: Nikita Shirokov <tehnerd@tehnerd.com>
Reported-by: Udip Pant <udippant@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200924171705.3803628-1-andriin@fb.com
2020-09-24 10:33:02 -07:00
Felix Fietkau
efb1676306 mt76: mt7615: reduce maximum VHT MPDU length to 7991
After fixing mac80211 to allow larger A-MSDUs in some cases, there have been
reports of performance regressions and packet loss with some clients.
It appears that the issue occurs when the hardware is transmitting A-MSDUs
bigger than 8k. Limit the local VHT MPDU size capability to 7991, matching
the value used for MT7915 as well.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200923052442.24141-1-nbd@nbd.name
2020-09-24 16:12:22 +03:00
Greg Kroah-Hartman
938835aa90 platform/x86: intel_pmc_core: do not create a static struct device
A struct device is a dynamic structure, with reference counting.
"Tricking" the kernel to make a dynamic structure static, by working
around the driver core release detection logic, is not nice.

Because of this, this code has been used as an example for others on
"how to do things", which is just about the worst thing possible to have
happen.

Fix this all up by making the platform device dynamic and providing a
real release function.

Cc: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
Cc: Vishwanath Somayaji <vishwanath.somayaji@intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Andy Shevchenko <andy@infradead.org>
Cc: Rajat Jain <rajatja@google.com>
Cc: platform-driver-x86@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
Fixes: b02f6a2ef0 ("platform/x86: intel_pmc_core: Attach using APCI HID "INT33A1"")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Rajat Jain <rajatja@google.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-09-24 14:05:21 +03:00
Vadim Pasternak
2b06a1c889 platform/x86: mlx-platform: Fix extended topology configuration for power supply units
Fix topology configuration for power supply units in structure
'mlxplat_mlxcpld_ext_pwr_items_data', due to hardware change.

Note: no need to backport the fix, since there is no such hardware yet
(equipped with four power) at the filed.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-09-24 14:05:20 +03:00
Ed Wildgoose
fce55cc8b7 platform/x86: pcengines-apuv2: Fix typo on define of AMD_FCH_GPIO_REG_GPIO55_DEVSLP0
Schematics show that the GPIO number is 55 (not 59). Trivial typo.

Signed-off-by: Ed Wildgoose <lists@wildgooses.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-09-24 14:04:53 +03:00
Necip Fazil Yildiran
afdd1ebb72 platform/x86: fix kconfig dependency warning for FUJITSU_LAPTOP
When FUJITSU_LAPTOP is enabled and NEW_LEDS is disabled, it results in the
following Kbuild warning:

WARNING: unmet direct dependencies detected for LEDS_CLASS
  Depends on [n]: NEW_LEDS [=n]
  Selected by [y]:
  - FUJITSU_LAPTOP [=y] && X86 [=y] && X86_PLATFORM_DEVICES [=y] && ACPI [=y] && INPUT [=y] && BACKLIGHT_CLASS_DEVICE [=y] && (ACPI_VIDEO [=n] || ACPI_VIDEO [=n]=n)

The reason is that FUJITSU_LAPTOP selects LEDS_CLASS without depending on
or selecting NEW_LEDS while LEDS_CLASS is subordinate to NEW_LEDS.

Honor the kconfig menu hierarchy to remove kconfig dependency warnings.

Reported-by: Hans de Goede <hdegoede@redhat.com>
Fixes: d89bcc83e7 ("platform/x86: fujitsu-laptop: select LEDS_CLASS")
Signed-off-by: Necip Fazil Yildiran <fazilyildiran@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-09-24 14:04:52 +03:00
Necip Fazil Yildiran
8f0c01e666 platform/x86: fix kconfig dependency warning for LG_LAPTOP
When LG_LAPTOP is enabled and NEW_LEDS is disabled, it results in the
following Kbuild warning:

WARNING: unmet direct dependencies detected for LEDS_CLASS
  Depends on [n]: NEW_LEDS [=n]
  Selected by [y]:
  - LG_LAPTOP [=y] && X86 [=y] && X86_PLATFORM_DEVICES [=y] && ACPI [=y] && ACPI_WMI [=y] && INPUT [=y]

The reason is that LG_LAPTOP selects LEDS_CLASS without depending on or
selecting NEW_LEDS while LEDS_CLASS is subordinate to NEW_LEDS.

Honor the kconfig menu hierarchy to remove kconfig dependency warnings.

Fixes: dbf0c5a6b1 ("platform/x86: Add LG Gram laptop special features driver")
Signed-off-by: Necip Fazil Yildiran <fazilyildiran@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: mark gross <mgross@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-09-24 14:04:52 +03:00
Tom Rix
5f38b06db8 platform/x86: thinkpad_acpi: initialize tp_nvram_state variable
clang static analysis flags this represenative problem
thinkpad_acpi.c:2523:7: warning: Branch condition evaluates
  to a garbage value
                if (!oldn->mute ||
                    ^~~~~~~~~~~

In hotkey_kthread() mute is conditionally set by hotkey_read_nvram()
but unconditionally checked by hotkey_compare_and_issue_event().
So the tp_nvram_state variable s[2] needs to be initialized.

Fixes: 01e88f2598 ("ACPI: thinkpad-acpi: add CMOS NVRAM polling for hot keys (v9)")
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: mark gross <mgross@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-09-24 14:04:52 +03:00
Hans de Goede
d823346876 platform/x86: intel-vbtn: Fix SW_TABLET_MODE always reporting 1 on the HP Pavilion 11 x360
Commit cfae58ed68 ("platform/x86: intel-vbtn: Only blacklist
SW_TABLET_MODE on the 9 / "Laptop" chasis-type") restored SW_TABLET_MODE
reporting on the HP stream x360 11 series on which it was previously broken
by commit de9647efea ("platform/x86: intel-vbtn: Only activate tablet
mode switch on 2-in-1's").

It turns out that enabling SW_TABLET_MODE reporting on devices with a
chassis-type of 10 ("Notebook") causes SW_TABLET_MODE to always report 1
at boot on the HP Pavilion 11 x360, which causes libinput to disable the
kbd and touchpad.

The HP Pavilion 11 x360's ACPI VGBS method sets bit 4 instead of bit 6 when
NOT in tablet mode at boot. Inspecting all the DSDTs in my DSDT collection
shows only one other model, the Medion E1239T ever setting bit 4 and it
always sets this together with bit 6.

So lets treat bit 4 as a second bit which when set indicates the device not
being in tablet-mode, as we already do for bit 6.

While at it also prefix all VGBS constant defines with "VGBS_".

Fixes: cfae58ed68 ("platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE on the 9 / "Laptop" chasis-type")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Mark Gross <mgross@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-09-24 14:04:52 +03:00
Marius Iacob
1d2dd379bd platform/x86: asus-wmi: Add BATC battery name to the list of supported
The Intel Atom Cherry Trail platform reports a new battery
name (BATC). Tested on ASUS Transformer Mini T103HAF.

Signed-off-by: Marius Iacob <themariusus@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-09-24 13:04:20 +03:00
Hans de Goede
8a333dab28 platform/x86: asus-nb-wmi: Revert "Do not load on Asus T100TA and T200TA"
The WMI INIT method on for some reason turns on the camera LED on these
2-in-1s, without the WMI interface allowing further control over the LED.

To fix this commit b5f7311d3a ("platform/x86: asus-nb-wmi: Do not load
on Asus T100TA and T200TA") added a blacklist with these 2 models on it
since the WMI driver did not add any extra functionality to these models.

Recently I've been working on making more 2-in-1 models report their
tablet-mode (SW_TABLET_MODE) to userspace; and I've found that these 2
Asus models report this through WMI. This commit reverts the adding
of the blacklist, so that the Asus WMI driver can be used on these
models to report their tablet-mode. I have another patch fixing the LED
issue in a different manner.

Note this is the second time the we revert the adding of the
asus_nb_wmi_blacklist. It was reverted before in commit:

aab9e7896e ("platform/x86: asus-nb-wmi: Revert "Do not load on Asus
T100TA and T200TA")"

But some how (accidentally re-applying of the patch?) it got re-added
again in commit 3bd12da7f5 ("platform/x86: asus-nb-wmi: Do not load
on Asus T100TA and T200TA"), so now we need to revert it again.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-09-24 13:04:20 +03:00
Hans de Goede
efe813d0b0 platform/x86: touchscreen_dmi: Add info for the MPMAN Converter9 2-in-1
Add touchscreen info for the MPMAN Converter9 2-in-1. This device uses the
same case as the ITworks TW891, but it uses a different digitizer, so it
needs its own firmware.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-09-24 13:04:20 +03:00
Randy Dunlap
d41ec792ed Documentation: laptops: thinkpad-acpi: fix underline length build warning
Fix underline length build warning in thinkpad-acpi.rst documentation:

Documentation/admin-guide/laptops/thinkpad-acpi.rst:1437: WARNING: Title underline too short.
DYTC Lapmode sensor
------------------

Fixes: acf7f4a591 ("platform/x86: thinkpad_acpi: lap or desk mode interface")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Nitin Joshi <njoshi1@lenovo.com>
Cc: Sugumaran <slacshiminar@lenovo.com>
Cc: Bastien Nocera <bnocera@redhat.com>
Cc: Mark Pearson <markpearson@lenovo.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Henrique de Moraes Holschuh <ibm-acpi@hmh.eng.br>
Cc: ibm-acpi-devel@lists.sourceforge.net
Cc: platform-driver-x86@vger.kernel.org
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-09-24 12:41:03 +03:00
Dinghao Liu
4fd9ac6bd3 Platform: OLPC: Fix memleak in olpc_ec_probe
When devm_regulator_register() fails, ec should be
freed just like when olpc_ec_cmd() fails.

Fixes: 231c0c2161 ("Platform: OLPC: Add a regulator for the DCON")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-09-24 12:40:30 +03:00
Voon Weifeng
7241c5a697 net: stmmac: removed enabling eee in EEE set callback
EEE should be only be enabled during stmmac_mac_link_up() when the
link are up and being set up properly. set_eee should only do settings
configuration and disabling the eee.

Without this fix, turning on EEE using ethtool will return
"Operation not supported". This is due to the driver is in a dead loop
waiting for eee to be advertised in the for eee to be activated but the
driver will only configure the EEE advertisement after the eee is
activated.

Ethtool should only return "Operation not supported" if there is no EEE
capbility in the MAC controller.

Fixes: 8a7493e58a ("net: stmmac: Fix a race in EEE enable callback")
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Acked-by: Mark Gross <mgross@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23 18:08:06 -07:00
Hauke Mehrtens
f9317ae552 net: lantiq: Add locking for TX DMA channel
The TX DMA channel data is accessed by the xrx200_start_xmit() and the
xrx200_tx_housekeeping() function from different threads. Make sure the
accesses are synchronized by acquiring the netif_tx_lock() in the
xrx200_tx_housekeeping() function too. This lock is acquired by the
kernel before calling xrx200_start_xmit().

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23 18:01:03 -07:00
Tian Tao
ea6754aef2 net: switchdev: Fixed kerneldoc warning
Update kernel-doc line comments to fix warnings reported by make W=1.
net/switchdev/switchdev.c:413: warning: Function parameter or
member 'extack' not described in 'call_switchdev_notifiers'

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Acked-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23 17:46:31 -07:00
Geert Uytterhoeven
77972b55fb Revert "ravb: Fixed to be able to unload modules"
This reverts commit 1838d6c62f.

This commit moved the ravb_mdio_init() call (and thus the
of_mdiobus_register() call) from the ravb_probe() to the ravb_open()
call.  This causes a regression during system resume (s2idle/s2ram), as
new PHY devices cannot be bound while suspended.

During boot, the Micrel PHY is detected like this:

    Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=228)
    ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off

During system suspend, (A) defer_all_probes is set to true, and (B)
usermodehelper_disabled is set to UMH_DISABLED, to avoid drivers being
probed while suspended.

  A. If CONFIG_MODULES=n, phy_device_register() calling device_add()
     merely adds the device, but does not probe it yet, as
     really_probe() returns early due to defer_all_probes being set:

       dpm_resume+0x128/0x4f8
	 device_resume+0xcc/0x1b0
	   dpm_run_callback+0x74/0x340
	     ravb_resume+0x190/0x1b8
	       ravb_open+0x84/0x770
		 of_mdiobus_register+0x1e0/0x468
		   of_mdiobus_register_phy+0x1b8/0x250
		     of_mdiobus_phy_device_register+0x178/0x1e8
		       phy_device_register+0x114/0x1b8
			 device_add+0x3d4/0x798
			   bus_probe_device+0x98/0xa0
			     device_initial_probe+0x10/0x18
			       __device_attach+0xe4/0x140
				 bus_for_each_drv+0x64/0xc8
				   __device_attach_driver+0xb8/0xe0
				     driver_probe_device.part.11+0xc4/0xd8
				       really_probe+0x32c/0x3b8

     Later, phy_attach_direct() notices no PHY driver has been bound,
     and falls back to the Generic PHY, leading to degraded operation:

       Generic PHY e6800000.ethernet-ffffffff:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=POLL)
       ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off

  B. If CONFIG_MODULES=y, request_module() returns early with -EBUSY due
     to UMH_DISABLED, and MDIO initialization fails completely:

       mdio_bus e6800000.ethernet-ffffffff:00: error -16 loading PHY driver module for ID 0x00221622
       ravb e6800000.ethernet eth0: failed to initialize MDIO
       PM: dpm_run_callback(): ravb_resume+0x0/0x1b8 returns -16
       PM: Device e6800000.ethernet failed to resume: error -16

     Ignoring -EBUSY in phy_request_driver_module(), like was done for
     -ENOENT in commit 21e194425a ("net: phy: fix issue with loading
     PHY driver w/o initramfs"), would makes it fall back to the Generic
     PHY, like in the CONFIG_MODULES=n case.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable@vger.kernel.org
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23 17:38:36 -07:00
Mat Martineau
ef59b1953c mptcp: Wake up MPTCP worker when DATA_FIN found on a TCP FIN packet
When receiving a DATA_FIN MPTCP option on a TCP FIN packet, the DATA_FIN
information would be stored but the MPTCP worker did not get
scheduled. In turn, the MPTCP socket state would remain in
TCP_ESTABLISHED and no blocked operations would be awakened.

TCP FIN packets are seen by the MPTCP socket when moving skbs out of the
subflow receive queues, so schedule the MPTCP worker when a skb with
DATA_FIN but no data payload is moved from a subflow queue. Other cases
(DATA_FIN on a bare TCP ACK or on a packet with data payload) are
already handled.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/84
Fixes: 43b54c6ee3 ("mptcp: Use full MPTCP-level disconnect state machine")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23 17:30:52 -07:00
Tony Ambardar
1245008122 libbpf: Fix native endian assumption when parsing BTF
Code in btf__parse_raw() fails to detect raw BTF of non-native endianness
and assumes it must be ELF data, which then fails to parse as ELF and
yields a misleading error message:

  root:/# bpftool btf dump file /sys/kernel/btf/vmlinux
  libbpf: failed to get EHDR from /sys/kernel/btf/vmlinux

For example, this could occur after cross-compiling a BTF-enabled kernel
for a target with non-native endianness, which is currently unsupported.

Check for correct endianness and emit a clearer error message:

  root:/# bpftool btf dump file /sys/kernel/btf/vmlinux
  libbpf: non-native BTF endianness is not supported

Fixes: 94a1fedd63 ("libbpf: Add btf__parse_raw() and generic btf__parse() APIs")
Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/90f81508ecc57bc0da318e0fe0f45cfe49b17ea7.1600417359.git.Tony.Ambardar@gmail.com
2020-09-21 21:50:49 +02:00
Tony Ambardar
65c2043989 bpf: Prevent .BTF section elimination
Systems with memory or disk constraints often reduce the kernel footprint
by configuring LD_DEAD_CODE_DATA_ELIMINATION. However, this can result in
removal of any BTF information.

Use the KEEP() macro to preserve the BTF data as done with other important
sections, while still allowing for smaller kernels.

Fixes: 90ceddcb49 ("bpf: Support llvm-objcopy for vmlinux BTF")
Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/a635b5d3e2da044e7b51ec1315e8910fbce0083f.1600417359.git.Tony.Ambardar@gmail.com
2020-09-21 21:50:44 +02:00
Tony Ambardar
e23bb04b0c bpf: Fix sysfs export of empty BTF section
If BTF data is missing or removed from the ELF section it is still exported
via sysfs as a zero-length file:

  root@OpenWrt:/# ls -l /sys/kernel/btf/vmlinux
  -r--r--r--    1 root    root    0 Jul 18 02:59 /sys/kernel/btf/vmlinux

Moreover, reads from this file succeed and leak kernel data:

  root@OpenWrt:/# hexdump -C /sys/kernel/btf/vmlinux|head -10
  000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
  *
  000cc0 00 00 00 00 00 00 00 00 00 00 00 00 80 83 b0 80 |................|
  000cd0 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
  000ce0 00 00 00 00 00 00 00 00 00 00 00 00 57 ac 6e 9d |............W.n.|
  000cf0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
  *
  002650 00 00 00 00 00 00 00 10 00 00 00 01 00 00 00 01 |................|
  002660 80 82 9a c4 80 85 97 80 81 a9 51 68 00 00 00 02 |..........Qh....|
  002670 80 25 44 dc 80 85 97 80 81 a9 50 24 81 ab c4 60 |.%D.......P$...`|

This situation was first observed with kernel 5.4.x, cross-compiled for a
MIPS target system. Fix by adding a sanity-check for export of zero-length
data sections.

Fixes: 341dfcf8d7 ("btf: expose BTF info through sysfs")
Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/b38db205a66238f70823039a8c531535864eaac5.1600417359.git.Tony.Ambardar@gmail.com
2020-09-21 21:50:24 +02:00
Tony Ambardar
ba2fd563b7 tools/bpftool: Support passing BPFTOOL_VERSION to make
This change facilitates out-of-tree builds, packaging, and versioning for
test and debug purposes. Defining BPFTOOL_VERSION allows self-contained
builds within the tools tree, since it avoids use of the 'kernelversion'
target in the top-level makefile, which would otherwise pull in several
other includes from outside the tools tree.

Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200917115833.1235518-1-Tony.Ambardar@gmail.com
2020-09-19 01:06:05 +02:00
Magnus Karlsson
642e450b6b xsk: Do not discard packet when NETDEV_TX_BUSY
In the skb Tx path, transmission of a packet is performed with
dev_direct_xmit(). When NETDEV_TX_BUSY is set in the drivers, it
signifies that it was not possible to send the packet right now,
please try later. Unfortunately, the xsk transmit code discarded the
packet and returned EBUSY to the application. Fix this unnecessary
packet loss, by not discarding the packet in the Tx ring and return
EAGAIN. As EAGAIN is returned to the application, it can then retry
the send operation later and the packet will then likely be sent as
the driver will then likely have space/resources to send the packet.

In summary, EAGAIN tells the application that the packet was not
discarded from the Tx ring and that it needs to call send()
again. EBUSY, on the other hand, signifies that the packet was not
sent and discarded from the Tx ring. The application needs to put
the packet on the Tx ring again if it wants it to be sent.

Fixes: 35fcde7f8d ("xsk: support for Tx")
Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/bpf/1600257625-2353-1-git-send-email-magnus.karlsson@gmail.com
2020-09-16 23:36:58 +02:00
Antony Antony
8366685b28 xfrm: clone whole liftime_cur structure in xfrm_do_migrate
When we clone state only add_time was cloned. It missed values like
bytes, packets.  Now clone the all members of the structure.

v1->v3:
 - use memcpy to copy the entire structure

Fixes: 80c9abaabf ("[XFRM]: Extension for dynamic update of endpoint address(es)")
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-09-07 12:45:22 +02:00
Antony Antony
7aa05d3047 xfrm: clone XFRMA_SEC_CTX in xfrm_do_migrate
XFRMA_SEC_CTX was not cloned from the old to the new.
Migrate this attribute during XFRMA_MSG_MIGRATE

v1->v2:
 - return -ENOMEM on error
v2->v3:
 - fix return type to int

Fixes: 80c9abaabf ("[XFRM]: Extension for dynamic update of endpoint address(es)")
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-09-07 12:45:22 +02:00
Antony Antony
91a46c6d1b xfrm: clone XFRMA_REPLAY_ESN_VAL in xfrm_do_migrate
XFRMA_REPLAY_ESN_VAL was not cloned completely from the old to the new.
Migrate this attribute during XFRMA_MSG_MIGRATE

v1->v2:
 - move curleft cloning to a separate patch

Fixes: af2f464e32 ("xfrm: Assign esn pointers when cloning a state")
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-09-07 12:45:22 +02:00
Antony Antony
545e5c5716 xfrm: clone XFRMA_SET_MARK in xfrm_do_migrate
XFRMA_SET_MARK and XFRMA_SET_MARK_MASK was not cloned from the old
to the new. Migrate these two attributes during XFRMA_MSG_MIGRATE

Fixes: 9b42c1f179 ("xfrm: Extend the output_mark to support input direction and masking.")
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-09-07 12:45:22 +02:00
Sabrina Dubroca
45a36a18d0 xfrmi: drop ignore_df check before updating pmtu
xfrm interfaces currently test for !skb->ignore_df when deciding
whether to update the pmtu on the skb's dst. Because of this, no pmtu
exception is created when we do something like:

    ping -s 1438 <dest>

By dropping this check, the pmtu exception will be created and the
next ping attempt will work.

Fixes: f203b76d78 ("xfrm: Add virtual xfrm interfaces")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-08-27 08:37:50 +02:00
Sabrina Dubroca
4eb2e13415 espintcp: restore IP CB before handing the packet to xfrm
Xiumei reported a bug with espintcp over IPv6 in transport mode,
because xfrm6_transport_finish expects to find IP6CB data (struct
inet6_skb_cb). Currently, espintcp zeroes the CB, but the relevant
part is actually preserved by previous layers (first set up by tcp,
then strparser only zeroes a small part of tcp_skb_tb), so we can just
relocate it to the start of skb->cb.

Fixes: e27cca96cd ("xfrm: add espintcp (RFC 8229)")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-08-17 15:58:04 +02:00
YueHaibing
61ee4137b5 ip_vti: Fix unused variable warning
If CONFIG_INET_XFRM_TUNNEL is set but CONFIG_IPV6 is n,

net/ipv4/ip_vti.c:493:27: warning: 'vti_ipip6_handler' defined but not used [-Wunused-variable]

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-08-08 08:17:22 +02:00
247 changed files with 3620 additions and 1366 deletions

View File

@@ -41,7 +41,8 @@ Andrew Murray <amurray@thegoodpenguin.co.uk> <andrew.murray@arm.com>
Andrew Vasquez <andrew.vasquez@qlogic.com>
Andrey Ryabinin <ryabinin.a.a@gmail.com> <a.ryabinin@samsung.com>
Andy Adamson <andros@citi.umich.edu>
Antoine Tenart <antoine.tenart@free-electrons.com>
Antoine Tenart <atenart@kernel.org> <antoine.tenart@bootlin.com>
Antoine Tenart <atenart@kernel.org> <antoine.tenart@free-electrons.com>
Antonio Ospite <ao2@ao2.it> <ao2@amarulasolutions.com>
Archit Taneja <archit@ti.com>
Ard Biesheuvel <ardb@kernel.org> <ard.biesheuvel@linaro.org>
@@ -188,6 +189,7 @@ Leon Romanovsky <leon@kernel.org> <leonro@nvidia.com>
Linas Vepstas <linas@austin.ibm.com>
Linus Lüssing <linus.luessing@c0d3.blue> <linus.luessing@ascom.ch>
Linus Lüssing <linus.luessing@c0d3.blue> <linus.luessing@web.de>
<linux-hardening@vger.kernel.org> <kernel-hardening@lists.openwall.com>
Li Yang <leoyang.li@nxp.com> <leoli@freescale.com>
Li Yang <leoyang.li@nxp.com> <leo@zh-kernel.org>
Lukasz Luba <lukasz.luba@arm.com> <l.luba@partner.samsung.com>

View File

@@ -21,6 +21,7 @@ Required properties:
- "renesas,etheravb-r8a774a1" for the R8A774A1 SoC.
- "renesas,etheravb-r8a774b1" for the R8A774B1 SoC.
- "renesas,etheravb-r8a774c0" for the R8A774C0 SoC.
- "renesas,etheravb-r8a774e1" for the R8A774E1 SoC.
- "renesas,etheravb-r8a7795" for the R8A7795 SoC.
- "renesas,etheravb-r8a7796" for the R8A77960 SoC.
- "renesas,etheravb-r8a77961" for the R8A77961 SoC.

View File

@@ -1342,8 +1342,8 @@ follow::
In addition to read/modify/write the setup header of the struct
boot_params as that of 16-bit boot protocol, the boot loader should
also fill the additional fields of the struct boot_params as that
described in zero-page.txt.
also fill the additional fields of the struct boot_params as
described in chapter :doc:`zero-page`.
After setting up the struct boot_params, the boot loader can load the
32/64-bit kernel in the same way as that of 16-bit boot protocol.
@@ -1379,7 +1379,7 @@ can be calculated as follows::
In addition to read/modify/write the setup header of the struct
boot_params as that of 16-bit boot protocol, the boot loader should
also fill the additional fields of the struct boot_params as described
in zero-page.txt.
in chapter :doc:`zero-page`.
After setting up the struct boot_params, the boot loader can load
64-bit kernel in the same way as that of 16-bit boot protocol, but

View File

@@ -1460,6 +1460,11 @@ S: Odd Fixes
F: drivers/amba/
F: include/linux/amba/bus.h
ARM PRIMECELL CLCD PL110 DRIVER
M: Russell King <linux@armlinux.org.uk>
S: Odd Fixes
F: drivers/video/fbdev/amba-clcd.*
ARM PRIMECELL KMI PL050 DRIVER
M: Russell King <linux@armlinux.org.uk>
S: Odd Fixes
@@ -1623,7 +1628,7 @@ N: meson
ARM/Annapurna Labs ALPINE ARCHITECTURE
M: Tsahee Zidenberg <tsahee@annapurnalabs.com>
M: Antoine Tenart <antoine.tenart@bootlin.com>
M: Antoine Tenart <atenart@kernel.org>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
F: arch/arm/boot/dts/alpine*
@@ -7235,7 +7240,7 @@ F: drivers/staging/gasket/
GCC PLUGINS
M: Kees Cook <keescook@chromium.org>
R: Emese Revfy <re.emese@gmail.com>
L: kernel-hardening@lists.openwall.com
L: linux-hardening@vger.kernel.org
S: Maintained
F: Documentation/kbuild/gcc-plugins.rst
F: scripts/Makefile.gcc-plugins
@@ -8673,7 +8678,7 @@ F: drivers/input/input-mt.c
K: \b(ABS|SYN)_MT_
INSIDE SECURE CRYPTO DRIVER
M: Antoine Tenart <antoine.tenart@bootlin.com>
M: Antoine Tenart <atenart@kernel.org>
L: linux-crypto@vger.kernel.org
S: Maintained
F: drivers/crypto/inside-secure/
@@ -8752,7 +8757,8 @@ F: include/drm/i915*
F: include/uapi/drm/i915_drm.h
INTEL ETHERNET DRIVERS
M: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
M: Jesse Brandeburg <jesse.brandeburg@intel.com>
M: Tony Nguyen <anthony.l.nguyen@intel.com>
L: intel-wired-lan@lists.osuosl.org (moderated for non-subscribers)
S: Supported
W: http://www.intel.com/support/feedback.htm
@@ -9796,7 +9802,7 @@ F: drivers/scsi/53c700*
LEAKING_ADDRESSES
M: Tobin C. Harding <me@tobin.cc>
M: Tycho Andersen <tycho@tycho.pizza>
L: kernel-hardening@lists.openwall.com
L: linux-hardening@vger.kernel.org
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tobin/leaks.git
F: scripts/leaking_addresses.pl
@@ -12077,6 +12083,7 @@ NETWORKING [DSA]
M: Andrew Lunn <andrew@lunn.ch>
M: Vivien Didelot <vivien.didelot@gmail.com>
M: Florian Fainelli <f.fainelli@gmail.com>
M: Vladimir Oltean <olteanv@gmail.com>
S: Maintained
F: Documentation/devicetree/bindings/net/dsa/
F: drivers/net/dsa/
@@ -16725,6 +16732,13 @@ S: Maintained
F: Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml
F: drivers/gpio/gpio-dwapb.c
SYNOPSYS DESIGNWARE APB SSI DRIVER
M: Serge Semin <fancer.lancer@gmail.com>
L: linux-spi@vger.kernel.org
S: Supported
F: Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml
F: drivers/spi/spi-dw*
SYNOPSYS DESIGNWARE AXI DMAC DRIVER
M: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
S: Maintained
@@ -18282,7 +18296,8 @@ F: drivers/gpu/vga/vga_switcheroo.c
F: include/linux/vga_switcheroo.h
VIA RHINE NETWORK DRIVER
S: Orphan
S: Maintained
M: Kevin Brace <kevinbrace@bracecomputerlab.com>
F: drivers/net/ethernet/via/via-rhine.c
VIA SD/MMC CARD CONTROLLER DRIVER
@@ -18887,10 +18902,10 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/mm
F: arch/x86/mm/
X86 PLATFORM DRIVERS
M: Darren Hart <dvhart@infradead.org>
M: Andy Shevchenko <andy@infradead.org>
M: Hans de Goede <hdegoede@redhat.com>
M: Mark Gross <mgross@linux.intel.com>
L: platform-driver-x86@vger.kernel.org
S: Odd Fixes
S: Maintained
T: git git://git.infradead.org/linux-platform-drivers-x86.git
F: drivers/platform/olpc/
F: drivers/platform/x86/

View File

@@ -2,7 +2,7 @@
VERSION = 5
PATCHLEVEL = 9
SUBLEVEL = 0
EXTRAVERSION = -rc8
EXTRAVERSION =
NAME = Kleptomaniac Octopus
# *DOCUMENTATION*

View File

@@ -150,7 +150,7 @@ static int xen_starting_cpu(unsigned int cpu)
pr_info("Xen: initializing cpu%d\n", cpu);
vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
info.mfn = virt_to_gfn(vcpup);
info.mfn = percpu_to_gfn(vcpup);
info.offset = xen_offset_in_page(vcpup);
err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, xen_vcpu_nr(cpu),

View File

@@ -788,7 +788,7 @@ SYM_FUNC_START_LOCAL(__xts_crypt8)
0: mov bskey, x21
mov rounds, x22
br x7
br x16
SYM_FUNC_END(__xts_crypt8)
.macro __xts_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7
@@ -806,7 +806,7 @@ SYM_FUNC_END(__xts_crypt8)
uzp1 v30.4s, v30.4s, v25.4s
ld1 {v25.16b}, [x24]
99: adr x7, \do8
99: adr x16, \do8
bl __xts_crypt8
ldp q16, q17, [sp, #.Lframe_local_offset]

View File

@@ -475,7 +475,6 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
case BPF_JMP | BPF_JSET | BPF_K:
case BPF_JMP | BPF_JSET | BPF_X:
true_cond = COND_NE;
fallthrough;
cond_branch:
/* same targets, can avoid doing the test :) */
if (filter[i].jt == filter[i].jf) {

View File

@@ -22,13 +22,11 @@ SECTIONS
/* Beginning of code and text segment */
. = LOAD_OFFSET;
_start = .;
_stext = .;
HEAD_TEXT_SECTION
. = ALIGN(PAGE_SIZE);
__init_begin = .;
INIT_TEXT_SECTION(PAGE_SIZE)
INIT_DATA_SECTION(16)
. = ALIGN(8);
__soc_early_init_table : {
__soc_early_init_table_start = .;
@@ -55,6 +53,7 @@ SECTIONS
. = ALIGN(SECTION_ALIGN);
.text : {
_text = .;
_stext = .;
TEXT_TEXT
SCHED_TEXT
CPUIDLE_TEXT
@@ -67,6 +66,8 @@ SECTIONS
_etext = .;
}
INIT_DATA_SECTION(16)
/* Start of data section */
_sdata = .;
RO_DATA(SECTION_ALIGN)

View File

@@ -515,6 +515,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
#else
dtb_early_va = (void *)dtb_pa;
#endif
dtb_early_pa = dtb_pa;
}
static inline void setup_vm_final(void)

View File

@@ -1904,6 +1904,8 @@ void (*machine_check_vector)(struct pt_regs *) = unexpected_machine_check;
static __always_inline void exc_machine_check_kernel(struct pt_regs *regs)
{
bool irq_state;
WARN_ON_ONCE(user_mode(regs));
/*
@@ -1914,7 +1916,7 @@ static __always_inline void exc_machine_check_kernel(struct pt_regs *regs)
mce_check_crashing_cpu())
return;
nmi_enter();
irq_state = idtentry_enter_nmi(regs);
/*
* The call targets are marked noinstr, but objtool can't figure
* that out because it's an indirect call. Annotate it.
@@ -1925,7 +1927,7 @@ static __always_inline void exc_machine_check_kernel(struct pt_regs *regs)
if (regs->flags & X86_EFLAGS_IF)
trace_hardirqs_on_prepare();
instrumentation_end();
nmi_exit();
idtentry_exit_nmi(regs, irq_state);
}
static __always_inline void exc_machine_check_user(struct pt_regs *regs)

View File

@@ -305,8 +305,6 @@ int ibm_partition(struct parsed_partitions *state)
if (!disk->fops->getgeo)
goto out_exit;
fn = symbol_get(dasd_biodasdinfo);
if (!fn)
goto out_exit;
blocksize = bdev_logical_block_size(bdev);
if (blocksize <= 0)
goto out_symbol;
@@ -326,7 +324,7 @@ int ibm_partition(struct parsed_partitions *state)
geo->start = get_start_sect(bdev);
if (disk->fops->getgeo(bdev, geo))
goto out_freeall;
if (fn(disk, info)) {
if (!fn || fn(disk, info)) {
kfree(info);
info = NULL;
}
@@ -370,7 +368,8 @@ out_nolab:
out_nogeo:
kfree(info);
out_symbol:
symbol_put(dasd_biodasdinfo);
if (fn)
symbol_put(dasd_biodasdinfo);
out_exit:
return res;
}

View File

@@ -651,6 +651,7 @@ struct compat_cdrom_generic_command {
compat_int_t stat;
compat_caddr_t sense;
unsigned char data_direction;
unsigned char pad[3];
compat_int_t quiet;
compat_int_t timeout;
compat_caddr_t reserved[1];

View File

@@ -1553,7 +1553,7 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa
* put_page(); and would cause either a VM_BUG directly, or
* __page_cache_release a page that would actually still be referenced
* by someone, leading to some obscure delayed Oops somewhere else. */
if (drbd_disable_sendpage || (page_count(page) < 1) || PageSlab(page))
if (drbd_disable_sendpage || !sendpage_ok(page))
return _drbd_no_send_page(peer_device, page, offset, size, msg_flags);
msg_flags |= MSG_NOSIGNAL;

View File

@@ -824,8 +824,21 @@ static irqreturn_t pca953x_irq_handler(int irq, void *devid)
ret = pca953x_irq_pending(chip, pending);
mutex_unlock(&chip->i2c_lock);
for_each_set_bit(level, pending, gc->ngpio)
handle_nested_irq(irq_find_mapping(gc->irq.domain, level));
if (ret) {
ret = 0;
for_each_set_bit(level, pending, gc->ngpio) {
int nested_irq = irq_find_mapping(gc->irq.domain, level);
if (unlikely(nested_irq <= 0)) {
dev_warn_ratelimited(gc->parent, "unmapped interrupt %d\n", level);
continue;
}
handle_nested_irq(nested_irq);
ret = 1;
}
}
return IRQ_RETVAL(ret);
}

View File

@@ -425,7 +425,7 @@ static __poll_t lineevent_poll(struct file *file,
static ssize_t lineevent_get_size(void)
{
#ifdef __x86_64__
#if defined(CONFIG_X86_64) && !defined(CONFIG_UML)
/* i386 has no padding after 'id' */
if (in_ia32_syscall()) {
struct compat_gpioeevent_data {

View File

@@ -694,12 +694,12 @@ static void soc15_reg_base_init(struct amdgpu_device *adev)
* it doesn't support SRIOV. */
if (amdgpu_discovery) {
r = amdgpu_discovery_reg_base_init(adev);
if (r) {
DRM_WARN("failed to init reg base from ip discovery table, "
"fallback to legacy init method\n");
vega10_reg_base_init(adev);
}
if (r == 0)
break;
DRM_WARN("failed to init reg base from ip discovery table, "
"fallback to legacy init method\n");
}
vega10_reg_base_init(adev);
break;
case CHIP_VEGA20:
vega20_reg_base_init(adev);

View File

@@ -1409,7 +1409,7 @@ static int dm_late_init(void *handle)
if (dmcu)
ret = dmcu_load_iram(dmcu, params);
else if (adev->dm.dc->ctx->dmub_srv)
ret = dmub_init_abm_config(adev->dm.dc->res_pool->abm, params);
ret = dmub_init_abm_config(adev->dm.dc->res_pool, params);
if (!ret)
return -EINVAL;

View File

@@ -657,7 +657,7 @@ void fill_iram_v_2_3(struct iram_table_v_2_2 *ram_table, struct dmcu_iram_parame
params, ram_table, big_endian);
}
bool dmub_init_abm_config(struct abm *abm,
bool dmub_init_abm_config(struct resource_pool *res_pool,
struct dmcu_iram_parameters params)
{
struct iram_table_v_2_2 ram_table;
@@ -665,8 +665,13 @@ bool dmub_init_abm_config(struct abm *abm,
bool result = false;
uint32_t i, j = 0;
if (abm == NULL)
#if defined(CONFIG_DRM_AMD_DC_DCN3_0)
if (res_pool->abm == NULL && res_pool->multiple_abms[0] == NULL)
return false;
#else
if (res_pool->abm == NULL)
return false;
#endif
memset(&ram_table, 0, sizeof(ram_table));
memset(&config, 0, sizeof(config));
@@ -707,8 +712,14 @@ bool dmub_init_abm_config(struct abm *abm,
config.min_abm_backlight = ram_table.min_abm_backlight;
result = abm->funcs->init_abm_config(
abm, (char *)(&config), sizeof(struct abm_config_table));
#if defined(CONFIG_DRM_AMD_DC_DCN3_0)
if (res_pool->multiple_abms[0]) {
result = res_pool->multiple_abms[0]->funcs->init_abm_config(
res_pool->multiple_abms[0], (char *)(&config), sizeof(struct abm_config_table));
} else
#endif
result = res_pool->abm->funcs->init_abm_config(
res_pool->abm, (char *)(&config), sizeof(struct abm_config_table));
return result;
}

View File

@@ -28,6 +28,8 @@
#include "dc/inc/hw/dmcu.h"
#include "dc/inc/hw/abm.h"
struct resource_pool;
enum abm_defines {
abm_defines_max_level = 4,
@@ -45,7 +47,7 @@ struct dmcu_iram_parameters {
bool dmcu_load_iram(struct dmcu *dmcu,
struct dmcu_iram_parameters params);
bool dmub_init_abm_config(struct abm *abm,
bool dmub_init_abm_config(struct resource_pool *res_pool,
struct dmcu_iram_parameters params);
#endif /* MODULES_POWER_POWER_HELPERS_H_ */

View File

@@ -2265,8 +2265,6 @@ static void navi10_fill_i2c_req(SwI2cRequest_t *req, bool write,
{
int i;
BUG_ON(numbytes > MAX_SW_I2C_COMMANDS);
req->I2CcontrollerPort = 0;
req->I2CSpeed = 2;
req->SlaveAddress = address;
@@ -2304,6 +2302,12 @@ static int navi10_i2c_read_data(struct i2c_adapter *control,
struct smu_table_context *smu_table = &adev->smu.smu_table;
struct smu_table *table = &smu_table->driver_table;
if (numbytes > MAX_SW_I2C_COMMANDS) {
dev_err(adev->dev, "numbytes requested %d is over max allowed %d\n",
numbytes, MAX_SW_I2C_COMMANDS);
return -EINVAL;
}
memset(&req, 0, sizeof(req));
navi10_fill_i2c_req(&req, false, address, numbytes, data);
@@ -2340,6 +2344,12 @@ static int navi10_i2c_write_data(struct i2c_adapter *control,
SwI2cRequest_t req;
struct amdgpu_device *adev = to_amdgpu_device(control);
if (numbytes > MAX_SW_I2C_COMMANDS) {
dev_err(adev->dev, "numbytes requested %d is over max allowed %d\n",
numbytes, MAX_SW_I2C_COMMANDS);
return -EINVAL;
}
memset(&req, 0, sizeof(req));
navi10_fill_i2c_req(&req, true, address, numbytes, data);

View File

@@ -2445,8 +2445,6 @@ static void sienna_cichlid_fill_i2c_req(SwI2cRequest_t *req, bool write,
{
int i;
BUG_ON(numbytes > MAX_SW_I2C_COMMANDS);
req->I2CcontrollerPort = 0;
req->I2CSpeed = 2;
req->SlaveAddress = address;
@@ -2484,6 +2482,12 @@ static int sienna_cichlid_i2c_read_data(struct i2c_adapter *control,
struct smu_table_context *smu_table = &adev->smu.smu_table;
struct smu_table *table = &smu_table->driver_table;
if (numbytes > MAX_SW_I2C_COMMANDS) {
dev_err(adev->dev, "numbytes requested %d is over max allowed %d\n",
numbytes, MAX_SW_I2C_COMMANDS);
return -EINVAL;
}
memset(&req, 0, sizeof(req));
sienna_cichlid_fill_i2c_req(&req, false, address, numbytes, data);
@@ -2520,6 +2524,12 @@ static int sienna_cichlid_i2c_write_data(struct i2c_adapter *control,
SwI2cRequest_t req;
struct amdgpu_device *adev = to_amdgpu_device(control);
if (numbytes > MAX_SW_I2C_COMMANDS) {
dev_err(adev->dev, "numbytes requested %d is over max allowed %d\n",
numbytes, MAX_SW_I2C_COMMANDS);
return -EINVAL;
}
memset(&req, 0, sizeof(req));
sienna_cichlid_fill_i2c_req(&req, true, address, numbytes, data);

View File

@@ -176,6 +176,8 @@ void
nouveau_mem_del(struct ttm_mem_reg *reg)
{
struct nouveau_mem *mem = nouveau_mem(reg);
if (!mem)
return;
nouveau_mem_fini(mem);
kfree(reg->mm_node);
reg->mm_node = NULL;

View File

@@ -3149,6 +3149,7 @@ nvkm_device_ctor(const struct nvkm_device_func *func,
case 0x168: device->chip = &nv168_chipset; break;
default:
nvdev_error(device, "unknown chipset (%08x)\n", boot0);
ret = -ENODEV;
goto done;
}

View File

@@ -5,6 +5,7 @@
* Copyright (C) 2014 Beniamino Galvani <b.galvani@gmail.com>
*/
#include <linux/bitfield.h>
#include <linux/clk.h>
#include <linux/completion.h>
#include <linux/i2c.h>
@@ -33,12 +34,17 @@
#define REG_CTRL_ACK_IGNORE BIT(1)
#define REG_CTRL_STATUS BIT(2)
#define REG_CTRL_ERROR BIT(3)
#define REG_CTRL_CLKDIV_SHIFT 12
#define REG_CTRL_CLKDIV_MASK GENMASK(21, 12)
#define REG_CTRL_CLKDIVEXT_SHIFT 28
#define REG_CTRL_CLKDIVEXT_MASK GENMASK(29, 28)
#define REG_CTRL_CLKDIV GENMASK(21, 12)
#define REG_CTRL_CLKDIVEXT GENMASK(29, 28)
#define REG_SLV_ADDR GENMASK(7, 0)
#define REG_SLV_SDA_FILTER GENMASK(10, 8)
#define REG_SLV_SCL_FILTER GENMASK(13, 11)
#define REG_SLV_SCL_LOW GENMASK(27, 16)
#define REG_SLV_SCL_LOW_EN BIT(28)
#define I2C_TIMEOUT_MS 500
#define FILTER_DELAY 15
enum {
TOKEN_END = 0,
@@ -133,19 +139,24 @@ static void meson_i2c_set_clk_div(struct meson_i2c *i2c, unsigned int freq)
unsigned long clk_rate = clk_get_rate(i2c->clk);
unsigned int div;
div = DIV_ROUND_UP(clk_rate, freq * i2c->data->div_factor);
div = DIV_ROUND_UP(clk_rate, freq);
div -= FILTER_DELAY;
div = DIV_ROUND_UP(div, i2c->data->div_factor);
/* clock divider has 12 bits */
if (div >= (1 << 12)) {
if (div > GENMASK(11, 0)) {
dev_err(i2c->dev, "requested bus frequency too low\n");
div = (1 << 12) - 1;
div = GENMASK(11, 0);
}
meson_i2c_set_mask(i2c, REG_CTRL, REG_CTRL_CLKDIV_MASK,
(div & GENMASK(9, 0)) << REG_CTRL_CLKDIV_SHIFT);
meson_i2c_set_mask(i2c, REG_CTRL, REG_CTRL_CLKDIV,
FIELD_PREP(REG_CTRL_CLKDIV, div & GENMASK(9, 0)));
meson_i2c_set_mask(i2c, REG_CTRL, REG_CTRL_CLKDIVEXT_MASK,
(div >> 10) << REG_CTRL_CLKDIVEXT_SHIFT);
meson_i2c_set_mask(i2c, REG_CTRL, REG_CTRL_CLKDIVEXT,
FIELD_PREP(REG_CTRL_CLKDIVEXT, div >> 10));
/* Disable HIGH/LOW mode */
meson_i2c_set_mask(i2c, REG_SLAVE_ADDR, REG_SLV_SCL_LOW_EN, 0);
dev_dbg(i2c->dev, "%s: clk %lu, freq %u, div %u\n", __func__,
clk_rate, freq, div);
@@ -280,7 +291,10 @@ static void meson_i2c_do_start(struct meson_i2c *i2c, struct i2c_msg *msg)
token = (msg->flags & I2C_M_RD) ? TOKEN_SLAVE_ADDR_READ :
TOKEN_SLAVE_ADDR_WRITE;
writel(msg->addr << 1, i2c->regs + REG_SLAVE_ADDR);
meson_i2c_set_mask(i2c, REG_SLAVE_ADDR, REG_SLV_ADDR,
FIELD_PREP(REG_SLV_ADDR, msg->addr << 1));
meson_i2c_add_token(i2c, TOKEN_START);
meson_i2c_add_token(i2c, token);
}
@@ -357,16 +371,12 @@ static int meson_i2c_xfer_messages(struct i2c_adapter *adap,
struct meson_i2c *i2c = adap->algo_data;
int i, ret = 0;
clk_enable(i2c->clk);
for (i = 0; i < num; i++) {
ret = meson_i2c_xfer_msg(i2c, msgs + i, i == num - 1, atomic);
if (ret)
break;
}
clk_disable(i2c->clk);
return ret ?: i;
}
@@ -435,7 +445,7 @@ static int meson_i2c_probe(struct platform_device *pdev)
return ret;
}
ret = clk_prepare(i2c->clk);
ret = clk_prepare_enable(i2c->clk);
if (ret < 0) {
dev_err(&pdev->dev, "can't prepare clock\n");
return ret;
@@ -457,10 +467,14 @@ static int meson_i2c_probe(struct platform_device *pdev)
ret = i2c_add_adapter(&i2c->adap);
if (ret < 0) {
clk_unprepare(i2c->clk);
clk_disable_unprepare(i2c->clk);
return ret;
}
/* Disable filtering */
meson_i2c_set_mask(i2c, REG_SLAVE_ADDR,
REG_SLV_SDA_FILTER | REG_SLV_SCL_FILTER, 0);
meson_i2c_set_clk_div(i2c, timings.bus_freq_hz);
return 0;
@@ -471,7 +485,7 @@ static int meson_i2c_remove(struct platform_device *pdev)
struct meson_i2c *i2c = platform_get_drvdata(pdev);
i2c_del_adapter(&i2c->adap);
clk_unprepare(i2c->clk);
clk_disable_unprepare(i2c->clk);
return 0;
}

View File

@@ -176,6 +176,9 @@ static irqreturn_t owl_i2c_interrupt(int irq, void *_dev)
fifostat = readl(i2c_dev->base + OWL_I2C_REG_FIFOSTAT);
if (fifostat & OWL_I2C_FIFOSTAT_RNB) {
i2c_dev->err = -ENXIO;
/* Clear NACK error bit by writing "1" */
owl_i2c_update_reg(i2c_dev->base + OWL_I2C_REG_FIFOSTAT,
OWL_I2C_FIFOSTAT_RNB, true);
goto stop;
}
@@ -183,6 +186,9 @@ static irqreturn_t owl_i2c_interrupt(int irq, void *_dev)
stat = readl(i2c_dev->base + OWL_I2C_REG_STAT);
if (stat & OWL_I2C_STAT_BEB) {
i2c_dev->err = -EIO;
/* Clear BUS error bit by writing "1" */
owl_i2c_update_reg(i2c_dev->base + OWL_I2C_REG_STAT,
OWL_I2C_STAT_BEB, true);
goto stop;
}

View File

@@ -1320,9 +1320,10 @@ struct net_device *rdma_read_gid_attr_ndev_rcu(const struct ib_gid_attr *attr)
}
EXPORT_SYMBOL(rdma_read_gid_attr_ndev_rcu);
static int get_lower_dev_vlan(struct net_device *lower_dev, void *data)
static int get_lower_dev_vlan(struct net_device *lower_dev,
struct netdev_nested_priv *priv)
{
u16 *vlan_id = data;
u16 *vlan_id = (u16 *)priv->data;
if (is_vlan_dev(lower_dev))
*vlan_id = vlan_dev_vlan_id(lower_dev);
@@ -1348,6 +1349,9 @@ static int get_lower_dev_vlan(struct net_device *lower_dev, void *data)
int rdma_read_gid_l2_fields(const struct ib_gid_attr *attr,
u16 *vlan_id, u8 *smac)
{
struct netdev_nested_priv priv = {
.data = (void *)vlan_id,
};
struct net_device *ndev;
rcu_read_lock();
@@ -1368,7 +1372,7 @@ int rdma_read_gid_l2_fields(const struct ib_gid_attr *attr,
* the lower vlan device for this gid entry.
*/
netdev_walk_all_lower_dev_rcu(attr->ndev,
get_lower_dev_vlan, vlan_id);
get_lower_dev_vlan, &priv);
}
}
rcu_read_unlock();

View File

@@ -2865,9 +2865,10 @@ struct iboe_prio_tc_map {
bool found;
};
static int get_lower_vlan_dev_tc(struct net_device *dev, void *data)
static int get_lower_vlan_dev_tc(struct net_device *dev,
struct netdev_nested_priv *priv)
{
struct iboe_prio_tc_map *map = data;
struct iboe_prio_tc_map *map = (struct iboe_prio_tc_map *)priv->data;
if (is_vlan_dev(dev))
map->output_tc = get_vlan_ndev_tc(dev, map->input_prio);
@@ -2886,16 +2887,18 @@ static int iboe_tos_to_sl(struct net_device *ndev, int tos)
{
struct iboe_prio_tc_map prio_tc_map = {};
int prio = rt_tos2priority(tos);
struct netdev_nested_priv priv;
/* If VLAN device, get it directly from the VLAN netdev */
if (is_vlan_dev(ndev))
return get_vlan_ndev_tc(ndev, prio);
prio_tc_map.input_prio = prio;
priv.data = (void *)&prio_tc_map;
rcu_read_lock();
netdev_walk_all_lower_dev_rcu(ndev,
get_lower_vlan_dev_tc,
&prio_tc_map);
&priv);
rcu_read_unlock();
/* If map is found from lower device, use it; Otherwise
* continue with the current netdevice to get priority to tc map.

View File

@@ -531,10 +531,11 @@ struct upper_list {
struct net_device *upper;
};
static int netdev_upper_walk(struct net_device *upper, void *data)
static int netdev_upper_walk(struct net_device *upper,
struct netdev_nested_priv *priv)
{
struct upper_list *entry = kmalloc(sizeof(*entry), GFP_ATOMIC);
struct list_head *upper_list = data;
struct list_head *upper_list = (struct list_head *)priv->data;
if (!entry)
return 0;
@@ -553,12 +554,14 @@ static void handle_netdev_upper(struct ib_device *ib_dev, u8 port,
struct net_device *ndev))
{
struct net_device *ndev = cookie;
struct netdev_nested_priv priv;
struct upper_list *upper_iter;
struct upper_list *upper_temp;
LIST_HEAD(upper_list);
priv.data = &upper_list;
rcu_read_lock();
netdev_walk_all_upper_dev_rcu(ndev, netdev_upper_walk, &upper_list);
netdev_walk_all_upper_dev_rcu(ndev, netdev_upper_walk, &priv);
rcu_read_unlock();
handle_netdev(ib_dev, port, ndev);

View File

@@ -342,9 +342,10 @@ struct ipoib_walk_data {
struct net_device *result;
};
static int ipoib_upper_walk(struct net_device *upper, void *_data)
static int ipoib_upper_walk(struct net_device *upper,
struct netdev_nested_priv *priv)
{
struct ipoib_walk_data *data = _data;
struct ipoib_walk_data *data = (struct ipoib_walk_data *)priv->data;
int ret = 0;
if (ipoib_is_dev_match_addr_rcu(data->addr, upper)) {
@@ -368,10 +369,12 @@ static int ipoib_upper_walk(struct net_device *upper, void *_data)
static struct net_device *ipoib_get_net_dev_match_addr(
const struct sockaddr *addr, struct net_device *dev)
{
struct netdev_nested_priv priv;
struct ipoib_walk_data data = {
.addr = addr,
};
priv.data = (void *)&data;
rcu_read_lock();
if (ipoib_is_dev_match_addr_rcu(addr, dev)) {
dev_hold(dev);
@@ -379,7 +382,7 @@ static struct net_device *ipoib_get_net_dev_match_addr(
goto out;
}
netdev_walk_all_upper_dev_rcu(dev, ipoib_upper_walk, &data);
netdev_walk_all_upper_dev_rcu(dev, ipoib_upper_walk, &priv);
out:
rcu_read_unlock();
return data.result;

View File

@@ -190,7 +190,7 @@ static void mmc_queue_setup_discard(struct request_queue *q,
q->limits.discard_granularity = card->pref_erase << 9;
/* granularity must not be greater than max. discard */
if (card->pref_erase > max_discard)
q->limits.discard_granularity = 0;
q->limits.discard_granularity = SECTOR_SIZE;
if (mmc_can_secure_erase_trim(card))
blk_queue_flag_set(QUEUE_FLAG_SECERASE, q);
}

View File

@@ -942,9 +942,10 @@ struct alb_walk_data {
bool strict_match;
};
static int alb_upper_dev_walk(struct net_device *upper, void *_data)
static int alb_upper_dev_walk(struct net_device *upper,
struct netdev_nested_priv *priv)
{
struct alb_walk_data *data = _data;
struct alb_walk_data *data = (struct alb_walk_data *)priv->data;
bool strict_match = data->strict_match;
struct bonding *bond = data->bond;
struct slave *slave = data->slave;
@@ -983,6 +984,7 @@ static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[],
bool strict_match)
{
struct bonding *bond = bond_get_bond_by_slave(slave);
struct netdev_nested_priv priv;
struct alb_walk_data data = {
.strict_match = strict_match,
.mac_addr = mac_addr,
@@ -990,6 +992,7 @@ static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[],
.bond = bond,
};
priv.data = (void *)&data;
/* send untagged */
alb_send_lp_vid(slave, mac_addr, 0, 0);
@@ -997,7 +1000,7 @@ static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[],
* for that device.
*/
rcu_read_lock();
netdev_walk_all_upper_dev_rcu(bond->dev, alb_upper_dev_walk, &data);
netdev_walk_all_upper_dev_rcu(bond->dev, alb_upper_dev_walk, &priv);
rcu_read_unlock();
}

View File

@@ -1315,6 +1315,7 @@ static void bond_setup_by_slave(struct net_device *bond_dev,
bond_dev->type = slave_dev->type;
bond_dev->hard_header_len = slave_dev->hard_header_len;
bond_dev->needed_headroom = slave_dev->needed_headroom;
bond_dev->addr_len = slave_dev->addr_len;
memcpy(bond_dev->broadcast, slave_dev->broadcast,
@@ -2510,22 +2511,26 @@ re_arm:
}
}
static int bond_upper_dev_walk(struct net_device *upper, void *data)
static int bond_upper_dev_walk(struct net_device *upper,
struct netdev_nested_priv *priv)
{
__be32 ip = *((__be32 *)data);
__be32 ip = *(__be32 *)priv->data;
return ip == bond_confirm_addr(upper, 0, ip);
}
static bool bond_has_this_ip(struct bonding *bond, __be32 ip)
{
struct netdev_nested_priv priv = {
.data = (void *)&ip,
};
bool ret = false;
if (ip == bond_confirm_addr(bond->dev, 0, ip))
return true;
rcu_read_lock();
if (netdev_walk_all_upper_dev_rcu(bond->dev, bond_upper_dev_walk, &ip))
if (netdev_walk_all_upper_dev_rcu(bond->dev, bond_upper_dev_walk, &priv))
ret = true;
rcu_read_unlock();

View File

@@ -387,8 +387,8 @@ EXPORT_SYMBOL(ksz_switch_alloc);
int ksz_switch_register(struct ksz_device *dev,
const struct ksz_dev_ops *ops)
{
struct device_node *port, *ports;
phy_interface_t interface;
struct device_node *port;
unsigned int port_num;
int ret;
@@ -429,13 +429,17 @@ int ksz_switch_register(struct ksz_device *dev,
ret = of_get_phy_mode(dev->dev->of_node, &interface);
if (ret == 0)
dev->compat_interface = interface;
for_each_available_child_of_node(dev->dev->of_node, port) {
if (of_property_read_u32(port, "reg", &port_num))
continue;
if (port_num >= dev->port_cnt)
return -EINVAL;
of_get_phy_mode(port, &dev->ports[port_num].interface);
}
ports = of_get_child_by_name(dev->dev->of_node, "ports");
if (ports)
for_each_available_child_of_node(ports, port) {
if (of_property_read_u32(port, "reg",
&port_num))
continue;
if (port_num >= dev->port_cnt)
return -EINVAL;
of_get_phy_mode(port,
&dev->ports[port_num].interface);
}
dev->synclko_125 = of_property_read_bool(dev->dev->of_node,
"microchip,synclko-125");
}

View File

@@ -685,12 +685,12 @@ static struct vcap_field vsc9959_vcap_is2_actions[] = {
[VCAP_IS2_ACT_POLICE_ENA] = { 9, 1},
[VCAP_IS2_ACT_POLICE_IDX] = { 10, 9},
[VCAP_IS2_ACT_POLICE_VCAP_ONLY] = { 19, 1},
[VCAP_IS2_ACT_PORT_MASK] = { 20, 11},
[VCAP_IS2_ACT_REW_OP] = { 31, 9},
[VCAP_IS2_ACT_SMAC_REPLACE_ENA] = { 40, 1},
[VCAP_IS2_ACT_RSV] = { 41, 2},
[VCAP_IS2_ACT_ACL_ID] = { 43, 6},
[VCAP_IS2_ACT_HIT_CNT] = { 49, 32},
[VCAP_IS2_ACT_PORT_MASK] = { 20, 6},
[VCAP_IS2_ACT_REW_OP] = { 26, 9},
[VCAP_IS2_ACT_SMAC_REPLACE_ENA] = { 35, 1},
[VCAP_IS2_ACT_RSV] = { 36, 2},
[VCAP_IS2_ACT_ACL_ID] = { 38, 6},
[VCAP_IS2_ACT_HIT_CNT] = { 44, 32},
};
static const struct vcap_props vsc9959_vcap_props[] = {
@@ -1171,6 +1171,8 @@ static int vsc9959_prevalidate_phy_mode(struct ocelot *ocelot, int port,
*/
static u16 vsc9959_wm_enc(u16 value)
{
WARN_ON(value >= 16 * BIT(8));
if (value >= BIT(8))
return BIT(8) | (value / 16);
@@ -1284,8 +1286,28 @@ void vsc9959_mdio_bus_free(struct ocelot *ocelot)
static void vsc9959_sched_speed_set(struct ocelot *ocelot, int port,
u32 speed)
{
u8 tas_speed;
switch (speed) {
case SPEED_10:
tas_speed = OCELOT_SPEED_10;
break;
case SPEED_100:
tas_speed = OCELOT_SPEED_100;
break;
case SPEED_1000:
tas_speed = OCELOT_SPEED_1000;
break;
case SPEED_2500:
tas_speed = OCELOT_SPEED_2500;
break;
default:
tas_speed = OCELOT_SPEED_1000;
break;
}
ocelot_rmw_rix(ocelot,
QSYS_TAG_CONFIG_LINK_SPEED(speed),
QSYS_TAG_CONFIG_LINK_SPEED(tas_speed),
QSYS_TAG_CONFIG_LINK_SPEED_M,
QSYS_TAG_CONFIG, port);
}

View File

@@ -706,7 +706,7 @@ static const struct vcap_props vsc9953_vcap_props[] = {
.action_type_width = 1,
.action_table = {
[IS2_ACTION_TYPE_NORMAL] = {
.width = 44,
.width = 50, /* HIT_CNT not included */
.count = 2
},
[IS2_ACTION_TYPE_SMAC_SIP] = {
@@ -911,6 +911,8 @@ static int vsc9953_prevalidate_phy_mode(struct ocelot *ocelot, int port,
*/
static u16 vsc9953_wm_enc(u16 value)
{
WARN_ON(value >= 16 * BIT(9));
if (value >= BIT(9))
return BIT(9) | (value / 16);

View File

@@ -33,7 +33,7 @@ struct basic_ring {
u32 lastWrite;
};
/* The Typoon transmit ring -- same as a basic ring, plus:
/* The Typhoon transmit ring -- same as a basic ring, plus:
* lastRead: where we're at in regard to cleaning up the ring
* writeRegister: register to use for writing (different for Hi & Lo rings)
*/

View File

@@ -8,7 +8,7 @@
obj-$(CONFIG_AQTION) += atlantic.o
ccflags-y += -I$(src)
ccflags-y += -I$(srctree)/$(src)
atlantic-objs := aq_main.o \
aq_nic.o \
@@ -33,4 +33,4 @@ atlantic-objs := aq_main.o \
atlantic-$(CONFIG_MACSEC) += aq_macsec.o
atlantic-$(CONFIG_PTP_1588_CLOCK) += aq_ptp.o
atlantic-$(CONFIG_PTP_1588_CLOCK) += aq_ptp.o

View File

@@ -284,12 +284,12 @@
#define CCM_REG_GR_ARB_TYPE 0xd015c
/* [RW 2] Load (FIC0) channel group priority. The lowest priority is 0; the
highest priority is 3. It is supposed; that the Store channel priority is
the compliment to 4 of the rest priorities - Aggregation channel; Load
the complement to 4 of the rest priorities - Aggregation channel; Load
(FIC0) channel and Load (FIC1). */
#define CCM_REG_GR_LD0_PR 0xd0164
/* [RW 2] Load (FIC1) channel group priority. The lowest priority is 0; the
highest priority is 3. It is supposed; that the Store channel priority is
the compliment to 4 of the rest priorities - Aggregation channel; Load
the complement to 4 of the rest priorities - Aggregation channel; Load
(FIC0) channel and Load (FIC1). */
#define CCM_REG_GR_LD1_PR 0xd0168
/* [RW 2] General flags index. */
@@ -4489,11 +4489,11 @@
#define TCM_REG_GR_ARB_TYPE 0x50114
/* [RW 2] Load (FIC0) channel group priority. The lowest priority is 0; the
highest priority is 3. It is supposed that the Store channel is the
compliment of the other 3 groups. */
complement of the other 3 groups. */
#define TCM_REG_GR_LD0_PR 0x5011c
/* [RW 2] Load (FIC1) channel group priority. The lowest priority is 0; the
highest priority is 3. It is supposed that the Store channel is the
compliment of the other 3 groups. */
complement of the other 3 groups. */
#define TCM_REG_GR_LD1_PR 0x50120
/* [RW 4] The number of double REG-pairs; loaded from the STORM context and
sent to STORM; for a specific connection type. The double REG-pairs are
@@ -5020,11 +5020,11 @@
#define UCM_REG_GR_ARB_TYPE 0xe0144
/* [RW 2] Load (FIC0) channel group priority. The lowest priority is 0; the
highest priority is 3. It is supposed that the Store channel group is
compliment to the others. */
complement to the others. */
#define UCM_REG_GR_LD0_PR 0xe014c
/* [RW 2] Load (FIC1) channel group priority. The lowest priority is 0; the
highest priority is 3. It is supposed that the Store channel group is
compliment to the others. */
complement to the others. */
#define UCM_REG_GR_LD1_PR 0xe0150
/* [RW 2] The queue index for invalidate counter flag decision. */
#define UCM_REG_INV_CFLG_Q 0xe00e4
@@ -5523,11 +5523,11 @@
#define XCM_REG_GR_ARB_TYPE 0x2020c
/* [RW 2] Load (FIC0) channel group priority. The lowest priority is 0; the
highest priority is 3. It is supposed that the Channel group is the
compliment of the other 3 groups. */
complement of the other 3 groups. */
#define XCM_REG_GR_LD0_PR 0x20214
/* [RW 2] Load (FIC1) channel group priority. The lowest priority is 0; the
highest priority is 3. It is supposed that the Channel group is the
compliment of the other 3 groups. */
complement of the other 3 groups. */
#define XCM_REG_GR_LD1_PR 0x20218
/* [RW 1] Input nig0 Interface enable. If 0 - the valid input is
disregarded; acknowledge output is deasserted; all other signals are

View File

@@ -1219,7 +1219,7 @@ static int octeon_mgmt_open(struct net_device *netdev)
*/
if (netdev->phydev) {
netif_carrier_off(netdev);
phy_start_aneg(netdev->phydev);
phy_start(netdev->phydev);
}
netif_wake_queue(netdev);
@@ -1247,8 +1247,10 @@ static int octeon_mgmt_stop(struct net_device *netdev)
napi_disable(&p->napi);
netif_stop_queue(netdev);
if (netdev->phydev)
if (netdev->phydev) {
phy_stop(netdev->phydev);
phy_disconnect(netdev->phydev);
}
netif_carrier_off(netdev);

View File

@@ -11,9 +11,11 @@
#define DPNI_VER_MAJOR 7
#define DPNI_VER_MINOR 0
#define DPNI_CMD_BASE_VERSION 1
#define DPNI_CMD_2ND_VERSION 2
#define DPNI_CMD_ID_OFFSET 4
#define DPNI_CMD(id) (((id) << DPNI_CMD_ID_OFFSET) | DPNI_CMD_BASE_VERSION)
#define DPNI_CMD_V2(id) (((id) << DPNI_CMD_ID_OFFSET) | DPNI_CMD_2ND_VERSION)
#define DPNI_CMDID_OPEN DPNI_CMD(0x801)
#define DPNI_CMDID_CLOSE DPNI_CMD(0x800)
@@ -45,7 +47,7 @@
#define DPNI_CMDID_SET_MAX_FRAME_LENGTH DPNI_CMD(0x216)
#define DPNI_CMDID_GET_MAX_FRAME_LENGTH DPNI_CMD(0x217)
#define DPNI_CMDID_SET_LINK_CFG DPNI_CMD(0x21A)
#define DPNI_CMDID_SET_TX_SHAPING DPNI_CMD(0x21B)
#define DPNI_CMDID_SET_TX_SHAPING DPNI_CMD_V2(0x21B)
#define DPNI_CMDID_SET_MCAST_PROMISC DPNI_CMD(0x220)
#define DPNI_CMDID_GET_MCAST_PROMISC DPNI_CMD(0x221)

View File

@@ -229,7 +229,7 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
/* Return all Fs if nothing was there */
if ((xgmac_read32(&regs->mdio_stat, endian) & MDIO_STAT_RD_ER) &&
!priv->has_a011043) {
dev_err(&bus->dev,
dev_dbg(&bus->dev,
"Error while reading PHY%d reg at %d.%hhu\n",
phy_id, dev_addr, regnum);
return 0xffff;

View File

@@ -6,6 +6,7 @@
config HINIC
tristate "Huawei Intelligent PCIE Network Interface Card"
depends on (PCI_MSI && (X86 || ARM64))
select NET_DEVLINK
help
This driver supports HiNIC PCIE Ethernet cards.
To compile this driver as part of the kernel, choose Y here.

View File

@@ -58,9 +58,9 @@ static int change_mac(struct hinic_dev *nic_dev, const u8 *addr,
sizeof(port_mac_cmd),
&port_mac_cmd, &out_size);
if (err || out_size != sizeof(port_mac_cmd) ||
(port_mac_cmd.status &&
port_mac_cmd.status != HINIC_PF_SET_VF_ALREADY &&
port_mac_cmd.status != HINIC_MGMT_STATUS_EXIST)) {
(port_mac_cmd.status &&
(port_mac_cmd.status != HINIC_PF_SET_VF_ALREADY || !HINIC_IS_VF(hwif)) &&
port_mac_cmd.status != HINIC_MGMT_STATUS_EXIST)) {
dev_err(&pdev->dev, "Failed to change MAC, err: %d, status: 0x%x, out size: 0x%x\n",
err, port_mac_cmd.status, out_size);
return -EFAULT;

View File

@@ -38,8 +38,7 @@ static int hinic_set_mac(struct hinic_hwdev *hwdev, const u8 *mac_addr,
err = hinic_port_msg_cmd(hwdev, HINIC_PORT_CMD_SET_MAC, &mac_info,
sizeof(mac_info), &mac_info, &out_size);
if (err || out_size != sizeof(mac_info) ||
(mac_info.status && mac_info.status != HINIC_PF_SET_VF_ALREADY &&
mac_info.status != HINIC_MGMT_STATUS_EXIST)) {
(mac_info.status && mac_info.status != HINIC_MGMT_STATUS_EXIST)) {
dev_err(&hwdev->func_to_io.hwif->pdev->dev, "Failed to set MAC, err: %d, status: 0x%x, out size: 0x%x\n",
err, mac_info.status, out_size);
return -EIO;
@@ -503,8 +502,7 @@ struct hinic_sriov_info *hinic_get_sriov_info_by_pcidev(struct pci_dev *pdev)
static int hinic_check_mac_info(u8 status, u16 vlan_id)
{
if ((status && status != HINIC_MGMT_STATUS_EXIST &&
status != HINIC_PF_SET_VF_ALREADY) ||
if ((status && status != HINIC_MGMT_STATUS_EXIST) ||
(vlan_id & CHECK_IPSU_15BIT &&
status == HINIC_MGMT_STATUS_EXIST))
return -EINVAL;
@@ -546,12 +544,6 @@ static int hinic_update_mac(struct hinic_hwdev *hwdev, u8 *old_mac,
return -EINVAL;
}
if (mac_info.status == HINIC_PF_SET_VF_ALREADY) {
dev_warn(&hwdev->hwif->pdev->dev,
"PF has already set VF MAC. Ignore update operation\n");
return HINIC_PF_SET_VF_ALREADY;
}
if (mac_info.status == HINIC_MGMT_STATUS_EXIST)
dev_warn(&hwdev->hwif->pdev->dev, "MAC is repeated. Ignore update operation\n");

View File

@@ -3806,8 +3806,8 @@ static int __maybe_unused iavf_suspend(struct device *dev_d)
static int __maybe_unused iavf_resume(struct device *dev_d)
{
struct pci_dev *pdev = to_pci_dev(dev_d);
struct iavf_adapter *adapter = pci_get_drvdata(pdev);
struct net_device *netdev = adapter->netdev;
struct net_device *netdev = pci_get_drvdata(pdev);
struct iavf_adapter *adapter = netdev_priv(netdev);
u32 err;
pci_set_master(pdev);

View File

@@ -2288,26 +2288,28 @@ void ice_set_safe_mode_caps(struct ice_hw *hw)
{
struct ice_hw_func_caps *func_caps = &hw->func_caps;
struct ice_hw_dev_caps *dev_caps = &hw->dev_caps;
u32 valid_func, rxq_first_id, txq_first_id;
u32 msix_vector_first_id, max_mtu;
struct ice_hw_common_caps cached_caps;
u32 num_funcs;
/* cache some func_caps values that should be restored after memset */
valid_func = func_caps->common_cap.valid_functions;
txq_first_id = func_caps->common_cap.txq_first_id;
rxq_first_id = func_caps->common_cap.rxq_first_id;
msix_vector_first_id = func_caps->common_cap.msix_vector_first_id;
max_mtu = func_caps->common_cap.max_mtu;
cached_caps = func_caps->common_cap;
/* unset func capabilities */
memset(func_caps, 0, sizeof(*func_caps));
#define ICE_RESTORE_FUNC_CAP(name) \
func_caps->common_cap.name = cached_caps.name
/* restore cached values */
func_caps->common_cap.valid_functions = valid_func;
func_caps->common_cap.txq_first_id = txq_first_id;
func_caps->common_cap.rxq_first_id = rxq_first_id;
func_caps->common_cap.msix_vector_first_id = msix_vector_first_id;
func_caps->common_cap.max_mtu = max_mtu;
ICE_RESTORE_FUNC_CAP(valid_functions);
ICE_RESTORE_FUNC_CAP(txq_first_id);
ICE_RESTORE_FUNC_CAP(rxq_first_id);
ICE_RESTORE_FUNC_CAP(msix_vector_first_id);
ICE_RESTORE_FUNC_CAP(max_mtu);
ICE_RESTORE_FUNC_CAP(nvm_unified_update);
ICE_RESTORE_FUNC_CAP(nvm_update_pending_nvm);
ICE_RESTORE_FUNC_CAP(nvm_update_pending_orom);
ICE_RESTORE_FUNC_CAP(nvm_update_pending_netlist);
/* one Tx and one Rx queue in safe mode */
func_caps->common_cap.num_rxq = 1;
@@ -2318,22 +2320,25 @@ void ice_set_safe_mode_caps(struct ice_hw *hw)
func_caps->guar_num_vsi = 1;
/* cache some dev_caps values that should be restored after memset */
valid_func = dev_caps->common_cap.valid_functions;
txq_first_id = dev_caps->common_cap.txq_first_id;
rxq_first_id = dev_caps->common_cap.rxq_first_id;
msix_vector_first_id = dev_caps->common_cap.msix_vector_first_id;
max_mtu = dev_caps->common_cap.max_mtu;
cached_caps = dev_caps->common_cap;
num_funcs = dev_caps->num_funcs;
/* unset dev capabilities */
memset(dev_caps, 0, sizeof(*dev_caps));
#define ICE_RESTORE_DEV_CAP(name) \
dev_caps->common_cap.name = cached_caps.name
/* restore cached values */
dev_caps->common_cap.valid_functions = valid_func;
dev_caps->common_cap.txq_first_id = txq_first_id;
dev_caps->common_cap.rxq_first_id = rxq_first_id;
dev_caps->common_cap.msix_vector_first_id = msix_vector_first_id;
dev_caps->common_cap.max_mtu = max_mtu;
ICE_RESTORE_DEV_CAP(valid_functions);
ICE_RESTORE_DEV_CAP(txq_first_id);
ICE_RESTORE_DEV_CAP(rxq_first_id);
ICE_RESTORE_DEV_CAP(msix_vector_first_id);
ICE_RESTORE_DEV_CAP(max_mtu);
ICE_RESTORE_DEV_CAP(nvm_unified_update);
ICE_RESTORE_DEV_CAP(nvm_update_pending_nvm);
ICE_RESTORE_DEV_CAP(nvm_update_pending_orom);
ICE_RESTORE_DEV_CAP(nvm_update_pending_netlist);
dev_caps->num_funcs = num_funcs;
/* one Tx and one Rx queue per function in safe mode */

View File

@@ -289,7 +289,13 @@ ice_write_one_nvm_block(struct ice_pf *pf, u16 module, u32 offset,
return -EIO;
}
err = ice_aq_wait_for_event(pf, ice_aqc_opc_nvm_write, HZ, &event);
/* In most cases, firmware reports a write completion within a few
* milliseconds. However, it has been observed that a completion might
* take more than a second to complete in some cases. The timeout here
* is conservative and is intended to prevent failure to update when
* firmware is slow to respond.
*/
err = ice_aq_wait_for_event(pf, ice_aqc_opc_nvm_write, 15 * HZ, &event);
if (err) {
dev_err(dev, "Timed out waiting for firmware write completion for module 0x%02x, err %d\n",
module, err);
@@ -513,7 +519,7 @@ static int ice_switch_flash_banks(struct ice_pf *pf, u8 activate_flags,
return -EIO;
}
err = ice_aq_wait_for_event(pf, ice_aqc_opc_nvm_write_activate, HZ,
err = ice_aq_wait_for_event(pf, ice_aqc_opc_nvm_write_activate, 30 * HZ,
&event);
if (err) {
dev_err(dev, "Timed out waiting for firmware to switch active flash banks, err %d\n",

View File

@@ -246,7 +246,7 @@ static int ice_get_free_slot(void *array, int size, int curr)
* ice_vsi_delete - delete a VSI from the switch
* @vsi: pointer to VSI being removed
*/
void ice_vsi_delete(struct ice_vsi *vsi)
static void ice_vsi_delete(struct ice_vsi *vsi)
{
struct ice_pf *pf = vsi->back;
struct ice_vsi_ctx *ctxt;
@@ -313,7 +313,7 @@ static void ice_vsi_free_arrays(struct ice_vsi *vsi)
*
* Returns 0 on success, negative on failure
*/
int ice_vsi_clear(struct ice_vsi *vsi)
static int ice_vsi_clear(struct ice_vsi *vsi)
{
struct ice_pf *pf = NULL;
struct device *dev;
@@ -563,7 +563,7 @@ static int ice_vsi_get_qs(struct ice_vsi *vsi)
* ice_vsi_put_qs - Release queues from VSI to PF
* @vsi: the VSI that is going to release queues
*/
void ice_vsi_put_qs(struct ice_vsi *vsi)
static void ice_vsi_put_qs(struct ice_vsi *vsi)
{
struct ice_pf *pf = vsi->back;
int i;
@@ -1196,6 +1196,18 @@ static void ice_vsi_clear_rings(struct ice_vsi *vsi)
{
int i;
/* Avoid stale references by clearing map from vector to ring */
if (vsi->q_vectors) {
ice_for_each_q_vector(vsi, i) {
struct ice_q_vector *q_vector = vsi->q_vectors[i];
if (q_vector) {
q_vector->tx.ring = NULL;
q_vector->rx.ring = NULL;
}
}
}
if (vsi->tx_rings) {
for (i = 0; i < vsi->alloc_txq; i++) {
if (vsi->tx_rings[i]) {
@@ -2291,7 +2303,7 @@ ice_vsi_setup(struct ice_pf *pf, struct ice_port_info *pi,
if (status) {
dev_err(dev, "VSI %d failed lan queue config, error %s\n",
vsi->vsi_num, ice_stat_str(status));
goto unroll_vector_base;
goto unroll_clear_rings;
}
/* Add switch rule to drop all Tx Flow Control Frames, of look up

View File

@@ -45,10 +45,6 @@ int ice_cfg_vlan_pruning(struct ice_vsi *vsi, bool ena, bool vlan_promisc);
void ice_cfg_sw_lldp(struct ice_vsi *vsi, bool tx, bool create);
void ice_vsi_delete(struct ice_vsi *vsi);
int ice_vsi_clear(struct ice_vsi *vsi);
#ifdef CONFIG_DCB
int ice_vsi_cfg_tc(struct ice_vsi *vsi, u8 ena_tc);
#endif /* CONFIG_DCB */
@@ -79,8 +75,6 @@ bool ice_is_reset_in_progress(unsigned long *state);
void
ice_write_qrxflxp_cntxt(struct ice_hw *hw, u16 pf_q, u32 rxdid, u32 prio);
void ice_vsi_put_qs(struct ice_vsi *vsi);
void ice_vsi_dis_irq(struct ice_vsi *vsi);
void ice_vsi_free_irq(struct ice_vsi *vsi);

View File

@@ -3169,10 +3169,8 @@ static int ice_setup_pf_sw(struct ice_pf *pf)
return -EBUSY;
vsi = ice_pf_vsi_setup(pf, pf->hw.port_info);
if (!vsi) {
status = -ENOMEM;
goto unroll_vsi_setup;
}
if (!vsi)
return -ENOMEM;
status = ice_cfg_netdev(vsi);
if (status) {
@@ -3219,12 +3217,7 @@ unroll_napi_add:
}
unroll_vsi_setup:
if (vsi) {
ice_vsi_free_q_vectors(vsi);
ice_vsi_delete(vsi);
ice_vsi_put_qs(vsi);
ice_vsi_clear(vsi);
}
ice_vsi_release(vsi);
return status;
}
@@ -4522,6 +4515,7 @@ static int __maybe_unused ice_suspend(struct device *dev)
}
ice_clear_interrupt_scheme(pf);
pci_save_state(pdev);
pci_wake_from_d3(pdev, pf->wol_ena);
pci_set_power_state(pdev, PCI_D3hot);
return 0;

View File

@@ -5396,9 +5396,10 @@ static int ixgbe_fwd_ring_up(struct ixgbe_adapter *adapter,
return err;
}
static int ixgbe_macvlan_up(struct net_device *vdev, void *data)
static int ixgbe_macvlan_up(struct net_device *vdev,
struct netdev_nested_priv *priv)
{
struct ixgbe_adapter *adapter = data;
struct ixgbe_adapter *adapter = (struct ixgbe_adapter *)priv->data;
struct ixgbe_fwd_adapter *accel;
if (!netif_is_macvlan(vdev))
@@ -5415,8 +5416,12 @@ static int ixgbe_macvlan_up(struct net_device *vdev, void *data)
static void ixgbe_configure_dfwd(struct ixgbe_adapter *adapter)
{
struct netdev_nested_priv priv = {
.data = (void *)adapter,
};
netdev_walk_all_upper_dev_rcu(adapter->netdev,
ixgbe_macvlan_up, adapter);
ixgbe_macvlan_up, &priv);
}
static void ixgbe_configure(struct ixgbe_adapter *adapter)
@@ -9023,9 +9028,10 @@ static void ixgbe_set_prio_tc_map(struct ixgbe_adapter *adapter)
}
#endif /* CONFIG_IXGBE_DCB */
static int ixgbe_reassign_macvlan_pool(struct net_device *vdev, void *data)
static int ixgbe_reassign_macvlan_pool(struct net_device *vdev,
struct netdev_nested_priv *priv)
{
struct ixgbe_adapter *adapter = data;
struct ixgbe_adapter *adapter = (struct ixgbe_adapter *)priv->data;
struct ixgbe_fwd_adapter *accel;
int pool;
@@ -9062,13 +9068,16 @@ static int ixgbe_reassign_macvlan_pool(struct net_device *vdev, void *data)
static void ixgbe_defrag_macvlan_pools(struct net_device *dev)
{
struct ixgbe_adapter *adapter = netdev_priv(dev);
struct netdev_nested_priv priv = {
.data = (void *)adapter,
};
/* flush any stale bits out of the fwd bitmask */
bitmap_clear(adapter->fwd_bitmask, 1, 63);
/* walk through upper devices reassigning pools */
netdev_walk_all_upper_dev_rcu(dev, ixgbe_reassign_macvlan_pool,
adapter);
&priv);
}
/**
@@ -9242,14 +9251,18 @@ struct upper_walk_data {
u8 queue;
};
static int get_macvlan_queue(struct net_device *upper, void *_data)
static int get_macvlan_queue(struct net_device *upper,
struct netdev_nested_priv *priv)
{
if (netif_is_macvlan(upper)) {
struct ixgbe_fwd_adapter *vadapter = macvlan_accel_priv(upper);
struct upper_walk_data *data = _data;
struct ixgbe_adapter *adapter = data->adapter;
int ifindex = data->ifindex;
struct ixgbe_adapter *adapter;
struct upper_walk_data *data;
int ifindex;
data = (struct upper_walk_data *)priv->data;
ifindex = data->ifindex;
adapter = data->adapter;
if (vadapter && upper->ifindex == ifindex) {
data->queue = adapter->rx_ring[vadapter->rx_base_queue]->reg_idx;
data->action = data->queue;
@@ -9265,6 +9278,7 @@ static int handle_redirect_action(struct ixgbe_adapter *adapter, int ifindex,
{
struct ixgbe_ring_feature *vmdq = &adapter->ring_feature[RING_F_VMDQ];
unsigned int num_vfs = adapter->num_vfs, vf;
struct netdev_nested_priv priv;
struct upper_walk_data data;
struct net_device *upper;
@@ -9284,8 +9298,9 @@ static int handle_redirect_action(struct ixgbe_adapter *adapter, int ifindex,
data.ifindex = ifindex;
data.action = 0;
data.queue = 0;
priv.data = (void *)&data;
if (netdev_walk_all_upper_dev_rcu(adapter->netdev,
get_macvlan_queue, &data)) {
get_macvlan_queue, &priv)) {
*action = data.action;
*queue = data.queue;

View File

@@ -245,6 +245,7 @@ static int xrx200_tx_housekeeping(struct napi_struct *napi, int budget)
int pkts = 0;
int bytes = 0;
netif_tx_lock(net_dev);
while (pkts < budget) {
struct ltq_dma_desc *desc = &ch->dma.desc_base[ch->tx_free];
@@ -268,6 +269,7 @@ static int xrx200_tx_housekeeping(struct napi_struct *napi, int budget)
net_dev->stats.tx_bytes += bytes;
netdev_completed_queue(ch->priv->net_dev, pkts, bytes);
netif_tx_unlock(net_dev);
if (netif_queue_stopped(net_dev))
netif_wake_queue(net_dev);

View File

@@ -3400,24 +3400,15 @@ static int mvneta_txq_sw_init(struct mvneta_port *pp,
txq->last_desc = txq->size - 1;
txq->buf = kmalloc_array(txq->size, sizeof(*txq->buf), GFP_KERNEL);
if (!txq->buf) {
dma_free_coherent(pp->dev->dev.parent,
txq->size * MVNETA_DESC_ALIGNED_SIZE,
txq->descs, txq->descs_phys);
if (!txq->buf)
return -ENOMEM;
}
/* Allocate DMA buffers for TSO MAC/IP/TCP headers */
txq->tso_hdrs = dma_alloc_coherent(pp->dev->dev.parent,
txq->size * TSO_HEADER_SIZE,
&txq->tso_hdrs_phys, GFP_KERNEL);
if (!txq->tso_hdrs) {
kfree(txq->buf);
dma_free_coherent(pp->dev->dev.parent,
txq->size * MVNETA_DESC_ALIGNED_SIZE,
txq->descs, txq->descs_phys);
if (!txq->tso_hdrs)
return -ENOMEM;
}
/* Setup XPS mapping */
if (txq_number > 1)

View File

@@ -17,7 +17,7 @@
static const u16 msgs_offset = ALIGN(sizeof(struct mbox_hdr), MBOX_MSG_ALIGN);
void otx2_mbox_reset(struct otx2_mbox *mbox, int devid)
void __otx2_mbox_reset(struct otx2_mbox *mbox, int devid)
{
void *hw_mbase = mbox->hwbase + (devid * MBOX_SIZE);
struct otx2_mbox_dev *mdev = &mbox->dev[devid];
@@ -26,13 +26,21 @@ void otx2_mbox_reset(struct otx2_mbox *mbox, int devid)
tx_hdr = hw_mbase + mbox->tx_start;
rx_hdr = hw_mbase + mbox->rx_start;
spin_lock(&mdev->mbox_lock);
mdev->msg_size = 0;
mdev->rsp_size = 0;
tx_hdr->num_msgs = 0;
tx_hdr->msg_size = 0;
rx_hdr->num_msgs = 0;
rx_hdr->msg_size = 0;
}
EXPORT_SYMBOL(__otx2_mbox_reset);
void otx2_mbox_reset(struct otx2_mbox *mbox, int devid)
{
struct otx2_mbox_dev *mdev = &mbox->dev[devid];
spin_lock(&mdev->mbox_lock);
__otx2_mbox_reset(mbox, devid);
spin_unlock(&mdev->mbox_lock);
}
EXPORT_SYMBOL(otx2_mbox_reset);

View File

@@ -93,6 +93,7 @@ struct mbox_msghdr {
};
void otx2_mbox_reset(struct otx2_mbox *mbox, int devid);
void __otx2_mbox_reset(struct otx2_mbox *mbox, int devid);
void otx2_mbox_destroy(struct otx2_mbox *mbox);
int otx2_mbox_init(struct otx2_mbox *mbox, void __force *hwbase,
struct pci_dev *pdev, void __force *reg_base,

View File

@@ -463,6 +463,7 @@ void rvu_nix_freemem(struct rvu *rvu);
int rvu_get_nixlf_count(struct rvu *rvu);
void rvu_nix_lf_teardown(struct rvu *rvu, u16 pcifunc, int blkaddr, int npalf);
int nix_get_nixlf(struct rvu *rvu, u16 pcifunc, int *nixlf, int *nix_blkaddr);
int nix_update_bcast_mce_list(struct rvu *rvu, u16 pcifunc, bool add);
/* NPC APIs */
int rvu_npc_init(struct rvu *rvu);
@@ -477,7 +478,7 @@ void rvu_npc_disable_promisc_entry(struct rvu *rvu, u16 pcifunc, int nixlf);
void rvu_npc_enable_promisc_entry(struct rvu *rvu, u16 pcifunc, int nixlf);
void rvu_npc_install_bcast_match_entry(struct rvu *rvu, u16 pcifunc,
int nixlf, u64 chan);
void rvu_npc_disable_bcast_entry(struct rvu *rvu, u16 pcifunc);
void rvu_npc_enable_bcast_entry(struct rvu *rvu, u16 pcifunc, bool enable);
int rvu_npc_update_rxvlan(struct rvu *rvu, u16 pcifunc, int nixlf);
void rvu_npc_disable_mcam_entries(struct rvu *rvu, u16 pcifunc, int nixlf);
void rvu_npc_disable_default_entries(struct rvu *rvu, u16 pcifunc, int nixlf);

View File

@@ -17,7 +17,6 @@
#include "npc.h"
#include "cgx.h"
static int nix_update_bcast_mce_list(struct rvu *rvu, u16 pcifunc, bool add);
static int rvu_nix_get_bpid(struct rvu *rvu, struct nix_bp_cfg_req *req,
int type, int chan_id);
@@ -2020,7 +2019,7 @@ static int nix_update_mce_list(struct nix_mce_list *mce_list,
return 0;
}
static int nix_update_bcast_mce_list(struct rvu *rvu, u16 pcifunc, bool add)
int nix_update_bcast_mce_list(struct rvu *rvu, u16 pcifunc, bool add)
{
int err = 0, idx, next_idx, last_idx;
struct nix_mce_list *mce_list;
@@ -2065,7 +2064,7 @@ static int nix_update_bcast_mce_list(struct rvu *rvu, u16 pcifunc, bool add)
/* Disable MCAM entry in NPC */
if (!mce_list->count) {
rvu_npc_disable_bcast_entry(rvu, pcifunc);
rvu_npc_enable_bcast_entry(rvu, pcifunc, false);
goto end;
}

View File

@@ -530,7 +530,7 @@ void rvu_npc_install_bcast_match_entry(struct rvu *rvu, u16 pcifunc,
NIX_INTF_RX, &entry, true);
}
void rvu_npc_disable_bcast_entry(struct rvu *rvu, u16 pcifunc)
void rvu_npc_enable_bcast_entry(struct rvu *rvu, u16 pcifunc, bool enable)
{
struct npc_mcam *mcam = &rvu->hw->mcam;
int blkaddr, index;
@@ -543,7 +543,7 @@ void rvu_npc_disable_bcast_entry(struct rvu *rvu, u16 pcifunc)
pcifunc = pcifunc & ~RVU_PFVF_FUNC_MASK;
index = npc_get_nixlf_mcam_index(mcam, pcifunc, 0, NIXLF_BCAST_ENTRY);
npc_enable_mcam_entry(rvu, mcam, blkaddr, index, false);
npc_enable_mcam_entry(rvu, mcam, blkaddr, index, enable);
}
void rvu_npc_update_flowkey_alg_idx(struct rvu *rvu, u16 pcifunc, int nixlf,
@@ -622,23 +622,35 @@ static void npc_enadis_default_entries(struct rvu *rvu, u16 pcifunc,
nixlf, NIXLF_UCAST_ENTRY);
npc_enable_mcam_entry(rvu, mcam, blkaddr, index, enable);
/* For PF, ena/dis promisc and bcast MCAM match entries */
if (pcifunc & RVU_PFVF_FUNC_MASK)
/* For PF, ena/dis promisc and bcast MCAM match entries.
* For VFs add/delete from bcast list when RX multicast
* feature is present.
*/
if (pcifunc & RVU_PFVF_FUNC_MASK && !rvu->hw->cap.nix_rx_multicast)
return;
/* For bcast, enable/disable only if it's action is not
* packet replication, incase if action is replication
* then this PF's nixlf is removed from bcast replication
* then this PF/VF's nixlf is removed from bcast replication
* list.
*/
index = npc_get_nixlf_mcam_index(mcam, pcifunc,
index = npc_get_nixlf_mcam_index(mcam, pcifunc & ~RVU_PFVF_FUNC_MASK,
nixlf, NIXLF_BCAST_ENTRY);
bank = npc_get_bank(mcam, index);
*(u64 *)&action = rvu_read64(rvu, blkaddr,
NPC_AF_MCAMEX_BANKX_ACTION(index & (mcam->banksize - 1), bank));
if (action.op != NIX_RX_ACTIONOP_MCAST)
/* VFs will not have BCAST entry */
if (action.op != NIX_RX_ACTIONOP_MCAST &&
!(pcifunc & RVU_PFVF_FUNC_MASK)) {
npc_enable_mcam_entry(rvu, mcam,
blkaddr, index, enable);
} else {
nix_update_bcast_mce_list(rvu, pcifunc, enable);
/* Enable PF's BCAST entry for packet replication */
rvu_npc_enable_bcast_entry(rvu, pcifunc, enable);
}
if (enable)
rvu_npc_enable_promisc_entry(rvu, pcifunc, nixlf);
else

View File

@@ -370,8 +370,8 @@ static int otx2_forward_vf_mbox_msgs(struct otx2_nic *pf,
dst_mbox = &pf->mbox;
dst_size = dst_mbox->mbox.tx_size -
ALIGN(sizeof(*mbox_hdr), MBOX_MSG_ALIGN);
/* Check if msgs fit into destination area */
if (mbox_hdr->msg_size > dst_size)
/* Check if msgs fit into destination area and has valid size */
if (mbox_hdr->msg_size > dst_size || !mbox_hdr->msg_size)
return -EINVAL;
dst_mdev = &dst_mbox->mbox.dev[0];
@@ -526,10 +526,10 @@ static void otx2_pfvf_mbox_up_handler(struct work_struct *work)
end:
offset = mbox->rx_start + msg->next_msgoff;
if (mdev->msgs_acked == (vf_mbox->up_num_msgs - 1))
__otx2_mbox_reset(mbox, 0);
mdev->msgs_acked++;
}
otx2_mbox_reset(mbox, vf_idx);
}
static irqreturn_t otx2_pfvf_mbox_intr_handler(int irq, void *pf_irq)
@@ -803,10 +803,11 @@ static void otx2_pfaf_mbox_handler(struct work_struct *work)
msg = (struct mbox_msghdr *)(mdev->mbase + offset);
otx2_process_pfaf_mbox_msg(pf, msg);
offset = mbox->rx_start + msg->next_msgoff;
if (mdev->msgs_acked == (af_mbox->num_msgs - 1))
__otx2_mbox_reset(mbox, 0);
mdev->msgs_acked++;
}
otx2_mbox_reset(mbox, 0);
}
static void otx2_handle_link_event(struct otx2_nic *pf)
@@ -1560,10 +1561,13 @@ int otx2_open(struct net_device *netdev)
err = otx2_rxtx_enable(pf, true);
if (err)
goto err_free_cints;
goto err_tx_stop_queues;
return 0;
err_tx_stop_queues:
netif_tx_stop_all_queues(netdev);
netif_carrier_off(netdev);
err_free_cints:
otx2_free_cints(pf, qidx);
vec = pci_irq_vector(pf->pdev,

View File

@@ -524,6 +524,7 @@ static void otx2_sqe_add_hdr(struct otx2_nic *pfvf, struct otx2_snd_queue *sq,
sqe_hdr->ol3type = NIX_SENDL3TYPE_IP4_CKSUM;
} else if (skb->protocol == htons(ETH_P_IPV6)) {
proto = ipv6_hdr(skb)->nexthdr;
sqe_hdr->ol3type = NIX_SENDL3TYPE_IP6;
}
if (proto == IPPROTO_TCP)

View File

@@ -99,10 +99,10 @@ static void otx2vf_vfaf_mbox_handler(struct work_struct *work)
msg = (struct mbox_msghdr *)(mdev->mbase + offset);
otx2vf_process_vfaf_mbox_msg(af_mbox->pfvf, msg);
offset = mbox->rx_start + msg->next_msgoff;
if (mdev->msgs_acked == (af_mbox->num_msgs - 1))
__otx2_mbox_reset(mbox, 0);
mdev->msgs_acked++;
}
otx2_mbox_reset(mbox, 0);
}
static int otx2vf_process_mbox_msg_up(struct otx2_nic *vf,

View File

@@ -69,12 +69,10 @@ enum {
MLX5_CMD_DELIVERY_STAT_CMD_DESCR_ERR = 0x10,
};
static struct mlx5_cmd_work_ent *alloc_cmd(struct mlx5_cmd *cmd,
struct mlx5_cmd_msg *in,
struct mlx5_cmd_msg *out,
void *uout, int uout_size,
mlx5_cmd_cbk_t cbk,
void *context, int page_queue)
static struct mlx5_cmd_work_ent *
cmd_alloc_ent(struct mlx5_cmd *cmd, struct mlx5_cmd_msg *in,
struct mlx5_cmd_msg *out, void *uout, int uout_size,
mlx5_cmd_cbk_t cbk, void *context, int page_queue)
{
gfp_t alloc_flags = cbk ? GFP_ATOMIC : GFP_KERNEL;
struct mlx5_cmd_work_ent *ent;
@@ -83,6 +81,7 @@ static struct mlx5_cmd_work_ent *alloc_cmd(struct mlx5_cmd *cmd,
if (!ent)
return ERR_PTR(-ENOMEM);
ent->idx = -EINVAL;
ent->in = in;
ent->out = out;
ent->uout = uout;
@@ -91,10 +90,16 @@ static struct mlx5_cmd_work_ent *alloc_cmd(struct mlx5_cmd *cmd,
ent->context = context;
ent->cmd = cmd;
ent->page_queue = page_queue;
refcount_set(&ent->refcnt, 1);
return ent;
}
static void cmd_free_ent(struct mlx5_cmd_work_ent *ent)
{
kfree(ent);
}
static u8 alloc_token(struct mlx5_cmd *cmd)
{
u8 token;
@@ -109,7 +114,7 @@ static u8 alloc_token(struct mlx5_cmd *cmd)
return token;
}
static int alloc_ent(struct mlx5_cmd *cmd)
static int cmd_alloc_index(struct mlx5_cmd *cmd)
{
unsigned long flags;
int ret;
@@ -123,7 +128,7 @@ static int alloc_ent(struct mlx5_cmd *cmd)
return ret < cmd->max_reg_cmds ? ret : -ENOMEM;
}
static void free_ent(struct mlx5_cmd *cmd, int idx)
static void cmd_free_index(struct mlx5_cmd *cmd, int idx)
{
unsigned long flags;
@@ -132,6 +137,22 @@ static void free_ent(struct mlx5_cmd *cmd, int idx)
spin_unlock_irqrestore(&cmd->alloc_lock, flags);
}
static void cmd_ent_get(struct mlx5_cmd_work_ent *ent)
{
refcount_inc(&ent->refcnt);
}
static void cmd_ent_put(struct mlx5_cmd_work_ent *ent)
{
if (!refcount_dec_and_test(&ent->refcnt))
return;
if (ent->idx >= 0)
cmd_free_index(ent->cmd, ent->idx);
cmd_free_ent(ent);
}
static struct mlx5_cmd_layout *get_inst(struct mlx5_cmd *cmd, int idx)
{
return cmd->cmd_buf + (idx << cmd->log_stride);
@@ -219,11 +240,6 @@ static void poll_timeout(struct mlx5_cmd_work_ent *ent)
ent->ret = -ETIMEDOUT;
}
static void free_cmd(struct mlx5_cmd_work_ent *ent)
{
kfree(ent);
}
static int verify_signature(struct mlx5_cmd_work_ent *ent)
{
struct mlx5_cmd_mailbox *next = ent->out->next;
@@ -837,11 +853,22 @@ static void cb_timeout_handler(struct work_struct *work)
struct mlx5_core_dev *dev = container_of(ent->cmd, struct mlx5_core_dev,
cmd);
mlx5_cmd_eq_recover(dev);
/* Maybe got handled by eq recover ? */
if (!test_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state)) {
mlx5_core_warn(dev, "cmd[%d]: %s(0x%x) Async, recovered after timeout\n", ent->idx,
mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in));
goto out; /* phew, already handled */
}
ent->ret = -ETIMEDOUT;
mlx5_core_warn(dev, "%s(0x%x) timeout. Will cause a leak of a command resource\n",
mlx5_command_str(msg_to_opcode(ent->in)),
msg_to_opcode(ent->in));
mlx5_core_warn(dev, "cmd[%d]: %s(0x%x) Async, timeout. Will cause a leak of a command resource\n",
ent->idx, mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in));
mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true);
out:
cmd_ent_put(ent); /* for the cmd_ent_get() took on schedule delayed work */
}
static void free_msg(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *msg);
@@ -856,6 +883,32 @@ static bool opcode_allowed(struct mlx5_cmd *cmd, u16 opcode)
return cmd->allowed_opcode == opcode;
}
static int cmd_alloc_index_retry(struct mlx5_cmd *cmd)
{
unsigned long alloc_end = jiffies + msecs_to_jiffies(1000);
int idx;
retry:
idx = cmd_alloc_index(cmd);
if (idx < 0 && time_before(jiffies, alloc_end)) {
/* Index allocation can fail on heavy load of commands. This is a temporary
* situation as the current command already holds the semaphore, meaning that
* another command completion is being handled and it is expected to release
* the entry index soon.
*/
cpu_relax();
goto retry;
}
return idx;
}
bool mlx5_cmd_is_down(struct mlx5_core_dev *dev)
{
return pci_channel_offline(dev->pdev) ||
dev->cmd.state != MLX5_CMDIF_STATE_UP ||
dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR;
}
static void cmd_work_handler(struct work_struct *work)
{
struct mlx5_cmd_work_ent *ent = container_of(work, struct mlx5_cmd_work_ent, work);
@@ -873,14 +926,14 @@ static void cmd_work_handler(struct work_struct *work)
sem = ent->page_queue ? &cmd->pages_sem : &cmd->sem;
down(sem);
if (!ent->page_queue) {
alloc_ret = alloc_ent(cmd);
alloc_ret = cmd_alloc_index_retry(cmd);
if (alloc_ret < 0) {
mlx5_core_err_rl(dev, "failed to allocate command entry\n");
if (ent->callback) {
ent->callback(-EAGAIN, ent->context);
mlx5_free_cmd_msg(dev, ent->out);
free_msg(dev, ent->in);
free_cmd(ent);
cmd_ent_put(ent);
} else {
ent->ret = -EAGAIN;
complete(&ent->done);
@@ -916,15 +969,12 @@ static void cmd_work_handler(struct work_struct *work)
ent->ts1 = ktime_get_ns();
cmd_mode = cmd->mode;
if (ent->callback)
schedule_delayed_work(&ent->cb_timeout_work, cb_timeout);
if (ent->callback && schedule_delayed_work(&ent->cb_timeout_work, cb_timeout))
cmd_ent_get(ent);
set_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state);
/* Skip sending command to fw if internal error */
if (pci_channel_offline(dev->pdev) ||
dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR ||
cmd->state != MLX5_CMDIF_STATE_UP ||
!opcode_allowed(&dev->cmd, ent->op)) {
if (mlx5_cmd_is_down(dev) || !opcode_allowed(&dev->cmd, ent->op)) {
u8 status = 0;
u32 drv_synd;
@@ -933,13 +983,10 @@ static void cmd_work_handler(struct work_struct *work)
MLX5_SET(mbox_out, ent->out, syndrome, drv_synd);
mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true);
/* no doorbell, no need to keep the entry */
free_ent(cmd, ent->idx);
if (ent->callback)
free_cmd(ent);
return;
}
cmd_ent_get(ent); /* for the _real_ FW event on completion */
/* ring doorbell after the descriptor is valid */
mlx5_core_dbg(dev, "writing 0x%x to command doorbell\n", 1 << ent->idx);
wmb();
@@ -983,6 +1030,35 @@ static const char *deliv_status_to_str(u8 status)
}
}
enum {
MLX5_CMD_TIMEOUT_RECOVER_MSEC = 5 * 1000,
};
static void wait_func_handle_exec_timeout(struct mlx5_core_dev *dev,
struct mlx5_cmd_work_ent *ent)
{
unsigned long timeout = msecs_to_jiffies(MLX5_CMD_TIMEOUT_RECOVER_MSEC);
mlx5_cmd_eq_recover(dev);
/* Re-wait on the ent->done after executing the recovery flow. If the
* recovery flow (or any other recovery flow running simultaneously)
* has recovered an EQE, it should cause the entry to be completed by
* the command interface.
*/
if (wait_for_completion_timeout(&ent->done, timeout)) {
mlx5_core_warn(dev, "cmd[%d]: %s(0x%x) recovered after timeout\n", ent->idx,
mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in));
return;
}
mlx5_core_warn(dev, "cmd[%d]: %s(0x%x) No done completion\n", ent->idx,
mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in));
ent->ret = -ETIMEDOUT;
mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true);
}
static int wait_func(struct mlx5_core_dev *dev, struct mlx5_cmd_work_ent *ent)
{
unsigned long timeout = msecs_to_jiffies(MLX5_CMD_TIMEOUT_MSEC);
@@ -994,12 +1070,10 @@ static int wait_func(struct mlx5_core_dev *dev, struct mlx5_cmd_work_ent *ent)
ent->ret = -ECANCELED;
goto out_err;
}
if (cmd->mode == CMD_MODE_POLLING || ent->polling) {
if (cmd->mode == CMD_MODE_POLLING || ent->polling)
wait_for_completion(&ent->done);
} else if (!wait_for_completion_timeout(&ent->done, timeout)) {
ent->ret = -ETIMEDOUT;
mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true);
}
else if (!wait_for_completion_timeout(&ent->done, timeout))
wait_func_handle_exec_timeout(dev, ent);
out_err:
err = ent->ret;
@@ -1039,11 +1113,16 @@ static int mlx5_cmd_invoke(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *in,
if (callback && page_queue)
return -EINVAL;
ent = alloc_cmd(cmd, in, out, uout, uout_size, callback, context,
page_queue);
ent = cmd_alloc_ent(cmd, in, out, uout, uout_size,
callback, context, page_queue);
if (IS_ERR(ent))
return PTR_ERR(ent);
/* put for this ent is when consumed, depending on the use case
* 1) (!callback) blocking flow: by caller after wait_func completes
* 2) (callback) flow: by mlx5_cmd_comp_handler() when ent is handled
*/
ent->token = token;
ent->polling = force_polling;
@@ -1062,12 +1141,10 @@ static int mlx5_cmd_invoke(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *in,
}
if (callback)
goto out;
goto out; /* mlx5_cmd_comp_handler() will put(ent) */
err = wait_func(dev, ent);
if (err == -ETIMEDOUT)
goto out;
if (err == -ECANCELED)
if (err == -ETIMEDOUT || err == -ECANCELED)
goto out_free;
ds = ent->ts2 - ent->ts1;
@@ -1085,7 +1162,7 @@ static int mlx5_cmd_invoke(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *in,
*status = ent->status;
out_free:
free_cmd(ent);
cmd_ent_put(ent);
out:
return err;
}
@@ -1516,14 +1593,19 @@ static void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, u64 vec, bool force
if (!forced) {
mlx5_core_err(dev, "Command completion arrived after timeout (entry idx = %d).\n",
ent->idx);
free_ent(cmd, ent->idx);
free_cmd(ent);
cmd_ent_put(ent);
}
continue;
}
if (ent->callback)
cancel_delayed_work(&ent->cb_timeout_work);
if (ent->callback && cancel_delayed_work(&ent->cb_timeout_work))
cmd_ent_put(ent); /* timeout work was canceled */
if (!forced || /* Real FW completion */
pci_channel_offline(dev->pdev) || /* FW is inaccessible */
dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
cmd_ent_put(ent);
if (ent->page_queue)
sem = &cmd->pages_sem;
else
@@ -1545,10 +1627,6 @@ static void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, u64 vec, bool force
ent->ret, deliv_status_to_str(ent->status), ent->status);
}
/* only real completion will free the entry slot */
if (!forced)
free_ent(cmd, ent->idx);
if (ent->callback) {
ds = ent->ts2 - ent->ts1;
if (ent->op < MLX5_CMD_OP_MAX) {
@@ -1576,10 +1654,13 @@ static void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, u64 vec, bool force
free_msg(dev, ent->in);
err = err ? err : ent->status;
if (!forced)
free_cmd(ent);
/* final consumer is done, release ent */
cmd_ent_put(ent);
callback(err, context);
} else {
/* release wait_func() so mlx5_cmd_invoke()
* can make the final ent_put()
*/
complete(&ent->done);
}
up(sem);
@@ -1589,8 +1670,11 @@ static void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, u64 vec, bool force
void mlx5_cmd_trigger_completions(struct mlx5_core_dev *dev)
{
struct mlx5_cmd *cmd = &dev->cmd;
unsigned long bitmask;
unsigned long flags;
u64 vector;
int i;
/* wait for pending handlers to complete */
mlx5_eq_synchronize_cmd_irq(dev);
@@ -1599,11 +1683,20 @@ void mlx5_cmd_trigger_completions(struct mlx5_core_dev *dev)
if (!vector)
goto no_trig;
bitmask = vector;
/* we must increment the allocated entries refcount before triggering the completions
* to guarantee pending commands will not get freed in the meanwhile.
* For that reason, it also has to be done inside the alloc_lock.
*/
for_each_set_bit(i, &bitmask, (1 << cmd->log_sz))
cmd_ent_get(cmd->ent_arr[i]);
vector |= MLX5_TRIGGERED_CMD_COMP;
spin_unlock_irqrestore(&dev->cmd.alloc_lock, flags);
mlx5_core_dbg(dev, "vector 0x%llx\n", vector);
mlx5_cmd_comp_handler(dev, vector, true);
for_each_set_bit(i, &bitmask, (1 << cmd->log_sz))
cmd_ent_put(cmd->ent_arr[i]);
return;
no_trig:
@@ -1711,10 +1804,7 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
u8 token;
opcode = MLX5_GET(mbox_in, in, opcode);
if (pci_channel_offline(dev->pdev) ||
dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR ||
dev->cmd.state != MLX5_CMDIF_STATE_UP ||
!opcode_allowed(&dev->cmd, opcode)) {
if (mlx5_cmd_is_down(dev) || !opcode_allowed(&dev->cmd, opcode)) {
err = mlx5_internal_err_ret_value(dev, opcode, &drv_synd, &status);
MLX5_SET(mbox_out, out, status, status);
MLX5_SET(mbox_out, out, syndrome, drv_synd);

View File

@@ -91,7 +91,12 @@ struct page_pool;
#define MLX5_MPWRQ_PAGES_PER_WQE BIT(MLX5_MPWRQ_WQE_PAGE_ORDER)
#define MLX5_MTT_OCTW(npages) (ALIGN(npages, 8) / 2)
#define MLX5E_REQUIRED_WQE_MTTS (ALIGN(MLX5_MPWRQ_PAGES_PER_WQE, 8))
/* Add another page to MLX5E_REQUIRED_WQE_MTTS as a buffer between
* WQEs, This page will absorb write overflow by the hardware, when
* receiving packets larger than MTU. These oversize packets are
* dropped by the driver at a later stage.
*/
#define MLX5E_REQUIRED_WQE_MTTS (ALIGN(MLX5_MPWRQ_PAGES_PER_WQE + 1, 8))
#define MLX5E_LOG_ALIGNED_MPWQE_PPW (ilog2(MLX5E_REQUIRED_WQE_MTTS))
#define MLX5E_REQUIRED_MTTS(wqes) (wqes * MLX5E_REQUIRED_WQE_MTTS)
#define MLX5E_MAX_RQ_NUM_MTTS \
@@ -617,6 +622,7 @@ struct mlx5e_rq {
u32 rqn;
struct mlx5_core_dev *mdev;
struct mlx5_core_mkey umr_mkey;
struct mlx5e_dma_info wqe_overflow;
/* XDP read-mostly */
struct xdp_rxq_info xdp_rxq;

View File

@@ -569,6 +569,9 @@ int mlx5e_set_fec_mode(struct mlx5_core_dev *dev, u16 fec_policy)
if (fec_policy >= (1 << MLX5E_FEC_LLRS_272_257_1) && !fec_50g_per_lane)
return -EOPNOTSUPP;
if (fec_policy && !mlx5e_fec_in_caps(dev, fec_policy))
return -EOPNOTSUPP;
MLX5_SET(pplm_reg, in, local_port, 1);
err = mlx5_core_access_reg(dev, in, sz, out, sz, MLX5_REG_PPLM, 0, 0);
if (err)

View File

@@ -110,11 +110,25 @@ static void mlx5e_rep_neigh_stats_work(struct work_struct *work)
rtnl_unlock();
}
struct neigh_update_work {
struct work_struct work;
struct neighbour *n;
struct mlx5e_neigh_hash_entry *nhe;
};
static void mlx5e_release_neigh_update_work(struct neigh_update_work *update_work)
{
neigh_release(update_work->n);
mlx5e_rep_neigh_entry_release(update_work->nhe);
kfree(update_work);
}
static void mlx5e_rep_neigh_update(struct work_struct *work)
{
struct mlx5e_neigh_hash_entry *nhe =
container_of(work, struct mlx5e_neigh_hash_entry, neigh_update_work);
struct neighbour *n = nhe->n;
struct neigh_update_work *update_work = container_of(work, struct neigh_update_work,
work);
struct mlx5e_neigh_hash_entry *nhe = update_work->nhe;
struct neighbour *n = update_work->n;
struct mlx5e_encap_entry *e;
unsigned char ha[ETH_ALEN];
struct mlx5e_priv *priv;
@@ -146,30 +160,42 @@ static void mlx5e_rep_neigh_update(struct work_struct *work)
mlx5e_rep_update_flows(priv, e, neigh_connected, ha);
mlx5e_encap_put(priv, e);
}
mlx5e_rep_neigh_entry_release(nhe);
rtnl_unlock();
neigh_release(n);
mlx5e_release_neigh_update_work(update_work);
}
static void mlx5e_rep_queue_neigh_update_work(struct mlx5e_priv *priv,
struct mlx5e_neigh_hash_entry *nhe,
struct neighbour *n)
static struct neigh_update_work *mlx5e_alloc_neigh_update_work(struct mlx5e_priv *priv,
struct neighbour *n)
{
/* Take a reference to ensure the neighbour and mlx5 encap
* entry won't be destructed until we drop the reference in
* delayed work.
*/
neigh_hold(n);
struct neigh_update_work *update_work;
struct mlx5e_neigh_hash_entry *nhe;
struct mlx5e_neigh m_neigh = {};
/* This assignment is valid as long as the the neigh reference
* is taken
*/
nhe->n = n;
update_work = kzalloc(sizeof(*update_work), GFP_ATOMIC);
if (WARN_ON(!update_work))
return NULL;
if (!queue_work(priv->wq, &nhe->neigh_update_work)) {
mlx5e_rep_neigh_entry_release(nhe);
neigh_release(n);
m_neigh.dev = n->dev;
m_neigh.family = n->ops->family;
memcpy(&m_neigh.dst_ip, n->primary_key, n->tbl->key_len);
/* Obtain reference to nhe as last step in order not to release it in
* atomic context.
*/
rcu_read_lock();
nhe = mlx5e_rep_neigh_entry_lookup(priv, &m_neigh);
rcu_read_unlock();
if (!nhe) {
kfree(update_work);
return NULL;
}
INIT_WORK(&update_work->work, mlx5e_rep_neigh_update);
neigh_hold(n);
update_work->n = n;
update_work->nhe = nhe;
return update_work;
}
static int mlx5e_rep_netevent_event(struct notifier_block *nb,
@@ -181,7 +207,7 @@ static int mlx5e_rep_netevent_event(struct notifier_block *nb,
struct net_device *netdev = rpriv->netdev;
struct mlx5e_priv *priv = netdev_priv(netdev);
struct mlx5e_neigh_hash_entry *nhe = NULL;
struct mlx5e_neigh m_neigh = {};
struct neigh_update_work *update_work;
struct neigh_parms *p;
struct neighbour *n;
bool found = false;
@@ -196,17 +222,11 @@ static int mlx5e_rep_netevent_event(struct notifier_block *nb,
#endif
return NOTIFY_DONE;
m_neigh.dev = n->dev;
m_neigh.family = n->ops->family;
memcpy(&m_neigh.dst_ip, n->primary_key, n->tbl->key_len);
rcu_read_lock();
nhe = mlx5e_rep_neigh_entry_lookup(priv, &m_neigh);
rcu_read_unlock();
if (!nhe)
update_work = mlx5e_alloc_neigh_update_work(priv, n);
if (!update_work)
return NOTIFY_DONE;
mlx5e_rep_queue_neigh_update_work(priv, nhe, n);
queue_work(priv->wq, &update_work->work);
break;
case NETEVENT_DELAY_PROBE_TIME_UPDATE:
@@ -352,7 +372,6 @@ int mlx5e_rep_neigh_entry_create(struct mlx5e_priv *priv,
(*nhe)->priv = priv;
memcpy(&(*nhe)->m_neigh, &e->m_neigh, sizeof(e->m_neigh));
INIT_WORK(&(*nhe)->neigh_update_work, mlx5e_rep_neigh_update);
spin_lock_init(&(*nhe)->encap_list_lock);
INIT_LIST_HEAD(&(*nhe)->encap_list);
refcount_set(&(*nhe)->refcnt, 1);

View File

@@ -246,8 +246,10 @@ mlx5_tc_ct_rule_to_tuple_nat(struct mlx5_ct_tuple *tuple,
case FLOW_ACT_MANGLE_HDR_TYPE_IP6:
ip6_offset = (offset - offsetof(struct ipv6hdr, saddr));
ip6_offset /= 4;
if (ip6_offset < 8)
if (ip6_offset < 4)
tuple->ip.src_v6.s6_addr32[ip6_offset] = cpu_to_be32(val);
else if (ip6_offset < 8)
tuple->ip.dst_v6.s6_addr32[ip6_offset - 4] = cpu_to_be32(val);
else
return -EOPNOTSUPP;
break;

View File

@@ -217,6 +217,9 @@ static int __mlx5e_add_vlan_rule(struct mlx5e_priv *priv,
break;
}
if (WARN_ONCE(*rule_p, "VLAN rule already exists type %d", rule_type))
return 0;
*rule_p = mlx5_add_flow_rules(ft, spec, &flow_act, &dest, 1);
if (IS_ERR(*rule_p)) {
@@ -397,8 +400,7 @@ static void mlx5e_add_vlan_rules(struct mlx5e_priv *priv)
for_each_set_bit(i, priv->fs.vlan.active_svlans, VLAN_N_VID)
mlx5e_add_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_MATCH_STAG_VID, i);
if (priv->fs.vlan.cvlan_filter_disabled &&
!(priv->netdev->flags & IFF_PROMISC))
if (priv->fs.vlan.cvlan_filter_disabled)
mlx5e_add_any_vid_rules(priv);
}
@@ -415,8 +417,12 @@ static void mlx5e_del_vlan_rules(struct mlx5e_priv *priv)
for_each_set_bit(i, priv->fs.vlan.active_svlans, VLAN_N_VID)
mlx5e_del_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_MATCH_STAG_VID, i);
if (priv->fs.vlan.cvlan_filter_disabled &&
!(priv->netdev->flags & IFF_PROMISC))
WARN_ON_ONCE(!(test_bit(MLX5E_STATE_DESTROYING, &priv->state)));
/* must be called after DESTROY bit is set and
* set_rx_mode is called and flushed
*/
if (priv->fs.vlan.cvlan_filter_disabled)
mlx5e_del_any_vid_rules(priv);
}

View File

@@ -246,12 +246,17 @@ static int mlx5e_rq_alloc_mpwqe_info(struct mlx5e_rq *rq,
static int mlx5e_create_umr_mkey(struct mlx5_core_dev *mdev,
u64 npages, u8 page_shift,
struct mlx5_core_mkey *umr_mkey)
struct mlx5_core_mkey *umr_mkey,
dma_addr_t filler_addr)
{
int inlen = MLX5_ST_SZ_BYTES(create_mkey_in);
struct mlx5_mtt *mtt;
int inlen;
void *mkc;
u32 *in;
int err;
int i;
inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + sizeof(*mtt) * npages;
in = kvzalloc(inlen, GFP_KERNEL);
if (!in)
@@ -271,6 +276,18 @@ static int mlx5e_create_umr_mkey(struct mlx5_core_dev *mdev,
MLX5_SET(mkc, mkc, translations_octword_size,
MLX5_MTT_OCTW(npages));
MLX5_SET(mkc, mkc, log_page_size, page_shift);
MLX5_SET(create_mkey_in, in, translations_octword_actual_size,
MLX5_MTT_OCTW(npages));
/* Initialize the mkey with all MTTs pointing to a default
* page (filler_addr). When the channels are activated, UMR
* WQEs will redirect the RX WQEs to the actual memory from
* the RQ's pool, while the gaps (wqe_overflow) remain mapped
* to the default page.
*/
mtt = MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt);
for (i = 0 ; i < npages ; i++)
mtt[i].ptag = cpu_to_be64(filler_addr);
err = mlx5_core_create_mkey(mdev, umr_mkey, in, inlen);
@@ -282,7 +299,8 @@ static int mlx5e_create_rq_umr_mkey(struct mlx5_core_dev *mdev, struct mlx5e_rq
{
u64 num_mtts = MLX5E_REQUIRED_MTTS(mlx5_wq_ll_get_size(&rq->mpwqe.wq));
return mlx5e_create_umr_mkey(mdev, num_mtts, PAGE_SHIFT, &rq->umr_mkey);
return mlx5e_create_umr_mkey(mdev, num_mtts, PAGE_SHIFT, &rq->umr_mkey,
rq->wqe_overflow.addr);
}
static inline u64 mlx5e_get_mpwqe_offset(struct mlx5e_rq *rq, u16 wqe_ix)
@@ -350,6 +368,28 @@ static void mlx5e_rq_err_cqe_work(struct work_struct *recover_work)
mlx5e_reporter_rq_cqe_err(rq);
}
static int mlx5e_alloc_mpwqe_rq_drop_page(struct mlx5e_rq *rq)
{
rq->wqe_overflow.page = alloc_page(GFP_KERNEL);
if (!rq->wqe_overflow.page)
return -ENOMEM;
rq->wqe_overflow.addr = dma_map_page(rq->pdev, rq->wqe_overflow.page, 0,
PAGE_SIZE, rq->buff.map_dir);
if (dma_mapping_error(rq->pdev, rq->wqe_overflow.addr)) {
__free_page(rq->wqe_overflow.page);
return -ENOMEM;
}
return 0;
}
static void mlx5e_free_mpwqe_rq_drop_page(struct mlx5e_rq *rq)
{
dma_unmap_page(rq->pdev, rq->wqe_overflow.addr, PAGE_SIZE,
rq->buff.map_dir);
__free_page(rq->wqe_overflow.page);
}
static int mlx5e_alloc_rq(struct mlx5e_channel *c,
struct mlx5e_params *params,
struct mlx5e_xsk_param *xsk,
@@ -396,7 +436,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
rq_xdp_ix += params->num_channels * MLX5E_RQ_GROUP_XSK;
err = xdp_rxq_info_reg(&rq->xdp_rxq, rq->netdev, rq_xdp_ix);
if (err < 0)
goto err_rq_wq_destroy;
goto err_rq_xdp_prog;
rq->buff.map_dir = params->xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE;
rq->buff.headroom = mlx5e_get_rq_headroom(mdev, params, xsk);
@@ -406,6 +446,10 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
err = mlx5_wq_ll_create(mdev, &rqp->wq, rqc_wq, &rq->mpwqe.wq,
&rq->wq_ctrl);
if (err)
goto err_rq_xdp;
err = mlx5e_alloc_mpwqe_rq_drop_page(rq);
if (err)
goto err_rq_wq_destroy;
@@ -424,18 +468,18 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
err = mlx5e_create_rq_umr_mkey(mdev, rq);
if (err)
goto err_rq_wq_destroy;
goto err_rq_drop_page;
rq->mkey_be = cpu_to_be32(rq->umr_mkey.key);
err = mlx5e_rq_alloc_mpwqe_info(rq, c);
if (err)
goto err_free;
goto err_rq_mkey;
break;
default: /* MLX5_WQ_TYPE_CYCLIC */
err = mlx5_wq_cyc_create(mdev, &rqp->wq, rqc_wq, &rq->wqe.wq,
&rq->wq_ctrl);
if (err)
goto err_rq_wq_destroy;
goto err_rq_xdp;
rq->wqe.wq.db = &rq->wqe.wq.db[MLX5_RCV_DBR];
@@ -450,19 +494,19 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
GFP_KERNEL, cpu_to_node(c->cpu));
if (!rq->wqe.frags) {
err = -ENOMEM;
goto err_free;
goto err_rq_wq_destroy;
}
err = mlx5e_init_di_list(rq, wq_sz, c->cpu);
if (err)
goto err_free;
goto err_rq_frags;
rq->mkey_be = c->mkey_be;
}
err = mlx5e_rq_set_handlers(rq, params, xsk);
if (err)
goto err_free;
goto err_free_by_rq_type;
if (xsk) {
err = xdp_rxq_info_reg_mem_model(&rq->xdp_rxq,
@@ -486,13 +530,13 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
if (IS_ERR(rq->page_pool)) {
err = PTR_ERR(rq->page_pool);
rq->page_pool = NULL;
goto err_free;
goto err_free_by_rq_type;
}
err = xdp_rxq_info_reg_mem_model(&rq->xdp_rxq,
MEM_TYPE_PAGE_POOL, rq->page_pool);
}
if (err)
goto err_free;
goto err_free_by_rq_type;
for (i = 0; i < wq_sz; i++) {
if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ) {
@@ -542,23 +586,27 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
return 0;
err_free:
err_free_by_rq_type:
switch (rq->wq_type) {
case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
kvfree(rq->mpwqe.info);
err_rq_mkey:
mlx5_core_destroy_mkey(mdev, &rq->umr_mkey);
err_rq_drop_page:
mlx5e_free_mpwqe_rq_drop_page(rq);
break;
default: /* MLX5_WQ_TYPE_CYCLIC */
kvfree(rq->wqe.frags);
mlx5e_free_di_list(rq);
err_rq_frags:
kvfree(rq->wqe.frags);
}
err_rq_wq_destroy:
mlx5_wq_destroy(&rq->wq_ctrl);
err_rq_xdp:
xdp_rxq_info_unreg(&rq->xdp_rxq);
err_rq_xdp_prog:
if (params->xdp_prog)
bpf_prog_put(params->xdp_prog);
xdp_rxq_info_unreg(&rq->xdp_rxq);
page_pool_destroy(rq->page_pool);
mlx5_wq_destroy(&rq->wq_ctrl);
return err;
}
@@ -580,6 +628,7 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq)
case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
kvfree(rq->mpwqe.info);
mlx5_core_destroy_mkey(rq->mdev, &rq->umr_mkey);
mlx5e_free_mpwqe_rq_drop_page(rq);
break;
default: /* MLX5_WQ_TYPE_CYCLIC */
kvfree(rq->wqe.frags);
@@ -4177,6 +4226,21 @@ int mlx5e_get_vf_stats(struct net_device *dev,
}
#endif
static bool mlx5e_gre_tunnel_inner_proto_offload_supported(struct mlx5_core_dev *mdev,
struct sk_buff *skb)
{
switch (skb->inner_protocol) {
case htons(ETH_P_IP):
case htons(ETH_P_IPV6):
case htons(ETH_P_TEB):
return true;
case htons(ETH_P_MPLS_UC):
case htons(ETH_P_MPLS_MC):
return MLX5_CAP_ETH(mdev, tunnel_stateless_mpls_over_gre);
}
return false;
}
static netdev_features_t mlx5e_tunnel_features_check(struct mlx5e_priv *priv,
struct sk_buff *skb,
netdev_features_t features)
@@ -4199,7 +4263,9 @@ static netdev_features_t mlx5e_tunnel_features_check(struct mlx5e_priv *priv,
switch (proto) {
case IPPROTO_GRE:
return features;
if (mlx5e_gre_tunnel_inner_proto_offload_supported(priv->mdev, skb))
return features;
break;
case IPPROTO_IPIP:
case IPPROTO_IPV6:
if (mlx5e_tunnel_proto_supported(priv->mdev, IPPROTO_IPIP))

View File

@@ -135,12 +135,6 @@ struct mlx5e_neigh_hash_entry {
/* encap list sharing the same neigh */
struct list_head encap_list;
/* valid only when the neigh reference is taken during
* neigh_update_work workqueue callback.
*/
struct neighbour *n;
struct work_struct neigh_update_work;
/* neigh hash entry can be deleted only when the refcount is zero.
* refcount is needed to avoid neigh hash entry removal by TC, while
* it's used by the neigh notification call.

View File

@@ -189,6 +189,29 @@ u32 mlx5_eq_poll_irq_disabled(struct mlx5_eq_comp *eq)
return count_eqe;
}
static void mlx5_eq_async_int_lock(struct mlx5_eq_async *eq, unsigned long *flags)
__acquires(&eq->lock)
{
if (in_irq())
spin_lock(&eq->lock);
else
spin_lock_irqsave(&eq->lock, *flags);
}
static void mlx5_eq_async_int_unlock(struct mlx5_eq_async *eq, unsigned long *flags)
__releases(&eq->lock)
{
if (in_irq())
spin_unlock(&eq->lock);
else
spin_unlock_irqrestore(&eq->lock, *flags);
}
enum async_eq_nb_action {
ASYNC_EQ_IRQ_HANDLER = 0,
ASYNC_EQ_RECOVER = 1,
};
static int mlx5_eq_async_int(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -198,11 +221,14 @@ static int mlx5_eq_async_int(struct notifier_block *nb,
struct mlx5_eq_table *eqt;
struct mlx5_core_dev *dev;
struct mlx5_eqe *eqe;
unsigned long flags;
int num_eqes = 0;
dev = eq->dev;
eqt = dev->priv.eq_table;
mlx5_eq_async_int_lock(eq_async, &flags);
eqe = next_eqe_sw(eq);
if (!eqe)
goto out;
@@ -223,8 +249,19 @@ static int mlx5_eq_async_int(struct notifier_block *nb,
out:
eq_update_ci(eq, 1);
mlx5_eq_async_int_unlock(eq_async, &flags);
return 0;
return unlikely(action == ASYNC_EQ_RECOVER) ? num_eqes : 0;
}
void mlx5_cmd_eq_recover(struct mlx5_core_dev *dev)
{
struct mlx5_eq_async *eq = &dev->priv.eq_table->cmd_eq;
int eqes;
eqes = mlx5_eq_async_int(&eq->irq_nb, ASYNC_EQ_RECOVER, NULL);
if (eqes)
mlx5_core_warn(dev, "Recovered %d EQEs on cmd_eq\n", eqes);
}
static void init_eq_buf(struct mlx5_eq *eq)
@@ -569,6 +606,7 @@ setup_async_eq(struct mlx5_core_dev *dev, struct mlx5_eq_async *eq,
int err;
eq->irq_nb.notifier_call = mlx5_eq_async_int;
spin_lock_init(&eq->lock);
err = create_async_eq(dev, &eq->core, param);
if (err) {
@@ -656,8 +694,10 @@ static void destroy_async_eqs(struct mlx5_core_dev *dev)
cleanup_async_eq(dev, &table->pages_eq, "pages");
cleanup_async_eq(dev, &table->async_eq, "async");
mlx5_cmd_allowed_opcode(dev, MLX5_CMD_OP_DESTROY_EQ);
mlx5_cmd_use_polling(dev);
cleanup_async_eq(dev, &table->cmd_eq, "cmd");
mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL);
mlx5_eq_notifier_unregister(dev, &table->cq_err_nb);
}

View File

@@ -37,6 +37,7 @@ struct mlx5_eq {
struct mlx5_eq_async {
struct mlx5_eq core;
struct notifier_block irq_nb;
spinlock_t lock; /* To avoid irq EQ handle races with resiliency flows */
};
struct mlx5_eq_comp {
@@ -81,6 +82,7 @@ void mlx5_cq_tasklet_cb(unsigned long data);
struct cpumask *mlx5_eq_comp_cpumask(struct mlx5_core_dev *dev, int ix);
u32 mlx5_eq_poll_irq_disabled(struct mlx5_eq_comp *eq);
void mlx5_cmd_eq_recover(struct mlx5_core_dev *dev);
void mlx5_eq_synchronize_async_irq(struct mlx5_core_dev *dev);
void mlx5_eq_synchronize_cmd_irq(struct mlx5_core_dev *dev);

View File

@@ -432,7 +432,7 @@ static int reclaim_pages_cmd(struct mlx5_core_dev *dev,
u32 npages;
u32 i = 0;
if (dev->state != MLX5_DEVICE_STATE_INTERNAL_ERROR)
if (!mlx5_cmd_is_down(dev))
return mlx5_cmd_exec(dev, in, in_size, out, out_size);
/* No hard feelings, we want our pages back! */

View File

@@ -115,7 +115,7 @@ static int request_irqs(struct mlx5_core_dev *dev, int nvec)
return 0;
err_request_irq:
for (; i >= 0; i--) {
while (i--) {
struct mlx5_irq *irq = mlx5_irq_get(dev, i);
int irqn = pci_irq_vector(dev->pdev, i);

View File

@@ -3690,13 +3690,13 @@ bool mlxsw_sp_port_dev_check(const struct net_device *dev)
return dev->netdev_ops == &mlxsw_sp_port_netdev_ops;
}
static int mlxsw_sp_lower_dev_walk(struct net_device *lower_dev, void *data)
static int mlxsw_sp_lower_dev_walk(struct net_device *lower_dev,
struct netdev_nested_priv *priv)
{
struct mlxsw_sp_port **p_mlxsw_sp_port = data;
int ret = 0;
if (mlxsw_sp_port_dev_check(lower_dev)) {
*p_mlxsw_sp_port = netdev_priv(lower_dev);
priv->data = (void *)netdev_priv(lower_dev);
ret = 1;
}
@@ -3705,15 +3705,16 @@ static int mlxsw_sp_lower_dev_walk(struct net_device *lower_dev, void *data)
struct mlxsw_sp_port *mlxsw_sp_port_dev_lower_find(struct net_device *dev)
{
struct mlxsw_sp_port *mlxsw_sp_port;
struct netdev_nested_priv priv = {
.data = NULL,
};
if (mlxsw_sp_port_dev_check(dev))
return netdev_priv(dev);
mlxsw_sp_port = NULL;
netdev_walk_all_lower_dev(dev, mlxsw_sp_lower_dev_walk, &mlxsw_sp_port);
netdev_walk_all_lower_dev(dev, mlxsw_sp_lower_dev_walk, &priv);
return mlxsw_sp_port;
return (struct mlxsw_sp_port *)priv.data;
}
struct mlxsw_sp *mlxsw_sp_lower_get(struct net_device *dev)
@@ -3726,16 +3727,17 @@ struct mlxsw_sp *mlxsw_sp_lower_get(struct net_device *dev)
struct mlxsw_sp_port *mlxsw_sp_port_dev_lower_find_rcu(struct net_device *dev)
{
struct mlxsw_sp_port *mlxsw_sp_port;
struct netdev_nested_priv priv = {
.data = NULL,
};
if (mlxsw_sp_port_dev_check(dev))
return netdev_priv(dev);
mlxsw_sp_port = NULL;
netdev_walk_all_lower_dev_rcu(dev, mlxsw_sp_lower_dev_walk,
&mlxsw_sp_port);
&priv);
return mlxsw_sp_port;
return (struct mlxsw_sp_port *)priv.data;
}
struct mlxsw_sp_port *mlxsw_sp_port_lower_dev_hold(struct net_device *dev)

View File

@@ -292,13 +292,14 @@ mlxsw_sp_acl_tcam_group_add(struct mlxsw_sp_acl_tcam *tcam,
int err;
group->tcam = tcam;
mutex_init(&group->lock);
INIT_LIST_HEAD(&group->region_list);
err = mlxsw_sp_acl_tcam_group_id_get(tcam, &group->id);
if (err)
return err;
mutex_init(&group->lock);
return 0;
}

View File

@@ -7351,9 +7351,10 @@ int mlxsw_sp_netdevice_vrf_event(struct net_device *l3_dev, unsigned long event,
return err;
}
static int __mlxsw_sp_rif_macvlan_flush(struct net_device *dev, void *data)
static int __mlxsw_sp_rif_macvlan_flush(struct net_device *dev,
struct netdev_nested_priv *priv)
{
struct mlxsw_sp_rif *rif = data;
struct mlxsw_sp_rif *rif = (struct mlxsw_sp_rif *)priv->data;
if (!netif_is_macvlan(dev))
return 0;
@@ -7364,12 +7365,16 @@ static int __mlxsw_sp_rif_macvlan_flush(struct net_device *dev, void *data)
static int mlxsw_sp_rif_macvlan_flush(struct mlxsw_sp_rif *rif)
{
struct netdev_nested_priv priv = {
.data = (void *)rif,
};
if (!netif_is_macvlan_port(rif->dev))
return 0;
netdev_warn(rif->dev, "Router interface is deleted. Upper macvlans will not work\n");
return netdev_walk_all_upper_dev_rcu(rif->dev,
__mlxsw_sp_rif_macvlan_flush, rif);
__mlxsw_sp_rif_macvlan_flush, &priv);
}
static void mlxsw_sp_rif_subport_setup(struct mlxsw_sp_rif *rif,

View File

@@ -136,9 +136,9 @@ bool mlxsw_sp_bridge_device_is_offloaded(const struct mlxsw_sp *mlxsw_sp,
}
static int mlxsw_sp_bridge_device_upper_rif_destroy(struct net_device *dev,
void *data)
struct netdev_nested_priv *priv)
{
struct mlxsw_sp *mlxsw_sp = data;
struct mlxsw_sp *mlxsw_sp = priv->data;
mlxsw_sp_rif_destroy_by_dev(mlxsw_sp, dev);
return 0;
@@ -147,10 +147,14 @@ static int mlxsw_sp_bridge_device_upper_rif_destroy(struct net_device *dev,
static void mlxsw_sp_bridge_device_rifs_destroy(struct mlxsw_sp *mlxsw_sp,
struct net_device *dev)
{
struct netdev_nested_priv priv = {
.data = (void *)mlxsw_sp,
};
mlxsw_sp_rif_destroy_by_dev(mlxsw_sp, dev);
netdev_walk_all_upper_dev_rcu(dev,
mlxsw_sp_bridge_device_upper_rif_destroy,
mlxsw_sp);
&priv);
}
static int mlxsw_sp_bridge_device_vxlan_init(struct mlxsw_sp_bridge *bridge,

View File

@@ -1253,7 +1253,7 @@ void ocelot_port_set_maxlen(struct ocelot *ocelot, int port, size_t sdu)
struct ocelot_port *ocelot_port = ocelot->ports[port];
int maxlen = sdu + ETH_HLEN + ETH_FCS_LEN;
int pause_start, pause_stop;
int atop_wm;
int atop, atop_tot;
if (port == ocelot->npi) {
maxlen += OCELOT_TAG_LEN;
@@ -1274,12 +1274,12 @@ void ocelot_port_set_maxlen(struct ocelot *ocelot, int port, size_t sdu)
ocelot_fields_write(ocelot, port, SYS_PAUSE_CFG_PAUSE_STOP,
pause_stop);
/* Tail dropping watermark */
atop_wm = (ocelot->shared_queue_sz - 9 * maxlen) /
/* Tail dropping watermarks */
atop_tot = (ocelot->shared_queue_sz - 9 * maxlen) /
OCELOT_BUFFER_CELL_SZ;
ocelot_write_rix(ocelot, ocelot->ops->wm_enc(9 * maxlen),
SYS_ATOP, port);
ocelot_write(ocelot, ocelot->ops->wm_enc(atop_wm), SYS_ATOP_TOT_CFG);
atop = (9 * maxlen) / OCELOT_BUFFER_CELL_SZ;
ocelot_write_rix(ocelot, ocelot->ops->wm_enc(atop), SYS_ATOP, port);
ocelot_write(ocelot, ocelot->ops->wm_enc(atop_tot), SYS_ATOP_TOT_CFG);
}
EXPORT_SYMBOL(ocelot_port_set_maxlen);

View File

@@ -745,6 +745,8 @@ static int ocelot_reset(struct ocelot *ocelot)
*/
static u16 ocelot_wm_enc(u16 value)
{
WARN_ON(value >= 16 * BIT(8));
if (value >= BIT(8))
return BIT(8) | (value / 16);

View File

@@ -2058,11 +2058,18 @@ static void rtl_release_firmware(struct rtl8169_private *tp)
void r8169_apply_firmware(struct rtl8169_private *tp)
{
int val;
/* TODO: release firmware if rtl_fw_write_firmware signals failure. */
if (tp->rtl_fw) {
rtl_fw_write_firmware(tp, tp->rtl_fw);
/* At least one firmware doesn't reset tp->ocp_base. */
tp->ocp_base = OCP_STD_PHY_BASE;
/* PHY soft reset may still be in progress */
phy_read_poll_timeout(tp->phydev, MII_BMCR, val,
!(val & BMCR_RESET),
50000, 600000, true);
}
}
@@ -2239,14 +2246,10 @@ static void rtl_pll_power_down(struct rtl8169_private *tp)
default:
break;
}
clk_disable_unprepare(tp->clk);
}
static void rtl_pll_power_up(struct rtl8169_private *tp)
{
clk_prepare_enable(tp->clk);
switch (tp->mac_version) {
case RTL_GIGA_MAC_VER_25 ... RTL_GIGA_MAC_VER_33:
case RTL_GIGA_MAC_VER_37:
@@ -2904,7 +2907,7 @@ static void rtl_hw_start_8168f_1(struct rtl8169_private *tp)
{ 0x08, 0x0001, 0x0002 },
{ 0x09, 0x0000, 0x0080 },
{ 0x19, 0x0000, 0x0224 },
{ 0x00, 0x0000, 0x0004 },
{ 0x00, 0x0000, 0x0008 },
{ 0x0c, 0x3df0, 0x0200 },
};
@@ -2921,7 +2924,7 @@ static void rtl_hw_start_8411(struct rtl8169_private *tp)
{ 0x06, 0x00c0, 0x0020 },
{ 0x0f, 0xffff, 0x5200 },
{ 0x19, 0x0000, 0x0224 },
{ 0x00, 0x0000, 0x0004 },
{ 0x00, 0x0000, 0x0008 },
{ 0x0c, 0x3df0, 0x0200 },
};
@@ -4826,21 +4829,8 @@ static void rtl8169_net_suspend(struct rtl8169_private *tp)
#ifdef CONFIG_PM
static int __maybe_unused rtl8169_suspend(struct device *device)
static int rtl8169_net_resume(struct rtl8169_private *tp)
{
struct rtl8169_private *tp = dev_get_drvdata(device);
rtnl_lock();
rtl8169_net_suspend(tp);
rtnl_unlock();
return 0;
}
static int rtl8169_resume(struct device *device)
{
struct rtl8169_private *tp = dev_get_drvdata(device);
rtl_rar_set(tp, tp->dev->dev_addr);
if (tp->TxDescArray)
@@ -4851,6 +4841,33 @@ static int rtl8169_resume(struct device *device)
return 0;
}
static int __maybe_unused rtl8169_suspend(struct device *device)
{
struct rtl8169_private *tp = dev_get_drvdata(device);
rtnl_lock();
rtl8169_net_suspend(tp);
if (!device_may_wakeup(tp_to_dev(tp)))
clk_disable_unprepare(tp->clk);
rtnl_unlock();
return 0;
}
static int __maybe_unused rtl8169_resume(struct device *device)
{
struct rtl8169_private *tp = dev_get_drvdata(device);
if (!device_may_wakeup(tp_to_dev(tp)))
clk_prepare_enable(tp->clk);
/* Reportedly at least Asus X453MA truncates packets otherwise */
if (tp->mac_version == RTL_GIGA_MAC_VER_37)
rtl_init_rxcfg(tp);
return rtl8169_net_resume(tp);
}
static int rtl8169_runtime_suspend(struct device *device)
{
struct rtl8169_private *tp = dev_get_drvdata(device);
@@ -4874,7 +4891,7 @@ static int rtl8169_runtime_resume(struct device *device)
__rtl8169_set_wol(tp, tp->saved_wolopts);
return rtl8169_resume(device);
return rtl8169_net_resume(tp);
}
static int rtl8169_runtime_idle(struct device *device)

View File

@@ -1342,51 +1342,6 @@ static inline int ravb_hook_irq(unsigned int irq, irq_handler_t handler,
return error;
}
/* MDIO bus init function */
static int ravb_mdio_init(struct ravb_private *priv)
{
struct platform_device *pdev = priv->pdev;
struct device *dev = &pdev->dev;
int error;
/* Bitbang init */
priv->mdiobb.ops = &bb_ops;
/* MII controller setting */
priv->mii_bus = alloc_mdio_bitbang(&priv->mdiobb);
if (!priv->mii_bus)
return -ENOMEM;
/* Hook up MII support for ethtool */
priv->mii_bus->name = "ravb_mii";
priv->mii_bus->parent = dev;
snprintf(priv->mii_bus->id, MII_BUS_ID_SIZE, "%s-%x",
pdev->name, pdev->id);
/* Register MDIO bus */
error = of_mdiobus_register(priv->mii_bus, dev->of_node);
if (error)
goto out_free_bus;
return 0;
out_free_bus:
free_mdio_bitbang(priv->mii_bus);
return error;
}
/* MDIO bus release function */
static int ravb_mdio_release(struct ravb_private *priv)
{
/* Unregister mdio bus */
mdiobus_unregister(priv->mii_bus);
/* Free bitbang info */
free_mdio_bitbang(priv->mii_bus);
return 0;
}
/* Network device open function for Ethernet AVB */
static int ravb_open(struct net_device *ndev)
{
@@ -1395,13 +1350,6 @@ static int ravb_open(struct net_device *ndev)
struct device *dev = &pdev->dev;
int error;
/* MDIO bus init */
error = ravb_mdio_init(priv);
if (error) {
netdev_err(ndev, "failed to initialize MDIO\n");
return error;
}
napi_enable(&priv->napi[RAVB_BE]);
napi_enable(&priv->napi[RAVB_NC]);
@@ -1479,7 +1427,6 @@ out_free_irq:
out_napi_off:
napi_disable(&priv->napi[RAVB_NC]);
napi_disable(&priv->napi[RAVB_BE]);
ravb_mdio_release(priv);
return error;
}
@@ -1789,8 +1736,6 @@ static int ravb_close(struct net_device *ndev)
ravb_ring_free(ndev, RAVB_BE);
ravb_ring_free(ndev, RAVB_NC);
ravb_mdio_release(priv);
return 0;
}
@@ -1942,6 +1887,51 @@ static const struct net_device_ops ravb_netdev_ops = {
.ndo_set_features = ravb_set_features,
};
/* MDIO bus init function */
static int ravb_mdio_init(struct ravb_private *priv)
{
struct platform_device *pdev = priv->pdev;
struct device *dev = &pdev->dev;
int error;
/* Bitbang init */
priv->mdiobb.ops = &bb_ops;
/* MII controller setting */
priv->mii_bus = alloc_mdio_bitbang(&priv->mdiobb);
if (!priv->mii_bus)
return -ENOMEM;
/* Hook up MII support for ethtool */
priv->mii_bus->name = "ravb_mii";
priv->mii_bus->parent = dev;
snprintf(priv->mii_bus->id, MII_BUS_ID_SIZE, "%s-%x",
pdev->name, pdev->id);
/* Register MDIO bus */
error = of_mdiobus_register(priv->mii_bus, dev->of_node);
if (error)
goto out_free_bus;
return 0;
out_free_bus:
free_mdio_bitbang(priv->mii_bus);
return error;
}
/* MDIO bus release function */
static int ravb_mdio_release(struct ravb_private *priv)
{
/* Unregister mdio bus */
mdiobus_unregister(priv->mii_bus);
/* Free bitbang info */
free_mdio_bitbang(priv->mii_bus);
return 0;
}
static const struct of_device_id ravb_match_table[] = {
{ .compatible = "renesas,etheravb-r8a7790", .data = (void *)RCAR_GEN2 },
{ .compatible = "renesas,etheravb-r8a7794", .data = (void *)RCAR_GEN2 },
@@ -2184,6 +2174,13 @@ static int ravb_probe(struct platform_device *pdev)
eth_hw_addr_random(ndev);
}
/* MDIO bus init */
error = ravb_mdio_init(priv);
if (error) {
dev_err(&pdev->dev, "failed to initialize MDIO\n");
goto out_dma_free;
}
netif_napi_add(ndev, &priv->napi[RAVB_BE], ravb_poll, 64);
netif_napi_add(ndev, &priv->napi[RAVB_NC], ravb_poll, 64);
@@ -2205,6 +2202,8 @@ static int ravb_probe(struct platform_device *pdev)
out_napi_del:
netif_napi_del(&priv->napi[RAVB_NC]);
netif_napi_del(&priv->napi[RAVB_BE]);
ravb_mdio_release(priv);
out_dma_free:
dma_free_coherent(ndev->dev.parent, priv->desc_bat_size, priv->desc_bat,
priv->desc_bat_dma);
@@ -2236,6 +2235,7 @@ static int ravb_remove(struct platform_device *pdev)
unregister_netdev(ndev);
netif_napi_del(&priv->napi[RAVB_NC]);
netif_napi_del(&priv->napi[RAVB_BE]);
ravb_mdio_release(priv);
pm_runtime_disable(&pdev->dev);
free_netdev(ndev);
platform_set_drvdata(pdev, NULL);

View File

@@ -3099,9 +3099,10 @@ struct rocker_walk_data {
struct rocker_port *port;
};
static int rocker_lower_dev_walk(struct net_device *lower_dev, void *_data)
static int rocker_lower_dev_walk(struct net_device *lower_dev,
struct netdev_nested_priv *priv)
{
struct rocker_walk_data *data = _data;
struct rocker_walk_data *data = (struct rocker_walk_data *)priv->data;
int ret = 0;
if (rocker_port_dev_check_under(lower_dev, data->rocker)) {
@@ -3115,6 +3116,7 @@ static int rocker_lower_dev_walk(struct net_device *lower_dev, void *_data)
struct rocker_port *rocker_port_dev_lower_find(struct net_device *dev,
struct rocker *rocker)
{
struct netdev_nested_priv priv;
struct rocker_walk_data data;
if (rocker_port_dev_check_under(dev, rocker))
@@ -3122,7 +3124,8 @@ struct rocker_port *rocker_port_dev_lower_find(struct net_device *dev,
data.rocker = rocker;
data.port = NULL;
netdev_walk_all_lower_dev(dev, rocker_lower_dev_walk, &data);
priv.data = (void *)&data;
netdev_walk_all_lower_dev(dev, rocker_lower_dev_walk, &priv);
return data.port;
}

View File

@@ -653,7 +653,6 @@ static void intel_eth_pci_remove(struct pci_dev *pdev)
pci_free_irq_vectors(pdev);
clk_disable_unprepare(priv->plat->stmmac_clk);
clk_unregister_fixed_rate(priv->plat->stmmac_clk);
pcim_iounmap_regions(pdev, BIT(0));

View File

@@ -203,6 +203,8 @@ struct stmmac_priv {
int eee_enabled;
int eee_active;
int tx_lpi_timer;
int tx_lpi_enabled;
int eee_tw_timer;
unsigned int mode;
unsigned int chain_mode;
int extend_desc;

View File

@@ -665,6 +665,7 @@ static int stmmac_ethtool_op_get_eee(struct net_device *dev,
edata->eee_enabled = priv->eee_enabled;
edata->eee_active = priv->eee_active;
edata->tx_lpi_timer = priv->tx_lpi_timer;
edata->tx_lpi_enabled = priv->tx_lpi_enabled;
return phylink_ethtool_get_eee(priv->phylink, edata);
}
@@ -675,24 +676,26 @@ static int stmmac_ethtool_op_set_eee(struct net_device *dev,
struct stmmac_priv *priv = netdev_priv(dev);
int ret;
if (!edata->eee_enabled) {
if (!priv->dma_cap.eee)
return -EOPNOTSUPP;
if (priv->tx_lpi_enabled != edata->tx_lpi_enabled)
netdev_warn(priv->dev,
"Setting EEE tx-lpi is not supported\n");
if (!edata->eee_enabled)
stmmac_disable_eee_mode(priv);
} else {
/* We are asking for enabling the EEE but it is safe
* to verify all by invoking the eee_init function.
* In case of failure it will return an error.
*/
edata->eee_enabled = stmmac_eee_init(priv);
if (!edata->eee_enabled)
return -EOPNOTSUPP;
}
ret = phylink_ethtool_set_eee(priv->phylink, edata);
if (ret)
return ret;
priv->eee_enabled = edata->eee_enabled;
priv->tx_lpi_timer = edata->tx_lpi_timer;
if (edata->eee_enabled &&
priv->tx_lpi_timer != edata->tx_lpi_timer) {
priv->tx_lpi_timer = edata->tx_lpi_timer;
stmmac_eee_init(priv);
}
return 0;
}

View File

@@ -94,7 +94,7 @@ static const u32 default_msg_level = (NETIF_MSG_DRV | NETIF_MSG_PROBE |
static int eee_timer = STMMAC_DEFAULT_LPI_TIMER;
module_param(eee_timer, int, 0644);
MODULE_PARM_DESC(eee_timer, "LPI tx expiration time in msec");
#define STMMAC_LPI_T(x) (jiffies + msecs_to_jiffies(x))
#define STMMAC_LPI_T(x) (jiffies + usecs_to_jiffies(x))
/* By default the driver will use the ring mode to manage tx and rx descriptors,
* but allow user to force to use the chain instead of the ring
@@ -370,7 +370,7 @@ static void stmmac_eee_ctrl_timer(struct timer_list *t)
struct stmmac_priv *priv = from_timer(priv, t, eee_ctrl_timer);
stmmac_enable_eee_mode(priv);
mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(eee_timer));
mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(priv->tx_lpi_timer));
}
/**
@@ -383,7 +383,7 @@ static void stmmac_eee_ctrl_timer(struct timer_list *t)
*/
bool stmmac_eee_init(struct stmmac_priv *priv)
{
int tx_lpi_timer = priv->tx_lpi_timer;
int eee_tw_timer = priv->eee_tw_timer;
/* Using PCS we cannot dial with the phy registers at this stage
* so we do not support extra feature like EEE.
@@ -403,7 +403,7 @@ bool stmmac_eee_init(struct stmmac_priv *priv)
if (priv->eee_enabled) {
netdev_dbg(priv->dev, "disable EEE\n");
del_timer_sync(&priv->eee_ctrl_timer);
stmmac_set_eee_timer(priv, priv->hw, 0, tx_lpi_timer);
stmmac_set_eee_timer(priv, priv->hw, 0, eee_tw_timer);
}
mutex_unlock(&priv->lock);
return false;
@@ -411,11 +411,12 @@ bool stmmac_eee_init(struct stmmac_priv *priv)
if (priv->eee_active && !priv->eee_enabled) {
timer_setup(&priv->eee_ctrl_timer, stmmac_eee_ctrl_timer, 0);
mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(eee_timer));
stmmac_set_eee_timer(priv, priv->hw, STMMAC_DEFAULT_LIT_LS,
tx_lpi_timer);
eee_tw_timer);
}
mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(priv->tx_lpi_timer));
mutex_unlock(&priv->lock);
netdev_dbg(priv->dev, "Energy-Efficient Ethernet initialized\n");
return true;
@@ -930,6 +931,7 @@ static void stmmac_mac_link_down(struct phylink_config *config,
stmmac_mac_set(priv, priv->ioaddr, false);
priv->eee_active = false;
priv->tx_lpi_enabled = false;
stmmac_eee_init(priv);
stmmac_set_eee_pls(priv, priv->hw, false);
}
@@ -1027,6 +1029,7 @@ static void stmmac_mac_link_up(struct phylink_config *config,
if (phy && priv->dma_cap.eee) {
priv->eee_active = phy_init_eee(phy, 1) >= 0;
priv->eee_enabled = stmmac_eee_init(priv);
priv->tx_lpi_enabled = priv->eee_enabled;
stmmac_set_eee_pls(priv, priv->hw, true);
}
}
@@ -2061,7 +2064,7 @@ static int stmmac_tx_clean(struct stmmac_priv *priv, int budget, u32 queue)
if ((priv->eee_enabled) && (!priv->tx_path_in_lpi_mode)) {
stmmac_enable_eee_mode(priv);
mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(eee_timer));
mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_T(priv->tx_lpi_timer));
}
/* We still have pending packets, let's call for a new scheduling */
@@ -2694,7 +2697,11 @@ static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
netdev_warn(priv->dev, "PTP init failed\n");
}
priv->tx_lpi_timer = STMMAC_DEFAULT_TWT_LS;
priv->eee_tw_timer = STMMAC_DEFAULT_TWT_LS;
/* Convert the timer from msec to usec */
if (!priv->tx_lpi_timer)
priv->tx_lpi_timer = eee_timer * 1000;
if (priv->use_riwt) {
if (!priv->rx_riwt)

View File

@@ -2,7 +2,7 @@
/*
Written 1998-2001 by Donald Becker.
Current Maintainer: Roger Luethi <rl@hellgate.ch>
Current Maintainer: Kevin Brace <kevinbrace@bracecomputerlab.com>
This software may be used and distributed according to the terms of
the GNU General Public License (GPL), incorporated herein by reference.
@@ -32,8 +32,6 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#define DRV_NAME "via-rhine"
#define DRV_VERSION "1.5.1"
#define DRV_RELDATE "2010-10-09"
#include <linux/types.h>
@@ -117,10 +115,6 @@ static const int multicast_filter_limit = 32;
#include <linux/uaccess.h>
#include <linux/dmi.h>
/* These identify the driver base version and may not be removed. */
static const char version[] =
"v1.10-LK" DRV_VERSION " " DRV_RELDATE " Written by Donald Becker";
MODULE_AUTHOR("Donald Becker <becker@scyld.com>");
MODULE_DESCRIPTION("VIA Rhine PCI Fast Ethernet driver");
MODULE_LICENSE("GPL");
@@ -243,7 +237,7 @@ enum rhine_revs {
VT8233 = 0x60, /* Integrated MAC */
VT8235 = 0x74, /* Integrated MAC */
VT8237 = 0x78, /* Integrated MAC */
VTunknown1 = 0x7C,
VT8251 = 0x7C, /* Integrated MAC */
VT6105 = 0x80,
VT6105_B0 = 0x83,
VT6105L = 0x8A,
@@ -1051,11 +1045,6 @@ static int rhine_init_one_pci(struct pci_dev *pdev,
u32 quirks = 0;
#endif
/* when built into the kernel, we only print version if device is found */
#ifndef MODULE
pr_info_once("%s\n", version);
#endif
rc = pci_enable_device(pdev);
if (rc)
goto err_out;
@@ -1706,6 +1695,8 @@ static int rhine_open(struct net_device *dev)
goto out_free_ring;
alloc_tbufs(dev);
enable_mmio(rp->pioaddr, rp->quirks);
rhine_power_init(dev);
rhine_chip_reset(dev);
rhine_task_enable(rp);
init_registers(dev);
@@ -2294,7 +2285,6 @@ static void netdev_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *i
struct device *hwdev = dev->dev.parent;
strlcpy(info->driver, DRV_NAME, sizeof(info->driver));
strlcpy(info->version, DRV_VERSION, sizeof(info->version));
strlcpy(info->bus_info, dev_name(hwdev), sizeof(info->bus_info));
}
@@ -2616,9 +2606,6 @@ static int __init rhine_init(void)
int ret_pci, ret_platform;
/* when a module, this is printed whether or not devices are found in probe */
#ifdef MODULE
pr_info("%s\n", version);
#endif
if (dmi_check_system(rhine_dmi_table)) {
/* these BIOSes fail at PXE boot if chip is in D3 */
avoid_D3 = true;

View File

@@ -1077,6 +1077,7 @@ static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb)
struct macsec_rx_sa *rx_sa;
struct macsec_rxh_data *rxd;
struct macsec_dev *macsec;
unsigned int len;
sci_t sci;
u32 hdr_pn;
bool cbit;
@@ -1232,9 +1233,10 @@ deliver:
macsec_rxsc_put(rx_sc);
skb_orphan(skb);
len = skb->len;
ret = gro_cells_receive(&macsec->gro_cells, skb);
if (ret == NET_RX_SUCCESS)
count_rx(dev, skb->len);
count_rx(dev, len);
else
macsec->secy.netdev->stats.rx_dropped++;

View File

@@ -222,6 +222,7 @@ config MDIO_THUNDER
depends on 64BIT
depends on PCI
select MDIO_CAVIUM
select MDIO_DEVRES
help
This driver supports the MDIO interfaces found on Cavium
ThunderX SoCs when the MDIO bus device appears as a PCI

View File

@@ -1,6 +1,5 @@
// SPDX-License-Identifier: GPL-2.0+
/*
* drivers/net/phy/realtek.c
/* drivers/net/phy/realtek.c
*
* Driver for Realtek PHYs
*
@@ -32,9 +31,9 @@
#define RTL8211F_TX_DELAY BIT(8)
#define RTL8211F_RX_DELAY BIT(3)
#define RTL8211E_TX_DELAY BIT(1)
#define RTL8211E_RX_DELAY BIT(2)
#define RTL8211E_MODE_MII_GMII BIT(3)
#define RTL8211E_CTRL_DELAY BIT(13)
#define RTL8211E_TX_DELAY BIT(12)
#define RTL8211E_RX_DELAY BIT(11)
#define RTL8201F_ISR 0x1e
#define RTL8201F_IER 0x13
@@ -246,16 +245,16 @@ static int rtl8211e_config_init(struct phy_device *phydev)
/* enable TX/RX delay for rgmii-* modes, and disable them for rgmii. */
switch (phydev->interface) {
case PHY_INTERFACE_MODE_RGMII:
val = 0;
val = RTL8211E_CTRL_DELAY | 0;
break;
case PHY_INTERFACE_MODE_RGMII_ID:
val = RTL8211E_TX_DELAY | RTL8211E_RX_DELAY;
val = RTL8211E_CTRL_DELAY | RTL8211E_TX_DELAY | RTL8211E_RX_DELAY;
break;
case PHY_INTERFACE_MODE_RGMII_RXID:
val = RTL8211E_RX_DELAY;
val = RTL8211E_CTRL_DELAY | RTL8211E_RX_DELAY;
break;
case PHY_INTERFACE_MODE_RGMII_TXID:
val = RTL8211E_TX_DELAY;
val = RTL8211E_CTRL_DELAY | RTL8211E_TX_DELAY;
break;
default: /* the rest of the modes imply leaving delays as is. */
return 0;
@@ -263,11 +262,12 @@ static int rtl8211e_config_init(struct phy_device *phydev)
/* According to a sample driver there is a 0x1c config register on the
* 0xa4 extension page (0x7) layout. It can be used to disable/enable
* the RX/TX delays otherwise controlled by RXDLY/TXDLY pins. It can
* also be used to customize the whole configuration register:
* 8:6 = PHY Address, 5:4 = Auto-Negotiation, 3 = Interface Mode Select,
* 2 = RX Delay, 1 = TX Delay, 0 = SELRGV (see original PHY datasheet
* for details).
* the RX/TX delays otherwise controlled by RXDLY/TXDLY pins.
* The configuration register definition:
* 14 = reserved
* 13 = Force Tx RX Delay controlled by bit12 bit11,
* 12 = RX Delay, 11 = TX Delay
* 10:0 = Test && debug settings reserved by realtek
*/
oldpage = phy_select_page(phydev, 0x7);
if (oldpage < 0)
@@ -277,7 +277,8 @@ static int rtl8211e_config_init(struct phy_device *phydev)
if (ret)
goto err_restore_page;
ret = __phy_modify(phydev, 0x1c, RTL8211E_TX_DELAY | RTL8211E_RX_DELAY,
ret = __phy_modify(phydev, 0x1c, RTL8211E_CTRL_DELAY
| RTL8211E_TX_DELAY | RTL8211E_RX_DELAY,
val);
err_restore_page:

View File

@@ -287,7 +287,7 @@ inst_rollback:
for (i--; i >= 0; i--)
__team_option_inst_del_option(team, dst_opts[i]);
i = option_count - 1;
i = option_count;
alloc_rollback:
for (i--; i >= 0; i--)
kfree(dst_opts[i]);
@@ -2112,6 +2112,7 @@ static void team_setup_by_port(struct net_device *dev,
dev->header_ops = port_dev->header_ops;
dev->type = port_dev->type;
dev->hard_header_len = port_dev->hard_header_len;
dev->needed_headroom = port_dev->needed_headroom;
dev->addr_len = port_dev->addr_len;
dev->mtu = port_dev->mtu;
memcpy(dev->broadcast, port_dev->broadcast, port_dev->addr_len);

View File

@@ -1823,6 +1823,33 @@ static const struct driver_info belkin_info = {
.status = ax88179_status,
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
static const struct driver_info toshiba_info = {
.description = "Toshiba USB Ethernet Adapter",
.bind = ax88179_bind,
.unbind = ax88179_unbind,
.status = ax88179_status,
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
static const struct driver_info mct_info = {
.description = "MCT USB 3.0 Gigabit Ethernet Adapter",
.bind = ax88179_bind,
.unbind = ax88179_unbind,
.status = ax88179_status,
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
@@ -1861,6 +1888,14 @@ static const struct usb_device_id products[] = {
/* Belkin B2B128 USB 3.0 Hub + Gigabit Ethernet Adapter */
USB_DEVICE(0x050d, 0x0128),
.driver_info = (unsigned long)&belkin_info,
}, {
/* Toshiba USB 3.0 GBit Ethernet Adapter */
USB_DEVICE(0x0930, 0x0a13),
.driver_info = (unsigned long)&toshiba_info,
}, {
/* Magic Control Technology U3-A9003 USB 3.0 Gigabit Ethernet Adapter */
USB_DEVICE(0x0711, 0x0179),
.driver_info = (unsigned long)&mct_info,
},
{ },
};

View File

@@ -360,28 +360,47 @@ fail:
}
#endif /* PEGASUS_WRITE_EEPROM */
static inline void get_node_id(pegasus_t *pegasus, __u8 *id)
static inline int get_node_id(pegasus_t *pegasus, u8 *id)
{
int i;
__u16 w16;
int i, ret;
u16 w16;
for (i = 0; i < 3; i++) {
read_eprom_word(pegasus, i, &w16);
ret = read_eprom_word(pegasus, i, &w16);
if (ret < 0)
return ret;
((__le16 *) id)[i] = cpu_to_le16(w16);
}
return 0;
}
static void set_ethernet_addr(pegasus_t *pegasus)
{
__u8 node_id[6];
int ret;
u8 node_id[6];
if (pegasus->features & PEGASUS_II) {
get_registers(pegasus, 0x10, sizeof(node_id), node_id);
ret = get_registers(pegasus, 0x10, sizeof(node_id), node_id);
if (ret < 0)
goto err;
} else {
get_node_id(pegasus, node_id);
set_registers(pegasus, EthID, sizeof(node_id), node_id);
ret = get_node_id(pegasus, node_id);
if (ret < 0)
goto err;
ret = set_registers(pegasus, EthID, sizeof(node_id), node_id);
if (ret < 0)
goto err;
}
memcpy(pegasus->net->dev_addr, node_id, sizeof(node_id));
return;
err:
eth_hw_addr_random(pegasus->net);
dev_info(&pegasus->intf->dev, "software assigned MAC address.\n");
return;
}
static inline int reset_mac(pegasus_t *pegasus)

View File

@@ -1375,6 +1375,7 @@ static const struct usb_device_id products[] = {
{QMI_QUIRK_SET_DTR(0x2cb7, 0x0104, 4)}, /* Fibocom NL678 series */
{QMI_FIXED_INTF(0x0489, 0xe0b4, 0)}, /* Foxconn T77W968 LTE */
{QMI_FIXED_INTF(0x0489, 0xe0b5, 0)}, /* Foxconn T77W968 LTE with eSIM support*/
{QMI_FIXED_INTF(0x2692, 0x9025, 4)}, /* Cellient MPL200 (rebranded Qualcomm 05c6:9025) */
/* 4. Gobi 1000 devices */
{QMI_GOBI1K_DEVICE(0x05c6, 0x9212)}, /* Acer Gobi Modem Device */

View File

@@ -274,12 +274,20 @@ static int write_mii_word(rtl8150_t * dev, u8 phy, __u8 indx, u16 reg)
return 1;
}
static inline void set_ethernet_addr(rtl8150_t * dev)
static void set_ethernet_addr(rtl8150_t *dev)
{
u8 node_id[6];
u8 node_id[ETH_ALEN];
int ret;
get_registers(dev, IDR, sizeof(node_id), node_id);
memcpy(dev->netdev->dev_addr, node_id, sizeof(node_id));
ret = get_registers(dev, IDR, sizeof(node_id), node_id);
if (ret == sizeof(node_id)) {
ether_addr_copy(dev->netdev->dev_addr, node_id);
} else {
eth_hw_addr_random(dev->netdev);
netdev_notice(dev->netdev, "Assigned a random MAC address: %pM\n",
dev->netdev->dev_addr);
}
}
static int rtl8150_set_mac_address(struct net_device *netdev, void *p)

View File

@@ -63,6 +63,11 @@ static const unsigned long guest_offloads[] = {
VIRTIO_NET_F_GUEST_CSUM
};
#define GUEST_OFFLOAD_LRO_MASK ((1ULL << VIRTIO_NET_F_GUEST_TSO4) | \
(1ULL << VIRTIO_NET_F_GUEST_TSO6) | \
(1ULL << VIRTIO_NET_F_GUEST_ECN) | \
(1ULL << VIRTIO_NET_F_GUEST_UFO))
struct virtnet_stat_desc {
char desc[ETH_GSTRING_LEN];
size_t offset;
@@ -2531,7 +2536,8 @@ static int virtnet_set_features(struct net_device *dev,
if (features & NETIF_F_LRO)
offloads = vi->guest_offloads_capable;
else
offloads = 0;
offloads = vi->guest_offloads_capable &
~GUEST_OFFLOAD_LRO_MASK;
err = virtnet_set_guest_offloads(vi, offloads);
if (err)

View File

@@ -1032,7 +1032,6 @@ vmxnet3_tq_xmit(struct sk_buff *skb, struct vmxnet3_tx_queue *tq,
/* Use temporary descriptor to avoid touching bits multiple times */
union Vmxnet3_GenericDesc tempTxDesc;
#endif
struct udphdr *udph;
count = txd_estimate(skb);
@@ -1135,8 +1134,7 @@ vmxnet3_tq_xmit(struct sk_buff *skb, struct vmxnet3_tx_queue *tq,
gdesc->txd.om = VMXNET3_OM_ENCAP;
gdesc->txd.msscof = ctx.mss;
udph = udp_hdr(skb);
if (udph->check)
if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM)
gdesc->txd.oco = 1;
} else {
gdesc->txd.hlen = ctx.l4_offset + ctx.l4_hdr_size;
@@ -3371,6 +3369,7 @@ vmxnet3_probe_device(struct pci_dev *pdev,
.ndo_change_mtu = vmxnet3_change_mtu,
.ndo_fix_features = vmxnet3_fix_features,
.ndo_set_features = vmxnet3_set_features,
.ndo_features_check = vmxnet3_features_check,
.ndo_get_stats64 = vmxnet3_get_stats64,
.ndo_tx_timeout = vmxnet3_tx_timeout,
.ndo_set_rx_mode = vmxnet3_set_mc,

View File

@@ -267,6 +267,34 @@ netdev_features_t vmxnet3_fix_features(struct net_device *netdev,
return features;
}
netdev_features_t vmxnet3_features_check(struct sk_buff *skb,
struct net_device *netdev,
netdev_features_t features)
{
struct vmxnet3_adapter *adapter = netdev_priv(netdev);
/* Validate if the tunneled packet is being offloaded by the device */
if (VMXNET3_VERSION_GE_4(adapter) &&
skb->encapsulation && skb->ip_summed == CHECKSUM_PARTIAL) {
u8 l4_proto = 0;
switch (vlan_get_protocol(skb)) {
case htons(ETH_P_IP):
l4_proto = ip_hdr(skb)->protocol;
break;
case htons(ETH_P_IPV6):
l4_proto = ipv6_hdr(skb)->nexthdr;
break;
default:
return features & ~(NETIF_F_CSUM_MASK | NETIF_F_GSO_MASK);
}
if (l4_proto != IPPROTO_UDP)
return features & ~(NETIF_F_CSUM_MASK | NETIF_F_GSO_MASK);
}
return features;
}
static void vmxnet3_enable_encap_offloads(struct net_device *netdev)
{
struct vmxnet3_adapter *adapter = netdev_priv(netdev);

View File

@@ -470,6 +470,10 @@ vmxnet3_rq_destroy_all(struct vmxnet3_adapter *adapter);
netdev_features_t
vmxnet3_fix_features(struct net_device *netdev, netdev_features_t features);
netdev_features_t
vmxnet3_features_check(struct sk_buff *skb,
struct net_device *netdev, netdev_features_t features);
int
vmxnet3_set_features(struct net_device *netdev, netdev_features_t features);

Some files were not shown because too many files have changed in this diff Show More