Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ARM SoC fixes from Olof Johansson:
"Stream of fixes has slowed down, only a few this week:
- Some DT fixes for Allwinner platforms, and addition of a clock to
the R_CCU clock controller that had been missed.
- A couple of small DT fixes for am335x-sl50"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
ARM: dts: am335x-sl50: Fix card detect pin for mmc1
arm64: allwinner: h5: Remove syslink to shared DTSI
ARM: sunxi: h3/h5: fix the compatible of R_CCU
Allwinner fixes for 4.12
A few fixes around the PRCM support that got in 4.12 with a wrong
compatible, and a missing clock in the binding.
* tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
arm64: allwinner: h5: Remove syslink to shared DTSI
ARM: sunxi: h3/h5: fix the compatible of R_CCU
Signed-off-by: Olof Johansson <olof@lixom.net>
Two fixes for am335x-sl50 to fix a boot time error
for claiming SPI pins, and to fix a SDIO card detect
pin for production version of the device.
* tag 'omap-for-v4.12/fixes-sl50' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
ARM: dts: am335x-sl50: Fix card detect pin for mmc1
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull virtio bugfix from Michael Tsirkin:
"It turns out balloon does not handle IOMMUs correctly. We should fix
that at some point, for now let's just disable this configuration"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_balloon: disable VIOMMU support
Pull i2c fixes from Wolfram Sang:
"Two driver bugfixes"
* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: ismt: fix wrong device address when unmap the data buffer
i2c: rcar: use correct length when unmapping DMA
Pull MIPS fixes from Ralf Baechle:
- Three highmem fixes:
+ Fixed mapping initialization
+ Adjust the pkmap location
+ Ensure we use at most one page for PTEs
- Fix makefile dependencies for .its targets to depend on vmlinux
- Fix reversed condition in BNEZC and JIALC software branch emulation
- Only flush initialized flush_insn_slot to avoid NULL pointer
dereference
- perf: Remove incorrect odd/even counter handling for I6400
- ftrace: Fix init functions tracing
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: .its targets depend on vmlinux
MIPS: Fix bnezc/jialc return address calculation
MIPS: kprobes: flush_insn_slot should flush only if probe initialised
MIPS: ftrace: fix init functions tracing
MIPS: mm: adjust PKMAP location
MIPS: highmem: ensure that we don't use more than one page for PTEs
MIPS: mm: fixed mappings: correct initialisation
MIPS: perf: Remove incorrect odd/even counter handling for I6400
virtio balloon bypasses the DMA API entirely so does not support the
VIOMMU right now. It's not clear we need that support, for now let's
just make sure we don't pretend to support it.
Cc: stable@vger.kernel.org
Cc: Wei Wang <wei.w.wang@intel.com>
Fixes: 1a93769399 ("virtio: new feature to detect IOMMU device quirk")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Pull x86 fixes from Thomas Gleixner:
"Two fixlets for x86:
- Handle WARN_ONs proper with the new UD based WARN implementation
- Disable 1G mappings when 2M mappings are disabled by kmemleak or
debug_pagealloc. Otherwise 1G mappings might still be used,
confusing the debug mechanisms"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Disable 1GB direct mappings when disabling 2MB mappings
x86/debug: Handle early WARN_ONs proper
Pull timer fixes from Thomas Gleixner:
"Three fixlets for timers:
- Two hot-fixes for the alarmtimer based posix timers, which prevent
a nasty DOS by self rescheduling timers. The proper cleanup of that
mess is queued for 4.13
- Make a function static"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/broadcast: Make tick_broadcast_setup_oneshot() static
alarmtimer: Rate limit periodic intervals
alarmtimer: Prevent overflow of relative timers
Pull scheduler fixes from Thomas Gleixner:
"Two small fixes for the schedulre core:
- Use the proper switch_mm() variant in idle_task_exit() because that
code is not called with interrupts disabled.
- Fix a confusing typo in a printk"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Idle_task_exit() shouldn't use switch_mm_irqs_off()
sched/fair: Fix typo in printk message
Pull perf fixes from Thomas Gleixner:
"Three fixes for the perf user space side:
- Fix the probing of precise_ip level, which got broken recently for
x86.
- Unbreak the ARCH=x86_64 build
- Report module before trying to unwind into the module code, which
avoids broken stack frames displayed"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf unwind: Report module before querying isactivation in dwfl unwind
perf tools: Fix build with ARCH=x86_64
perf evsel: Fix probing of precise_ip level for default cycles event
Pull irq fix from Thomas Gleixner:
"Add a missing resource release to an error path"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Release resources in __setup_irq() error path
Pull objtool fix from Thomas Gleixner:
"A single fix which adds fortify_panic to the list of no return
functions"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Add fortify_panic as __noreturn function
Pull LED fixes from Jacek Anaszewski:
"Two LED fixes:
- fix signal source assignment for leds-bcm6328
- revert patch that intended to fix LED behavior on suspend but it
had a side effect preventing suspend at all due to uevent being
sent on trigger removal"
* tag 'led_fixes_for_4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
Revert "leds: handle suspend/resume in heartbeat trigger"
leds: bcm6328: fix signal source assignment for leds 4 to 7
Pull USB fixes from Greg KH:
"Here are some small gadget and xhci USB fixes for 4.12-rc6.
Nothing major, but one of the gadget patches does fix a reported oops,
and the xhci ones resolve reported problems. All have been in
linux-next with no reported issues"
* tag 'usb-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks
usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk
usb: xhci: Fix USB 3.1 supported protocol parsing
USB: gadget: fix GPF in gadgetfs
usb: gadget: composite: make sure to reactivate function on unbind
Pull staging and IIO fixes from Greg KH:
"Here are some small staging and IIO driver fixes for 4.12-rc6.
Nothing huge, just a few small driver fixes for reported issues. All
have been in linux-next with no reported issues"
* tag 'staging-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
Staging: rtl8723bs: fix an error code in isFileReadable()
iio: buffer-dmaengine: Add missing header buffer_impl.h
iio: buffer-dma: Add missing header buffer_impl.h
iio: adc: meson-saradc: fix potential crash in meson_sar_adc_clear_fifo
iio: adc: mxs-lradc: Fix return value check in mxs_lradc_adc_probe()
iio: imu: inv_mpu6050: add accel lpf setting for chip >= MPU6500
staging: iio: ad7152: Fix deadlock in ad7152_write_raw_samp_freq()
Pull ceph fixes from Ilya Dryomov:
"A fix for an old ceph ->fh_to_* bug from Luis and two timestamp fixups
from Zheng, prompted by the ongoing y2038 work"
* tag 'ceph-for-4.12-rc6' of git://github.com/ceph/ceph-client:
ceph: unify inode i_ctime update
ceph: use current_kernel_time() to get request time stamp
ceph: check i_nlink while converting a file handle to dentry
Pull xfs fix from Darrick Wong:
"One more bugfix for you for 4.12-rc6 to fix something that came up in
an earlier rc:
- Fix some bogus ASSERT failures on CONFIG_SMP=n and CONFIG_XFS_DEBUG=y"
* tag 'xfs-4.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix spurious spin_is_locked() assert failures on non-smp kernels
Pull ufs fixes from Al Viro:
"Fix assorted ufs bugs: a couple of deadlocks, fs corruption in
truncate(), oopsen on tail unpacking and truncate when racing with
vmscan, mild fs corruption (free blocks stats summary buggered, *BSD
fsck would complain and fix), several instances of broken logics
around reserved blocks (starting with "check almost never triggers
when it should" and then there are issues with sufficiently large
UFS2)"
[ Note: ufs hasn't gotten any loving in a long time, because nobody
really seems to use it. These ufs fixes are triggered by people
actually caring now, not some sudden influx of new bugs. - Linus ]
* 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ufs_truncate_blocks(): fix the case when size is in the last direct block
ufs: more deadlock prevention on tail unpacking
ufs: avoid grabbing ->truncate_mutex if possible
ufs_get_locked_page(): make sure we have buffer_heads
ufs: fix s_size/s_dsize users
ufs: fix reserved blocks check
ufs: make ufs_freespace() return signed
ufs: fix logics in "ufs: make fsck -f happy"
Pull vfs fixes from Al Viro:
"A couple of fixes; a leak in mntns_install() caught by Andrei (this
cycle regression) + d_invalidate() softlockup fix - that had been
reported by a bunch of people lately, but the problem is pretty old"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: don't forget to put old mntns in mntns_install
Hang/soft lockup in d_invalidate with simultaneous calls
Pull PCI fixes from Bjorn Helgaas:
- fix another PCI_ENDPOINT build error (merged for v4.12)
- fix error codes added to config accessors for v4.12
* tag 'pci-v4.12-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: endpoint: Select CRC32 to fix test build error
PCI: Make error code types consistent in pci_{read,write}_config_*
Merge misc fixes from Andrew Morton:
"5 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: correct the comment when reclaimed pages exceed the scanned pages
userfaultfd: shmem: handle coredumping in handle_userfault()
mm: numa: avoid waiting on freed migrated pages
swap: cond_resched in swap_cgroup_prepare()
mm/memory-failure.c: use compound_head() flags for huge pages
Anon and hugetlbfs handle FOLL_DUMP set by get_dump_page() internally to
__get_user_pages().
shmem as opposed has no special FOLL_DUMP handling there so
handle_mm_fault() is invoked without mmap_sem and ends up calling
handle_userfault() that isn't expecting to be invoked without mmap_sem
held.
This makes handle_userfault() fail immediately if invoked through
shmem_vm_ops->fault during coredumping and solves the problem.
The side effect is a BUG_ON with no lock held triggered by the
coredumping process which exits. Only 4.11 is affected, pre-4.11 anon
memory holes are skipped in __get_user_pages by checking FOLL_DUMP
explicitly against empty pagetables (mm/gup.c:no_page_table()).
It's zero cost as we already had a check for current->flags to prevent
futex to trigger userfaults during exit (PF_EXITING).
Link: http://lkml.kernel.org/r/20170615214838.27429-1-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: <stable@vger.kernel.org> [4.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by
waiting until the pmd is unlocked before we return and retry. However,
we can race with migrate_misplaced_transhuge_page():
// do_huge_pmd_numa_page // migrate_misplaced_transhuge_page()
// Holds 0 refs on page // Holds 2 refs on page
vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
/* ... */
if (pmd_trans_migrating(*vmf->pmd)) {
page = pmd_page(*vmf->pmd);
spin_unlock(vmf->ptl);
ptl = pmd_lock(mm, pmd);
if (page_count(page) != 2)) {
/* roll back */
}
/* ... */
mlock_migrate_page(new_page, page);
/* ... */
spin_unlock(ptl);
put_page(page);
put_page(page); // page freed here
wait_on_page_locked(page);
goto out;
}
This can result in the freed page having its waiters flag set
unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the
page alloc/free functions. This has been observed on arm64 KVM guests.
We can avoid this by having do_huge_pmd_numa_page() take a reference on
the page before dropping the pmd lock, mirroring what we do in
__migration_entry_wait().
When we hit the race, migrate_misplaced_transhuge_page() will see the
reference and abort the migration, as it may do today in other cases.
Fixes: b8916634b7 ("mm: Prevent parallel splits during THP migration")
Link: http://lkml.kernel.org/r/1497349722-6731-2-git-send-email-will.deacon@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
memory_failure() chooses a recovery action function based on the page
flags. For huge pages it uses the tail page flags which don't have
anything interesting set, resulting in:
> Memory failure: 0x9be3b4: Unknown page state
> Memory failure: 0x9be3b4: recovery action for unknown page: Failed
Instead, save a copy of the head page's flags if this is a huge page,
this means if there are no relevant flags for this tail page, we use the
head pages flags instead. This results in the me_huge_page() recovery
action being called:
> Memory failure: 0x9b7969: recovery action for huge page: Delayed
For hugepages that have not yet been allocated, this allows the hugepage
to be dequeued.
Fixes: 524fca1e73 ("HWPOISON: fix misjudgement of page_action() for errors on mlocked pages")
Link: http://lkml.kernel.org/r/20170524130204.21845-1-james.morse@arm.com
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull powerpc fixes from Michael Ellerman:
"Three small fixes for recently merged code:
- remove a spurious WARN_ON when a PCI device has no of_node, it's
allowed in some circumstances for there to be no of_node.
- fix the offset for store EOI MMIOs in the XIVE interrupt
controller.
- fix non-const WARN_ONs which were becoming BUGs due to them losing
BUGFLAG_WARNING in a recent cleanup patch.
Thanks to: Alexey Kardashevskiy, Alistair Popple, Benjamin
Herrenschmidt"
* tag 'powerpc-4.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/debug: Add missing warn flag to WARN_ON's non-builtin path
powerpc/xive: Fix offset for store EOI MMIOs
powerpc/npu-dma: Remove spurious WARN_ON when a PCI device has no of_node
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix probing of precise_ip level for default cycles event, that
got broken recently on x86_64 when its arch code started
considering invalid requesting precise samples when not sampling
(i.e. when attr.sample_period == 0).
This also fixes another problem in s/390 where the precision
probing with sample_period == 0 returned precise_ip > 0, that
then, when setting up the real cycles event (not probing) would
return EOPNOTSUPP for precise_ip > 0 (as determined previously
by probing) and sample_period > 0.
These problems resulted in attr_precise not being set to the
highest precision available on x86.64 when no event was specified,
i.e. the canonical:
perf record ./workload
would end up using attr.precise_ip = 0. As a workaround this would
need to be done:
perf record -e cycles:P ./workload
And on s/390 it would plain not work, requiring using:
perf record -e cycles ./workload
as a workaround. (Arnaldo Carvalho de Melo)
- Fix perf build with ARCH=x86_64, when ARCH should be transformed
into ARCH=x86, just like with the main kernel Makefile and
tools/objtool's, i.e. use SRCARCH. (Jiada Wang)
- Avoid accessing uninitialized data structures when unwinding with
elfutils's libdw, making it more closely mimic libunwind's unwinder.
(Milian Wolff)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull configfs updates from Christoph Hellwig:
"A fix from Nic for a race seen in production (including a stable tag).
And while I'm sending you this I'm also sneaking in a trivial new
helper from Bart so that we don't need inter-tree dependencies for the
next merge window"
* tag 'configfs-for-4.12' of git://git.infradead.org/users/hch/configfs:
configfs: Introduce config_item_get_unless_zero()
configfs: Fix race between create_link and configfs_rmdir
Pull MMC fix from Ulf Hansson:
"MMC meson-gx host: work around broken SDIO with certain WiFi chips"
* tag 'mmc-v4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: meson-gx: work around broken SDIO with certain WiFi chips
Pull drm fixes from Dave Airlie:
"This is the main fixes pull for 4.12-rc6, all pretty normal for this
stage, nothing really stands out. The mxsfb one is probably the
largest and it's for a black screen boot problem.
AMD, i915, mgag200, msxfb, tegra fixes"
* tag 'drm-fixes-for-v4.12-rc6' of git://people.freedesktop.org/~airlied/linux:
drm: mxsfb_crtc: Reset the eLCDIF controller
drm/mgag200: Fix to always set HiPri for G200e4 V2
drm/tegra: Correct idr_alloc() minimum id
drm/tegra: Fix lockup on a use of staging API
gpu: host1x: Fix error handling
drm/radeon: Fix overflow of watermark calcs at > 4k resolutions.
drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions.
drm/radeon: fix "force the UVD DPB into VRAM as well"
drm/i915: Fix GVT-g PVINFO version compatibility check
drm/i915: Fix SKL+ watermarks for 90/270 rotation
drm/i915: Fix scaling check for 90/270 degree plane rotation
drm: dw-hdmi: Fix compilation breakage by selecting REGMAP_MMIO
Pull rdma fixes from Doug Ledford:
"I had thought at the time of the last pull request that there wouldn't
be much more to go, but several things just kept trickling in over the
last week.
Instead of just the six patches to bnxt_re that I had anticipated,
there are another five IPoIB patches, two qedr patches, and a few
other miscellaneous patches.
The bnxt_re patches are more lines of diff than I like to submit this
late in the game. That's mostly because of the first two patches in
the series of six. I almost dropped them just because of the lines of
churn, but on a close review, a lot of the churn came from removing
duplicated code sections and consolidating them into callable
routines. I felt like this made the number of lines of change more
acceptable, and they address problems, so I left them. The remainder
of the patches are all small, well contained, and well understood.
These have passed 0day testing, but have not been submitted to
linux-next (but a local merge test with your current master was
without any conflicts).
Summary:
- A fix for fix eea40b8f62 ("infiniband: call ipv6 route lookup via
the stub interface")
- Six patches against bnxt_re...the first two are considerably larger
than I would like, but as they address real issues I went ahead and
submitted them (it also helped that a good deal of the churn was
removing code repeated in multiple places and consolidating it to
one common function)
- Two fixes against qedr that just came in
- One fix against rxe that took a few revisions to get right plus
time to get the proper reviews
- Five late breaking IPoIB fixes
- One late cxgb4 fix"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
rdma/cxgb4: Fix memory leaks during module exit
IB/ipoib: Fix memory leak in create child syscall
IB/ipoib: Fix access to un-initialized napi struct
IB/ipoib: Delete napi in device uninit default
IB/ipoib: Limit call to free rdma_netdev for capable devices
IB/ipoib: Fix memory leaks for child interfaces priv
rxe: Fix a sleep-in-atomic bug in post_one_send
RDMA/qedr: Add 64KB PAGE_SIZE support to user-space queues
RDMA/qedr: Initialize byte_len in WC of READ and SEND commands
RDMA/bnxt_re: Remove FMR support
RDMA/bnxt_re: Fix RQE posting logic
RDMA/bnxt_re: Add HW workaround for avoiding stall for UD QPs
RDMA/bnxt_re: Dereg MR in FW before freeing the fast_reg_page_list
RDMA/bnxt_re: HW workarounds for handling specific conditions
RDMA/bnxt_re: Fixing the Control path command and response handling
IB/addr: Fix setting source address in addr6_resolve()
Pull x86 platform driver fix from Darren Hart:
"Just a single patch to fix an oops in the intel_telemetry_debugfs
module load/unload"
* tag 'platform-drivers-x86-v4.12-2' of git://git.infradead.org/linux-platform-drivers-x86:
platform/x86: intel_telemetry_debugfs: fix oops when load/unload module
Pull block layer fix from Jens Axboe:
"Just a single fix this week, fixing a regression introduced in this
release.
When we put the final reference to the queue, we may need to block.
Ensure that we can safely do so. From Bart"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: Fix a blk_exit_rl() regression
Pull dmi fixes from Jean Delvare.
* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
firmware: dmi_scan: Check DMI structure length
firmware: dmi: Fix permissions of product_family
firmware: dmi_scan: Make dmi_walk and dmi_walk_early return real error codes
firmware: dmi_scan: Look for SMBIOS 3 entry point first
Pull selinux fix from James Morris:
"Fix for a double free bug in SELinux"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
selinux: fix double free in selinux_parse_opts_str()
When trapped on WARN_ON(), report_bug() is expected to return
BUG_TRAP_TYPE_WARN so the caller will increment NIP by 4 and continue.
The __builtin_constant_p() path of the PPC's WARN_ON()
calls (indirectly) __WARN_FLAGS() which has BUGFLAG_WARNING set,
however the other branch does not which makes report_bug() report a
bug rather than a warning.
Fixes: f26dee1510 ("debug: Avoid setting BUGFLAG_WARNING twice")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Driver Changes:
- dw-hdmi: Fix compilation error if REGMAP_MMIO not selected (Laurent)
- host1x: Fix incorrect return value (Christophe)
- tegra: Shore up idr API usage in tegra staging code (Dmitry)
- mgag200: Always use HiPri mode for G200e4v2 and limit max bandwidth (Mathieu)
- mxsfb: Ensure display can be lit up without bootloader initialization (Fabio)
Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Dmitry Osipenko <digetx@gmail.com>
Cc: Mathieu Larouche <mathieu.larouche@matrox.com>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
* tag 'drm-misc-fixes-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc:
drm: mxsfb_crtc: Reset the eLCDIF controller
drm/mgag200: Fix to always set HiPri for G200e4 V2
drm/tegra: Correct idr_alloc() minimum id
drm/tegra: Fix lockup on a use of staging API
gpu: host1x: Fix error handling
drm: dw-hdmi: Fix compilation breakage by selecting REGMAP_MMIO
A few fixes for 4.12:
- fix a UVD regression on SI
- fix overflow in watermark calcs on large modes
* 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: Fix overflow of watermark calcs at > 4k resolutions.
drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions.
drm/radeon: fix "force the UVD DPB into VRAM as well"
Using the syzkaller kernel fuzzer, Andrey Konovalov generated the
following error in gadgetfs:
> BUG: KASAN: use-after-free in __lock_acquire+0x3069/0x3690
> kernel/locking/lockdep.c:3246
> Read of size 8 at addr ffff88003a2bdaf8 by task kworker/3:1/903
>
> CPU: 3 PID: 903 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #35
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: usb_hub_wq hub_event
> Call Trace:
> __dump_stack lib/dump_stack.c:16 [inline]
> dump_stack+0x292/0x395 lib/dump_stack.c:52
> print_address_description+0x78/0x280 mm/kasan/report.c:252
> kasan_report_error mm/kasan/report.c:351 [inline]
> kasan_report+0x230/0x340 mm/kasan/report.c:408
> __asan_report_load8_noabort+0x19/0x20 mm/kasan/report.c:429
> __lock_acquire+0x3069/0x3690 kernel/locking/lockdep.c:3246
> lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
> spin_lock include/linux/spinlock.h:299 [inline]
> gadgetfs_suspend+0x89/0x130 drivers/usb/gadget/legacy/inode.c:1682
> set_link_state+0x88e/0xae0 drivers/usb/gadget/udc/dummy_hcd.c:455
> dummy_hub_control+0xd7e/0x1fb0 drivers/usb/gadget/udc/dummy_hcd.c:2074
> rh_call_control drivers/usb/core/hcd.c:689 [inline]
> rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
> usb_hcd_submit_urb+0x92f/0x20b0 drivers/usb/core/hcd.c:1650
> usb_submit_urb+0x8b2/0x12c0 drivers/usb/core/urb.c:542
> usb_start_wait_urb+0x148/0x5b0 drivers/usb/core/message.c:56
> usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
> usb_control_msg+0x341/0x4d0 drivers/usb/core/message.c:151
> usb_clear_port_feature+0x74/0xa0 drivers/usb/core/hub.c:412
> hub_port_disable+0x123/0x510 drivers/usb/core/hub.c:4177
> hub_port_init+0x1ed/0x2940 drivers/usb/core/hub.c:4648
> hub_port_connect drivers/usb/core/hub.c:4826 [inline]
> hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
> port_event drivers/usb/core/hub.c:5105 [inline]
> hub_event+0x1ae1/0x3d40 drivers/usb/core/hub.c:5185
> process_one_work+0xc08/0x1bd0 kernel/workqueue.c:2097
> process_scheduled_works kernel/workqueue.c:2157 [inline]
> worker_thread+0xb2b/0x1860 kernel/workqueue.c:2233
> kthread+0x363/0x440 kernel/kthread.c:231
> ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
>
> Allocated by task 9958:
> save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
> kmem_cache_alloc_trace+0x87/0x280 mm/slub.c:2745
> kmalloc include/linux/slab.h:492 [inline]
> kzalloc include/linux/slab.h:665 [inline]
> dev_new drivers/usb/gadget/legacy/inode.c:170 [inline]
> gadgetfs_fill_super+0x24f/0x540 drivers/usb/gadget/legacy/inode.c:1993
> mount_single+0xf6/0x160 fs/super.c:1192
> gadgetfs_mount+0x31/0x40 drivers/usb/gadget/legacy/inode.c:2019
> mount_fs+0x9c/0x2d0 fs/super.c:1223
> vfs_kern_mount.part.25+0xcb/0x490 fs/namespace.c:976
> vfs_kern_mount fs/namespace.c:2509 [inline]
> do_new_mount fs/namespace.c:2512 [inline]
> do_mount+0x41b/0x2d90 fs/namespace.c:2834
> SYSC_mount fs/namespace.c:3050 [inline]
> SyS_mount+0xb0/0x120 fs/namespace.c:3027
> entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> Freed by task 9960:
> save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
> slab_free_hook mm/slub.c:1357 [inline]
> slab_free_freelist_hook mm/slub.c:1379 [inline]
> slab_free mm/slub.c:2961 [inline]
> kfree+0xed/0x2b0 mm/slub.c:3882
> put_dev+0x124/0x160 drivers/usb/gadget/legacy/inode.c:163
> gadgetfs_kill_sb+0x33/0x60 drivers/usb/gadget/legacy/inode.c:2027
> deactivate_locked_super+0x8d/0xd0 fs/super.c:309
> deactivate_super+0x21e/0x310 fs/super.c:340
> cleanup_mnt+0xb7/0x150 fs/namespace.c:1112
> __cleanup_mnt+0x1b/0x20 fs/namespace.c:1119
> task_work_run+0x1a0/0x280 kernel/task_work.c:116
> exit_task_work include/linux/task_work.h:21 [inline]
> do_exit+0x18a8/0x2820 kernel/exit.c:878
> do_group_exit+0x14e/0x420 kernel/exit.c:982
> get_signal+0x784/0x1780 kernel/signal.c:2318
> do_signal+0xd7/0x2130 arch/x86/kernel/signal.c:808
> exit_to_usermode_loop+0x1ac/0x240 arch/x86/entry/common.c:157
> prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
> syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
> entry_SYSCALL_64_fastpath+0xbc/0xbe
>
> The buggy address belongs to the object at ffff88003a2bdae0
> which belongs to the cache kmalloc-1024 of size 1024
> The buggy address is located 24 bytes inside of
> 1024-byte region [ffff88003a2bdae0, ffff88003a2bdee0)
> The buggy address belongs to the page:
> page:ffffea0000e8ae00 count:1 mapcount:0 mapping: (null)
> index:0x0 compound_mapcount: 0
> flags: 0x100000000008100(slab|head)
> raw: 0100000000008100 0000000000000000 0000000000000000 0000000100170017
> raw: ffffea0000ed3020 ffffea0000f5f820 ffff88003e80efc0 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff88003a2bd980: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff88003a2bda00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >ffff88003a2bda80: fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb
> ^
> ffff88003a2bdb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff88003a2bdb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
What this means is that the gadgetfs_suspend() routine was trying to
access dev->lock after it had been deallocated. The root cause is a
race in the dummy_hcd driver; the dummy_udc_stop() routine can race
with the rest of the driver because it contains no locking. And even
when proper locking is added, it can still race with the
set_link_state() function because that function incorrectly drops the
private spinlock before invoking any gadget driver callbacks.
The result of this race, as seen above, is that set_link_state() can
invoke a callback in gadgetfs even after gadgetfs has been unbound
from dummy_hcd's UDC and its private data structures have been
deallocated.
include/linux/usb/gadget.h documents that the ->reset, ->disconnect,
->suspend, and ->resume callbacks may be invoked in interrupt context.
In general this is necessary, to prevent races with gadget driver
removal. This patch fixes dummy_hcd to retain the spinlock across
these calls, and it adds a spinlock acquisition to dummy_udc_stop() to
prevent the race.
The net2280 driver makes the same mistake of dropping the private
spinlock for its ->disconnect and ->reset callback invocations. The
patch fixes it too.
Lastly, since gadgetfs_suspend() may be invoked in interrupt context,
it cannot assume that interrupts are enabled when it runs. It must
use spin_lock_irqsave() instead of spin_lock_irq(). The patch fixes
that bug as well.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: Andrey Konovalov <andreyknvl@google.com>
CC: <stable@vger.kernel.org>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to the eLCDIF initialization steps listed in the MX6SX
Reference Manual the eLCDIF block reset is mandatory.
Without performing the eLCDIF reset the display shows garbage content
when the kernel boots.
In earlier tests this issue has not been observed because the bootloader
was previously showing a splash screen and the bootloader display driver
does properly implement the eLCDIF reset.
Add the eLCDIF reset to the driver, so that it can operate correctly
independently of the bootloader.
Tested on a imx6sx-sdb board.
Cc: <stable@vger.kernel.org>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1494007301-14535-1-git-send-email-fabio.estevam@nxp.com
Architecturally we should apply a 0x400 offset for these. Not doing
it will break future HW implementations.
The offset of 0 is supposed to remain for "triggers" though not all
sources support both trigger and store EOI, and in P9 specifically,
some sources will treat 0 as a store EOI. But future chips will not.
So this makes us use the properly architected offset which should work
always.
Fixes: 243e25112d ("powerpc/xive: Native exploitation of the XIVE interrupt controller")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently they return -1 on error, which will confuse callers if
they try to interpret it as a normal negative error code.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Since version 3.0.0 of the SMBIOS specification, there can be
multiple entry points in memory, pointing to one or two DMI tables.
If both a 32-bit ("_SM_") entry point and a 64-bit ("_SM3_") entry
point are present, the specification requires that the latter points
to a table which is a super-set of the table pointed to by the
former. Therefore we should give preference to the 64-bit ("_SM3_")
entry point.
However, currently the code is picking the first valid entry point
it finds. Per specification, we should look for a 64-bit ("_SM3_")
entry point first, and if we can't find any, look for a 32-bit
("_SM_" or "_DMI_") entry point. Modify the code to do that.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
It's not hard to trigger a bunch of d_invalidate() on the same
dentry in parallel. They end up fighting each other - any
dentry picked for removal by one will be skipped by the rest
and we'll go for the next iteration through the entire
subtree, even if everything is being skipped. Morevoer, we
immediately go back to scanning the subtree. The only thing
we really need is to dissolve all mounts in the subtree and
as soon as we've nothing left to do, we can just unhash the
dentry and bugger off.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The .its targets require information about the kernel binary, such as
its entry point, which is extracted from the vmlinux ELF. We therefore
require that the ELF is built before the .its files are generated.
Declare this requirement in the Makefile such that make will ensure this
is always the case, otherwise in corner cases we can hit issues as the
.its is generated with an incorrect (either invalid or stale) entry
point.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: cf2a5e0bb4 ("MIPS: Support generating Flattened Image Trees (.itb)")
Cc: linux-mips@linux-mips.org
Cc: stable <stable@vger.kernel.org> # v4.9+
Patchwork: https://patchwork.linux-mips.org/patch/16179/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The code handling the pop76 opcode (ie. bnezc & jialc instructions) in
__compute_return_epc_for_insn() needs to set the value of $31 in the
jialc case, which is encoded with rs = 0. However its check to
differentiate bnezc (rs != 0) from jialc (rs = 0) was unfortunately
backwards, meaning that if we emulate a bnezc instruction we clobber $31
& if we emulate a jialc instruction it actually behaves like a jic
instruction.
Fix this by inverting the check of rs to match the way the instructions
are actually encoded.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 28d6f93d20 ("MIPS: Emulate the new MIPS R6 BNEZC and JIALC instructions")
Cc: stable <stable@vger.kernel.org> # v4.0+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16178/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull networking fixes from David Miller:
1) The netlink attribute passed in to dev_set_alias() is not
necessarily NULL terminated, don't use strlcpy() on it. From
Alexander Potapenko.
2) Fix implementation of atomics in arm64 bpf JIT, from Daniel
Borkmann.
3) Correct the release of netdevs and driver private data in certain
circumstances.
4) Sanitize netlink message length properly in decnet, from Mateusz
Jurczyk.
5) Don't leak kernel data in rtnl_fill_vfinfo() netlink blobs. From
Yuval Mintz.
6) Hash secret is never initialized in ipv6 ILA translation code, from
Arnd Bergmann. I guess those clang warnings about unused inline
functions are useful for something!
7) Fix endian selection in bpf_endian.h, from Daniel Borkmann.
8) Sanitize sockaddr length before dereferncing any fields in AF_UNIX
and CAIF. From Mateusz Jurczyk.
9) Fix timestamping for GMAC3 chips in stmmac driver, from Mario
Molitor.
10) Do not leak netdev on dev_alloc_name() errors in mac80211, from
Johannes Berg.
11) Fix locking in sctp_for_each_endpoint(), from Xin Long.
12) Fix wrong memset size on 32-bit in snmp6, from Christian Perle.
13) Fix use after free in ip_mc_clear_src(), from WANG Cong.
14) Fix regressions caused by ICMP rate limiting changes in 4.11, from
Jesper Dangaard Brouer.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (91 commits)
i40e: Fix a sleep-in-atomic bug
net: don't global ICMP rate limit packets originating from loopback
net/act_pedit: fix an error code
net: update undefined ->ndo_change_mtu() comment
net_sched: move tcf_lock down after gen_replace_estimator()
caif: Add sockaddr length check before accessing sa_family in connect handler
qed: fix dump of context data
qmi_wwan: new Telewell and Sierra device IDs
net: phy: Fix MDIO_THUNDER dependencies
netconsole: Remove duplicate "netconsole: " logging prefix
igmp: acquire pmc lock for ip_mc_clear_src()
r8152: give the device version
net: rps: fix uninitialized symbol warning
mac80211: don't send SMPS action frame in AP mode when not needed
mac80211/wpa: use constant time memory comparison for MACs
mac80211: set bss_info data before configuring the channel
mac80211: remove 5/10 MHz rate code from station MLME
mac80211: Fix incorrect condition when checking rx timestamp
mac80211: don't look at the PM bit of BAR frames
i40e: fix handling of HW ATR eviction
...
Pull crypto fix from Herbert Xu:
"This fixes a bug on sparc where we may dereference freed stack memory"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: Work around deallocated stack frame reference gcc bug on sparc.
Pull ACPI fixes from Rafael Wysocki:
"These revert an ACPICA commit from the 4.11 cycle that causes problems
to happen on some systems and add a protection against possible kernel
crashes due to table reference counter imbalance.
Specifics:
- Revert a 4.11 ACPICA change that made assumptions which are not
satisfied on some systems and caused the enumeration of resources
to fail on them (Rafael Wysocki).
- Add a mechanism to prevent tables from being unmapped prematurely
due to reference counter overflows (Lv Zheng)"
* tag 'acpi-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: Tables: Mechanism to handle late stage acpi_get_table() imbalance
Revert "ACPICA: Disassembler: Enhance resource descriptor detection"
Pull power management fixes from Rafael Wysocki:
"These revert a recent cpufreq schedutil governor change that turned
out to be problematic and fix a few minor issues in cpufreq, cpuidle
and the Exynos devfreq drivers.
Specifics:
- Revert a recent cpufreq schedutil governor change that caused some
systems to behave undesirably (Rafael Wysocki).
- Fix a cpufreq conservative governor issue introduced during the
3.10 cycle that prevents it from working as expected in some
situations (Tomasz Wilczyński).
- Fix an error code path in the generic cpuidle driver for DT-based
systems (Christophe Jaillet).
- Fix three minor issues in devfreq drivers for Exynos (Arvind Yadav,
Krzysztof Kozlowski)"
* tag 'pm-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle: dt: Add missing 'of_node_put()'
cpufreq: conservative: Allow down_threshold to take values from 1 to 10
Revert "cpufreq: schedutil: Reduce frequencies slower"
PM / devfreq: exynos-ppmu: Staticize event list
PM / devfreq: exynos-ppmu: Handle return value of clk_prepare_enable
PM / devfreq: exynos-nocp: Handle return value of clk_prepare_enable
Pull HID fix from Jiri Kosina:
- ifdef-based bandaid for a long-standing issue with HID driver
matching, avoiding regressions in cases where specific driver is not
enabled in kernel .config, from Jiri Kosina
* 'for-4.12/driver-matching-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: let generic driver yield control iff specific driver has been enabled
Pull media fixes from Mauro Carvalho Chehab:
- some build dependency issues at CEC core with randconfigs
- fix an off by one error at vb2
- a race fix at cec core
- driver fixes at tc358743, sir_ir and rainshadow-cec
* tag 'media/v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] media/cec.h: use IS_REACHABLE instead of IS_ENABLED
[media] cec: race fix: don't return -ENONET in cec_receive()
[media] sir_ir: infinite loop in interrupt handler
[media] cec-notifier.h: handle unreachable CONFIG_CEC_CORE
[media] cec: improve MEDIA_CEC_RC dependencies
[media] vb2: Fix an off by one error in 'vb2_plane_vaddr'
[media] rainshadow-cec: Fix missing spin_lock_init()
[media] tc358743: fix register i2c_rd/wr function fix
The logics when deciding whether we need to do anything with direct blocks
is broken when new size is within the last direct block. It's better to
find the path to the last byte _not_ to be removed and use that instead
of the path to the beginning of the first block to be freed...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
tail unpacking is done in a wrong place; the deadlocks galore
is best dealt with by doing that in ->write_iter() (and switching
to iomap, while we are at it), but that's rather painful to
backport. The trouble comes from grabbing pages that cover
the beginning of tail from inside of ufs_new_fragments(); ongoing
pageout of any of those is going to deadlock on ->truncate_mutex
with process that got around to extending the tail holding that
and waiting for page to get unlocked, while ->writepage() on
that page is waiting on ->truncate_mutex.
The thing is, we don't need ->truncate_mutex when the fragment
we are trying to map is within the tail - the damn thing is
allocated (tail can't contain holes).
Let's do a plain lookup and if the fragment is present, we can
just pretend that we'd won the race in almost all cases. The
only exception is a fragment between the end of tail and the
end of block containing tail.
Protect ->i_lastfrag with ->meta_lock - read_seqlock_excl() is
sufficient.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The driver may sleep under a spin lock, and the function call path is:
i40e_ndo_set_vf_port_vlan (acquire the lock by spin_lock_bh)
i40e_vsi_remove_pvid
i40e_vlan_stripping_disable
i40e_aq_update_vsi_params
i40e_asq_send_command
mutex_lock --> may sleep
To fixed it, the spin lock is released before "i40e_vsi_remove_pvid", and
the lock is acquired again after this function.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
callers rely upon that, but find_lock_page() racing with attempt of
page eviction by memory pressure might have left us with
* try_to_free_buffers() successfully done
* __remove_mapping() failed, leaving the page in our mapping
* find_lock_page() returning an uptodate page with no
buffer_heads attached.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
For UFS2 we need 64bit variants; we even store them in uspi, but
use 32bit ones instead. One wrinkle is in handling of reserved
space - recalculating it every time had been stupid all along, but
now it would become really ugly. Just calculate it once...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
a) honour ->s_minfree; don't just go with default (5)
b) don't bother with capability checks until we know we'll need them
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Florian Weimer seems to have a glibc test-case which requires that
loopback interfaces does not get ICMP ratelimited. This was broken by
commit c0303efeab ("net: reduce cycles spend on ICMP replies that
gets rate limited").
An ICMP response will usually be routed back-out the same incoming
interface. Thus, take advantage of this and skip global ICMP
ratelimit when the incoming device is loopback. In the unlikely event
that the outgoing it not loopback, due to strange routing policy
rules, ICMP rate limiting still works via peer ratelimiting via
icmpv4_xrlim_allow(). Thus, we should still comply with RFC1812
(section 4.3.2.8 "Rate Limiting").
This seems to fix the reproducer given by Florian. While still
avoiding to perform expensive and unneeded outgoing route lookup for
rate limited packets (in the non-loopback case).
Fixes: c0303efeab ("net: reduce cycles spend on ICMP replies that gets rate limited")
Reported-by: Florian Weimer <fweimer@redhat.com>
Reported-by: "H.J. Lu" <hjl.tools@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid that the following complaint is reported:
BUG: sleeping function called from invalid context at kernel/workqueue.c:2790
in_atomic(): 1, irqs_disabled(): 0, pid: 41, name: rcuop/3
1 lock held by rcuop/3/41:
#0: (rcu_callback){......}, at: [<ffffffff8111f9a2>] rcu_nocb_kthread+0x282/0x500
Call Trace:
dump_stack+0x86/0xcf
___might_sleep+0x174/0x260
__might_sleep+0x4a/0x80
flush_work+0x7e/0x2e0
__cancel_work_timer+0x143/0x1c0
cancel_work_sync+0x10/0x20
blk_throtl_exit+0x25/0x60
blkcg_exit_queue+0x35/0x40
blk_release_queue+0x42/0x130
kobject_put+0xa9/0x190
This happens since we invoke callbacks that need to block from the
queue release handler. Fix this by pushing the final release to
a workqueue.
Reported-by: Ross Zwisler <zwisler@gmail.com>
Fixes: commit b425e50492 ("block: Avoid that blk_exit_rl() triggers a use-after-free")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Updated changelog
Signed-off-by: Jens Axboe <axboe@fb.com>
I'm reviewing static checker warnings where we do ERR_PTR(0), which is
the same as NULL. I'm pretty sure we intended to return ERR_PTR(-EINVAL)
here. Sometimes these bugs lead to a NULL dereference but I don't
immediately see that problem here.
Fixes: 71d0ed7079 ("net/act_pedit: Support using offset relative to the conventional network headers")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
Storing stats _only_ at new locations is wrong for UFS1; old
locations should always be kept updated. The check for "has
been converted to use of new locations" is also wrong - it
should be "->fs_maxbsize is equal to ->fs_bsize".
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The flow of creating a new child goes through ipoib_vlan_add
which allocates a new interface and checks the rtnl_lock.
If the lock is taken, restart_syscall will be called to restart
the system call again. In this case we are not releasing the
already allocated interface, causing a leak.
Fixes: 9baa0b0364 ("IB/ipoib: Add rtnl_link_ops support")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch mekas init_default and uninit_default symmetric
with a call to delete napi. Additionally, the uninit_default
gained delete napi call in case of init_default fails.
Fixes: 515ed4f3aa ('IB/IPoIB: Separate control and data related initializations')
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
There is a need to free priv explicitly and not just to release
the device, child priv is freed explicitly on remove flow and this
patch also includes priv free on error flow in P_key creation
and also in add_port.
Fixes: cd565b4b51 ('IB/IPoIB: Support acceleration options callbacks')
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Update ->ndo_change_mtu() callback comment to remove text
about returning error in case of undefined callback. This
change makes the comment match the existing code behavior.
Signed-off-by: Magnus Damm <damm+renesas@opensource.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 18e7a45af9 ("perf/x86: Reject non sampling events with
precise_ip") returns -EINVAL for sys_perf_event_open() with an attribute
with (attr.precise_ip > 0 && attr.sample_period == 0), just like is done
in the routine used to probe the max precise level when no events were
passed to 'perf record' or 'perf top', i.e.:
perf_evsel__new_cycles()
perf_event_attr__set_max_precise_ip()
The x86 code, in x86_pmu_hw_config(), which is called all the way from
sys_perf_event_open() did, starting with the aforementioned commit:
/* There's no sense in having PEBS for non sampling events: */
if (!is_sampling_event(event))
return -EINVAL;
Which makes it fail for cycles:ppp, cycles:pp and cycles:p, always using
just the non precise cycles variant.
To make sure that this is the case, I tested it, before this patch,
with:
# perf probe -L x86_pmu_hw_config
<x86_pmu_hw_config@/home/acme/git/linux/arch/x86/events/core.c:0>
0 int x86_pmu_hw_config(struct perf_event *event)
1 {
2 if (event->attr.precise_ip) {
<SNIP>
17 if (event->attr.precise_ip > precise)
18 return -EOPNOTSUPP;
/* There's no sense in having PEBS for non sampling events: */
21 if (!is_sampling_event(event))
22 return -EINVAL;
}
<SNIP>
# perf probe x86_pmu_hw_config:22
Added new events:
probe:x86_pmu_hw_config (on x86_pmu_hw_config:22)
probe:x86_pmu_hw_config_1 (on x86_pmu_hw_config:22)
You can now use it in all perf tools, such as:
perf record -e probe:x86_pmu_hw_config_1 -aR sleep 1
# perf trace -e perf_event_open,probe:x86_pmu_hwconfig*/max-stack=16/ perf record usleep 1
0.000 ( 0.015 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
0.015 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
x86_pmu_hw_config ([kernel.kallsyms])
hsw_hw_config ([kernel.kallsyms])
x86_pmu_event_init ([kernel.kallsyms])
perf_try_init_event ([kernel.kallsyms])
perf_event_alloc ([kernel.kallsyms])
SYSC_perf_event_open ([kernel.kallsyms])
sys_perf_event_open ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
perf_evsel__new_cycles (/home/acme/bin/perf)
perf_evlist__add_default (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
0.000 ( 0.021 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
0.023 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
0.025 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
x86_pmu_hw_config ([kernel.kallsyms])
hsw_hw_config ([kernel.kallsyms])
x86_pmu_event_init ([kernel.kallsyms])
perf_try_init_event ([kernel.kallsyms])
perf_event_alloc ([kernel.kallsyms])
SYSC_perf_event_open ([kernel.kallsyms])
sys_perf_event_open ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
perf_evsel__new_cycles (/home/acme/bin/perf)
perf_evlist__add_default (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
0.023 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
0.028 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
0.030 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
x86_pmu_hw_config ([kernel.kallsyms])
hsw_hw_config ([kernel.kallsyms])
x86_pmu_event_init ([kernel.kallsyms])
perf_try_init_event ([kernel.kallsyms])
perf_event_alloc ([kernel.kallsyms])
SYSC_perf_event_open ([kernel.kallsyms])
sys_perf_event_open ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
return_from_SYSCALL_64 ([kernel.kallsyms])
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
perf_evsel__new_cycles (/home/acme/bin/perf)
perf_evlist__add_default (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
0.028 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
41.018 ( 0.012 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8b5dd0, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
41.065 ( 0.011 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
41.080 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
41.103 ( 0.010 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4
41.115 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5
41.122 ( 0.004 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6
41.128 ( 0.008 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ]
#
I.e. that return -EINVAL in x86_pmu_hw_config() is hit three times.
So fix it by just setting attr.sample_period
Now, after this patch:
# perf trace --max-stack=2 -e perf_event_open,probe:x86_pmu_hw_config* perf record usleep 1
[ perf record: Woken up 1 times to write data ]
0.000 ( 0.017 ms): perf/8469 perf_event_open(attr_uptr: 0x7ffe36c27d10, pid: -1, cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_event_open_cloexec_flag (/home/acme/bin/perf)
0.050 ( 0.031 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_evlist__config (/home/acme/bin/perf)
0.092 ( 0.040 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_evlist__config (/home/acme/bin/perf)
0.143 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, cpu: -1, group_fd: -1 ) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
0.161 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
0.171 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
0.180 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
0.190 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8
syscall (/usr/lib64/libc-2.24.so)
perf_evsel__open (/home/acme/bin/perf)
[ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
#
The probe one called from perf_event_attr__set_max_precise_ip() works
the first time, with attr.precise_ip = 3, wit hthe next ones being the
per cpu ones for the cycles:ppp event.
And here is the text from a report and alternative proposed patch by
Thomas-Mich Richter:
---
On s390 the counter and sampling facility do not support a precise IP
skid level and sometimes returns EOPNOTSUPP when structure member
precise_ip in struct perf_event_attr is not set to zero.
On s390 commnd 'perf record -- true' fails with error EOPNOTSUPP. This
happens only when no events are specified on command line.
The functions called are
...
--> perf_evlist__add_default
--> perf_evsel__new_cycles
--> perf_event_attr__set_max_precise_ip
The last function determines the value of structure member precise_ip by
invoking the perf_event_open() system call and checking the return code.
The first successful open is the value for precise_ip.
However the value is determined without setting member sample_period and
indicates no sampling.
On s390 the counter facility and sampling facility are different. The
above procedure determines a precise_ip value of 3 using the counter
facility. Later it uses the sampling facility with a value of 3 and
fails with EOPNOTSUPP.
---
v2: Older compilers (e.g. gcc 4.4.7) don't support referencing members
of unnamed union members in the container struct initialization, so
move from:
struct perf_event_attr attr = {
...
.sample_period = 1,
};
to right after it as:
struct perf_event_attr attr = {
...
};
attr.sample_period = 1;
v3: We need to reset .sample_period to 0 to let the users of
perf_evsel__new_cycles() to properly setup attr.sample_period or
attr.sample_freq. Reported by Ingo Molnar.
Reported-and-Acked-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 18e7a45af9 ("perf/x86: Reject non sampling events with precise_ip")
Link: http://lkml.kernel.org/n/tip-yv6nnkl7tzqocrm0hl3x7vf1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Laura reported a sleep-in-atomic kernel warning inside
tcf_act_police_init() which calls gen_replace_estimator() with
spinlock protection.
It is not necessary in this case, we already have RTNL lock here
so it is enough to protect concurrent writers. For the reader,
i.e. tcf_act_police(), it needs to make decision based on this
rate estimator, in the worst case we drop more/less packets than
necessary while changing the rate in parallel, it is still acceptable.
Reported-by: Laura Abbott <labbott@redhat.com>
Reported-by: Nick Huber <nicholashuber@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current __ceph_setattr() can set inode's i_ctime to current_time(),
req->r_stamp or attr->ia_ctime. These time stamps may have minor
differences. It may cause potential problem.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
ceph uses ktime_get_real_ts() to get request time stamp. In most
other cases, current_kernel_time() is used to get time stamp for
filesystem operations (called by current_time()).
There is granularity difference between ktime_get_real_ts() and
current_kernel_time(). The later one can be up to one jiffy behind
the former one. This can causes inode's ctime to go back.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Converting a file handle to a dentry can be done call after the inode
unlink. This means that __fh_to_dentry() requires an extra check to
verify the number of links is not 0.
The issue can be easily reproduced using xfstest generic/426, which does
something like:
name_to_handle_at(&fh)
echo 3 > /proc/sys/vm/drop_caches
unlink()
open_by_handle_at(&fh)
The call to open_by_handle_at() should fail, as the file doesn't exist
anymore.
Link: http://tracker.ceph.com/issues/19958
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The driver may sleep under a spin lock, and the function call path is:
post_one_send (acquire the lock by spin_lock_irqsave)
init_send_wqe
copy_from_user --> may sleep
There is no flow that makes "qp->is_user" true, and copy_from_user may
cause bug when a non-user pointer is used. So the lines of copy_from_user
and check of "qp->is_user" are removed.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add 64KB PAGE_SIZE support to user-space CQ, SQ and RQ queues.
De-facto it means that code was added to translate 64KB
pages to smaller 4KB pages that the FW can handle. Otherwise,
the FW would wrap (or jump to the next page) when reaching 4KB
while the user space library will continue on the same large page.
Note that MR code remains as is since the FW supports larger pages
for MRs.
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Initialize byte_len in work completion of RDMA_READ and RDMA_SEND.
Exposed by uDAPL application.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Some issues observed with FMR implementation
while running stress traffic. So removing the
FMR verbs support for now.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch adds code to ring RQ Doorbell aggressively
so that the adapter can DMA RQ buffers sooner, instead
of DMA all WQEs in the post_recv WR list together at the
end of the post_recv verb.
Also use spinlock to serialize RQ posting
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
HW stalls out after 0x800000 WQEs are posted for UD QPs.
To workaround this problem, driver will send a modify_qp cmd
to the HW at around the halfway mark(0x400000) so that FW
can accordingly modify the QP context in the HW to prevent this
stall.
This workaround needs to be done for UD, QP1 and Raw Ethertype
packets. Added a counter to keep track of WQEs posted during post_send.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
If the host buffers are freed before destroying MR in HW,
HW could try accessing these buffers. This could cause a host
crash. Fixing the code to avoid this condition.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch implements the following HW workarounds
1. The SQ depth needs to be augmented by 128 + 1 to avoid running
into an Out of order CQE issue
2. Workaround to handle the problem where the HW fast path engine continues
to access DMA memory in retranmission mode even after the WQE has
already been completed. If the HW reports this condition, driver detects
it and posts a Fence WQE. The driver stops reporting the completions
to stack until it receives completion for Fence WQE.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The caller only cares about zero vs non-zero so this code actually works
fine but we should be returning a negative error code instead of a valid
pointer casted to int.
Fixes: 554c0a3abf ("staging: Add rtl8723bs sdio wifi driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Drop log level for blanking from info to debug. Xorg likes to habitually
unblank when already unblanked and this can fill up logs over a long period
of time.
Signed-off-by: Mike Gerow <gerow@google.com>
Cc: bernie@plugable.com
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
When CONFIG_PROC_FS is disabled, we get warnings about unused variables
as remove_proc_entry() evaluates to an empty macro.
drivers/video/fbdev/via/viafbdev.c: In function 'viafb_remove_proc':
drivers/video/fbdev/via/viafbdev.c:1635:4: error: unused variable 'iga2_entry' [-Werror=unused-variable]
drivers/video/fbdev/via/viafbdev.c:1634:4: error: unused variable 'iga1_entry' [-Werror=unused-variable]
These are easy to avoid by using the pointer from the structure.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
gcc-7 suspects this code might be wrong because we use the
result of a multiplication as a bool:
drivers/video/fbdev/core/fbmon.c: In function 'fb_edid_add_monspecs':
drivers/video/fbdev/core/fbmon.c:1051:84: error: '*' in boolean context, suggest '&&' instead [-Werror=int-in-bool-context]
It's actually fine, so let's add a comparison to zero to make
that clear to the compiler too.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Jonathan writes:
Second set of IIO fixes for the 4.12 cycle.
* buffer-dma / buffer-dmaengine
- Fix missing include of buffer_impl.h after the split of buffer.h.
No driver in mainline is currently using these buffers so it wasn't
picked up by automated build tests.
* ad7152
- Fix a deadlock in ad7152_write_raw_samp_freq as the chip_state lock
was already held.
* inv_mpu6050
- Add low pass filter setting for chips newer than the MPU6500. None of
use previously picked up no the fact it was different on these newer
chips. It is separately set for the acceleration on these parts. There
is no normal reason to set it differently so the userspace interface
remains the same as for early parts.
* meson-saradc:
- Fix a potential crash by NULL pointer dereference in
meson_sar_adc_clear_fifo.
* mxs-lradc
- Fix a return value check where IS_ERR is used on a function that returns
NULL on error
Commit 4c3b89effc ("powerpc/powernv: Add sanity checks to
pnv_pci_get_{gpu|npu}_dev") introduced explicit warnings in
pnv_pci_get_npu_dev() when a PCIe device has no associated device-tree
node. However not all PCIe devices have an of_node and
pnv_pci_get_npu_dev() gets indirectly called at least once for every
PCIe device in the system. This results in spurious WARN_ON()'s so
remove it.
The same situation should not exist for pnv_pci_get_gpu_dev() as any
NPU based PCIe device requires a device-tree node.
Fixes: 4c3b89effc ("powerpc/powernv: Add sanity checks to pnv_pci_get_{gpu|npu}_dev")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in the connect()
handler of the AF_CAIF socket. Since the syscall doesn't enforce a minimum
size of the corresponding memory region, very short sockaddrs (zero or one
byte long) result in operating on uninitialized memory while referencing
sa_family.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixing a concurrency issue with creq handling. Each caller
was given a globally managed crsq element, which was
accessed outside a lock. This could result in corruption,
if lot of applications are simultaneously issuing Control Path
commands. Now, each caller will provide its own response buffer
and the responses will be copied under a lock.
Also, Fixing the queue full condition check for the CMDQ.
As a part of these changes, the control path code is refactored
to remove the code replication in the response status checking.
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add buffer_impl.h as buffer.h was split into interface for using and
for internals. Without this industrialio-buffer-dmaengine.c fails
to compile.
Fixes:
commit 33dd94cb97 ("iio:buffer.h - split
into buffer.h and buffer_impl.h")
Signed-off-by: Phil Reid <preid@electromag.com.au>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Add buffer_impl.h as buffer.h was split into interface for using and
for internals. Without this industrialio-buffer-dma.c fails
to compile.
Fixes:
commit 33dd94cb97 ("iio:buffer.h - split
into buffer.h and buffer_impl.h")
Signed-off-by: Phil Reid <preid@electromag.com.au>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
This reverts commit 5ab92a7cb8.
System cannot enter suspend mode because of heartbeat led trigger.
In autosleep_wq, try_to_suspend function will try to enter suspend
mode in specific period. it will get wakeup_count then call pm_notifier
chain callback function and freeze processes.
Heartbeat_pm_notifier is called and it call led_trigger_unregister to
change the trigger of led device to none. It will send uevent message
and the wakeup source count changed. As wakeup_count changed, suspend
will abort.
Fixes: 5ab92a7cb8 ("leds: handle suspend/resume in heartbeat trigger")
Signed-off-by: Zhang Bo <bo.zhang@nxp.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Each nibble represents 4 LEDs, and in case of the higher register, bit 0
represents LED 4, so we need to use modulus for the LED number as well.
Fixes: fd7b025a23 ("leds: add BCM6328 LED driver")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Acked-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Simon Wunderlich says:
====================
Here are two batman-adv bugfixes:
- fix rx packet counters for local ARP replies, by Sven Eckelmann
- fix memory leaks for unicast packetes received from another gateway
in bridge loop avoidance, by Andreas Pape
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Some fixes:
* Avi fixes some fallout from my mac80211 RX flags changes
* Emmanuel fixes an issue with adhering to the spec, and
an oversight in the SMPS management code
* Jason's patch makes mac80211 use constant-time memory
comparisons for message authentication, to avoid having
potentially observable timing differences
* my fix makes mac80211 set the basic rates bitmap before
the channel so the next update to the driver has more
consistent data - this required another rework patch to
remove some useless 5/10 MHz code that can never be hit
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently when dumping a context data only word number '1' is read for the
entire context.
Fixes: c965db4446 ("qed: Add support for debug data collection")
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A new Sierra Wireless EM7305 device ID used in a Toshiba laptop,
and two Longcheer device IDs entries used by Telewell TW-3G HSPA+
branded modems.
Reported-by: Petr Kloc <petr_kloc@yahoo.com>
Reported-by: Teemu Likonen <tlikonen@iki.fi>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 90eff9096c ("net: phy: Allow splitting MDIO
bus/device support from PHYs") we could create a configuration where
MDIO_DEVICE=y and PHYLIB=m which leads to the following undefined
references:
drivers/built-in.o: In function `thunder_mdiobus_pci_remove':
>> mdio-thunder.c:(.text+0x2a212f): undefined reference to
>> `mdiobus_unregister'
>> mdio-thunder.c:(.text+0x2a2138): undefined reference to
>> `mdiobus_free'
drivers/built-in.o: In function `thunder_mdiobus_pci_probe':
mdio-thunder.c:(.text+0x2a22e7): undefined reference to
`devm_mdiobus_alloc_size'
mdio-thunder.c:(.text+0x2a236f): undefined reference to
`of_mdiobus_register'
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 90eff9096c ("net: phy: Allow splitting MDIO bus/device support from PHYs")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's already added by pr_fmt so remove the explicit use.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrey reported a use-after-free in add_grec():
for (psf = *psf_list; psf; psf = psf_next) {
...
psf_next = psf->sf_next;
where the struct ip_sf_list's were already freed by:
kfree+0xe8/0x2b0 mm/slub.c:3882
ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
sock_release+0x8d/0x1e0 net/socket.c:597
sock_close+0x16/0x20 net/socket.c:1072
This happens because we don't hold pmc->lock in ip_mc_clear_src()
and a parallel mr_ifc_timer timer could jump in and access them.
The RCU lock is there but it is merely for pmc itself, this
spinlock could actually ensure we don't access them in parallel.
Thanks to Eric and Long for discussion on this bug.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Getting the device version out of the driver really aids debugging.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes uninitialized symbol warning that
got introduced by the following commit
773fc8f6e8 ("net: rps: send out pending IPI's on CPU hotplug")
Signed-off-by: Ashwanth Goli <ashwanth@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are many situations where generic HID driver provides some basic level
of support for certain device, but later this support (usually by implementing
vendor-specific extensions of HID protocol) is extended and the support moved
over to a separate (usually per-vendor) specific driver.
This might bring a rather unpleasant suprise for users, as all of a sudden
there is a new config option they have to enable in order to get any support
for their device whatsoever, although previous kernel versions provided basic
support through the generic driver. Which is rightfully seen as a regression.
Fix this by including the entry for a particular device in
hid_have_special_driver[] iff the specific config option has been specified,
and let generic driver handle the device otherwise.
Also make the behavior of hid_scan_report() (where the same decision is being
taken on a per-report level) consistent.
While at it, reshuffle the hid_have_special_driver[] a bit to restore the
alphabetical ordering (first order by config option, and within those
sections order by VID).
This is considered a short-term solution, before generic way of giving
precedence to special drivers and falling back to generic driver is
figured out.
While at it, fixup a missing entry for GFRM driver; thanks to Hans de Geode for
spotting this (and for discovering a few issues in the conversion).
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
mac80211 allows to modify the SMPS state of an AP both,
when it is started, and after it has been started. Such a
change will trigger an action frame to all the peers that
are currently connected, and will be remembered so that
new peers will get notified as soon as they connect (since
the SMPS setting in the beacon may not be the right one).
This means that we need to remember the SMPS state
currently requested as well as the SMPS state that was
configured initially (and advertised in the beacon).
The former is bss->req_smps and the latter is
sdata->smps_mode.
Initially, the AP interface could only be started with
SMPS_OFF, which means that sdata->smps_mode was SMPS_OFF
always. Later, a nl80211 API was added to be able to start
an AP with a different AP mode. That code forgot to update
bss->req_smps and because of that, if the AP interface was
started with SMPS_DYNAMIC, we had:
sdata->smps_mode = SMPS_DYNAMIC
bss->req_smps = SMPS_OFF
That configuration made mac80211 think it needs to fire off
an action frame to any new station connecting to the AP in
order to let it know that the actual SMPS configuration is
SMPS_OFF.
Fix that by properly setting bss->req_smps in
ieee80211_start_ap.
Fixes: f699317487 ("mac80211: set smps_mode according to ap params")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When mac80211 changes the channel, it also calls into the driver's
bss_info_changed() callback, e.g. with BSS_CHANGED_IDLE. The driver
may, like iwlwifi does, access more data from bss_info in that case
and iwlwifi accesses the basic_rates bitmap, but if changing from a
band with more (basic) rates to one with fewer, an out-of-bounds
access of the rate array may result.
While we can't avoid having invalid data at some point in time, we
can avoid having it while we call the driver - so set up all the
data before configuring the channel, and then apply it afterwards.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=195677
Reported-by: Johannes Hirte <johannes.hirte@datenkhaos.de>
Tested-by: Johannes Hirte <johannes.hirte@datenkhaos.de>
Debugged-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There's no need for the station MLME code to handle bitrates for 5
or 10 MHz channels when it can't ever create such a configuration.
Remove the unnecessary code.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If the driver reports the rx timestamp at PLCP start, mac80211 can
only handle legacy encoding, but the code checks that the encoding
is not legacy. Fix this.
Fixes: da6a4352e7 ("mac80211: separate encoding/bandwidth from flags")
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch is based on a discussion generated by an earlier patch
from Tetsuo Handa:
* https://marc.info/?t=149035659300001&r=1&w=2
The double free problem involves the mnt_opts field of the
security_mnt_opts struct, selinux_parse_opts_str() frees the memory
on error, but doesn't set the field to NULL so if the caller later
attempts to call security_free_mnt_opts() we trigger the problem.
In order to play it safe we change selinux_parse_opts_str() to call
security_free_mnt_opts() on error instead of free'ing the memory
directly. This should ensure that everything is handled correctly,
regardless of what the caller may do.
Fixes: e000752989 ("LSM/SELinux: Interfaces to allow FS to control mount options")
Cc: stable@vger.kernel.org
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Pull Xtensa fixes from Max Filippov:
- don't use linux IRQ #0 in legacy irq domains: fixes timer interrupt
assignment when it's hardware IRQ # is 0 and the kernel is built w/o
device tree support
- reduce reservation size for double exception vector literals from 48
to 20 bytes: fixes build on cores with small user exception vector
- cleanups: use kmalloc_array instead of kmalloc in simdisk_init and
seq_puts instead of seq_printf in c_show.
* tag 'xtensa-20170612' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: don't use linux IRQ #0
xtensa: reduce double exception literal reservation
xtensa: ISS: Use kmalloc_array() in simdisk_init()
xtensa: Use seq_puts() in c_show()
Pull s390 fixes from Martin Schwidefsky:
- A fix for KVM to avoid kernel oopses in case of host protection
faults due to runtime instrumentation
- A fix for the AP bus to avoid dead devices after unbind / bind
- A fix for a compile warning merged from the vfio_ccw tree
- Updated default configurations
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: update defconfig
s390/zcrypt: Fix blocking queue device after unbind/bind.
s390/vfio_ccw: make some symbols static
s390/kvm: do not rely on the ILC on kvm host protection fauls
A recent commit to refactor the driver and remove the hw_disabled_flags
field accidentally introduced two regressions. First, we overwrote
pf->flags which removed various key flags including the MSI-X settings.
Additionally, it was intended that we have now two flags,
HW_ATR_EVICT_CAPABLE and HW_ATR_EVICT_ENABLED, but this was not done,
and we accidentally were mis-using HW_ATR_EVICT_CAPABLE everywhere.
This patch adds the missing piece, HW_ATR_EVICT_ENABLED, and safely
updates pf->flags instead of overwriting it.
Without this patch we will have many problems including disabling MSI-X
support, and we'll attempt to use HW ATR eviction on devices which do
not support it.
Fixes: 47994c119a ("i40e: remove hw_disabled_flags in favor of using separate flag bits", 2017-04-19)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PCI endpoint test driver uses crc32_le() so it should select
CRC32. Fixes this build error (when CRC32=m):
drivers/built-in.o: In function `pci_epf_test_cmd_handler':
pci-epf-test.c:(.text+0x2d98d): undefined reference to `crc32_le'
Fixes: 349e7a85b2 ("PCI: endpoint: functions: Add an EP function to test PCI")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
When HSR interface is setup using ip link command, an annoying warning
appears with the trace as below:-
[ 203.019828] hsr_get_node: Non-HSR frame
[ 203.019833] Modules linked in:
[ 203.019848] CPU: 0 PID: 158 Comm: sd-resolve Tainted: G W 4.12.0-rc3-00052-g9fa6bf70 #2
[ 203.019853] Hardware name: Generic DRA74X (Flattened Device Tree)
[ 203.019869] [<c0110280>] (unwind_backtrace) from [<c010c2f4>] (show_stack+0x10/0x14)
[ 203.019880] [<c010c2f4>] (show_stack) from [<c04b9f64>] (dump_stack+0xac/0xe0)
[ 203.019894] [<c04b9f64>] (dump_stack) from [<c01374e8>] (__warn+0xd8/0x104)
[ 203.019907] [<c01374e8>] (__warn) from [<c0137548>] (warn_slowpath_fmt+0x34/0x44)
root@am57xx-evm:~# [ 203.019921] [<c0137548>] (warn_slowpath_fmt) from [<c081126c>] (hsr_get_node+0x148/0x170)
[ 203.019932] [<c081126c>] (hsr_get_node) from [<c0814240>] (hsr_forward_skb+0x110/0x7c0)
[ 203.019942] [<c0814240>] (hsr_forward_skb) from [<c0811d64>] (hsr_dev_xmit+0x2c/0x34)
[ 203.019954] [<c0811d64>] (hsr_dev_xmit) from [<c06c0828>] (dev_hard_start_xmit+0xc4/0x3bc)
[ 203.019963] [<c06c0828>] (dev_hard_start_xmit) from [<c06c13d8>] (__dev_queue_xmit+0x7c4/0x98c)
[ 203.019974] [<c06c13d8>] (__dev_queue_xmit) from [<c0782f54>] (ip6_finish_output2+0x330/0xc1c)
[ 203.019983] [<c0782f54>] (ip6_finish_output2) from [<c0788f0c>] (ip6_output+0x58/0x454)
[ 203.019994] [<c0788f0c>] (ip6_output) from [<c07b16cc>] (mld_sendpack+0x420/0x744)
As this is an expected path to hsr_get_node() with frame coming from
the master interface, add a check to ensure packet is not from the
master port and then warn.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When plugging an USB webcam I see the following message:
[106385.615559] xhci_hcd 0000:04:00.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
[106390.583860] handle_tx_event: 913 callbacks suppressed
With this patch applied, I get no more printing of this message.
Cc: <stable@vger.kernel.org>
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
xHCI host controllers can have both USB 3.1 and 3.0 extended speed
protocol lists. If the USB3.1 speed is parsed first and 3.0 second then
the minor revision supported will be overwritten by the 3.0 speeds and
the USB3 roothub will only show support for USB 3.0 speeds.
This was the case with a xhci controller with the supported protocol
capability listed below.
In xhci-mem.c, the USB 3.1 speed is parsed first, the min_rev of usb3_rhub
is set as 0x10. And then USB 3.0 is parsed. However, the min_rev of
usb3_rhub will be changed to 0x00. If USB 3.1 device is connected behind
this host controller, the speed of USB 3.1 device just reports 5G speed
using lsusb.
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00 01 08 00 00 00 00 00 40 00 00 00 00 00 00 00 00
10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20 02 08 10 03 55 53 42 20 01 02 00 00 00 00 00 00 //USB 3.1
30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
40 02 08 00 03 55 53 42 20 03 06 00 00 00 00 00 00 //USB 3.0
50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60 02 08 00 02 55 53 42 20 09 0E 19 00 00 00 00 00 //USB 2.0
70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
This patch fixes the issue by only owerwriting the minor revision if
it is higher than the existing one.
[reword commit message -Mathias]
Cc: <stable@vger.kernel.org>
Signed-off-by: YD Tseng <yd_tseng@asmedia.com.tw>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Felipe writes:
usb: fixes for v4.12-rc5
Alan Stern fixed a GPF in gadgetfs found by the kernel fuzzying project
composite.c learned that if it deactivates a function during bind, it
must reactivate it during unbind.
Reading /proc/net/snmp6 yields bogus values on 32 bit kernels.
Use "u64" instead of "unsigned long" in sizeof().
Fixes: 4a4857b1c8 ("proc: Reduce cache miss in snmp6_seq_show")
Signed-off-by: Christian Perle <christian.perle@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull devfreq fixes from MyungJoo Ham.
* 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq:
PM / devfreq: exynos-ppmu: Staticize event list
PM / devfreq: exynos-ppmu: Handle return value of clk_prepare_enable
PM / devfreq: exynos-nocp: Handle return value of clk_prepare_enable
'of_node_put()' should be called on pointer returned by
'of_parse_phandle()' when done. In this function this is done in all path
except this 'continue', so add it.
Fixes: 97735da074 (drivers: cpuidle: Add status property to ARM idle states)
Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 27ed3cd2eb (cpufreq: conservative: Fix the logic in frequency
decrease checking) removed the 10 point substraction when comparing the
load against down_threshold but did not remove the related limit for the
down_threshold value. As a result, down_threshold lower than 11 is not
allowed even though values from 1 to 10 do work correctly too. The
comment ("cannot be lower than 11 otherwise freq will not fall") also
does not apply after removing the substraction.
For this reason, allow down_threshold to take any value from 1 to 99
and fix the related comment.
Fixes: 27ed3cd2eb (cpufreq: conservative: Fix the logic in frequency decrease checking)
Signed-off-by: Tomasz Wilczyński <twilczynski@naver.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 3.10+ <stable@vger.kernel.org> # 3.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Revert commit 39b64aa1c0 (cpufreq: schedutil: Reduce frequencies
slower) that introduced unintentional changes in behavior leading
to adverse effects on some systems.
Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Considering this case:
1. A program opens a sysfs table file 65535 times, it can increase
validation_count and first increment cause the table to be mapped:
validation_count = 65535
2. AML execution causes "Load" to be executed on the same
table, this time it cannot increase validation_count, so
validation_count remains:
validation_count = 65535
3. The program closes sysfs table file 65535 times, it can decrease
validation_count and the last decrement cause the table to be
unmapped:
validation_count = 0
4. AML code still accessing the loaded table, kernel crash can be
observed.
To prevent that from happening, add a validation_count threashold.
When it is reached, the validation_count can no longer be
incremented/decremented to invalidate the table descriptor (means
preventing table unmappings)
Note that code added in acpi_tb_put_table() is actually a no-op but
changes the warning message into a "warn once" one. Lv Zheng.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Changelog, comments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There have been reports about SDIO failing with certain WiFi chips in
descriptor chain mode. SD / eMMC are working fine.
So let's fall back to bounce buffer mode for command SD_IO_RW_EXTENDED.
This was reported to fix the error.
Fixes: 79ed05e329 "mmc: meson-gx: add support for descriptor chain mode"
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The ppmu_events array is accessed only in this compilation unit so it
can be made static.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Pull key subsystem fixes from James Morris:
"Here are a bunch of fixes for Linux keyrings, including:
- Fix up the refcount handling now that key structs use the
refcount_t type and the refcount_t ops don't allow a 0->1
transition.
- Fix a potential NULL deref after error in x509_cert_parse().
- Don't put data for the crypto algorithms to use on the stack.
- Fix the handling of a null payload being passed to add_key().
- Fix incorrect cleanup an uninitialised key_preparsed_payload in
key_update().
- Explicit sanitisation of potentially secure data before freeing.
- Fixes for the Diffie-Helman code"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (23 commits)
KEYS: fix refcount_inc() on zero
KEYS: Convert KEYCTL_DH_COMPUTE to use the crypto KPP API
crypto : asymmetric_keys : verify_pefile:zero memory content before freeing
KEYS: DH: add __user annotations to keyctl_kdf_params
KEYS: DH: ensure the KDF counter is properly aligned
KEYS: DH: don't feed uninitialized "otherinfo" into KDF
KEYS: DH: forbid using digest_null as the KDF hash
KEYS: sanitize key structs before freeing
KEYS: trusted: sanitize all key material
KEYS: encrypted: sanitize all key material
KEYS: user_defined: sanitize key payloads
KEYS: sanitize add_key() and keyctl() key payloads
KEYS: fix freeing uninitialized memory in key_update()
KEYS: fix dereferencing NULL payload with nonzero length
KEYS: encrypted: use constant-time HMAC comparison
KEYS: encrypted: fix race causing incorrect HMAC calculations
KEYS: encrypted: fix buffer overread in valid_master_desc()
KEYS: encrypted: avoid encrypting/decrypting stack buffers
KEYS: put keyring if install_session_keyring_to_cred() fails
KEYS: Delete an error message for a failed memory allocation in get_derived_key()
...
Commit abb2ea7dfd ("compiler, clang: suppress warning for unused
static inline functions") just caused more warnings due to re-defining
the 'inline' macro.
So undef it before re-defining it, and also add the 'notrace' attribute
like the gcc version that this is overriding does.
Maybe this makes clang happier.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes two issues:
1) When forwarding on *,G mroutes that are in a vrf, the
kernel was dropping information about the actual incoming
interface when calling ip_mr_forward from ip_mr_input.
This caused ip_mr_forward to send the multicast packet
back out the incoming interface. Fix this by
modifying ip_mr_forward to be handed the correctly
resolved dev.
2) When a unresolved cache entry is created we store
the incoming skb on the unresolved cache entry and
upon mroute resolution from the user space daemon,
we attempt to forward the packet. Again we were
not resolving to the correct incoming device for
a vrf scenario, before calling ip_mr_forward.
Fix this by resolving to the correct interface
and calling ip_mr_forward with the result.
Fixes: e58e415968 ("net: Enable support for VRF with ipv4 multicast")
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox mlx5 fixes 2017-06-11
This series contains some fixes for the mlx5 core and netdev driver.
Please pull and let me know if there's any problem.
For -stable:
("net/mlx5e: Added BW check for DIM decision mechanism") kernels >= 4.9
("net/mlx5e: Fix wrong indications in DIM due to counter wraparound") kernels >= 4.9
("net/mlx5: Remove several module events out of ethtool stats") kernels >= 4.10
("net/mlx5: Enable 4K UAR only when page size is bigger than 4K") kernels >= 4.11
*all patches apply with no issue on their -stable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal says:
====================
Bugs fixes in ena ethernet driver
This patchset contains fixes for the bugs that were discovered so far.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
check_for_missing_tx_completions() is called from a timer
task and looking for lost tx packets.
The old implementation accumulate all the lost tx packets
and did not check if those packets were retrieved on a later stage.
This cause to a situation where the driver reset
the device for no reason.
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the rare case where the device runs out of free rx buffer
descriptors (in case of pressure on kernel memory),
and the napi handler continuously fail to refill new Rx descriptors
until device rx queue totally runs out of all free rx buffers
to post incoming packet, leading to a deadlock:
* The device won't send interrupts since all the new
Rx packets will be dropped.
* The napi handler won't try to allocate new Rx descriptors
since allocation is part of NAPI that's not being invoked any more
The fix involves detecting this scenario and rescheduling NAPI
(to refill buffers) by the keepalive/watchdog task.
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch also change the mapping functions to devm_ functions
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug:
"Completion context is occupied" error printout will be noticed in
dmesg.
This error will cause the admin command to fail, which will lead to
an ena_probe() failure or a watchdog reset (depends on which admin
command failed).
Root cause:
__ena_com_submit_admin_cmd() is the function that submits new entries to
the admin queue.
The function have a check that makes sure the queue is not full and the
function does not override any outstanding command.
It uses head and tail indexes for this check.
The head is increased by ena_com_handle_admin_completion() which runs
from interrupt context, and the tail index is increased by the submit
function (the function is running under ->q_lock, so there is no risk
of multithread increment).
Each command is associated with a completion context. This context
allocated before call to __ena_com_submit_admin_cmd() and freed by
ena_com_wait_and_process_admin_cq_interrupts(), right after the command
was completed.
This can lead to a state where the head was increased, the check passed,
but the completion context is still in use.
Solution:
Use the atomic variable ->outstanding_cmds instead of using the head and
the tail indexes.
This variable is safe for use since it is bumped in get_comp_ctx() in
__ena_com_submit_admin_cmd() and is freed by comp_ctxt_release()
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixing a bug that the driver does not unmask the IO interrupts
in ndo_open():
occasionally, the MSI-X interrupt (for one or more IO queues)
can be masked when ndo_close() was called.
If that is followed by ndo open(),
then the MSI-X will be still masked so no interrupt
will be received by the driver.
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current flow to detect admin completion is:
while (command_not_completed) {
if (timeout)
error
check_for_completion()
sleep()
}
So in case the sleep took more than the timeout
(in case the thread/workqueue was not scheduled due to higher priority
task or prolonged VMexit), the driver can detect a stall even if
the completion is present.
The fix changes the order of this function to first check for
completion and only after that check if the timeout expired.
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull randomness fixes from Ted Ts'o:
"Improve performance by using a lockless update mechanism suggested by
Linus, and make sure we refresh per-CPU entropy returned get_random_*
as soon as the CRNG is initialized"
* tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: invalidate batched entropy after crng init
random: use lockless method of accessing and updating f->reg_idx
Pull ext4 fixes from Ted Ts'o:
"Fix various bug fixes in ext4 caused by races and memory allocation
failures"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix fdatasync(2) after extent manipulation operations
ext4: fix data corruption for mmap writes
ext4: fix data corruption with EXT4_GET_BLOCKS_ZERO
ext4: fix quota charging for shared xattr blocks
ext4: remove redundant check for encrypted file on dio write path
ext4: remove unused d_name argument from ext4_search_dir() et al.
ext4: fix off-by-one error when writing back pages before dio read
ext4: fix off-by-one on max nr_pages in ext4_find_unwritten_pgoff()
ext4: keep existing extra fields when inode expands
ext4: handle the rest of ext4_mb_load_buddy() ENOMEM errors
ext4: fix off-by-in in loop termination in ext4_find_unwritten_pgoff()
ext4: fix SEEK_HOLE
jbd2: preserve original nofs flag during journal restart
ext4: clear lockdep subtype for quota files on quota off
Pull GPIO fixes from Linus Walleij:
"A few overdue GPIO patches for the v4.12 kernel.
- Fix debounce logic on the Aspeed platform.
- Fix the "virtual gpio" things on the Intel Crystal Cove.
- Fix the blink counter selection on the MVEBU platform"
* tag 'gpio-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: mvebu: fix gpio bank registration when pwm is used
gpio: mvebu: fix blink counter register selection
MAINTAINERS: remove self from GPIO maintainers
gpio: crystalcove: Do not write regular gpio registers for virtual GPIOs
gpio: aspeed: Don't attempt to debounce if disabled
Pull char/misc driver fixes from Greg KH:
"Here are some small driver fixes for 4.12-rc5. Nothing major here,
just some small bugfixes found by people testing, and a MAINTAINERS
file update for the genwqe driver.
All have been in linux-next with no reported issues"
[ The cxl driver fix came in through the powerpc tree earlier ]
* tag 'char-misc-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
cxl: Avoid double free_irq() for psl,slice interrupts
mei: make sysfs modalias format similar as uevent modalias
drivers: char: mem: Fix wraparound check to allow mappings up to the end
MAINTAINERS: Change maintainer of genwqe driver
goldfish_pipe: use GFP_ATOMIC under spin lock
firmware: vpd: do not leak kobjects
firmware: vpd: avoid potential use-after-free when destroying section
firmware: vpd: do not leave freed section attributes to the list
Pull staging/IIO fixes from Greg KH:
"These are mostly all IIO driver fixes, resolving a number of tiny
issues. There's also a ccree and lustre fix in here as well, both fix
problems found in those codebases.
All have been in linux-next with no reported issues"
* tag 'staging-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: ccree: fix buffer copy
staging/lustre/lov: remove set_fs() call from lov_getstripe()
staging: ccree: add CRYPTO dependency
iio: adc: sun4i-gpadc-iio: fix parent device being used in devm function
iio: light: ltr501 Fix interchanged als/ps register field
iio: adc: bcm_iproc_adc: swap primary and secondary isr handler's
iio: trigger: fix NULL pointer dereference in iio_trigger_write_current()
iio: adc: max9611: Fix attribute measure unit
iio: adc: ti_am335x_adc: allocating too much in probe
iio: adc: sun4i-gpadc-iio: Fix module autoload when OF devices are registered
iio: adc: sun4i-gpadc-iio: Fix module autoload when PLATFORM devices are registered
iio: proximity: as3935: fix iio_trigger_poll issue
iio: proximity: as3935: fix AS3935_INT mask
iio: adc: Max9611: checking for ERR_PTR instead of NULL in probe
iio: proximity: as3935: recalibrate RCO after resume
Pull USB fixes from Greg KH:
"Here are some small USB fixes for 4.12-rc5
They are for some reported issues in the chipidea and gadget drivers.
Nothing major. All have been in linux-next for a while with no
reported issues"
* tag 'usb-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: gadget: udc: renesas_usb3: Fix PN_INT_ENA disabling timing
usb: gadget: udc: renesas_usb3: lock for PN_ registers access
usb: gadget: udc: renesas_usb3: fix deadlock by spinlock
usb: gadget: udc: renesas_usb3: fix pm_runtime functions calling
usb: gadget: f_mass_storage: Serialize wake and sleep execution
usb: dwc2: add support for the DWC2 controller on Meson8 SoCs
phy: qualcomm: phy-qcom-qmp: fix application of sizeof to pointer
usb: musb: dsps: keep VBUS on for host-only mode
usb: chipidea: core: check before accessing ci_role in ci_role_show
usb: chipidea: debug: check before accessing ci_role
phy: qcom-qmp: fix return value check in qcom_qmp_phy_create()
usb: chipidea: udc: fix NULL pointer dereference if udc_start failed
usb: chipidea: imx: Do not access CLKONOFF on i.MX51
Pull SCSI fixes from James Bottomley:
"This is a set of user visible fixes (excepting one format string
change).
Four of the qla2xxx fixes only affect the firmware dump path, but it's
still important to the enterprise. The rest are various NULL pointer
crash conditions or outright driver hangs"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: cxgb4i: libcxgbi: in error case RST tcp conn
scsi: scsi_debug: Avoid PI being disabled when TPGS is enabled
scsi: qla2xxx: Fix extraneous ref on sp's after adapter break
scsi: lpfc: prevent potential null pointer dereference
scsi: lpfc: Avoid NULL pointer dereference in lpfc_els_abort()
scsi: lpfc: nvmet_fc: fix format string
scsi: qla2xxx: Fix crash due to NULL pointer dereference of ctx
scsi: qla2xxx: Fix mailbox pointer error in fwdump capture
scsi: qla2xxx: Set bit 15 for DIAG_ECHO_TEST MBC
scsi: qla2xxx: Modify T262 FW dump template to specify same start/end to debug customer issues
scsi: qla2xxx: Fix crash due to mismatch mumber of Q-pair creation for Multi queue
scsi: qla2xxx: Fix NULL pointer access due to redundant fc_host_port_name call
scsi: qla2xxx: Fix recursive loop during target mode configuration for ISP25XX leaving system unresponsive
scsi: bnx2fc: fix race condition in bnx2fc_get_host_stats()
scsi: qla2xxx: don't disable a not previously enabled PCI device
Pull libnvdimm fix from Dan Williams:
"We expanded the device-dax fs type in 4.12 to be a generic provider of
a struct dax_device with an embedded inode. However, Sasha found some
basic negative testing was not run to verify that this fs cleanly
handles being mounted directly.
Note that the fresh rebase was done to remove an unnecessary Cc:
<stable> tag, but this commit otherwise had a build success
notification from the 0day robot."
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
device-dax: fix 'dax' device filesystem inode destruction crash
Pull hexagon fix from Guenter Roeck:
"This fixes a build error seen when building hexagon images.
Richard sent me an Ack, but didn't reply when asked if he wants me to
send the patch to you directly, so I figured I'd just do it"
* tag 'hexagon-for-linus-v4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hexagon: Use raw_copy_to_user
meson_sar_adc_clear_fifo passes a 0 as value-pointer to regmap_read().
In case of the meson-saradc driver this ends up in regmap_mmio_read(),
where the value-pointer is de-referenced unconditionally to assign the
value which was read.
Fix this by passing an actual pointer, even though all we want to do is
to discard the value.
As a side-effect this fixes a sparse warning ("Using plain integer as
NULL pointer") as reported by Paolo Cretaro.
Fixes: 3adbf34273 ("iio: adc: add a driver for the SAR ADC found in Amlogic Meson SoCs")
Reported-by: Paolo Cretaro <paolocretaro@gmail.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
When the page size isn't bigger than 4K, there is no added value of enabling 4K
UAR feature in the Firmware.
Modified the condition of enabling the 4K UAR accordingly.
Fixes: f502d83495 ("net/mlx5: Activate support for 4K UARs")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for
changing the channel interrupt moderation values in order to reduce CPU
overhead for all traffic types.
Each iteration of the algorithm, DIM calculates the difference in
throughput, packet rate and interrupt rate from last iteration in order
to make a decision. DIM relies on counters for each metric. When these
counters get to their type's max value they wraparound. In this case
the delta between 'end' and 'start' samples is negative and when
translated to unsigned integers - very high. This results in a false
indication to the algorithm and might result in a wrong decision.
The fix calculates the 'distance' between 'end' and 'start' samples in a
cyclic way around the relevant type's max value. It can also be viewed as
an absolute value around the type's max value instead of around 0.
Testing show higher stability in DIM profile selection and no wraparound
issues.
Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for
changing the channel interrupt moderation values in order to reduce CPU
overhead for all traffic types.
Until now only interrupt and packet rate were sampled.
We found a scenario on which we get a false indication since a change in
DIM caused more aggregation and reduced packet rate while increasing BW.
We now regard a change as succesfull iff:
current_BW > (prev_BW + threshold) or
current_BW ~= prev_BW and current_PR > (prev_PR + threshold) or
current_BW ~= prev_BW and current_PR ~= prev_PR and
current_IR < (prev_IR - threshold)
Where BW = Bandwidth, PR = Packet rate and IR = Interrupt rate
Improvements (ConnectX-4Lx 25GbE, single RX queue, LRO off)
--------------------------------------------------
packet size | before[Mb/s] | after[Mb/s] | gain |
2B | 343.4 | 359.4 | 4.5% |
16B | 2739.7 | 2814.8 | 2.7% |
64B | 9739 | 10185.3 | 4.5% |
Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Remove the following module event counters out of ethtool stats. The
reason for removing these event counters is that these events do not
occur without techinician's intervention.
module_pwr_budget_exd
module_long_range
module_no_eeprom
module_enforce_part
module_unknown_id
module_unknown_status
module_plug
Fixes: bedb7c909c ("net/mlx5e: Add port module event counters to ethtool stats")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The issue is that when we get an assert we will stop polling the health
and thus we cant enter error state when we have a real health issue.
Fixes: fd76ee4da5 ('net/mlx5_core: Fix internal error detection conditions')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
INFO: task gnome-terminal-:1734 blocked for more than 120 seconds.
Not tainted 4.12.0-rc4+ #8
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
gnome-terminal- D 0 1734 1015 0x00000000
Call Trace:
__schedule+0x3cd/0xb30
schedule+0x40/0x90
kvm_async_pf_task_wait+0x1cc/0x270
? __vfs_read+0x37/0x150
? prepare_to_swait+0x22/0x70
do_async_page_fault+0x77/0xb0
? do_async_page_fault+0x77/0xb0
async_page_fault+0x28/0x30
This is triggered by running both win7 and win2016 on L1 KVM simultaneously,
and then gives stress to memory on L1, I can observed this hang on L1 when
at least ~70% swap area is occupied on L0.
This is due to async pf was injected to L2 which should be injected to L1,
L2 guest starts receiving pagefault w/ bogus %cr2(apf token from the host
actually), and L1 guest starts accumulating tasks stuck in D state in
kvm_async_pf_task_wait() since missing PAGE_READY async_pfs.
This patch fixes the hang by doing async pf when executing L1 guest.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit ac4691fac8 ("hexagon: switch to RAW_COPY_USER") replaced
__copy_to_user_hexagon() with raw_copy_to_user(), but did not catch
all callers, resulting in the following build error.
arch/hexagon/mm/uaccess.c: In function '__clear_user_hexagon':
arch/hexagon/mm/uaccess.c:40:3: error:
implicit declaration of function '__copy_to_user_hexagon'
Fixes: ac4691fac8 ("hexagon: switch to RAW_COPY_USER")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Richard Kuo <rkuo@codeaurora.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Thomas Petazzoni says:
====================
net: mvpp2: driver fixes
As requested, here is a series of patches containing only bug fixes
for the mvpp2 driver. It is based on the latest "net" branch.
Changes since v1:
- Fixed a build breakage that occurred when only PATCH 1 was only,
and not later patches in the series. Was reported by the kbuild
report on the first submission.
- Added Tested-by from Marc Zyngier on PATCH 2.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
smp_processor_id() should not be used in migration-enabled contexts. We
originally thought it was OK in the specific situation of this driver,
but it was wrong, and calling smp_processor_id() in a migration-enabled
context prints a big fat warning when CONFIG_DEBUG_PREEMPT=y.
Therefore, this commit replaces the smp_processor_id() in
migration-enabled contexts by the appropriate get_cpu/put_cpu sections.
Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Fixes: a786841df7 ("net: mvpp2: handle register mapping and access for PPv2.2")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit removes the useless remove
mvpp2_bm_cookie_{build,pool_get} functions. All what
mvpp2_bm_cookie_build() was doing is compute a 32-bit value by
concatenating the pool number and the CPU number... only to get the pool
number re-extracted by mvpp2_bm_cookie_pool_get() later on.
Instead, just get the pool number directly from RX descriptor status,
and pass it to mvpp2_pool_refill() and mvpp2_rx_refill().
This has the added benefit of dropping a smp_processor_id() call in a
migration-enabled context, which is wrong, and is the original
motivation for making this change.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel may sleep under a rcu read lock in tipc_msg_reverse, and the
function call path is:
tipc_l2_rcv_msg (acquire the lock by rcu_read_lock)
tipc_rcv
tipc_sk_rcv
tipc_msg_reverse
pskb_expand_head(GFP_KERNEL) --> may sleep
tipc_node_broadcast
tipc_node_xmit_skb
tipc_node_xmit
tipc_sk_rcv
tipc_msg_reverse
pskb_expand_head(GFP_KERNEL) --> may sleep
To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC".
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the
function call path is:
cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock)
cfctrl_linkdown_req
cfpkt_create
cfpkt_create_pfx
alloc_skb(GFP_KERNEL) --> may sleep
cfserl_receive (acquire the lock by rcu_read_lock)
cfpkt_split
cfpkt_create_pfx
alloc_skb(GFP_KERNEL) --> may sleep
There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or
"GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function
is called under a rcu read lock, instead in interrupt.
To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now sctp holds read_lock when foreach sctp_ep_hashtable without disabling
BH. If CPU schedules to another thread A at this moment, the thread A may
be trying to hold the write_lock with disabling BH.
As BH is disabled and CPU cannot schedule back to the thread holding the
read_lock, while the thread A keeps waiting for the read_lock. A dead
lock would be triggered by this.
This patch is to fix this dead lock by calling read_lock_bh instead to
disable BH when holding the read_lock in sctp_for_each_endpoint.
Fixes: 626d16f50f ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 2b30842b23 ("net: fec: Clear and enable MIB counters on imx51")
introduced fec_enet_clear_ethtool_stats(), but missed to add a stub
for the CONFIG_M5272=y case, causing build failure for the
m5272c3_defconfig.
Add the missing empty stub to fix the build failure.
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a counter problem on 32bit systems:
When the rx_bytes counter reached 2 GiB, it jumpd to (2^64 Bytes - 2GiB) Bytes.
rtnl_link_stats64 has __u64 type and atomic_long_read returns
atomic_long_t which is signed. Due to the conversation
we get an incorrect value on 32bit systems if the MSB of
the atomic_long_t value is set.
CC: Tom Parkin <tparkin@katalix.com>
Fixes: 7b7c0719cd ("l2tp: avoid deadlock in l2tp stats update")
Signed-off-by: Dominik Heidler <dheidler@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function is not defined, so no need to declare it.
As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz says:
====================
bnx2x: Fix malicious VFs indication
It was discovered that for a VF there's a simple [yet uncommon] scenario
which would cause device firmware to declare that VF as malicious -
Add a vlan interface on top of a VF and disable txvlan offloading for
that VF [causing VF to transmit packets where vlan is on payload].
Patch #1 corrects driver transmission to prevent this issue.
Patch #2 is a by-product correcting PF behavior once a VF is declared
malicious.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Once firmware indicates that a given VF is malicious and until
that VF passes an FLR all bets are off - PF can't know anything
is happening to the VF [since VF can't communicate anything to its PF].
But PF is currently still periodically asking device to collect
statistics for the VF which might in turn fill logs by IOMMU blocking
memory access done by the VF's PCI function [in the case VF has unmapped
its buffers].
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VF clients are configured as enforced, meaning firmware is validating
the correctness of their ethertype/vid during transmission.
Once txvlan is disabled, VF would start getting SKBs for transmission
here vlan is on the payload - but it'll pass the packet's ethertype
instead of the vid, leading to firmware declaring it as malicious.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull UFS fixes from Al Viro:
"This is just the obvious backport fodder; I'm pretty sure that there
will be more - definitely so wrt performance and quite possibly
correctness as well"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ufs: we need to sync inode before freeing it
excessive checks in ufs_write_failed() and ufs_evict_inode()
ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path
ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments()
ufs: set correct ->s_maxsize
ufs: restore maintaining ->i_blocks
fix ufs_isblockset()
ufs: restore proper tail allocation
Pull btrfs fixes from Chris Mason:
"Some fixes that Dave Sterba collected.
We've been hitting an early enospc problem on production machines that
Omar tracked down to an old int->u64 mistake. I waited a bit on this
pull to make sure it was really the problem from production, but it's
on ~2100 hosts now and I think we're good.
Omar also noticed a commit in the queue would make new early ENOSPC
problems. I pulled that out for now, which is why the top three
commits are younger than the rest.
Otherwise these are all fixes, some explaining very old bugs that
we've been poking at for a while"
* 'for-linus-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix delalloc accounting leak caused by u32 overflow
Btrfs: clear EXTENT_DEFRAG bits in finish_ordered_io
btrfs: tree-log.c: Wrong printk information about namelen
btrfs: fix race with relocation recovery and fs_root setup
btrfs: fix memory leak in update_space_info failure path
btrfs: use correct types for page indices in btrfs_page_exists_in_range
btrfs: fix incorrect error return ret being passed to mapping_set_error
btrfs: Make flush bios explicitely sync
btrfs: fiemap: Cache and merge fiemap extent before submit it to user
Pull x86 fixes from Ingo Molnar:
"Misc fixes: a Geode fix plus a microcode loader fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/intel: Clear patch pointer before jettisoning the initrd
x86/cpu/cyrix: Add alternative Device ID of Geode GX1 SoC
Pull CPU hotplug fix from Ingo Molnar:
"An error handling corner case fix"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Drop the device lock on error
Pull RCU fixes from Ingo Molnar:
"Fix an SRCU bug affecting KVM IRQ injection"
* 'rcu-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
srcu: Allow use of Classic SRCU from both process and interrupt context
srcu: Allow use of Tiny/Tree SRCU from both process and interrupt context
Pull perf fixes from Ingo Molnar:
"This is mostly tooling fixes, plus an instruction pointer filtering
fix.
It's more fixes than usual - Arnaldo got back from a longer vacation
and there was a backlog"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
perf symbols: Kill dso__build_id_is_kmod()
perf symbols: Keep DSO->symtab_type after decompress
perf tests: Decompress kernel module before objdump
perf tools: Consolidate error path in __open_dso()
perf tools: Decompress kernel module when reading DSO data
perf annotate: Use dso__decompress_kmodule_path()
perf tools: Introduce dso__decompress_kmodule_{fd,path}
perf tools: Fix a memory leak in __open_dso()
perf annotate: Fix symbolic link of build-id cache
perf/core: Drop kernel samples even though :u is specified
perf script python: Remove dups in documentation examples
perf script python: Updated trace_unhandled() signature
perf script python: Fix wrong code snippets in documentation
perf script: Fix documentation errors
perf script: Fix outdated comment for perf-trace-python
perf probe: Fix examples section of documentation
perf report: Ensure the perf DSO mapping matches what libdw sees
perf report: Include partial stacks unwound with libdw
perf annotate: Add missing powerpc triplet
perf test: Disable breakpoint signal tests for powerpc
...
Pull EFI fix from Ingo Molnar:
"A boot crash fix for certain systems where the kernel would trust a
piece of firmware data it should not have"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: Fix boot panic because of invalid BGRT image address
Pull IOMMU fixes from Joerg Roedel:
- another compile-fix for my header cleanup
- a couple of fixes for the recently merged IOMMU probe deferal code
- fixes for ACPI/IORT code necessary with IOMMU probe deferal
* tag 'iommu-fixes-v4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
arm: dma-mapping: Reset the device's dma_ops
ACPI/IORT: Move the check to get iommu_ops from translated fwspec
ARM: dma-mapping: Don't tear down third-party mappings
ACPI/IORT: Ignore all errors except EPROBE_DEFER
iommu/of: Ignore all errors except EPROBE_DEFER
iommu/of: Fix check for returning EPROBE_DEFER
iommu/dma: Fix function declaration
Pull input fixes from Dmitry Torokhov:
- mark "guest" RMI device as pass-through port to avoid "phantom" ALPS
toouchpad on newer Lenovo Carbons
- add two more laptops to the Elantech's lists of devices using CRC
mode
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - register F03 port as pass-through serio
Input: elantech - add Fujitsu Lifebook E546/E557 to force crc_enabled
Pull MD bugfix from Shaohua Li:
"One bug fix from Neil Brown for MD. The bug was introduced in this
cycle"
* tag 'md/4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md: initialise ->writes_pending in personality modules.
Pull block fixes from Jens Axboe:
"A set of fixes in the area of block IO, that should go into the next
-rc release. This contains:
- An OOPS fix from Dmitry, fixing a regression with the bio integrity
code in this series.
- Fix truncation of elevator io context cache name, from Eric
Biggers.
- NVMe pull from Christoph includes FC fixes from James, APST
fixes/tweaks from Kai-Heng, removal fix from Rakesh, and an RDMA
fix from Sagi.
- Two tweaks for the block throttling code. One from Joseph Qi,
fixing an oops from the timer code, and one from Shaohua, improving
the behavior on rotatonal storage.
- Two blk-mq fixes from Ming, fixing corner cases with the direct
issue code.
- Locking fix for bfq cgroups from Paolo"
* 'for-linus' of git://git.kernel.dk/linux-block:
block, bfq: access and cache blkg data only when safe
Fix loop device flush before configure v3
blk-throttle: set default latency baseline for harddisk
blk-throttle: fix NULL pointer dereference in throtl_schedule_pending_timer
nvme: relax APST default max latency to 100ms
nvme: only consider exit latency when choosing useful non-op power states
nvme-fc: fix missing put reference on controller create failure
nvme-fc: on lldd/transport io error, terminate association
nvme-rdma: fast fail incoming requests while we reconnect
nvme-pci: fix multiple ctrl removal scheduling
nvme: fix hang in remove path
elevator: fix truncation of icq_cache_name
blk-mq: fix direct issue
blk-mq: pass correct hctx to blk_mq_try_issue_directly
bio-integrity: Do not allocate integrity context for bio w/o data
Pull sound fixes from Takashi Iwai:
"This update contains a slightly hight amount of changes due to the
pending ASoC fixes:
- ALSA timer core got a couple of fixes for races between read and
ioctl, leading to potential read of uninitialized kmalloced memory
- ASoC core fixed the de-registration pattern for use-after-free bug
- The rewrite of probe code in ASoC Intel Skylake for i915 component
- ASoC R-snd got a series of fixes for SSI
- ASoC simple-card, atmel, da7213, and rt286 trivial fixes
- HD-audio ALC269 quirk and rearrangement of quirk table"
* tag 'sound-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT
ALSA: timer: Fix race between read and ioctl
ALSA: hda/realtek - Reorder ALC269 ASUS quirk entries
ALSA: hda/realtek: Fix mic and headset jack sense on Asus X705UD
ASoC: rsnd: fixup parent_clk_name of AUDIO_CLKOUTx
ASoC: Intel: Skylake: Fix to parse consecutive string tkns in manifest
ASoC: Intel: Skylake: Fix IPC rx_list corruption
ASoC: rsnd: SSI PIO adjust to 24bit mode
MAINTAINERS: Update email address for patches to Wolfson parts
ASoC: Fix use-after-free at card unregistration
ASoC: simple-card: fix mic jack initialization
ASoC: rsnd: don't call free_irq() on Parent SSI
ASoC: atmel-classd: sync regcache when resuming
ASoC: rsnd: don't use PDTA bit for 24bit on SSI
ASoC: da7213: Fix incorrect usage of bitwise '&' operator for SRM check
rt286: add Thinkpad Helix 2 to force_combo_jack_table
ASoC: Intel: Skylake: Move i915 registration to worker thread
Pull drm fixes from Dave Airlie:
"Intel, nouveau, rockchip, vmwgfx, imx, meson, mediatek and core fixes.
Bit more spread out fixes this time, fixes for 7 drivers + a couple of
core fixes.
i915 and vmwgfx are the main ones. The vmwgfx ones fix a bunch of
regressions in their atomic rework, and a few fixes destined for
stable. i915 has some 4.12 regressions and older things that need to
be fixed in stable as well.
nouveau also has some runtime pm fixes and a timer list handling fix,
otherwise a couple of core and small driver regression fixes"
* tag 'drm-fixes-for-v4.12-rc5' of git://people.freedesktop.org/~airlied/linux: (37 commits)
drm/i915: fix warning for unused variable
drm/meson: Fix driver bind when only CVBS is available
drm/i915: Fix 90/270 rotated coordinates for FBC
drm/i915: Restore has_fbc=1 for ILK-M
drm/i915: Workaround VLV/CHV DSI scanline counter hardware fail
drm/i915: Fix logical inversion for gen4 quirking
drm/i915: Guard against i915_ggtt_disable_guc() being invoked unconditionally
drm/i915: Always recompute watermarks when distrust_bios_wm is set, v2.
drm/i915: Prevent the system suspend complete optimization
drm/i915/psr: disable psr2 for resolution greater than 32X20
drm/i915: Hold a wakeref for probing the ring registers
drm/i915: Short-circuit i915_gem_wait_for_idle() if already idle
drm/i915: Disable decoupled MMIO
drm/i915/guc: Remove stale comment for q_fail
drm/vmwgfx: Bump driver minor and date
drm/vmwgfx: Remove unused legacy cursor functions
drm/vmwgfx: fix spelling mistake "exeeds" -> "exceeds"
drm/vmwgfx: Fix large topology crash
drm/vmwgfx: Make sure to update STDU when FB is updated
drm/vmwgfx: Make sure backup_handle is always valid
...
As it is, short copy in write() to append-only file will fail
to truncate the excessive allocated blocks. As the matter of
fact, all checks in ufs_truncate_blocks() are either redundant
or wrong for that caller. As for the only other caller
(ufs_evict_inode()), we only need the file type checks there.
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
btrfs_calc_trans_metadata_size() does an unsigned 32-bit multiplication,
which can overflow if num_items >= 4 GB / (nodesize * BTRFS_MAX_LEVEL * 2).
For a nodesize of 16kB, this overflow happens at 16k items. Usually,
num_items is a small constant passed to btrfs_start_transaction(), but
we also use btrfs_calc_trans_metadata_size() for metadata reservations
for extent items in btrfs_delalloc_{reserve,release}_metadata().
In drop_outstanding_extents(), num_items is calculated as
inode->reserved_extents - inode->outstanding_extents. The difference
between these two counters is usually small, but if many delalloc
extents are reserved and then the outstanding extents are merged in
btrfs_merge_extent_hook(), the difference can become large enough to
overflow in btrfs_calc_trans_metadata_size().
The overflow manifests itself as a leak of a multiple of 4 GB in
delalloc_block_rsv and the metadata bytes_may_use counter. This in turn
can cause early ENOSPC errors. Additionally, these WARN_ONs in
extent-tree.c will be hit when unmounting:
WARN_ON(fs_info->delalloc_block_rsv.size > 0);
WARN_ON(fs_info->delalloc_block_rsv.reserved > 0);
WARN_ON(space_info->bytes_pinned > 0 ||
space_info->bytes_reserved > 0 ||
space_info->bytes_may_use > 0);
Fix it by casting nodesize to a u64 so that
btrfs_calc_trans_metadata_size() does a full 64-bit multiplication.
While we're here, do the same in btrfs_calc_trunc_metadata_size(); this
can't overflow with any existing uses, but it's better to be safe here
than have another hard-to-debug problem later on.
Cc: stable@vger.kernel.org
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Before this, we use 'filled' mode here, ie. if all range has been
filled with EXTENT_DEFRAG bits, get to clear it, but if the defrag
range joins the adjacent delalloc range, then we'll have EXTENT_DEFRAG
bits in extent_state until releasing this inode's pages, and that
prevents extent_data from being freed.
This clears the bit if any was found within the ordered extent.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
In verify_dir_item, it wants to printk name_len of dir_item but
printk data_len acutally.
Fix it by calling btrfs_dir_name_len instead of btrfs_dir_data_len.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Marc Kleine-Budde says:
====================
pull-request: can 2017-06-09
this is a pull request of 6 patches for net/master.
There's a patch by Stephane Grosjean that fixes an uninitialized symbol warning
in the peak_canfd driver. A patch by Johan Hovold to fix the product-id
endianness in an error message in the the peak_usb driver. A patch by Oliver
Hartkopp to enable CAN FD for virtual CAN devices by default. Three patches by
me, one makes the helper function can_change_state() robust to be called with
cf == NULL. The next patch fixes a memory leak in the gs_usb driver. And the
last one fixes a lockdep splat by properly initialize the per-net
can_rcvlists_lock spin_lock.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The change to remove free_netdev() from ieee80211_if_free()
erroneously didn't add the necessary free_netdev() for when
ieee80211_if_free() is called directly in one place, rather
than as the priv_destructor. Add the missing call.
Fixes: cf124db566 ("net: Fix inconsistent teardown and release of private netdev state.")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPI's from the victim cpu are not handled in dev_cpu_callback.
So these pending IPI's would be sent to the remote cpu only when
NET_RX is scheduled on the victim cpu and since this trigger is
unpredictable it would result in packet latencies on the remote cpu.
This patch add support to send the pending ipi's of victim cpu.
Signed-off-by: Ashwanth Goli <ashwanth@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull xen fix from Juergen Gross:
"A fix for Xen on ARM when dealing with 64kB page size of a guest"
* tag 'for-linus-4.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/privcmd: Support correctly 64KB page granularity when mapping memory
The 5th generation Thinkpad X1 Carbons use Synaptics touchpads accessible
over SMBus/RMI, combined with ALPS or Elantech trackpoint devices instead
of classic IBM/Lenovo trackpoints. Unfortunately there is no way for ALPS
driver to detect whether it is dealing with touchpad + trackpoint
combination or just a trackpoint, so we end up with a "phantom" dualpoint
ALPS device in addition to real touchpad and trackpoint.
Given that we do not have any special advanced handling for ALPS or
Elantech trackpoints (unlike IBM trackpoints that have separate driver and
a host of options) we are better off keeping the trackpoints in PS/2
emulation mode. We achieve that by setting serio type to SERIO_PS_PSTHRU,
which will limit number of protocols psmouse driver will try. In addition
to getting rid of the "phantom" touchpads, this will also speed up probing
of F03 pass-through port.
Reported-by: Damjan Georgievski <gdamjan@gmail.com>
Suggested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Pull powerpc fixes from Michael Ellerman:
"Mostly fairly minor, of note are:
- Fix percpu allocations to be NUMA aware
- Limit 4k page size config to 64TB virtual address space
- Avoid needlessly restoring FP and vector registers
Thanks to Aneesh Kumar K.V, Breno Leitao, Christophe Leroy, Frederic
Barrat, Madhavan Srinivasan, Michael Bringmann, Nicholas Piggin,
Vaibhav Jain"
* tag 'powerpc-4.12-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/book3s64: Move PPC_DT_CPU_FTRs and enable it by default
powerpc/mm/4k: Limit 4k page size config to 64TB virtual address space
cxl: Fix error path on bad ioctl
powerpc/perf: Fix Power9 test_adder fields
powerpc/numa: Fix percpu allocations to be NUMA aware
cxl: Avoid double free_irq() for psl,slice interrupts
powerpc/kernel: Initialize load_tm on task creation
powerpc/kernel: Fix FP and vector register restoration
powerpc/64: Reclaim CPU_FTR_SUBCORE
powerpc/hotplug-mem: Fix missing endian conversion of aa_index
powerpc/sysdev/simple_gpio: Fix oops in gpio save_regs function
powerpc/spufs: Fix coredump of SPU contexts
powerpc/64s: Add dt_cpu_ftrs boot time setup option
Pull ARM SoC fixes from Olof Johansson:
"Been sitting on these for a couple of weeks waiting on some larger
batches to come in but it's been pretty quiet.
Just your garden variety fixes here:
- A few maintainers updates (ep93xx, Exynos, TI, Marvell)
- Some PM fixes for Atmel/at91 and Marvell
- A few DT fixes for Marvell, Versatile, TI Keystone, bcm283x
- A reset driver patch to set module license for symbol access"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
MAINTAINERS: EP93XX: Update maintainership
MAINTAINERS: remove kernel@stlinux.com obsolete mailing list
ARM: dts: versatile: use #include "..." to include local DT
MAINTAINERS: add device-tree files to TI DaVinci entry
ARM: at91: select CONFIG_ARM_CPU_SUSPEND
ARM: dts: keystone-k2l: fix broken Ethernet due to disabled OSR
arm64: defconfig: enable some core options for 64bit Rockchip socs
arm64: marvell: dts: fix interrupts in 7k/8k crypto nodes
reset: hi6220: Set module license so that it can be loaded
MAINTAINERS: add irqchip related drivers to Marvell EBU maintainers
MAINTAINERS: sort F entries for Marvell EBU maintainers
ARM: davinci: PM: Do not free useful resources in normal path in 'davinci_pm_init'
ARM: davinci: PM: Free resources in error handling path in 'davinci_pm_init'
ARM: dts: bcm283x: Reserve first page for firmware
memory: atmel-ebi: mark PM ops as __maybe_unused
MAINTAINERS: Remove Javier Martinez Canillas as reviewer for Exynos
1.) Bugfix of function stmmac_get_tx_hwtstamp.
Corrected the tx timestamp available check (same as 4.8 and older)
Change printout from info syslevel to debug.
2.) Bugfix of function stmmac_get_rx_hwtstamp.
Corrected the rx timestamp available check (same as 4.8 and older)
Change printout from info syslevel to debug.
Fixes: ba1ffd74df ("stmmac: fix PTP support for GMAC4")
Signed-off-by: Mario Molitor <mario_molitor@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
According the CYCLON V documention only the bit 16 of snaptypesel should
set.
(more information see Table 17-20 (cv_5v4.pdf) :
Timestamp Snapshot Dependency on Register Bits)
Fixes: d2042052a0 ("stmmac: update the PTP header file")
Signed-off-by: Mario Molitor <mario_molitor@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like this:
Message from syslogd@flamingo at Apr 26 00:45:00 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4
They seem to coincide with net namespace teardown.
The message is emitted by netdev_wait_allrefs().
Forced a kdump in netdev_run_todo, but found that the refcount on the lo
device was already 0 at the time we got to the panic.
Used bcc to check the blocking in netdev_run_todo. The only places
where we're off cpu there are in the rcu_barrier() and msleep() calls.
That behavior is expected. The msleep time coincides with the amount of
time we spend waiting for the refcount to reach zero; the rcu_barrier()
wait times are not excessive.
After looking through the list of callbacks that the netdevice notifiers
invoke in this path, it appears that the dst_dev_event is the most
interesting. The dst_ifdown path places a hold on the loopback_dev as
part of releasing the dev associated with the original dst cache entry.
Most of our notifier callbacks are straight-forward, but this one a)
looks complex, and b) places a hold on the network interface in
question.
I constructed a new bcc script that watches various events in the
liftime of a dst cache entry. Note that dst_ifdown will take a hold on
the loopback device until the invalidated dst entry gets freed.
[ __dst_free] on DST: ffff883ccabb7900 IF tap1008300eth0 invoked at 1282115677036183
__dst_free
rcu_nocb_kthread
kthread
ret_from_fork
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The inode destruction path for the 'dax' device filesystem incorrectly
assumes that the inode was initialized through 'alloc_dax()'. However,
if someone attempts to directly mount the dax filesystem with 'mount -t
dax dax mnt' that will bypass 'alloc_dax()' and the following failure
signatures may occur as a result:
kill_dax() must be called before final iput()
WARNING: CPU: 2 PID: 1188 at drivers/dax/super.c:243 dax_destroy_inode+0x48/0x50
RIP: 0010:dax_destroy_inode+0x48/0x50
Call Trace:
destroy_inode+0x3b/0x60
evict+0x139/0x1c0
iput+0x1f9/0x2d0
dentry_unlink_inode+0xc3/0x160
__dentry_kill+0xcf/0x180
? dput+0x37/0x3b0
dput+0x3a3/0x3b0
do_one_tree+0x36/0x40
shrink_dcache_for_umount+0x2d/0x90
generic_shutdown_super+0x1f/0x120
kill_anon_super+0x12/0x20
deactivate_locked_super+0x43/0x70
deactivate_super+0x4e/0x60
general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
RIP: 0010:kfree+0x6d/0x290
Call Trace:
<IRQ>
dax_i_callback+0x22/0x60
? dax_destroy_inode+0x50/0x50
rcu_process_callbacks+0x298/0x740
ida_remove called for id=0 which is not allocated.
WARNING: CPU: 0 PID: 0 at lib/idr.c:383 ida_remove+0x110/0x120
[..]
Call Trace:
<IRQ>
ida_simple_remove+0x2b/0x50
? dax_destroy_inode+0x50/0x50
dax_i_callback+0x3c/0x60
rcu_process_callbacks+0x298/0x740
Add missing initialization of the 'struct dax_device' and inode so that
the destruction path does not kfree() or ida_simple_remove()
uninitialized data.
Fixes: 7b6be8444e ("dax: refactor dax-fs into a generic provider of 'struct dax_device' instances")
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() and connect()
handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
size of the corresponding memory region, very short sockaddrs (zero or
one byte long) result in operating on uninitialized memory while
referencing .sa_family.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maniaxx reported a kernel boot crash in the EFI code, which I emulated
by using same invalid phys addr in code:
BUG: unable to handle kernel paging request at ffffffffff280001
IP: efi_bgrt_init+0xfb/0x153
...
Call Trace:
? bgrt_init+0xbc/0xbc
acpi_parse_bgrt+0xe/0x12
acpi_table_parse+0x89/0xb8
acpi_boot_init+0x445/0x4e2
? acpi_parse_x2apic+0x79/0x79
? dmi_ignore_irq0_timer_override+0x33/0x33
setup_arch+0xb63/0xc82
? early_idt_handler_array+0x120/0x120
start_kernel+0xb7/0x443
? early_idt_handler_array+0x120/0x120
x86_64_start_reservations+0x29/0x2b
x86_64_start_kernel+0x154/0x177
secondary_startup_64+0x9f/0x9f
There is also a similar bug filed in bugzilla.kernel.org:
https://bugzilla.kernel.org/show_bug.cgi?id=195633
The crash is caused by this commit:
7b0a911478 efi/x86: Move the EFI BGRT init code to early init code
The root cause is the firmware on those machines provides invalid BGRT
image addresses.
In a kernel before above commit BGRT initializes late and uses ioremap()
to map the image address. Ioremap validates the address, if it is not a
valid physical address ioremap() just fails and returns. However in current
kernel EFI BGRT initializes early and uses early_memremap() which does not
validate the image address, and kernel panic happens.
According to ACPI spec the BGRT image address should fall into
EFI_BOOT_SERVICES_DATA, see the section 5.2.22.4 of below document:
http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
Fix this issue by validating the image address in efi_bgrt_init(). If the
image address does not fall into any EFI_BOOT_SERVICES_DATA areas we just
bail out with a warning message.
Reported-by: Maniaxx <tripleshiftone@gmail.com>
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 7b0a911478 ("efi/x86: Move the EFI BGRT init code to early init code")
Link: http://lkml.kernel.org/r/20170609084558.26766-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
CAN FD capable CAN interfaces can handle (classic) CAN 2.0 frames too.
New users usually fail at their first attempt to explore CAN FD on
virtual CAN interfaces due to the current CAN_MTU default.
Set the MTU to CANFD_MTU by default to reduce this confusion.
If someone *really* needs a 'classic CAN'-only device this can be set
with the 'ip' tool with e.g. 'ip link set vcan0 mtu 16' as before.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch adds the missing kfree() in gs_cmd_reset() to free the
memory that is not used anymore after usb_control_msg().
Cc: linux-stable <stable@vger.kernel.org>
Cc: Maximilian Schneider <max@schneidersoft.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Make sure to use the USB device product-id stored in host-byte order in
a probe error message.
Also remove a redundant reassignment of the local usb_dev variable which
had already been used to retrieve the product id.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch fixes two uninitialized symbol warnings in the new code adding
support of the PEAK-System PCAN-PCI Express FD boards, in the socket-CAN
network protocol family.
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
In OOM situations where no skb can be allocated, can_change_state() may
be called with cf == NULL. As this function updates the state and error
statistics it's not an option to skip the call to can_change_state() in
OOM situations.
This patch makes can_change_state() robust, so that it can be called
with cf == NULL.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
During an eeh call to cxl_remove can result in double free_irq of
psl,slice interrupts. This can happen if perst_reloads_same_image == 1
and call to cxl_configure_adapter() fails during slot_reset
callback. In such a case we see a kernel oops with following back-trace:
Oops: Kernel access of bad area, sig: 11 [#1]
Call Trace:
free_irq+0x88/0xd0 (unreliable)
cxl_unmap_irq+0x20/0x40 [cxl]
cxl_native_release_psl_irq+0x78/0xd8 [cxl]
pci_deconfigure_afu+0xac/0x110 [cxl]
cxl_remove+0x104/0x210 [cxl]
pci_device_remove+0x6c/0x110
device_release_driver_internal+0x204/0x2e0
pci_stop_bus_device+0xa0/0xd0
pci_stop_and_remove_bus_device+0x28/0x40
pci_hp_remove_devices+0xb0/0x150
pci_hp_remove_devices+0x68/0x150
eeh_handle_normal_event+0x140/0x580
eeh_handle_event+0x174/0x360
eeh_event_handler+0x1e8/0x1f0
This patch fixes the issue of double free_irq by checking that
variables that hold the virqs (err_hwirq, serr_hwirq, psl_virq) are
not '0' before un-mapping and resetting these variables to '0' when
they are un-mapped.
Cc: stable@vger.kernel.org
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If more than one gpio bank has the "pwm" property, only one will be
registered successfully, all the others will fail with:
mvebu-gpio: probe of f1018140.gpio failed with error -17
That's because in alloc_pwms(), the chip->base (aka "int pwm"), was not
set (thus, ==0) ; and 0 is a meaningful start value in alloc_pwm().
What was intended is mvpwm->chip->base = -1.
Like that, the numbering will be done auto-magically
Moreover, as the region might be already occupied by another pwm, we
shouldn't force:
mvpwm->chip->base = 0
nor
mvpwm->chip->base = id * MVEBU_MAX_GPIO_PER_BANK;
Tested on clearfog-pro (Marvell 88F6828)
Fixes: 757642f9a5 ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The blink counter A was always selected because 0 was forced in the
blink select counter register.
The variable 'set' was obviously there to be used as the register value,
selecting the B counter when id==1 and A counter when id==0.
Tested on clearfog-pro (Marvell 88F6828)
Fixes: 757642f9a5 ("gpio: mvebu: Add limited PWM support")
Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Pull RCU fix from Paul E. McKenney:
" This series enables srcu_read_lock() and srcu_read_unlock() to be used from
interrupt handlers, which fixes a bug in KVM's use of SRCU in delivery
of interrupts to guest OSes. "
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If a key's refcount is dropped to zero between key_lookup() peeking at
the refcount and subsequently attempting to increment it, refcount_inc()
will see a zero refcount. Here, refcount_inc() will WARN_ONCE(), and
will *not* increment the refcount, which will remain zero.
Once key_lookup() drops key_serial_lock, it is possible for the key to
be freed behind our back.
This patch uses refcount_inc_not_zero() to perform the peek and increment
atomically.
Fixes: fff292914d ("security, keys: convert key.usage from atomic_t to refcount_t")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
The initial Diffie-Hellman computation made direct use of the MPI
library because the crypto module did not support DH at the time. Now
that KPP is implemented, KEYCTL_DH_COMPUTE should use it to get rid of
duplicate code and leverage possible hardware acceleration.
This fixes an issue whereby the input to the KDF computation would
include additional uninitialized memory when the result of the
Diffie-Hellman computation was shorter than the input prime number.
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
If userspace called KEYCTL_DH_COMPUTE with kdf_params containing NULL
otherinfo but nonzero otherinfolen, the kernel would allocate a buffer
for the otherinfo, then feed it into the KDF without initializing it.
Fix this by always doing the copy from userspace (which will fail with
EFAULT in this scenario).
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Requesting "digest_null" in the keyctl_kdf_params caused an infinite
loop in kdf_ctr() because the "null" hash has a digest size of 0. Fix
it by rejecting hash algorithms with a digest size of 0.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: James Morris <james.l.morris@oracle.com>
While a 'struct key' itself normally does not contain sensitive
information, Documentation/security/keys.txt actually encourages this:
"Having a payload is not required; and the payload can, in fact,
just be a value stored in the struct key itself."
In case someone has taken this advice, or will take this advice in the
future, zero the key structure before freeing it. We might as well, and
as a bonus this could make it a bit more difficult for an adversary to
determine which keys have recently been in use.
This is safe because the key_jar cache does not use a constructor.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
As the previous patch did for encrypted-keys, zero sensitive any
potentially sensitive data related to the "trusted" key type before it
is freed. Notably, we were not zeroing the tpm_buf structures in which
the actual key is stored for TPM seal and unseal, nor were we zeroing
the trusted_key_payload in certain error paths.
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: David Safford <safford@us.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
For keys of type "encrypted", consistently zero sensitive key material
before freeing it. This was already being done for the decrypted
payloads of encrypted keys, but not for the master key and the keys
derived from the master key.
Out of an abundance of caution and because it is trivial to do so, also
zero buffers containing the key payload in encrypted form, although
depending on how the encrypted-keys feature is used such information
does not necessarily need to be kept secret.
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: David Safford <safford@us.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Zero the payloads of user and logon keys before freeing them. This
prevents sensitive key material from being kept around in the slab
caches after a key is released.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Before returning from add_key() or one of the keyctl() commands that
takes in a key payload, zero the temporary buffer that was allocated to
hold the key payload copied from userspace. This may contain sensitive
key material that should not be kept around in the slab caches.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
sys_add_key() and the KEYCTL_UPDATE operation of sys_keyctl() allowed a
NULL payload with nonzero length to be passed to the key type's
->preparse(), ->instantiate(), and/or ->update() methods. Various key
types including asymmetric, cifs.idmap, cifs.spnego, and pkcs7_test did
not handle this case, allowing an unprivileged user to trivially cause a
NULL pointer dereference (kernel oops) if one of these key types was
present. Fix it by doing the copy_from_user() when 'plen' is nonzero
rather than when '_payload' is non-NULL, causing the syscall to fail
with EFAULT as expected when an invalid buffer is specified.
Cc: stable@vger.kernel.org # 2.6.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
The encrypted-keys module was using a single global HMAC transform,
which could be rekeyed by multiple threads concurrently operating on
different keys, causing incorrect HMAC values to be calculated. Fix
this by allocating a new HMAC transform whenever we need to calculate a
HMAC. Also simplify things a bit by allocating the shash_desc's using
SHASH_DESC_ON_STACK() for both the HMAC and unkeyed hashes.
The following script reproduces the bug:
keyctl new_session
keyctl add user master "abcdefghijklmnop" @s
for i in $(seq 2); do
(
set -e
for j in $(seq 1000); do
keyid=$(keyctl add encrypted desc$i "new user:master 25" @s)
datablob="$(keyctl pipe $keyid)"
keyctl unlink $keyid > /dev/null
keyid=$(keyctl add encrypted desc$i "load $datablob" @s)
keyctl unlink $keyid > /dev/null
done
) &
done
Output with bug:
[ 439.691094] encrypted_key: bad hmac (-22)
add_key: Invalid argument
add_key: Invalid argument
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer. Fix this by using strncmp() instead of memcmp(). [Also
clean up the code to deduplicate some logic.]
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Since v4.9, the crypto API cannot (normally) be used to encrypt/decrypt
stack buffers because the stack may be virtually mapped. Fix this for
the padding buffers in encrypted-keys by using ZERO_PAGE for the
encryption padding and by allocating a temporary heap buffer for the
decryption padding.
Tested with CONFIG_DEBUG_SG=y:
keyctl new_session
keyctl add user master "abcdefghijklmnop" @s
keyid=$(keyctl add encrypted desc "new user:master 25" @s)
datablob="$(keyctl pipe $keyid)"
keyctl unlink $keyid
keyid=$(keyctl add encrypted desc "load $datablob" @s)
datablob2="$(keyctl pipe $keyid)"
[ "$datablob" = "$datablob2" ] && echo "Success!"
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
In join_session_keyring(), if install_session_keyring_to_cred() were to
fail, we would leak the keyring reference, just like in the bug fixed by
commit 23567fd052 ("KEYS: Fix keyring ref leak in
join_session_keyring()"). Fortunately this cannot happen currently, but
we really should be more careful. Do this by adding and using a new
error label at which the keyring reference is dropped.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
We forgot to set the error code on this path so it could result in
returning NULL which leads to a NULL dereference.
Fixes: db6c43bd21 ("crypto: KEYS: convert public key and digsig asym to the akcipher api")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
With the new standardized functions, we can replace all ACCESS_ONCE()
calls across relevant security/keyrings/.
ACCESS_ONCE() does not work reliably on non-scalar types. For example
gcc 4.6 and 4.7 might remove the volatile tag for such accesses during
the SRA (scalar replacement of aggregates) step:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
Update the new calls regardless of if it is a scalar type, this is
cleaner than having three alternatives.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
CONFIG_KEYS_COMPAT is defined in arch-specific Kconfigs and is missing for
several 64-bit architectures : mips, parisc, tile.
At the moment and for those architectures, calling in 32-bit userspace the
keyctl syscall would return an ENOSYS error.
This patch moves the CONFIG_KEYS_COMPAT option to security/keys/Kconfig, to
make sure the compatibility wrapper is registered by default for any 64-bit
architecture as long as it is configured with CONFIG_COMPAT.
[DH: Modified to remove arm64 compat enablement also as requested by Eric
Biggers]
Signed-off-by: Bilal Amarni <bilal.amarni@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
cc: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
A bunch of fixes for vmwgfx 4.12 regressions and older stuff. In the latter
case either trivial, cc'd stable or requiring backports for stable.
* 'vmwgfx-fixes-4.12' of git://people.freedesktop.org/~thomash/linux:
drm/vmwgfx: Bump driver minor and date
drm/vmwgfx: Remove unused legacy cursor functions
drm/vmwgfx: fix spelling mistake "exeeds" -> "exceeds"
drm/vmwgfx: Fix large topology crash
drm/vmwgfx: Make sure to update STDU when FB is updated
drm/vmwgfx: Make sure backup_handle is always valid
drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve()
drm/vmwgfx: Don't create proxy surface for cursor
drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl()
drm/i915 fixes for v4.12-rc5
* tag 'drm-intel-fixes-2017-06-08' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: fix warning for unused variable
drm/i915: Fix 90/270 rotated coordinates for FBC
drm/i915: Restore has_fbc=1 for ILK-M
drm/i915: Workaround VLV/CHV DSI scanline counter hardware fail
drm/i915: Fix logical inversion for gen4 quirking
drm/i915: Guard against i915_ggtt_disable_guc() being invoked unconditionally
drm/i915: Always recompute watermarks when distrust_bios_wm is set, v2.
drm/i915: Prevent the system suspend complete optimization
drm/i915/psr: disable psr2 for resolution greater than 32X20
drm/i915: Hold a wakeref for probing the ring registers
drm/i915: Short-circuit i915_gem_wait_for_idle() if already idle
drm/i915: Disable decoupled MMIO
drm/i915/guc: Remove stale comment for q_fail
drm/i915: Serialize GTT/Aperture accesses on BXT
Driver Changes:
- kirin: Use correct dt port for the bridge (John)
- meson: Fix regression caused by adding HDMI support to allow board
configurations without HDMI (Neil)
Cc: John Stultz <john.stultz@linaro.org>
Cc: Neil Armstrong <narmstrong@baylibre.com>
* tag 'drm-misc-fixes-2017-06-07' of git://anongit.freedesktop.org/git/drm-misc:
drm/meson: Fix driver bind when only CVBS is available
drm: kirin: Fix drm_of_find_panel_or_bridge conversion
imx-drm: PRE clock gating, panelless LDB, and VDIC CSI selection fixes
- Keep the external clock input to the PRE ungated and only use the internal
soft reset to keep the module in low power state, to avoid sporadic startup
failures.
- Ignore -ENODEV return values from drm_of_find_panel_or_bridge in the LDB
driver to fix probing for devices that still do not specify a panel in the
device tree.
- Fix the CSI input selection to the VDIC. According to experiments, the real
behaviour differs a bit from the documentation.
* tag 'imx-drm-fixes-2017-06-08' of git://git.pengutronix.de/git/pza/linux:
gpu: ipu-v3: Fix CSI selection for VDIC
drm/imx: imx-ldb: Accept drm_of_find_panel_or_bridge failure
gpu: ipu-v3: pre: only use internal clock gating
Pull power management fixes from Rafael Wysocki:
"These revert one problematic commit related to system sleep and fix
one recent intel_pstate regression.
Specifics:
- Revert a recent commit that attempted to avoid spurious wakeups
from suspend-to-idle via ACPI SCI, but introduced regressions on
some systems (Rafael Wysocki).
We will get back to the problem it tried to address in the next
cycle.
- Fix a possible division by 0 during intel_pstate initialization
due to a missing check (Rafael Wysocki)"
* tag 'pm-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle"
cpufreq: intel_pstate: Avoid division by 0 in min_perf_pct_min()
Pull module maintainer address change from Jessica Yu:
"A single patch that advertises my email address change"
* tag 'modules-for-v4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
MAINTAINERS: update email address for Jessica Yu
Commit 1aa6c4f6b8 ("net: vrf: Add l3mdev rules on first device create")
adds the l3mdev FIB rule the first time a VRF device is created. However,
it only creates the rule once and only in the namespace the first device
is created - which may not be init_net. Fix by using the net_generic
capability to make the add_fib_rules flag per network namespace.
Fixes: 1aa6c4f6b8 ("net: vrf: Add l3mdev rules on first device create")
Reported-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I noticed that test_l4lb was failing in selftests:
# ./test_progs
test_pkt_access:PASS:ipv4 77 nsec
test_pkt_access:PASS:ipv6 44 nsec
test_xdp:PASS:ipv4 2933 nsec
test_xdp:PASS:ipv6 1500 nsec
test_l4lb:PASS:ipv4 377 nsec
test_l4lb:PASS:ipv6 544 nsec
test_l4lb:FAIL:stats 6297600000 200000
test_tcp_estats:PASS: 0 nsec
Summary: 7 PASSED, 1 FAILED
Tracking down the issue actually revealed that endianness selection
in bpf_endian.h is broken when compiled with clang with bpf target.
test_pkt_access.c, test_l4lb.c is compiled with __BYTE_ORDER as
__BIG_ENDIAN, test_xdp.c as __LITTLE_ENDIAN! test_l4lb noticeably
fails, because the test accounts bytes via bpf_ntohs(ip6h->payload_len)
and bpf_ntohs(iph->tot_len), and compares them against a defined
value and given a wrong endianness, the test outcome is different,
of course.
Turns out that there are actually two bugs: i) when we do __BYTE_ORDER
comparison with __LITTLE_ENDIAN/__BIG_ENDIAN, then depending on the
include order we see different outcomes. Reason is that __BYTE_ORDER
is undefined due to missing endian.h include. Before we include the
asm/byteorder.h (e.g. through linux/in.h), then __BYTE_ORDER equals
__LITTLE_ENDIAN since both are undefined, after the include which
correctly pulls in linux/byteorder/little_endian.h, __LITTLE_ENDIAN
is defined, but given __BYTE_ORDER is still undefined, we match on
__BYTE_ORDER equals to __BIG_ENDIAN since __BIG_ENDIAN is also
undefined at that point, sigh. ii) But even that would be wrong,
since when compiling the test cases with clang, one can select between
bpfeb and bpfel targets for cross compilation. Hence, we can also not
rely on what the system's endian.h provides, but we need to look at
the compiler's defined endianness. The compiler defines __BYTE_ORDER__,
and we can match __ORDER_LITTLE_ENDIAN__ and __ORDER_BIG_ENDIAN__,
which also reflects targets bpf (native), bpfel, bpfeb correctly,
thus really only rely on that. After patch:
# ./test_progs
test_pkt_access:PASS:ipv4 74 nsec
test_pkt_access:PASS:ipv6 42 nsec
test_xdp:PASS:ipv4 2340 nsec
test_xdp:PASS:ipv6 1461 nsec
test_l4lb:PASS:ipv4 400 nsec
test_l4lb:PASS:ipv6 530 nsec
test_tcp_estats:PASS: 0 nsec
Summary: 7 PASSED, 0 FAILED
Fixes: 43bcf707cc ("bpf: fix _htons occurences in test_progs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each time a new speed is added, the bonding 802.3ad isn't updated. Add a
comment to remind the developer to update this driver.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds 14 Gbps enum definition, and fixes
aggregated bandwidth calculation based on above slave links.
Fixes: 0d7e2d2166 ("IB/ipoib: add get_link_ksettings in ethtool")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds [5|50] Gbps enum definition, and fixes
aggregated bandwidth calculation based on above slave links.
Fixes: c9a70d4346 ("net-next: ethtool: Added port speed macros.")
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first netlink attribute (value 0) must always be defined
as none/unspec.
Because we cannot change an existing UAPI, I add a comment to point the
mistake and avoid to propagate it in a new ovs API in the future.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix messages like this:
adv7842.c:(.text+0x2edadd): undefined reference to `cec_unregister_adapter'
when CEC_CORE=m but the driver including media/cec.h is built-in. In that case
the static inlines provided in media/cec.h should be used by that driver.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
While discussing the possible merits of clang warning about unused initialized
functions, I found one function that was clearly meant to be called but
never actually is.
__ila_hash_secret_init() initializes the hash value for the ila locator,
apparently this is intended to prevent hash collision attacks, but this ends
up being a read-only zero constant since there is no caller. I could find
no indication of why it was never called, the earliest patch submission
for the module already was like this. If my interpretation is right, we
certainly want to backport the patch to stable kernels as well.
I considered adding it to the ila_xlat_init callback, but for best effect
the random data is read as late as possible, just before it is first used.
The underlying net_get_random_once() is already highly optimized to avoid
overhead when called frequently.
Fixes: 7f00feaf10 ("ila: Add generic ILA translation facility")
Cc: stable@vger.kernel.org
Link: https://www.spinics.net/lists/kernel/msg2527243.html
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull printk fix from Petr Mladek:
"This reverts a fix added into 4.12-rc1. It caused the kernel log to be
printed on another console when two consoles of the same type were
defined, e.g. console=ttyS0 console=ttyS1.
This configuration was never supported by kernel itself, but it
started to make sense with systemd. In other words, the commit broke
userspace"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
Revert "printk: fix double printing with earlycon"
Pull crypto fixes from Herbert Xu:
"This fixes a couple of places in the crypto code that were doing
interruptible sleeps dangerously. They have been converted to use
non-interruptible sleeps.
This also fixes a bug in asymmetric_keys where it would trigger a
use-after-free if a request returned EBUSY due to a full device queue"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: gcm - wait for crypto op not signal safe
crypto: drbg - wait for crypto op not signal safe
crypto: asymmetric_keys - handle EBUSY due to backlog correctly
drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c: In function ‘rtw_cfg80211_add_monitor_if’:
drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c:2670:10: error: ‘struct net_device’ has no member named ‘destructor’
mon_ndev->destructor = rtw_ndev_destructor;
^
Signed-off-by: David S. Miller <davem@davemloft.net>
In blk-cgroup, operations on blkg objects are protected with the
request_queue lock. This is no more the lock that protects
I/O-scheduler operations in blk-mq. In fact, the latter are now
protected with a finer-grained per-scheduler-instance lock. As a
consequence, although blkg lookups are also rcu-protected, blk-mq I/O
schedulers may see inconsistent data when they access blkg and
blkg-related objects. BFQ does access these objects, and does incur
this problem, in the following case.
The blkg_lookup performed in bfq_get_queue, being protected (only)
through rcu, may happen to return the address of a copy of the
original blkg. If this is the case, then the blkg_get performed in
bfq_get_queue, to pin down the blkg, is useless: it does not prevent
blk-cgroup code from destroying both the original blkg and all objects
directly or indirectly referred by the copy of the blkg. BFQ accesses
these objects, which typically causes a crash for NULL-pointer
dereference of memory-protection violation.
Some additional protection mechanism should be added to blk-cgroup to
address this issue. In the meantime, this commit provides a quick
temporary fix for BFQ: cache (when safe) blkg data that might
disappear right after a blkg_lookup.
In particular, this commit exploits the following facts to achieve its
goal without introducing further locks. Destroy operations on a blkg
invoke, as a first step, hooks of the scheduler associated with the
blkg. And these hooks are executed with bfqd->lock held for BFQ. As a
consequence, for any blkg associated with the request queue an
instance of BFQ is attached to, we are guaranteed that such a blkg is
not destroyed, and that all the pointers it contains are consistent,
while that instance is holding its bfqd->lock. A blkg_lookup performed
with bfqd->lock held then returns a fully consistent blkg, which
remains consistent until this lock is held. In more detail, this holds
even if the returned blkg is a copy of the original one.
Finally, also the object describing a group inside BFQ needs to be
protected from destruction on the blkg_free of the original blkg
(which invokes bfq_pd_free). This commit adds private refcounting for
this object, to let it disappear only after no bfq_queue refers to it
any longer.
This commit also removes or updates some stale comments on locking
issues related to blk-cgroup operations.
Reported-by: Tomas Konir <tomas.konir@gmail.com>
Reported-by: Lee Tibbert <lee.tibbert@gmail.com>
Reported-by: Marco Piazza <mpiazza@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Tested-by: Tomas Konir <tomas.konir@gmail.com>
Tested-by: Lee Tibbert <lee.tibbert@gmail.com>
Tested-by: Marco Piazza <mpiazza@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Stephen Hemminger says:
====================
netvsc: bug fixes
These are bugfixes for netvsc driver in 4.12.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The work queue and handling of network filter parameters should
be in rndis_device. This gets rid of warning from RCU checks,
eliminates a race and cleans up code.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ndo_poll_controller function needs to schedule NAPI to pick
up arriving packets and send completions. Otherwise no data
will ever be received. For simple case of netconsole, it also
will allow send completions to happen. Without this netpoll
will eventually get stuck.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool info command calls the netvsc get_sset_count with RTNL
but not with RCU. Which causes warning:
drivers/net/hyperv/netvsc_drv.c:1010 suspicious rcu_dereference_check() usage!
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting
down a guest running iperf on a VFIO assigned device. This happens
because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt
context, while a worker thread does the same inside kvm_set_irq(). If the
interrupt happens while the worker thread is executing __srcu_read_lock(),
updates to the Classic SRCU ->lock_count[] field or the Tree SRCU
->srcu_lock_count[] field can be lost.
The docs say you are not supposed to call srcu_read_lock() and
srcu_read_unlock() from irq context, but KVM interrupt injection happens
from (host) interrupt context and it would be nice if SRCU supported the
use case. KVM is using SRCU here not really for the "sleepable" part,
but rather due to its IPI-free fast detection of grace periods. It is
therefore not desirable to switch back to RCU, which would effectively
revert commit 719d93cd5f ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING",
2014-01-16).
However, the docs are overly conservative. You can have an SRCU instance
only has users in irq context, and you can mix process and irq context
as long as process context users disable interrupts. In addition,
__srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and
Classic SRCU. For those two implementations, only srcu_read_lock()
is unsafe.
When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(),
in commit 5a41344a3d ("srcu: Simplify __srcu_read_unlock() via
this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments.
Therefore it kept __this_cpu_inc(), with preempt_disable/enable in
the caller. Tree SRCU however only does one increment, so on most
architectures it is more efficient for __srcu_read_lock() to use
this_cpu_inc(), and any performance differences appear to be down in
the noise.
Cc: stable@vger.kernel.org
Fixes: 719d93cd5f ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING")
Reported-by: Linu Cherian <linuc.decode@gmail.com>
Suggested-by: Linu Cherian <linuc.decode@gmail.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting
down a guest running iperf on a VFIO assigned device. This happens
because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt
context, while a worker thread does the same inside kvm_set_irq(). If the
interrupt happens while the worker thread is executing __srcu_read_lock(),
updates to the Classic SRCU ->lock_count[] field or the Tree SRCU
->srcu_lock_count[] field can be lost.
The docs say you are not supposed to call srcu_read_lock() and
srcu_read_unlock() from irq context, but KVM interrupt injection happens
from (host) interrupt context and it would be nice if SRCU supported the
use case. KVM is using SRCU here not really for the "sleepable" part,
but rather due to its IPI-free fast detection of grace periods. It is
therefore not desirable to switch back to RCU, which would effectively
revert commit 719d93cd5f ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING",
2014-01-16).
However, the docs are overly conservative. You can have an SRCU instance
only has users in irq context, and you can mix process and irq context
as long as process context users disable interrupts. In addition,
__srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and
Classic SRCU. For those two implementations, only srcu_read_lock()
is unsafe.
When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(),
in commit 5a41344a3d ("srcu: Simplify __srcu_read_unlock() via
this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments.
Therefore it kept __this_cpu_inc(), with preempt_disable/enable in
the caller. Tree SRCU however only does one increment, so on most
architectures it is more efficient for __srcu_read_lock() to use
this_cpu_inc(), and any performance differences appear to be down in
the noise.
Unlike Classic and Tree SRCU, Tiny SRCU does increments and decrements on
a single variable. Therefore, as Peter Zijlstra pointed out, Tiny SRCU's
implementation already supports mixed-context use of srcu_read_lock()
and srcu_read_unlock(), at least as long as uses of srcu_read_lock()
and srcu_read_unlock() in each handler are nested and paired properly.
In other words, it is still illegal to (say) invoke srcu_read_lock()
in an interrupt handler and to invoke the matching srcu_read_unlock()
in a softirq handler. Therefore, the only change required for Tiny SRCU
is to its comments.
Fixes: 719d93cd5f ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING")
Reported-by: Linu Cherian <linuc.decode@gmail.com>
Suggested-by: Linu Cherian <linuc.decode@gmail.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
The 0-day kernel test robot reports assertion failures on
!CONFIG_SMP kernels due to failed spin_is_locked() checks. As it
turns out, spin_is_locked() is hardcoded to return zero on
!CONFIG_SMP kernels and so this function cannot be relied on to
verify spinlock state in this configuration.
To avoid this problem, replace the associated asserts with lockdep
variants that do the right thing regardless of kernel configuration.
Drop the one assert that checks for an unlocked lock as there is no
suitable lockdep variant for that case. This moves the spinlock
checks from XFS debug code to lockdep, but generally provides the
same level of protection.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Roopa reported attempts to delete a bond device that is referenced in a
multipath route is hanging:
$ ifdown bond2 # ifupdown2 command that deletes virtual devices
unregister_netdevice: waiting for bond2 to become free. Usage count = 2
Steps to reproduce:
echo 1 > /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown
ip link add dev bond12 type bond
ip link add dev bond13 type bond
ip addr add 2001:db8:2::0/64 dev bond12
ip addr add 2001:db8:3::0/64 dev bond13
ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2
ip link del dev bond12
ip link del dev bond13
The root cause is the recent change to keep routes on a linkdown. Update
the check to detect when the device is unregistering and release the
route for that case.
Fixes: a1a22c1206 ("net: ipv6: Keep nexthop of multipath route on admin down")
Reported-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the structure's fields are not initialized by the
rtnetlink. If driver doesn't set those in ndo_get_vf_config(),
they'd leak memory to user.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
CC: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Verify that the length of the socket buffer is sufficient to cover the
nlmsghdr structure before accessing the nlh->nlmsg_len field for further
input sanitization. If the client only supplies 1-3 bytes of data in
sk_buff, then nlh->nlmsg_len remains partially uninitialized and
contains leftover memory from the corresponding kernel allocation.
Operating on such data may result in indeterminate evaluation of the
nlmsg_len < sizeof(*nlh) expression.
The bug was discovered by a runtime instrumentation designed to detect
use of uninitialized memory in the kernel. The patch prevents this and
other similar tools (e.g. KMSAN) from flagging this behavior in the future.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 85eac2ba35.
There is an updated version of this fix which we should
use instead.
Signed-off-by: David S. Miller <davem@davemloft.net>
emac_mdio_read_link() was not copying the requested phy settings
back into the emac driver's own phy api. This has caused a link
speed mismatch issue for the AR8035 as the emac driver kept
trying to connect with 10/100MBps on a 1GBit/s link.
This patch also unifies shared code between emac_setup_aneg()
and emac_mdio_setup_forced(). And furthermore it removes
a chunk of emac_mdio_init_phy(), that was copying the same
data into itself.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a problem where the AR8035 PHY can't be
detected on an Cisco Meraki MR24, if the ethernet cable is
not connected on boot.
Russell Senior provided steps to reproduce the issue:
|Disconnect ethernet cable, apply power, wait until device has booted,
|plug in ethernet, check for interfaces, no eth0 is listed.
|
|This appears to be a problem during probing of the AR8035 Phy chip.
|When ethernet has no link, the phy detection fails, and eth0 is not
|created. Plugging ethernet later has no effect, because there is no
|interface as far as the kernel is concerned. The relevant part of
|the boot log looks like this:
|this is the failing case:
|
|[ 0.876611] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[ 0.882532] /plb/opb/ethernet@ef600c00: reset timeout
|[ 0.888546] /plb/opb/ethernet@ef600c00: can't find PHY!
|and the succeeding case:
|
|[ 0.876672] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[ 0.883952] eth0: EMAC-0 /plb/opb/ethernet@ef600c00, MAC 00:01:..
|[ 0.890822] eth0: found Atheros 8035 Gigabit Ethernet PHY (0x01)
Based on the comment and the commit message of
commit 23fbb5a87c ("emac: Fix EMAC soft reset on 460EX/GT").
This is because the AR8035 PHY doesn't provide the TX Clock,
if the ethernet cable is not attached. This causes the reset
to timeout and the PHY detection code in emac_init_phy() is
unable to detect the AR8035 PHY. As a result, the emac driver
bails out early and the user left with no ethernet.
In order to stay compatible with existing configurations, the driver
tries the current reset approach at first. Only if the first attempt
timed out, it does perform one more retry with the clock temporarily
switched to the internal source for just the duration of the reset.
LEDE-Bug: #687 <https://bugs.lede-project.org/index.php?do=details&task_id=687>
Cc: Chris Blake <chrisrblake93@gmail.com>
Reported-by: Russell Senior <russell@personaltelco.net>
Fixes: 23fbb5a87c ("emac: Fix EMAC soft reset on 460EX/GT")
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Verify that the length of the socket buffer is sufficient to cover the
entire nlh->nlmsg_len field before accessing that field for further
input sanitization. If the client only supplies 1-3 bytes of data in
sk_buff, then nlh->nlmsg_len remains partially uninitialized and
contains leftover memory from the corresponding kernel allocation.
Operating on such data may result in indeterminate evaluation of the
nlmsg_len < sizeof(*nlh) expression.
The bug was discovered by a runtime instrumentation designed to detect
use of uninitialized memory in the kernel. The patch prevents this and
other similar tools (e.g. KMSAN) from flagging this behavior in the future.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christoph writes:
"A few NVMe fixes for 4.12-rc, PCIe reset fixes and APST fixes, a
RDMA reconnect fix, two FC fixes and a general controller removal fix."
> ../drivers/hsi/clients/ssi_protocol.c:1069:5: error: 'struct net_device' has no member named 'destructor'
Reported-by: Mark Brown <broonie@kernel.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/gpu/drm/i915/intel_engine_cs.c: In function ‘intel_engine_is_idle’:
drivers/gpu/drm/i915/intel_engine_cs.c:1103:27: error: unused variable ‘dev_priv’ [-Werror=unused-variable]
struct drm_i915_private *dev_priv = engine->i915;
^~~~~~~~
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
While installing SLES-12 (based on v4.4), I found that the installer
will stall for 60+ seconds during LVM disk scan. The root cause was
determined to be the removal of a bound device check in loop_flush()
by commit b5dd2f6047 ("block: loop: improve performance via blk-mq").
Restoring this check, examining ->lo_state as set by loop_set_fd()
eliminates the bad behavior.
Test method:
modprobe loop max_loop=64
dd if=/dev/zero of=disk bs=512 count=200K
for((i=0;i<4;i++))do losetup -f disk; done
mkfs.ext4 -F /dev/loop0
for((i=0;i<4;i++))do mkdir t$i; mount /dev/loop$i t$i;done
for f in `ls /dev/loop[0-9]*|sort`; do \
echo $f; dd if=$f of=/dev/null bs=512 count=1; \
done
Test output: stock patched
/dev/loop0 18.1217e-05 8.3842e-05
/dev/loop1 6.1114e-05 0.000147979
/dev/loop10 0.414701 0.000116564
/dev/loop11 0.7474 6.7942e-05
/dev/loop12 0.747986 8.9082e-05
/dev/loop13 0.746532 7.4799e-05
/dev/loop14 0.480041 9.3926e-05
/dev/loop15 1.26453 7.2522e-05
Note that from loop10 onward, the device is not mounted, yet the
stock kernel consumes several orders of magnitude more wall time
than it does for a mounted device.
(Thanks for Mike Galbraith <efault@gmx.de>, give a changelog review.)
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: James Wang <jnwang@suse.com>
Fixes: b5dd2f6047 ("block: loop: improve performance via blk-mq")
Signed-off-by: Jens Axboe <axboe@fb.com>
If "i" is the last element in the vcpu->arch.cpuid_entries[] array, it
potentially can be exploited the vulnerability. this will out-of-bounds
read and write. Luckily, the effect is small:
/* when no next entry is found, the current entry[i] is reselected */
for (j = i + 1; ; j = (j + 1) % nent) {
struct kvm_cpuid_entry2 *ej = &vcpu->arch.cpuid_entries[j];
if (ej->function == e->function) {
It reads ej->maxphyaddr, which is user controlled. However...
ej->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
After cpuid_entries there is
int maxphyaddr;
struct x86_emulate_ctxt emulate_ctxt; /* 16-byte aligned */
So we have:
- cpuid_entries at offset 1B50 (6992)
- maxphyaddr at offset 27D0 (6992 + 3200 = 10192)
- padding at 27D4...27DF
- emulate_ctxt at 27E0
And it writes in the padding. Pfew, writing the ops field of emulate_ctxt
would have been much worse.
This patch fixes it by modding the index to avoid the out-of-bounds
access. Worst case, i == j and ej->function == e->function,
the loop can bail out.
Reported-by: Moguofang <moguofang@huawei.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Guofang Mo <moguofang@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM/ARM Fixes for v4.12-rc5 - Take 2
Changes include:
- Fix an issue with migrating GICv2 VMs on GICv3 systems.
- Squashed a bug for gicv3 when figuring out preemption levels.
- Fix a potential null pointer derefence in KVM happening under memory
pressure.
- Maintain RES1 bits in the SCTLR_EL2 to make sure KVM works on new
architecture revisions.
- Allow unaligned accesses at EL2/HYP
Since introduction of tracing for init functions the in_kernel_space()
check is no longer correct, as it ignores the init sections. As a
result, when probes are inserted (and disabled) in the init functions,
a branch instruction is inserted instead of a nop, which is likely to
result in random crashes during boot.
Remove the MIPS-specific in_kernel_space() method and replace it with a
generic core_kernel_text() that also checks for init sections during
system boot stage.
Fixes: 42c269c88d ("ftrace: Allow for function tracing to record init functions on boot up")
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Tested-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16092/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Space reserved for PKMap should span from PKMAP_BASE to FIXADDR_START.
For large page sizes this is not the case as eg. for 64k pages the range
currently defined is from 0xfe000000 to 0x102000000(!!) which obviously
isn't right.
Remove the hardcoded location and set the BASE address as an offset from
FIXADDR_START.
Since all PKMAP ptes have to be placed in a contiguous memory, ensure
that this is the case by placing them all in a single page. This is
achieved by aligning the end address to pkmap pages count pages.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15950/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
All PTEs used by PKMAP should be allocated in a contiguous memory area,
but we do not currently have a mechanism to enforce that, so ensure that
we don't try to allocate more entries than would fit in a single page.
Current fixed value of 1024 would not work with XPA enabled when
sizeof(pte_t)==8 and we need two pages to store pte tables.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15949/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
fixrange_init operates at PMD-granularity and expects the addresses to
be PMD-size aligned, but currently that might not be the case for
PKMAP_BASE unless it is defined properly, so ensure a correct alignment
is used before passing the address to fixrange_init.
fixed mappings: only align the start address that is passed to
fixrange_init rather than the value before adding the size, as we may
end up with uninitialised upper part of the range.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15948/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The PPC_DT_CPU_FTRs is a bit misplaced in menuconfig, it shows up with
other general kernel options. It's really more at home in the "Platform
Support" section, so move it there.
Also enable it by default, for Book3s 64. It does mostly nothing unless
the device tree properties are found, and we will want it enabled
eventually in distro kernels, so turn it on to start getting more
testing.
Fixes: 5a61ef74f2 ("powerpc/64s: Support new device tree binding for discovering CPU features")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Supporting 512TB requires us to do a order 3 allocation for level 1 page
table (pgd). This results in page allocation failures with certain workloads.
For now limit 4k linux page size config to 64TB.
Fixes: f6eedbba7a ("powerpc/mm/hash: Increase VA range to 128TB")
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When calling CEC_RECEIVE do not check if the adapter is configured.
Typically CEC_RECEIVE is called after a select() and if that indicates
that there are messages in the receive queue, then you should always be
able to dequeue a message.
The race condition here is that a message has been received and is
queued, so select() tells userspace that a message is available. But
before the application calls CEC_RECEIVE the adapter is unconfigured
(e.g. the HDMI cable is removed). Now select will always report that
there is a message, but calling CEC_RECEIVE will always return -ENONET
because the adapter is no longer configured and so will never actually
dequeue the message.
There is really no need for this check, and in fact the ENONET error
code was never documented for CEC_RECEIVE. This may have been a left-over
of old code that was never updated.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Cc: <stable@vger.kernel.org> # for v4.10 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This reverts commit cf39bf58af.
The commit regression to users that define both console=ttyS1
and console=ttyS0 on the command line, see
https://lkml.kernel.org/r/20170509082915.GA13236@bistromath.localdomain
The kernel log messages always appeared only on one serial port. It is
even documented in Documentation/admin-guide/serial-console.rst:
"Note that you can only define one console per device type (serial,
video)."
The above mentioned commit changed the order in which the command line
parameters are searched. As a result, the kernel log messages go to
the last mentioned ttyS* instead of the first one.
We long thought that using two console=ttyS* on the command line
did not make sense. But then we realized that console= parameters
were handled also by systemd, see
http://0pointer.de/blog/projects/serial-console.html
"By default systemd will instantiate one serial-getty@.service on
the main kernel console, if it is not a virtual terminal."
where
"[4] If multiple kernel consoles are used simultaneously, the main
console is the one listed first in /sys/class/tty/console/active,
which is the last one listed on the kernel command line."
This puts the original report into another light. The system is running
in qemu. The first serial port is used to store the messages into a file.
The second one is used to login to the system via a socket. It depends
on systemd and the historic kernel behavior.
By other words, systemd causes that it makes sense to define both
console=ttyS1 console=ttyS0 on the command line. The kernel fix
caused regression related to userspace (systemd) and need to be
reverted.
In addition, it went out that the fix helped only partially.
The messages still were duplicated when the boot console was
removed early by late_initcall(printk_late_init). Then the entire
log was replayed when the same console was registered as a normal one.
Link: 20170606160339.GC7604@pathway.suse.cz
Cc: Aleksey Makarov <aleksey.makarov@linaro.org>
Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Robin Murphy <robin.murphy@arm.com>,
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Nair, Jayachandran" <Jayachandran.Nair@cavium.com>
Cc: linux-serial@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
On sparc, if we have an alloca() like situation, as is the case with
SHASH_DESC_ON_STACK(), we can end up referencing deallocated stack
memory. The result can be that the value is clobbered if a trap
or interrupt arrives at just the right instruction.
It only occurs if the function ends returning a value from that
alloca() area and that value can be placed into the return value
register using a single instruction.
For example, in lib/libcrc32c.c:crc32c() we end up with a return
sequence like:
return %i7+8
lduw [%o5+16], %o0 ! MEM[(u32 *)__shash_desc.1_10 + 16B],
%o5 holds the base of the on-stack area allocated for the shash
descriptor. But the return released the stack frame and the
register window.
So if an intererupt arrives between 'return' and 'lduw', then
the value read at %o5+16 can be corrupted.
Add a data compiler barrier to work around this problem. This is
exactly what the gcc fix will end up doing as well, and it absolutely
should not change the code generated for other cpus (unless gcc
on them has the same bug :-)
With crucial insight from Eric Sandeen.
Cc: <stable@vger.kernel.org>
Reported-by: Anatoly Pugachev <matorola@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
During early boot, load_ucode_intel_ap() uses __load_ucode_intel()
to obtain a pointer to the relevant microcode patch (embedded in the
initrd), and stores this value in 'intel_ucode_patch' to speed up the
microcode patch application for subsequent CPUs.
On resuming from suspend-to-RAM, however, load_ucode_ap() calls
load_ucode_intel_ap() for each non-boot-CPU. By then the initramfs is
long gone so the pointer stored in 'intel_ucode_patch' no longer points to
a valid microcode patch.
Clear that pointer so that we effectively fall back to the CPU hotplug
notifier callbacks to update the microcode.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
[ Edit and massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.10..
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170607095819.9754-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I will be traveling in the upcoming months and it'll be much easier for me
to access my kernel.org email rather than my work one. Change my email
address in the MAINTAINERS file from jeyu@redhat.com to jeyu@kernel.org.
Signed-off-by: Jessica Yu <jeyu@redhat.com>
It's possible that get_random_{u32,u64} is used before the crng has
initialized, in which case, its output might not be cryptographically
secure. For this problem, directly, this patch set is introducing the
*_wait variety of functions, but even with that, there's a subtle issue:
what happens to our batched entropy that was generated before
initialization. Prior to this commit, it'd stick around, supplying bad
numbers. After this commit, we force the entropy to be re-extracted
after each phase of the crng has initialized.
In order to avoid a race condition with the position counter, we
introduce a simple rwlock for this invalidation. Since it's only during
this awkward transition period, after things are all set up, we stop
using it, so that it doesn't have an impact on performance.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org # v4.11+
Linus pointed out that there is a much more efficient way of avoiding
the problem that we were trying to address in commit 9dfa7bba35:
"fix race in drivers/char/random.c:get_reg()".
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Network devices can allocate reasources and private memory using
netdev_ops->ndo_init(). However, the release of these resources
can occur in one of two different places.
Either netdev_ops->ndo_uninit() or netdev->destructor().
The decision of which operation frees the resources depends upon
whether it is necessary for all netdev refs to be released before it
is safe to perform the freeing.
netdev_ops->ndo_uninit() presumably can occur right after the
NETDEV_UNREGISTER notifier completes and the unicast and multicast
address lists are flushed.
netdev->destructor(), on the other hand, does not run until the
netdev references all go away.
Further complicating the situation is that netdev->destructor()
almost universally does also a free_netdev().
This creates a problem for the logic in register_netdevice().
Because all callers of register_netdevice() manage the freeing
of the netdev, and invoke free_netdev(dev) if register_netdevice()
fails.
If netdev_ops->ndo_init() succeeds, but something else fails inside
of register_netdevice(), it does call ndo_ops->ndo_uninit(). But
it is not able to invoke netdev->destructor().
This is because netdev->destructor() will do a free_netdev() and
then the caller of register_netdevice() will do the same.
However, this means that the resources that would normally be released
by netdev->destructor() will not be.
Over the years drivers have added local hacks to deal with this, by
invoking their destructor parts by hand when register_netdevice()
fails.
Many drivers do not try to deal with this, and instead we have leaks.
Let's close this hole by formalizing the distinction between what
private things need to be freed up by netdev->destructor() and whether
the driver needs unregister_netdevice() to perform the free_netdev().
netdev->priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev->destructor(), except for
free_netdev().
netdev->needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().
Now, register_netdevice() can sanely release all resources after
ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
and netdev->priv_destructor().
And at the end of unregister_netdevice(), we invoke
netdev->priv_destructor() and optionally call free_netdev().
Signed-off-by: David S. Miller <davem@davemloft.net>
Will reported that in BPF_XADD we must use a different register in stxr
instruction for the status flag due to otherwise CONSTRAINED UNPREDICTABLE
behavior per architecture. Reference manual says [1]:
If s == t, then one of the following behaviors must occur:
* The instruction is UNDEFINED.
* The instruction executes as a NOP.
* The instruction performs the store to the specified address, but
the value stored is UNKNOWN.
Thus, use a different temporary register for the status flag to fix it.
Disassembly extract from test 226/STX_XADD_DW from test_bpf.ko:
[...]
0000003c: c85f7d4b ldxr x11, [x10]
00000040: 8b07016b add x11, x11, x7
00000044: c80c7d4b stxr w12, x11, [x10]
00000048: 35ffffac cbnz w12, 0x0000003c
[...]
[1] https://static.docs.arm.com/ddi0487/b/DDI0487B_a_armv8_arm.pdf, p.6132
Fixes: 85f68fe898 ("bpf, arm64: implement jiting of BPF_XADD")
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mvpp22_port_mii_set() function was added by 2697582144, but the
function directly returns without doing anything. This return was used
when debugging and wasn't removed before sending the patch. Fix this.
Fixes: 2697582144 ("net: mvpp2: handle misc PPv2.1/PPv2.2 differences")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changing the mtu is currently not supported in the ibmvnic driver.
Implement .ndo_change_mtu in the driver so that attempting to use ifconfig
to change the mtu will fail and present the user with an error message.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit eea40b8f62 ("infiniband: call ipv6 route lookup via the stub
interface") introduced a regression in address resolution when connecting
to IPv6 destination addresses. The old code called ip6_route_output(),
while the new code calls ipv6_stub->ipv6_dst_lookup(). The two are almost
the same, except that ipv6_dst_lookup() also calls ip6_route_get_saddr()
if the source address is in6addr_any.
This means that the test of ipv6_addr_any(&fl6.saddr) now never succeeds,
and so we never copy the source address out. This ends up causing
rdma_resolve_addr() to fail, because without a resolved source address,
cma_acquire_dev() will fail to find an RDMA device to use. For me, this
causes connecting to an NVMe over Fabrics target via RoCE / IPv6 to fail.
Fix this by copying out fl6.saddr if ipv6_addr_any() is true for the original
source address passed into addr6_resolve(). We can drop our call to
ipv6_dev_get_saddr() because ipv6_dst_lookup() already does that work.
Fixes: eea40b8f62 ("infiniband: call ipv6 route lookup via the stub interface")
Cc: <stable@vger.kernel.org> # 3.12+
Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The Lifebook E546 and E557 touchpad were also not functioning and
worked after running:
echo "1" > /sys/devices/platform/i8042/serio2/crc_enabled
Add them to the list of machines that need this workaround.
Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org>
Reviewed-by: Arjan Opmeer <arjan@opmeer.net>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
When freeing VF's DMA mappings, an already NULLed pointer was checked
again due to an apparent copy&paste error. Consequently, the pf2vf
bulletin DMA mapping was not freed.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KMSAN reported a use of uninitialized memory in dev_set_alias(),
which was caused by calling strlcpy() (which in turn called strlen())
on the user-supplied non-terminated string.
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Only print NMI watchdog hint in 'perf stat' when it is enabled (Andi Kleen)
- Fix sys_mmap/sys_old_mmap shandling in s390 in 'perf trace' (Jiri Olsa)
- Disable breakpoint signal tests in powerpc, that lacks the perf kernel
glue to set breakpoint events and makes 'perf test' always fail (Jiri Olsa)
- Fix 'perf annotate' for branch instruction with multiple operands (Kim Phillips)
- Add missing powerpc triplet when disassembling with 'objdump' in 'perf
annotate' (Kim Phillips)
- Do not trow away partial unwound stacks when using libdw, making
callchains produced with it similar to those produced when linked with
the other DWARF unwind library supported in perf, libunwind (Milian Wolff)
- Fixes to properly handle kernel modules when processing build-id meta
events (Namhyung Kim)
- Fix handling of compressed modules in the build-id cache (Namhyung Kim)
- Fix 'perf annotate' failure when filename has special chars (Ravi Bangoria)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
hard disk IO latency varies a lot depending on spindle move. The latency
range could be from several microseconds to several milliseconds. It's
pretty hard to get the baseline latency used by io.low.
We will use a different stragety here. The idea is only using IO with
spindle move to determine if cgroup IO is in good state. For HD, if io
latency is small (< 1ms), we ignore the IO. Such IO is likely from
sequential IO, and is helpless to help determine if a cgroup's IO is
impacted by other cgroups. With this, we only account IO with big
latency. Then we can choose a hardcoded baseline latency for HD (4ms,
which is typical IO latency with seek). With all these settings, the
io.low latency works for both HD and SSD.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
While introducing HDMI support, component matching on connectors node
were bypassed since no driver would actually bind on the DT node.
But when only a CVBS connector is present, only a single node is found
in the graph, but ignored and a NULL match table is given to the
component code.
This code permits bypassing the components framework by binding directly
the DRM driver when no components needs to be loaded.
Fixes: a41e82e6c4 ("drm/meson: Add support for components")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1496067352-8733-1-git-send-email-narmstrong@baylibre.com
I have encountered a NULL pointer dereference in
throtl_schedule_pending_timer:
[ 413.735396] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
[ 413.735535] IP: [<ffffffff812ebbbf>] throtl_schedule_pending_timer+0x3f/0x210
[ 413.735643] PGD 22c8cf067 PUD 22cb34067 PMD 0
[ 413.735713] Oops: 0000 [#1] SMP
......
This is caused by the following case:
blk_throtl_bio
throtl_schedule_next_dispatch <= sq is top level one without parent
throtl_schedule_pending_timer
sq_to_tg(sq)->td->throtl_slice <= sq_to_tg(sq) returns NULL
Fix it by using sq_to_td instead of sq_to_tg(sq)->td, which will always
return a valid td.
Fixes: 297e3d8547 ("blk-throttle: make throtl_slice tunable")
Signed-off-by: Joseph Qi <qijiang.qj@alibaba-inc.com>
Reviewed-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The scanline counter is bonkers on VLV/CHV DSI. The scanline counter
increment is not lined up with the start of vblank like it is on
every other platform and output type. This causes problems for
both the vblank timestamping and atomic update vblank evasion.
On my FFRD8 machine at least, the scanline counter increment
happens about 1/3 of a scanline ahead of the start of vblank (which
is where all register latching happens still). That means we can't
trust the scanline counter to tell us whether we're in vblank or not
while we're on that particular line. In order to keep vblank
timestamping in working condition when called from the vblank irq,
we'll leave scanline_offset at one, which means that the entire
line containing the start of vblank is considered to be inside
the vblank.
For the vblank evasion we'll need to consider that entire line
to be bad, since we can't tell whether the registers already
got latched or not. And we can't actually use the start of vblank
interrupt to get us past that line as the interrupt would fire
too soon, and then we'd up waiting for the next start of vblank
instead. One way around that would using the frame start
interrupt instead since that wouldn't fire until the next
scanline, but that would require some bigger changes in the
interrupt code. So for simplicity we'll just poll until we get
past the bad line.
v2: Adjust the comments a bit
Cc: stable@vger.kernel.org
Cc: Jonas Aaberg <cja@gmx.net>
Tested-by: Jonas Aaberg <cja@gmx.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99086
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161215174734.28779-1-ville.syrjala@linux.intel.com
Tested-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
(cherry picked from commit ec1b4ee283)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Since
commit bac2a909a0
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Wed Jan 21 02:17:42 2015 +0100
PCI / PM: Avoid resuming PCI devices during system suspend
PCI devices will default to allowing the system suspend complete
optimization where devices are not woken up during system suspend if
they were already runtime suspended. This however breaks the i915/HDA
drivers for two reasons:
- The i915 driver has system suspend specific steps that it needs to
run, that bring the device to a different state than its runtime
suspended state.
- The HDA driver's suspend handler requires power that it will request
from the i915 driver's power domain handler. This in turn requires the
i915 driver to runtime resume itself, but this won't be possible if the
suspend complete optimization is in effect: in this case the i915
runtime PM is disabled and trying to get an RPM reference returns
-EACCESS.
Solve this by requiring the PCI/PM core to resume the device during
system suspend which in effect disables the suspend complete optimization.
Regardless of the above commit the optimization stayed disabled for DRM
devices until
commit d14d2a8453
Author: Lukas Wunner <lukas@wunner.de>
Date: Wed Jun 8 12:49:29 2016 +0200
drm: Remove dev_pm_ops from drm_class
so this patch is in practice a fix for this commit. Another reason for
the bug staying hidden for so long is that the optimization for a device
is disabled if it's disabled for any of its children devices. i915 may
have a backlight device as its child which doesn't support runtime PM
and so doesn't allow the optimization either. So if this backlight
device got registered the bug stayed hidden.
Credits to Marta, Tomi and David who enabled pstore logging,
that caught one instance of this issue across a suspend/
resume-to-ram and Ville who rememberd that the optimization was enabled
for some devices at one point.
The first WARN triggered by the problem:
[ 6250.746445] WARNING: CPU: 2 PID: 17384 at drivers/gpu/drm/i915/intel_runtime_pm.c:2846 intel_runtime_pm_get+0x6b/0xd0 [i915]
[ 6250.746448] pm_runtime_get_sync() failed: -13
[ 6250.746451] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul
snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core ptp mei_me pps_core snd_pcm lpc_ich mei prime_
numbers i2c_hid i2c_designware_platform i2c_designware_core [last unloaded: i915]
[ 6250.746512] CPU: 2 PID: 17384 Comm: kworker/u8:0 Tainted: G U W 4.11.0-rc5-CI-CI_DRM_334+ #1
[ 6250.746515] Hardware name: /NUC5i5RYB, BIOS RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
[ 6250.746521] Workqueue: events_unbound async_run_entry_fn
[ 6250.746525] Call Trace:
[ 6250.746530] dump_stack+0x67/0x92
[ 6250.746536] __warn+0xc6/0xe0
[ 6250.746542] ? pci_restore_standard_config+0x40/0x40
[ 6250.746546] warn_slowpath_fmt+0x46/0x50
[ 6250.746553] ? __pm_runtime_resume+0x56/0x80
[ 6250.746584] intel_runtime_pm_get+0x6b/0xd0 [i915]
[ 6250.746610] intel_display_power_get+0x1b/0x40 [i915]
[ 6250.746646] i915_audio_component_get_power+0x15/0x20 [i915]
[ 6250.746654] snd_hdac_display_power+0xc8/0x110 [snd_hda_core]
[ 6250.746661] azx_runtime_resume+0x218/0x280 [snd_hda_intel]
[ 6250.746667] pci_pm_runtime_resume+0x76/0xa0
[ 6250.746672] __rpm_callback+0xb4/0x1f0
[ 6250.746677] ? pci_restore_standard_config+0x40/0x40
[ 6250.746682] rpm_callback+0x1f/0x80
[ 6250.746686] ? pci_restore_standard_config+0x40/0x40
[ 6250.746690] rpm_resume+0x4ba/0x740
[ 6250.746698] __pm_runtime_resume+0x49/0x80
[ 6250.746703] pci_pm_suspend+0x57/0x140
[ 6250.746709] dpm_run_callback+0x6f/0x330
[ 6250.746713] ? pci_pm_freeze+0xe0/0xe0
[ 6250.746718] __device_suspend+0xf9/0x370
[ 6250.746724] ? dpm_watchdog_set+0x60/0x60
[ 6250.746730] async_suspend+0x1a/0x90
[ 6250.746735] async_run_entry_fn+0x34/0x160
[ 6250.746741] process_one_work+0x1f2/0x6d0
[ 6250.746749] worker_thread+0x49/0x4a0
[ 6250.746755] kthread+0x107/0x140
[ 6250.746759] ? process_one_work+0x6d0/0x6d0
[ 6250.746763] ? kthread_create_on_node+0x40/0x40
[ 6250.746768] ret_from_fork+0x2e/0x40
[ 6250.746778] ---[ end trace 102a62fd2160f5e6 ]---
v2:
- Use the new pci_dev->needs_resume flag, to avoid any overhead during
the ->pm_prepare hook. (Rafael)
v3:
- Update commit message to reference the actual regressing commit.
(Lukas)
v4:
- Rebase on v4 of patch 1/2.
Fixes: d14d2a8453 ("drm: Remove dev_pm_ops from drm_class")
References: https://bugs.freedesktop.org/show_bug.cgi?id=100378
References: https://bugs.freedesktop.org/show_bug.cgi?id=100770
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Marta Lofstedt <marta.lofstedt@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: linux-pci@vger.kernel.org
Cc: <stable@vger.kernel.org> # v4.10.x: 4d071c3 - PCI/PM: Add needs_resume flag
Cc: <stable@vger.kernel.org> # v4.10.x
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-and-tested-by: Marta Lofstedt <marta.lofstedt@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1493726649-32094-2-git-send-email-imre.deak@intel.com
(cherry picked from commit adfdf85d79)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
While the atomic modesetting capability is signaled also elsewhere, also
reflect it by a driver minor bump.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
These function implementations and/or declarations are no longer used
now that atomic is enabled.
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reported-by: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Trivial fix to spelling mistake in DRM_ERROR error message.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
The previous attempt at this had an issue with with num_clips > 1
because it would always end up using the coordinates of the last
clip while using width and height calculated from the bounding
box of all the clips.
So if the last clip happens to be not at the top-left corner of
the bounding box, the CPU blit operation would go out of bounds.
The original intent was to coalesce all the clips into one blit,
and to do that we need to also track the starting point of the
content buffer.
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
When a new FB is bound, we have to send an update command otherwise
the new FB may not be shown
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
When vmw_gb_surface_define_ioctl() is called with an existing buffer,
we end up returning an uninitialized variable in the backup_handle.
The fix is to first initialize backup_handle to 0 just to be sure, and
second, when a user-provided buffer is found, we will use the
req->buffer_handle as the backup_handle.
Cc: <stable@vger.kernel.org>
Reported-by: Murray McAllister <murray.mcallister@insomniasec.com>
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
With atomic, the cursor surface is treated like a FB. Creating
a proxy surface for cursor doesn't gain us much benefit.
This fixes the issue on atomic enabled 2D VMs where the cursor
disappears.
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
BXT has a H/W issue with IOMMU which can lead to system hangs when
Aperture accesses are queued within the GAM behind GTT Accesses.
This patch avoids the condition by wrapping all GTT updates in stop_machine
and using a flushing read prior to restarting the machine.
The stop_machine guarantees no new Aperture accesses can begin while
the PTE writes are being emmitted. The flushing read ensures that
any following Aperture accesses cannot begin until the PTE writes
have been cleared out of the GAM's fifo.
Only FOLLOWING Aperture accesses need to be separated from in flight
PTE updates. PTE Writes may follow tightly behind already in flight
Aperture accesses, so no flushing read is required at the start of
a PTE update sequence.
This issue was reproduced by running
igt/gem_readwrite and
igt/gem_render_copy
simultaneously from different processes, each in a tight loop,
with INTEL_IOMMU enabled.
This patch was originally published as:
drm/i915: Serialize GTT Updates on BXT
[Note: This will cause a performance penalty for some use cases, but
avoiding hangs trumps performance hits. This may need to be worked
around in Mesa to recover the lost performance.]
v2: Move bxt/iommu detection into static function
Remove #ifdef CONFIG_INTEL_IOMMU protection
Make function names more reflective of purpose
Move flushing read into static function
v3: Tidy up for checkpatch.pl
Testcase: igt/gem_concurrent_blit
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: John Harrison <john.C.Harrison@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/1495641251-30022-1-git-send-email-jon.bloomfield@intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 0ef34ad622)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Commit 5995a68 "xen/privcmd: Add support for Linux 64KB page granularity" did
not go far enough to support 64KB in mmap_batch_fn.
The variable 'nr' is the number of 4KB chunk to map. However, when Linux
is using 64KB page granularity the array of pages (vma->vm_private_data)
contain one page per 64KB. Fix it by incrementing st->index correctly.
Furthermore, st->va is not correctly incremented as PAGE_SIZE !=
XEN_PAGE_SIZE.
Fixes: 5995a68 ("xen/privcmd: Add support for Linux 64KB page granularity")
CC: stable@vger.kernel.org
Reported-by: Feng Kan <fkan@apm.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Christoph Hellwig suggests we should to make APST work out of the box.
Hence relax the the default max latency to make them able to enter
deepest power state on default.
Here are id-ctrl excerpts from two high latency NVMes:
vid : 0x14a4
ssvid : 0x1b4b
mn : CX2-GB1024-Q11 NVMe LITEON 1024GB
ps 3 : mp:0.1000W non-operational enlat:5000 exlat:5000 rrt:3 rrl:3
rwt:3 rwl:3 idle_power:- active_power:-
ps 4 : mp:0.0100W non-operational enlat:50000 exlat:100000 rrt:4 rrl:4
rwt:4 rwl:4 idle_power:- active_power:-
vid : 0x15b7
ssvid : 0x1b4b
mn : A400 NVMe SanDisk 512GB
ps 3 : mp:0.0500W non-operational enlat:51000 exlat:10000 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
ps 4 : mp:0.0055W non-operational enlat:1000000 exlat:100000 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When a NVMe is in non-op states, the latency is exlat.
The latency will be enlat + exlat only when the NVMe tries to transit
from operational state right atfer it begins to transit to
non-operational state, which should be a rare case.
Therefore, as Andy Lutomirski suggests, use exlat only when deciding power
states to trainsit to.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The failure case, of a create controller request, called
nvme_uninit_ctrl() but didn't do a put to allow the nvme
controller to be deleted.
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Per FC-NVME, when lldd or transport detects an i/o error, the
connection must be terminated, which in turn requires the association
to be termianted. Currently the transport simply creates a nvme
completion status of transport error and returns the io. The FC-NVME
spec makes the mandate as initiator and host, depending on the error,
can get out of sync on outstanding io counts (sqhd/sqtail).
Implement the association teardown on lldd or transport detected
errors.
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
When we encounter an transport/controller errors, error recovery
kicks in which performs:
1. stops io/admin queues
2. moves transport queues out of LIVE state
3. fast fail pending io
4. schedule periodic reconnects.
But we also need to fast fail incoming IO taht enters after we
already scheduled. Given that our queue is not LIVE anymore, simply
restart the request queues to fail in .queue_rq
Reported-by: Alex Turin <alex@vastdata.com>
Reported-by: shahar.salzman <shahar.salzman@gmail.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
We need to start admin queues too in nvme_kill_queues()
for avoiding hang in remove path[1].
This patch is very similar with 806f026f9b901eaf(nvme: use
blk_mq_start_hw_queues() in nvme_kill_queues()).
[1] hang stack trace
[<ffffffff813c9716>] blk_execute_rq+0x56/0x80
[<ffffffff815cb6e9>] __nvme_submit_sync_cmd+0x89/0xf0
[<ffffffff815ce7be>] nvme_set_features+0x5e/0x90
[<ffffffff815ce9f6>] nvme_configure_apst+0x166/0x200
[<ffffffff815cef45>] nvme_set_latency_tolerance+0x35/0x50
[<ffffffff8157bd11>] apply_constraint+0xb1/0xc0
[<ffffffff8157cbb4>] dev_pm_qos_constraints_destroy+0xf4/0x1f0
[<ffffffff8157b44a>] dpm_sysfs_remove+0x2a/0x60
[<ffffffff8156d951>] device_del+0x101/0x320
[<ffffffff8156db8a>] device_unregister+0x1a/0x60
[<ffffffff8156dc4c>] device_destroy+0x3c/0x50
[<ffffffff815cd295>] nvme_uninit_ctrl+0x45/0xa0
[<ffffffff815d4858>] nvme_remove+0x78/0x110
[<ffffffff81452b69>] pci_device_remove+0x39/0xb0
[<ffffffff81572935>] device_release_driver_internal+0x155/0x210
[<ffffffff81572a02>] device_release_driver+0x12/0x20
[<ffffffff815d36fb>] nvme_remove_dead_ctrl_work+0x6b/0x70
[<ffffffff810bf3bc>] process_one_work+0x18c/0x3a0
[<ffffffff810bf61e>] worker_thread+0x4e/0x3b0
[<ffffffff810c5ac9>] kthread+0x109/0x140
[<ffffffff8185800c>] ret_from_fork+0x2c/0x40
[<ffffffffffffffff>] 0xffffffffffffffff
Fixes: c5552fde102fc("nvme: Enable autonomous power state transitions")
Reported-by: Rakesh Pandit <rakesh@tuxera.com>
Tested-by: Rakesh Pandit <rakesh@tuxera.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
snd_timer_user_tselect() reallocates the queue buffer dynamically, but
it forgot to reset its indices. Since the read may happen
concurrently with ioctl and snd_timer_user_tselect() allocates the
buffer via kmalloc(), this may lead to the leak of uninitialized
kernel-space data, as spotted via KMSAN:
BUG: KMSAN: use of unitialized memory in snd_timer_user_read+0x6c4/0xa10
CPU: 0 PID: 1037 Comm: probe Not tainted 4.11.0-rc5+ #2739
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16
dump_stack+0x143/0x1b0 lib/dump_stack.c:52
kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007
kmsan_check_memory+0xc2/0x140 mm/kmsan/kmsan.c:1086
copy_to_user ./arch/x86/include/asm/uaccess.h:725
snd_timer_user_read+0x6c4/0xa10 sound/core/timer.c:2004
do_loop_readv_writev fs/read_write.c:716
__do_readv_writev+0x94c/0x1380 fs/read_write.c:864
do_readv_writev fs/read_write.c:894
vfs_readv fs/read_write.c:908
do_readv+0x52a/0x5d0 fs/read_write.c:934
SYSC_readv+0xb6/0xd0 fs/read_write.c:1021
SyS_readv+0x87/0xb0 fs/read_write.c:1018
This patch adds the missing reset of queue indices. Together with the
previous fix for the ioctl/read race, we cover the whole problem.
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The read from ALSA timer device, the function snd_timer_user_tread(),
may access to an uninitialized struct snd_timer_user fields when the
read is concurrently performed while the ioctl like
snd_timer_user_tselect() is invoked. We have already fixed the races
among ioctls via a mutex, but we seem to have forgotten the race
between read vs ioctl.
This patch simply applies (more exactly extends the already applied
range of) tu->ioctl_lock in snd_timer_user_tread() for closing the
race window.
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Revert commit eed4d47efe (ACPI / sleep: Ignore spurious SCI wakeups
from suspend-to-idle) as it turned out to be premature and triggered
a number of different issues on various systems.
That includes, but is not limited to, premature suspend-to-RAM aborts
on Dell XPS 13 (9343) reported by Dominik.
The issue the commit in question attempted to address is real and
will need to be taken care of going forward, but evidently more work
is needed for this purpose.
Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull networking fixes from David Miller:
1) Made TCP congestion control documentation match current reality,
from Anmol Sarma.
2) Various build warning and failure fixes from Arnd Bergmann.
3) Fix SKB list leak in ipv6_gso_segment().
4) Use after free in ravb driver, from Eugeniu Rosca.
5) Don't use udp_poll() in ping protocol driver, from Eric Dumazet.
6) Don't crash in PCI error recovery of cxgb4 driver, from Guilherme
Piccoli.
7) _SRC_NAT_DONE_BIT needs to be cleared using atomics, from Liping
Zhang.
8) Use after free in vxlan deletion, from Mark Bloch.
9) Fix ordering of NAPI poll enabled in ethoc driver, from Max
Filippov.
10) Fix stmmac hangs with TSO, from Niklas Cassel.
11) Fix crash in CALIPSO ipv6, from Richard Haines.
12) Clear nh_flags properly on mpls link up. From Roopa Prabhu.
13) Fix regression in sk_err socket error queue handling, noticed by
ping applications. From Soheil Hassas Yeganeh.
14) Update mlx4/mlx5 MAINTAINERS information.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits)
net: stmmac: fix a broken u32 less than zero check
net: stmmac: fix completely hung TX when using TSO
net: ethoc: enable NAPI before poll may be scheduled
net: bridge: fix a null pointer dereference in br_afspec
ravb: Fix use-after-free on `ifconfig eth0 down`
net/ipv6: Fix CALIPSO causing GPF with datagram support
net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value
Revert "sit: reload iphdr in ipip6_rcv"
i40e/i40evf: proper update of the page_offset field
i40e: Fix state flags for bit set and clean operations of PF
iwlwifi: fix host command memory leaks
iwlwifi: fix min API version for 7265D, 3168, 8000 and 8265
iwlwifi: mvm: clear new beacon command template struct
iwlwifi: mvm: don't fail when removing a key from an inexisting sta
iwlwifi: pcie: only use d0i3 in suspend/resume if system_pm is set to d0i3
iwlwifi: mvm: fix firmware debug restart recording
iwlwifi: tt: move ucode_loaded check under mutex
iwlwifi: mvm: support ibss in dqa mode
iwlwifi: mvm: Fix command queue number on d0i3 flow
iwlwifi: mvm: rs: start using LQ command color
...
Pull sparc fixes from David Miller:
1) Fix TLB context wrap races, from Pavel Tatashin.
2) Cure some gcc-7 build issues.
3) Handle invalid setup_hugepagesz command line values properly, from
Liam R Howlett.
4) Copy TSB using the correct address shift for the huge TSB, from Mike
Kravetz.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: delete old wrap code
sparc64: new context wrap
sparc64: add per-cpu mm of secondary contexts
sparc64: redefine first version
sparc64: combine activate_mm and switch_mm
sparc64: reset mm cpumask after wrap
sparc/mm/hugepages: Fix setup_hugepagesz for invalid values.
sparc: Machine description indices can vary
sparc64: mm: fix copy_tsb to correctly copy huge page TSBs
arch/sparc: support NR_CPUS = 4096
sparc64: Add __multi3 for gcc 7.x and later.
sparc64: Fix build warnings with gcc 7.
arch/sparc: increase CONFIG_NODES_SHIFT on SPARC64 to 5
GCC explicitly does not warn for unused static inline functions for
-Wunused-function. The manual states:
Warn whenever a static function is declared but not defined or
a non-inline static function is unused.
Clang does warn for static inline functions that are unused.
It turns out that suppressing the warnings avoids potentially complex
#ifdef directives, which also reduces LOC.
Suppress the warning for clang.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pavel Tatashin says:
====================
sparc64: context wrap fixes
This patch series contains fixes for context wrap: when we are out of
context ids, and need to get a new version.
It fixes memory corruption issues which happen when more than number of
context ids (currently set to 8K) number of processes are started
simultaneously, and processes can get a wrong context.
sparc64: new context wrap:
- contains explanation of new wrap method, and also explanation of races
that it solves
sparc64: reset mm cpumask after wrap
- explains issue of not reseting cpu mask on a wrap
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The old method that is using xcall and softint to get new context id is
deleted, as it is replaced by a method of using per_cpu_secondary_mm
without xcall to perform the context wrap.
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current wrap implementation has a race issue: it is called outside of
the ctx_alloc_lock, and also does not wait for all CPUs to complete the
wrap. This means that a thread can get a new context with a new version
and another thread might still be running with the same context. The
problem is especially severe on CPUs with shared TLBs, like sun4v. I used
the following test to very quickly reproduce the problem:
- start over 8K processes (must be more than context IDs)
- write and read values at a memory location in every process.
Very quickly memory corruptions start happening, and what we read back
does not equal what we wrote.
Several approaches were explored before settling on this one:
Approach 1:
Move smp_new_mmu_context_version() inside ctx_alloc_lock, and wait for
every process to complete the wrap. (Note: every CPU must WAIT before
leaving smp_new_mmu_context_version_client() until every one arrives).
This approach ends up with deadlocks, as some threads own locks which other
threads are waiting for, and they never receive softint until these threads
exit smp_new_mmu_context_version_client(). Since we do not allow the exit,
deadlock happens.
Approach 2:
Handle wrap right during mondo interrupt. Use etrap/rtrap to enter into
into C code, and issue new versions to every CPU.
This approach adds some overhead to runtime: in switch_mm() we must add
some checks to make sure that versions have not changed due to wrap while
we were loading the new secondary context. (could be protected by PSTATE_IE
but that degrades performance as on M7 and older CPUs as it takes 50 cycles
for each access). Also, we still need a global per-cpu array of MMs to know
where we need to load new contexts, otherwise we can change context to a
thread that is going way (if we received mondo between switch_mm() and
switch_to() time). Finally, there are some issues with window registers in
rtrap() when context IDs are changed during CPU mondo time.
The approach in this patch is the simplest and has almost no impact on
runtime. We use the array with mm's where last secondary contexts were
loaded onto CPUs and bump their versions to the new generation without
changing context IDs. If a new process comes in to get a context ID, it
will go through get_new_mmu_context() because of version mismatch. But the
running processes do not need to be interrupted. And wrap is quicker as we
do not need to xcall and wait for everyone to receive and complete wrap.
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CTX_FIRST_VERSION defines the first context version, but also it defines
first context. This patch redefines it to only include the first context
version.
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only difference between these two functions is that in activate_mm we
unconditionally flush context. However, there is no need to keep this
difference after fixing a bug where cpumask was not reset on a wrap. So, in
this patch we combine these.
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After a wrap (getting a new context version) a process must get a new
context id, which means that we would need to flush the context id from
the TLB before running for the first time with this ID on every CPU. But,
we use mm_cpumask to determine if this process has been running on this CPU
before, and this mask is not reset after a wrap. So, there are two possible
fixes for this issue:
1. Clear mm cpumask whenever mm gets a new context id
2. Unconditionally flush context every time process is running on a CPU
This patch implements the first solution
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hugetlb_bad_size needs to be called on invalid values. Also change the
pr_warn to a pr_err to better align with other platforms.
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VIO devices were being looked up by their index in the machine
description node block, but this often varies over time as devices are
added and removed. Instead, store the ID and look up using the type,
config handle and ID.
Signed-off-by: James Clarke <jrtc27@jrtc27.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112541
Signed-off-by: David S. Miller <davem@davemloft.net>
When a TSB grows beyond its current capacity, a new TSB is allocated
and copy_tsb is called to copy entries from the old TSB to the new.
A hash shift based on page size is used to calculate the index of an
entry in the TSB. copy_tsb has hard coded PAGE_SHIFT in these
calculations. However, for huge page TSBs the value REAL_HPAGE_SHIFT
should be used. As a result, when copy_tsb is called for a huge page
TSB the entries are placed at the incorrect index in the newly
allocated TSB. When doing hardware table walk, the MMU does not
match these entries and we end up in the TSB miss handling code.
This code will then create and write an entry to the correct index
in the TSB. We take a performance hit for the table walk miss and
recreation of these entries.
Pass a new parameter to copy_tsb that is the page size shift to be
used when copying the TSB.
Suggested-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linux SPARC64 limits NR_CPUS to 4064 because init_cpu_send_mondo_info()
only allocates a single page for NR_CPUS mondo entries. Thus we cannot
use all 4096 CPUs on some SPARC platforms.
To fix, allocate (2^order) pages where order is set according to the size
of cpu_list for possible cpus. Since cpu_list_pa and cpu_mondo_block_pa
are not used in asm code, there are no imm13 offsets from the base PA
that will break because they can only reach one page.
Orabug: 25505750
Signed-off-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Atish Patra <atish.patra@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The check that queue is less or equal to zero is always true
because queue is a u32; queue is decremented and will wrap around
and never go -ve. Fix this by making queue an int.
Detected by CoverityScan, CID#1428988 ("Unsigned compared against 0")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stmmac_tso_allocator can fail to set the Last Descriptor bit
on a descriptor that actually was the last descriptor.
This happens when the buffer of the last descriptor ends
up having a size of exactly TSO_MAX_BUFF_SIZE.
When the IP eventually reaches the next last descriptor,
which actually has the bit set, the DMA will hang.
When the DMA hangs, we get a tx timeout, however,
since stmmac does not do a complete reset of the IP
in stmmac_tx_timeout, we end up in a state with
completely hung TX.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ethoc_reset enables device interrupts, ethoc_interrupt may schedule a
NAPI poll before NAPI is enabled in the ethoc_open, which results in
device being unable to send or receive anything until it's closed and
reopened. In case the device is flooded with ingress packets it may be
unable to recover at all.
Move napi_enable above ethoc_reset in the ethoc_open to fix that.
Fixes: a170285772 ("net: Add support for the OpenCores 10/100 Mbps Ethernet MAC.")
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently have the HSCTLR.A bit set, trapping unaligned accesses
at HYP, but we're not really prepared to deal with it.
Since the rest of the kernel is pretty happy about that, let's follow
its example and set HSCTLR.A to zero. Modern CPUs don't really care.
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
We currently have the SCTLR_EL2.A bit set, trapping unaligned accesses
at EL2, but we're not really prepared to deal with it. So far, this
has been unnoticed, until GCC 7 started emitting those (in particular
64bit writes on a 32bit boundary).
Since the rest of the kernel is pretty happy about that, let's follow
its example and set SCTLR_EL2.A to zero. Modern CPUs don't really
care.
Cc: stable@vger.kernel.org
Reported-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
__do_hyp_init has the rather bad habit of ignoring RES1 bits and
writing them back as zero. On a v8.0-8.2 CPU, this doesn't do anything
bad, but may end-up being pretty nasty on future revisions of the
architecture.
Let's preserve those bits so that we don't have to fix this later on.
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
We might call br_afspec() with p == NULL which is a valid use case if
the action is on the bridge device itself, but the bridge tunnel code
dereferences the p pointer without checking, so check if p is null
first.
Reported-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Fixes: efa5356b0d ("bridge: per vlan dst_metadata netlink support")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using CALIPSO with IPPROTO_UDP it is possible to trigger a GPF as the
IP header may have moved.
Also update the payload length after adding the CALIPSO option.
Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current comparison of entry < 0 will never be true since entry is an
unsigned integer. Make entry an int to ensure -ve error return values
from the call to jumbo_frm are correctly being caught.
Detected by CoverityScan, CID#1238760 ("Macro compares unsigned to 0")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ASoC: Fixes for v4.12
This is the usual collection of device specific fixes, all accumilated
since the merge window, plus one fix from Takashi for a nasty use after
free bug that bit some things with deferred probe and an update to the
maintainer address for the former Wolfson parts.
gcc 7.1 reports the following warning:
block/elevator.c: In function ‘elv_register’:
block/elevator.c:898:5: warning: ‘snprintf’ output may be truncated before the last format character [-Wformat-truncation=]
"%s_io_cq", e->elevator_name);
^~~~~~~~~~
block/elevator.c:897:3: note: ‘snprintf’ output between 7 and 22 bytes into a destination of size 21
snprintf(e->icq_cache_name, sizeof(e->icq_cache_name),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"%s_io_cq", e->elevator_name);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The bug is that the name of the icq_cache is 6 characters longer than
the elevator name, but only ELV_NAME_MAX + 5 characters were reserved
for it --- so in the case of a maximum-length elevator name, the 'q'
character in "_io_cq" would be truncated by snprintf(). Fix it by
reserving ELV_NAME_MAX + 6 characters instead.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Kalle Valo says:
====================
wireless-drivers fixes for 4.12
It has been a slow start of cycle and this the first set of fixes for
4.12. Nothing really major here.
wcn36xx
* fix an issue with module reload
brcmfmac
* fix aligment regression on 64 bit systems
iwlwifi
* fixes for memory leaks, runtime PM, memory initialisation and other
smaller problems
* fix IBSS on devices using DQA mode (7260 and up)
* fix the minimum firmware API requirement for 7265D, 3168, 8000 and
8265
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull media fixes from Mauro Carvalho Chehab:
"Some bug fixes:
- Don't fail build if atomisp has warnings
- Some CEC Kconfig changes to allow it to be used by DRM without
media dependencies
- A race fix at RC initialization code
- A driver fix at rainshadow-cec
IMHO, the one that affects most people in this series is a build fix:
if you try to build the Kernel with W=1 or using gcc7 and
all[yes|mod]config, build will fail due to -Werror at atomisp
makefiles"
* tag 'media/v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] rc-core: race condition during ir_raw_event_register()
[media] cec: drop MEDIA_CEC_DEBUG
[media] cec: rename MEDIA_CEC_NOTIFIER to CEC_NOTIFIER
[media] cec: select CEC_CORE instead of depend on it
[media] rainshadow-cec: ensure exit_loop is intialized
[media] atomisp: don't treat warnings as errors
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2017-06-06
This series contains fixes to i40e and i40evf only.
Mauro S. M. Rodrigues fixes a flood in the kernel log which was introduced
in a previous commit because of a mistaken substitution of __I40E_VSI_DOWN
instead of __I40E_DOWN when testing the state of the PF.
Björn Töpel fixes an issue introduced in a previous commit where the
offset was incorrect and could lead to data corruption for architectures
using PAGE_SIZE larger than 8191. Fixed the issue by updating the
page_offset correctly using the proper setting for truesize.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When direct issue is done on request picked up from plug list,
the hctx need to be updated with the actual hw queue, otherwise
wrong hctx is used and may hurt performance, especially when
wrong SRCU readlock is acquired/released
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This reverts commit b699d00358.
As per Eric Dumazet, the pskb_may_pull() is a NOP in this
particular case, so the 'iph' reload is unnecessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a bug where the copying of scatterlist buffers incorrectly
ignored bytes to skip in a scatterlist and ended 1 byte short.
This fixes testmgr hmac and hash test failures currently obscured
by hash import/export not being supported.
Fixes: abefd6741d ("staging: ccree: introduce CryptoCell HW driver").
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
WARNING: CPU: 3 PID: 2840 at arch/x86/kvm/vmx.c:10966 nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel]
CPU: 3 PID: 2840 Comm: qemu-system-x86 Tainted: G OE 4.12.0-rc3+ #23
RIP: 0010:nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel]
Call Trace:
? kvm_check_async_pf_completion+0xef/0x120 [kvm]
? rcu_read_lock_sched_held+0x79/0x80
vmx_queue_exception+0x104/0x160 [kvm_intel]
? vmx_queue_exception+0x104/0x160 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0x1171/0x1ce0 [kvm]
? kvm_arch_vcpu_load+0x47/0x240 [kvm]
? kvm_arch_vcpu_load+0x62/0x240 [kvm]
kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
? __fget+0xf3/0x210
do_vfs_ioctl+0xa4/0x700
? __fget+0x114/0x210
SyS_ioctl+0x79/0x90
do_syscall_64+0x81/0x220
entry_SYSCALL64_slow_path+0x25/0x25
This is triggered occasionally by running both win7 and win2016 in L2, in
addition, EPT is disabled on both L1 and L2. It can't be reproduced easily.
Commit 0b6ac343fc (KVM: nVMX: Correct handling of exception injection) mentioned
that "KVM wants to inject page-faults which it got to the guest. This function
assumes it is called with the exit reason in vmcs02 being a #PF exception".
Commit e011c663 (KVM: nVMX: Check all exceptions for intercept during delivery to
L2) allows to check all exceptions for intercept during delivery to L2. However,
there is no guarantee the exit reason is exception currently, when there is an
external interrupt occurred on host, maybe a time interrupt for host which should
not be injected to guest, and somewhere queues an exception, then the function
nested_vmx_check_exception() will be called and the vmexit emulation codes will
try to emulate the "Acknowledge interrupt on exit" behavior, the warning is
triggered.
Reusing the exit reason from the L2->L0 vmexit is wrong in this case,
the reason must always be EXCEPTION_NMI when injecting an exception into
L1 as a nested vmexit.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Fixes: e011c663b9 ("KVM: nVMX: Check all exceptions for intercept during delivery to L2")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
If a function sets bind_deactivated flag, upon removal we will be left
with an unbalanced deactivation. Let's make sure that we conditionally
call usb_function_activate() from usb_remove_function() and make sure
usb_remove_function() is called from remove_config().
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Commit 8d911904f3 ('powerpc/perf: Add restrictions to PMC5 in power9 DD1')
was added to restrict the use of PMC5 in Power9 DD1. Intention was to disable
the use of PMC5 using raw event code. But instead of updating the
power9_isa207_pmu structure (used on DD1), the commit incorrectly updated the
power9_pmu structure. Fix it.
Fixes: 8d911904f3 ("powerpc/perf: Add restrictions to PMC5 in power9 DD1")
Reported-by: Shriya <shriyak@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Tested-by: Shriya <shriyak@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In commit 8c27226119 ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we
switched to the generic implementation of cpu_to_node(), which uses a percpu
variable to hold the NUMA node for each CPU.
Unfortunately we neglected to notice that we use cpu_to_node() in the allocation
of our percpu areas, leading to a chicken and egg problem. In practice what
happens is when we are setting up the percpu areas, cpu_to_node() reports that
all CPUs are on node 0, so we allocate all percpu areas on node 0.
This is visible in the dmesg output, as all pcpu allocs being in group 0:
pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31
pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39
pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47
To fix it we need an early_cpu_to_node() which can run prior to percpu being
setup. We already have the numa_cpu_lookup_table we can use, so just plumb it
in. With the patch dmesg output shows two groups, 0 and 1:
pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31
pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39
pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47
We can also check the data_offset in the paca of various CPUs, with the fix we
see:
CPU 0: data_offset = 0x0ffe8b0000
CPU 24: data_offset = 0x1ffe5b0000
And we can see from dmesg that CPU 24 has an allocation on node 1:
node 0: [mem 0x0000000000000000-0x0000000fffffffff]
node 1: [mem 0x0000001000000000-0x0000001fffffffff]
Cc: stable@vger.kernel.org # v3.16+
Fixes: 8c27226119 ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
A disorder is found in some ALC269 quirk entries for ASUS (1043:xxxx),
which should have been sorted in PCI SSID order. Rearrange them, so
that I won't overlook the already existing entry like I did a couple
of times in the past...
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The ASUS X705UD laptop requires the known fixup ALC256_FIXUP_ASUS_MIC
in order to fix headphone jack sensing and to enable use of the internal
microphone.
Unfortunately jack sensing for the headset mic is still not working.
[rearranged the position to keep the PCI SSID order -- tiwai]
Signed-off-by: Chris Chiu <chiu@endlessm.com>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Since this driver does no detection of hardware, it might be used with
a non-sir port. Escape out if we are spinning.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Fix a link error in this specific combination of config options:
CONFIG_MEDIA_CEC_SUPPORT=y
CONFIG_CEC_CORE=m
CONFIG_MEDIA_CEC_NOTIFIER=y
CONFIG_VIDEO_STI_HDMI_CEC=m
CONFIG_DRM_STI=y
drivers/gpu/drm/sti/sti_hdmi.o: In function `sti_hdmi_remove':
sti_hdmi.c:(.text.sti_hdmi_remove+0x10): undefined reference to
`cec_notifier_set_phys_addr'
sti_hdmi.c:(.text.sti_hdmi_remove+0x34): undefined reference to
`cec_notifier_put'
drivers/gpu/drm/sti/sti_hdmi.o: In function `sti_hdmi_connector_get_modes':
sti_hdmi.c:(.text.sti_hdmi_connector_get_modes+0x4a): undefined
reference to `cec_notifier_set_phys_addr_from_edid'
drivers/gpu/drm/sti/sti_hdmi.o: In function `sti_hdmi_probe':
sti_hdmi.c:(.text.sti_hdmi_probe+0x204): undefined reference to
`cec_notifier_get'
drivers/gpu/drm/sti/sti_hdmi.o: In function `sti_hdmi_connector_detect':
sti_hdmi.c:(.text.sti_hdmi_connector_detect+0x36): undefined reference
to `cec_notifier_set_phys_addr'
drivers/gpu/drm/sti/sti_hdmi.o: In function `sti_hdmi_disable':
sti_hdmi.c:(.text.sti_hdmi_disable+0xc0): undefined reference to
`cec_notifier_set_phys_addr'
The version below seems to work, though I don't particularly
like the IS_REACHABLE() addition since that can be confusing
to users.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Changing the IS_REACHABLE() into a plain #ifdef broke the case of
CONFIG_MEDIA_RC=m && CONFIG_MEDIA_CEC=y:
drivers/media/cec/cec-core.o: In function `cec_unregister_adapter':
cec-core.c:(.text.cec_unregister_adapter+0x18): undefined reference to `rc_unregister_device'
drivers/media/cec/cec-core.o: In function `cec_delete_adapter':
cec-core.c:(.text.cec_delete_adapter+0x54): undefined reference to `rc_free_device'
drivers/media/cec/cec-core.o: In function `cec_register_adapter':
cec-core.c:(.text.cec_register_adapter+0x94): undefined reference to `rc_register_device'
cec-core.c:(.text.cec_register_adapter+0xa4): undefined reference to `rc_free_device'
cec-core.c:(.text.cec_register_adapter+0x110): undefined reference to `rc_unregister_device'
drivers/media/cec/cec-core.o: In function `cec_allocate_adapter':
cec-core.c:(.text.cec_allocate_adapter+0x234): undefined reference to `rc_allocate_device'
drivers/media/cec/cec-adap.o: In function `cec_received_msg':
cec-adap.c:(.text.cec_received_msg+0x734): undefined reference to `rc_keydown'
cec-adap.c:(.text.cec_received_msg+0x768): undefined reference to `rc_keyup'
This adds an additional dependency to explicitly forbid this combination.
Fixes: 5f2c467c54 ("[media] cec: add MEDIA_CEC_RC config option")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The driver allocates the spinlock but not initialize it.
Use spin_lock_init() on it to initialize it correctly.
This is detected by Coccinelle semantic patch.
Fixes: 0f314f6c2e ("[media] rainshadow-cec: new RainShadow Tech HDMI
CEC driver")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
In f8b45b74cc ("i40e/i40evf: Use build_skb to build frames")
i40e_build_skb updates the page_offset field with an incorrect offset,
which can lead to data corruption. This patch updates page_offset
correctly, by properly setting truesize.
Note that the bug only appears on architectures where PAGE_SIZE is
8192 or larger.
Fixes: f8b45b74cc ("i40e/i40evf: Use build_skb to build frames")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Commit 0da36b9774 ("i40e: use DECLARE_BITMAP for state fields")
introduced changes in the way i40e works with state flags converting
them to bitmaps using kernel bitmap API. This change introduced a
regression due to a mistaken substitution using __I40E_VSI_DOWN instead
of __I40E_DOWN when testing state of a PF at i40e_reset_subtask()
function. This caused a flood in the kernel log with the follow message:
[49.013] i40e 0002:01:00.0: bad reset request 0x00000020
Commit d19cb64b92 ("i40e: separate PF and VSI state flags")
also introduced some misuse of the VSI and PF flags, so both could be
considered as the offenders.
This patch simply fixes the flags where it makes sense by changing
__I40E_VSI_DOWN to __I40E_DOWN.
Fixes: 0da36b9774 ("i40e: use DECLARE_BITMAP for state fields")
Fixes: d19cb64b92 ("i40e: separate PF and VSI state flags")
Reviewed-by: "Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
During an eeh call to cxl_remove can result in double free_irq of
psl,slice interrupts. This can happen if perst_reloads_same_image == 1
and call to cxl_configure_adapter() fails during slot_reset
callback. In such a case we see a kernel oops with following back-trace:
Oops: Kernel access of bad area, sig: 11 [#1]
Call Trace:
free_irq+0x88/0xd0 (unreliable)
cxl_unmap_irq+0x20/0x40 [cxl]
cxl_native_release_psl_irq+0x78/0xd8 [cxl]
pci_deconfigure_afu+0xac/0x110 [cxl]
cxl_remove+0x104/0x210 [cxl]
pci_device_remove+0x6c/0x110
device_release_driver_internal+0x204/0x2e0
pci_stop_bus_device+0xa0/0xd0
pci_stop_and_remove_bus_device+0x28/0x40
pci_hp_remove_devices+0xb0/0x150
pci_hp_remove_devices+0x68/0x150
eeh_handle_normal_event+0x140/0x580
eeh_handle_event+0x174/0x360
eeh_event_handler+0x1e8/0x1f0
This patch fixes the issue of double free_irq by checking that
variables that hold the virqs (err_hwirq, serr_hwirq, psl_virq) are
not '0' before un-mapping and resetting these variables to '0' when
they are un-mapped.
Cc: stable@vger.kernel.org
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently tsk->thread.load_tm is not initialized in the task creation
and can contain garbage on a new task.
This is an undesired behaviour, since it affects the timing to enable
and disable the transactional memory laziness (disabling and enabling
the MSR TM bit, which affects TM reclaim and recheckpoint in the
scheduling process).
Fixes: 5d176f751e ("powerpc: tm: Enable transactional memory (TM) lazily for userspace")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The description of the CSI_SEL bit in the i.MX6 reference manual is
incorrect. It states "This bit defines which CSI is the input to the
IC. This bit is effective only if IC_INPUT is bit cleared".
From experiment it was found this is in fact not correct. The CSI_SEL
bit selects which CSI is input to _both_ the VDIC _and_ the IC. If the
IC_INPUT bit is set so that the IC is receiving from the VDIC, the IC
ignores the CSI_SEL bit, but CSI_SEL still selects which CSI the VDIC
receives from in that case.
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Steve Longerbeam <steve_longerbeam@mentor.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Not having an endpoint bound in DT should not cause a failure here,
there are fallbacks. So explicitly accept a missing endpoint.
This behavior change was introduced by refactoring in drm_of parsing
code and it should not require dts changes.
In particular this fixes imx6qdl-sabreauto boards.
Link: https://lists.freedesktop.org/archives/dri-devel/2017-May/141233.html
Fixes: ebc9446135 ("drm: convert drivers to use drm_of_find_panel_or_bridge")
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
By setting the SFTRST bit, the PRE will be held in the lowest power state
with clocks to the internal blocks gated. When external clock gating is
used (from the external clock controller, or by setting the CLKGATE bit)
the PRE will sporadically fail to start.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Fixes: d2a3423258 ("gpu: ipu-v3: add driver for Prefetch Resolve Engine")
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
We used to extract PRIbits from the ICH_VT_EL2 which was the upper field
in the register word, so a mask wasn't necessary, but as we switched to
looking at PREbits, which is bits 26 through 28 with the PRIbits field
being potentially non-zero, we really need to mask off the field value,
otherwise fun things may happen.
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Core Changes:
- Grab locks in drm_atomic_helper_resume() (Daniel)
- Fix oops when unplugging USB device (expand cleanup in drm_unplug_dev) (Hans)
Driver Changes:
- rockchip: Don't output 10-bit format to 8-bit encoders (Mark)
Cc: Mark yao <mark.yao@rock-chips.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Hans de Goede <hdegoede@redhat.com>
* tag 'drm-misc-fixes-2017-06-02' of git://anongit.freedesktop.org/git/drm-misc:
drm: Fix oops + Xserver hang when unplugging USB drm devices
drm: Fix locking in drm_atomic_helper_resume
drm/rockchip: Correct vop out_mode configure
4 nouveau regression fixes.
* 'linux-4.12' of git://github.com/skeggsb/linux:
drm/nouveau/tmr: fully separate alarm execution/pending lists
drm/nouveau: enable autosuspend only when it'll actually be used
drm/nouveau: replace multiple open-coded runpm support checks with function
drm/nouveau/kms/nv50: add null check before pointer dereference
Reusing the list_head for both is a bad idea. Callback execution is done
with the lock dropped so that alarms can be rescheduled from the callback,
which means that with some unfortunate timing, lists can get corrupted.
The execution list should not require its own locking, the single function
that uses it can only be called from a single context.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
Add null check before dereferencing pointer asyc
Addresses-Coverity-ID: 1397932
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Linux IRQ #0 is reserved for error reporting and may not be used.
Increase NR_IRQS for one additional slot and increase
irq_domain_add_legacy parameter first_irq value to 1, so that linux
IRQ #0 is not associated with hardware IRQ #0 in legacy IRQ domains.
Introduce macro XTENSA_PIC_LINUX_IRQ for static translation of xtensa
PIC hardware IRQ # to linux IRQ #. Use this macro in XTFPGA platform
data definitions.
This fixes inability to use hardware IRQ #0 in configurations that don't
use device tree and allows for non-identity mapping between linux IRQ #
and hardware IRQ #.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
The new per-cpu counter for writes_pending is initialised in
md_alloc(), which is not called by dm-raid.
So dm-raid fails when md_write_start() is called.
Move the initialization to the personality modules
that need it. This way it is always initialised when needed,
but isn't unnecessarily initialized (requiring memory allocation)
when the personality doesn't use writes_pending.
Reported-by: Heinz Mauelshagen <heinzm@redhat.com>
Fixes: 4ad23a9764 ("MD: use per-cpu counter for writes_pending")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Pull cgroup fixes from Tejun Heo:
"Two cgroup fixes. One to address RCU delay of cpuset removal affecting
userland visible behaviors. The other fixes a race condition between
controller disable and cgroup removal"
* 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cpuset: consider dying css as offline
cgroup: Prevent kill_css() from being called more than once
Pull libata fixes from Tejun Heo:
- Revert of sata_mv devm_ioremap_resource() conversion. It made init
fail if there are overlapping resources which led to detection
failures on some setups.
- A workaround for an Acer laptop which sometimes reports corrupt port
map.
- Other non-critical fixes.
* 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata: fix error checking in in ata_parse_force_one()
Revert "ata: sata_mv: Convert to devm_ioremap_resource()"
ata: libahci: properly propagate return value of platform_get_irq()
ata: sata_rcar: Handle return value of clk_prepare_enable
ahci: Acer SA5-271 SSD Not Detected Fix
Revert commit da28e1955d (ACPICA: Disassembler: Enhance resource
descriptor detection) as it is based on an assumption that doesn't
hold all the time and causes problems to happen because of that.
Reported-by: Linda Knippers <linda.knippers@hpe.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Double exception vector only needs 20 bytes of space for 5 literals, not
48. Reduce the reservation for double exception vector literals
accordingly. This fixes build for configurations with small user
exception vector size.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Sending host command with CMD_WANT_SKB flag demands the release of the
response buffer with iwl_free_resp function.
The patch adds the memory release in all the relevant places
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In a previous commit, we removed support for API versions earlier than
22 for these NICs. By mistake, the *_UCODE_API_MIN definitions were
set to 17. Fix that.
Fixes: 4b87e5af63 ("iwlwifi: remove support for fw older than -17 and -22")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Clear the struct so that all reserved fields are zero when we
send the struct down to the device.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The iwl_mvm_remove_sta_key() function handles removing a key when the
sta doesn't exist anymore. Mistakenly, this was changed to return an
error while fixing another bug.
If the mvm_sta doesn't exist, we continue normally, but just don't try
to remove the igtk key.
Fixes: cd4d23c1ea ("iwlwifi: mvm: Fix removal of IGTK")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
When we want to stop the recording of the firmware debug
and restart it later without reloading the firmware we
don't need to resend the configuration that comes with
host commands.
Sending those commands confused the hardware and led to
an NMI 0x66.
Change the flow as following:
* read the relevant registers (DBGC_IN_SAMPLE, DBGC_OUT_CTRL)
* clear those registers
* wait for the hardware to complete its write to the buffer
* get the data
* restore the value of those registers (to restart the
recording)
For early start (where the configuration is already
compiled in the firmware), we don't need to set those
registers after the firmware has been loaded, but only
when we want to restart the recording without having
restarted the firmware.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The ucode_loaded check should be under the mutex, since it can
otherwise change state after we looked at it and before we got
the mutex. Fix that.
Fixes: 5c89e7bc55 ("iwlwifi: mvm: add registration to cooling device")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Allow working IBSS also when working in DQA mode.
This is done by setting it to treat the queues the
same as a BSS AP treats the queues.
Fixes: 7948b87308 ("iwlwifi: mvm: enable dynamic queue allocation mode")
Signed-off-by: Liad Kaufman <liad.kaufman@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
During d0i3 flow we flush all the queue except from the command queue.
Currently, in this flow the command queue is hard coded to 9.
In DQA the command queue number has changed from 9 to 0.
Fix that.
This fixes a problem in runtime PM resume flow.
Fixes: 097129c9e6 ("iwlwifi: mvm: move cmd queue to be #0 in dqa mode")
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Up until now, the driver was comparing the rate reported by the FW and
the rate of the latest LQ command to avoid processing data belonging
to the old LQ command. Recently, FW changed the meaning of the initial
rate field in tx response and it holds the actual rate (which is not
necessarily the initial rate of LQ's rate table). Use instead LQ cmd
color to be able to filter out tx responses/BA notifications which
where sent during earlier LQ commands' time frame.
This fixes some throughput degradation in noisy environments.
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Pull ARM fixes from Russell King:
"Three fixes this time around:
- Two fixes for noMMU, fixing the decompressor header layout, and
preventing a build error with some configurations.
- Fixing the hyp-stub updates that went in during the merge window
for platforms that use MCPM"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8677/1: boot/compressed: fix decompressor header layout for v7-M
ARM: 8676/1: NOMMU: provide pgprot_device() macro
ARM: 8675/1: MCPM: ensure not to enter __hyp_soft_restart from loopback and cpu_power_down
In some situations the libdw unwinder stopped working properly. I.e.
with libunwind we see:
~~~~~
heaptrack_gui 2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
608f3 _GLOBAL__sub_I_kdynamicjobtracker.cpp (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
f199 call_init.part.0 (/usr/lib/ld-2.25.so)
f2a5 _dl_init (/usr/lib/ld-2.25.so)
db9 _dl_start_user (/usr/lib/ld-2.25.so)
~~~~~
But with libdw and without this patch this sample is not properly
unwound:
~~~~~
heaptrack_gui 2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
~~~~~
Debug output showed me that libdw found a module for the last frame
address, but it thinks it belongs to /usr/lib/ld-2.25.so. This patch
double-checks what libdw sees and what perf knows. If the mappings
mismatch, we now report the elf known to perf. This fixes the situation
above, and the libdw unwinder produces the same stack as libunwind.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20170602143753.16907-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The following tests are failing on powerpc:
# perf test break
18: Breakpoint overflow signal handler : FAILED!
19: Breakpoint overflow sampling : FAILED!
The powerpc kenel so far does not have support to even create
instruction breakpoints using the perf event interface, so those tests
fail early in the config phase.
I added a '->is_supported()' callback to test struct to be able to
disable specific tests. It seems better than putting ifdefs directly to
the test array.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170601205450.GA398@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The decompress_kmodule() decompresses kernel modules in order to load
symbols from it. In the DSO_BINARY_TYPE__BUILD_ID_CACHE case, it needs
the full file path to extract the file extension to determine the
decompression method. But overwriting 'name' will fail the
decompression since it might point to a non-existing old file.
Instead, use dso->long_name for having the correct extension and use the
real filename to decompress.
In the DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP case, both names should
be the same. This allows resolving symbols in the old modules.
Before:
$ perf report -i perf.data.old | grep scsi_mod
0.00% cc1 [scsi_mod] [k] 0x0000000000004aa6
0.00% as [scsi_mod] [k] 0x00000000000099e1
0.00% cc1 [scsi_mod] [k] 0x0000000000009830
0.00% cc1 [scsi_mod] [k] 0x0000000000001b8f
After:
0.00% cc1 [scsi_mod] [k] scsi_handle_queue_ramp_up
0.00% as [scsi_mod] [k] scsi_sg_alloc
0.00% cc1 [scsi_mod] [k] scsi_setup_cmnd
0.00% cc1 [scsi_mod] [k] scsi_get_command
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170531120105.21731-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When perf processes build-id event, it creates DSOs with the build-id.
But it didn't set the module short name (like '[module-name]') so when
processing a kernel mmap event of the module, it cannot found the DSO as
it only checks the short names.
That leads for perf to create a same DSO without the build-id info and
it'll lookup the system path even if the DSO is already in the build-id
cache. After kernel was updated, perf cannot find the DSO and cannot
show symbols in it anymore.
You can see this if you have an old data file (w/ old kernel version):
$ perf report -i perf.data.old -v |& grep scsi_mod
build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz : cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
Failed to open /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz, continuing without symbols
...
The second message didn't show the build-id. With this patch:
$ perf report -i perf.data.old -v |& grep scsi_mod
build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz: cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
/lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz with build id cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1 not found, continuing without symbols
...
Now it shows the build-id but still cannot load the symbol table. This
is a different problem which will be fixed in the next patch.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170531120105.21731-1-namhyung@kernel.org
[ Fix the build on older compilers (debian <= 8, fedora <= 21, etc) wrt kmod_path var init ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The Granular QoS per VF feature must be enabled in FW before it can be
used.
Thus, the driver cannot modify a QP's qos_vport value (via the UPDATE_QP FW
command) if the feature has not been enabled -- the FW returns an error if
this is attempted.
Fixes: 08068cd568 ("net/mlx4: Added qos_vport QP configuration in VST mode")
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix kernel-doc warnings (typo) in drivers/net/phy/phy.c:
..//drivers/net/phy/phy.c:259: warning: No description found for parameter 'features'
..//drivers/net/phy/phy.c:259: warning: Excess function parameter 'feature' description in 'phy_lookup_setting'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update tcp.txt to fix mandatory congestion control ops and default
CCA selection. Also, fix comment in tcp.h for undo_cwnd.
Signed-off-by: Anmol Sarma <me@anmolsarma.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit c5a2ee7dde (cpufreq: intel_pstate: Active mode P-state
limits rework) incorrectly assumed that pstate.turbo_pstate would
always be nonzero for CPU0 in min_perf_pct_min() if
cpufreq_register_driver() had succeeded which may not be the case
in virtualized environments.
If that assumption doesn't hold, it leads to an early crash on boot
in intel_pstate_register_driver(), so add a sanity check to
min_perf_pct_min() to prevent the crash from happening.
Fixes: c5a2ee7dde (cpufreq: intel_pstate: Active mode P-state limits rework)
Reported-and-tested-by: Jongman Heo <jongman.heo@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
As reported by Patrice, the header layout of the decompressor is
incorrect when building for v7-M. In this case, the __nop macro
resolves to 'mov r0, r0', which is emitted as a narrow encoding,
resulting in the header data fields to end up at lower offsets than
required.
Given the variety of targets we need to support with the same code,
the startup sequence is a bit of a jumble, and uses instructions
and macros whose encoding widths cannot be specified (badr), or only
exist in a narrow encoding (bx)
So force the use of a wide encoding in __nop, and replace the start
sequence with a simple jump to the label marking the start of code,
preceded by a Thumb2 mode switch if required (using explicit wide
encodings where appropriate). The label itself can be moved to the
start of code [where it belongs] due to the larger range of branch
instructions as compared to adr instructions.
Reported-by: Patrice CHOTARD <patrice.chotard@st.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
NOMMU build leads to the following error:
CC drivers/pci/mmap.o
drivers/pci/mmap.c: In function 'pci_mmap_resource_range':
drivers/pci/mmap.c:60:3: error: implicit declaration of function 'pgprot_device' [-Werror=implicit-function-declaration]
vma->vm_page_prot = pgprot_device(vma->vm_page_prot);
^
cc1: some warnings being treated as errors
scripts/Makefile.build:302: recipe for target 'drivers/pci/mmap.o' failed
make[2]: *** [drivers/pci/mmap.o] Error 1
scripts/Makefile.build:561: recipe for target 'drivers/pci' failed
make[1]: *** [drivers/pci] Error 2
Makefile:1016: recipe for target 'drivers' failed
make: *** [drivers] Error 2
Fix it with support of pgprot_device() macro for NOMMU.
Fixes: 00d2904ffe ("ARM/PCI: Use generic pci_mmap_resource_range()")
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
A SoC variant of Geode GX1, notably NSC branded SC1100, seems to
report an inverted Device ID in its DIR0 configuration register,
specifically 0xb instead of the expected 0x4.
Catch this presumably quirky version so it's properly recognized
as GX1 and has its cache switched to write-back mode, which provides
a significant performance boost in most workloads.
SC1100's datasheet "Geode™ SC1100 Information Appliance On a Chip",
states in section 1.1.7.1 "Device ID" that device identification
values are specified in SC1100's device errata. These, however,
seem to not have been publicly released.
Wading through a number of boot logs and /proc/cpuinfo dumps found on
pastebin and blogs, this patch should mostly be relevant for a number
of now admittedly aging Soekris NET4801 and PC Engines WRAP devices,
the latter being the platform this issue was discovered on.
Performance impact was verified using "openssl speed", with
write-back caching scaling throughput between -3% and +41%.
Signed-off-by: Christian Sünkenberg <christian.suenkenberg@student.kit.edu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1496596719.26725.14.camel@student.kit.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently tsk->thread->load_vec and load_fp are not initialized during
task creation, which can lead to garbage values in these variables (non-zero
values).
These variables will be checked later in restore_math() to validate if the
FP and vector registers are being utilized. Since these values might be
non-zero, the restore_math() will continue to save the FP and vectors even if
they were never utilized by the userspace application. load_fp and load_vec
counters will then overflow (they wrap at 255) and the FP and Altivec will be
finally disabled, but before that condition is reached (counter overflow)
several context switches will have restored FP and vector registers without
need, causing a performance degradation.
Fixes: 70fe3d980f ("powerpc: Restore FPU/VEC/VSX if previously used")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Gustavo Romero <gusbromero@gmail.com>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Our previous patch (cited below) introduced a regression
for RAW Eth QPs.
Fix it by checking if the QP number provided by user-space
exists, hence allowing steering rules to be added for valid
QPs only.
Fixes: 89c557687a ("net/mlx4_en: Avoid adding steering rules with invalid ring")
Reported-by: Or Gerlitz <gerlitz.or@gmail.com>
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since iptunnel_pull_header() can call pskb_may_pull(),
we must reload any pointer that was related to skb->head.
Fixes: a09a4c8dd1 ("tunnels: Remove encapsulation offloads on decap")
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander reported various KASAN messages triggered in recent kernels
The problem is that ping sockets should not use udp_poll() in the first
place, and recent changes in UDP stack finally exposed this old bug.
Fixes: c319b4d76b ("net: ipv4: add IPPROTO_ICMP socket kind")
Fixes: 6d0bfe2261 ("net: ipv6: Add IPv6 support to the ping socket.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Acked-By: Lorenzo Colitti <lorenzo@google.com>
Tested-By: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 9520ed8fb8 ("net: dsa: use cpu_switch instead of ds[0]")
replaced the use of dst->ds[0] with dst->cpu_switch since that is
functionally equivalent, however, we can now run into an use after free
scenario after unbinding then rebinding the switch driver.
The use after free happens because we do correctly initialize
dst->cpu_switch the first time we probe in dsa_cpu_parse(), then we
unbind the driver: dsa_dst_unapply() is called, and we rebind again.
dst->cpu_switch now points to a freed "ds" structure, and so when we
finally dereference it in dsa_cpu_port_ethtool_setup(), we oops.
To fix this, simply set dst->cpu_switch to NULL in dsa_dst_unapply()
which guarantees that we always correctly re-assign dst->cpu_switch in
dsa_cpu_parse().
Fixes: 9520ed8fb8 ("net: dsa: use cpu_switch instead of ds[0]")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If ip6_find_1stfragopt() fails and we return an error we have to free
up 'segs' because nobody else is going to.
Fixes: 2423496af3 ("ipv6: Prevent overrun when parsing v6 header options")
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 9b4437a5b8 ("geneve: Unify LWT and netdev handling.")
when using COLLECT_METADATA geneve devices are created with too small of
a needed_headroom and too large of a max_mtu. This is because
ip_tunnel_info_af() is not valid with the device level info when using
COLLECT_METADATA and we mistakenly fall into the IPv4 case.
For COLLECT_METADATA, always use the worst case of ipv6 since both
sockets are created.
Fixes: 9b4437a5b8 ("geneve: Unify LWT and netdev handling.")
Signed-off-by: Eric Garver <e@erig.me>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prior to f5f99309fa (sock: do not set sk_err in
sock_dequeue_err_skb), sk_err was reset to the error of
the skb on the head of the error queue.
Applications, most notably ping, are relying on this
behavior to reset sk_err for ICMP packets.
Set sk_err to the ICMP error when there is an ICMP packet
at the head of the error queue.
Fixes: f5f99309fa (sock: do not set sk_err in sock_dequeue_err_skb)
Reported-by: Cyril Hrubis <chrubis@suse.cz>
Tested-by: Cyril Hrubis <chrubis@suse.cz>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
xgbe_map_rx_buffer is rather confused about what PAGE_ALLOC_COSTLY_ORDER
means. It uses PAGE_ALLOC_COSTLY_ORDER-1 assuming that
PAGE_ALLOC_COSTLY_ORDER is the first costly order which is not the case
actually because orders larger than that are costly. And even that
applies only to sleeping allocations which is not the case here. We
simply do not perform any costly operations like reclaim or compaction
for those. Simplify the code by dropping the order calculation and use
PAGE_ALLOC_COSTLY_ORDER directly.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip6_route_output() requires that the flowlabel contains the traffic
class for policy routing.
Commit 0e9a709560 ("ip6_tunnel, ip6_gre: fix setting of DSCP on
encapsulated packets") removed the code which previously added the
traffic class to the flowlabel.
The traffic class is added here because only route lookup needs the
flowlabel to contain the traffic class.
Fixes: 0e9a709560 ("ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets")
Signed-off-by: Liam McBirnie <liam.mcbirnie@boeing.com>
Acked-by: Peter Dawson <peter.a.dawson@boeing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a problem with reading files larger than 2GB from a UFS-2
file system:
https://bugzilla.kernel.org/show_bug.cgi?id=195721
The incorrect UFS s_maxsize limit became a problem as of commit
c2a9737f45 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()")
which started using s_maxbytes to avoid a page index overflow in
do_generic_file_read().
That caused files to be truncated on UFS-2 file systems because the
default maximum file size is 2GB (MAX_NON_LFS) and UFS didn't update it.
Here I simply increase the default to a common value used by other file
systems.
Signed-off-by: Richard Narron <comet.berkeley@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Will B <will.brokenbourgh2877@gmail.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: <stable@vger.kernel.org> # v4.9 and backports of c2a9737f45
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use software polling (PHY_POLL) to check for link state changes instead
of relying on the EMAC's hardware polling feature. Some PHY drivers
are unable to get a functioning link because the HW polling is not
robust enough.
The EMAC is able to poll the PHY on the MDIO bus looking for link state
changes (via the Link Status bit in the Status Register at address 0x1).
When the link state changes, the EMAC triggers an interrupt and tells the
driver what the new state is. The feature eliminates the need for
software to poll the MDIO bus.
Unfortunately, this feature is incompatible with phylib, because it
ignores everything that the PHY core and PHY drivers are trying to do.
In particular:
1. It assumes a compatible register set, so PHYs with different registers
may not work.
2. It doesn't allow for hardware errata that have work-arounds implemented
in the PHY driver.
3. It doesn't support multiple register pages. If the PHY core switches
the register set to another page, the EMAC won't know the page has
changed and will still attempt to read the same PHY register.
4. It only checks the copper side of the link, not the SGMII side. Some
PHY drivers (e.g. at803x) may also check the SGMII side, and
report the link as not ready during autonegotiation if the SGMII link
is still down. Phylib then waits for another interrupt to query
the PHY again, but the EMAC won't send another interrupt because it
thinks the link is up.
Cc: stable@vger.kernel.org # 4.11.x
Tested-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull NFS client bugfixes from Trond Myklebust:
"Bugfixes include:
- Fix a typo in commit e092693443 ("NFS append COMMIT after
synchronous COPY") that breaks copy offload
- Fix the connect error propagation in xs_tcp_setup_socket()
- Fix a lock leak in nfs40_walk_client_list
- Verify that pNFS requests lie within the offset range of the layout
segment"
* tag 'nfs-for-4.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
nfs: Mark unnecessarily extern functions as static
SUNRPC: ensure correct error is reported by xs_tcp_setup_socket()
NFSv4.0: Fix a lock leak in nfs40_walk_client_list
pnfs: Fix the check for requests in range of layout segment
xprtrdma: Delete an error message for a failed memory allocation in xprt_rdma_bc_setup()
pNFS/flexfiles: missing error code in ff_layout_alloc_lseg()
NFS fix COMMIT after COPY
Pull tty fix from Greg KH:
"Here is a single tty core fix for 4.12-rc4. It reverts a patch that a
lot of people reported as causing lockdep and other warnings.
Right after I reverted this in my tree, it seems like another
"correct" fix might have shown up, but it's too late in the release
cycle to be messing with tty core locking, so let's just revert this
for now to go back how things always have been and try it again for
4.13.
This has not been in linux-next as I only reverted it a few hours ago"
* tag 'tty-4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "tty: fix port buffer locking"
Pull input subsystem fixes from Dmitry Torokhov:
- a couple of regression fixes in synaptics and axp20x-pek drivers
- try to ease transition from PS/2 to RMI for Synaptics touchpad users
by ensuring we do not try to activate RMI mode when RMI SMBus support
is not enabled, and nag users a bit to enable it
- plus a couple of other changes that seemed worthwhile for this
release
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: axp20x-pek - switch to acpi_dev_present and check for ACPI0011 too
Input: axp20x-pek - only check for "INTCFD9" ACPI device on Cherry Trail
Input: tm2-touchkey - use LEN_ON as boolean value instead of LED_FULL
Input: synaptics - tell users to report when they should be using rmi-smbus
Input: synaptics - warn the users when there is a better mode
Input: synaptics - keep PS/2 around when RMI4_SMB is not enabled
Input: synaptics - clear device info before filling in
Input: silead - disable interrupt during suspend
Pull RTC fixlet from Alexandre Belloni:
"A single patch, not really a fix but I don't think there is any reason
to delay it.
Change the mailing list address"
* tag 'rtc-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
MAINTAINERS: update RTC mailing list
A rc device can call ir_raw_event_handle() after rc_allocate_device(),
but before rc_register_device() has completed. This is racey because
rcdev->raw is set before rcdev->raw->thread has a valid value.
Cc: stable@kernel.org
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Just depend on DEBUG_FS, no need to invent a new kernel config.
Especially since CEC can be enabled by drm without enabling
MEDIA_SUPPORT.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This config option is strictly speaking independent of the
media subsystem since it can be used by drm as well.
Besides, it looks odd when drivers select CEC_CORE and
MEDIA_CEC_NOTIFIER, that's inconsistent naming.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The CEC framework is used by both drm and media. That makes it tricky
to get the dependencies right.
This patch moves the CEC_CORE and MEDIA_CEC_NOTIFIER config options
out of the media menu and instead drivers that want to use CEC should
select CEC_CORE and MEDIA_CEC_NOTIFIER (if needed).
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
exit_loop is not being initialized, so it contains garbage. Ensure it is
initialized to false.
Detected by CoverityScan, CID#1436409 ("Uninitialized scalar variable")
Fixes: ea6a69defd ("[media] rainshadow-cec: avoid -Wmaybe-uninitialized warning")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Several atomisp files use:
ccflags-y += -Werror
As, on media, our usual procedure is to use W=1, and atomisp
has *a lot* of warnings with such flag enabled,like:
./drivers/staging/media/atomisp/pci/atomisp2/css2400/hive_isp_css_common/host/system_local.h:62:26: warning: 'DDR_BASE' defined but not used [-Wunused-const-variable=]
At the end, it causes our build to fail, impacting our workflow.
So, remove this crap. If one wants to force -Werror, he
can still build with it enabled by passing a parameter to
make.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Pull SCSI fixes from James Bottomley:
"This is nine fixes, seven of which are for the qedi driver (new as of
4.10) the other two are a use after free in the cxgbi drivers and a
potential NULL dereference in the rdac device handler"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: libcxgbi: fix skb use after free
scsi: qedi: Fix endpoint NULL panic during recovery.
scsi: qedi: set max_fin_rt default value
scsi: qedi: Set firmware tcp msl timer value.
scsi: qedi: Fix endpoint NULL panic in qedi_set_path.
scsi: qedi: Set dma_boundary to 0xfff.
scsi: qedi: Correctly set firmware max supported BDs.
scsi: qedi: Fix bad pte call trace when iscsiuio is stopped.
scsi: scsi_dh_rdac: Use ctlr directly in rdac_failover_get()
Pull rdma fixes from Doug Ledford:
"For the most part this is just a minor -rc cycle for the rdma
subsystem. Even given that this is all of the -rc patches since the
merge window closed, it's still only about 25 patches:
- Multiple i40iw, nes, iw_cxgb4, hfi1, qib, mlx4, mlx5 fixes
- A few upper layer protocol fixes (IPoIB, iSER, SRP)
- A modest number of core fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (26 commits)
RDMA/SA: Fix kernel panic in CMA request handler flow
RDMA/umem: Fix missing mmap_sem in get umem ODP call
RDMA/core: not to set page dirty bit if it's already set.
RDMA/uverbs: Declare local function static and add brackets to sizeof
RDMA/netlink: Reduce exposure of RDMA netlink functions
RDMA/srp: Fix NULL deref at srp_destroy_qp()
RDMA/IPoIB: Limit the ipoib_dev_uninit_default scope
RDMA/IPoIB: Replace netdev_priv with ipoib_priv for ipoib_get_link_ksettings
RDMA/qedr: add null check before pointer dereference
RDMA/mlx5: set UMR wqe fence according to HCA cap
net/mlx5: Define interface bits for fencing UMR wqe
RDMA/mlx4: Fix MAD tunneling when SRIOV is enabled
RDMA/qib,hfi1: Fix MR reference count leak on write with immediate
RDMA/hfi1: Defer setting VL15 credits to link-up interrupt
RDMA/hfi1: change PCI bar addr assignments to Linux API functions
RDMA/hfi1: fix array termination by appending NULL to attr array
RDMA/iw_cxgb4: fix the calculation of ipv6 header size
RDMA/iw_cxgb4: calculate t4_eq_status_entries properly
RDMA/iw_cxgb4: Avoid touch after free error in ARP failure handlers
RDMA/nes: ACK MPA Reply frame
...
The alarmtimer code has another source of potentially rearming itself too
fast. Interval timers with a very samll interval have a similar CPU hog
effect as the previously fixed overflow issue.
The reason is that alarmtimers do not implement the normal protection
against this kind of problem which the other posix timer use:
timer expires -> queue signal -> deliver signal -> rearm timer
This scheme brings the rearming under scheduler control and prevents
permanently firing timers which hog the CPU.
Bringing this scheme to the alarm timer code is a major overhaul because it
lacks all the necessary mechanisms completely.
So for a quick fix limit the interval to one jiffie. This is not
problematic in practice as alarmtimers are usually backed by an RTC for
suspend which have 1 second resolution. It could be therefor argued that
the resolution of this clock should be set to 1 second in general, but
that's outside the scope of this fix.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kostya Serebryany <kcc@google.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170530211655.896767100@linutronix.de
Andrey reported a alartimer related RCU stall while fuzzing the kernel with
syzkaller.
The reason for this is an overflow in ktime_add() which brings the
resulting time into negative space and causes immediate expiry of the
timer. The following rearm with a small interval does not bring the timer
back into positive space due to the same issue.
This results in a permanent firing alarmtimer which hogs the CPU.
Use ktime_add_safe() instead which detects the overflow and clamps the
result to KTIME_SEC_MAX.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kostya Serebryany <kcc@google.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170530211655.802921648@linutronix.de
nfs_initialise_sb() and nfs_clone_super() are declared as extern even
though they are used only in fs/nfs/super.c. Mark them as static.
Also remove explicit 'inline' directive from nfs_initialise_sb() and
leave it upto compiler to decide whether inlining is worth it.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Pull hwmon fixes from Guenter Roeck:
"A couple of patches for the aspeed pwm fan driver"
* tag 'hwmon-for-linus-v4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (aspeed-pwm-tacho) make fan/pwm names start with index 1
hwmon: (aspeed-pwm-tacho) Call of_node_put() on a node not claimed
hwmon: (aspeed-pwm-tacho) On read failure return -ETIMEDOUT
hwmon: (aspeed-pwm-tacho) Select REGMAP
Pull MTD fixes from Brian Norris:
"NAND updates from Boris:
tango fixes:
- Add missing MODULE_DEVICE_TABLE() in tango_nand.c
- Update the number of corrected bitflips
core fixes:
- Fix a long standing memory leak in nand_scan_tail()
- Fix several bugs introduced by the per-vendor init/detection
infrastructure (introduced in 4.12)
- Add a static specifier to nand_ooblayout_lp_hamming_ops definition"
* tag 'for-linus-20170602' of git://git.infradead.org/linux-mtd:
mtd: nand: make nand_ooblayout_lp_hamming_ops static
mtd: nand: tango: Update ecc_stats.corrected
mtd: nand: tango: Export OF device ID table as module aliases
mtd: nand: samsung: warn about un-parseable ECC info
mtd: nand: free vendor-specific resources in init failure paths
mtd: nand: drop unneeded module.h include
mtd: nand: don't leak buffers when ->scan_bbt() fails
Make fan and pwm names in sysfs start with index 1 in accordance to
Documentation/hwmon/sysfs-interface conventions.
Current implementation starts with index 0, making tools such as
sensors(1) skip the first fan.
Signed-off-by: Stefan Schaeckeler <sschaeck@cisco.com>
Fixes: 2d7a548a3e ("drivers: hwmon: Support for ASPEED PWM/Fan tach")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Call of_node_put() on a node claimed with of_node_get() or by any other
means such as for_each_child_of_node().
Signed-off-by: Stefan Schaeckeler <sschaeck@cisco.com>
Fixes: 2d7a548a3e ("drivers: hwmon: Support for ASPEED PWM/Fan tach")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
modprobe is not able to resolve sysfs modalias for mei devices.
# cat
/sys/class/watchdog/watchdog0/device/watchdog/watchdog0/device/modalias
mei::05b79a6f-4628-4d7f-899d-a91514cb32ab:
# modprobe --set-version 4.9.6-200.fc25.x86_64 -R
mei::05b79a6f-4628-4d7f-899d-a91514cb32ab:
modprobe: FATAL: Module mei::05b79a6f-4628-4d7f-899d-a91514cb32ab: not
found in directory /lib/modules/4.9.6-200.fc25.x86_64
# cat /lib/modules/4.9.6-200.fc25.x86_64/modules.alias | grep
05b79a6f-4628-4d7f-899d-a91514cb32ab
alias mei:*:05b79a6f-4628-4d7f-899d-a91514cb32ab:*:* mei_wdt
commit b26864cad1 ("mei: bus: add client protocol
version to the device alias"), however sysfs modalias
is still in formmat mei:S:uuid:*.
This patch equates format of uevent and sysfs modalias so that modprobe
is able to resolve the aliases.
Cc: <stable@vger.kernel.org> 4.7+
Fixes: commit b26864cad1 ("mei: bus: add client protocol version to the device alias")
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A recent fix to /dev/mem prevents mappings from wrapping around the end
of physical address space. However, the check was written in a way that
also prevents a mapping reaching just up to the end of physical address
space, which may be a valid use case (especially on 32-bit systems).
This patch fixes it by checking the last mapped address (instead of the
first address behind that) for overflow.
Fixes: b299cde245 ("drivers: char: mem: Check for address space wraparound with mmap()")
Cc: <stable@vger.kernel.org>
Reported-by: Nico Huber <nico.h@gmx.de>
Signed-off-by: Julius Werner <jwerner@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case of error, the function devm_ioremap() returns NULL pointer
not ERR_PTR(). The IS_ERR() test in the return value check should
be replaced with NULL test. Also add NULL test for iores.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Starting from MPU6500, accelerometer dlpf is set in a separate
register named ACCEL_CONFIG_2.
Add this new register in the map and set it for the corresponding
chips.
Signed-off-by: Jean-Baptiste Maneyrol <jmaneyrol@invensense.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
lov_getstripe() calls set_fs(KERNEL_DS) so that it can handle a struct
lov_user_md pointer from user- or kernel-space. This changes the
behavior of copy_from_user() on SPARC and may result in a misaligned
access exception which in turn oopses the kernel. In fact the
relevant argument to lov_getstripe() is never called with a
kernel-space pointer and so changing the address limits is unnecessary
and so we remove the calls to save, set, and restore the address
limits.
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-on: http://review.whamcloud.com/6150
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3221
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Wei <wei.g.li@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a custom CPU target is specified and that one is not available _or_
can't be interrupted then the code returns to userland without dropping a
lock as notices by lockdep:
|echo 133 > /sys/devices/system/cpu/cpu7/hotplug/target
| ================================================
| [ BUG: lock held when returning to user space! ]
| ------------------------------------------------
| bash/503 is leaving the kernel with locks still held!
| 1 lock held by bash/503:
| #0: (device_hotplug_lock){+.+...}, at: [<ffffffff815b5650>] lock_device_hotplug_sysfs+0x10/0x40
So release the lock then.
Fixes: 757c989b99 ("cpu/hotplug: Make target state writeable")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170602142714.3ogo25f2wbq6fjpj@linutronix.de
The AR100 clock within the R_CCU (PRCM) has the PLL_PERIPH0 as one of
its parents.
This adds the reference in the device tree describing this relationship.
This patch uses a raw number for the clock index to ease merging by
avoiding cross tree dependencies.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
The AR100 clock within the R_CCU (PRCM) has the PLL_PERIPH0 as one of
its parents.
This adds the reference in the device tree describing this relationship.
This patch uses a raw number for the clock index to ease merging by
avoiding cross tree dependencies.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Kishon writes:
phy: for 4.12-rc
*) Fix return value check in phy-qcom-qmp driver
*) Fix memory allocation bug in phy-qcom-qmp driver
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
acpi_dev_found checks that there is a matching ACPI node, but it
may be disabled (_STA method returns 0) in which case the
soc_button_array driver will not bind to it and axp20x-pek should
handle the power-button.
This commit switches from acpi_dev_found to acpi_dev_present to
avoid not registering an input-dev for the powerbutton when there
is a disabled PNP0C40 device.
The ACPI-6.0 standard defines a standard gpio button device using
the ACPI0011 HID replacing the custom PNP0C40 gpio device, many
newer devices define both PNP0C40 and ACPI0011 devices enabling one
or the other depending on whether the BIOS thinks it is going to boot
Android or Windows.
This commit adds a check for the ACPI0011 device, so that if
either device is present *and* enabled we don't register an input-dev
for the powerbutton.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Commit 9b13a4ca8d ("Input: axp20x-pek - do not register input device
on some systems") added a check for the INTCFD9 ACPI device which also
handles the powerbutton as on some systems the powerbutton is connected
to both the PMIC, handled by axp20x-pek, and to a gpio on the SoC, handled
by soc_button_array which attaches itself to the INTCFD9 ACPI device.
Testing + comparing DSDTs has shown that this only happens on Cherry
Trail devices with an AXP288 PMIC, the AXP288 PMIC is also used on
Bay Trail devices but there the power button is only connected to
the PMIC and not handled by soc_button_array.
This means that the INTCFD9 check has caused a regression on Bay Trail
devices, causing power-button presses to no longer be seen.
This commit fixes this by limiting the check to devices where the ACPI
node for the AXP288 contains a _HRV (hardware revision) attribute with
a value of 3 which indicates we are dealing with a Cherry Trail platform.
Fixes: 9b13a4ca8d ("Input: axp20x-pek - do not register input ...")
Reported-by: Сергей Трусов <t.rus76@ya.ru>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Felipe writes:
usb: fixes for v4.12-rc4
A fix to a really old synchronization bug on mass storage gadget.
Support for Meson8 SoCs on dwc2
Synchronization fixes on renesas USB driver.
Pull ACPI fixes from Rafael Wysocki:
"These revert one more problematic commit related to the ACPI-based
handling of laptop lids and make some unuseful error messages coming
from ACPICA go away.
Specifics:
- Revert one more commit related to the ACPI-based handling of laptop
lids that changed the default behavior on laptops that booted with
closed lids and introduced a regression there (Benjamin Tissoires).
- Add a missing acpi_put_table() to the code implementing the
/sys/firmware/acpi/tables interface to prevent a counter in the
ACPICA core from overflowing (Dan Williams).
- Drop error messages printed by ACPICA on acpi_get_table() reference
counting mismatches as they need not indicate real errors at this
point (Lv Zheng)"
* tag 'acpi-4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: Tables: Fix regression introduced by a too early mechanism enabling
Revert "ACPI / button: Change default behavior to lid_init_state=open"
ACPI / sysfs: fix acpi_get_table() leak / acpi-sysfs denial of service
Pull power management fixes from Rafael Wysocki:
"These fix two bugs in error code paths in the cpufreq core and in the
kirkwood-cpufreq driver.
Specifics:
- Make cpufreq_register_driver() return an error if the ->init()
calls fail for all CPUs to prevent non-functional drivers from
hanging around for no reason (David Arcari).
- Make kirkwood-cpufreq check the return value of
clk_prepare_enable() (which may fail) as appropriate (Arvind
Yadav)"
* tag 'pm-4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: kirkwood-cpufreq:- Handle return value of clk_prepare_enable()
cpufreq: cpufreq_register_driver() should return -ENODEV if init fails
Pull /dev/random bug fix from Ted Ts'o:
"Fix a race on architectures with prioritized interrupts (such as m68k)
which can causes crashes in drivers/char/random.c:get_reg()"
* tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
fix race in drivers/char/random.c:get_reg()
Merge misc fixes from Andrew Morton:
"15 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
scripts/gdb: make lx-dmesg command work (reliably)
mm: consider memblock reservations for deferred memory initialization sizing
mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified
mlock: fix mlock count can not decrease in race condition
mm/migrate: fix refcount handling when !hugepage_migration_supported()
dax: fix race between colliding PMD & PTE entries
mm: avoid spurious 'bad pmd' warning messages
mm/page_alloc.c: make sure OOM victim can try allocations with no watermarks once
pcmcia: remove left-over %Z format
slub/memcg: cure the brainless abuse of sysfs attributes
initramfs: fix disabling of initramfs (and its compression)
mm: clarify why we want kmalloc before falling backto vmallock
frv: declare jiffies to be located in the .data section
include/linux/gfp.h: fix ___GFP_NOLOCKDEP value
ksm: prevent crash after write_protect_page fails
lx-dmesg needs access to the log_buf symbol from printk.c.
Unfortunately, the symbol log_buf also exists in BPF's verifier.c and
hence gdb can pick one or the other. If it happens to pick BPF's
log_buf, lx-dmesg doesn't work:
(gdb) lx-dmesg
Python Exception <class 'gdb.MemoryError'> Cannot access memory at address 0x0:
Error occurred in Python command: Cannot access memory at address 0x0
(gdb) p log_buf
$15 = 0x0
Luckily, GDB has a way to deal with this, see
https://sourceware.org/gdb/onlinedocs/gdb/Symbols.html
(gdb) info variables ^log_buf$
All variables matching regular expression "^log_buf$":
File <linux.git>/kernel/bpf/verifier.c:
static char *log_buf;
File <linux.git>/kernel/printk/printk.c:
static char *log_buf;
(gdb) p 'verifier.c'::log_buf
$1 = 0x0
(gdb) p 'printk.c'::log_buf
$2 = 0x811a6aa0 <__log_buf> ""
(gdb) p &log_buf
$3 = (char **) 0x8120fe40 <log_buf>
(gdb) p &'verifier.c'::log_buf
$4 = (char **) 0x8120fe40 <log_buf>
(gdb) p &'printk.c'::log_buf
$5 = (char **) 0x8048b7d0 <log_buf>
By being explicit about the location of the symbol, we can make lx-dmesg
work again. While at it, do the same for the other symbols we need from
printk.c
Link: http://lkml.kernel.org/r/20170526112222.3414-1-git@andred.net
Signed-off-by: André Draszik <git@andred.net>
Tested-by: Kieran Bingham <kieran@bingham.xyz>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have seen an early OOM killer invocation on ppc64 systems with
crashkernel=4096M:
kthreadd invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL|__GFP_COMP|__GFP_NOTRACK), nodemask=7, order=0, oom_score_adj=0
kthreadd cpuset=/ mems_allowed=7
CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.4.68-1.gd7fe927-default #1
Call Trace:
dump_stack+0xb0/0xf0 (unreliable)
dump_header+0xb0/0x258
out_of_memory+0x5f0/0x640
__alloc_pages_nodemask+0xa8c/0xc80
kmem_getpages+0x84/0x1a0
fallback_alloc+0x2a4/0x320
kmem_cache_alloc_node+0xc0/0x2e0
copy_process.isra.25+0x260/0x1b30
_do_fork+0x94/0x470
kernel_thread+0x48/0x60
kthreadd+0x264/0x330
ret_from_kernel_thread+0x5c/0xa4
Mem-Info:
active_anon:0 inactive_anon:0 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
slab_reclaimable:5 slab_unreclaimable:73
mapped:0 shmem:0 pagetables:0 bounce:0
free:0 free_pcp:0 free_cma:0
Node 7 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:52428800kB managed:110016kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:4672kB kernel_stack:1152kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
Node 7 DMA: 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 0kB
0 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
819200 pages RAM
0 pages HighMem/MovableOnly
817481 pages reserved
0 pages cma reserved
0 pages hwpoisoned
the reason is that the managed memory is too low (only 110MB) while the
rest of the the 50GB is still waiting for the deferred intialization to
be done. update_defer_init estimates the initial memoty to initialize
to 2GB at least but it doesn't consider any memory allocated in that
range. In this particular case we've had
Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 51200MB)
so the low 2GB is mostly depleted.
Fix this by considering memblock allocations in the initial static
initialization estimation. Move the max_initialise to
reset_deferred_meminit and implement a simple memblock_reserved_memory
helper which iterates all reserved blocks and sums the size of all that
start below the given address. The cumulative size is than added on top
of the initial estimation. This is still not ideal because
reset_deferred_meminit doesn't consider holes and so reservation might
be above the initial estimation whihch we ignore but let's make the
logic simpler until we really need to handle more complicated cases.
Fixes: 3a80a7fa79 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Link: http://lkml.kernel.org/r/20170531104010.GI27783@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> [4.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KVM uses get_user_pages() to resolve its stage2 faults. KVM sets the
FOLL_HWPOISON flag causing faultin_page() to return -EHWPOISON when it
finds a VM_FAULT_HWPOISON. KVM handles these hwpoison pages as a
special case. (check_user_page_hwpoison())
When huge pages are involved, this doesn't work so well.
get_user_pages() calls follow_hugetlb_page(), which stops early if it
receives VM_FAULT_HWPOISON from hugetlb_fault(), eventually returning
-EFAULT to the caller. The step to map this to -EHWPOISON based on the
FOLL_ flags is missing. The hwpoison special case is skipped, and
-EFAULT is returned to user-space, causing Qemu or kvmtool to exit.
Instead, move this VM_FAULT_ to errno mapping code into a header file
and use it from faultin_page() and follow_hugetlb_page().
With this, KVM works as expected.
This isn't a problem for arm64 today as we haven't enabled
MEMORY_FAILURE, but I can't see any reason this doesn't happen on x86
too, so I think this should be a fix. This doesn't apply earlier than
stable's v4.11.1 due to all sorts of cleanup.
[james.morse@arm.com: add vm_fault_to_errno() call to faultin_page()]
suggested.
Link: http://lkml.kernel.org/r/20170525171035.16359-1-james.morse@arm.com
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170524160900.28786-1-james.morse@arm.com
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org> [4.11.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kefeng reported that when running the follow test, the mlock count in
meminfo will increase permanently:
[1] testcase
linux:~ # cat test_mlockal
grep Mlocked /proc/meminfo
for j in `seq 0 10`
do
for i in `seq 4 15`
do
./p_mlockall >> log &
done
sleep 0.2
done
# wait some time to let mlock counter decrease and 5s may not enough
sleep 5
grep Mlocked /proc/meminfo
linux:~ # cat p_mlockall.c
#include <sys/mman.h>
#include <stdlib.h>
#include <stdio.h>
#define SPACE_LEN 4096
int main(int argc, char ** argv)
{
int ret;
void *adr = malloc(SPACE_LEN);
if (!adr)
return -1;
ret = mlockall(MCL_CURRENT | MCL_FUTURE);
printf("mlcokall ret = %d\n", ret);
ret = munlockall();
printf("munlcokall ret = %d\n", ret);
free(adr);
return 0;
}
In __munlock_pagevec() we should decrement NR_MLOCK for each page where
we clear the PageMlocked flag. Commit 1ebb7cc6a5 ("mm: munlock: batch
NR_MLOCK zone state updates") has introduced a bug where we don't
decrement NR_MLOCK for pages where we clear the flag, but fail to
isolate them from the lru list (e.g. when the pages are on some other
cpu's percpu pagevec). Since PageMlocked stays cleared, the NR_MLOCK
accounting gets permanently disrupted by this.
Fix it by counting the number of page whose PageMlock flag is cleared.
Fixes: 1ebb7cc6a5 (" mm: munlock: batch NR_MLOCK zone state updates")
Link: http://lkml.kernel.org/r/1495678405-54569-1-git-send-email-xieyisheng1@huawei.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Joern Engel <joern@logfs.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michel Lespinasse <walken@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: zhongjiang <zhongjiang@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On failing to migrate a page, soft_offline_huge_page() performs the
necessary update to the hugepage ref-count.
But when !hugepage_migration_supported() , unmap_and_move_hugepage()
also decrements the page ref-count for the hugepage. The combined
behaviour leaves the ref-count in an inconsistent state.
This leads to soft lockups when running the overcommitted hugepage test
from mce-tests suite.
Soft offlining pfn 0x83ed600 at process virtual address 0x400000000000
soft offline: 0x83ed600: migration failed 1, type 1fffc00000008008 (uptodate|head)
INFO: rcu_preempt detected stalls on CPUs/tasks:
Tasks blocked on level-0 rcu_node (CPUs 0-7): P2715
(detected by 7, t=5254 jiffies, g=963, c=962, q=321)
thugetlb_overco R running task 0 2715 2685 0x00000008
Call trace:
dump_backtrace+0x0/0x268
show_stack+0x24/0x30
sched_show_task+0x134/0x180
rcu_print_detail_task_stall_rnp+0x54/0x7c
rcu_check_callbacks+0xa74/0xb08
update_process_times+0x34/0x60
tick_sched_handle.isra.7+0x38/0x70
tick_sched_timer+0x4c/0x98
__hrtimer_run_queues+0xc0/0x300
hrtimer_interrupt+0xac/0x228
arch_timer_handler_phys+0x3c/0x50
handle_percpu_devid_irq+0x8c/0x290
generic_handle_irq+0x34/0x50
__handle_domain_irq+0x68/0xc0
gic_handle_irq+0x5c/0xb0
Address this by changing the putback_active_hugepage() in
soft_offline_huge_page() to putback_movable_pages().
This only triggers on systems that enable memory failure handling
(ARCH_SUPPORTS_MEMORY_FAILURE) but not hugepage migration
(!ARCH_ENABLE_HUGEPAGE_MIGRATION).
I imagine this wasn't triggered as there aren't many systems running
this configuration.
[akpm@linux-foundation.org: remove dead comment, per Naoya]
Link: http://lkml.kernel.org/r/20170525135146.32011-1-punit.agrawal@arm.com
Reported-by: Manoj Iyer <manoj.iyer@canonical.com>
Tested-by: Manoj Iyer <manoj.iyer@canonical.com>
Suggested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org> [3.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We currently have two related PMD vs PTE races in the DAX code. These
can both be easily triggered by having two threads reading and writing
simultaneously to the same private mapping, with the key being that
private mapping reads can be handled with PMDs but private mapping
writes are always handled with PTEs so that we can COW.
Here is the first race:
CPU 0 CPU 1
(private mapping write)
__handle_mm_fault()
create_huge_pmd() - FALLBACK
handle_pte_fault()
passes check for pmd_devmap()
(private mapping read)
__handle_mm_fault()
create_huge_pmd()
dax_iomap_pmd_fault() inserts PMD
dax_iomap_pte_fault() does a PTE fault, but we already have a DAX PMD
installed in our page tables at this spot.
Here's the second race:
CPU 0 CPU 1
(private mapping read)
__handle_mm_fault()
passes check for pmd_none()
create_huge_pmd()
dax_iomap_pmd_fault() inserts PMD
(private mapping write)
__handle_mm_fault()
create_huge_pmd() - FALLBACK
(private mapping read)
__handle_mm_fault()
passes check for pmd_none()
create_huge_pmd()
handle_pte_fault()
dax_iomap_pte_fault() inserts PTE
dax_iomap_pmd_fault() inserts PMD,
but we already have a PTE at
this spot.
The core of the issue is that while there is isolation between faults to
the same range in the DAX fault handlers via our DAX entry locking,
there is no isolation between faults in the code in mm/memory.c. This
means for instance that this code in __handle_mm_fault() can run:
if (pmd_none(*vmf.pmd) && transparent_hugepage_enabled(vma)) {
ret = create_huge_pmd(&vmf);
But by the time we actually get to run the fault handler called by
create_huge_pmd(), the PMD is no longer pmd_none() because a racing PTE
fault has installed a normal PMD here as a parent. This is the cause of
the 2nd race. The first race is similar - there is the following check
in handle_pte_fault():
} else {
/* See comment in pte_alloc_one_map() */
if (pmd_devmap(*vmf->pmd) || pmd_trans_unstable(vmf->pmd))
return 0;
So if a pmd_devmap() PMD (a DAX PMD) has been installed at vmf->pmd, we
will bail and retry the fault. This is correct, but there is nothing
preventing the PMD from being installed after this check but before we
actually get to the DAX PTE fault handlers.
In my testing these races result in the following types of errors:
BUG: Bad rss-counter state mm:ffff8800a817d280 idx:1 val:1
BUG: non-zero nr_ptes on freeing mm: 15
Fix this issue by having the DAX fault handlers verify that it is safe
to continue their fault after they have taken an entry lock to block
other racing faults.
[ross.zwisler@linux.intel.com: improve fix for colliding PMD & PTE entries]
Link: http://lkml.kernel.org/r/20170526195932.32178-1-ross.zwisler@linux.intel.com
Link: http://lkml.kernel.org/r/20170522215749.23516-2-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Pawel Lebioda <pawel.lebioda@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Pawel Lebioda <pawel.lebioda@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Xiong Zhou <xzhou@redhat.com>
Cc: Eryu Guan <eguan@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the pmd_devmap() checks were added by 5c7fb56e5e ("mm, dax:
dax-pmd vs thp-pmd vs hugetlbfs-pmd") to add better support for DAX huge
pages, they were all added to the end of if() statements after existing
pmd_trans_huge() checks. So, things like:
- if (pmd_trans_huge(*pmd))
+ if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd))
When further checks were added after pmd_trans_unstable() checks by
commit 7267ec008b ("mm: postpone page table allocation until we have
page to map") they were also added at the end of the conditional:
+ if (pmd_trans_unstable(fe->pmd) || pmd_devmap(*fe->pmd))
This ordering is fine for pmd_trans_huge(), but doesn't work for
pmd_trans_unstable(). This is because DAX huge pages trip the bad_pmd()
check inside of pmd_none_or_trans_huge_or_clear_bad() (called by
pmd_trans_unstable()), which prints out a warning and returns 1. So, we
do end up doing the right thing, but only after spamming dmesg with
suspicious looking messages:
mm/pgtable-generic.c:39: bad pmd ffff8808daa49b88(84000001006000a5)
Reorder these checks in a helper so that pmd_devmap() is checked first,
avoiding the error messages, and add a comment explaining why the
ordering is important.
Fixes: commit 7267ec008b ("mm: postpone page table allocation until we have page to map")
Link: http://lkml.kernel.org/r/20170522215749.23516-1-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Pawel Lebioda <pawel.lebioda@intel.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Xiong Zhou <xzhou@redhat.com>
Cc: Eryu Guan <eguan@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roman Gushchin has reported that the OOM killer can trivially selects
next OOM victim when a thread doing memory allocation from page fault
path was selected as first OOM victim.
allocate invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
allocate cpuset=/ mems_allowed=0
CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
Call Trace:
oom_kill_process+0x219/0x3e0
out_of_memory+0x11d/0x480
__alloc_pages_slowpath+0xc84/0xd40
__alloc_pages_nodemask+0x245/0x260
alloc_pages_vma+0xa2/0x270
__handle_mm_fault+0xca9/0x10c0
handle_mm_fault+0xf3/0x210
__do_page_fault+0x240/0x4e0
trace_do_page_fault+0x37/0xe0
do_async_page_fault+0x19/0x70
async_page_fault+0x28/0x30
...
Out of memory: Kill process 492 (allocate) score 899 or sacrifice child
Killed process 492 (allocate) total-vm:2052368kB, anon-rss:1894576kB, file-rss:4kB, shmem-rss:0kB
allocate: page allocation failure: order:0, mode:0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null)
allocate cpuset=/ mems_allowed=0
CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
Call Trace:
__alloc_pages_slowpath+0xd32/0xd40
__alloc_pages_nodemask+0x245/0x260
alloc_pages_vma+0xa2/0x270
__handle_mm_fault+0xca9/0x10c0
handle_mm_fault+0xf3/0x210
__do_page_fault+0x240/0x4e0
trace_do_page_fault+0x37/0xe0
do_async_page_fault+0x19/0x70
async_page_fault+0x28/0x30
...
oom_reaper: reaped process 492 (allocate), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
...
allocate invoked oom-killer: gfp_mask=0x0(), nodemask=(null), order=0, oom_score_adj=0
allocate cpuset=/ mems_allowed=0
CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
Call Trace:
oom_kill_process+0x219/0x3e0
out_of_memory+0x11d/0x480
pagefault_out_of_memory+0x68/0x80
mm_fault_error+0x8f/0x190
? handle_mm_fault+0xf3/0x210
__do_page_fault+0x4b2/0x4e0
trace_do_page_fault+0x37/0xe0
do_async_page_fault+0x19/0x70
async_page_fault+0x28/0x30
...
Out of memory: Kill process 233 (firewalld) score 10 or sacrifice child
Killed process 233 (firewalld) total-vm:246076kB, anon-rss:20956kB, file-rss:0kB, shmem-rss:0kB
There is a race window that the OOM reaper completes reclaiming the
first victim's memory while nothing but mutex_trylock() prevents the
first victim from calling out_of_memory() from pagefault_out_of_memory()
after memory allocation for page fault path failed due to being selected
as an OOM victim.
This is a side effect of commit 9a67f6488e ("mm: consolidate
GFP_NOFAIL checks in the allocator slowpath") because that commit
silently changed the behavior from
/* Avoid allocations with no watermarks from looping endlessly */
to
/*
* Give up allocations without trying memory reserves if selected
* as an OOM victim
*/
in __alloc_pages_slowpath() by moving the location to check TIF_MEMDIE
flag. I have noticed this change but I didn't post a patch because I
thought it is an acceptable change other than noise by warn_alloc()
because !__GFP_NOFAIL allocations are allowed to fail. But we
overlooked that failing memory allocation from page fault path makes
difference due to the race window explained above.
While it might be possible to add a check to pagefault_out_of_memory()
that prevents the first victim from calling out_of_memory() or remove
out_of_memory() from pagefault_out_of_memory(), changing
pagefault_out_of_memory() does not suppress noise by warn_alloc() when
allocating thread was selected as an OOM victim. There is little point
with printing similar backtraces and memory information from both
out_of_memory() and warn_alloc().
Instead, if we guarantee that current thread can try allocations with no
watermarks once when current thread looping inside
__alloc_pages_slowpath() was selected as an OOM victim, we can follow "who
can use memory reserves" rules and suppress noise by warn_alloc() and
prevent memory allocations from page fault path from calling
pagefault_out_of_memory().
If we take the comment literally, this patch would do
- if (test_thread_flag(TIF_MEMDIE))
- goto nopage;
+ if (alloc_flags == ALLOC_NO_WATERMARKS || (gfp_mask & __GFP_NOMEMALLOC))
+ goto nopage;
because gfp_pfmemalloc_allowed() returns false if __GFP_NOMEMALLOC is
given. But if I recall correctly (I couldn't find the message), the
condition is meant to apply to only OOM victims despite the comment.
Therefore, this patch preserves TIF_MEMDIE check.
Fixes: 9a67f6488e ("mm: consolidate GFP_NOFAIL checks in the allocator slowpath")
Link: http://lkml.kernel.org/r/201705192112.IAF69238.OQOHSJLFOFFMtV@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Roman Gushchin <guro@fb.com>
Tested-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org> [4.11]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
memcg_propagate_slab_attrs() abuses the sysfs attribute file functions
to propagate settings from the root kmem_cache to a newly created
kmem_cache. It does that with:
attr->show(root, buf);
attr->store(new, buf, strlen(bug);
Aside of being a lazy and absurd hackery this is broken because it does
not check the return value of the show() function.
Some of the show() functions return 0 w/o touching the buffer. That
means in such a case the store function is called with the stale content
of the previous show(). That causes nonsense like invoking
kmem_cache_shrink() on a newly created kmem_cache. In the worst case it
would cause handing in an uninitialized buffer.
This should be rewritten proper by adding a propagate() callback to
those slub_attributes which must be propagated and avoid that insane
conversion to and from ASCII, but that's too large for a hot fix.
Check at least the return value of the show() function, so calling
store() with stale content is prevented.
Steven said:
"It can cause a deadlock with get_online_cpus() that has been uncovered
by recent cpu hotplug and lockdep changes that Thomas and Peter have
been doing.
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(cpu_hotplug.lock);
lock(slab_mutex);
lock(cpu_hotplug.lock);
lock(slab_mutex);
*** DEADLOCK ***"
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705201244540.2255@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit db2aa7fd15 ("initramfs: allow again choice of the embedded
initram compression algorithm") introduced the possibility to select the
initramfs compression algorithm from Kconfig and while this is a nice
feature it broke the use case described below.
Here is what my build system does:
- kernel is initially configured not to have an initramfs included
- build the user space root file system
- re-configure the kernel to have an initramfs included
(CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
CONFIG_INITRAMFS options, in my case, no compression option
(CONFIG_INITRAMFS_COMPRESSION_NONE)
- kernel is re-built with these options -> kernel+initramfs image is
copied
- kernel is re-built again without these options -> kernel image is
copied
Building a kernel without an initramfs means setting this option:
CONFIG_INITRAMFS_SOURCE="" (and this one only)
whereas building a kernel with an initramfs means setting these options:
CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
CONFIG_INITRAMFS_ROOT_UID=1000
CONFIG_INITRAMFS_ROOT_GID=1000
CONFIG_INITRAMFS_COMPRESSION_NONE=y
CONFIG_INITRAMFS_COMPRESSION=""
Commit db2aa7fd15 ("initramfs: allow again choice of the embedded
initram compression algorithm") is problematic because
CONFIG_INITRAMFS_COMPRESSION which is used to determine the
initramfs_data.cpio extension/compression is a string, and due to how
Kconfig works it will evaluate in order, how to assign it.
Setting CONFIG_INITRAMFS_COMPRESSION_NONE with CONFIG_INITRAMFS_SOURCE=""
cannot possibly work (because of the depends on INITRAMFS_SOURCE!=""
imposed on CONFIG_INITRAMFS_COMPRESSION ) yet we still get
CONFIG_INITRAMFS_COMPRESSION assigned to ".gz" because CONFIG_RD_GZIP=y
is set in my kernel, even when there is no initramfs being built.
So we basically end-up generating two initramfs_data.cpio* files, one
without extension, and one with .gz. This causes usr/Makefile to track
usr/initramfs_data.cpio.gz, and not usr/initramfs_data.cpio anymore,
that is also largely problematic after 9e3596b0c6 ("kbuild:
initramfs cleanup, set target from Kconfig") because we used to track
all possible initramfs_data files in the $(targets) variable before that
commit.
The end result is that the kernel with an initramfs clearly does not
contain what we expect it to, it has a stale initramfs_data.cpio file
built into it, and we keep re-generating an initramfs_data.cpio.gz file
which is not the one that we want to include in the kernel image proper.
The fix consists in hiding CONFIG_INITRAMFS_COMPRESSION when
CONFIG_INITRAMFS_SOURCE="". This puts us back in a state to the
pre-4.10 behavior where we can properly disable and re-enable initramfs
within the same kernel .config file, and be in control of what
CONFIG_INITRAMFS_COMPRESSION is set to.
Fixes: db2aa7fd15 ("initramfs: allow again choice of the embedded initram compression algorithm")
Fixes: 9e3596b0c6 ("kbuild: initramfs cleanup, set target from Kconfig")
Link: http://lkml.kernel.org/r/20170521033337.6197-1-f.fainelli@gmail.com
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Cc: P J P <ppandit@redhat.com>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While converting drm_[cm]alloc* helpers to kvmalloc* variants Chris
Wilson has wondered why we want to try kmalloc before vmalloc fallback
even for larger allocations requests. Let's clarify that one larger
physically contiguous block is less likely to fragment memory than many
scattered pages which can prevent more large blocks from being created.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170517080932.21423-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 7c30f352c8 ("jiffies.h: declare jiffies and jiffies_64 with
____cacheline_aligned_in_smp") removed a section specification from the
jiffies declaration that caused conflicts on some platforms.
Unfortunately this change broke the build for frv:
kernel/built-in.o: In function `__do_softirq': (.text+0x6460): relocation truncated to fit: R_FRV_GPREL12 against symbol
`jiffies' defined in *ABS* section in .tmp_vmlinux1
kernel/built-in.o: In function `__do_softirq': (.text+0x6574): relocation truncated to fit: R_FRV_GPREL12 against symbol
`jiffies' defined in *ABS* section in .tmp_vmlinux1
kernel/built-in.o: In function `pwq_activate_delayed_work': workqueue.c:(.text+0x15b9c): relocation truncated to fit: R_FRV_GPREL12 against
symbol `jiffies' defined in *ABS* section in .tmp_vmlinux1
...
Add __jiffy_arch_data to the declaration of jiffies and use it on frv to
include the section specification. For all other platforms
__jiffy_arch_data (currently) has no effect.
Fixes: 7c30f352c8 ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp")
Link: http://lkml.kernel.org/r/20170516221333.177280-1-mka@chromium.org
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Igor Stoppa has noticed that __GFP_NOLOCKDEP can use a lower bit. At
the time commit 7e7844226f ("lockdep: allow to disable reclaim lockup
detection") was written we still had __GFP_OTHER_NODE but I have removed
it in commit 41b6167e8f ("mm: get rid of __GFP_OTHER_NODE") and forgot
to lower the bit value.
The current value is outside of __GFP_BITS_SHIFT so it cannot be used
actually.
Fixes: 7e7844226f ("lockdep: allow to disable reclaim lockup detection")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Igor Stoppa <igor.stoppa@nokia.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
"err" needs to be left set to -EFAULT if split_huge_page succeeds.
Otherwise if "err" gets clobbered with zero and write_protect_page
fails, try_to_merge_one_page() will succeed instead of returning -EFAULT
and then try_to_merge_with_ksm_page() will continue thinking kpage is a
PageKsm when in fact it's still an anonymous page. Eventually it'll
crash in page_add_anon_rmap.
This has been reproduced on Fedora25 kernel but I can reproduce with
upstream too.
The bug was introduced in commit f765f54059 ("ksm: prepare to new THP
semantics") introduced in v4.5.
page:fffff67546ce1cc0 count:4 mapcount:2 mapping:ffffa094551e36e1 index:0x7f0f46673
flags: 0x2ffffc0004007c(referenced|uptodate|dirty|lru|active|swapbacked)
page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
page->mem_cgroup:ffffa09674bf0000
------------[ cut here ]------------
kernel BUG at mm/rmap.c:1222!
CPU: 1 PID: 76 Comm: ksmd Not tainted 4.9.3-200.fc25.x86_64 #1
RIP: do_page_add_anon_rmap+0x1c4/0x240
Call Trace:
page_add_anon_rmap+0x18/0x20
try_to_merge_with_ksm_page+0x50b/0x780
ksm_scan_thread+0x1211/0x1410
? prepare_to_wait_event+0x100/0x100
? try_to_merge_with_ksm_page+0x780/0x780
kthread+0xd9/0xf0
? kthread_park+0x60/0x60
ret_from_fork+0x25/0x30
Fixes: f765f54059 ("ksm: prepare to new THP semantics")
Link: http://lkml.kernel.org/r/20170513131040.21732-1-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Federico Simoncelli <fsimonce@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm-cpufreq:
cpufreq: kirkwood-cpufreq:- Handle return value of clk_prepare_enable()
cpufreq: cpufreq_register_driver() should return -ENODEV if init fails
Pull XFS fix from Darrick Wong:
"I've one more bugfix for you for 4.12-rc4: Fix an unmount hang due to
a race in io buffer accounting"
* tag 'xfs-4.12-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: use ->b_state to fix buffer I/O accounting release race
Pull ceph fix from Ilya Dryomov:
"A small fix for rbd FALLOC_FL_ZERO_RANGE/PUNCH_HOLE handling breakage
introduced in -rc1"
* tag 'ceph-for-4.12-rc4' of git://github.com/ceph/ceph-client:
rbd: implement REQ_OP_WRITE_ZEROES
If logout response is not received and ->ep_disconnect() is called then
close tcp conn by RST instead of FIN to cleanup conn resources
immediately.
Also move ->csk_push_tx_frames() above 'done:' to avoid calling
->csk_push_tx_frames() in error cases.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull device mapper fixes from Mike Snitzer:
- a DM verity fix for a mode when no salt is used
- a fix to DM to account for the possibility that PREFLUSH or FUA are
used without the SYNC flag if the underlying storage doesn't have a
volatile write-cache
- a DM ioctl memory allocation flag fix to use __GFP_HIGH to allow
emergency forward progress (by using memory reserves as last resort)
- a small DM integrity cleanup to use kvmalloc() instead of duplicating
the same
* tag 'for-4.12/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: make flush bios explicitly sync
dm ioctl: restore __GFP_HIGH in copy_params()
dm integrity: use kvmalloc() instead of dm_integrity_kvmalloc()
dm verity: fix no salt use case
Pull MD fixes from Shaohua Li:
"Several patches for MD. One notable is making flush bios sync, others
fix small issues"
* tag 'md/4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md: Make flush bios explicitely sync
md: report sector of stripes with check mismatches
md: uuid debug statement now in processor byte order.
md-cluster: fix potential lock issue in add_new_disk
Pull block fixes from Jens Axboe:
"A set of fixes that should go into the next -rc. This contains:
- A use-after-free in the request_list exit for the legacy IO path,
from Bart.
- A fix for CFQ, fixing a recent regression with the conversion to
higher resolution timing for iops mode. From Hou Tao.
- A single fix for nbd, split in two patches, fixing a leak of a data
structure.
- A regression fix from Keith, ensuring that callers of
blk_mq_update_nr_hw_queues() hold the right lock"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: Avoid that blk_exit_rl() triggers a use-after-free
cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode
blk-mq: Take tagset lock when updating hw queues
nbd: don't leak nbd_config
nbd: nbd_reset() call in nbd_dev_add() is redundant
Pull drm displayport quirk support:
"DP quirk for usb c dongles.
As mentioned I have a separate request for fixing a regression, but
also keeping the broken hw working, for certain USB-C DP adapters they
require a minimised n/m parameters, but an attempt to do this
generically has failed, we need to quirk these specific adapters.
However doing it generically regressed some eDP panels.
This pull adds the infrastructure and a quirk for the adapter"
* tag 'drm-dp-quirk-for-v4.12-rc4' of git://people.freedesktop.org/~airlied/linux:
drm/i915: Detect USB-C specific dongles before reducing M and N
drm/dp: start a DPCD based DP sink/branch device quirk database
drm/i915: use drm DP helper to read DPCD desc
drm/dp: add helper for reading DP sink/branch device desc from DPCD
commit d85b758f72 ("virtio_net: fix support for small rings")
was supposed to increase the buffer size for small rings but had an
unintentional side effect of decreasing it for large rings. This seems
to break some setups - it's not yet clear why, but increasing buffer
size back to what it was before helps.
Fixes: d85b758f72 ("virtio_net: fix support for small rings")
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: "J. Bruce Fields" <bfields@fieldses.org>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Tested-by: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Qlogic's 82xx series adapter doesn't support
tunnel offloads, driver incorrectly assumes that it is
supported and causes firmware hang while running tunnel IO.
This patch fixes this by not advertising tunnel offloads
for 82xx adapters.
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding a vxlan interface to a socket isn't symmetrical, while adding
is done in vxlan_open() the deletion is done in vxlan_dellink().
This can cause a use-after-free error when we close the vxlan
interface before deleting it.
We add vxlan_vs_del_dev() to match vxlan_vs_add_dev() and call
it from vxlan_stop() to match the call from vxlan_open().
Fixes: 56ef9c909b ("vxlan: Move socket initialization to within rtnl scope")
Acked-by: Jiri Benc <jbenc@redhat.com>
Tested-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the sender switches its congestion control during loss
recovery, if the recovery is spurious then it may incorrectly
revert cwnd and ssthresh to the older values set by a previous
congestion control. Consider a congestion control (like BBR)
that does not use ssthresh and keeps it infinite: the connection
may incorrectly revert cwnd to an infinite value when switching
from BBR to another congestion control.
This patch fixes it by disallowing such cwnd undo operation
upon switching congestion control. Note that undo_marker
is not reset s.t. the packets that were incorrectly marked
lost would be corrected. We only avoid undoing the cwnd in
tcp_undo_cwnd_reduction().
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Take uld mutex to avoid race between cxgb_up() and
cxgb4_register_uld() to enable napi for the same uld
queue.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
xfrm6_find_1stfragopt() may now return an error code and we must
not treat it as a length.
Fixes: 2423496af3 ("ipv6: Prevent overrun when parsing v6 header options")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull sound fixes from Takashi Iwai:
"This contains the fixes for a few reported regression for HD-audio and
USB-audio. All small, trivial, and boring"
* tag 'sound-4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix applying MSI dual-codec mobo quirk
ALSA: usb: Avoid VLA in mixer_us16x08.c
ALSA: usb: Fix a typo in Tascam US-16x08 mixer element
Revert "ALSA: usb-audio: purge needless variable length array"
Pull dmaengine fixes from Vinod Koul:
"Here is the dmaengine fixes request for 4.12. Fixes bunch of issues in
the driver, npthing exciting though..
- mv_xor_v2 driver fixes for handling descriptors, tx_submit
implementation, removing interrupt coalescing and setting DMA mask
properly
- fix usb-dmac DMAOR AE bit definition
- fix ep93xx start buffer from BASE0 and not drain the transfers in
terminate_all
- fix rcar-dmac to use right descriptor pointer for residue
calculation
- pl330 fix warn for irq freeup"
* tag 'dmaengine-fix-4.12-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: pl330: fix warning in pl330_remove
rcar-dmac: fixup descriptor pointer for descriptor mode
dmaengine: ep93xx: Don't drain the transfers in terminate_all()
dmaengine: ep93xx: Always start from BASE0
dmaengine: usb-dmac: Fix DMAOR AE bit definition
dmaengine: mv_xor_v2: set DMA mask to 40 bits
dmaengine: mv_xor_v2: remove interrupt coalescing
dmaengine: mv_xor_v2: fix tx_submit() implementation
dmaengine: mv_xor_v2: enable XOR engine after its configuration
dmaengine: mv_xor_v2: do not use descriptors not acked by async_tx
dmaengine: mv_xor_v2: properly handle wrapping in the array of HW descriptors
dmaengine: mv_xor_v2: handle mv_xor_v2_prep_sw_desc() error properly
Pull HID fixes from Jiri Kosina:
- corner-case oops fixes for Asus and Wacom drivers from Carlo Caione
and Jason Gerecke
- power management fix (reported on SIS0817 touchscreen) for i2c-hid
devices from Hans de Goede
- device-id-specific fixes and quirks from Hans de Goede, Diego Elio
Pettenò and Che-Liang Chiou
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: asus: Stop underlying hardware on remove
HID: i2c: Call acpi_device_fix_up_power for ACPI-enumerated devices
HID: asus: Add support for T100 keyboard
HID: elecom: extend to fix the descriptor for DEFT trackballs
HID: magicmouse: Set multi-touch keybits for Magic Mouse
HID: wacom: Have wacom_tpc_irq guard against possible NULL dereference
Pull livepatching fix from Jiri Kosina:
"Kconfig dependency fix for livepatching infrastructure from Miroslav
Benes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: Make livepatch dependent on !TRIM_UNUSED_KSYMS
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- revert a broken PAT commit that broke a number of systems
- fix two preemptability warnings/bugs that can trigger under certain
circumstances, in the debug code and in the microcode loader"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "x86/PAT: Fix Xorg regression on CPUs that don't support PAT"
x86/debug/32: Convert a smp_processor_id() call to raw to avoid DEBUG_PREEMPT warning
x86/microcode/AMD: Change load_microcode_amd()'s param to bool to fix preemptibility bug
Pull EFI fixes from Ingo Molnar:
"Misc fixes:
- three boot crash fixes for uncommon configurations
- silence a boot warning under virtualization
- plus a GCC 7 related (harmless) build warning fix"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/bgrt: Skip efi_bgrt_init() in case of non-EFI boot
x86/efi: Correct EFI identity mapping under 'efi=old_map' when KASLR is enabled
x86/efi: Disable runtime services on kexec kernel if booted with efi=old_map
efi: Remove duplicate 'const' specifiers
efi: Don't issue error message when booted under Xen
commit a39be606f9 ("drm: Do a full device unregister when unplugging")
causes backtraces like this one when unplugging an usb drm device while
it is in use:
usb 2-3: USB disconnect, device number 25
------------[ cut here ]------------
WARNING: CPU: 0 PID: 242 at drivers/gpu/drm/drm_mode_config.c:424
drm_mode_config_cleanup+0x220/0x280 [drm]
...
RIP: 0010:drm_mode_config_cleanup+0x220/0x280 [drm]
...
Call Trace:
gm12u320_modeset_cleanup+0xe/0x10 [gm12u320]
gm12u320_driver_unload+0x35/0x70 [gm12u320]
drm_dev_unregister+0x3c/0xe0 [drm]
drm_unplug_dev+0x12/0x60 [drm]
gm12u320_usb_disconnect+0x36/0x40 [gm12u320]
usb_unbind_interface+0x72/0x280
device_release_driver_internal+0x158/0x210
device_release_driver+0x12/0x20
bus_remove_device+0x104/0x180
device_del+0x1d2/0x350
usb_disable_device+0x9f/0x270
usb_disconnect+0xc6/0x260
...
[drm:drm_mode_config_cleanup [drm]] *ERROR* connector Unknown-1 leaked!
------------[ cut here ]------------
WARNING: CPU: 0 PID: 242 at drivers/gpu/drm/drm_mode_config.c:458
drm_mode_config_cleanup+0x268/0x280 [drm]
...
<same Call Trace>
---[ end trace 80df975dae439ed6 ]---
general protection fault: 0000 [#1] SMP
...
Call Trace:
? __switch_to+0x225/0x450
drm_mode_rmfb_work_fn+0x55/0x70 [drm]
process_one_work+0x193/0x3c0
worker_thread+0x4a/0x3a0
...
RIP: drm_framebuffer_remove+0x62/0x3f0 [drm] RSP: ffffb776c39dfd98
---[ end trace 80df975dae439ed7 ]---
After which the system is unusable this is caused by drm_dev_unregister
getting called immediately on unplug, which calls the drivers unload
function which calls drm_mode_config_cleanup which removes the framebuffer
object while userspace is still holding a reference to it.
Reverting commit a39be606f9 ("drm: Do a full device unregister
when unplugging") leads to the following oops on unplug instead,
when userspace closes the last fd referencing the drm_dev:
sysfs group 'power' not found for kobject 'card1-Unknown-1'
------------[ cut here ]------------
WARNING: CPU: 0 PID: 2459 at fs/sysfs/group.c:237
sysfs_remove_group+0x80/0x90
...
RIP: 0010:sysfs_remove_group+0x80/0x90
...
Call Trace:
dpm_sysfs_remove+0x57/0x60
device_del+0xfd/0x350
device_unregister+0x1a/0x60
drm_sysfs_connector_remove+0x39/0x50 [drm]
drm_connector_unregister+0x5a/0x70 [drm]
drm_connector_unregister_all+0x45/0xa0 [drm]
drm_modeset_unregister_all+0x12/0x30 [drm]
drm_dev_unregister+0xca/0xe0 [drm]
drm_put_dev+0x32/0x60 [drm]
drm_release+0x2f3/0x380 [drm]
__fput+0xdf/0x1e0
...
---[ end trace ecfb91ac85688bbe ]---
BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
IP: down_write+0x1f/0x40
...
Call Trace:
debugfs_remove_recursive+0x55/0x1b0
drm_debugfs_connector_remove+0x21/0x40 [drm]
drm_connector_unregister+0x62/0x70 [drm]
drm_connector_unregister_all+0x45/0xa0 [drm]
drm_modeset_unregister_all+0x12/0x30 [drm]
drm_dev_unregister+0xca/0xe0 [drm]
drm_put_dev+0x32/0x60 [drm]
drm_release+0x2f3/0x380 [drm]
__fput+0xdf/0x1e0
...
---[ end trace ecfb91ac85688bbf ]---
This is caused by the revert moving back to drm_unplug_dev calling
drm_minor_unregister which does:
device_del(minor->kdev);
dev_set_drvdata(minor->kdev, NULL); /* safety belt */
drm_debugfs_cleanup(minor);
Causing the sysfs entries to already be removed even though we still
have references to them in e.g. drm_connector.
Note we must call drm_minor_unregister to notify userspace of the unplug
of the device, so calling drm_dev_unregister is not completely wrong the
problem is that drm_dev_unregister does too much.
This commit fixes drm_unplug_dev by not only reverting
commit a39be606f9 ("drm: Do a full device unregister when unplugging")
but by also adding a call to drm_modeset_unregister_all before the
drm_minor_unregister calls to make sure all sysfs entries are removed
before calling device_del(minor->kdev) thereby also fixing the second
set of oopses caused by just reverting the commit.
Fixes: a39be606f9 ("drm: Do a full device unregister when unplugging")
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeffy <jeffy.chen@rock-chips.com>
Cc: Marco Diego Aurélio Mesquita <marcodiegomesquita@gmail.com>
Reported-by: Marco Diego Aurélio Mesquita <marcodiegomesquita@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601115430.4113-1-hdegoede@redhat.com
Johannes Berg says:
====================
Just two fixes:
* fix the per-CPU drop counters to not be added to the
rx_packets counter, but really the drop counter
* fix TX aggregation start/stop callback races by setting
bits instead of allocating and queueing an skb
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
dsa_switch_suspend() and dsa_switch_resume() are functions that belong in
net/dsa/dsa.c and are not part of the legacy platform support code.
Fixes: a6a71f19fe ("net: dsa: isolate legacy code")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
On SYSTEMPORT Lite, since we have the main interrupt source in the first
cell, the second cell is the Wake-on-LAN interrupt, yet the code was not
properly updated to fetch the second cell, and instead looked at the
third and non-existing cell for Wake-on-LAN.
Fixes: 44a4524c54 ("net: systemport: Add support for SYSTEMPORT Lite")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The BAD_MADT_GICC_ENTRY() macro checks if a GICC MADT entry passes
muster from an ACPI specification standpoint. Current macro detects the
MADT GICC entry length through ACPI firmware version (it changed from 76
to 80 bytes in the transition from ACPI 5.1 to ACPI 6.0 specification)
but always uses (erroneously) the ACPICA (latest) struct (ie struct
acpi_madt_generic_interrupt - that is 80-bytes long) length to check if
the current GICC entry memory record exceeds the MADT table end in
memory as defined by the MADT table header itself, which may result in
false negatives depending on the ACPI firmware version and how the MADT
entries are laid out in memory (ie on ACPI 5.1 firmware MADT GICC
entries are 76 bytes long, so by adding 80 to a GICC entry start address
in memory the resulting address may well be past the actual MADT end,
triggering a false negative).
Fix the BAD_MADT_GICC_ENTRY() macro by reshuffling the condition checks
and update them to always use the firmware version specific MADT GICC
entry length in order to carry out boundary checks.
Fixes: b6cfb27737 ("ACPI / ARM64: add BAD_MADT_GICC_ENTRY() macro")
Reported-by: Julien Grall <julien.grall@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Al Stone <ahs3@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When the association between a queue device and the
driver is released via unbind and later re-associated
the queue device was not operational any more. Reason
was a wrong administration of the card/queue lists
within the ap device driver.
This patch introduces revised card/queue list handling
within the ap device driver: when an ap device is
detected it is initial not added to the card/queue list
any more. With driver probe the card device is added to
the card list/the queue device is added to the queue list
within a card. With driver remove the device is removed
from the card/queue list. Additionally there are some
situations within the ap device live where the lists
need update upon card/queue device release (for example
device hot unplug or suspend/resume).
Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We are missing a call to hid_hw_stop() on the remove hook.
Among other things this is causing an Oops when (re-)starting GNOME /
upowerd / ... after the module has been already rmmod-ed.
Signed-off-by: Carlo Caione <carlo@endlessm.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The PN_INT_ENA register should be used after usb3_pn_change() is called.
So, this patch moves the access from renesas_usb3_stop_controller() to
usb3_disable_pipe_n().
Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This controller disallows to change the PIPE until reading/writing
a packet finishes. However. the previous code is not enough to hold
the lock in some functions. So, this patch fixes it.
Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This patch fixes an issue that this driver is possible to cause
deadlock by double-spinclocked in renesas_usb3_stop_controller().
So, this patch removes spinlock API calling in renesas_usb3_stop().
(In other words, the previous code had a redundant lock.)
Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This patch fixes an issue that this driver is possible to access
the registers before pm_runtime_get_sync() if a gadget driver is
installed first. After that, oops happens on R-Car Gen3 environment.
To avoid it, this patch changes the pm_runtime call timing from
probe/remove to udc_start/udc_stop.
Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
f_mass_storage has a memorry barrier issue with the sleep and wake
functions that can cause a deadlock. This results in intermittent hangs
during MSC file transfer. The host will reset the device after receiving
no response to resume the transfer. This issue is seen when dwc3 is
processing 2 transfer-in-progress events at the same time, invoking
completion handlers for CSW and CBW. Also this issue occurs depending on
the system timing and latency.
To increase the chance to hit this issue, you can force dwc3 driver to
wait and process those 2 events at once by adding a small delay (~100us)
in dwc3_check_event_buf() whenever the request is for CSW and read the
event count again. Avoid debugging with printk and ftrace as extra
delays and memory barrier will mask this issue.
Scenario which can lead to failure:
-----------------------------------
1) The main thread sleeps and waits for the next command in
get_next_command().
2) bulk_in_complete() wakes up main thread for CSW.
3) bulk_out_complete() tries to wake up the running main thread for CBW.
4) thread_wakeup_needed is not loaded with correct value in
sleep_thread().
5) Main thread goes to sleep again.
The pattern is shown below. Note the 2 critical variables.
* common->thread_wakeup_needed
* bh->state
CPU 0 (sleep_thread) CPU 1 (wakeup_thread)
============================== ===============================
bh->state = BH_STATE_FULL;
smp_wmb();
thread_wakeup_needed = 0; thread_wakeup_needed = 1;
smp_rmb();
if (bh->state != BH_STATE_FULL)
sleep again ...
As pointed out by Alan Stern, this is an R-pattern issue. The issue can
be seen when there are two wakeups in quick succession. The
thread_wakeup_needed can be overwritten in sleep_thread, and the read of
the bh->state maybe reordered before the write to thread_wakeup_needed.
This patch applies full memory barrier smp_mb() in both sleep_thread()
and wakeup_thread() to ensure the order which the thread_wakeup_needed
and bh->state are written and loaded.
However, a better solution in the future would be to use wait_queue
method that takes care of managing memory barrier between waker and
waiter.
Cc: <stable@vger.kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
USB support in the Meson8 SoCs is provided by a DWC2 controller which
works with the same settings as Meson8b and GXBB. Using the generic
"snps,dwc2" binding results in an endless stream of "Overcurrent change
detected" messages.
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
When removing a device with less than 9 IRQs (AMBA_NR_IRQS), we'll get a
big WARN_ON from devres.c because pl330_remove calls devm_free_irqs for
unallocated irqs. Similarly to pl330_probe, check that IRQ number is
present before calling devm_free_irq.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Commit 4e552c8cb5 ("leds: add LED_ON brightness as boolean value")
has introduced the LED_ON enumeration value that can be used
instead of LED_FULL which has more of a linear value.
Because the tm2-touchscreen doesn't have brightness levels, but
it's a simple on/off led, use LED_ON instead of LED_FULL.
Signed-off-by: Andi Shyti <andi.shyti@samsung.com>
Reviewed-by: Jaechul Lee <jcsing.lee@samsung.com>
Tested-by: Jaechul Lee <jcsing.lee@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
DP sink specific quirks
* tag 'topic/dp-quirks-2017-05-31' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Detect USB-C specific dongles before reducing M and N
drm/dp: start a DPCD based DP sink/branch device quirk database
drm/i915: use drm DP helper to read DPCD desc
drm/dp: add helper for reading DP sink/branch device desc from DPCD
A fix to MAINTAINERS file adding relevant device-tree
files to mach-davinci entry.
* tag 'davinci-fixes-for-v4.12-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci:
MAINTAINERS: add device-tree files to TI DaVinci entry
Signed-off-by: Olof Johansson <olof@lixom.net>
mvebu non critical fixes for 4.12
Update MAINTAINER file for irqchip related drivers to Marvell EBU
* tag 'mvebu-fixes-non-critical-4.12-1' of git://git.infradead.org/linux-mvebu:
MAINTAINERS: add irqchip related drivers to Marvell EBU maintainers
MAINTAINERS: sort F entries for Marvell EBU maintainers
Signed-off-by: Olof Johansson <olof@lixom.net>
mvebu fixes for 4.12
Fix the interrupt description of the crypto node for device tree of
the Armada 7K/8K SoCs
* tag 'mvebu-fixes-4.12-1' of git://git.infradead.org/linux-mvebu: (316 commits)
arm64: marvell: dts: fix interrupts in 7k/8k crypto nodes
+ Linux 4.12-rc2
Signed-off-by: Olof Johansson <olof@lixom.net>
The STMicroelectronics dedicated mailing list kernel@stlinux.com
is no more available, remove it to avoid bouncing mails.
Several request to create a new mailing list has been send by Benjamin
Gaignard and me but without any answers.
Signed-off-by: Patrice Chotard <patrice.chotard@st.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Reset controller fixes for v4.12
- Set hi6220_reset driver module license to GPL v2 to fix module loading.
* tag 'reset-fixes-for-4.12' of git://git.pengutronix.de/git/pza/linux:
reset: hi6220: Set module license so that it can be loaded
Signed-off-by: Olof Johansson <olof@lixom.net>
Fixes for 4.12:
Fix two compilation issues
* tag 'at91-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
ARM: at91: select CONFIG_ARM_CPU_SUSPEND
memory: atmel-ebi: mark PM ops as __maybe_unused
Signed-off-by: Olof Johansson <olof@lixom.net>
Most of DT files in ARM use #include "..." to make pre-processor
include DT in the same directory, but this is one of the exceptional
files that use #include <...> for that.
Fix it to remove -I$(srctree)/arch/$(SRCARCH)/boot/dts path from
dtc_cpp_flags.
ARM: dts: versatile: use #include "..." to include DT in the same directory
Most of DT files in ARM use #include "..." to make pre-processor
include DT in the same directory, but we have 3 exceptional files
that use #include <...> for that.
They must be fixed to remove -I$(srctree)/arch/$(SRCARCH)/boot/dts
path from dtc_cpp_flags.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull nfsd fixes from Bruce Fields:
"Revert patch accidentally included in the merge window pull request,
and fix a crash that was likely a result of buggy client behavior"
* tag 'nfsd-4.12-1' of git://linux-nfs.org/~bfields/linux:
nfsd4: fix null dereference on replay
nfsd: Revert "nfsd: check for oversized NFSv2/v3 arguments"
Pull gcc-plugin prepwork from Kees Cook:
"Use designated initializers for mtk-vcodec, powerplay, amdgpu, and
sgi-xp. Use ERR_CAST() to avoid cross-structure cast in ocf2, ntfs,
and NFS.
Christoph Hellwig recommended that I send these fixes now, rather than
waiting for the v4.13 merge window. These are all initializer and cast
fixes needed for the future randstruct plugin that haven't been picked
up by the respective maintainers"
* tag 'gcc-plugins-v4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
mtk-vcodec: Use designated initializers
drm/amd/powerplay: Use designated initializers
drm/amdgpu: Use designated initializers
sgi-xp: Use designated initializers
ocfs2: Use ERR_CAST() to avoid cross-structure cast
ntfs: Use ERR_CAST() to avoid cross-structure cast
NFS: Use ERR_CAST() to avoid cross-structure cast
Commit 9fdca4da4d (IB/SA: Split struct sa_path_rec based on IB and
ROCE specific fields) moved the service_id to be specific attribute
for IB and OPA SA Path Record, and thus wasn't assigned for RoCE.
This caused to the following kernel panic in the CMA request handler flow:
[ 27.074594] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 27.074731] IP: __radix_tree_lookup+0x1d/0xe0
...
[ 27.075356] Workqueue: ib_cm cm_work_handler [ib_cm]
[ 27.075401] task: ffff88022e3b8000 task.stack: ffffc90001298000
[ 27.075449] RIP: 0010:__radix_tree_lookup+0x1d/0xe0
...
[ 27.075979] Call Trace:
[ 27.076015] radix_tree_lookup+0xd/0x10
[ 27.076055] cma_ps_find+0x59/0x70 [rdma_cm]
[ 27.076097] cma_id_from_event+0xd2/0x470 [rdma_cm]
[ 27.076144] ? ib_init_ah_from_path+0x39a/0x590 [ib_core]
[ 27.076193] cma_req_handler+0x25/0x480 [rdma_cm]
[ 27.076237] cm_process_work+0x25/0x120 [ib_cm]
[ 27.076280] ? cm_get_bth_pkey.isra.62+0x3c/0xa0 [ib_cm]
[ 27.076350] cm_req_handler+0xb03/0xd40 [ib_cm]
[ 27.076430] ? sched_clock_cpu+0x11/0xb0
[ 27.076478] cm_work_handler+0x194/0x1588 [ib_cm]
[ 27.076525] process_one_work+0x160/0x410
[ 27.076565] worker_thread+0x137/0x4a0
[ 27.076614] kthread+0x112/0x150
[ 27.076684] ? max_active_store+0x60/0x60
[ 27.077642] ? kthread_park+0x90/0x90
[ 27.078530] ret_from_fork+0x2c/0x40
This patch moves it back to the common SA Path Record structure
and removes the redundant setter and getter.
Tested on Connect-IB and Connect-X4 in Infiniband and RoCE respectively.
Fixes: 9fdca4da4d (IB/SA: Split struct sa_path_rec based on IB ands
ROCE specific fields)
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This change will optimize kernel memory deregistration operations.
__ib_umem_release() used to call set_page_dirty_lock() against every
writable page in its memory region. Its purpose is to keep data
synced between CPU and DMA device when swapping happens after mem
deregistration ops. Now we choose not to set page dirty bit if it's
already set by kernel prior to calling __ib_umem_release(). This
reduces memory deregistration time by half or even more when we ran
application simulation test program.
Signed-off-by: Qing Huang <qing.huang@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Commit 5752075144 ("IB/SA: Add OPA path record type") introduced
new local function __ib_copy_path_rec_to_user, but didn't limit its
scope. This produces the following sparse warning:
drivers/infiniband/core/uverbs_marshall.c:99:6: warning:
symbol '__ib_copy_path_rec_to_user' was not declared. Should it be
static?
In addition, it used sizeof ... notations instead of sizeof(...), which
is correct in C, but a little bit misleading. Let's change it too.
Fixes: 5752075144 ("IB/SA: Add OPA path record type")
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
RDMA netlink is part of ib_core, hence ibnl_chk_listeners(),
ibnl_init() and ibnl_cleanup() don't need to be published
in public header file.
Let's remove EXPORT_SYMBOL from ibnl_chk_listeners() and move all these
functions to private header file.
CC: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
If srp_init_qp() fails at srp_create_ch_ib() then ch->send_cq
may be NULL.
Calling directly to ib_destroy_qp() is sufficient because
no work requests were posted on the created qp.
Fixes: 9294000d6d ("IB/srp: Drain the send queue before destroying a QP")
Cc: <stable@vger.kernel.org>
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com>--
Signed-off-by: Doug Ledford <dledford@redhat.com>
ipoib_dev_uninit_default() call is used in ipoib_main.c file only
and it generates the following warning from smatch tool:
drivers/infiniband/ulp/ipoib/ipoib_main.c:1593:6: warning:
symbol 'ipoib_dev_uninit_default' was not declared. Should it
be static?
so let's declare that function as static.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
HW can implement UMR wqe re-transmission in various ways.
Thus, add HCA cap to distinguish the needed fence for UMR to make
sure that the wqe wouldn't fail on mkey checks.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The cited patch added a type field to structures ib_ah and rdma_ah_attr.
Function mlx4_ib_query_ah() builds an rdma_ah_attr structure from the
data in an mlx4_ib_ah structure (which contains both an ib_ah structure
and an address vector).
For mlx4_ib_query_ah() to work properly, the type field in the contained
ib_ah structure must be set correctly.
In the outgoing MAD tunneling flow, procedure mlx4_ib_multiplex_mad()
paravirtualizes a MAD received from a slave and sends the processed
mad out over the wire. During this processing, it populates an
mlx4_ib_ah structure and calls mlx4_ib_query_ah().
The cited commit overlooked setting the type field in the contained
ib_ah structure before invoking mlx4_ib_query_ah(). As a result, the
type field remained uninitialized, and the rdma_ah_attr structure was
incorrectly built. This resulted in improperly built MADs being sent out
over the wire.
This patch properly initializes the type field in the contained ib_ah
structure before calling mlx4_ib_query_ah(). The rdma_ah_attr structure
is then generated correctly.
Fixes: 44c58487d5 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The handling of IB_RDMA_WRITE_ONLY_WITH_IMMEDIATE will leak a memory
reference when a buffer cannot be allocated for returning the immediate
data.
The issue is that the rkey validation has already occurred and the RNR
nak fails to release the reference that was fruitlessly gotten. The
the peer will send the identical single packet request when its RNR
timer pops.
The fix is to release the held reference prior to the rnr nak exit.
This is the only sequence the requires both rkey validation and the
buffer allocation on the same packet.
Cc: Stable <stable@vger.kernel.org> # 4.7+
Tested-by: Tadeusz Struk <tadeusz.struk@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Keep VL15 credits at 0 during LNI, before link-up. Store
VL15 credits value during verify cap interrupt and set
in after link-up. This addresses an issue where VL15 MAD
packets could be sent by one side of the link before
the other side is ready to receive them.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The Omni-Path adapter driver fails to load on the ppc64le platform
due to invalid PCI setup.
This patch makes the PCI configuration more robust and will
fix 64 bit addressing for ppc64le.
Signed-off-by: Steven L Roberts <robers97@gmail.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Take care of ipv6 checks while computing header length for deducing mtu
size of ipv6 servers. Due to the incorrect header length computation for
ipv6 servers, wrong mss is reported to the peer (client).
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The patch 761e19a504 (RDMA/iw_cxgb4: Handle return value of
c4iw_ofld_send() in abort_arp_failure()) from May 6, 2016
leads to the following static checker warning:
drivers/infiniband/hw/cxgb4/cm.c:575 abort_arp_failure()
warn: passing freed memory 'skb'
Also fixes skb leak when l2t resolution fails
Fixes: 761e19a504 (RDMA/iw_cxgb4: Handle return value of
c4iw_ofld_send() in abort_arp_failure())
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Explicitly ACK the MPA Reply frame so the peer
does not retransmit the frame.
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Don't set control flag for 0-length FULPDU (Send) RTR indication
in the enhanced MPA Request/Reply frames, because it isn't supported.
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Some error paths in i40iw_initialize_dev are doing
additional and unnecessary work before exiting.
Correctly free resources allocated prior to error
and return with correct status code.
Signed-off-by: Mustafa Ismail <mustafa.ismail@intelcom>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Don't set control flag for 0-length FULPDU (Send)
RTR indication in the enhanced MPA Request/Reply
frames, because it isn't supported.
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In the commit enabling per-CPU station statistics, I inadvertedly
copy-pasted some code to update rx_packets and forgot to change it
to update rx_dropped_misc. Fix that.
This addresses https://bugzilla.kernel.org/show_bug.cgi?id=195953.
Fixes: c9c5962b56 ("mac80211: enable collecting station statistics per-CPU")
Reported-by: Petru-Florin Mihancea <petrum@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Leonard Crestez says:
====================
ARM: imx6ul-14x14-evk: Fix suspend over nfs by phy
Right now attempting doing suspend/resume while root is mounted over NFS
hangs on imx6ul-14x14-evk. This is happening because ksz8081 phy fixups are
lost on resume.
Fix this by using equivalent devicetree properties instead of a phy fixup
and handling those properties on resume in the micrel driver.
In theory it might now be possible to remove the phy fixup from mach-imx6ul
entirely but it is possible that this would break other imx6ul boards which
use the same phy. The solution would be to patch their dts but it's not
clear how to identify affected boards.
This code is shared with imx6ull-14x14-evk but 6ull suspend needs an
unrelated patch: https://lkml.org/lkml/2017/5/30/584
This is something of a corner case so there is no CC: stable.
Changes since v1: https://lkml.org/lkml/2017/5/30/672
* Split a kszphy_config_reset function for stuff shared between
config_init and resume. Calling config_init directly could be an option but
on some HW variants it does extra stuff like parsing devicetree options.
That would not be appropriate for resume code.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
These bits seem to be lost after a suspend/resume cycle so just set them
again. Do this by splitting the handling of these bits into a function
that is also called on resume.
This patch fixes ethernet suspend/resume on imx6ul-14x14-evk boards.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now mach-imx6ul registers a fixup for the ksz8081 phy. The same
register values can be set through the micrel phy driver by using dts
properties.
This seems preferable and allows cleanly fixing suspend/resume.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver may sleep under a read spin lock, and the function call path is:
send_socklist (acquire the lock by read_lock)
skb_copy(GFP_KERNEL) --> may sleep
To fix it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC".
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 0c1d70af92 ("net: use dst_cache for vxlan device"),
cached dst entries could be leaked when more than one remote was
present for a given vxlan_fdb entry, causing subsequent netns
operations to block indefinitely and "unregister_netdevice: waiting
for lo to become free." messages to appear in the kernel log.
Fix by properly releasing cached dst and freeing resources in this
case.
Fixes: 0c1d70af92 ("net: use dst_cache for vxlan device")
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
From Boris:
"""
This pull request contains several fixes to the core and the tango
driver.
tango fixes:
* Add missing MODULE_DEVICE_TABLE() in tango_nand.c
* Update the number of corrected bitflips
core fixes:
* Fix a long standing memory leak in nand_scan_tail()
* Fix several bugs introduced by the per-vendor init/detection
infrastructure (introduced in 4.12)
* Add a static specifier to nand_ooblayout_lp_hamming_ops definition
"""
Pull KVM fixes from Paolo Bonzini:
"Many small x86 bug fixes: SVM segment registers access rights, nested
VMX, preempt notifiers, LAPIC virtual wire mode, NMI injection"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Fix nmi injection failure when vcpu got blocked
KVM: SVM: do not zero out segment attributes if segment is unusable or not present
KVM: SVM: ignore type when setting segment registers
KVM: nVMX: fix nested_vmx_check_vmptr failure paths under debugging
KVM: x86: Fix virtual wire mode
KVM: nVMX: Fix handling of lmsw instruction
KVM: X86: Fix preempt the preemption timer cancel
Pull Reiserfs and GFS2 fixes from Jan Kara:
"Fixes to GFS2 & Reiserfs for the fallout of the recent WRITE_FUA
cleanup from Christoph.
Fixes for other filesystems were already merged by respective
maintainers."
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
reiserfs: Make flush bios explicitely sync
gfs2: Make flush bios explicitely sync
Pull SCSI target fixes from Nicholas Bellinger:
"Here are the target-pending fixes for v4.12-rc4:
- ibmviscsis ABORT_TASK handling fixes that missed the v4.12 merge
window. (Bryant Ly and Michael Cyr)
- Re-add a target-core check enforcing WRITE overflow reject that was
relaxed in v4.3, to avoid unsupported iscsi-target immediate data
overflow. (nab)
- Fix a target-core-user OOPs during device removal. (MNC + Bryant
Ly)
- Fix a long standing iscsi-target potential issue where kthread exit
did not wait for kthread_should_stop(). (Jiang Yi)
- Fix a iscsi-target v3.12.y regression OOPs involving initial login
PDU processing during asynchronous TCP connection close. (MNC +
nab)
This is a little larger than usual for an -rc4, primarily due to the
iscsi-target v3.12.y regression OOPs bug-fix.
However, it's an important patch as MNC + Hannes where both able to
trigger it using a reduced iscsi initiator login timeout combined with
a backend taking a long time to complete I/Os during iscsi login
driven session reinstatement"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
iscsi-target: Always wait for kthread_should_stop() before kthread exit
iscsi-target: Fix initial login PDU asynchronous socket close OOPs
tcmu: fix crash during device removal
target: Re-add check to reject control WRITEs with overflow data
ibmvscsis: Fix the incorrect req_lim_delta
ibmvscsis: Clear left-over abort_cmd pointers
arch/sparc/kernel/ds.c: In function ‘register_services’:
arch/sparc/kernel/ds.c:912:3: error: ‘strcpy’: writing at least 1 byte
into a region of size 0 overflows the destination
Reported-by: Anatoly Pugachev <matorola@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the transition of NO_STP -> KERNEL_STP was fixed by always calling
mod_timer in br_stp_start, it introduced a new regression which causes
the timer to be armed even when the bridge is down, and since we stop
the timers in its ndo_stop() function, they never get disabled if the
device is destroyed before it's upped.
To reproduce:
$ while :; do ip l add br0 type bridge hello_time 100; brctl stp br0 on;
ip l del br0; done;
CC: Xin Long <lucien.xin@gmail.com>
CC: Ivan Vecera <cera@cera.cz>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Fixes: 6d18c732b9 ("bridge: start hello_timer when enabling KERNEL_STP in br_stp_start")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Apparently multi-cos isn't working for bnx2x quite some time -
driver implements ndo_select_queue() to allow queue-selection
for FCoE, but the regular L2 flow would cause it to modulo the
fallback's result by the number of queues.
The fallback would return a queue matching the needed tc
[via __skb_tx_hash()], but since the modulo is by the number of TSS
queues where number of TCs is not accounted, transmission would always
be done by a queue configured into using TC0.
Fixes: ada7c19e6d ("bnx2x: use XPS if possible for bnx2x_select_queue instead of pure hash")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change t4fw_version.h to update latest firmware version
number to 1.16.45.0.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NETLINK_F_LISTEN_ALL_NSID otion enables to listen all netns that have a
nsid assigned into the netns where the netlink socket is opened.
The nsid is sent as metadata to userland, but the existence of this nsid is
checked only for netns that are different from the socket netns. Thus, if
no nsid is assigned to the socket netns, NETNSA_NSID_NOT_ASSIGNED is
reported to the userland. This value is confusing and useless.
After this patch, only valid nsid are sent to userland.
Reported-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver may sleep under a write spin lock, and the function
call path is:
qlcnic_82xx_hw_write_wx_2M (acquire the lock by write_lock_irqsave)
crb_win_lock
qlcnic_pcie_sem_lock
usleep_range
qlcnic_82xx_hw_read_wx_2M (acquire the lock by write_lock_irqsave)
crb_win_lock
qlcnic_pcie_sem_lock
usleep_range
To fix it, the usleep_range is replaced with udelay.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we have to recover relocation during mount, we'll ultimately have to
evict the orphan inode. That goes through the reservation dance, where
priority_reclaim_metadata_space and flush_space expect fs_info->fs_root
to be valid. That's the next thing to be set up during mount, so we
crash, almost always in flush_space trying to join the transaction
but priority_reclaim_metadata_space is possible as well. This call
path has been problematic in the past WRT whether ->fs_root is valid
yet. Commit 957780eb27 (Btrfs: introduce ticketed enospc
infrastructure) added new users that are called in the direct path
instead of the async path that had already been worked around.
The thing is that we don't actually need the fs_root, specifically, for
anything. We either use it to determine whether the root is the
chunk_root for use in choosing an allocation profile or as a root to pass
btrfs_join_transaction before immediately committing it. Anything that
isn't the chunk root works in the former case and any root works in
the latter.
A simple fix is to use a root we know will always be there: the
extent_root.
Cc: <stable@vger.kernel.org> # v4.8+
Fixes: 957780eb27 (Btrfs: introduce ticketed enospc infrastructure)
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Variables start_idx and end_idx are supposed to hold a page index
derived from the file offsets. The int type is not the right one though,
offsets larger than 1 << 44 will get silently trimmed off the high bits.
(1 << 44 is 16TiB)
What can go wrong, if start is below the boundary and end gets trimmed:
- if there's a page after start, we'll find it (radix_tree_gang_lookup_slot)
- the final check "if (page->index <= end_idx)" will unexpectedly fail
The function will return false, ie. "there's no page in the range",
although there is at least one.
btrfs_page_exists_in_range is used to prevent races in:
* in hole punching, where we make sure there are not pages in the
truncated range, otherwise we'll wait for them to finish and redo
truncation, but we're going to replace the pages with holes anyway so
the only problem is the intermediate state
* lock_extent_direct: we want to make sure there are no pages before we
lock and start DIO, to prevent stale data reads
For practical occurence of the bug, there are several constaints. The
file must be quite large, the affected range must cross the 16TiB
boundary and the internal state of the file pages and pending operations
must match. Also, we must not have started any ordered data in the
range, otherwise we don't even reach the buggy function check.
DIO locking tries hard in several places to avoid deadlocks with
buffered IO and avoids waiting for ranges. The worst consequence seems
to be stale data read.
CC: Liu Bo <bo.li.liu@oracle.com>
CC: stable@vger.kernel.org # 3.16+
Fixes: fc4adbff82 ("btrfs: Drop EXTENT_UPTODATE check in hole punching and direct locking")
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The s390 architecture maps sys_mmap (nr 90) into sys_old_mmap. For this
reason perf trace can't find the proper syscall event to get args format
from and displays it wrongly as 'continued'.
To fix that fill the "alias" field with "old_mmap" for trace's mmap record
to get the correct translation.
Before:
0.042 ( 0.011 ms): vest/43052 fstat(statbuf: 0x3ffff89fd90 ) = 0
0.042 ( 0.028 ms): vest/43052 ... [continued]: mmap()) = 0x3fffd6e2000
0.072 ( 0.025 ms): vest/43052 read(buf: 0x3fffd6e2000, count: 4096 ) = 6
After:
0.045 ( 0.011 ms): fstat(statbuf: 0x3ffff8a0930 ) = 0
0.057 ( 0.018 ms): mmap(arg: 0x3ffff8a0858 ) = 0x3fffd14a000
0.076 ( 0.025 ms): read(buf: 0x3fffd14a000, count: 4096 ) = 6
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170531113557.19175-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We are running low on CPU feature bits, so we only want to use them when
it's really necessary.
CPU_FTR_SUBCORE is only used in one place, and only in C, so we don't
need it in order to make asm patching work. It can only be set on
"Power8" CPUs, which in practice means POWER8, POWER8E and POWER8NVL.
There are no plans to implement it on future CPUs, but if there ever
were we could retrofit it then.
Although KVM uses subcores, it never looks at the CPU feature, it either
looks at the ISA level or the threads_per_subcore value.
So drop the CPU feature and do a PVR check instead. Drop the device tree
"subcore" feature as we no longer support doing anything with it, and we
will drop it from skiboot too.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When adding or removing memory, the aa_index (affinity value) for the
memblock must also be converted to match the endianness of the rest
of the 'ibm,dynamic-memory' property. Otherwise, subsequent retrieval
of the attribute will likely lead to non-existent nodes, followed by
using the default node in the code inappropriately.
Fixes: 5f97b2a0d1 ("powerpc/pseries: Implement memory hotplug add in the kernel")
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
If a process dumps core while it has SPU contexts active then we have
code to also dump information about the SPU contexts.
Unfortunately it's been broken for 3 1/2 years, and we didn't notice. In
commit 7b1f4020d0 ("spufs: get rid of dump_emit() wrappers") the nread
variable was removed and rc used instead. That means when the loop exits
successfully, rc has the number of bytes read, but it's then used as the
return value for the function, which should return 0 on success.
So fix it by setting rc = 0 before returning in the success case.
Fixes: 7b1f4020d0 ("spufs: get rid of dump_emit() wrappers")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Provide a dt_cpu_ftrs= cmdline option to disable the dt_cpu_ftrs CPU
feature discovery, and fall back to the "cputable" based version.
Also allow control of advertising unknown features to userspace and
with this parameter, and remove the clunky CONFIG option.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add explicit early check of bootargs in dt_cpu_ftrs_init()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
drivers/phy/qualcomm/phy-qcom-qmp.c:847:37-43: ERROR: application of sizeof to pointer
sizeof when applied to a pointer typed expression gives the size of
the pointer
Generated by: scripts/coccinelle/misc/noderef.cocci
CC: Vivek Gautam <vivek.gautam@codeaurora.org>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
When spin_lock_irqsave() deadlock occurs inside the guest, vcpu threads,
other than the lock-holding one, would enter into S state because of
pvspinlock. Then inject NMI via libvirt API "inject-nmi", the NMI could
not be injected into vm.
The reason is:
1 It sets nmi_queued to 1 when calling ioctl KVM_NMI in qemu, and sets
cpu->kvm_vcpu_dirty to true in do_inject_external_nmi() meanwhile.
2 It sets nmi_queued to 0 in process_nmi(), before entering guest, because
cpu->kvm_vcpu_dirty is true.
It's not enough just to check nmi_queued to decide whether to stay in
vcpu_block() or not. NMI should be injected immediately at any situation.
Add checking nmi_pending, and testing KVM_REQ_NMI replaces nmi_queued
in vm_vcpu_has_events().
Do the same change for SMIs.
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a fix for the problem [1], where VMCB.CPL was set to 0 and interrupt
was taken on userspace stack. The root cause lies in the specific AMD CPU
behaviour which manifests itself as unusable segment attributes on SYSRET.
The corresponding work around for the kernel is the following:
61f01dd941 ("x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue")
In other turn virtualization side treated unusable segment incorrectly and
restored CPL from SS attributes, which were zeroed out few lines above.
In current patch it is assured only that P bit is cleared in VMCB.save state
and segment attributes are not zeroed out if segment is not presented or is
unusable, therefore CPL can be safely restored from DPL field.
This is only one part of the fix, since QEMU side should be fixed accordingly
not to zero out attributes on its side. Corresponding patch will follow.
[1] Message id: CAJrWOzD6Xq==b-zYCDdFLgSRMPM-NkNuTSDFEtX=7MreT45i7Q@mail.gmail.com
Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Signed-off-by: Mikhail Sennikovskii <mikhail.sennikovskii@profitbricks.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmář <rkrcmar@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For SDIO the alignment requirement for transfers from device to host
is configured in firmware. This configuration is limited to minimum
of 4-byte alignment. However, this is not correct for platforms using
64-bit DMA when the minimum alignment should be 8 bytes. This issue
appeared when the ALIGNMENT definition was set according the DMA
configuration. The configuration in firmware was not using that macro
defintion, but a hardcoded value of 4. Hence the driver reported
alignment failures for data coming from the device and causing
transfers to fail.
Fixes: 6e84ab604b ("brcmfmac: properly align buffers on certain platforms
Reported-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The previous commit [63691587f7: ALSA: hda - Apply dual-codec quirk
for MSI Z270-Gaming mobo] attempted to apply the existing dual-codec
quirk for a MSI mobo. But it turned out that this isn't applied
properly due to the MSI-vendor quirk before this entry. I overlooked
such two MSI entries just because they were put in the wrong position,
although we have a list ordered by PCI SSID numbers.
This patch fixes it by rearranging the unordered entries.
Fixes: 63691587f7 ("ALSA: hda - Apply dual-codec quirk for MSI Z270-Gaming mobo")
Reported-by: Rudolf Schmidt <info@rudolfschmidt.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pull drm fixes from Dave Airlie:
"This is the main set of fixes for rc4, one amdgpu fix, some exynos
regression fixes, some msm fixes and some i915 and GVT fixes.
I've got a second regression fix for some DP chips that might be a
bit large, but I think we'd like to land it now, I'll send it along
tomorrow, once you are happy with this set"
* tag 'drm-fixes-for-v4.12-rc4' of git://people.freedesktop.org/~airlied/linux: (24 commits)
drm/amdgpu: Program ring for vce instance 1 at its register space
drm/exynos: clean up description of exynos_drm_crtc
drm/exynos: dsi: Remove bridge node reference in removal
drm/exynos: dsi: Fix the parse_dt function
drm/exynos: Merge pre/postclose hooks
drm/msm: Fix the check for the command size
drm/msm: Take the mutex before calling msm_gem_new_impl
drm/msm: for array in-fences, check if all backing fences are from our own context before waiting
drm/msm: constify irq_domain_ops
drm/msm/mdp5: release hwpipe(s) for unused planes
drm/msm: Reuse dma_fence_release.
drm/msm: Expose our reservation object when exporting a dmabuf.
drm/msm/gpu: check legacy clk names in get_clocks()
drm/msm/mdp5: use __drm_atomic_helper_plane_duplicate_state()
drm/msm: select PM_OPP
drm/i915: Stop pretending to mask/unmask LPE audio interrupts
drm/i915/selftests: Silence compiler warning in igt_ctx_exec
Revert "drm/i915: Restore lost "Initialized i915" welcome message"
drm/i915/gvt: clean up unsubmited workloads before destroying kmem cache
drm/i915/gvt: Disable compression workaround for Gen9
...
It was not possible to enable both T10 PI and TPGS because they share
the same byte in the INQUIRY response. Logically OR the TPGS value
instead of using assignment.
Reported-by: Ritika Srivastava <ritika.srivastava@oracle.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hung task timeouts can result if a qlogic board breaks unexpectedly
while running I/O. These tasks become hung because command srb reference
counts are not going to zero, hence the affected srbs and commands do
not get freed. This fix accounts for this extra reference in the srbs in
the case of a board failure.
Fixes: a465537ad1 ("qla2xxx: Disable the adapter and skip error recovery in case of register disconnect")
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Null check at line 966: if (ndlp) {, implies that ndlp might be NULL.
Functions lpfc_nlp_set_state() and lpfc_issue_els_prli() dereference
pointer ndlp. Include these function calls inside the IF block that
tests pointer ndlp.
Addresses-Coverity-ID: 1401856
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We might have a NULL pring in lpfc_els_abort(), for example on error
recovery path, since queues are destroyed during error recovery
mechanism.
In this case, we should just drop the abort since the queues will be
recreated anyway. This patch just verifies for NULL pointer and stop the
abortion of the queue in case of a NULL pring.
Also, this patch converts return type of lpfc_els_abort() from int to
void, since it's not checked anywhere.
Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Reported-by: Naresh Bannoth <nbannoth@in.ibm.com>
Tested-by: Raphael Silva <raphasil@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The lpfc_nvmeio_data() tracing helper always takes a format string and
three additional arguments. The latest caller has a format string with
only two integer arguments, causing this harmless warning:
drivers/scsi/lpfc/lpfc_nvmet.c: In function 'lpfc_nvmet_xmt_fcp_release':
drivers/scsi/lpfc/lpfc_nvmet.c:802:25: error: too many arguments for format [-Werror=format-extra-args]
lpfc_nvmeio_data(phba, "NVMET FCP FREE: xri x%x ste %d\n", ctxp->oxid,
We could add a dummy argument here, but it seems reasonable to print
the 'abort' flag as the third argument.
Fixes: 19b58d9473 ("nvmet_fc: add req_release to lldd api")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
- Fix a regression to description of exynos_drm_crtc
- Remove preclose hook of Exynos
. This was a exynos change of the patch series[1] merged already.
- Fix one dt broken issue
- Make sure to release bridge_node of Exynos MIPI-DSI driver.
[1] https://lists.freedesktop.org/archives/dri-devel/2017-March/135111.html
* tag 'exynos-drm-fixes-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
drm/exynos: clean up description of exynos_drm_crtc
drm/exynos: dsi: Remove bridge node reference in removal
drm/exynos: dsi: Fix the parse_dt function
drm/exynos: Merge pre/postclose hooks
a few fixes for 4.12..
* 'msm-fixes-4.12-rc4' of git://people.freedesktop.org/~robclark/linux:
drm/msm: Fix the check for the command size
drm/msm: Take the mutex before calling msm_gem_new_impl
drm/msm: for array in-fences, check if all backing fences are from our own context before waiting
drm/msm: constify irq_domain_ops
drm/msm/mdp5: release hwpipe(s) for unused planes
drm/msm: Reuse dma_fence_release.
drm/msm: Expose our reservation object when exporting a dmabuf.
drm/msm/gpu: check legacy clk names in get_clocks()
drm/msm/mdp5: use __drm_atomic_helper_plane_duplicate_state()
drm/msm: select PM_OPP
drm/i915 fixes for v4.12-rc4
* tag 'drm-intel-fixes-2017-05-29' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Stop pretending to mask/unmask LPE audio interrupts
drm/i915/selftests: Silence compiler warning in igt_ctx_exec
Revert "drm/i915: Restore lost "Initialized i915" welcome message"
drm/i915/gvt: clean up unsubmited workloads before destroying kmem cache
drm/i915/gvt: Disable compression workaround for Gen9
drm/i915: set initialised only when init_context callback is NULL
drm/i915: Fix new -Wint-in-bool-context gcc compiler warning
drm/i915: use vma->size for appgtt allocate_va_range
drm/i915: Do not sync RCU during shrinking
There are three timing problems in the kthread usages of iscsi_target_mod:
- np_thread of struct iscsi_np
- rx_thread and tx_thread of struct iscsi_conn
In iscsit_close_connection(), it calls
send_sig(SIGINT, conn->tx_thread, 1);
kthread_stop(conn->tx_thread);
In conn->tx_thread, which is iscsi_target_tx_thread(), when it receive
SIGINT the kthread will exit without checking the return value of
kthread_should_stop().
So if iscsi_target_tx_thread() exit right between send_sig(SIGINT...)
and kthread_stop(...), the kthread_stop() will try to stop an already
stopped kthread.
This is invalid according to the documentation of kthread_stop().
(Fix -ECONNRESET logout handling in iscsi_target_tx_thread and
early iscsi_target_rx_thread failure case - nab)
Signed-off-by: Jiang Yi <jiangyilism@gmail.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a OOPs originally introduced by:
commit bb048357da
Author: Nicholas Bellinger <nab@linux-iscsi.org>
Date: Thu Sep 5 14:54:04 2013 -0700
iscsi-target: Add sk->sk_state_change to cleanup after TCP failure
which would trigger a NULL pointer dereference when a TCP connection
was closed asynchronously via iscsi_target_sk_state_change(), but only
when the initial PDU processing in iscsi_target_do_login() from iscsi_np
process context was blocked waiting for backend I/O to complete.
To address this issue, this patch makes the following changes.
First, it introduces some common helper functions used for checking
socket closing state, checking login_flags, and atomically checking
socket closing state + setting login_flags.
Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP
connection has dropped via iscsi_target_sk_state_change(), but the
initial PDU processing within iscsi_target_do_login() in iscsi_np
context is still running. For this case, it sets LOGIN_FLAGS_CLOSED,
but doesn't invoke schedule_delayed_work().
The original NULL pointer dereference case reported by MNC is now handled
by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before
transitioning to FFP to determine when the socket has already closed,
or iscsi_target_start_negotiation() if the login needs to exchange
more PDUs (eg: iscsi_target_do_login returned 0) but the socket has
closed. For both of these cases, the cleanup up of remaining connection
resources will occur in iscsi_target_start_negotiation() from iscsi_np
process context once the failure is detected.
Finally, to handle to case where iscsi_target_sk_state_change() is
called after the initial PDU procesing is complete, it now invokes
conn->login_work -> iscsi_target_do_login_rx() to perform cleanup once
existing iscsi_target_sk_check_close() checks detect connection failure.
For this case, the cleanup of remaining connection resources will occur
in iscsi_target_do_login_rx() from delayed workqueue process context
once the failure is detected.
Reported-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Tested-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Reported-by: Hannes Reinecke <hare@suse.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Varun Prakash <varun@chelsio.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
recent fixes to use WRITE_ONCE for nh_flags on link up,
accidently ended up leaving the deadflags on a nh. This patch
fixes the WRITE_ONCE to use freshly evaluated nh_flags.
Fixes: 39eb8cd175 ("net: mpls: rt_nhn_alive and nh_flags should be accessed using READ_ONCE")
Reported-by: Satish Ashok <sashok@cumulusnetworks.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver may sleep under a spin lock, the function call path is:
isdn_ppp_mp_receive (acquire the lock)
isdn_ppp_mp_reassembly
isdn_ppp_push_higher
isdn_ppp_decompress
isdn_ppp_ccp_reset_trans
isdn_ppp_ccp_reset_alloc_state
kzalloc(GFP_KERNEL) --> may sleep
To fixed it, the "GFP_KERNEL" is replaced with "GFP_ATOMIC".
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ata_parse_force_one() was incorrectly comparing @p to @endp when it
should have been comparing @id. The only consequence is that it may
end up using an invalid port number in "libata.force" module param
instead of rejecting it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Petru-Florin Mihancea <petrum@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=195785
Add NULL check before dereferencing pointer _id_ in order to avoid
a potential NULL pointer dereference.
Addresses-Coverity-ID: 1397995
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Auto-loading of the Marvell DSA driver has stopped working with recent
kernels. This seems to be due to the change of binding for DSA devices,
moving them from the platform bus to the MDIO bus.
In order for module auto-loading to work, we need to provide a MODALIAS
string in the uevent file for the device. However, the device core does
not automatically provide this, and needs each bus_type to implement a
uevent method to generate these strings. The MDIO bus does not provide
such a method, so no MODALIAS string is provided:
.# cat /sys/bus/mdio_bus/devices/f1072004.mdio-mii\:04/uevent
DRIVER=mv88e6085
OF_NAME=switch
OF_FULLNAME=/soc/internal-regs/mdio@72004/switch@4
OF_COMPATIBLE_0=marvell,mv88e6085
OF_COMPATIBLE_N=1
In the case of OF-based devices, the solution is easy -
of_device_uevent_modalias() does the work for us. After this is done,
the uevent file looks like this:
.# cat /sys/bus/mdio_bus/devices/f1072004.mdio-mii\:04/uevent
DRIVER=mv88e6085
OF_NAME=switch
OF_FULLNAME=/soc/internal-regs/mdio@72004/switch@4
OF_COMPATIBLE_0=marvell,mv88e6085
OF_COMPATIBLE_N=1
MODALIAS=of:NswitchT<NULL>Cmarvell,mv88e6085
which results in auto-loading of the Marvell DSA driver on Clearfog
platforms.
Fixes: c0405563a6 ("ARM: dts: armada-388-clearfog: Utilize new DSA binding")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Marvell driver incorrectly provides phydev->lp_advertising as the
logical and of the link partner's advert and our advert. This is
incorrect - this field is supposed to store the link parter's unmodified
advertisment.
This allows ethtool to report the correct link partner auto-negotiation
status.
Fixes: be937f1f89 ("Marvell PHY m88e1111 driver fix")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need program ring buffer on instance 1 register space domain,
when only if instance 1 available, with two instances or instance 0,
and we need only program instance 0 regsiter space domain for ring.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
MTU probing initialization occurred only at connect() and at SYN or
SYN-ACK reception, but the former sets MSS to either the default or the
user set value (through TCP_MAXSEG sockopt) and the latter never happens
with repaired sockets.
The result was that, with MTU probing enabled and unless TCP_MAXSEG
sockopt was used before connect(), probing would be stuck at
tcp_base_mss value until tcp_probe_interval seconds have passed.
Signed-off-by: Douglas Caetano dos Santos <douglascs@taghos.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
If you attempt a TCP mount from an host that is unreachable in a way
that triggers an immediate error from kernel_connect(), that error
does not propagate up, instead EAGAIN is reported.
This results in call_connect_status receiving the wrong error.
A case that it easy to demonstrate is to attempt to mount from an
address that results in ENETUNREACH, but first deleting any default
route.
Without this patch, the mount.nfs process is persistently runnable
and is hard to kill. With this patch it exits as it should.
The problem is caused by the fact that xs_tcp_force_close() eventually
calls
xprt_wake_pending_tasks(xprt, -EAGAIN);
which causes an error return of -EAGAIN. so when xs_tcp_setup_sock()
calls
xprt_wake_pending_tasks(xprt, status);
the status is ignored.
Fixes: 4efdd92c92 ("SUNRPC: Remove TCP client connection reset hack")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Commit b685d3d65a "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions
Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.
CC: linux-raid@vger.kernel.org
CC: Shaohua Li <shli@kernel.org>
Fixes: b685d3d65a
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Shaohua Li <shli@fb.com>
Pull overlayfs fixes from Miklos Szeredi:
"Fix regressions:
- missing CONFIG_EXPORTFS dependency
- failure if upper fs doesn't support xattr
- bad error cleanup
This also adds the concept of "impure" directories complementing the
"origin" marking introduced in -rc1. Together they enable getting
consistent st_ino and d_ino for directory listings.
And there's a bug fix and a cleanup as well"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: filter trusted xattr for non-admin
ovl: mark upper merge dir with type origin entries "impure"
ovl: mark upper dir with type origin entries "impure"
ovl: remove unused arg from ovl_lookup_temp()
ovl: handle rename when upper doesn't support xattr
ovl: don't fail copy-up if upper doesn't support xattr
ovl: check on mount time if upper fs supports setting xattr
ovl: fix creds leak in copy up error path
ovl: select EXPORTFS
When adding a cfq_group into the cfq service tree, we use CFQ_IDLE_DELAY
as the delay of cfq_group's vdisktime if there have been other cfq_groups
already.
When cfq is under iops mode, commit 9a7f38c42c ("cfq-iosched: Convert
from jiffies to nanoseconds") could result in a large iops delay and
lead to an abnormal io schedule delay for the added cfq_group. To fix
it, we just need to revert to the old CFQ_IDLE_DELAY value: HZ / 5
when iops mode is enabled.
Despite having the same value, the delay of a cfq_queue in idle class
and the delay of cfq_group are different things, so I define two new
macros for the delay of a cfq_group under time-slice mode and iops mode.
Fixes: 9a7f38c42c ("cfq-iosched: Convert from jiffies to nanoseconds")
Cc: <stable@vger.kernel.org> # 4.8+
Signed-off-by: Hou Tao <houtao1@huawei.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
We've had user reports of unmount hangs in xfs_wait_buftarg() that
analysis shows is due to btp->bt_io_count == -1. bt_io_count
represents the count of in-flight asynchronous buffers and thus
should always be >= 0. xfs_wait_buftarg() waits for this value to
stabilize to zero in order to ensure that all untracked (with
respect to the lru) buffers have completed I/O processing before
unmount proceeds to tear down in-core data structures.
The value of -1 implies an I/O accounting decrement race. Indeed,
the fact that xfs_buf_ioacct_dec() is called from xfs_buf_rele()
(where the buffer lock is no longer held) means that bp->b_flags can
be updated from an unsafe context. While a user-level reproducer is
currently not available, some intrusive hacks to run racing buffer
lookups/ioacct/releases from multiple threads was used to
successfully manufacture this problem.
Existing callers do not expect to acquire the buffer lock from
xfs_buf_rele(). Therefore, we can not safely update ->b_flags from
this context. It turns out that we already have separate buffer
state bits and associated serialization for dealing with buffer LRU
state in the form of ->b_state and ->b_lock. Therefore, replace the
_XBF_IN_FLIGHT flag with a ->b_state variant, update the I/O
accounting wrappers appropriately and make sure they are used with
the correct locking. This ensures that buffer in-flight state can be
modified at buffer release time without racing with modifications
from a buffer lock holder.
Fixes: 9c7504aa72 ("xfs: track and serialize in-flight async buffers against unmount")
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Libor Pechacek <lpechacek@suse.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Commit b685d3d65a ("block: treat REQ_FUA and REQ_PREFLUSH as
synchronous") removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions.
Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.
Fixes: b685d3d65a ("block: treat REQ_FUA and REQ_PREFLUSH as synchronous")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
In the conversion to drop drm_modeset_lock_all and the magic implicit
context I failed to realize that _resume starts out with a pile of
state copies, but not with the locks. And hence drm_atomic_commit
won't grab these for us.
v2: Add locking checks in helpers to make sure we catch this in the
future. Note we can only require the locks in the atomic_check phase,
not in the commit phase. But since any commit is guaranteed to first
run the checks (even for the resume stuff where we use stored
duplicated old state) this should give us full coverage. Requested by
Maarten.
Cc: Jyri Sarha <jsarha@ti.com>
Fixes: a5b8444e28 ("drm/atomic-helper: remove modeset_lock_all from helper_resume")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170531083813.1390-1-daniel.vetter@ffwll.ch
This is another attempt to work around the VLA used in
mixer_us16x08.c. Basically the temporary array is used individually
for two cases, and we can declare locally in each block, instead of
hackish max() usage.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This reverts commit 89b593c30e ("ALSA: usb-audio: purge needless
variable length array"). The patch turned out to cause a severe
regression, triggering an Oops at snd_usb_ctl_msg(). It was overseen
that snd_usb_ctl_msg() writes back the response to the given buffer,
while the patch changed it to a read-only const buffer. (One should
always double-check when an extra pointer cast is present...)
As a simple fix, just revert the affected commit. It was merely a
cleanup. Although it brings VLA again, it's clearer as a fix. We'll
address the VLA later in another patch.
Fixes: 89b593c30e ("ALSA: usb-audio: purge needless variable length array")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195875
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Force vop output mode on encoder driver seem not a good idea,
EDP, HDMI, DisplayPort all have 10bit input on rk3399,
On non-10bit vop, vop 8bit output bit[0-7] connect to the
encoder high 8bit [2-9].
So force RGB10 to RGB888 on vop driver would be better.
And another problem, EDP check crtc id on atomic_check,
but encoder maybe NULL, so out_mode configure would fail,
it cause edp no display.
Signed-off-by: Mark Yao <mark.yao@rock-chips.com>
Reviewed-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1495885416-22216-1-git-send-email-mark.yao@rock-chips.com
When the controller fails to provide an RPM reading within the alloted
time; the driver returns -ETIMEDOUT and no file contents.
Signed-off-by: Patrick Venture <venture@google.com>
Fixes: 2d7a548a3e ("drivers: hwmon: Support for ASPEED PWM/Fan tach")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The driver uses regmap and thus has to select it to avoid build
errors such as the following.
drivers/hwmon/aspeed-pwm-tacho.c:337:21: error: variable
'aspeed_pwm_tacho_regmap_config' has initializer but incomplete type
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Acked-by: Joel Stanley <joel@jms.id.au>
Fixes: 2d7a548a3e ("drivers: hwmon: Support for ASPEED PWM/Fan tach")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
This effectively reverts commit 8ee74a91ac ("proc: try to remove use
of FOLL_FORCE entirely")
It turns out that people do depend on FOLL_FORCE for the /proc/<pid>/mem
case, and we're talking not just debuggers. Talking to the affected people, the use-cases are:
Keno Fischer:
"We used these semantics as a hardening mechanism in the julia JIT. By
opening /proc/self/mem and using these semantics, we could avoid
needing RWX pages, or a dual mapping approach. We do have fallbacks to
these other methods (though getting EIO here actually causes an assert
in released versions - we'll updated that to make sure to take the
fall back in that case).
Nevertheless the /proc/self/mem approach was our favored approach
because it a) Required an attacker to be able to execute syscalls
which is a taller order than getting memory write and b) didn't double
the virtual address space requirements (as a dual mapping approach
would).
I think in general this feature is very useful for anybody who needs
to precisely control the execution of some other process. Various
debuggers (gdb/lldb/rr) certainly fall into that category, but there's
another class of such processes (wine, various emulators) which may
want to do that kind of thing.
Now, I suspect most of these will have the other process under ptrace
control, so maybe allowing (same_mm || ptraced) would be ok, but at
least for the sandbox/remote-jit use case, it would be perfectly
reasonable to not have the jit server be a ptracer"
Robert O'Callahan:
"We write to readonly code and data mappings via /proc/.../mem in lots
of different situations, particularly when we're adjusting program
state during replay to match the recorded execution.
Like Julia, we can add workarounds, but they could be expensive."
so not only do people use FOLL_FORCE for both reads and writes, but they
use it for both the local mm and remote mm.
With these comments in mind, we likely also cannot add the "are we
actively ptracing" check either, so this keeps the new code organization
and does not do a real revert that would add back the original comment
about "Maybe we should limit FOLL_FORCE to actual ptrace users?"
Reported-by: Keno Fischer <keno@juliacomputing.com>
Reported-by: Robert O'Callahan <robert@ocallahan.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The tagset lock needs to be held when iterating the tag_list, so a
lockdep assert was added when updating number of hardware queues. The
drivers calling this API, however, were unaware of the new requirement,
so are failing the assertion.
This patch takes the lock within the blk-mq function so the drivers do
not have to be modified in order to be safe.
Fixes: 705cda97e ("blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list")
Reported-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Tariq Toukan says:
====================
MAINTAINERS updates
This patchset contains updates to the MAINTAINERS file.
In the first patch, I replace Yishai as the maintainer of
the mlx4_core driver.
In the other two patches we move an RDMA header file from
the list of the mlx4/mlx5 core driver into the respective
IB driver, where it belongs.
Series generated against net commit:
468b0df61a Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
It belongs there, should not be under mlx5 Core driver.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It belongs there, should not be under mlx4 Core driver.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add myself as a maintainer for mlx4 core driver,
replacing Yishai Hadas.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Building the driver with CONFIG_SMP disabled results in a harmless
warning:
ethernet/mellanox/mlx5/core/main.c: In function 'mlx5_irq_set_affinity_hint':
ethernet/mellanox/mlx5/core/main.c:615:6: error: unused variable 'irq' [-Werror=unused-variable]
It's better to express the conditional compilation using IS_ENABLED()
here, as that lets the compiler see what the intented use for the variable
is, and that it can be silently discarded.
Fixes: b665d98edc ("net/mlx5: Tolerate irq_set_affinity_hint() failures")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
'static' was not enough, the helpers must be 'static inline'
net/dsa/mv88e6xxx/global2.h:123:12: error: 'mv88e6xxx_g2_misc_4_bit_port' defined but not used [-Werror=unused-function]
net/dsa/mv88e6xxx/global2.h:117:12: error: 'mv88e6xxx_g2_pvt_write' defined but not used [-Werror=unused-function]
Fixes: c21fbe29f8 ("net: dsa: mv88e6xxx: Add missing static to stub functions")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current implementation lacks the logic for providing management
firmware with RDMA-related statistics; [much] worse than that -
it logs such events by default to system logs.
Since the statistics' gathering is done periodically, using sufficiently
new management firmware the system logs would get filled with these
unnecessary prints.
For now, reduce the verbosity of the log so that it would not be
logged by default.
Fixes: 6c75424612 ("qed: Add support for NCSI statistics")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During PCI error recovery process, specifically on eeh_err_detected()
we might have a NULL netdev struct, hence a direct dereference will
lead to a kernel oops. This was observed with latest upstream kernel
(v4.12-rc2) on Chelsio adapter T422-CR in PowerPC machines.
This patch checks for NULL pointer and avoids the crash, both in
eeh_err_detected() and eeh_resume(). Also, we avoid to trigger
a fatal error or to try disabling interrupts on FW during PCI
error recovery, because: (a) driver might not be able to accurately
access PCI regions in this case, and (b) trigger a fatal error
_during_ the recovery steps is a mistake that could prevent the
recovery path to complete successfully.
Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 19bca6ab75 ("KVM: SVM: Fix cross vendor migration issue with
unusable bit") added checking type when setting unusable.
So unusable can be set if present is 0 OR type is 0.
According to the AMD processor manual, long mode ignores the type value
in segment descriptor. And type can be 0 if it is read-only data segment.
Therefore type value is not related to unusable flag.
This patch is based on linux-next v4.12.0-rc3.
Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_skip_emulated_instruction() will return 0 if userspace is
single-stepping the guest.
kvm_skip_emulated_instruction() uses return status convention of exit
handler: 0 means "exit to userspace" and 1 means "continue vm entries".
The problem is that nested_vmx_check_vmptr() return status means
something else: 0 is ok, 1 is error.
This means we would continue executing after a failure. Static checker
noticed it because vmptr was not initialized.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 6affcbedca ("KVM: x86: Add kvm_skip_emulated_instruction and use it.")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
nbd_config is allocated in nbd_alloc_config(), but never freed.
Fixes: 5ea8d10802 ("nbd: separate out the config information")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
There is nothing to clear -- nbd_device has just been allocated.
Fold nbd_reset() into its other caller, nbd_config_put().
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
We saw perf IRQ init failures when running Linux kernel in an ACPI
guest without PMU (i.e. pmu=off). This is because perf IRQ is not
present when pmu=off, but arm_pmu_acpi still tries to register
or unregister GSI. This patch addresses the problem by checking
gicc->performance_interrupt. If it is 0, which is the value set
by qemu when pmu=off, we skip the IRQ register/unregister process.
[ 4.069470] bc00: 0000000000040b00 ffff0000089db190
[ 4.070267] [<ffff000008134f80>] enable_percpu_irq+0xdc/0xe4
[ 4.071192] [<ffff000008667cc4>] arm_perf_starting_cpu+0x108/0x10c
[ 4.072200] [<ffff0000080cbdd4>] cpuhp_invoke_callback+0x14c/0x4ac
[ 4.073210] [<ffff0000080ccd3c>] cpuhp_thread_fun+0xd4/0x11c
[ 4.074132] [<ffff0000080f1394>] smpboot_thread_fn+0x1b4/0x1c4
[ 4.075081] [<ffff0000080ec90c>] kthread+0x10c/0x138
[ 4.075921] [<ffff0000080833c0>] ret_from_fork+0x10/0x50
[ 4.076947] genirq: Setting trigger mode 4 for irq 43 failed
(gic_set_type+0x0/0x74)
Signed-off-by: Wei Huang <wei@redhat.com>
[will: add comment justifying deviation from ACPI spec, removed redundant hunk]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add device-tree files relevant to TI DaVinci platform
to its entry so mach-davinci sub-arch maintainers get
copied on patches with device-tree file updates.
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops()
,dma_ops should be cleared in the teardown path. Currently, only the
device's iommu mapping structures are cleared in arch_teardown_dma_ops,
but not the dma_ops. So on the next reprobe, dma_ops left in place is
stale from the first IOMMU setup, but iommu mappings has been disposed
of. This is a problem when the probe of the device is deferred and
recalled with the IOMMU probe deferral.
So for fixing this, slightly refactor by moving the code from
__arm_iommu_detach_device to arm_iommu_detach_device and cleanup
the former. This takes care of resetting the dma_ops in the teardown
path.
Fixes: 09515ef5dd ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices")
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With IOMMU probe deferral, iort_iommu_configure can be called
multiple times for the same device. Hence we have a check
to see if the device's fwspec is already translated and return
the iommu_ops from that directly. But the check is wrongly
placed in iort_iommu_xlate, which breaks devices with multiple
sids. Move the check to iort_iommu_configure.
Fixes: 5a1bb638d5 ("drivers: acpi: Handle IOMMU lookup failure with deferred probing or error")
Tested-by: Nate Watterson <nwatters@codeaurora.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
arch_setup_dma_ops() is used in device probe code paths to create an
IOMMU mapping and attach it to the device. The function assumes that the
device is attached to a device-specific IOMMU instance (or at least a
device-specific TLB in a shared IOMMU instance) and thus creates a
separate mapping for every device.
On several systems (Renesas R-Car Gen2 being one of them), that
assumption is not true, and IOMMU mappings must be shared between
multiple devices. In those cases the IOMMU driver knows better than the
generic ARM dma-mapping layer and attaches mapping to devices manually
with arm_iommu_attach_device(), which sets the DMA ops for the device.
The arch_setup_dma_ops() function takes this into account and bails out
immediately if the device already has DMA ops assigned. However, the
corresponding arch_teardown_dma_ops() function, called from driver
unbind code paths (including probe deferral), will tear the mapping down
regardless of who created it. When the device is reprobed
arch_setup_dma_ops() will be called again but won't perform any
operation as the DMA ops will still be set.
We need to reset the DMA ops in arch_teardown_dma_ops() to fix this.
However, we can't do so unconditionally, as then a new mapping would be
created by arch_setup_dma_ops() when the device is reprobed, regardless
of whether the device needs to share a mapping or not. We must thus keep
track of whether arch_setup_dma_ops() created the mapping, and only in
that case tear it down in arch_teardown_dma_ops().
Keep track of that information in the dev_archdata structure. As the
structure is embedded in all instances of struct device let's not grow
it, but turn the existing dma_coherent bool field into a bitfield that
can be used for other purposes.
Fixes: 09515ef5dd ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices")
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
While deferring the probe of IOMMU masters, xlate and
add_device callbacks called from iort_iommu_configure
can pass back error values like -ENODEV, which means
the IOMMU cannot be connected with that master for real
reasons. Before the IOMMU probe deferral, all such errors
were ignored. Now all those errors are propagated back,
killing the master's probe for such errors. Instead ignore
all the errors except EPROBE_DEFER, which is the only one
of concern and let the master work without IOMMU, thus
restoring the old behavior. Also make explicit that
acpi_dma_configure handles only -EPROBE_DEFER from
iort_iommu_configure.
Fixes: 5a1bb638d5 ("drivers: acpi: Handle IOMMU lookup failure with deferred probing or error")
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
While deferring the probe of IOMMU masters, xlate and
add_device callbacks called from of_iommu_configure
can pass back error values like -ENODEV, which means
the IOMMU cannot be connected with that master for real
reasons. Before the IOMMU probe deferral, all such errors
were ignored. Now all those errors are propagated back,
killing the master's probe for such errors. Instead ignore
all the errors except EPROBE_DEFER, which is the only one
of concern and let the master work without IOMMU, thus
restoring the old behavior. Also make explicit that
of_dma_configure handles only -EPROBE_DEFER from
of_iommu_configure.
Fixes: 7b07cbefb6 ("iommu: of: Handle IOMMU lookup failure with deferred probing or error")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Magnus Damn <magnus.damn@gmail.com>
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Now with IOMMU probe deferral, we return -EPROBE_DEFER
for masters that are connected to an IOMMU which is not
probed yet, but going to get probed, so that we can attach
the correct dma_ops. So while trying to defer the probe of
the master, check if the of_iommu node that it is connected
to is marked in DT as 'status=disabled', then the IOMMU is never
is going to get probed. So simply return NULL and let the master
work without an IOMMU.
Fixes: 7b07cbefb6 ("iommu: of: Handle IOMMU lookup failure with deferred probing or error")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Tested-by: Magnus Damn <magnus.damn@gmail.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Newly added code in the ipmmu-vmsa driver showed a small mistake
in a header file that can't be included by itself without CONFIG_IOMMU_DMA
enabled:
In file included from drivers/iommu/ipmmu-vmsa.c:13:0:
include/linux/dma-iommu.h:105:94: error: 'struct device' declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
This adds a forward declaration for 'struct device', similar to how
we treat the other struct types in this case.
Fixes: 3ae4729202 ("iommu/ipmmu-vmsa: Add new IOMMU_DOMAIN_DMA ops")
Fixes: 273df96353 ("iommu/dma: Make PCI window reservation generic")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When starting or stopping an aggregation session, one of the steps
is that the driver calls back to mac80211 that the start/stop can
proceed. This is handled by queueing up a fake SKB and processing
it from the normal iface/sdata work. Since this isn't flushed when
disassociating, the following race is possible:
* associate
* start aggregation session
* driver callback
* disassociate
* associate again to the same AP
* callback processing runs, leading to a WARN_ON() that
the TID hadn't requested aggregation
If the second association isn't to the same AP, there would only
be a message printed ("Could not find station: <addr>"), but the
same race could happen.
Fix this by not going the whole detour with a fake SKB etc. but
simply looking up the aggregation session in the driver callback,
marking it with a START_CB/STOP_CB bit and then scheduling the
regular aggregation work that will now process these bits as well.
This also simplifies the code and gets rid of the whole problem
with allocation failures of said skb, which could have left the
session in limbo.
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In descriptor mode, the descriptor running pointer is not maintained
by the interrupt handler, thus, driver finds the running descriptor
from the descriptor pointer field in the CHCRB register.
But, CHCRB::DPTR indicates *next* descriptor pointer, not current.
Thus, The residue calculation will be missed. This patch fixup it.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Conntrack SCTP CRC32c checksum mangling may operate on non-linear
skbuff, patch from Davide Caratti.
2) nf_tables rb-tree set backend does not handle element re-addition
after deletion in the same transaction, leading to infinite loop.
3) Atomically unclear the IPS_SRC_NAT_DONE_BIT on nat module removal,
from Liping Zhang.
4) Conntrack hashtable resizing while ctnetlink dump is progress leads
to a dead reference to released objects in the lists, also from
Liping.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Users should really consider switching to rmi-smbus instead of plain PS/2.
Notify them that they should report a missing pnpID in the file.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The Synaptics touchpads are now either using i2c-hid or rmi-smbus.
Warn the users if they are missing the rmi-smbus modules and have no
chance of reporting correct data.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
synaptics_query_hardware() was being passed a 'struct synaptics_device_info'
in uninitialized stack memory, then not always initializing all fields.
This caused garbage to show up in certain fields, making the touchpad
unusable.
Fix by zeroing the device info, so all fields default to 0.
Fixes: 6c53694fb2 ("Input: synaptics - split device info into a separate structure")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
When we put the touchscreen controller in low-power mode the irq
pin may trigger (float) and if we then try to read a data packet we
get the following error in dmesg:
[ 478.801017] silead_ts i2c-MSSL1680:00: Data read error -121
This commit disables the irq during suspend/resume fixing this error.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
For a driver that does not set the CPUFREQ_STICKY flag, if all of the
->init() calls fail, cpufreq_register_driver() should return an error.
This will prevent the driver from loading.
Fixes: ce1bcfe94d (cpufreq: check cpufreq_policy_list instead of scanning policies for all CPUs)
Cc: 4.0+ <stable@vger.kernel.org> # 4.0+
Signed-off-by: David Arcari <darcari@redhat.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the Linux kernel, acpi_get_table() "clones" haven't been fully
balanced by acpi_put_table() invocations. In upstream ACPICA, due to
the design change, there are also unbalanced acpi_get_table_by_index()
invocations requiring special care.
acpi_get_table() reference counting mismatches may occor due to that
and printing error messages related to them is not useful at this
point. The strict balanced validation count check should only be
enabled after confirming that all invocations are safe and aligned
with their designed purposes.
Thus this patch removes the error value returned by acpi_tb_get_table()
in that case along with the accompanying error message to fix the
issue.
Fixes: 174cc7187e (ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel)
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
Reported-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Revert commit 77e9a4aa9d (ACPI / button: Change default behavior to
lid_init_state=open) which changed the kernel's behavior on laptops
that boot with closed lids and expect the lid switch state to be
reported accurately by the kernel.
If you boot or resume your laptop with the lid closed on a docking
station while using an external monitor connected to it, both internal
and external displays will light on, while only the external should.
There is a design choice in gdm to only provide the greeter on the
internal display when lit on, so users only see a gray area on the
external monitor. Also, the cursor will not show up as it's by
default on the internal display too.
To "fix" that, users have to open the laptop once and close it once
again to sync the state of the switch with the hardware state.
Even if the "method" operation mode implementation can be buggy on
some platforms, the "open" choice is worse. It breaks docking
stations basically and there is no way to have a user-space hwdb to
fix that.
On the contrary, it's rather easy in user-space to have a hwdb
with the problematic platforms. Then, libinput (1.7.0+) can fix
the state of the lid switch for us: you need to set the udev
property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
When libinput detects internal keyboard events, it will overwrite the
state of the switch to open, making it reliable again. Given that
logind only checks the lid switch value after a timeout, we can
assume the user will use the internal keyboard before this timeout
expires.
For example, such a hwdb entry is:
libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
Link: https://bugzilla.gnome.org/show_bug.cgi?id=782380
Cc: 4.11+ <stable@vger.kernel.org> # 4.11+
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, extent manipulation operations such as hole punch, range
zeroing, or extent shifting do not record the fact that file data has
changed and thus fdatasync(2) has a work to do. As a result if we crash
e.g. after a punch hole and fdatasync, user can still possibly see the
punched out data after journal replay. Test generic/392 fails due to
these problems.
Fix the problem by properly marking that file data has changed in these
operations.
CC: stable@vger.kernel.org
Fixes: a4bb6b64e3
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Pull pin control fixes from Linus Walleij:
"Here is an overdue pull request for pin control fixes, the most
prominent feature is to make Intel Chromebooks (and I suspect any
other Cherryview-based Intel thing) happy again, which we really want
to see.
There is a patch hitting drivers/firmware/* that I was uncertain to
who actually manages, but I got Andy Shevchenko's and Dmitry Torokov's
review tags on it and I trust them both 100% to do the right thing for
Intel platform drivers.
Summary:
- Make a few Intel Chromebooks with Cherryview DMI firmware work
smoothly.
- A fix for some bogus allocations in the generic group management
code.
- Some GPIO descriptor lookup table stubs. Merged through the pin
control tree for administrative reasons.
- Revert the "bi-directional" and "output-enable" generic properties:
we need more discussions around this. It seems other SoCs are using
input/output gate enablement and these terms are not correct.
- Fix mux and drive strength atomically in the MXS driver.
- Fix the SPDIF function on sunxi A83T.
- OF table terminators and other small fixes"
* tag 'pinctrl-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: sunxi: Fix SPDIF function name for A83T
pinctrl: mxs: atomically switch mux and drive strength config
pinctrl: cherryview: Extend the Chromebook DMI quirk to Intel_Strago systems
firmware: dmi: Add DMI_PRODUCT_FAMILY identification string
pinctrl: core: Fix warning by removing bogus code
gpiolib: Add stubs for gpiod lookup table interface
Revert "pinctrl: generic: Add bi-directional and output-enable"
pinctrl: cherryview: Add terminate entry for dmi_system_id tables
This fixes a regression in commit 4d6501dce0 where I didn't notice
that MIPS and OpenRISC were reinitialising p->{set,clear}_child_tid to
NULL after our initialisation in copy_process().
We can simply get rid of the arch-specific initialisation here since it
is now always done in copy_process() before hitting copy_thread{,_tls}().
Review notes:
- As far as I can tell, copy_process() is the only user of
copy_thread_tls(), which is the only caller of copy_thread() for
architectures that don't implement copy_thread_tls().
- After this patch, there is no arch-specific code touching
p->set_child_tid or p->clear_child_tid whatsoever.
- It may look like MIPS/OpenRISC wanted to always have these fields be
NULL, but that's not true, as copy_process() would unconditionally
set them again _after_ calling copy_thread_tls() before commit
4d6501dce0.
Fixes: 4d6501dce0 ("kthread: Fix use-after-free if kthread fork fails")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net> # MIPS only
Acked-by: Stafford Horne <shorne@gmail.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: openrisc@lists.librecores.org
Cc: Jamie Iles <jamie.iles@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Filesystems filter out extended attributes in the "trusted." domain for
unprivlieged callers.
Overlay calls underlying filesystem's method with elevated privs, so need
to do the filtering in overlayfs too.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
For ACPI devices which do not have a _PSC method, the ACPI subsys cannot
query their initial state at boot, so these devices are assumed to have
been put in D0 by the BIOS, but for touchscreens that is not always true.
This commit adds a call to acpi_device_fix_up_power to explicitly put
devices without a _PSC method into D0 state (for devices with a _PSC
method it is a nop). Note we only need to do this on probe, after a
resume the ACPI subsys knows the device is in D3 and will properly
put it in D0.
This fixes the SIS0817 i2c-hid touchscreen on a Peaq C1010 2-in-1
device failing to probe with a "hid_descr_cmd failed" error.
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Switch to using the common DP helpers instead of using our own.
v2: also remove leftover struct intel_dp_desc (Daniel)
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
An upper dir is marked "impure" to let ovl_iterate() know that this
directory may contain non pure upper entries whose d_ino may need to be
read from the origin inode.
We already mark a non-merge dir "impure" when moving a non-pure child
entry inside it, to let ovl_iterate() know not to iterate the non-merge
dir directly.
Mark also a merge dir "impure" when moving a non-pure child entry inside
it and when copying up a child entry inside it.
This can be used to optimize ovl_iterate() to perform a "pure merge" of
upper and lower directories, merging the content of the directories,
without having to read d_ino from origin inodes.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Commit 93c1defedc ("rbd: remove the discard_zeroes_data flag")
explicitly didn't implement REQ_OP_WRITE_ZEROES for rbd, while the
following commit 48920ff2a5 ("block: remove the discard_zeroes_data
flag") dropped ->discard_zeroes_data in favor of REQ_OP_WRITE_ZEROES.
rbd does support efficient zeroing via CEPH_OSD_OP_ZERO opcode and will
release either some or all blocks depending on whether the zeroing
request is rbd_obj_bytes() aligned. This is how we currently implement
discards, so REQ_OP_WRITE_ZEROES can be identical to REQ_OP_DISCARD for
now. Caveats:
- REQ_NOUNMAP is ignored, but AFAICT that's true of at least two other
current implementations - nvme and loop
- there is no ->write_zeroes_alignment and blk_bio_write_zeroes_split()
is hence less helpful than blk_bio_discard_split(), but this can (and
should) be fixed on the rbd side
In the future we will split these into two code paths to respect
REQ_NOUNMAP on zeroout and save on zeroing blocks that couldn't be
released on discard.
Fixes: 93c1defedc ("rbd: remove the discard_zeroes_data flag")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The reference to cpu_resume requires the corresponding
generic code to be enabled when CONFIG_PM is set:
arch/arm/mach-at91/pm.o: In function `sama5d2_pm_init':
pm.c:(.init.text+0x5e8): undefined reference to `cpu_resume'
Fixes: 24a0f5c539 ("ARM: at91: pm: Add sama5d2 backup mode")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
With CONFIG_DEBUG_PREEMPT enabled, I get:
BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
caller is debug_smp_processor_id
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc2+ #2
Call Trace:
dump_stack
check_preemption_disabled
debug_smp_processor_id
save_microcode_in_initrd_amd
? microcode_init
save_microcode_in_initrd
...
because, well, it says it above, we're using smp_processor_id() in
preemptible code.
But passing the CPU number is not really needed. It is only used to
determine whether we're on the BSP, and, if so, to save the microcode
patch for early loading.
[ We don't absolutely need to do it on the BSP but we do that
customarily there. ]
Instead, convert that function parameter to a boolean which denotes
whether the patch should be saved or not, thereby avoiding the use of
smp_processor_id() in preemptible code.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170528200414.31305-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch removes unnecessary descriptions on
exynos_drm_crtc structure and adds one description
which specifies what pipe_clk member does.
pipe_clk support had been added by below patch without any description,
drm/exynos: add support for pipeline clock to the framework
Commit-id : f26b9343f5
Signed-off-by: Inki Dae <inki.dae@samsung.com>
The dsi + panel is a parental relationship, so OF grpah is not needed.
Therefore, the current dsi_parse_dt function will throw an error,
because there is no linked OF graph for the case fimd + dsi + panel.
Parse the Pll burst and esc clock frequency properties in dsi_parse_dt()
and create a bridge_node only if there is an OF graph associated with dsi.
Signed-off-by: Hoegeun Kwon <hoegeun.kwon@samsung.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Andi Shyti <andi.shyti@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Pull thermal SoC management fixes from Eduardo Valentin:
- fixes to TI SoC driver, Broadcom, qoriq
- small sparse warning fix on thermal core
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
thermal: broadcom: ns-thermal: default on iProc SoCs
ti-soc-thermal: Fix a typo in a comment line
ti-soc-thermal: Delete error messages for failed memory allocations in ti_bandgap_build()
ti-soc-thermal: Use devm_kcalloc() in ti_bandgap_build()
thermal: core: make thermal_emergency_poweroff static
thermal: qoriq: remove useless call for of_thermal_get_trip_points()
Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. In this case, no initializers
are needed (they can be NULL initialized and callers adjusted to check
for NULL, which is more efficient than an indirect call).
Cc: Robin Holt <robinmholt@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
When trying to propagate an error result, the error return path attempts
to retain the error, but does this with an open cast across very different
types, which the upcoming structure layout randomization plugin flags as
being potentially dangerous in the face of randomization. This is a false
positive, but what this code actually wants to do is use ERR_CAST() to
retain the error value.
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
When trying to propagate an error result, the error return path attempts
to retain the error, but does this with an open cast across very different
types, which the upcoming structure layout randomization plugin flags as
being potentially dangerous in the face of randomization. This is a false
positive, but what this code actually wants to do is use ERR_CAST() to
retain the error value.
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
When the call to nfs_devname() fails, the error path attempts to retain
the error via the mnt variable, but this requires a cast across very
different types (char * to struct vfsmount *), which the upcoming
structure layout randomization plugin flags as being potentially
dangerous in the face of randomization. This is a false positive, but
what this code actually wants to do is retain the error value, so this
patch explicitly sets it, instead of using what seems to be an
unexpected cast.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Stub functions in header files need to be static, or we can have
multiple definitions errors.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 6335e9f244 ("net: dsa: mv88e6xxx: mv88e6390X SERDES support")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
ad7152_write_raw_samp_freq() is called by ad7152_write_raw() with
chip->state_lock held. So, there is unavoidable deadlock when
ad7152_write_raw_samp_freq() locks the mutex itself.
The patch removes unneeded locking.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Fixes: 6572389bcc ("staging: iio: cdc: ad7152: Implement
IIO_CHAN_INFO_SAMP_FREQ attribute")
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Sabrina Dubroca reported an early panic:
BUG: unable to handle kernel paging request at ffffffffff240001
IP: efi_bgrt_init+0xdc/0x134
[...]
---[ end Kernel panic - not syncing: Attempted to kill the idle task!
... which was introduced by:
7b0a911478 ("efi/x86: Move the EFI BGRT init code to early init code")
The cause is that on this machine the firmware provides the EFI ACPI BGRT
table even on legacy non-EFI bootups - which table should be EFI only.
The garbage BGRT data causes the efi_bgrt_init() panic.
Add a check to skip efi_bgrt_init() in case non-EFI bootup to work around
this firmware bug.
Tested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: <stable@vger.kernel.org> # v4.11+
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 7b0a911478 ("efi/x86: Move the EFI BGRT init code to early init code")
Link: http://lkml.kernel.org/r/20170526113652.21339-6-matt@codeblueprint.co.uk
[ Rewrote the changelog to be more readable. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For EFI with the 'efi=old_map' kernel option specified, the kernel will panic
when KASLR is enabled:
BUG: unable to handle kernel paging request at 000000007febd57e
IP: 0x7febd57e
PGD 1025a067
PUD 0
Oops: 0010 [#1] SMP
Call Trace:
efi_enter_virtual_mode()
start_kernel()
x86_64_start_reservations()
x86_64_start_kernel()
start_cpu()
The root cause is that the identity mapping is not built correctly
in the 'efi=old_map' case.
On 'nokaslr' kernels, PAGE_OFFSET is 0xffff880000000000 which is PGDIR_SIZE
aligned. We can borrow the PUD table from the direct mappings safely. Given a
physical address X, we have pud_index(X) == pud_index(__va(X)).
However, on KASLR kernels, PAGE_OFFSET is PUD_SIZE aligned. For a given physical
address X, pud_index(X) != pud_index(__va(X)). We can't just copy the PGD entry
from direct mapping to build identity mapping, instead we need to copy the
PUD entries one by one from the direct mapping.
Fix it.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: Frank Ramsay <frank.ramsay@hpe.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20170526113652.21339-5-matt@codeblueprint.co.uk
[ Fixed and reworded the changelog and code comments to be more readable. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Booting kexec kernel with "efi=old_map" in kernel command line hits
kernel panic as shown below.
BUG: unable to handle kernel paging request at ffff88007fe78070
IP: virt_efi_set_variable.part.7+0x63/0x1b0
PGD 7ea28067
PUD 7ea2b067
PMD 7ea2d067
PTE 0
[...]
Call Trace:
virt_efi_set_variable()
efi_delete_dummy_variable()
efi_enter_virtual_mode()
start_kernel()
x86_64_start_reservations()
x86_64_start_kernel()
start_cpu()
[ efi=old_map was never intended to work with kexec. The problem with
using efi=old_map is that the virtual addresses are assigned from the
memory region used by other kernel mappings; vmalloc() space.
Potentially there could be collisions when booting kexec if something
else is mapped at the virtual address we allocated for runtime service
regions in the initial boot - Matt Fleming ]
Since kexec was never intended to work with efi=old_map, disable
runtime services in kexec if booted with efi=old_map, so that we don't
panic.
Tested-by: Lee Chun-Yi <jlee@suse.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20170526113652.21339-4-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
syszkaller fuzzer triggered a divide by zero, when set calibration
through ioctl().
To fix it, test 'bitrate' if it is negative or 0, just return -EINVAL.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Firo Yang <firogm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The binding documentation for the mv88e6xxx switch is missing the
eeprom-length property, which has been implemented since May 2016,
commit f8cd8753de ("dsa: mv88e6xxx: Handle eeprom-length property")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The overrun check for the size of submitted commands is off by one.
It should allow the offset plus the size to be equal to the
size of the memory object when the command stream is very tightly
constructed.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Amongst its other duties, msm_gem_new_impl adds the newly created
GEM object to the shared inactive list which may also be actively
modifiying the list during submission. All the paths to modify
the list are protected by the mutex except for the one through
msm_gem_import which can end up causing list corruption.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
[add extra WARN_ON(!mutex_is_locked(&dev->struct_mutex))]
Signed-off-by: Rob Clark <robdclark@gmail.com>
struct irq_domain_ops is not modified, so it can be made const.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Otherwise if someone was using old bindings with "core_clk" instead of
"core" as the clock name, we'd never find it and gpu would be stuck at
27MHz (or whatever it's slowest rate is).
Fixes: 98db803 ("msm/drm: gpu: Dynamically locate the clocks from the device tree")
Signed-off-by: Rob Clark <robdclark@gmail.com>
Otherwise, if nothing else enabled selects it, dev_pm_opp_of_add_table()
will return -ENOTSUPP.
Fixes: e2af8b6 ("drm/msm: gpu: Use OPP tables if we can")
Signed-off-by: Rob Clark <robdclark@gmail.com>
Pull tty/serial fixes from Greg KH:
"Here are some serial and tty fixes for 4.12-rc3. They are a bit bigger
than normal, which is why I had them bake in linux-next for a few
weeks and didn't send them to you for -rc2.
They revert a few of the serdev patches from 4.12-rc1, and bring
things back to how they were in 4.11, to try to make things a bit more
stable there. Rob and Johan both agree that this is the way forward,
so this isn't people squabbling over semantics. Other than that, just
a few minor serial driver fixes that people have had problems with.
All of these have been in linux-next for a few weeks with no reported
issues"
* tag 'tty-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: altera_uart: call iounmap() at driver remove
serial: imx: ensure UCR3 and UFCR are setup correctly
MAINTAINERS/serial: Change maintainer of jsm driver
serial: enable serdev support
tty/serdev: add serdev registration interface
serdev: Restore serdev_device_write_buf for atomic context
serial: core: fix crash in uart_suspend_port
tty: fix port buffer locking
tty: ehv_bytechan: clean up init error handling
serial: ifx6x60: fix use-after-free on module unload
serial: altera_jtaguart: adding iounmap()
serial: exar: Fix stuck MSIs
serial: efm32: Fix parity management in 'efm32_uart_console_get_options()'
serdev: fix tty-port client deregistration
Revert "tty_port: register tty ports with serdev bus"
drivers/tty: 8250: only call fintek_8250_probe when doing port I/O
Pull powerpc fixes from Michael Ellerman:
"Fix running SPU programs on Cell, and a few other minor fixes.
Thanks to Alistair Popple, Jeremy Kerr, Michael Neuling, Nicholas
Piggin"
* tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Add PPC_FEATURE userspace bits for SCV and DARN instructions
powerpc/spufs: Fix hash faults for kernel regions
powerpc: Fix booting P9 hash with CONFIG_PPC_RADIX_MMU=N
powerpc/powernv/npu-dma.c: Fix opal_npu_destroy_context() call
selftests/powerpc: Fix TM resched DSCR test with some compilers
Pull x86 fixes from Thomas Gleixner:
"A series of fixes for X86:
- The final fix for the end-of-stack issue in the unwinder
- Handle non PAT systems gracefully
- Prevent access to uninitiliazed memory
- Move early delay calaibration after basic init
- Fix Kconfig help text
- Fix a cross compile issue
- Unbreak older make versions"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/timers: Move simple_udelay_calibration past init_hypervisor_platform
x86/alternatives: Prevent uninitialized stack byte read in apply_alternatives()
x86/PAT: Fix Xorg regression on CPUs that don't support PAT
x86/watchdog: Fix Kconfig help text file path reference to lockup watchdog documentation
x86/build: Permit building with old make versions
x86/unwind: Add end-of-stack check for ftrace handlers
Revert "x86/entry: Fix the end of the stack for newly forked tasks"
x86/boot: Use CROSS_COMPILE prefix for readelf
Pull timer fixlet from Thomas Gleixner:
"Silence dmesg spam by making the posix cpu timer printks depend on
print_fatal_signals"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
posix-timers: Make signal printks conditional
Pull RAS fixes from Thomas Gleixner:
"Two fixlets for RAS:
- Export memory_error() so the NFIT module can utilize it
- Handle memory errors in NFIT correctly"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
acpi, nfit: Fix the memory error check in nfit_handle_mce()
x86/MCE: Export memory_error()
Pull perf tooling fixes from Thomas Gleixner:
- Synchronization of tools and kernel headers
- A series of fixes for perf report addressing various failures:
* Handle invalid maps proper
* Plug a memory leak
* Handle frames and callchain order correctly
- Fixes for handling inlines and children mode
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools/include: Sync kernel ABI headers with tooling headers
perf tools: Put caller above callee in --children mode
perf report: Do not drop last inlined frame
perf report: Always honor callchain order for inlined nodes
perf script: Add --inline option for debugging
perf report: Fix off-by-one for non-activation frames
perf report: Fix memory leak in addr2line when called by addr2inlines
perf report: Don't crash on invalid maps in `-g srcline` mode
Pull locking fix from Thomas Gleixner:
"A fix for a state leak which was introduced in the recent rework of
futex/rtmutex interaction"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()
Pull kthread fix from Thomas Gleixner:
"A single fix which prevents a use after free when kthread fork fails"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kthread: Fix use-after-free if kthread fork fails
Pull ftrace fixes from Steven Rostedt:
"There's been a few memory issues found with ftrace.
One was simply a memory leak where not all was being freed that should
have been in releasing a file pointer on set_graph_function.
Then Thomas found that the ftrace trampolines were marked for
read/write as well as execute. To shrink the possible attack surface,
he added calls to set them to ro. Which also uncovered some other
issues with freeing module allocated memory that had its permissions
changed.
Kprobes had a similar issue which is fixed and a selftest was added to
trigger that issue again"
* tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
x86/ftrace: Make sure that ftrace trampolines are not RWX
x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range()
selftests/ftrace: Add a testcase for many kprobe events
kprobes/x86: Fix to set RWX bits correctly before releasing trampoline
ftrace: Fix memory leak in ftrace_graph_release()
Currently VBUS is turned off while a usb device is detached, and turned
on again by the polling routine. This short period VBUS loss prevents
usb modem to switch mode.
VBUS should be constantly on for host-only mode, so this changes the
driver to not turn off VBUS for host-only mode.
Fixes: 2f3fd2c5bd ("usb: musb: Prepare dsps glue layer for PM runtime support")
Cc: stable@vger.kernel.org #v4.11
Reported-by: Moreno Bartalucci <moreno.bartalucci@tecnorama.it>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ftrace use module_alloc() to allocate trampoline pages. The mapping of
module_alloc() is RWX, which makes sense as the memory is written to right
after allocation. But nothing makes these pages RO after writing to them.
Add proper set_memory_rw/ro() calls to protect the trampolines after
modification.
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705251056410.1862@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
With function tracing starting in early bootup and having its trampoline
pages being read only, a bug triggered with the following:
kernel BUG at arch/x86/mm/pageattr.c:189!
invalid opcode: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc2-test+ #3
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
task: ffffffffb4222500 task.stack: ffffffffb4200000
RIP: 0010:change_page_attr_set_clr+0x269/0x302
RSP: 0000:ffffffffb4203c88 EFLAGS: 00010046
RAX: 0000000000000046 RBX: 0000000000000000 RCX: 00000001b6000000
RDX: ffffffffb4203d40 RSI: 0000000000000000 RDI: ffffffffb4240d60
RBP: ffffffffb4203d18 R08: 00000001b6000000 R09: 0000000000000001
R10: ffffffffb4203aa8 R11: 0000000000000003 R12: ffffffffc029b000
R13: ffffffffb4203d40 R14: 0000000000000001 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff9a639ea00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff9a636b384000 CR3: 00000001ea21d000 CR4: 00000000000406b0
Call Trace:
change_page_attr_clear+0x1f/0x21
set_memory_ro+0x1e/0x20
arch_ftrace_update_trampoline+0x207/0x21c
? ftrace_caller+0x64/0x64
? 0xffffffffc029b000
ftrace_startup+0xf4/0x198
register_ftrace_function+0x26/0x3c
function_trace_init+0x5e/0x73
tracer_init+0x1e/0x23
tracing_set_tracer+0x127/0x15a
register_tracer+0x19b/0x1bc
init_function_trace+0x90/0x92
early_trace_init+0x236/0x2b3
start_kernel+0x200/0x3f5
x86_64_start_reservations+0x29/0x2b
x86_64_start_kernel+0x17c/0x18f
secondary_startup_64+0x9f/0x9f
? secondary_startup_64+0x9f/0x9f
Interrupts should not be enabled at this early in the boot process. It is
also fine to leave interrupts enabled during this time as there's only one
CPU running, and on_each_cpu() means to only run on the current CPU.
If early_boot_irqs_disabled is set, it is safe to run cpu_flush_range() with
interrupts disabled. Don't trigger a BUG_ON() in that case.
Link: http://lkml.kernel.org/r/20170526093717.0be3b849@gandalf.local.home
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Add a testcase to test kprobes via ftrace interface
with many concurrent kprobe events.
This tries to add many kprobe events (up to 256) on
kernel functions. To avoid making ftrace-based
kprobes (kprobes on fentry), it skips first N bytes
(on x86 N=5, on ppc or arm N=4) of function entry.
After that, it enables all those events, disable it,
and remove it.
Since the unoptimization buffer reclaiming will
be delayed, after removing events, it will wait
enough time.
Link: http://lkml.kernel.org/r/149577388470.11702.11832460851769204511.stgit@devbox
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fix kprobes to set(recover) RWX bits correctly on trampoline
buffer before releasing it. Releasing readonly page to
module_memfree() crash the kernel.
Without this fix, if kprobes user register a bunch of kprobes
in function body (since kprobes on function entry usually
use ftrace) and unregister it, kernel hits a BUG and crash.
Link: http://lkml.kernel.org/r/149570868652.3518.14120169373590420503.stgit@devbox
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: d0381c81c2 ("kprobes/x86: Set kprobes pages read-only")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Pull input layer fixes from Dmitry Torokhov:
"Just a few fixups to a couple of drivers"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elan_i2c - ignore signals when finishing updating firmware
Input: elan_i2c - clear INT before resetting controller
Input: atmel_mxt_ts - add T100 as a readable object
Input: edt-ft5x06 - increase allowed data range for threshold parameter
If TRIM_UNUSED_KSYMS is enabled, all unneeded exported symbols are made
unexported. Two-pass build of the kernel is done to find out which
symbols are needed based on a configuration. This effectively
complicates things for out-of-tree modules.
Livepatch exports functions to (un)register and enable/disable a live
patch. The only in-tree module which uses these functions is a sample in
samples/livepatch/. If the sample is disabled, the functions are
trimmed and out-of-tree live patches cannot be built.
Note that live patches are intended to be built out-of-tree.
Suggested-by: Michal Marek <mmarek@suse.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Jessica Yu <jeyu@redhat.com>
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
mpage_submit_page() can race with another process growing i_size and
writing data via mmap to the written-back page. As mpage_submit_page()
samples i_size too early, it may happen that ext4_bio_write_page()
zeroes out too large tail of the page and thus corrupts user data.
Fix the problem by sampling i_size only after the page has been
write-protected in page tables by clear_page_dirty_for_io() call.
Reported-by: Michael Zimmer <michael@swarm64.com>
CC: stable@vger.kernel.org
Fixes: cb20d51883
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When ext4_map_blocks() is called with EXT4_GET_BLOCKS_ZERO to zero-out
allocated blocks and these blocks are actually converted from unwritten
extent the following race can happen:
CPU0 CPU1
page fault page fault
... ...
ext4_map_blocks()
ext4_ext_map_blocks()
ext4_ext_handle_unwritten_extents()
ext4_ext_convert_to_initialized()
- zero out converted extent
ext4_zeroout_es()
- inserts extent as initialized in status tree
ext4_map_blocks()
ext4_es_lookup_extent()
- finds initialized extent
write data
ext4_issue_zeroout()
- zeroes out new extent overwriting data
This problem can be reproduced by generic/340 for the fallocated case
for the last block in the file.
Fix the problem by avoiding zeroing out the area we are mapping with
ext4_map_blocks() in ext4_ext_convert_to_initialized(). It is pointless
to zero out this area in the first place as the caller asked us to
convert the area to initialized because he is just going to write data
there before the transaction finishes. To achieve this we delete the
special case of zeroing out full extent as that will be handled by the
cases below zeroing only the part of the extent that needs it. We also
instruct ext4_split_extent() that the middle of extent being split
contains data so that ext4_split_extent_at() cannot zero out full extent
in case of ENOSPC.
CC: stable@vger.kernel.org
Fixes: 12735f8819
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Callers normally treat the config space accessors as returning PCBIOS_*
error codes, not Linux error codes (or they don't look at them at all). We
have pcibios_err_to_errno() in case the error code needs to be translated.
Fixes: 4b10388347 ("PCI: Don't attempt config access to disconnected devices")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Pull LED fix from Jacek Anaszewski:
"A single LED fix for 4.12-rc3.
leds-pca955x driver uses only i2c_smbus API and thus it should pass
I2C_FUNC_SMBUS_BYTE_DATA flag to i2c_check_functionality"
* tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
leds: pca955x: Correct I2C Functionality
Pull networking fixes from David Miller:
1) Fix state pruning in bpf verifier wrt. alignment, from Daniel
Borkmann.
2) Handle non-linear SKBs properly in SCTP ICMP parsing, from Davide
Caratti.
3) Fix bit field definitions for rss_hash_type of descriptors in mlx5
driver, from Jesper Brouer.
4) Defer slave->link updates until bonding is ready to do a full commit
to the new settings, from Nithin Sujir.
5) Properly reference count ipv4 FIB metrics to avoid use after free
situations, from Eric Dumazet and several others including Cong Wang
and Julian Anastasov.
6) Fix races in llc_ui_bind(), from Lin Zhang.
7) Fix regression of ESP UDP encapsulation for TCP packets, from
Steffen Klassert.
8) Fix mdio-octeon driver Kconfig deps, from Randy Dunlap.
9) Fix regression in setting DSCP on ipv6/GRE encapsulation, from Peter
Dawson.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
ipv4: add reference counting to metrics
net: ethernet: ax88796: don't call free_irq without request_irq first
ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets
sctp: fix ICMP processing if skb is non-linear
net: llc: add lock_sock in llc_ui_bind to avoid a race condition
bonding: Don't update slave->link until ready to commit
test_bpf: Add a couple of tests for BPF_JSGE.
bpf: add various verifier test cases
bpf: fix wrong exposure of map_flags into fdinfo for lpm
bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data
bpf: properly reset caller saved regs after helper call and ld_abs/ind
bpf: fix incorrect pruning decision when alignment must be tracked
arp: fixed -Wuninitialized compiler warning
tcp: avoid fastopen API to be used on AF_UNSPEC
net: move somaxconn init from sysctl code
net: fix potential null pointer dereference
geneve: fix fill_info when using collect_metadata
virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
be2net: Fix offload features for Q-in-Q packets
vlan: Fix tcp checksum offloads in Q-in-Q vlans
...
Pull XFS fixes from Darrick Wong:
"A few miscellaneous bug fixes & cleanups:
- Fix indlen block reservation accounting bug when splitting delalloc
extent
- Fix warnings about unused variables that appeared in -rc1.
- Don't spew errors when bmapping a local format directory
- Fix an off-by-one error in a delalloc eof assertion
- Make fsmap only return inode information for CAP_SYS_ADMIN
- Fix a potential mount time deadlock recovering cow extents
- Fix unaligned memory access in _btree_visit_blocks
- Fix various SEEK_HOLE/SEEK_DATA bugs"
* tag 'xfs-4.12-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: Move handling of missing page into one place in xfs_find_get_desired_pgoff()
xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff()
xfs: Fix missed holes in SEEK_HOLE implementation
xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff()
xfs: fix unaligned access in xfs_btree_visit_blocks
xfs: avoid mount-time deadlock in CoW extent recovery
xfs: only return detailed fsmap info if the caller has CAP_SYS_ADMIN
xfs: bad assertion for delalloc an extent that start at i_size
xfs: fix warnings about unused stack variables
xfs: BMAPX shouldn't barf on inline-format directories
xfs: fix indlen accounting error on partial delalloc conversion
Andrey Konovalov reported crashes in ipv4_mtu()
I could reproduce the issue with KASAN kernels, between
10.246.7.151 and 10.246.7.152 :
1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 &
2) At the same time run following loop :
while :
do
ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
done
Cong Wang attempted to add back rt->fi in commit
82486aa6f1 ("ipv4: restore rt->fi for reference counting")
but this proved to add some issues that were complex to solve.
Instead, I suggested to add a refcount to the metrics themselves,
being a standalone object (in particular, no reference to other objects)
I tried to make this patch as small as possible to ease its backport,
instead of being super clean. Note that we believe that only ipv4 dst
need to take care of the metric refcount. But if this is wrong,
this patch adds the basic infrastructure to extend this to other
families.
Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang
for his efforts on this problem.
Fixes: 2860583fe8 ("ipv4: Kill rt->fi")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function ax_init_dev (which is called only from the driver's .probe
function) calls free_irq in the error path without having requested the
irq in the first place. So drop the free_irq call in the error path.
Fixes: 825a2ff189 ("AX88796 network driver")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fix addresses two problems in the way the DSCP field is formulated
on the encapsulating header of IPv6 tunnels.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195661
1) The IPv6 tunneling code was manipulating the DSCP field of the
encapsulating packet using the 32b flowlabel. Since the flowlabel is
only the lower 20b it was incorrect to assume that the upper 12b
containing the DSCP and ECN fields would remain intact when formulating
the encapsulating header. This fix handles the 'inherit' and
'fixed-value' DSCP cases explicitly using the extant dsfield u8 variable.
2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was
incorrect and resulted in the DSCP value always being set to 0.
Commit 90427ef5d2 ("ipv6: fix flow labels when the traffic class
is non-0") caused the regression by masking out the flowlabel
which exposed the incorrect handling of the DSCP portion of the
flowlabel in ip6_tunnel and ip6_gre.
Fixes: 90427ef5d2 ("ipv6: fix flow labels when the traffic class is non-0")
Signed-off-by: Peter Dawson <peter.a.dawson@boeing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sometimes ICMP replies to INIT chunks are ignored by the client, even if
the encapsulated SCTP headers match an open socket. This happens when the
ICMP packet is carried by a paged skb: use skb_header_pointer() to read
packet contents beyond the SCTP header, so that chunk header and initiate
tag are validated correctly.
v2:
- don't use skb_header_pointer() to read the transport header, since
icmp_socket_deliver() already puts these 8 bytes in the linear area.
- change commit message to make specific reference to INIT chunks.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a race condition in llc_ui_bind if two or more processes/threads
try to bind a same socket.
If more processes/threads bind a same socket success that will lead to
two problems, one is this action is not what we expected, another is
will lead to kernel in unstable status or oops(in my simple test case,
cause llc2.ko can't unload).
The current code is test SOCK_ZAPPED bit to avoid a process to
bind a same socket twice but that is can't avoid more processes/threads
try to bind a same socket at the same time.
So, add lock_sock in llc_ui_bind like others, such as llc_ui_connect.
Signed-off-by: Lin Zhang <xiaolou4617@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull block fixes from Jens Axboe:
"A collection of fixes that should go into this series. This contains:
- A set of NVMe fixes, pulled from Christoph. This includes a set of
fixes for the fiber channel bits from James Smart, rdma queue depth
fix from Marta, controller removal fixes from Ming, and some more
APST quirk updates from Andy.
- A blk-mq debugfs fix from Bart, fixing a problem with the
untangling of the sysfs and debugfs blk-mq bits that was added in
this series.
- Error code fix in add_partition() from Dan.
- A small series of fixes for the new blk-throttle code from Shaohua"
* 'for-linus' of git://git.kernel.dk/linux-block: (21 commits)
blk-mq: Only register debugfs attributes for blk-mq queues
nvme: Quirk APST on Intel 600P/P3100 devices
nvme: only setup block integrity if supported by the driver
nvme: replace is_flags field in nvme_ctrl_ops with a flags field
nvme-pci: consistencly use ctrl->device for logging
partitions/msdos: FreeBSD UFS2 file systems are not recognized
block: fix an error code in add_partition()
blk-throttle: force user to configure all settings for io.low
blk-throttle: respect 0 bps/iops settings for io.low
blk-throttle: output some debug info in trace
blk-throttle: add hierarchy support for latency target and idle time
nvme_fc: remove extra controller reference taken on reconnect
nvme_fc: correct nvme status set on abort
nvme_fc: set logging level on resets/deletes
nvme_fc: revise comment on teardown
nvme_fc: Support ctrl_loss_tmo
nvme_fc: get rid of local reconnect_delay
blk-mq: remove blk_mq_abort_requeue_list()
nvme: avoid to use blk_mq_abort_requeue_list()
nvme: use blk_mq_start_hw_queues() in nvme_kill_queues()
...
Pull PCI fixes from Bjorn Helgaas:
- fix PCI_ENDPOINT build error (merged for v4.12)
- fix Switchtec driver (merged for v4.12)
- fix imx6 config read timeouts, fallout from changing to non-postable
reads
- add PM "needs_resume" flag for i915 suspend issue
* tag 'pci-v4.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/PM: Add needs_resume flag to avoid suspend complete optimization
PCI: imx6: Fix config read timeout handling
switchtec: Fix minor bug with partition ID register
switchtec: Use new cdev_device_add() helper function
PCI: endpoint: Make PCI_ENDPOINT depend on HAS_DMA
Pul ceph fixes from Ilya Dryomov:
"A bunch of make W=1 and static checker fixups, a RECONNECT_SEQ
messenger patch from Zheng and Luis' fallocate fix"
* tag 'ceph-for-4.12-rc3' of git://github.com/ceph/ceph-client:
ceph: check that the new inode size is within limits in ceph_fallocate()
libceph: cleanup old messages according to reconnect seq
libceph: NULL deref on crush_decode() error path
libceph: fix error handling in process_one_ticket()
libceph: validate blob_struct_v in process_one_ticket()
libceph: drop version variable from ceph_monmap_decode()
libceph: make ceph_msg_data_advance() return void
libceph: use kbasename() and kill ceph_file_part()
Pull MMC fixes from Ulf Hansson:
"This contains fixes to make the WiFi work again for the ARM64 Hikey
board.
Together with a couple of DTS updates for the Hikey board we have also
extended the mmc pwrseq_simple, to support a new power-off-delay-us DT
property, as that was required to enable a graceful power off sequence
for the WiFi chip"
* tag 'mmc-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
arm64: dts: hikey: Fix WiFi support
arm64: dts: hi6220: Move board data from the dwmmc nodes to hikey dts
arm64: dts: hikey: Add the SYS_5V and the VDD_3V3 regulators
arm64: dts: hi6220: Move the fixed_5v_hub regulator to the hikey dts
arm64: dts: hikey: Add clock for the pmic mfd
mfd: dts: hi655x: Add clock binding for the pmic
mmc: pwrseq_simple: Parse DTS for the power-off-delay-us property
mmc: dt: pwrseq-simple: Invent power-off-delay-us
Pull sound fixes from Takashi Iwai:
"This contains a few HD-audio device-specific quirks and an endianess
fix for USB-audio, as well as the update of quirk model list document.
All fixes are small and trivial.
The document update could have been postponed, but it's a good thing
for user and has absolutely zero risk of breakage, so included here"
* tag 'sound-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - apply STAC_9200_DELL_M22 quirk for Dell Latitude D430
ALSA: hda - Update the list of quirk models
ALSA: hda - Provide dual-codecs model option for a few Realtek codecs
ALSA: hda - Apply dual-codec quirk for MSI Z270-Gaming mobo
ALSA: hda - No loopback on ALC299 codec
ALSA: usb-audio: fix Amanero Combo384 quirk on big-endian hosts
Intel SDM says, that at most one LAPIC should be configured with ExtINT
delivery. KVM configures all LAPICs this way. This causes pic_unlock()
to kick the first available vCPU from the internal KVM data structures.
If this vCPU is not the BSP, but some not-yet-booted AP, the BSP may
never realize that there is an interrupt.
Fix that by enabling ExtINT delivery only for the BSP.
This allows booting a Linux guest without a TSC in the above situation.
Otherwise the BSP gets stuck in calibrate_delay_converge().
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The decision whether or not to exit from L2 to L1 on an lmsw instruction is
based on bogus values: instead of using the information encoded within the
exit qualification, it uses the data also used for the mov-to-cr
instruction, which boils down to using whatever is in %eax at that point.
Use the correct values instead.
Without this fix, an L1 may not get notified when a 32-bit Linux L2
switches its secondary CPUs to protected mode; the L1 is only notified on
the next modification of CR0. This short time window poses a problem, when
there is some other reason to exit to L1 in between. Then, L2 will be
resumed in real mode and chaos ensues.
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull drm fixes from Dave Airlie:
"Not a whole lot happening here, a set of amdgpu fixes and one core
deadlock fix, and some misc drivers fixes"
* tag 'drm-fixes-for-v4.12-rc3' of git://people.freedesktop.org/~airlied/linux:
drm/amdgpu: fix null point error when rmmod amdgpu.
drm/amd/powerplay: fix a signedness bugs
drm/amdgpu: fix NULL pointer panic of emit_gds_switch
drm/radeon: Unbreak HPD handling for r600+
drm/amd/powerplay/smu7: disable mclk switching for high refresh rates
drm/amd/powerplay/smu7: add vblank check for mclk switching (v2)
drm/radeon/ci: disable mclk switching for high refresh rates (v2)
drm/amdgpu/ci: disable mclk switching for high refresh rates (v2)
drm/amdgpu: fix fundamental suspend/resume issue
drm/gma500/psb: Actually use VBT mode when it is found
drm: Fix deadlock retry loop in page_flip_ioctl
drm: qxl: Delay entering atomic context during cursor update
drm/radeon: Fix oops upon driver load on PowerXpress laptops
Preemption can occur during cancel preemption timer, and there will be
inconsistent status in lapic, vmx and vmcs field.
CPU0 CPU1
preemption timer vmexit
handle_preemption_timer(vCPU0)
kvm_lapic_expired_hv_timer
vmx_cancel_hv_timer
vmx->hv_deadline_tsc = -1
vmcs_clear_bits
/* hv_timer_in_use still true */
sched_out
sched_in
kvm_arch_vcpu_load
vmx_set_hv_timer
write vmx->hv_deadline_tsc
vmcs_set_bits
/* back in kvm_lapic_expired_hv_timer */
hv_timer_in_use = false
...
vmx_vcpu_run
vmx_arm_hv_run
write preemption timer deadline
spurious preemption timer vmexit
handle_preemption_timer(vCPU0)
kvm_lapic_expired_hv_timer
WARN_ON(!apic->lapic_timer.hv_timer_in_use);
This can be reproduced sporadically during boot of L2 on a
preemptible L1, causing a splat on L1.
WARNING: CPU: 3 PID: 1952 at arch/x86/kvm/lapic.c:1529 kvm_lapic_expired_hv_timer+0xb5/0xd0 [kvm]
CPU: 3 PID: 1952 Comm: qemu-system-x86 Not tainted 4.12.0-rc1+ #24 RIP: 0010:kvm_lapic_expired_hv_timer+0xb5/0xd0 [kvm]
Call Trace:
handle_preemption_timer+0xe/0x20 [kvm_intel]
vmx_handle_exit+0xc9/0x15f0 [kvm_intel]
? lock_acquire+0xdb/0x250
? lock_acquire+0xdb/0x250
? kvm_arch_vcpu_ioctl_run+0xdf3/0x1ce0 [kvm]
kvm_arch_vcpu_ioctl_run+0xe55/0x1ce0 [kvm]
kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
? __fget+0xf3/0x210
do_vfs_ioctl+0xa4/0x700
? __fget+0x114/0x210
SyS_ioctl+0x79/0x90
do_syscall_64+0x8f/0x750
? trace_hardirqs_on_thunk+0x1a/0x1c
entry_SYSCALL64_slow_path+0x25/0x25
This patch fixes it by disabling preemption while cancelling
preemption timer. This way cancel_hv_timer is atomic with
respect to kvm_arch_vcpu_load.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We need to return an error for any call that asks for MSI / MSI-X
vectors only, so that non-trivial fallback logic can work properly.
Also valid dev->irq and use the "correct" errno value based on feedback
from Linus.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Fixes: aff17164 ("PCI: Provide sensible IRQ vector alloc/free routines")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We don't need to bitbang these pins anymore, instead we muxed these
pins as SPI, after this change, done in commit 6c69f726, we introduced
the following error:
pinctrl-single 44e10800.pinmux: pin PIN85 already requested \
by 44e10800.pinmux; cannot claim for 48030000.spi
pinctrl-single 44e10800.pinmux: pin-85 (48030000.spi) status -22
Fixes: 6c69f726 ("ARM: dts: am335x-sl50: Enable SPI0 interface and Flash Memory")
Cc: <stable@vger.kernel.org> # 4.11
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The second version of the hardware moved the card detect pin from gpio0_6
to gpio1_9, as we won't support the first hardware version fix the pinmux
configuration of this pin.
Fixes: 8584d4fc ("ARM: dts: am335x-sl50: Add Toby-Churchill SL50 board support.")
Cc: <stable@vger.kernel.org> # 4.11
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Christoph writes:
"A couple of fixes for the next rc on the nvme front. Various FC fixes
from James, controller removal fixes from Ming (including a block layer
patch), a APST related device quirk from Andy, a RDMA fix for small
queue depth device from Marta, as well as fixes for the lack of
metadata support in non-PCIe drivers and the printk logging format from
me."
The code in blk-mq-debugfs.c assumes that it is working on a blk-mq
queue and is not intended to work on a blk-sq queue. Hence only
register blk-mq debugfs attributes for blk-mq queues.
Fixes: commit 9c1051aacd ("blk-mq: untangle debugfs and sysfs")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
commit 25165f79ad ("ASoC: rsnd: enable clock-frequency for both
44.1kHz/48kHz") supported both 44.1kHz/48kHz for AUDIO_CLKOUTx,
but it didn't care its parent clock name.
This patch fixes it.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
vlv_display_irq_postinstall() enables the LPE audio interrupts
regardless of whether the LPE audio irq chip has masked/unmasked
them. Also the irqchip masking/unmasking doesn't consider the state
of the display power well or the device, and hence just leads to
dmesg spew when it tries to access the hardware while it's powered
down.
If the current way works, then we don't need to do anything in the
mask/unmask hooks. If it doesn't work, well, then we'd need to properly
track whether the irqchip has masked/unmasked the interrupts when
we enable display interrupts. And the mask/unmask hooks would need
to check whether display interrupts are even enabled before frobbing
with he registers.
So let's just assume the current way works and neuter the mask/unmask
hooks. Also clean up vlv_display_irq_postinstall() a bit and stop
it from trying to unmask/enable the LPE C interrupt on VLV since it
doesn't exist.
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170427160231.13337-4-ville.syrjala@linux.intel.com
Reviewed-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit ebf5f92147)
Reference: http://mid.mail-archive.com/874cf6d3-4e45-d4cf-e662-eb972490d2ce@redhat.com
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Ethernet networking on K2L has been broken since v4.11-rc1. This was
caused by commit 32a34441a9 ("ARM: keystone: dts: fix netcp clocks
and add names"). This commit inadvertently moves on-chip static RAM
clock to the end of list of clocks provided for netcp. Since keystone
PM domain support does not have a list of recognized con_ids, only the
first clock in the list comes under runtime PM management. This means
the OSR (On-chip Static RAM) clock remains disabled and that broke
networking on K2L.
The OSR is used by QMSS on K2L as an external linking RAM. However this
is a standalone RAM that can be used for non-QMSS usage (as well as from
DSP side). So add a SRAM device node for the same and add the OSR clock
to the node.
Remove the now redundant OSR clock node from netcp.
To manage all clocks defined for netCP's use by runtime PM needs keystone
generic power domain (genpd) driver support which is under works.
Meanwhile, this patch restores K2L networking and is correct irrespective
of any future genpd work since OSR is an independent module and not part
of NetCP anyway.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Acked-by: Tero Kristo <t-kristo@ti.com>
[nsekhar@ti.com: commit message updates, port to latest mainline]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Cc: stable@vger.kernel.org # for 4.11
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Currently only the PCIe driver supports metadata, so we should not claim
integrity support for the other drivers. This prevents nasty crashes
with targets that advertise metadata support on fabrics.
Also use the opportunity to factor out some code into a separate helper
that isn't even compiled if CONFIG_BLK_DEV_INTEGRITY is disabled.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
So that we can have more flags for transport-specific behavior.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
This is what most of the code already does and gives much more useful
prefixes than the device embedded in the pci_dev.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
A bunch of bug fixes:
- Fix display flickering on some chips at high refresh rates
- suspend/resume fix
- hotplug fix
- a couple of segfault fixes for certain cases
* 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu: fix null point error when rmmod amdgpu.
drm/amd/powerplay: fix a signedness bugs
drm/amdgpu: fix NULL pointer panic of emit_gds_switch
drm/radeon: Unbreak HPD handling for r600+
drm/amd/powerplay/smu7: disable mclk switching for high refresh rates
drm/amd/powerplay/smu7: add vblank check for mclk switching (v2)
drm/radeon/ci: disable mclk switching for high refresh rates (v2)
drm/amdgpu/ci: disable mclk switching for high refresh rates (v2)
drm/amdgpu: fix fundamental suspend/resume issue
Core Changes:
- Don't drop vblank reference more than once in cases of ww retry (Daniel)
Driver Changes:
- radeon: Fix oops during radeon probe trying to reference wrong device (Lukas)
- qxl: Avoid sleeping while in atomic context on cursor update (Gabriel)
- gma500: Use VBT mode instead of pre-programmed mode for LVDS (Patrik)
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
* tag 'drm-misc-fixes-2017-05-25' of git://anongit.freedesktop.org/git/drm-misc:
drm/gma500/psb: Actually use VBT mode when it is found
drm: Fix deadlock retry loop in page_flip_ioctl
drm: qxl: Delay entering atomic context during cursor update
drm/radeon: Fix oops upon driver load on PowerXpress laptops
Update of maintainers entry for Samsung SoC.
* tag 'samsung-fixes-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
MAINTAINERS: Remove Javier Martinez Canillas as reviewer for Exynos
Signed-off-by: Olof Johansson <olof@lixom.net>
This pull request contains Broadcom ARM-based SoCs Device Tree fixes for 4.12,
please pull the following:
- Phil provides a fix for the BCM283x (Raspberry Pi) by flagging the first
4KiB of physical memory as a reserved region in order to let the secondary
cores successfully spin until they are brought online
* tag 'arm-soc/for-4.12/devicetree-fixes-2' of http://github.com/Broadcom/stblinux:
ARM: dts: bcm283x: Reserve first page for firmware
Signed-off-by: Olof Johansson <olof@lixom.net>
Enable some very core config options used on 64bit Rockchip socs.
As built-in driver enable the Rockchip spi driver as well as the
cros-ec-spi and cros-ec keyboard driver, as this may be helpful
in case an initrd does not work as expected and drops the user
into a shell. Another built-in is the fan53555 regulator driver,
as it and its register-compatible cousins Silergy syr827 and syr828
are often used on Rockchip socs as cpu-supply next to regular pmic.
The rest can be enabled as modules and contains the pcie host
controller and its phy, the sucessive approximation adc (saradc)
that gets often used for additional buttons on Rockchip boards
as well as the adc-keys Keyboard driver for these keys.
The cros-ec-pwm also can be a module, as it is normally only used to
drive display backlights as well as the Rockchip thermal controller
that allows to read the cpu and gpu temperatures and affect frequency
scaling if necessary.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Olof Johansson <olof@lixom.net>
These fix issues with power management initialization code
on DaVinci. Some resources were getting freed prematurely.
And there was an issue with resources not being on error.
* tag 'davinci-fixes-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci:
ARM: davinci: PM: Do not free useful resources in normal path in 'davinci_pm_init'
ARM: davinci: PM: Free resources in error handling path in 'davinci_pm_init'
Signed-off-by: Olof Johansson <olof@lixom.net>
In the loadbalance arp monitoring scheme, when a slave link change is
detected, the slave->link is immediately updated and slave_state_changed
is set. Later down the function, the rtnl_lock is acquired and the
changes are committed, updating the bond link state.
However, the acquisition of the rtnl_lock can fail. The next time the
monitor runs, since slave->link is already updated, it determines that
link is unchanged. This results in the bond link state permanently out
of sync with the slave link.
This patch modifies bond_loadbalance_arp_mon() to handle link changes
identical to bond_ab_arp_{inspect/commit}(). The new link state is
maintained in slave->new_link until we're ready to commit at which point
it's copied into slave->link.
NOTE: miimon_{inspect/commit}() has a more complex state machine
requiring the use of the bond_{propose,commit}_link_state() functions
which maintains the intermediate state in slave->link_new_state. The arp
monitors don't require that.
Testing: This bug is very easy to reproduce with the following steps.
1. In a loop, toggle a slave link of a bond slave interface.
2. In a separate loop, do ifconfig up/down of an unrelated interface to
create contention for rtnl_lock.
Within a few iterations, the bond link goes out of sync with the slave
link.
Signed-off-by: Nithin Nayak Sujir <nsujir@tintri.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Jay Vosburgh <jay.vosburgh@canonical.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some JITs can optimize comparisons with zero. Add a couple of
BPF_JSGE tests against immediate zero.
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
Various BPF fixes
Follow-up to fix incorrect pruning when alignment tracking is
in use and to properly clear regs after call to not leave stale
data behind, also a fix that adds bpf_clone_redirect to the
bpf_helper_changes_pkt_data helper and exposes correct map_flags
for lpm map into fdinfo. For details, please see individual
patches.
v1 -> v2:
- Reworked first patch so that env->strict_alignment is the
final indicator on whether we have to deal with strict
alignment rather than having CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
checks on various locations, so only checking env->strict_alignment
is sufficient after that. Thanks for spotting, Dave!
- Added patch 3 and 4.
- Rest as is.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds various verifier test cases:
1) A test case for the pruning issue when tracking alignment
is used.
2) Various PTR_TO_MAP_VALUE_OR_NULL tests to make sure pointer
arithmetic turns such register into UNKNOWN_VALUE type.
3) Test cases for the special treatment of LD_ABS/LD_IND to
make sure verifier doesn't break calling convention here.
Latter is needed, since f.e. arm64 JIT uses r1 - r5 for
storing temporary data, so they really must be marked as
NOT_INIT.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
trie_alloc() always needs to have BPF_F_NO_PREALLOC passed in via
attr->map_flags, since it does not support preallocation yet. We
check the flag, but we never copy the flag into trie->map.map_flags,
which is later on exposed into fdinfo and used by loaders such as
iproute2. Latter uses this in bpf_map_selfcheck_pinned() to test
whether a pinned map has the same spec as the one from the BPF obj
file and if not, bails out, which is currently the case for lpm
since it exposes always 0 as flags.
Also copy over flags in array_map_alloc() and stack_map_alloc().
They always have to be 0 right now, but we should make sure to not
miss to copy them over at a later point in time when we add actual
flags for them to use.
Fixes: b95a5c4db0 ("bpf: add a longest prefix match trie map implementation")
Reported-by: Jarno Rajahalme <jarno@covalent.io>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bpf_clone_redirect() still needs to be listed in
bpf_helper_changes_pkt_data() since we call into
bpf_try_make_head_writable() from there, thus we need
to invalidate prior pkt regs as well.
Fixes: 36bbef52c7 ("bpf: direct packet write and access for helpers for clsact progs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, after performing helper calls, we clear all caller saved
registers, that is r0 - r5 and fill r0 depending on struct bpf_func_proto
specification. The way we reset these regs can affect pruning decisions
in later paths, since we only reset register's imm to 0 and type to
NOT_INIT. However, we leave out clearing of other variables such as id,
min_value, max_value, etc, which can later on lead to pruning mismatches
due to stale data.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, when we enforce alignment tracking on direct packet access,
the verifier lets the following program pass despite doing a packet
write with unaligned access:
0: (61) r2 = *(u32 *)(r1 +76)
1: (61) r3 = *(u32 *)(r1 +80)
2: (61) r7 = *(u32 *)(r1 +8)
3: (bf) r0 = r2
4: (07) r0 += 14
5: (25) if r7 > 0x1 goto pc+4
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=0,max_value=1 R10=fp
6: (2d) if r0 > r3 goto pc+1
R0=pkt(id=0,off=14,r=14) R1=ctx R2=pkt(id=0,off=0,r=14)
R3=pkt_end R7=inv,min_value=0,max_value=1 R10=fp
7: (63) *(u32 *)(r0 -4) = r0
8: (b7) r0 = 0
9: (95) exit
from 6 to 8:
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=0,max_value=1 R10=fp
8: (b7) r0 = 0
9: (95) exit
from 5 to 10:
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=2 R10=fp
10: (07) r0 += 1
11: (05) goto pc-6
6: safe <----- here, wrongly found safe
processed 15 insns
However, if we enforce a pruning mismatch by adding state into r8
which is then being mismatched in states_equal(), we find that for
the otherwise same program, the verifier detects a misaligned packet
access when actually walking that path:
0: (61) r2 = *(u32 *)(r1 +76)
1: (61) r3 = *(u32 *)(r1 +80)
2: (61) r7 = *(u32 *)(r1 +8)
3: (b7) r8 = 1
4: (bf) r0 = r2
5: (07) r0 += 14
6: (25) if r7 > 0x1 goto pc+4
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=0,max_value=1
R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
7: (2d) if r0 > r3 goto pc+1
R0=pkt(id=0,off=14,r=14) R1=ctx R2=pkt(id=0,off=0,r=14)
R3=pkt_end R7=inv,min_value=0,max_value=1
R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
8: (63) *(u32 *)(r0 -4) = r0
9: (b7) r0 = 0
10: (95) exit
from 7 to 9:
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=0,max_value=1
R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
9: (b7) r0 = 0
10: (95) exit
from 6 to 11:
R0=pkt(id=0,off=14,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
R3=pkt_end R7=inv,min_value=2
R8=imm1,min_value=1,max_value=1,min_align=1 R10=fp
11: (07) r0 += 1
12: (b7) r8 = 0
13: (05) goto pc-7 <----- mismatch due to r8
7: (2d) if r0 > r3 goto pc+1
R0=pkt(id=0,off=15,r=15) R1=ctx R2=pkt(id=0,off=0,r=15)
R3=pkt_end R7=inv,min_value=2
R8=imm0,min_value=0,max_value=0,min_align=2147483648 R10=fp
8: (63) *(u32 *)(r0 -4) = r0
misaligned packet access off 2+15+-4 size 4
The reason why we fail to see it in states_equal() is that the
third test in compare_ptrs_to_packet() ...
if (old->off <= cur->off &&
old->off >= old->range && cur->off >= cur->range)
return true;
... will let the above pass. The situation we run into is that
old->off <= cur->off (14 <= 15), meaning that prior walked paths
went with smaller offset, which was later used in the packet
access after successful packet range check and found to be safe
already.
For example: Given is R0=pkt(id=0,off=0,r=0). Adding offset 14
as in above program to it, results in R0=pkt(id=0,off=14,r=0)
before the packet range test. Now, testing this against R3=pkt_end
with 'if r0 > r3 goto out' will transform R0 into R0=pkt(id=0,off=14,r=14)
for the case when we're within bounds. A write into the packet
at offset *(u32 *)(r0 -4), that is, 2 + 14 -4, is valid and
aligned (2 is for NET_IP_ALIGN). After processing this with
all fall-through paths, we later on check paths from branches.
When the above skb->mark test is true, then we jump near the
end of the program, perform r0 += 1, and jump back to the
'if r0 > r3 goto out' test we've visited earlier already. This
time, R0 is of type R0=pkt(id=0,off=15,r=0), and we'll prune
that part because this time we'll have a larger safe packet
range, and we already found that with off=14 all further insn
were already safe, so it's safe as well with a larger off.
However, the problem is that the subsequent write into the packet
with 2 + 15 -4 is then unaligned, and not caught by the alignment
tracking. Note that min_align, aux_off, and aux_off_align were
all 0 in this example.
Since we cannot tell at this time what kind of packet access was
performed in the prior walk and what minimal requirements it has
(we might do so in the future, but that requires more complexity),
fix it to disable this pruning case for strict alignment for now,
and let the verifier do check such paths instead. With that applied,
the test cases pass and reject the program due to misalignment.
Fixes: d117441674 ("bpf: Track alignment of register values in the verifier.")
Reference: http://patchwork.ozlabs.org/patch/761909/
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7d472a59c0 ("arp: always override
existing neigh entries with gratuitous ARP") introduced a compiler
warning:
net/ipv4/arp.c:880:35: warning: 'addr_type' may be used uninitialized in
this function [-Wmaybe-uninitialized]
While the code logic seems to be correct and doesn't allow the variable
to be used uninitialized, and the warning is not consistently
reproducible, it's still worth fixing it for other people not to waste
time looking at the warning in case it pops up in the build environment.
Yes, compiler is probably at fault, but we will need to accommodate.
Fixes: 7d472a59c0 ("arp: always override existing neigh entries with gratuitous ARP")
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fastopen API should be used to perform fastopen operations on the TCP
socket. It does not make sense to use fastopen API to perform disconnect
by calling it with AF_UNSPEC. The fastopen data path is also prone to
race conditions and bugs when using with AF_UNSPEC.
One issue reported and analyzed by Vegard Nossum is as follows:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Thread A: Thread B:
------------------------------------------------------------------------
sendto()
- tcp_sendmsg()
- sk_stream_memory_free() = 0
- goto wait_for_sndbuf
- sk_stream_wait_memory()
- sk_wait_event() // sleep
| sendto(flags=MSG_FASTOPEN, dest_addr=AF_UNSPEC)
| - tcp_sendmsg()
| - tcp_sendmsg_fastopen()
| - __inet_stream_connect()
| - tcp_disconnect() //because of AF_UNSPEC
| - tcp_transmit_skb()// send RST
| - return 0; // no reconnect!
| - sk_stream_wait_connect()
| - sock_error()
| - xchg(&sk->sk_err, 0)
| - return -ECONNRESET
- ... // wake up, see sk->sk_err == 0
- skb_entail() on TCP_CLOSE socket
If the connection is reopened then we will send a brand new SYN packet
after thread A has already queued a buffer. At this point I think the
socket internal state (sequence numbers etc.) becomes messed up.
When the new connection is closed, the FIN-ACK is rejected because the
sequence number is outside the window. The other side tries to
retransmit,
but __tcp_retransmit_skb() calls tcp_trim_head() on an empty skb which
corrupts the skb data length and hits a BUG() in copy_and_csum_bits().
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Hence, this patch adds a check for AF_UNSPEC in the fastopen data path
and return EOPNOTSUPP to user if such case happens.
Fixes: cf60af03ca ("tcp: Fast Open client - sendmsg(MSG_FASTOPEN)")
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The default value for somaxconn is set in sysctl_core_net_init(), but this
function is not called when kernel is configured without CONFIG_SYSCTL.
This results in the kernel not being able to accept TCP connections,
because the backlog has zero size. Usually, the user ends up with:
"TCP: request_sock_TCP: Possible SYN flooding on port 7. Dropping request. Check SNMP counters."
If SYN cookies are not enabled the connection is rejected.
Before ef547f2ac1 (tcp: remove max_qlen_log), the effects were less
severe, because the backlog was always at least eight slots long.
Signed-off-by: Roman Kapl <roman.kapl@sysgo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use wait_for_completion_timeout() instead of
wait_for_completion_interruptible_timeout() to avoid stray signals ruining
firmware update. Our timeout is only 300 msec so we are fine simply letting
it expire in case device misbehaves.
Signed-off-by: KT Liao <kt.liao@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Some old touchpad FWs need to have interrupt cleared before issuing reset
command after updating firmware. We clear interrupt by attempting to read
full report from the controller, and discarding any data read.
Signed-off-by: KT Liao <kt.liao@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Add null check to avoid a potential null pointer dereference.
Addresses-Coverity-ID: 1408831
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since 9b4437a5b8 ("geneve: Unify LWT and netdev handling.") fill_info
does not return UDP_ZERO_CSUM6_RX when using COLLECT_METADATA. This is
because it uses ip_tunnel_info_af() with the device level info, which is
not valid for COLLECT_METADATA.
Fix by checking for the presence of the actual sockets.
Fixes: 9b4437a5b8 ("geneve: Unify LWT and netdev handling.")
Signed-off-by: Eric Garver <e@erig.me>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently several places in xfs_find_get_desired_pgoff() handle the case
of a missing page. Make them all handled in one place after the loop has
terminated.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
There is an off-by-one error in loop termination conditions in
xfs_find_get_desired_pgoff() since 'end' may index a page beyond end of
desired range if 'endoff' is page aligned. It doesn't have any visible
effects but still it is good to fix it.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
XFS SEEK_HOLE implementation could miss a hole in an unwritten extent as
can be seen by the following command:
xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k"
-c "seek -h 0" file
wrote 57344/57344 bytes at offset 0
56 KiB, 14 ops; 0.0000 sec (49.312 MiB/sec and 12623.9856 ops/sec)
wrote 8192/8192 bytes at offset 131072
8 KiB, 2 ops; 0.0000 sec (70.383 MiB/sec and 18018.0180 ops/sec)
Whence Result
HOLE 139264
Where we can see that hole at offset 56k was just ignored by SEEK_HOLE
implementation. The bug is in xfs_find_get_desired_pgoff() which does
not properly detect the case when pages are not contiguous.
Fix the problem by properly detecting when found page has larger offset
than expected.
CC: stable@vger.kernel.org
Fixes: d126d43f63
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
xfs_find_get_desired_pgoff() is used to search for offset of hole or
data in page range [index, end] (both inclusive), and the max number
of pages to search should be at least one, if end == index.
Otherwise the only page is missed and no hole or data is found,
which is not correct.
When block size is smaller than page size, this can be demonstrated
by preallocating a file with size smaller than page size and writing
data to the last block. E.g. run this xfs_io command on a 1k block
size XFS on x86_64 host.
# xfs_io -fc "falloc 0 3k" -c "pwrite 2k 1k" \
-c "seek -d 0" /mnt/xfs/testfile
wrote 1024/1024 bytes at offset 2048
1 KiB, 1 ops; 0.0000 sec (33.675 MiB/sec and 34482.7586 ops/sec)
Whence Result
DATA EOF
Data at offset 2k was missed, and lseek(2) returned ENXIO.
This is uncovered by generic/285 subtest 07 and 08 on ppc64 host,
where pagesize is 64k. Because a recent change to generic/285
reduced the preallocated file size to smaller than 64k.
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: Eryu Guan <eguan@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
This structure copy was throwing unaligned access warnings on sparc64:
Kernel unaligned access at TPC[1043c088] xfs_btree_visit_blocks+0x88/0xe0 [xfs]
xfs_btree_copy_ptrs does a memcpy, which avoids it.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
The function get_free_pipe_id_locked() is called from
goldfish_pipe_open() with a lock is held, so we should
use GFP_ATOMIC instead of GFP_KERNEL.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 093d24a204 ("arm64: PCI: Manage controller-specific data on
per-controller basis") added code to allocate ACPI PCI root_ops
dynamically on a per host bridge basis but failed to update the
corresponding memory allocation failure path in pci_acpi_scan_root()
leading to a potential memory leakage.
Fix it by adding the required kfree call.
Fixes: 093d24a204 ("arm64: PCI: Manage controller-specific data on per-controller basis")
Reviewed-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Timmy Li <lixiaoping3@huawei.com>
[lorenzo.pieralisi@arm.com: refactored code, rewrote commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
kobject_del() only unlinks kobject, we need to use kobject_put() to
make sure kobject will go away completely.
Fixes: 049a59db34 ("firmware: Google VPD sysfs driver")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We should not free info->key before we remove sysfs attribute that uses
this data as its name.
Fixes: 049a59db34 ("firmware: Google VPD sysfs driver")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We should only add section attribute to the list of section attributes
if we successfully created corresponding sysfs attribute.
Fixes: 049a59db34 ("firmware: Google VPD sysfs driver")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Providing "scv" support to userspace requires kernel support, so it
must be advertised as independently to the base ISA 3 instruction set.
The darn instruction relies on firmware enablement, so it has been
decided to split this out from the core ISA 3 feature as well.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit ac29c64089 ("powerpc/mm: Replace _PAGE_USER with
_PAGE_PRIVILEGED") swapped _PAGE_USER for _PAGE_PRIVILEGED, and
introduced check_pte_access() which denied kernel access to
non-_PAGE_PRIVILEGED pages.
However, it didn't add _PAGE_PRIVILEGED to the hash fault handler
for spufs' kernel accesses, so the DMAs required to establish SPE
memory no longer work.
This change adds _PAGE_PRIVILEGED to the hash fault handler for
kernel accesses.
Fixes: ac29c64089 ("powerpc/mm: Replace _PAGE_USER with _PAGE_PRIVILEGED")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Reported-by: Sombat Tragolgosol <sombat3960@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently if you disable CONFIG_PPC_RADIX_MMU you'll crash on boot on
a P9. This is because we still set MMU_FTR_TYPE_RADIX via
ibm,pa-features and MMU_FTR_TYPE_RADIX is what's used for code patching
in much of the asm code (ie. slb_miss_realmode)
This patch fixes the problem by stopping MMU_FTR_TYPE_RADIX from being
set from ibm.pa-features.
We may eventually end up removing the CONFIG_PPC_RADIX_MMU option
completely but until then this fixes the issue.
Fixes: 17a3dd2f5f ("powerpc/mm/radix: Use firmware feature to enable Radix MMU")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
opal_npu_destroy_context() should be called with the NPU PHB, not the
PCIe PHB.
Fixes: 1ab66d1fba ("powerpc/powernv: Introduce address translation services for Nvlink2")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
A rare randconfig build error shows up when we have CONFIG_CRYPTO=m
in combination with a built-in CCREE driver:
crypto/hmac.o: In function `hmac_update':
hmac.c:(.text.hmac_update+0x28): undefined reference to `crypto_shash_update'
crypto/hmac.o: In function `hmac_setkey':
hmac.c:(.text.hmac_setkey+0x90): undefined reference to `crypto_shash_digest'
hmac.c:(.text.hmac_setkey+0x154): undefined reference to `crypto_shash_update'
drivers/staging/ccree/ssi_cipher.o: In function `ssi_blkcipher_setkey':
ssi_cipher.c:(.text.ssi_blkcipher_setkey+0x350): undefined reference to `crypto_shash_digest'
drivers/staging/ccree/ssi_cipher.o: In function `ssi_blkcipher_exit':
ssi_cipher.c:(.text.ssi_blkcipher_exit+0xd4): undefined reference to `crypto_destroy_tfm'
drivers/staging/ccree/ssi_cipher.o: In function `ssi_blkcipher_init':
ssi_cipher.c:(.text.ssi_blkcipher_init+0x1b0): undefined reference to `crypto_alloc_shash'
drivers/staging/ccree/ssi_cipher.o: In function `ssi_ablkcipher_free':
ssi_cipher.c:(.text.ssi_ablkcipher_free+0x48): undefined reference to `crypto_unregister_alg'
drivers/staging/ccree/ssi_cipher.o: In function `ssi_ablkcipher_alloc':
ssi_cipher.c:(.text.ssi_ablkcipher_alloc+0x138): undefined reference to `crypto_register_alg'
ssi_cipher.c:(.text.ssi_ablkcipher_alloc+0x274): undefined reference to `crypto_blkcipher_type'
We actually need to depend on both CRYPTO and CRYPTO_HW here to avoid the
problem, since CRYPTO_HW is a bool symbol and by itself that does not
force CCREE to be a loadable module when the core cryto support is modular.
Fixes: 50cfbbb7e6 ("staging: ccree: add ahash support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-By: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver calls ioremap() in the probe function but doesn't call
iounmap() in the remove function correspondingly. Do so now.
Follow commit 5c9d6abed9 ("serial: altera_jtaguart: adding iounmap()")
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit e61c38d85b ("serial: imx: setup DCEDTE early and ensure DCD and
RI irqs to be off") has a flaw: While UCR3 and UFCR were modified using
read-modify-write before it switched to write register values
independent of the previous state. That's a good idea in principle (and
that's why I did it) but needs more care.
This patch reinstates read-modify-write for UFCR and for UCR3 ensures
that RXDMUXSEL and ADNIMP are set for post imx1.
Fixes: e61c38d85b ("serial: imx: setup DCEDTE early and ensure DCD and RI irqs to be off")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Mika Penttilä <mika.penttila@nextfour.com>
Tested-by: Mika Penttilä <mika.penttila@nextfour.com>
Acked-by: Steve Twiss <stwiss.opensource@diasemi.com>
Tested-by: Steve Twiss <stwiss.opensource@diasemi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull SCSI fixes from James Bottomley:
"This is quite a big update because it includes a rework of the lpfc
driver to separate the NVMe part from the FC part.
The reason for doing this is because two separate trees (the nvme and
scsi trees respectively) want to update the individual components and
this separation will prevent a really nasty cross tree entanglement by
the time we reach the next merge window.
The rest of the fixes are the usual minor sort with no significant
security implications"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (25 commits)
scsi: zero per-cmd private driver data for each MQ I/O
scsi: csiostor: fix use after free in csio_hw_use_fwconfig()
scsi: ufs: Clean up some rpm/spm level SysFS nodes upon remove
scsi: lpfc: fix build issue if NVME_FC_TARGET is not defined
scsi: lpfc: Fix NULL pointer dereference during PCI error recovery
scsi: lpfc: update version to 11.2.0.14
scsi: lpfc: Add MDS Diagnostic support.
scsi: lpfc: Fix NVMEI's handling of NVMET's PRLI response attributes
scsi: lpfc: Cleanup entry_repost settings on SLI4 queues
scsi: lpfc: Fix debugfs root inode "lpfc" not getting deleted on driver unload.
scsi: lpfc: Fix NVME I+T not registering NVME as a supported FC4 type
scsi: lpfc: Added recovery logic for running out of NVMET IO context resources
scsi: lpfc: Separate NVMET RQ buffer posting from IO resources SGL/iocbq/context
scsi: lpfc: Separate NVMET data buffer pool fir ELS/CT.
scsi: lpfc: Fix NMI watchdog assertions when running nvmet IOPS tests
scsi: lpfc: Fix NVMEI driver not decrementing counter causing bad rport state.
scsi: lpfc: Fix nvmet RQ resource needs for large block writes.
scsi: lpfc: Adding additional stats counters for nvme.
scsi: lpfc: Fix system crash when port is reset.
scsi: lpfc: Fix used-RPI accounting problem.
...
Fixes following signature in the stack trace:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000374
IP: [<ffffffffa06ec8eb>] qla2x00_sp_free_dma+0xeb/0x2a0 [qla2xxx]
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
when driver is loaded with Multi Queue enabled, it was noticed that
there was one less queue pair created.
Following message would indicate this:
"No resources to create additional q pair."
The result of one less queue pair means that system can crash, if the
block mq layer thinks there is an extra hardware queue available, and
the driver will use a NULL ptr qpair in that instance.
Following stack trace is seen in one of the crash:
irq_create_affinity_masks+0x98/0x530
irq_create_affinity_masks+0x98/0x530
__pci_enable_msix+0x321/0x4e0
mutex_lock+0x12/0x40
pci_alloc_irq_vectors_affinity+0xb5/0x140
qla24xx_enable_msix+0x79/0x530 [qla2xxx]
qla2x00_request_irqs+0x61/0x2d0 [qla2xxx]
qla2x00_probe_one+0xc73/0x2390 [qla2xxx]
ida_simple_get+0x98/0x100
kernfs_next_descendant_post+0x40/0x50
local_pci_probe+0x45/0xa0
pci_device_probe+0xfc/0x140
driver_probe_device+0x2c5/0x470
__driver_attach+0xdd/0xe0
driver_probe_device+0x470/0x470
bus_for_each_dev+0x6c/0xc0
driver_attach+0x1e/0x20
bus_add_driver+0x45/0x270
driver_register+0x60/0xe0
__pci_register_driver+0x4c/0x50
qla2x00_module_init+0x1ce/0x21e [qla2xxx]
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This makes it possible, with appropriate filesystem support, for a
sysadmin to tell what is affected by the mismatch, and whether
it should be ignored (if it's inside a swap partition, for
instance).
We ratelimit to prevent log flooding: if there are so many
mismatches that ratelimiting is necessary, the individual messages
are relatively unlikely to be important (either the machine is
swapping like crazy or something is very wrong with the disk).
Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Previously, the uuid debug statements were printed in little-endian
format, which wasn't consistent in machines that might not be in
little-endian byte order. With this change, the output will be
consistent for all machines with different byte-ordering.
Signed-off-by: Kyungchan Koh <kkc6196@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
ext4_xattr_block_set() calls dquot_alloc_block() to charge for an xattr
block when new references are made. However if dquot_initialize() hasn't
been called on an inode, request for charging is effectively ignored
because ext4_inode_info->i_dquot is not initialized yet.
Add dquot_initialize() to call paths that lead to ext4_xattr_block_set().
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Currently we don't allow direct I/O on encrypted regular files, so in
such cases we return 0 early in ext4_direct_IO(). There was also an
additional BUG_ON() check in ext4_direct_IO_write(), but it can never be
hit because of the earlier check for the exact same condition in
ext4_direct_IO(). There was also no matching check on the read path,
which made the write path specific check seem very ad-hoc.
Just remove the unnecessary BUG_ON().
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: David Gstir <david@sigma-star.at>
Reviewed-by: Jan Kara <jack@suse.cz>
Now that we are passing a struct ext4_filename, we do not need to pass
around the original struct qstr too.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
The 'lend' argument of filemap_write_and_wait_range() is inclusive, so
we need to subtract 1 from pos + count.
Note that 'count' is guaranteed to be nonzero since
ext4_file_read_iter() returns early when given a 0 count.
Fixes: 16c5468859 ("ext4: Allow parallel DIO reads")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
ext4_find_unwritten_pgoff() is used to search for offset of hole or
data in page range [index, end] (both inclusive), and the max number
of pages to search should be at least one, if end == index.
Otherwise the only page is missed and no hole or data is found,
which is not correct.
When block size is smaller than page size, this can be demonstrated
by preallocating a file with size smaller than page size and writing
data to the last block. E.g. run this xfs_io command on a 1k block
size ext4 on x86_64 host.
# xfs_io -fc "falloc 0 3k" -c "pwrite 2k 1k" \
-c "seek -d 0" /mnt/ext4/testfile
wrote 1024/1024 bytes at offset 2048
1 KiB, 1 ops; 0.0000 sec (42.459 MiB/sec and 43478.2609 ops/sec)
Whence Result
DATA EOF
Data at offset 2k was missed, and lseek(2) returned ENXIO.
This is unconvered by generic/285 subtest 07 and 08 on ppc64 host,
where pagesize is 64k. Because a recent change to generic/285
reduced the preallocated file size to smaller than 64k.
Signed-off-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
get_reg() can be reentered on architectures with prioritized interrupts
(m68k in this case), causing f->reg_index to be incremented after the
range check. Out of bounds memory access past the pt_regs struct results.
This will go mostly undetected unless access is beyond end of memory.
Prevent the race by disabling interrupts in get_reg().
Tested on m68k (Atari Falcon, and ARAnyM emulator).
Kudos to Geert Uytterhoeven for helping to trace this race.
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We end up reading the interrupt register for HPD5, and then writing it
to HPD6 which on systems without anything using HPD5 results in
permanently disabling hotplug on one of the display outputs after the
first time we acknowledge a hotplug interrupt from the GPU.
This code is really bad. But for now, let's just fix this. I will
hopefully have a large patch series to refactor all of this soon.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lyude <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Daniel Borkmann says:
====================
BPF pruning follow-up
Follow-up to fix incorrect pruning when alignment tracking is
in use and to properly clear regs after call to not leave stale
data behind. For details, please see individual patches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Since virtio does not provide it's own ndo_features_check handler,
TSO, and now checksum offload, are disabled for stacked vlans.
Re-enable the support and let the host take care of it. This
restores/improves Guest-to-Guest performance over Q-in-Q vlans.
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At least some of the be2net cards do not seem to be capabled
of performing checksum offload computions on Q-in-Q packets.
In these case, the recevied checksum on the remote is invalid
and TCP syn packets are dropped.
This patch adds a call to check disbled acceleration features
on Q-in-Q tagged traffic.
CC: Sathya Perla <sathya.perla@broadcom.com>
CC: Ajit Khaparde <ajit.khaparde@broadcom.com>
CC: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
CC: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It appears that TCP checksum offloading has been broken for
Q-in-Q vlans. The behavior was execerbated by the
series
commit afb0bc972b ("Merge branch 'stacked_vlan_tso'")
that that enabled accleleration features on stacked vlans.
However, event without that series, it is possible to trigger
this issue. It just requires a lot more specialized configuration.
The root cause is the interaction between how
netdev_intersect_features() works, the features actually set on
the vlan devices and HW having the ability to run checksum with
longer headers.
The issue starts when netdev_interesect_features() replaces
NETIF_F_HW_CSUM with a combination of NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM,
if the HW advertises IP|IPV6 specific checksums. This happens
for tagged and multi-tagged packets. However, HW that enables
IP|IPV6 checksum offloading doesn't gurantee that packets with
arbitrarily long headers can be checksummed.
This patch disables IP|IPV6 checksums on the packet for multi-tagged
packets.
CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
CC: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reinitializing the VM manager during suspend/resume is a very very bad
idea since all the VMs are still active and kicking.
This can lead to random VM faults after resume when new processes
become the same client ID assigned.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The 88m1101 has an errata when configuring autoneg. However, it was
being applied to many other Marvell PHYs as well. Limit its scope to
just the 88m1101.
Fixes: 76884679c6 ("phylib: Add support for Marvell 88e1111S and 88e1145")
Reported-by: Daniel Walker <danielwa@cisco.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Harini Katakam <harinik@xilinx.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix build errors by making this driver depend on OF_MDIO, like
several other similar drivers do.
drivers/built-in.o: In function `octeon_mdiobus_remove':
mdio-octeon.c:(.text+0x196ee0): undefined reference to `mdiobus_unregister'
mdio-octeon.c:(.text+0x196ee8): undefined reference to `mdiobus_free'
drivers/built-in.o: In function `octeon_mdiobus_probe':
mdio-octeon.c:(.text+0x196f1d): undefined reference to `devm_mdiobus_alloc_size'
mdio-octeon.c:(.text+0x196ffe): undefined reference to `of_mdiobus_register'
mdio-octeon.c:(.text+0x197010): undefined reference to `mdiobus_free'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
mlx5-fixes-2017-05-23
Some TC offloads fixes from Or Gerlitz.
From Erez, mlx5 IPoIB RX fix to improve GRO.
From Mohamad, Command interface fix to improve mitigation against FW
commands timeouts.
From Tariq, Driver load Tolerance against affinity settings failures.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
Just two fixes this time:
* fix the scheduled scan "BUG: scheduling while atomic"
* check mesh address extension flags more strictly
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Some PHY require to wait for a bit after the reset GPIO has been
toggled. This adds support for the DT property `phy-reset-post-delay`
which gives the delay in milliseconds to wait after reset.
If the DT property is not given, no delay is observed. Post reset delay
greater than 1000ms are invalid.
Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long says:
====================
sctp: a bunch of fixes for processing dupcookie
After introducing transport hashtable and per stream info into sctp,
some regressions were caused when processing dupcookie, this patchset
is to fix them.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
After sctp changed to use transport hashtable, a transport would be
added into global hashtable when adding the peer to an asoc, then
the asoc can be got by searching the transport in the hashtbale.
The problem is when processing dupcookie in sctp_sf_do_5_2_4_dupcook,
a new asoc would be created. A peer with the same addr and port as
the one in the old asoc might be added into the new asoc, but fail
to be added into the hashtable, as they also belong to the same sk.
It causes that sctp's dupcookie processing can not really work.
Since the new asoc will be freed after copying it's information to
the old asoc, it's more like a temp asoc. So this patch is to fix
it by setting it as a temp asoc to avoid adding it's any transport
into the hashtable and also avoid allocing assoc_id.
An extra thing it has to do is to also alloc stream info for any
temp asoc, as sctp dupcookie process needs it to update old asoc.
But I don't think it would hurt something, as a temp asoc would
always be freed after finishing processing cookie echo packet.
Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 3dbcc105d5 ("sctp: alloc stream info when initializing
asoc"), stream and stream.out info are always alloced when creating
an asoc.
So it's not correct to check !asoc->stream before updating stream
info when processing dupcookie, but would be better to check asoc
state instead.
Fixes: 3dbcc105d5 ("sctp: alloc stream info when initializing asoc")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If multiple tasks attempt to read the stats, it may happen that the
start_req_done completion is re-initialized while still being used by
another task, causing a list corruption.
This patch fixes the bug by adding a mutex to serialize the calls to
bnx2fc_get_host_stats().
WARNING: at lib/list_debug.c:48 list_del+0x6e/0xa0() (Not tainted)
Hardware name: PowerEdge R820
list_del corruption. prev->next should be ffff882035627d90, but was ffff884069541588
Pid: 40267, comm: perl Not tainted 2.6.32-642.3.1.el6.x86_64 #1
Call Trace:
[<ffffffff8107c691>] ? warn_slowpath_common+0x91/0xe0
[<ffffffff8107c796>] ? warn_slowpath_fmt+0x46/0x60
[<ffffffff812ad16e>] ? list_del+0x6e/0xa0
[<ffffffff81547eed>] ? wait_for_common+0x14d/0x180
[<ffffffff8106c4a0>] ? default_wake_function+0x0/0x20
[<ffffffff81547fd3>] ? wait_for_completion_timeout+0x13/0x20
[<ffffffffa05410b1>] ? bnx2fc_get_host_stats+0xa1/0x280 [bnx2fc]
[<ffffffffa04cf630>] ? fc_stat_show+0x90/0xc0 [scsi_transport_fc]
[<ffffffffa04cf8b6>] ? show_fcstat_tx_frames+0x16/0x20 [scsi_transport_fc]
[<ffffffff8137c647>] ? dev_attr_show+0x27/0x50
[<ffffffff8113b9be>] ? __get_free_pages+0xe/0x50
[<ffffffff812170e1>] ? sysfs_read_file+0x111/0x200
[<ffffffff8119a305>] ? vfs_read+0xb5/0x1a0
[<ffffffff8119b0b6>] ? fget_light_pos+0x16/0x50
[<ffffffff8119a651>] ? sys_read+0x51/0xb0
[<ffffffff810ee1fe>] ? __audit_syscall_exit+0x25e/0x290
[<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Acked-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When pci_enable_device() or pci_enable_device_mem() fail in
qla2x00_probe_one() we bail out but do a call to
pci_disable_device(). This causes the dev_WARN_ON() in
pci_disable_device() to trigger, as the device wasn't enabled
previously.
So instead of taking the 'probe_out' error path we can directly return
*iff* one of the pci_enable_device() calls fails.
Additionally rename the 'probe_out' goto label's name to the more
descriptive 'disable_device'.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Fixes: e315cd28b9 ("[SCSI] qla2xxx: Code changes for qla data structure refactoring")
Cc: <stable@vger.kernel.org>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Element size in the manifest should be updated for each token, so that the
loop can parse all the string elements in the manifest. This was not
happening when more than two string elements appear consecutively, as it is
not updated with correct string element size. Fixed with this patch.
Signed-off-by: Shreyas NC <shreyas.nc@intel.com>
Signed-off-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
In SKL+ platforms, all IPC commands are serialised, i.e. the driver sends
a new IPC to DSP, only after receiving a reply from the firmware for the
current IPC.
Hence it seems apparent that there is only a single modifier of the IPC RX
List. However, during an IPC timeout case in a multithreaded environment,
there is a possibility of the list element being deleted two times if not
properly protected.
So, use spin lock save/restore to prevent rx_list corruption.
Signed-off-by: Pardha Saradhi K <pardha.saradhi.kesapragada@intel.com>
Signed-off-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
commit 90431eb49b ("ASoC: rsnd: don't use PDTA bit for 24bit on SSI")
fixups 24bit mode data alignment, but PIO was not cared.
This patch fixes PIO mode 24bit data alignment
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
A somewhat overdue update of the address for sending patches on Wolfson
parts to since our acquision a couple of years ago by Cirrus Logic.
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
soc_cleanup_card_resources() call snd_card_free() at the last of its
procedure. This turned out to lead to a use-after-free.
PCM runtimes have been already removed via soc_remove_pcm_runtimes(),
while it's dereferenced later in soc_pcm_free() called via
snd_card_free().
The fix is simple: just move the snd_card_free() call to the beginning
of the whole procedure. This also gives another benefit: it
guarantees that all operations have been shut down before actually
releasing the resources, which was racy until now.
Reported-and-tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: <stable@vger.kernel.org>
In most cases, a cgroup controller don't care about the liftimes of
cgroups. For the controller, a css becomes online when ->css_online()
is called on it and offline when ->css_offline() is called.
However, cpuset is special in that the user interface it exposes cares
whether certain cgroups exist or not. Combined with the RCU delay
between cgroup removal and css offlining, this can lead to user
visible behavior oddities where operations which should succeed after
cgroup removals fail for some time period. The effects of cgroup
removals are delayed when seen from userland.
This patch adds css_is_dying() which tests whether offline is pending
and updates is_cpuset_online() so that the function returns false also
while offline is pending. This gets rid of the userland visible
delays.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Link: http://lkml.kernel.org/r/327ca1f5-7957-fbb9-9e5f-9ba149d40ba2@oracle.com
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Gabriel will no longer maintain this driver, so I'm adding
myself as maintainer.
Thanks for all your work on jsm driver Gabriel.
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently the ceph client doesn't respect the rlimit in fallocate. This
means that a user can allocate a file with size > RLIMIT_FSIZE. This
patch adds the call to inode_newsize_ok() to verify filesystem limits and
ulimits. This should make ceph successfully run xfstest generic/228.
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Pull ptrace fix from Eric Biederman:
"This fixes a brown paper bag bug. When I fixed the ptrace interaction
with user namespaces I added a new field ptracer_cred in struct_task
and I failed to properly initialize it on fork.
This dangling pointer wound up breaking runing setuid applications run
from the enlightenment window manager.
As this is the worst sort of bug. A regression breaking user space for
no good reason let's get this fixed"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ptrace: Properly initialize ptracer_cred on fork
Pull MMC fixes from Ulf Hansson:
"A couple of MMC host fixes intended for v4.12 rc3:
- sdhci-xenon: Don't free data for phy allocated by devm*
- sdhci-iproc: Suppress spurious interrupts
- cavium: Fix probing race with regulator
- cavium: Prevent crash with incomplete DT
- cavium-octeon: Use proper GPIO name for power control
- cavium-octeon: Fix interrupt enable code"
* tag 'mmc-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-iproc: suppress spurious interrupt with Multiblock read
mmc: cavium: Fix probing race with regulator
of/platform: Make of_platform_device_destroy globally visible
mmc: cavium: Prevent crash with incomplete DT
mmc: cavium-octeon: Use proper GPIO name for power control
mmc: cavium-octeon: Fix interrupt enable code
mmc: sdhci-xenon: kill xenon_clean_phy()
The cryptographic engine nodes have an interrupt which is configured as
both edge and level, which makes no sense at all. Fix this by
configuring it the right way (level).
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
This reverts commit 368e5fbdfc.
devm_ioremap_resource() enforces that there are no overlapping
resources, where as devm_ioremap() does not. The sata phy driver needs
a subset of the sata IO address space, so maps some of the sata
address space. As a result, sata_mv now fails to probe, reporting it
cannot get its resources, and so we don't have any SATA disks.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: Tejun Heo <tj@kernel.org>
In the current form of the code, if a->replacementlen is 0, the reference
to *insnbuf for comparison touches potentially garbage memory. While it
doesn't affect the execution flow due to the subsequent a->replacementlen
comparison, it is (rightly) detected as use of uninitialized memory by a
runtime instrumentation currently under my development, and could be
detected as such by other tools in the future, too (e.g. KMSAN).
Fix the "false-positive" by reordering the conditions to first check the
replacement instruction length before referencing specific opcode bytes.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/20170524135500.27223-1-mjurczyk@google.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It's possible and acceptable for NFS to attempt to add requests beyond the
range of the current pgio->pg_lseg, a case which should be caught and
limited by the pg_test operation. However, the current handling of this
case replaces pgio->pg_lseg with a new layout segment (after a WARN) within
that pg_test operation. That will cause all the previously added requests
to be submitted with this new layout segment, which may not be valid for
those requests.
Fix this problem by only returning zero for the number of bytes to coalesce
from pg_test for this case which allows any previously added requests to
complete on the current layout segment. The check for requests starting
out of range of the layout segment moves to pg_init, so that the
replacement of pgio->pg_lseg will be done when the next request is added.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
If xdr_inline_decode() fails then we end up returning ERR_PTR(0). The
caller treats NULL returns as -ENOMEM so it doesn't really hurt runtime,
but obviously we intended to set an error code here.
Fixes: d67ae825a5 ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Commit b685d3d65a "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions
Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.
Fixes: b685d3d65a
CC: reiserfs-devel@vger.kernel.org
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Commit b685d3d65a "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions
Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.
Fixes: b685d3d65a
CC: Steven Whitehouse <swhiteho@redhat.com>
CC: cluster-devel@redhat.com
CC: stable@vger.kernel.org
Acked-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
If nf_conntrack_htable_size was adjusted by the user during the ct
dump operation, we may invoke nf_ct_put twice for the same ct, i.e.
the "last" ct. This will cause the ct will be freed but still linked
in hash buckets.
It's very easy to reproduce the problem by the following commands:
# while : ; do
echo $RANDOM > /proc/sys/net/netfilter/nf_conntrack_buckets
done
# while : ; do
conntrack -L
done
# iperf -s 127.0.0.1 &
# iperf -c 127.0.0.1 -P 60 -t 36000
After a while, the system will hang like this:
NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [bash:20184]
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [iperf:20382]
...
So at last if we find cb->args[1] is equal to "last", this means hash
resize happened, then we can set cb->args[1] to 0 to fix the above
issue.
Fixes: d205dc4079 ("[NETFILTER]: ctnetlink: fix deadlock in table dumping")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The hi6220_reset driver can be built as a standalone module
yet it cannot be loaded because it depends on GPL exported symbols.
Lets set the module license so that the module loads, and things like
the on-board kirin drm starts working.
Signed-off-by: Jeremy Linton <lintonrjeremy@gmail.com>
Reviewed-by: Xinliang Liu <xinliang.liu@linaro.org>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
In the file arch/x86/mm/pat.c, there's a '__pat_enabled' variable. The
variable is set to 1 by default and the function pat_init() sets
__pat_enabled to 0 if the CPU doesn't support PAT.
However, on AMD K6-3 CPUs, the processor initialization code never calls
pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
returns true, even though the K6-3 CPU doesn't support PAT.
The result of this bug is that a kernel warning is produced when attempting to
start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
Another symptom of this bug is that the framebuffer driver doesn't set the
K6-3 MTRR registers:
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
...
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
To fix the bug change pat_enabled() so that it returns true only if PAT
initialization was actually done.
Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
this_cpu_has(X86_FEATURE_PAT) in pat_ap_init(), so that we check the PAT
feature on the processor that is being initialized.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: stable@vger.kernel.org # v4.2+
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1704181501450.26399@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We have been a little loose with our intermediate VMCR representation
where we had a 'ctlr' field, but we failed to differentiate between the
GICv2 GICC_CTLR and ICC_CTLR_EL1 layouts, and therefore ended up mapping
the wrong bits into the individual fields of the ICH_VMCR_EL2 when
emulating a GICv2 on a GICv3 system.
Fix this by using explicit fields for the VMCR bits instead.
Cc: Eric Auger <eric.auger@redhat.com>
Reported-by: wanghaibin <wanghaibin.wang@huawei.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Dave Jones and Steven Rostedt reported unwinder warnings like the
following:
WARNING: kernel stack frame pointer at ffff8800bda0ff30 in sshd:1090 has bad value 000055b32abf1fa8
In both cases, the unwinder was attempting to unwind from an ftrace
handler into entry code. The callchain was something like:
syscall entry code
C function
ftrace handler
save_stack_trace()
The problem is that the unwinder's end-of-stack logic gets confused by
the way ftrace lays out the stack frame (with fentry enabled).
I was able to recreate this warning with:
echo call_usermodehelper_exec_async:stacktrace > /sys/kernel/debug/tracing/set_ftrace_filter
(exit login session)
I considered fixing this by changing the ftrace code to rewrite the
stack to make the unwinder happy. But that seemed too intrusive after I
implemented it. Instead, just add another check to the unwinder's
end-of-stack logic to detect this special case.
Side note: We could probably get rid of these end-of-stack checks by
encoding the frame pointer for syscall entry just like we do for
interrupt entry. That would be simpler, but it would also be a lot more
intrusive since it would slightly affect the performance of every
syscall.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: live-patching@vger.kernel.org
Fixes: c32c47c68a ("x86/unwind: Warn on bad frame pointer")
Link: http://lkml.kernel.org/r/671ba22fbc0156b8f7e0cfa5ab2a795e08bc37e1.1495553739.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Petr Mladek reported the following warning when loading the livepatch
sample module:
WARNING: CPU: 1 PID: 3699 at arch/x86/kernel/stacktrace.c:132 save_stack_trace_tsk_reliable+0x133/0x1a0
...
Call Trace:
__schedule+0x273/0x820
schedule+0x36/0x80
kthreadd+0x305/0x310
? kthread_create_on_cpu+0x80/0x80
? icmp_echo.part.32+0x50/0x50
ret_from_fork+0x2c/0x40
That warning means the end of the stack is no longer recognized as such
for newly forked tasks. The problem was introduced with the following
commit:
ff3f7e2475 ("x86/entry: Fix the end of the stack for newly forked tasks")
... which was completely misguided. It only partially fixed the
reported issue, and it introduced another bug in the process. None of
the other entry code saves the frame pointer before calling into C code,
so it doesn't make sense for ret_from_fork to do so either.
Contrary to what I originally thought, the original issue wasn't related
to newly forked tasks. It was actually related to ftrace. When entry
code calls into a function which then calls into an ftrace handler, the
stack frame looks different than normal.
The original issue will be fixed in the unwinder, in a subsequent patch.
Reported-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: live-patching@vger.kernel.org
Fixes: ff3f7e2475 ("x86/entry: Fix the end of the stack for newly forked tasks")
Link: http://lkml.kernel.org/r/f350760f7e82f0750c8d1dd093456eb212751caa.1495553739.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Sync (copy) the following v4.12 kernel headers to the tooling headers:
arch/x86/include/asm/disabled-features.h:
arch/x86/include/uapi/asm/kvm.h:
arch/powerpc/include/uapi/asm/kvm.h:
arch/s390/include/uapi/asm/kvm.h:
arch/arm/include/uapi/asm/kvm.h:
arch/arm64/include/uapi/asm/kvm.h:
- 'struct kvm_sync_regs' got changed in an ABI-incompatible way,
fortunately none of the (in-kernel) tooling relied on it
- new KVM_DEV calls added
arch/x86/include/asm/required-features.h:
- 5-level paging hardware ABI detail added
arch/x86/include/asm/cpufeatures.h:
- new CPU feature added
arch/x86/include/uapi/asm/vmx.h:
- new VMX exit conditions
None of the changes requires fixes in the tooling source code.
This addresses the following warnings:
Warning: include/uapi/linux/stat.h differs from kernel
Warning: arch/x86/include/asm/disabled-features.h differs from kernel
Warning: arch/x86/include/asm/required-features.h differs from kernel
Warning: arch/x86/include/asm/cpufeatures.h differs from kernel
Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel
Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel
Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel
Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel
Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524065721.j2mlch6bgk5klgbc@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The __hpp__sort_acc() sorts entries using callchain depth in order to
put callers above in children mode. But it assumed the callchain order
was callee-first. Now default (for children) is caller-first so the
order of entries is reverted.
For example, consider following case:
$ perf report --no-children
..l
# Overhead Command Shared Object Symbol
# ........ ....... ................... ..........................
#
99.44% a.out a.out [.] main
|
---main
__libc_start_main
_start
Then children mode should show 'start' above '__libc_start_main' since
it's the caller (parent) of the __libc_start_main. But it's reversed:
# Children Self Command Shared Object Symbol
# ........ ........ ....... ............... .....................
#
99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main
99.61% 0.00% a.out a.out [.] _start
99.54% 99.44% a.out a.out [.] main
This patch fixes it.
# Children Self Command Shared Object Symbol
# ........ ........ ....... ............... .....................
#
99.61% 0.00% a.out a.out [.] _start
99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main
99.54% 99.44% a.out a.out [.] main
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-8-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The very last inlined frame, i.e. the one furthest away from the
non-inlined frame, was silently dropped. This is apparent when
comparing the output of `perf script` and `addr2line`:
~~~~~~
$ perf script --inline
...
a.out 26722 80836.309329: 72425 cycles:
21561 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
a4a main (a.out)
std::abs<double>
std::_Norm_helper<true>::_S_do_it<double>
std::norm<double>
main
20510 __libc_start_main (/usr/lib/libc-2.25.so)
bd9 _start (a.out)
$ addr2line -a -f -i -e /tmp/a.out a4a | c++filt
0x0000000000000a4a
std::__complex_abs(doublecomplex )
/usr/include/c++/6.3.1/complex:589
double std::abs<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:597
double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:654
double std::norm<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:664
main
/tmp/inlining.cpp:14
~~~~~
Note how `std::__complex_abs` is missing from the `perf script`
output. This is similarly showing up in `perf report`. The patch
here fixes this issue, and the output becomes:
~~~~~
a.out 26722 80836.309329: 72425 cycles:
21561 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
a4a main (a.out)
std::__complex_abs
std::abs<double>
std::_Norm_helper<true>::_S_do_it<double>
std::norm<double>
main
20510 __libc_start_main (/usr/lib/libc-2.25.so)
bd9 _start (a.out)
~~~~~
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
As the documentation for dwfl_frame_pc says, frames that
are no activation frames need to have their program counter
decremented by one to properly find the function of the caller.
This fixes many cases where perf report currently attributes
the cost to the next line. I.e. I have code like this:
~~~~~~~~~~~~~~~
#include <thread>
#include <chrono>
using namespace std;
int main()
{
this_thread::sleep_for(chrono::milliseconds(1000));
this_thread::sleep_for(chrono::milliseconds(100));
this_thread::sleep_for(chrono::milliseconds(10));
return 0;
}
~~~~~~~~~~~~~~~
Now compile and record it:
~~~~~~~~~~~~~~~
g++ -std=c++11 -g -O2 test.cpp
echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
perf record \
--event sched:sched_stat_sleep \
--event sched:sched_process_exit \
--event sched:sched_switch --call-graph=dwarf \
--output perf.data.raw \
./a.out
echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
perf inject --sched-stat --input perf.data.raw --output perf.data
~~~~~~~~~~~~~~~
Before this patch, the report clearly shows the off-by-one issue.
Most notably, the last sleep invocation is incorrectly attributed
to the "return 0;" line:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:9
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:10
| __libc_start_main
| _start
|
--0.91%--main test.cpp:13
__libc_start_main
_start
~~~~~~~~~~~~~~~
With this patch here applied, the issue is fixed. The report becomes
much more usable:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:8
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:9
| __libc_start_main
| _start
|
--0.91%--main test.cpp:10
__libc_start_main
_start
~~~~~~~~~~~~~~~
Similarly it works for signal frames:
~~~~~~~~~~~~~~~
__noinline void bar(void)
{
volatile long cnt = 0;
for (cnt = 0; cnt < 100000000; cnt++);
}
__noinline void foo(void)
{
bar();
}
void sig_handler(int sig)
{
foo();
}
int main(void)
{
signal(SIGUSR1, sig_handler);
raise(SIGUSR1);
foo();
return 0;
}
~~~~~~~~~~~~~~~~
Before, the report wrongly points to `signal.c:29` after raise():
~~~~~~~~~~~~~~~~
$ perf report --stdio --no-children -g srcline -s srcline
...
100.00% signal.c:11
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:29
__libc_start_main
_start
~~~~~~~~~~~~~~~~
With this patch in, the issue is fixed and we instead get:
~~~~~~~~~~~~~~~~
100.00% signal signal [.] bar
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:27
__libc_start_main
_start
~~~~~~~~~~~~~~~~
Note how this patch fixes this issue for both unwinding methods, i.e.
both dwfl and libunwind. The former case is straight-forward thanks
to dwfl_frame_pc(). For libunwind, we replace the functionality via
unw_is_signal_frame() for any but the very first frame.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-4-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When a filename was found in addr2line it was duplicated via strdup()
but never freed. Now we pass NULL and handle this gracefully in
addr2line.
Detected by Valgrind:
==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 220
==16331== at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16331== by 0x672FA69: strdup (in /usr/lib/libc-2.25.so)
==16331== by 0x52769F: addr2line (srcline.c:256)
==16331== by 0x52769F: addr2inlines (srcline.c:294)
==16331== by 0x52769F: dso__parse_addr_inlines (srcline.c:502)
==16331== by 0x574D7A: inline__fprintf (hist.c:41)
==16331== by 0x574D7A: ipchain__fprintf_graph (hist.c:147)
==16331== by 0x57518A: __callchain__fprintf_graph (hist.c:212)
==16331== by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337)
==16331== by 0x57738E: hist_entry__fprintf (hist.c:628)
==16331== by 0x57738E: hists__fprintf (hist.c:882)
==16331== by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399)
==16331== by 0x44A20F: report__browse_hists (builtin-report.c:491)
==16331== by 0x44A20F: __cmd_report (builtin-report.c:624)
==16331== by 0x44A20F: cmd_report (builtin-report.c:1054)
==16331== by 0x4A49CE: run_builtin (perf.c:296)
==16331== by 0x4A4CC0: handle_internal_command (perf.c:348)
==16331== by 0x434371: run_argv (perf.c:392)
==16331== by 0x434371: main (perf.c:530)
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-3-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I just hit a segfault when doing `perf report -g srcline`.
Valgrind pointed me at this code as the culprit:
==8359== Invalid read of size 8
==8359== at 0x3096D9: map__rip_2objdump (map.c:430)
==8359== by 0x2FC1A3: match_chain_srcline (callchain.c:645)
==8359== by 0x2FC1A3: match_chain (callchain.c:700)
==8359== by 0x2FC1A3: append_chain (callchain.c:895)
==8359== by 0x2FC1A3: append_chain_children (callchain.c:846)
==8359== by 0x2FF719: callchain_append (callchain.c:944)
==8359== by 0x2FF719: hist_entry__append_callchain (callchain.c:1058)
==8359== by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908)
==8359== by 0x33195C: hist_entry_iter__add (hist.c:1050)
==8359== by 0x258F65: process_sample_event (builtin-report.c:204)
==8359== by 0x30D60C: perf_session__deliver_event (session.c:1310)
==8359== by 0x30D60C: ordered_events__deliver_event (session.c:119)
==8359== by 0x310D12: __ordered_events__flush (ordered-events.c:210)
==8359== by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277)
==8359== by 0x30DD3C: perf_session__process_user_event (session.c:1349)
==8359== by 0x30DD3C: perf_session__process_event (session.c:1475)
==8359== by 0x30FC3C: __perf_session__process_events (session.c:1867)
==8359== by 0x30FC3C: perf_session__process_events (session.c:1921)
==8359== by 0x25A985: __cmd_report (builtin-report.c:575)
==8359== by 0x25A985: cmd_report (builtin-report.c:1054)
==8359== by 0x2B9A80: run_builtin (perf.c:296)
==8359== Address 0x70 is not stack'd, malloc'd or (recently) free'd
This patch fixes the issue.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
[ Remove dependency from another change ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-2-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The current buffer is being reset to zero on device_free_chan_resources()
but not on device_terminate_all(). It could happen that HW is restarted and
expects BASE0 to be used, but the driver is not synchronized and will start
from BASE1. One solution is to reset the buffer explicitly in
m2p_hw_setup().
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Tweak the Kconfig description to mention support for NSP and make the
default on for iProc based platforms.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "devm_kcalloc".
This issue was detected by using the Coccinelle software.
Acked-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Making thermal_emergency_poweroff static fixes sparse warning:
drivers/thermal/thermal_core.c:6: warning: symbol
'thermal_emergency_poweroff' was not declared. Should it be static?
Fixes: ef1d87e06a ("thermal: core: Add a back up thermal shutdown mechanism")
Acked-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Building this driver with W=1 reports:
warning: variable 'trip' set but not used [-Wunused-but-set-variable]
The call for of_thermal_get_trip_points() is useless.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
We currently do
tcmu_free_device ->tcmu_netlink_event(TCMU_CMD_REMOVED_DEVICE) ->
uio_unregister_device -> kfree(tcmu_dev).
The problem is that the kernel does not wait for userspace to
do the close() on the uio device before freeing the tcmu_dev.
We can then hit a race where the kernel frees the tcmu_dev before
userspace does close() and so when close() -> release -> tcmu_release
is done, we try to access a freed tcmu_dev.
This patch made over the target-pending master branch moves the freeing
of the tcmu_dev to when the last reference has been dropped.
This also fixes a leak where if tcmu_configure_device was not called on a
device we did not free udev->name which was allocated at tcmu_alloc_device time.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
skb->data is assigned to task->hdr in cxgbi_conn_alloc_pdu(),
skb gets freed after tx but task->hdr is still dereferenced in
iscsi_tcp_task_xmit() to avoid this call skb_get() after allocating skb
and free the skb in cxgbi_cleanup_task() or before allocating new skb in
cxgbi_conn_alloc_pdu().
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
max_fin_rt is the maximum re-transmission of FIN packets
as part of the termination flow. After reaching this value
the FW will send a single RESET.
Signed-off-by: Nilesh Javali <nilesh.javali@cavium.com>
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
munmap done by iscsiuio during a stop of the service triggers a "bad
pte" warning sometimes. munmap kernel path goes through the mmapped
pages and has a validation check for mapcount (in struct page) to be
zero or above. kzalloc, which we had used to allocate udev->ctrl, uses
slab allocations, which re-uses mapcount (union) for other purposes that
can make the mapcount look negative. Avoid all these trouble by invoking
one of the __get_free_pages wrappers to be used instead of kzalloc for
udev->ctrl.
BUG: Bad page map in process iscsiuio pte:80000000aa624067 pmd:3e6777067
page:ffffea0002a98900 count:2 mapcount:-2143289280
mapping: (null) index:0xffff8800aa624e00
page flags: 0x10075d00000090(dirty|slab)
page dumped because: bad pte
addr:00007fcba70a3000 vm_flags:0c0400fb anon_vma: (null)
mapping:ffff8803edf66e90 index:0
Call Trace:
dump_stack+0x19/0x1b
print_bad_pte+0x1af/0x250
unmap_page_range+0x7a7/0x8a0
unmap_single_vma+0x81/0xf0
unmap_vmas+0x49/0x90
unmap_region+0xbe/0x140
? vma_rb_erase+0x121/0x220
do_munmap+0x245/0x420
vm_munmap+0x41/0x60
SyS_munmap+0x22/0x30
tracesys+0xdd/0xe2
Signed-off-by: Arun Easi <arun.easi@cavium.com>
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A recent commit added extra printks for CPU/RT limits. This can result in
excessive spam in dmesg.
Make the printks conditional on print_fatal_signals.
Fixes: e7ea7c9806 ("rlimits: Print more information when CPU/RT limits are exceeded")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arun Raghavan <arun@arunraghavan.net>
We need to clear the IPS_SRC_NAT_DONE_BIT to indicate that the ct has
been removed from nat_bysource table. But unfortunately, we use the
non-atomic bit operation: "ct->status &= ~IPS_NAT_DONE_MASK". So
there's a race condition that we may clear the _DYING_BIT set by
another CPU unexpectedly.
Since we don't care about the IPS_DST_NAT_DONE_BIT, so just using
clear_bit to clear the IPS_SRC_NAT_DONE_BIT is enough.
Also note, this is the last user which use the non-atomic bit operation
to update the confirmed ct->status.
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The existing code selects no next branch to be inspected when
re-inserting an inactive element into the rb-tree, looping endlessly.
This patch restricts the check for active elements to the EEXIST case
only.
Fixes: e701001e7c ("netfilter: nft_rbtree: allow adjacent intervals with dynamic updates")
Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
sctp_compute_cksum() implementation assumes that at least the SCTP header
is in the linear part of skb: modify conntrack error callback to avoid
false CRC32c mismatch, if the transport header is partially/entirely paged.
Fixes: cf6e007eef ("netfilter: conntrack: validate SCTP crc32c in PREROUTING")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Some drivers - like i915 - may not support the system suspend direct
complete optimization due to differences in their runtime and system
suspend sequence. Add a flag that when set resumes the device before
calling the driver's system suspend handlers which effectively disables
the optimization.
Needed by a future patch fixing suspend/resume on i915.
Suggested by Rafael.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org
If there is not enough space then ceph_decode_32_safe() does a goto bad.
We need to return an error code in that situation. The current code
returns ERR_PTR(0) which is NULL. The callers are not expecting that
and it results in a NULL dereference.
Fixes: f24e9980eb ("ceph: OSD client")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
None of these are validated in userspace, but since we do validate
reply_struct_v in ceph_x_proc_ticket_reply(), tkt_struct_v (first) and
CephXServiceTicket struct_v (second) in process_one_ticket(), validate
CephXTicketBlob struct_v as well.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
It's set but not used: CEPH_FEATURE_MONNAMES feature bit isn't
advertised, which guarantees a v1 MonMap.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
if we receive a compound such that:
- the sessionid, slot, and sequence number in the SEQUENCE op
match a cached succesful reply with N ops, and
- the Nth operation of the compound is a PUTFH, PUTPUBFH,
PUTROOTFH, or RESTOREFH,
then nfsd4_sequence will return 0 and set cstate->status to
nfserr_replay_cache. The current filehandle will not be set. This will
cause us to call check_nfsd_access with first argument NULL.
To nfsd4_compound it looks like we just succesfully executed an
operation that set a filehandle, but the current filehandle is not set.
Fix this by moving the nfserr_replay_cache earlier. There was never any
reason to have it after the encode_op label, since the only case where
he hit that is when opdesc->op_func sets it.
Note that there are two ways we could hit this case:
- a client is resending a previously sent compound that ended
with one of the four PUTFH-like operations, or
- a client is sending a *new* compound that (incorrectly) shares
sessionid, slot, and sequence number with a previously sent
compound, and the length of the previously sent compound
happens to match the position of a PUTFH-like operation in the
new compound.
The second is obviously incorrect client behavior. The first is also
very strange--the only purpose of a PUTFH-like operation is to set the
current filehandle to be used by the following operation, so there's no
point in having it as the last in a compound.
So it's likely this requires a buggy or malicious client to reproduce.
Reported-by: Scott Mayhew <smayhew@redhat.com>
Cc: stable@kernel.vger.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull i2c fixes from Wolfram Sang:
"Fix the i2c-designware regression of rc2.
Also, a DMA buffer fix for the tiny-usb driver where the USB core now
loudly complains about the non DMA-capable buffer"
[ I had cherry-picked the designware fix separately because it hit my
laptop, but here is the proper sync with the i2c tree - Linus ]
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: designware: Fix bogus sda_hold_time due to uninitialized vars
i2c: i2c-tiny-usb: fix buffer not being DMA capable
Commit 9da5ac236d ("ARM: soft-reboot into same mode that we entered
the kernel") added support to enter the new kernel in the same processor
mode as the previous one when we soft-reboot from one kernel into
another by pass a flag to cpu_reset() so it knows what to do exactly.
However it missed to make similar changes in MCPM code. Due to the
missing flag, the CPUs enter HYP mode which is not supported with MCPM.
MCPM works only in secure mode as it manages CCI.
This patch aligns the cpu_reset call in MCPM with other changes in the
above mentioned commit by making phys_reset_t to follow the prototype
of cpu_reset().
Fixes: 9da5ac236d ("ARM: soft-reboot into same mode that we entered the kernel")
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
The SMD channel is not the primary WCNSS channel and must explicitly be
closed as the device is removed, or the channel will already by open on
a subsequent probe call in e.g. the case of reloading the kernel module.
This issue was introduced because I simplified the underlying SMD
implementation while the SMD adaptions of the driver sat on the mailing
list, but missed to update these patches. The patch does however only
apply back to the transition to rpmsg, hence the limited Fixes.
Fixes: 5052de8def ("soc: qcom: smd: Transition client drivers from smd to rpmsg")
Reported-by: Eyal Ilsar <c_eilsar@qti.qualcomm.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The code in block/partitions/msdos.c recognizes FreeBSD, OpenBSD
and NetBSD partitions and does a reasonable job picking out OpenBSD
and NetBSD UFS subpartitions.
But for FreeBSD the subpartitions are always "bad".
Kernel: <bsd:bad subpartition - ignored
Though all 3 of these BSD systems use UFS as a file system, only
FreeBSD uses relative start addresses in the subpartition
declarations.
The following patch fixes this for FreeBSD partitions and leaves
the code for OpenBSD and NetBSD intact:
Signed-off-by: Richard Narron <comet.berkeley@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Masks for extracting part of the Completion Queue Entry (CQE)
field rss_hash_type was swapped, namely CQE_RSS_HTYPE_IP and
CQE_RSS_HTYPE_L4.
The bug resulted in setting skb->l4_hash, even-though the
rss_hash_type indicated that hash was NOT computed over the
L4 (UDP or TCP) part of the packet.
Added comments from the datasheet, to make it more clear what
these masks are selecting.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some devices need their multicast filter reset but others are crashed by that.
So the methods need to be separated.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Reported-by: "Ridgway, Keith" <kridgway@harris.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2017-05-23
1) Fix wrong header offset for esp4 udpencap packets.
2) Fix a stack access out of bounds when creating a bundle
with sub policies. From Sabrina Dubroca.
3) Fix slab-out-of-bounds in pfkey due to an incorrect
sadb_x_sec_len calculation.
4) We checked the wrong feature flags when taking down
an interface with IPsec offload enabled.
Fix from Ilan Tayari.
5) Copy the anti replay sequence numbers when doing a state
migration, otherwise we get out of sync with the sequence
numbers. Fix from Antony Antony.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't set an error code on this path. It means that we return NULL
instead of an error pointer and the caller does a NULL dereference.
Fixes: 6d1d8050b4 ("block, partition: add partition_meta_info to hd_struct")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Add tolerance to failures of irq_set_affinity_hint().
Its role is to give hints that optimizes performance,
and should not block the driver load.
In non-SMP systems, functionality is not available as
there is a single core, and all these calls definitely
fail. Hence, do not call the function and avoid the
warning prints.
Fixes: db058a186f ("net/mlx5_core: Set irq affinity hints")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently when firmware command gets stuck or it takes long time to
complete, the driver command will get timeout and the command slot is
freed and can be used for new commands, and if the firmware receive new
command on the old busy slot its behavior is unexpected and this could
be harmful.
To fix this when the driver command gets timeout we return failure,
but we don't free the command slot and we wait for the firmware to
explicitly respond to that command.
Once all the entries are busy we will stop processing new firmware
commands.
Fixes: 9cba4ebcf3 ('net/mlx5: Fix potential deadlock in command mode change')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
IPoIB packet contains the pseudo header area, we need to pull it prior
to reset_mac_header in order to let the GRO work well.
In more details:
GRO checks the mac address of the new coming packet, it does that by
comparing the hard_header_len size of the current packet to the previous
one in that session, the comparison is over hard_header_len size.
Now, the driver prepares that area in the skb by allocating area from the
reserved part and resetting the correct mac header to it.
Fixes: 9d6bd752c6 ("net/mlx5e: IPoIB, RX handler")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The sparse tool emits these correct complaints:
drivers/net/ethernet/mellanox/mlx5/core//en_tc.c:1005:25: warning: cast to restricted __be32
drivers/net/ethernet/mellanox/mlx5/core//en_tc.c:1007:25: warning: cast to restricted __be16
The value is provided from user-space in network order, but there's
no way for them to realize that, avoid the warnings by casting to the
appropriate type.
Fixes: d79b6df6b1 ('net/mlx5e: Add parsing of TC pedit actions to HW format')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently we don't support partial header re-writes through TC pedit
action offloading. However, the code that enforces that wasn't err-ing
on cases where the first and last bits of the mask are set but there is
some zero bit between them, such as in the below example, fix that!
tc filter add dev enp1s0 protocol ip parent ffff: prio 10 flower
ip_proto udp dst_port 2001 skip_sw
action pedit munge ip src set 1.0.0.1 retain 0xff0000ff
Fixes: d79b6df6b1 ('net/mlx5e: Add parsing of TC pedit actions to HW format')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When offloading header re-writes, the HW re-calculates the relevant L3/L4
checksums. Hence, when upper layers (as done by OVS) ask for TC checksum action
offload together with pedit offload, don't err. This command now works:
tc filter add dev ens1f0 protocol ip parent ffff: prio 20 flower skip_sw
ip_proto tcp dst_port 9001
action pedit ex munge tcp dport set 0x1234 pipe
action csum tcp
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add the accessors for realizing if this is a csum action,
and for which fields checksum is needed.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We wrongly direcly invoke hlist_del_rcu() and not hash_del_rcu() which
does a slightly different call now and may change later, fix that.
Fixes: a54e20b4fc ('net/mlx5e: Add basic TC tunnel set action for SRIOV offloads')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When I introduced ptracer_cred I failed to consider the weirdness of
fork where the task_struct copies the old value by default. This
winds up leaving ptracer_cred set even when a process forks and
the child process does not wind up being ptraced.
Because ptracer_cred is not set on non-ptraced processes whose
parents were ptraced this has broken the ability of the enlightenment
window manager to start setuid children.
Fix this by properly initializing ptracer_cred in ptrace_init_task
This must be done with a little bit of care to preserve the current value
of ptracer_cred when ptrace carries through fork. Re-reading the
ptracer_cred from the ptracing process at this point is inconsistent
with how PT_PTRACE_CAP has been maintained all of these years.
Tested-by: Takashi Iwai <tiwai@suse.de>
Fixes: 64b875f7ac ("ptrace: Capture the ptracer's creds not PT_PTRACE_CAP")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
The description of the connection between the dwmmc (SDIO) controller and
the Wifi chip, which is attached to the SDIO bus is wrong. Currently the
SDIO card can't be detected and thus the Wifi doesn't work.
Let's fix this by assigning the correct vmmc supply, which is the always on
regulator VDD_3V3 and remove the WLAN enable regulator altogether. Then to
properly deal with the power on/off sequence, add a mmc-pwrseq node to
describe the resources needed to detect the SDIO card.
Except for the WLAN enable GPIO and its corresponding assert/de-assert
delays, the mmc-pwrseq node also contains a handle to a clock provided by
the hi655x pmic. This clock is also needed to be able to turn on the WiFi
chip.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Move the board specific descriptions for the dwmmc nodes in the hi6220 SoC
dtsi, into the hikey dts as it's there these belongs.
While changing this, let's take the opportunity to drop the use of the
"ti,non-removable" binding for one of the dwmmc device nodes, as it's not a
valid binding and not used. Drop also the unnecessary use of "num-slots =
<0x1>" for all of the dwmmc nodes, as there is no need to set this since
when default number of slots is one.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Add these regulators to better describe the HW, but also because those is
needed in following changes.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The regulator is a part of the hikey board, therefore let's move it from
the hi6220 SoC dtsi file into the hikey dts file . Let's also rename the
regulator according to the datasheet (5V_HUB) to better reflect the HW.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The hi655x PMIC provides the regulators but also a clock. The latter is
missing so let's add it. This clock is used by WiFi/Bluetooth chip, but
that connection is done in a separate change on top of this one.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
[Ulf: Split patch and updated changelog]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The hi655x PMIC provides the regulators but also a clock. The latter is
missing in the definition, so extend the documentation to include this as
well.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
[Ulf: Split patch and updated changelog]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
If the optional power-off-delay-us property is found, insert the
corresponding delay after asserting the GPIO during power off. This enables
a graceful shutdown sequence for some devices.
Cc: linux-mmc@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
During power off, after the GPIO pin has been asserted, some devices like
the Wifi chip from TI, Wl18xx, needs a delay before the host continues with
clock gating and turning off regulators as to follow a graceful shutdown
sequence.
Therefore invent an optional power-off-delay-us DT binding for
mmc-pwrseq-simple, to allow us to support this constraint.
Cc: devicetree@vger.kernel.org
Cc: Rob Herring <robh+dt@kernel.org>
Cc: linux-mmc@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
We use well known standard names for functions that have name, such as
I2C, SPI, SPDIF, etc..
Fix the function name of SPDIF, which was named OWA (One Wire Audio)
based on Allwinner datasheets.
Fixes: 4730f33f0d ("pinctrl: sunxi: add allwinner A83T PIO controller
support")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
To set the mux mode of a pin two bits must be set. Up to now this is
implemented using the following idiom:
writel(mask, reg + CLR);
writel(value, reg + SET);
. This however results in the mux mode being 0 between the two writes.
On my machine there is an IC's reset pin connected to LCD_D20. The
bootloader configures this pin as GPIO output-high (i.e. not holding the
IC in reset). When Linux reconfigures the pin to GPIO the short time
LCD_D20 is muxed as LCD_D20 instead of GPIO_1_20 is enough to confuse
the connected IC.
The same problem is present for the pin's drive strength setting which is
reset to low drive strength before using the right value.
So instead of relying on the hardware to modify the register setting
using two writes implement the bit toggling using read-modify-write.
Fixes: 17723111e6 ("pinctrl: add pinctrl-mxs support")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
I have not been able to dedicate time to the GPIO subsystem since quite
some time, and I don't see the situation improving in the near future.
Update the maintainers list to reflect this unfortunate fact.
Signed-off-by: Alexandre Courbot <gnurou@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
It turns out there are quite many Chromebooks out there that have the
same keyboard issue than Acer Chromebook. All of them are based on
Intel_Strago reference and report their DMI_PRODUCT_FAMILY as
"Intel_Strago" (Samsung Chromebook 3 and Cyan Chromebooks are exceptions
for which we add separate entries).
Instead of adding each machine to the quirk table, we use
DMI_PRODUCT_FAMILY of "Intel_Strago" that hopefully covers most of the
machines out there currently.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=194945
Suggested: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Sometimes it is more convenient to be able to match a whole family of
products, like in case of bunch of Chromebooks based on Intel_Strago to
apply a driver quirk instead of quirking each machine one-by-one.
This adds support for DMI_PRODUCT_FAMILY identification string and also
exports it to the userspace through sysfs attribute just like the
existing ones.
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The Crystal Cove PMIC has 16 real GPIOs but the ACPI code for devices
with this PMIC may address up to 95 GPIOs, these extra GPIOs are
called virtual GPIOs and are used by the ACPI code as a method of
accessing various non GPIO bits of PMIC.
Commit dcdc3018d6 ("gpio: crystalcove: support virtual GPIO") added
dummy support for these to avoid a bunch of ACPI errors, but instead of
ignoring writes / reads to them by doing:
if (gpio >= CRYSTALCOVE_GPIO_NUM)
return 0;
It accidentally introduced the following wrong check:
if (gpio > CRYSTALCOVE_VGPIO_NUM)
return 0;
Which means that attempts by the ACPI code to access these gpios
causes some arbitrary gpio to get touched through for example
GPIO1P0CTLO + gpionr % 8.
Since we do support input/output (but not interrupts) on the 0x5e
virtual GPIO, this commit makes to_reg return -ENOTSUPP for unsupported
virtual GPIOs so as to not have to check for (gpio >= CRYSTALCOVE_GPIO_NUM
&& gpio != 0x5e) everywhere and to make it easier to add support for more
virtual GPIOs in the future.
It then adds a check for to_reg returning an error to all callers where
this may happen fixing the ACPI code accessing virtual GPIOs accidentally
causing changes to real GPIOs.
Fixes: dcdc3018d6 ("gpio: crystalcove: support virtual GPIO")
Cc: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
This model is actually called 92XXM2-8 in Windows driver. But since pin
configs for M22 and M28 are identical, just reuse M22 quirk.
Fixes external microphone (tested) and probably docking station ports
(not tested).
Signed-off-by: Alexander Tsoy <alexander@tsoy.me>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
With enabling this workaround, can observe GPU hang issue on Gen9. As
currently host side doesn't have this workaround, disable it from GVT
side.
v2:
- Fix indent error.(Zhenyu)
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
crypto_gcm_setkey() was using wait_for_completion_interruptible() to
wait for completion of async crypto op but if a signal occurs it
may return before DMA ops of HW crypto provider finish, thus
corrupting the data buffer that is kfree'ed in this case.
Resolve this by using wait_for_completion() instead.
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
CC: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
drbg_kcapi_sym_ctr() was using wait_for_completion_interruptible() to
wait for completion of async crypto op but if a signal occurs it
may return before DMA ops of HW crypto provider finish, thus
corrupting the output buffer.
Resolve this by using wait_for_completion() instead.
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
CC: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
public_key_verify_signature() was passing the CRYPTO_TFM_REQ_MAY_BACKLOG
flag to akcipher_request_set_callback() but was not handling correctly
the case where a -EBUSY error could be returned from the call to
crypto_akcipher_verify() if backlog was used, possibly casuing
data corruption due to use-after-free of buffers.
Resolve this by handling -EBUSY correctly.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
CC: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto fix from Herbert Xu:
"This fixes a regression in the skcipher interface that allows bogus
key parameters to hit underlying implementations which can cause
crashes"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: skcipher - Add missing API setkey checks
Pull pstore fix from Kees Cook:
"Marta noticed another misbehavior in EFI pstore, which this fixes.
Hopefully this is the last of the v4.12 fixes for pstore!"
* tag 'pstore-v4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
efi-pstore: Fix write/erase id tracking
Pull ACPI fixes from Rafael Wysocki:
"These revert a 4.11 change that turned out to be problematic and add a
.gitignore file.
Specifics:
- Revert a 4.11 commit related to the ACPI-based handling of laptop
lids that made changes incompatible with existing user space stacks
and broke things there (Lv Zheng).
- Add .gitignore to the ACPI tools directory (Prarit Bhargava)"
* tag 'acpi-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI / button: Remove lid_init_state=method mode"
tools/power/acpi: Add .gitignore file
Pull power management fixes from Rafael Wysocki:
"These fix RTC wakeup from suspend-to-idle broken recently, fix CPU
idleness detection condition in the schedutil cpufreq governor, fix a
cpufreq driver build failure, fix an error code path in the power
capping framework, clean up the hibernate core and update the
intel_pstate documentation.
Specifics:
- Fix RTC wakeup from suspend-to-idle broken by the recent rework of
ACPI wakeup handling (Rafael Wysocki).
- Update intel_pstate driver documentation to reflect the current
code and explain how it works in more detail (Rafael Wysocki).
- Fix an issue related to CPU idleness detection on systems with
shared cpufreq policies in the schedutil governor (Juri Lelli).
- Fix a possible build issue in the dbx500 cpufreq driver (Arnd
Bergmann).
- Fix a function in the power capping framework core to return an
error code instead of 0 when there's an error (Dan Carpenter).
- Clean up variable definition in the hibernation core (Pushkar
Jambhlekar)"
* tag 'pm-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: dbx500: add a Kconfig symbol
PM / hibernate: Declare variables as static
PowerCap: Fix an error code in powercap_register_zone()
RTC: rtc-cmos: Fix wakeup from suspend-to-idle
PM / wakeup: Fix up wakeup_source_report_event()
cpufreq: intel_pstate: Document the current behavior and user interface
cpufreq: schedutil: use now as reference when aggregating shared policy requests
ci_role BUGs when the role is >= CI_ROLE_END.
This is the case while the role is changing.
Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
The datasheet and application note does not mention an allowed range for
the M09_REGISTER_THRESHOLD parameter. One of our customers needs to set
lower values than 20 and they seem to work just fine on EDT EP0xx0M09 with
T5x06 touch.
So, lacking a known lower limit, we increase the range for thresholds,
and set the lower limit to 0. The documentation is updated accordingly.
Signed-off-by: Schoefegger Stefan <stefan.schoefegger@ginzinger.com>
Signed-off-by: Manfred Schlaegl <manfred.schlaegl@ginzinger.com>
Signed-off-by: Martin Kepplinger <martin.kepplinger@ginzinger.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Prior to the pstore interface refactoring, the "id" generated during
a backend pstore_write() was only retained by the internal pstore
inode tracking list. Additionally the "part" was ignored, so EFI
would encode this in the id. This corrects the misunderstandings
and correctly sets "id" during pstore_write(), and uses "part"
directly during pstore_erase().
Reported-by: Marta Lofstedt <marta.lofstedt@intel.com>
Fixes: 76cc9580e3 ("pstore: Replace arguments for write() API")
Fixes: a61072aae6 ("pstore: Replace arguments for erase() API")
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Marta Lofstedt <marta.lofstedt@intel.com>
Commit d224e93818 ("drivers/md/dm-ioctl.c: use kvmalloc rather than
opencoded variant") left out the __GFP_HIGH flag when converting from
__vmalloc to kvmalloc. This can cause the DM ioctl to fail in some low
memory situations where it wouldn't have failed earlier. Add __GFP_HIGH
back to avoid any potential regression.
Fixes: d224e93818 ("drivers/md/dm-ioctl.c: use kvmalloc rather than opencoded variant")
Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Commit cc7b0d4955 ("PCI: designware: Update PCI config space remap
function") made PCI configuration requests non-posted, which means we now
get a synchronous abort when the CFG space read to probe for downstream
devices times out.
Synchronous aborts need to be handled differently from the async aborts we
were getting before, in particular the PC needs to be advanced when
resolving the abort. This is mostly a copy of what other PCI drivers do on
ARM to handle those aborts.
[bhelgaas: changelog, "Fixes"]
Fixes: cc7b0d4955 ("PCI: designware: Update PCI config space remap function")
Tested-by: Fabio Estevam <fabio.estevam@nxp.com>
Tested-by: Peter Senna Tschudin <peter.senna@collabora.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Richard Zhu <hongxing.zhu@nxp.com>
When a switch endpoint is configured without NTB, the mmio_ntb registers
will read all zeros. However, in corner case configurations where the
partition ID is not zero and NTB is not enabled, the code will have the
wrong partition ID and this causes the driver to use the wrong set of
drivers. To fix this we simply take the partition ID from the system info
region.
Reported-by: Dingbao Chen <dingbao.chen@microsemi.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Convert from "cdev_add() + device_add()" to cdev_device_add(), and from
"device_del() + cdev_del()" to cdev_device_del().
[bhelgaas: changelog]
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If NO_DMA=y:
drivers/built-in.o: In function `__pci_epc_create':
(.text+0xef4e): undefined reference to `bad_dma_ops'
drivers/built-in.o: In function `pci_epc_add_epf':
(.text+0xf676): undefined reference to `bad_dma_ops'
drivers/built-in.o: In function `pci_epf_alloc_space':
(.text+0xfa32): undefined reference to `bad_dma_ops'
drivers/built-in.o: In function `pci_epf_free_space':
(.text+0xfac4): undefined reference to `bad_dma_ops'
Add a dependency on HAS_DMA to fix this.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Default value of io.low limit is 0. If user doesn't configure the limit,
last patch makes cgroup be throttled to very tiny bps/iops, which could
stall the system. A cgroup with default settings of io.low limit really
means nothing, so we force user to configure all settings, otherwise
io.low limit doesn't take effect. With this stragety, default setting of
latency/idle isn't important, so just set them to very conservative and
safe value.
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
If a cgroup with low limit 0 for both bps/iops, the cgroup's low limit
is ignored and we throttle the cgroup with its max limit. In this way,
other cgroups with a low limit will not get protected. To fix this, we
don't do the exception any more. cgroup will be throttled to a limit 0
if it uese default setting. To avoid completed stall, we give such
cgroup tiny IO resources.
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
These info are important to understand what's happening and help debug.
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
For idle time, children's setting should not be bigger than parent's.
For latency target, children's setting should not be smaller than
parent's. The leaf nodes will adjust their settings according to the
hierarchy and compare their IO with the settings and do
upgrade/downgrade. parents nodes don't need to track their IO
latency/idle time.
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
If a kthread forks (e.g. usermodehelper since commit 1da5c46fa9) but
fails in copy_process() between calling dup_task_struct() and setting
p->set_child_tid, then the value of p->set_child_tid will be inherited
from the parent and get prematurely freed by free_kthread_struct().
kthread()
- worker_thread()
- process_one_work()
| - call_usermodehelper_exec_work()
| - kernel_thread()
| - _do_fork()
| - copy_process()
| - dup_task_struct()
| - arch_dup_task_struct()
| - tsk->set_child_tid = current->set_child_tid // implied
| - ...
| - goto bad_fork_*
| - ...
| - free_task(tsk)
| - free_kthread_struct(tsk)
| - kfree(tsk->set_child_tid)
- ...
- schedule()
- __schedule()
- wq_worker_sleeping()
- kthread_data(task)->flags // UAF
The problem started showing up with commit 1da5c46fa9 since it reused
->set_child_tid for the kthread worker data.
A better long-term solution might be to get rid of the ->set_child_tid
abuse. The comment in set_kthread_struct() also looks slightly wrong.
Debugged-by: Jamie Iles <jamie.iles@oracle.com>
Fixes: 1da5c46fa9 ("kthread: Make struct kthread kmalloc'ed")
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jamie Iles <jamie.iles@oracle.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170509073959.17858-1-vegard.nossum@oracle.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Markus reported that the glibc/nptl/tst-robustpi8 test was failing after
commit:
cfafcd117d ("futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()")
The following trace shows the problem:
ld-linux-x86-64-2161 [019] .... 410.760971: SyS_futex: 00007ffbeb76b028: 80000875 op=FUTEX_LOCK_PI
ld-linux-x86-64-2161 [019] ...1 410.760972: lock_pi_update_atomic: 00007ffbeb76b028: curval=80000875 uval=80000875 newval=80000875 ret=0
ld-linux-x86-64-2165 [011] .... 410.760978: SyS_futex: 00007ffbeb76b028: 80000875 op=FUTEX_UNLOCK_PI
ld-linux-x86-64-2165 [011] d..1 410.760979: do_futex: 00007ffbeb76b028: curval=80000875 uval=80000875 newval=80000871 ret=0
ld-linux-x86-64-2165 [011] .... 410.760980: SyS_futex: 00007ffbeb76b028: 80000871 ret=0000
ld-linux-x86-64-2161 [019] .... 410.760980: SyS_futex: 00007ffbeb76b028: 80000871 ret=ETIMEDOUT
Task 2165 does an UNLOCK_PI, assigning the lock to the waiter task 2161
which then returns with -ETIMEDOUT. That wrecks the lock state, because now
the owner isn't aware it acquired the lock and removes the pending robust
list entry.
If 2161 is killed, the robust list will not clear out this futex and the
subsequent acquire on this futex will then (correctly) result in -ESRCH
which is unexpected by glibc, triggers an internal assertion and dies.
Task 2161 Task 2165
rt_mutex_wait_proxy_lock()
timeout();
/* T2161 is still queued in the waiter list */
return -ETIMEDOUT;
futex_unlock_pi()
spin_lock(hb->lock);
rtmutex_unlock()
remove_rtmutex_waiter(T2161);
mark_lock_available();
/* Make the next waiter owner of the user space side */
futex_uval = 2161;
spin_unlock(hb->lock);
spin_lock(hb->lock);
rt_mutex_cleanup_proxy_lock()
if (rtmutex_owner() !== current)
...
return FAIL;
....
return -ETIMEOUT;
This means that rt_mutex_cleanup_proxy_lock() needs to call
try_to_take_rt_mutex() so it can take over the rtmutex correctly which was
assigned by the waker. If the rtmutex is owned by some other task then this
call is harmless and just confirmes that the waiter is not able to acquire
it.
While there, fix what looks like a merge error which resulted in
rt_mutex_cleanup_proxy_lock() having two calls to
fixup_rt_mutex_waiters() and rt_mutex_wait_proxy_lock() not having any.
Both should have one, since both potentially touch the waiter list.
Fixes: 38d589f2fd ("futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()")
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Bug-Spotted-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Link: http://lkml.kernel.org/r/20170519154850.mlomgdsd26drq5j6@hirez.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Jonathan writes:
First set of IIO fixes in the 4.12 cycle.
Matt finally set up the lightning storm he needed to test the as3935.
* core
- Fix a null pointer deference in iio_trigger_write_current when changing
from a non existent trigger to another non existent trigger.
* a3935
- Recalibrate the RCO after resume.
- Fix interrupt mask so that we actually get some interrupts.
- Use iio_trigger_poll_chained as we aren't in interrupt context.
* am335x
- Fix wrong allocation size provided for private data to iio_device_alloc.
* bcm_iproc
- Swapped primary and secondary isr handlers.
* ltr501
- Fix swapped als/ps register fields when enabling interrupts
* max9611
- Wrong scale factor for the shunt_resistor attribute.
* sun4-gpadc
- Module autoloading fixes by adding the device table declarations.
- Fix parent device being used in devm functions.
Pull networking fixes from David Miller:
"Mostly netfilter bug fixes in here, but we have some bits elsewhere as
well.
1) Don't do SNAT replies for non-NATed connections in IPVS, from
Julian Anastasov.
2) Don't delete conntrack helpers while they are still in use, from
Liping Zhang.
3) Fix zero padding in xtables's xt_data_to_user(), from Willem de
Bruijn.
4) Add proper RCU protection to nf_tables_dump_set() because we
cannot guarantee that we hold the NFNL_SUBSYS_NFTABLES lock. From
Liping Zhang.
5) Initialize rcv_mss in tcp_disconnect(), from Wei Wang.
6) smsc95xx devices can't handle IPV6 checksums fully, so don't
advertise support for offloading them. From Nisar Sayed.
7) Fix out-of-bounds access in __ip6_append_data(), from Eric
Dumazet.
8) Make atl2_probe() propagate the error code properly on failures,
from Alexey Khoroshilov.
9) arp_target[] in bond_check_params() is used uninitialized. This
got changes from a global static to a local variable, which is how
this mistake happened. Fix from Jarod Wilson.
10) Fix fallout from unnecessary NULL check removal in cls_matchall,
from Jiri Pirko. This is definitely brown paper bag territory..."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
net: sched: cls_matchall: fix null pointer dereference
vsock: use new wait API for vsock_stream_sendmsg()
bonding: fix randomly populated arp target array
net: Make IP alignment calulations clearer.
bonding: fix accounting of active ports in 3ad
net: atheros: atl2: don't return zero on failure path in atl2_probe()
ipv6: fix out of bound writes in __ip6_append_data()
bridge: start hello_timer when enabling KERNEL_STP in br_stp_start
smsc95xx: Support only IPv4 TCP/UDP csum offload
arp: always override existing neigh entries with gratuitous ARP
arp: postpone addr_type calculation to as late as possible
arp: decompose is_garp logic into a separate function
arp: fixed error in a comment
tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0
netfilter: xtables: fix build failure from COMPAT_XT_ALIGN outside CONFIG_COMPAT
ebtables: arpreply: Add the standard target sanity check
netfilter: nf_tables: revisit chain/object refcounting from elements
netfilter: nf_tables: missing sanitization in data from userspace
netfilter: nf_tables: can't assume lock is acquired when dumping set elems
netfilter: synproxy: fix conntrackd interaction
...
The driver checks an incorrect flag of functionality of adapter.
When a driver requires i2c_smbus_read_byte_data and
i2c_smbus_write_byte_data, it should check I2C_FUNC_SMBUS_BYTE_DATA
instead I2C_FUNC_I2C.
This patch fixes the problem.
Signed-off-by: Tin Huynh <tnhuynh@apm.com>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
fix extra controller reference taken on reconnect by moving
reference to initial controller create
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
correct nvme status set on abort. Patch that changed status to being actual
nvme status crossed in the night with the patch that added abort values.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Per the review by Sagi on:
http://lists.infradead.org/pipermail/linux-nvme/2017-April/009261.html
Looked at existing warn vs info vs err dev_xxx levels for the messages
printed on reconnects and deletes:
- Resets due to error and resets transitioned to deletes are dev_warn
- Other reset/disconnect messages are dev_info
- Removed chatty io queue related messages
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Sync with Sagi's recent addition of ctrl_loss_tmo in the core fabrics
layer.
Remove local connect limits and connect_attempts variable.
Use fabrics new nr_connects variable and use of nvmf_should_reconnect()
Refactor duplicate reconnect failure code.
Addresses review comment by Sagi on controller reset support:
http://lists.infradead.org/pipermail/linux-nvme/2017-April/009261.html
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Remove the local copy of reconnect_delay.
Use the value in the controller options directly.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Since the head is guaranteed by the check above to be null, the call_rcu
would explode. Remove the previously logically dead code that was made
logically very much alive and kicking.
Fixes: 985538eee0 ("net/sched: remove redundant null check on head")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NVMe may add request into requeue list simply and not kick off the
requeue if hw queues are stopped. Then blk_mq_abort_requeue_list()
is called in both nvme_kill_queues() and nvme_ns_remove() for
dealing with this issue.
Unfortunately blk_mq_abort_requeue_list() is absolutely a
race maker, for example, one request may be requeued during
the aborting. So this patch just calls blk_mq_kick_requeue_list() in
nvme_kill_queues() to handle this issue like what nvme_start_queues()
does. Now all requests in requeue list when queues are stopped will be
handled by blk_mq_kick_requeue_list() when queues are restarted, either
in nvme_start_queues() or in nvme_kill_queues().
Cc: stable@vger.kernel.org
Reported-by: Zhang Yi <yizhan@redhat.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Inside nvme_kill_queues(), we have to start hw queues for
draining requests in sw queues, .dispatch list and requeue list,
so use blk_mq_start_hw_queues() instead of blk_mq_start_stopped_hw_queues()
which only run queues if queues are stopped, but the queues may have
been started already, for example nvme_start_queues() is called in reset work
function.
blk_mq_start_hw_queues() run hw queues in current context, instead
of running asynchronously like before. Given nvme_kill_queues() is
run from either remove context or reset worker context, both are fine
to run hw queue directly. And the mutex of namespaces_mutex isn't a
problem too becasue nvme_start_freeze() runs hw queue in this way
already.
Cc: stable@vger.kernel.org
Reported-by: Zhang Yi <yizhan@redhat.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
In the case of small NVMe-oF queue size (<32) we may enter a deadlock
caused by the fact that the IB completions aren't sent waiting for 32
and the send queue will fill up.
The error is seen as (using mlx5):
[ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273):
[ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12
This patch changes the way the signaling is done so that it depends on
the queue depth now. The magic define has been removed completely.
Cc: stable@vger.kernel.org
Signed-off-by: Marta Rybczynska <marta.rybczynska@kalray.eu>
Signed-off-by: Samuel Jones <sjones@kalray.eu>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
As reported by Michal, vsock_stream_sendmsg() could still
sleep at vsock_stream_has_space() after prepare_to_wait():
vsock_stream_has_space
vmci_transport_stream_has_space
vmci_qpair_produce_free_space
qp_lock
qp_acquire_queue_mutex
mutex_lock
Just switch to the new wait API like we did for commit
d9dc8b0f8b ("net: fix sleeping for sk_wait_event()").
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit dc9c4d0fe0, the arp_target array moved from a static global
to a local variable. By the nature of static globals, the array used to
be initialized to all 0. At present, it's full of random data, which
that gets interpreted as arp_target values, when none have actually been
specified. Systems end up booting with spew along these lines:
[ 32.161783] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.168475] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.175089] 8021q: adding VLAN 0 to HW filter on device lacp0
[ 32.193091] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.204892] lacp0: Setting MII monitoring interval to 100
[ 32.211071] lacp0: Removing ARP target 216.124.228.17
[ 32.216824] lacp0: Removing ARP target 218.160.255.255
[ 32.222646] lacp0: Removing ARP target 185.170.136.184
[ 32.228496] lacp0: invalid ARP target 255.255.255.255 specified for removal
[ 32.236294] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
[ 32.243987] lacp0: Removing ARP target 56.125.228.17
[ 32.249625] lacp0: Removing ARP target 218.160.255.255
[ 32.255432] lacp0: Removing ARP target 15.157.233.184
[ 32.261165] lacp0: invalid ARP target 255.255.255.255 specified for removal
[ 32.268939] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
[ 32.276632] lacp0: Removing ARP target 16.0.0.0
[ 32.281755] lacp0: Removing ARP target 218.160.255.255
[ 32.287567] lacp0: Removing ARP target 72.125.228.17
[ 32.293165] lacp0: Removing ARP target 218.160.255.255
[ 32.298970] lacp0: Removing ARP target 8.125.228.17
[ 32.304458] lacp0: Removing ARP target 218.160.255.255
None of these were actually specified as ARP targets, and the driver does
seem to clean up the mess okay, but it's rather noisy and confusing, leaks
values to userspace, and the 255.255.255.255 spew shows up even when debug
prints are disabled.
The fix: just zero out arp_target at init time.
While we're in here, init arp_all_targets_value in the right place.
Fixes: dc9c4d0fe0 ("bonding: reduce scope of some global variables")
CC: Mahesh Bandewar <maheshb@google.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: netdev@vger.kernel.org
CC: stable@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
* intel_pstate:
cpufreq: intel_pstate: Document the current behavior and user interface
* pm-cpufreq:
cpufreq: dbx500: add a Kconfig symbol
* pm-cpufreq-sched:
cpufreq: schedutil: use now as reference when aggregating shared policy requests
DM-Verity has an (undocumented) mode where no salt is used. This was
never handled directly by the DM-Verity code, instead working due to the
fact that calling crypto_shash_update() with a zero length data is an
implicit noop.
This is no longer the case now that we have switched to
crypto_ahash_update(). Fix the issue by introducing explicit handling
of the no salt use case to DM-Verity.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Reported-by: Marian Csontos <mcsontos@redhat.com>
Fixes: d1ac3ff ("dm verity: switch to using asynchronous hash crypto API")
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The assignmnet:
ip_align = strict ? 2 : NET_IP_ALIGN;
in compare_pkt_ptr_alignment() trips up Coverity because we can only
get to this code when strict is true, therefore ip_align will always
be 2 regardless of NET_IP_ALIGN's value.
So just assign directly to '2' and explain the situation in the
comment above.
Reported-by: "Gustavo A. R. Silva" <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The stingray SDHCI hardware supports ACMD12 and automatically
issues after multi block transfer completed.
If ACMD12 in SDHCI is disabled, spurious tx done interrupts are seen
on multi block read command with below error message:
Got data interrupt 0x00000002 even though no data
operation was in progress.
This patch uses SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 to enable
ACM12 support in SDHCI hardware and suppress spurious interrupt.
Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: b580c52d58 ("mmc: sdhci-iproc: add IPROC SDHCI driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
As of 7bb11dc9f5 and 0622cab034, bond slaves in a 3ad bond are not
removed from the aggregator when they are down, and the active slave count
is NOT equal to number of ports in the aggregator, but rather the number
of ports in the aggregator that are still enabled. The sysfs spew for
bonding_show_ad_num_ports() has a comment that says "Show number of active
802.3ad ports.", but it's currently showing total number of ports, both
active and inactive. Remedy it by using the same logic introduced in
0622cab034 in __bond_3ad_get_active_agg_info(), so sysfs, procfs and
netlink all report the number of active ports. Note that this means that
IFLA_BOND_AD_INFO_NUM_PORTS really means NUM_ACTIVE_PORTS instead of
NUM_PORTS, and thus perhaps should be renamed for clarity.
Lightly tested on a dual i40e lacp bond, simulating link downs with an ip
link set dev <slave2> down, was able to produce the state where I could
see both in the same aggregator, but a number of ports count of 1.
MII Status: up
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 2 <---
Slave Interface: ens10
MII Status: up <---
Aggregator ID: 1
Slave Interface: ens11
MII Status: up
Aggregator ID: 1
MII Status: up
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 1 <---
Slave Interface: ens10
MII Status: down <---
Aggregator ID: 1
Slave Interface: ens11
MII Status: up
Aggregator ID: 1
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If dma mask checks fail in atl2_probe(), it breaks off initialization,
deallocates all resources, but returns zero.
The patch adds proper error code return value and
make error code setup unified.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the regulator probing is not yet finished this driver
might catch a -EPROBE_DEFER. Returning after this condition
did not remove the created platform device. On a repeated
call to the probe function the of_platform_device_create
fails.
Calling of_platform_device_destroy after EPROBE_DEFER resolves
this bug.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
of_platform_device_destroy is the counterpart to
of_platform_device_create which is a non-static function.
After creating a platform device it might be neccessary
to destroy it to deal with -EPROBE_DEFER where a
repeated of_platform_device_create call would fail otherwise.
Therefore also make of_platform_device_destroy globally visible.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In case the DT specifies neither a regulator nor a gpio
for the shared power the driver will crash accessing the regulator.
Prevent the crash by checking the regulator before use.
Use mmc_regulator_get_supply() instead of open coding the same
logic.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Andrey Konovalov and idaifish@gmail.com reported crashes caused by
one skb shared_info being overwritten from __ip6_append_data()
Andrey program lead to following state :
copy -4200 datalen 2000 fraglen 2040
maxfraglen 2040 alloclen 2048 transhdrlen 0 offset 0 fraggap 6200
The skb_copy_and_csum_bits(skb_prev, maxfraglen, data + transhdrlen,
fraggap, 0); is overwriting skb->head and skb_shared_info
Since we apparently detect this rare condition too late, move the
code earlier to even avoid allocating skb and risking crashes.
Once again, many thanks to Andrey and syzkaller team.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: <idaifish@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andre Przywara <andre.przywara@arm.com> noticed that we can get the
following warning with -EPROBE_DEFER:
"WARNING: CPU: 1 PID: 89 at drivers/base/dd.c:349
driver_probe_device+0x2ac/0x2e8"
Let's fix the issue by removing the indices as suggested by
Tejun Heo <tj@kernel.org>. All we have to do here is kill the radix
tree.
I probably ended up with the indices after grepping for removal
of all entries using radix_tree_for_each_slot() and the first
match found was gmap_radix_tree_free(). Anyways, no need for
indices here, and we can just do remove all the entries using
radix_tree_for_each_slot() along how the item_kill_tree() test
case does.
Fixes: c7059c5ac7 ("pinctrl: core: Add generic pinctrl functions for managing groups")
Fixes: a76edc89b1 ("pinctrl: core: Add generic pinctrl functions for managing groups")
Reported-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
I've forgotten to sync the documentation with the actually available
options for some time. Now all updated.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Recently some laptops and mobos are equipped with the dual Realtek
codecs that require special quirks. For making the debugging easier,
add the model "dual-codecs" to be passed via module option.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
MSI Z270-Gamin mobo has also two ALC1220 codecs like Gigabyte AZ370-
Gaming mobo. Apply the same quirk to this one.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Make some symbols static to fix sparse warnings like:
drivers/s390/cio/vfio_ccw_ops.c:73:1: warning: symbol 'mdev_type_attr_name' was not declared. Should it be static?
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
The keyboard dock used with the Asus Transformer T100 series, uses
the same vendor-defined 0xff31 usage-page as some other Asus
keyboards. But with a small twist, it has a small descriptor bug which
needs to be fixed up for things to work.
This commit adds the USB-ID for this keyboard to the hid-asus driver
and makes asus_report_fixup fix the descriptor issue, fixing
various special function keys on this keyboard not working.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Add stubs for gpiod_add_lookup_table() and gpiod_remove_lookup_table()
for the !GPIOLIB case to prevent build errors.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
This reverts commit 8c58f1a7a4.
It turns out that applying these generic properties was
premature: the properties used in the driver using this
are of unclear electrical nature and the subject need to
be discussed.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
We warn the user at driver probe time that debouncing is disabled.
However, if they request debouncing later on we print a confusing error
message:
gpio_aspeed 1e780000.gpio: Failed to convert 5000us to cycles at 0Hz: -524
Instead bail out when the clock is not present.
Fixes: 5ae4cb94b3 (gpio: aspeed: Add debounce support)
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
We need to initializes those variables to 0 for platforms that do not
provide ACPI parameters. Otherwise, we set sda_hold_time to random
values, breaking e.g. Galileo and IOT2000 boards.
Fixes: 9d64084330 ("i2c: designware: don't infer timings described by ACPI from clock rate")
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
According to Boris, some user-space tools expect MTD drivers to
update ecc_stats.corrected, and it's better to provide a lower
bound than to provide no information at all.
Fixes: 6956e2385a ("mtd: nand: add tango NAND flash controller support")
Cc: stable@vger.kernel.org
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
The device table is required to load modules based on
modaliases. After adding MODULE_DEVICE_TABLE, below entries
for example will be added to module.alias:
alias: of:N*T*Csigma,smp8758-nandC*
alias: of:N*T*Csigma,smp8758-nand
Fixes: 6956e2385a ("mtd: nand: add tango NAND flash controller support")
Cc: stable@vger.kernel.org
Signed-off-by: Andres Galacho <andresgalacho@gmail.com>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
We don't handle cases larger than 7. We probably shouldn't pretend we
know the ECC step size in this case, and it's probably also good to
WARN() like we do in many other similar cases.
Fixes: 8fc82d456e ("mtd: nand: samsung: Retrieve ECC requirements from extended ID")
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
If we fail any time after calling nand_detect(), then we don't call the
vendor-specific ->cleanup() callback, and we'll leak any resources the
vendor-specific code might have allocated.
Mark the "fix" against the first commit that started allocating anything
in ->init().
Fixes: 626994e074 ("mtd: nand: hynix: Add read-retry support for 1x nm MLC NANDs")
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
mtk_hdmi_setup_vendor_specific_infoframe will return before handle
mtk_hdmi_hw_send_info_frame.Because hdmi_vendor_infoframe_pack
returns the number of bytes packed into the binary buffer or
a negative error code on failure.
So correct it.
Fixes: 8f83f26891 ("drm/mediatek: Add HDMI support")
Signed-off-by: Nickey Yang <nickey.yang@rock-chips.com>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
This code causes a static checker warning because it treats "i == 0" as
a timeout but, because it's a post-op, the loop actually ends with "i"
set to -1. Philipp Zabel points out that it would be cleaner to use
readl_poll_timeout() instead.
Fixes: 2189881683 ("drm/mediatek: add dsi transfer function")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: CK Hu <ck.hu@mediatek.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
The add_new_disk returns with communication locked if
__sendmsg returns failure, fix it with call unlock_comm
before return.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
CC: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
I've got another report about breaking ext4 by ENOMEM error returned from
ext4_mb_load_buddy() caused by memory shortage in memory cgroup.
This time inside ext4_discard_preallocations().
This patch replaces ext4_error() with ext4_warning() where errors returned
from ext4_mb_load_buddy() are not fatal and handled by caller:
* ext4_mb_discard_group_preallocations() - called before generating ENOSPC,
we'll try to discard other group or return ENOSPC into user-space.
* ext4_trim_all_free() - just stop trimming and return ENOMEM from ioctl.
Some callers cannot handle errors, thus __GFP_NOFAIL is used for them:
* ext4_discard_preallocations()
* ext4_mb_discard_lg_preallocations()
Fixes: adb7ef600c ("ext4: use __GFP_NOFAIL in ext4_free_blocks()")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
There is an off-by-one error in loop termination conditions in
ext4_find_unwritten_pgoff() since 'end' may index a page beyond end of
desired range if 'endoff' is page aligned. It doesn't have any visible
effects but still it is good to fix it.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently, SEEK_HOLE implementation in ext4 may both return that there's
a hole at some offset although that offset already has data and skip
some holes during a search for the next hole. The first problem is
demostrated by:
xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "seek -h 0" file
wrote 57344/57344 bytes at offset 0
56 KiB, 14 ops; 0.0000 sec (2.054 GiB/sec and 538461.5385 ops/sec)
Whence Result
HOLE 0
Where we can see that SEEK_HOLE wrongly returned offset 0 as containing
a hole although we have written data there. The second problem can be
demonstrated by:
xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k"
-c "seek -h 0" file
wrote 57344/57344 bytes at offset 0
56 KiB, 14 ops; 0.0000 sec (1.978 GiB/sec and 518518.5185 ops/sec)
wrote 8192/8192 bytes at offset 131072
8 KiB, 2 ops; 0.0000 sec (2 GiB/sec and 500000.0000 ops/sec)
Whence Result
HOLE 139264
Where we can see that hole at offsets 56k..128k has been ignored by the
SEEK_HOLE call.
The underlying problem is in the ext4_find_unwritten_pgoff() which is
just buggy. In some cases it fails to update returned offset when it
finds a hole (when no pages are found or when the first found page has
higher index than expected), in some cases conditions for detecting hole
are just missing (we fail to detect a situation where indices of
returned pages are not contiguous).
Fix ext4_find_unwritten_pgoff() to properly detect non-contiguous page
indices and also handle all cases where we got less pages then expected
in one place and handle it properly there.
CC: stable@vger.kernel.org
Fixes: c8c0df241c
CC: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When a transaction starts, start_this_handle() saves current
PF_MEMALLOC_NOFS value so that it can be restored at journal stop time.
Journal restart is a special case that calls start_this_handle() without
stopping the transaction. start_this_handle() isn't aware that the
original value is already stored so it overwrites it with current value.
For instance, a call sequence like below leaves PF_MEMALLOC_NOFS flag set
at the end:
jbd2_journal_start()
jbd2__journal_restart()
jbd2_journal_stop()
Make jbd2__journal_restart() restore the original value before calling
start_this_handle().
Fixes: 81378da64d ("jbd2: mark the transaction context with the scope GFP_NOFS context")
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Quota files have special ranking of i_data_sem lock. We inform lockdep
about it when turning on quotas however when turning quotas off, we
don't clear the lockdep subclass from i_data_sem lock and thus when the
inode gets later reused for a normal file or directory, lockdep gets
confused and complains about possible deadlocks. Fix the problem by
resetting lockdep subclass of i_data_sem on quota off.
Cc: stable@vger.kernel.org
Fixes: daf647d2dd
Reported-and-tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The code to fetch a 64-bit value from user space was entirely buggered,
and has been since the code was merged in early 2016 in commit
b2f680380d ("x86/mm/32: Add support for 64-bit __get_user() on 32-bit
kernels").
Happily the buggered routine is almost certainly entirely unused, since
the normal way to access user space memory is just with the non-inlined
"get_user()", and the inlined version didn't even historically exist.
The normal "get_user()" case is handled by external hand-written asm in
arch/x86/lib/getuser.S that doesn't have either of these issues.
There were two independent bugs in __get_user_asm_u64():
- it still did the STAC/CLAC user space access marking, even though
that is now done by the wrapper macros, see commit 11f1a4b975
("x86: reorganize SMAP handling in user space accesses").
This didn't result in a semantic error, it just means that the
inlined optimized version was hugely less efficient than the
allegedly slower standard version, since the CLAC/STAC overhead is
quite high on modern Intel CPU's.
- the double register %eax/%edx was marked as an output, but the %eax
part of it was touched early in the asm, and could thus clobber other
inputs to the asm that gcc didn't expect it to touch.
In particular, that meant that the generated code could look like
this:
mov (%eax),%eax
mov 0x4(%eax),%edx
where the load of %edx obviously was _supposed_ to be from the 32-bit
word that followed the source of %eax, but because %eax was
overwritten by the first instruction, the source of %edx was
basically random garbage.
The fixes are trivial: remove the extraneous STAC/CLAC entries, and mark
the 64-bit output as early-clobber to let gcc know that no inputs should
alias with the output register.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@kernel.org # v4.8+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al noticed that unsafe_put_user() had type problems, and fixed them in
commit a7cc722fff ("fix unsafe_put_user()"), which made me look more
at those functions.
It turns out that unsafe_get_user() had a type issue too: it limited the
largest size of the type it could handle to "unsigned long". Which is
fine with the current users, but doesn't match our existing normal
get_user() semantics, which can also handle "u64" even when that does
not fit in a long.
While at it, also clean up the type cast in unsafe_put_user(). We
actually want to just make it an assignment to the expected type of the
pointer, because we actually do want warnings from types that don't
convert silently. And it makes the code more readable by not having
that one very long and complex line.
[ This patch might become stable material if we ever end up back-porting
any new users of the unsafe uaccess code, but as things stand now this
doesn't matter for any current existing uses. ]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Export the function which checks whether an MCE is a memory error to
other users so that we can reuse the logic. Drop the boot_cpu_data use,
while at it, as mce.cpuvendor already has the CPU vendor in there.
Integrate a piece from a patch from Vishal Verma
<vishal.l.verma@intel.com> to export it for modules (nfit).
The main reason we're exporting it is that the nfit handler
nfit_handle_mce() needs to detect a memory error properly before doing
its recovery actions.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170519093915.15413-2-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull misc uaccess fixes from Al Viro:
"Fix for unsafe_put_user() (no callers currently in mainline, but
anyone starting to use it will step into that) + alpha osf_wait4()
infoleak fix"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
osf_wait4(): fix infoleak
fix unsafe_put_user()
Pull scheduler fix from Thomas Gleixner:
"A single scheduler fix:
Prevent idle task from ever being preempted. That makes sure that
synchronize_rcu_tasks() which is ignoring idle task does not pretend
that no task is stuck in preempted state. If that happens and idle was
preempted on a ftrace trampoline the machine crashes due to
inconsistent state"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Call __schedule() from do_idle() without enabling preemption
Pull irq fixes from Thomas Gleixner:
"A set of small fixes for the irq subsystem:
- Cure a data ordering problem with chained interrupts
- Three small fixlets for the mbigen irq chip"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Fix chained interrupt data ordering
irqchip/mbigen: Fix the clear register offset calculation
irqchip/mbigen: Fix potential NULL dereferencing
irqchip/mbigen: Fix memory mapping code
Since commit 76b91c32dd ("bridge: stp: when using userspace stp stop
kernel hello and hold timers"), bridge would not start hello_timer if
stp_enabled is not KERNEL_STP when br_dev_open.
The problem is even if users set stp_enabled with KERNEL_STP later,
the timer will still not be started. It causes that KERNEL_STP can
not really work. Users have to re-ifup the bridge to avoid this.
This patch is to fix it by starting br->hello_timer when enabling
KERNEL_STP in br_stp_start.
As an improvement, it's also to start hello_timer again only when
br->stp_enabled is KERNEL_STP in br_hello_timer_expired, there is
no reason to start the timer again when it's NO_STP.
Fixes: 76b91c32dd ("bridge: stp: when using userspace stp stop kernel hello and hold timers")
Reported-by: Haidong Li <haili@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
When TX checksum offload is used, if the computed checksum is 0 the
LAN95xx device do not alter the checksum to 0xffff. In the case of ipv4
UDP checksum, it indicates to receiver that no checksum is calculated.
Under ipv6, UDP checksum yields a result of zero must be changed to
0xffff. Hence disabling checksum offload for ipv6 packets.
Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com>
Reported-by: popcorn mix <popcornmix@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ihar Hrachyshka says:
====================
arp: always override existing neigh entries with gratuitous ARP
This patchset is spurred by discussion started at
https://patchwork.ozlabs.org/patch/760372/ where we figured that there is no
real reason for enforcing override by gratuitous ARP packets only when
arp_accept is 1. Same should happen when it's 0 (the default value).
changelog v2: handled review comments by Julian Anastasov
- fixed a mistake in a comment;
- postponed addr_type calculation to as late as possible.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, when arp_accept is 1, we always override existing neigh
entries with incoming gratuitous ARP replies. Otherwise, we override
them only if new replies satisfy _locktime_ conditional (packets arrive
not earlier than _locktime_ seconds since the last update to the neigh
entry).
The idea behind locktime is to pick the very first (=> close) reply
received in a unicast burst when ARP proxies are used. This helps to
avoid ARP thrashing where Linux would switch back and forth from one
proxy to another.
This logic has nothing to do with gratuitous ARP replies that are
generally not aligned in time when multiple IP address carriers send
them into network.
This patch enforces overriding of existing neigh entries by all incoming
gratuitous ARP packets, irrespective of their time of arrival. This will
make the kernel honour all incoming gratuitous ARP packets.
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The addr_type retrieval can be costly, so it's worth trying to avoid its
calculation as much as possible. This patch makes it calculated only
for gratuitous ARP packets. This is especially important since later we
may want to move is_garp calculation outside of arp_accept block, at
which point the costly operation will be executed for all setups.
The patch is the result of a discussion in net-dev:
http://marc.info/?l=linux-netdev&m=149506354216994
Suggested-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code is quite involving already to earn a separate function for
itself. If anything, it helps arp_process readability.
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When tcp_disconnect() is called, inet_csk_delack_init() sets
icsk->icsk_ack.rcv_mss to 0.
This could potentially cause tcp_recvmsg() => tcp_cleanup_rbuf() =>
__tcp_select_window() call path to have division by 0 issue.
So this patch initializes rcv_mss to TCP_MIN_MSS instead of 0.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__put_user_size() relies upon its first argument having the same type as what
the second one points to; the only other user makes sure of that and
unsafe_put_user() should do the same.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:
1) When using IPVS in direct-routing mode, normal traffic from the LVS
host to a back-end server is sometimes incorrectly NATed on the way
back into the LVS host. Patch to fix this from Julian Anastasov.
2) Calm down clang compilation warning in ctnetlink due to type
mismatch, from Matthias Kaehlcke.
3) Do not re-setup NAT for conntracks that are already confirmed, this
is fixing a problem that was introduced in the previous nf-next batch.
Patch from Liping Zhang.
4) Do not allow conntrack helper removal from userspace cthelper
infrastructure if already in used. This comes with an initial patch
to introduce nf_conntrack_helper_put() that is required by this fix.
From Liping Zhang.
5) Zero the pad when copying data to userspace, otherwise iptables fails
to remove rules. This is a follow up on the patchset that sorts out
the internal match/target structure pointer leak to userspace. Patch
from the same author, Willem de Bruijn. This also comes with a build
failure when CONFIG_COMPAT is not on, coming in the last patch of
this series.
6) SYNPROXY crashes with conntrack entries that are created via
ctnetlink, more specifically via conntrackd state sync. Patch from
Eric Leblond.
7) RCU safe iteration on set element dumping in nf_tables, from
Liping Zhang.
8) Missing sanitization of immediate date for the bitwise and cmp
expressions in nf_tables.
9) Refcounting logic for chain and objects from set elements does not
integrate into the nf_tables 2-phase commit protocol.
10) Missing sanitization of target verdict in ebtables arpreply target,
from Gao Feng.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
For the sake of DT binding stability, this IIO driver is a child of an
MFD driver for Allwinner A10, A13 and A31 because there already exists a
DT binding for this IP. The MFD driver has a DT node but the IIO driver
does not.
The IIO device registers the temperature sensor in the thermal framework
using the DT node of the parent, the MFD device, so the thermal
framework could match the phandle to the MFD device in the DT and the
struct device used to register in the thermal framework.
devm_thermal_zone_of_sensor_register was previously used to register the
thermal sensor with the parent struct device of the IIO device,
representing the MFD device. By doing so, we registered actually the
parent in the devm routine and not the actual IIO device.
This lead to the devm unregister function not being called when the IIO
module driver is removed. It resulted in the thermal framework still
polling the get_temp function of the IIO module while the device doesn't
exist anymore, thus generated a kernel panic.
Use the non-devm function instead and do the unregister manually in the
remove function.
Fixes: d1caa99055 ("iio: adc: add support for Allwinner SoCs ADC")
Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Reported-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The third argument of devm_request_threaded_irq() is the primary
handler. It is called in hardirq context and checks whether the
interrupt is relevant to the device. If the primary handler returns
IRQ_WAKE_THREAD, the secondary handler (a.k.a. handler thread) is
scheduled to run in process context.
bcm_iproc_adc.c uses the secondary handler as the primary one
and the other way around. So this patch fixes the same, along with
re-naming the secondary handler and primary handler names properly.
Tested on the BCM9583XX iProc SoC based boards.
Fixes: 4324c97ece ("iio: Add driver for Broadcom iproc-static-adc")
Reported-by: Pavel Roskin <plroskin@gmail.com>
Signed-off-by: Raveendra Padasalagi <raveendra.padasalagi@broadcom.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Pull tracing fixes from Steven Rostedt:
- Fix a bug caused by not cleaning up the new instance unique triggers
when deleting an instance. It also creates a selftest that triggers
that bug.
- Fix the delayed optimization happening after kprobes boot up self
tests being removed by freeing of init memory.
- Comment kprobes on why the delay optimization is not a problem for
removal of modules, to keep other developers from searching that
riddle.
- Fix another case of rcu not watching in stack trace tracing.
* tag 'trace-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Make sure RCU is watching before calling a stack trace
kprobes: Document how optimized kprobes are removed from module unload
selftests/ftrace: Add test to remove instance with active event triggers
selftests/ftrace: Fix bashisms
ftrace: Remove #ifdef from code and add clear_ftrace_function_probes() stub
ftrace/instances: Clear function triggers when removing instances
ftrace: Simplify glob handling in unregister_ftrace_function_probe_func()
tracing/kprobes: Enforce kprobes teardown after testing
tracing: Move postpone selftests to core from early_initcall
Pull block fixes from Jens Axboe:
"A small collection of fixes that should go into this cycle.
- a pull request from Christoph for NVMe, which ended up being
manually applied to avoid pulling in newer bits in master. Mostly
fibre channel fixes from James, but also a few fixes from Jon and
Vijay
- a pull request from Konrad, with just a single fix for xen-blkback
from Gustavo.
- a fuseblk bdi fix from Jan, fixing a regression in this series with
the dynamic backing devices.
- a blktrace fix from Shaohua, replacing sscanf() with kstrtoull().
- a request leak fix for drbd from Lars, fixing a regression in the
last series with the kref changes. This will go to stable as well"
* 'for-linus' of git://git.kernel.dk/linux-block:
nvmet: release the sq ref on rdma read errors
nvmet-fc: remove target cpu scheduling flag
nvme-fc: stop queues on error detection
nvme-fc: require target or discovery role for fc-nvme targets
nvme-fc: correct port role bits
nvme: unmap CMB and remove sysfs file in reset path
blktrace: fix integer parse
fuseblk: Fix warning in super_setup_bdi_name()
block: xen-blkback: add null check to avoid null pointer dereference
drbd: fix request leak introduced by locking/atomic, kref: Kill kref_sub()
On rdma read errors, release the sq ref that was taken
when the req was initialized. This avoids a hang in
nvmet_sq_destroy() when the queue is being freed.
Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Per the recommendation by Sagi on:
http://lists.infradead.org/pipermail/linux-nvme/2017-April/009261.html
Rather than waiting for reset work thread to stop queues and abort the ios,
immediately stop the queues on error detection. Reset thread will restop
the queues (as it's called on other paths), but it does not appear to have
a side effect.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
In order to create an association, the remoteport must be
serving either a target role or a discovery role.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
FC Port roles is a bit mask, not individual values.
Correct nvme definitions to unique bits.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
CMB doesn't get unmapped until removal while getting remapped on every
reset. Add the unmapping and sysfs file removal to the reset path in
nvme_pci_disable to match the mapping path in nvme_pci_enable.
Fixes: 202021c1a ("nvme : Add sysfs entry for NVMe CMBs when appropriate")
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Reviewed-By: Stephen Bates <sbates@raithlin.com>
Cc: <stable@vger.kernel.org> # 4.9+
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Pull staging driver fixes from Greg KH:
"Here are a number of staging driver fixes for 4.12-rc2
Most of them are typec driver fixes found by reviewers and users of
the code. There are also some removals of files no longer needed in
the tree due to the ion driver rewrite in 4.12-rc1, as well as some
wifi driver fixes. And to round it out, a MAINTAINERS file update.
All have been in linux-next with no reported issues"
* tag 'staging-4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (22 commits)
MAINTAINERS: greybus-dev list is members-only
staging: fsl-dpaa2/eth: add ETHERNET dependency
staging: typec: fusb302: refactor resume retry mechanism
staging: typec: fusb302: reset i2c_busy state in error
staging: rtl8723bs: remove re-positioned call to kfree in os_dep/ioctl_cfg80211.c
staging: rtl8192e: GetTs Fix invalid TID 7 warning.
staging: rtl8192e: rtl92e_get_eeprom_size Fix read size of EPROM_CMD.
staging: rtl8192e: fix 2 byte alignment of register BSSIDR.
staging: rtl8192e: rtl92e_fill_tx_desc fix write to mapped out memory.
staging: vc04_services: Fix bulk cache maintenance
staging: ccree: remove extraneous spin_unlock_bh() in error handler
staging: typec: Fix sparse warnings about incorrect types
staging: typec: fusb302: do not free gpio from managed resource
staging: typec: tcpm: Fix Port Power Role field in PS_RDY messages
staging: typec: tcpm: Respond to Discover Identity commands
staging: typec: tcpm: Set correct flags in PD request messages
staging: typec: tcpm: Drop duplicate PD messages
staging: typec: fusb302: Fix chip->vbus_present init value
staging: typec: fusb302: Fix module autoload
staging: typec: tcpci: declare private structure as static
...
Pull USB fixes from Greg KH:
"Here are a number of small USB fixes for 4.12-rc2
Most of them come from Johan, in his valiant quest to fix up all
drivers that could be affected by "malicious" USB devices. There's
also some fixes for more "obscure" drivers to handle some of the
vmalloc stack fallout (which for USB drivers, was always the case, but
very few people actually ran those systems...)
Other than that, the normal set of xhci and gadget and musb driver
fixes as well.
All have been in linux-next with no reported issues"
* tag 'usb-4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (42 commits)
usb: musb: tusb6010_omap: Do not reset the other direction's packet size
usb: musb: Fix trying to suspend while active for OTG configurations
usb: host: xhci-plat: propagate return value of platform_get_irq()
xhci: Fix command ring stop regression in 4.11
xhci: remove GFP_DMA flag from allocation
USB: xhci: fix lock-inversion problem
usb: host: xhci-ring: don't need to clear interrupt pending for MSI enabled hcd
usb: host: xhci-mem: allocate zeroed Scratchpad Buffer
xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton
usb: xhci: trace URB before giving it back instead of after
USB: serial: qcserial: add more Lenovo EM74xx device IDs
USB: host: xhci: use max-port define
USB: hub: fix SS max number of ports
USB: hub: fix non-SS hub-descriptor handling
USB: hub: fix SS hub-descriptor handling
USB: usbip: fix nonconforming hub descriptor
USB: gadget: dummy_hcd: fix hub-descriptor removable fields
doc-rst: fixed kernel-doc directives in usb/typec.rst
USB: core: of: document reference taken by companion helper
USB: ehci-platform: fix companion-device leak
...
Pull char/misc driver fixes from Greg KH:
"Here are five small bugfixes for reported issues with 4.12-rc1 and
earlier kernels. Nothing huge here, just a lp, mem, vpd, and uio
driver fix, along with a Kconfig fixup for one of the misc drivers.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
firmware: Google VPD: Fix memory allocation error handling
drivers: char: mem: Check for address space wraparound with mmap()
uio: fix incorrect memory leak cleanup
misc: pci_endpoint_test: select CRC32
char: lp: fix possible integer overflow in lp_setup()
Pull drm fixes from Dave Airlie:
"Mostly nouveau and i915, fairly quiet as usual for rc2"
* tag 'drm-fixes-for-v4.12-rc2' of git://people.freedesktop.org/~airlied/linux:
drm/atmel-hlcdc: Fix output initialization
gpu: host1x: select IOMMU_IOVA
drm/nouveau/fifo/gk104-: Silence a locking warning
drm/nouveau/secboot: plug memory leak in ls_ucode_img_load_gr() error path
drm/nouveau: Fix drm poll_helper handling
drm/i915: don't do allocate_va_range again on PIN_UPDATE
drm/i915: Fix rawclk readout for g4x
drm/i915: Fix runtime PM for LPE audio
drm/i915/glk: Fix DSI "*ERROR* ULPS is still active" messages
drm/i915/gvt: avoid unnecessary vgpu switch
drm/i915/gvt: not to restore in-context mmio
drm/etnaviv: don't put fence in case of submit failure
drm/i915/gvt: fix typo: "supporte" -> "support"
drm: hdlcd: Fix the calculation of the scanout start address
The arm64 H5 and arm H3 SoCs share roughly the same base, and therefore
share a significant part of their device tree.
The approach we took was to add a symlink from the arm64 DTSI to the arm
DTSI.
Now that the arm DT folder is exposed in the include path, we can just use
it and remove our symlink.
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Pull SCSI fixes from James Bottomley:
"This is the first sweep of mostly minor fixes. There's one security
one: the read past the end of a buffer in qedf, and a panic fix for
lpfc SLI-3 adapters, but the rest are a set of include and build
dependency tidy ups and assorted other small fixes and updates"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: pmcraid: remove redundant check to see if request_size is less than zero
scsi: lpfc: ensure els_wq is being checked before destroying it
scsi: cxlflash: Select IRQ_POLL
scsi: qedf: Avoid reading past end of buffer
scsi: qedf: Cleanup the type of io_log->op
scsi: lpfc: double lock typo in lpfc_ns_rsp()
scsi: qedf: properly update arguments position in function call
scsi: scsi_lib: Add #include <scsi/scsi_transport.h>
scsi: MAINTAINERS: update OSD entries
scsi: Skip deleted devices in __scsi_device_lookup
scsi: lpfc: Fix panic on BFS configuration
scsi: libfc: do not flood console with messages 'libfc: queue full ...'
Pull libnvdimm fixes from Dan Williams:
"A couple of compile fixes.
With the removal of the ->direct_access() method from
block_device_operations in favor of a new dax_device + dax_operations
we broke two configurations.
The CONFIG_BLOCK=n case is fixed by compiling out the block+dax
helpers in the dax core. Configurations with FS_DAX=n EXT4=y / XFS=y
and DAX=m fail due to the helpers the builtin filesystem needs being
in a module, so we stub out the helpers in the FS_DAX=n case."
* 'libnvdimm-for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dax, xfs, ext4: compile out iomap-dax paths in the FS_DAX=n case
dax: fix false CONFIG_BLOCK dependency
Pull i2c fix from Wolfram Sang:
"A regression fix for I2C that would be great to have in rc2"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: designware: don't infer timings described by ACPI from clock rate
Pull IOMMU fixes from Joerg Roedel:
- another compile-fix as a fallout of the recent header-file cleanup
- add a missing IO/TLB flush to the Intel VT-d kdump code path
- a fix for ARM64 dma code to only access initialized iova_domain
members
* tag 'iommu-fixes-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/mediatek: Include linux/dma-mapping.h
iommu/vt-d: Flush the IOTLB to get rid of the initial kdump mappings
iommu/dma: Don't touch invalid iova_domain members
Pull KVM fixes from Radim Krčmář:
"ARM:
- a fix for a build failure introduced in -rc1 when tracepoints are
enabled on 32-bit ARM.
- disable use of stack pointer protection in the hyp code which can
cause panics.
- a handful of VGIC fixes.
- a fix to the init of the redistributors on GICv3 systems that
prevented boot with kvmtool on GICv3 systems introduced in -rc1.
- a number of race conditions fixed in our MMU handling code.
- a fix for the guest being able to program the debug extensions for
the host on the 32-bit side.
PPC:
- fixes for build failures with PR KVM configurations.
- a fix for a host crash that can occur on POWER9 with radix guests.
x86:
- fixes for nested PML and nested EPT.
- a fix for crashes caused by reserved bits in SSE MXCSR that could
have been set by userspace.
- an optimization of halt polling that fixes high CPU overhead.
- fixes for four reports from Dan Carpenter's static checker.
- a protection around code that shouldn't have been preemptible.
- a fix for port IO emulation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (27 commits)
KVM: x86: prevent uninitialized variable warning in check_svme()
KVM: x86/vPMU: fix undefined shift in intel_pmu_refresh()
KVM: x86: zero base3 of unusable segments
KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulation
KVM: x86: Fix potential preemption when get the current kvmclock timestamp
KVM: Silence underflow warning in avic_get_physical_id_entry()
KVM: arm/arm64: Hold slots_lock when unregistering kvm io bus devices
KVM: arm/arm64: Fix bug when registering redist iodevs
KVM: x86: lower default for halt_poll_ns
kvm: arm/arm64: Fix use after free of stage2 page table
kvm: arm/arm64: Force reading uncached stage2 PGD
KVM: nVMX: fix EPT permissions as reported in exit qualification
KVM: VMX: Don't enable EPT A/D feature if EPT feature is disabled
KVM: x86: Fix load damaged SSEx MXCSR register
kvm: nVMX: off by one in vmx_write_pml_buffer()
KVM: arm: rename pm_fake handler to trap_raz_wi
KVM: arm: plug potential guest hardware debug leakage
kvm: arm/arm64: Fix race in resetting stage2 PGD
KVM: arm/arm64: vgic-v3: Use PREbits to infer the number of ICH_APxRn_EL2 registers
KVM: arm/arm64: vgic-v3: Do not use Active+Pending state for a HW interrupt
...
Pull xen fixes from Juergen Gross:
"Some fixes for the new Xen 9pfs frontend and some minor cleanups"
* tag 'for-linus-4.12b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: make xen_flush_tlb_all() static
xen: cleanup pvh leftovers from pv-only sources
xen/9pfs: p9_trans_xen_init and p9_trans_xen_exit can be static
xen/9pfs: fix return value check in xen_9pfs_front_probe()
Pull ARM SoC fixes from Olof Johansson:
"We had a small batch of fixes before -rc1, but here is a larger one.
It contains a backmerge of 4.12-rc1 since some of the downstream
branches we merge had that as base; at the same time we already had
merged contents before -rc1 and rebase wasn't the right solution.
A mix of random smaller fixes and a few things worth pointing out:
- We've started telling people to avoid cross-tree shared branches if
all they're doing is picking up one or two DT-used constants from a
shared include file, and instead to use the numeric values on first
submission. Follow-up moving over to symbolic names are sent in
right after -rc1, i.e. here. It's only a few minor patches of this
type.
- Linus Walleij and others are resurrecting the 'Gemini' platform,
and wanted a cut-down platform-specific defconfig for it. So I
picked that up for them.
- Rob Herring ran 'savedefconfig' on arm64, it's a bit churny but it
helps people to prepare patches since it's a pain when defconfig
and current savedefconfig contents differs too much.
- Devicetree additions for some pinctrl drivers for Armada that were
merged this window. I'd have preferred to see those earlier but
it's not a huge deail.
The biggest change worth pointing out though since it's touching other
parts of the tree: We added prefixes to be used when cross-including
DT contents between arm64 and arm, allowing someone to #include
<arm/foo.dtsi> from arm64, and likewise. As part of that, we needed
arm/foo.dtsi to work on arm as well. The way I suggested this to Heiko
resulted in a recursive symlink.
Instead, I've now moved it out of arch/*/boot/dts/include, into a
shared location under scripts/dtc. While I was at it, I consolidated
so all architectures now behave the same way in this manner.
Rob Herring (DT maintainer) has acked it. I cc:d most other arch
maintainers but nobody seems to care much; it doesn't really affect
them since functionality is unchanged for them by default"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (29 commits)
arm64: dts: rockchip: fix include reference
firmware: ti_sci: fix strncat length check
ARM: remove duplicate 'const' annotations'
arm64: defconfig: enable options needed for QCom DB410c board
arm64: defconfig: sync with savedefconfig
ARM: configs: add a gemini defconfig
devicetree: Move include prefixes from arch to separate directory
ARM: dts: dra7: Reduce cpu thermal shutdown temperature
memory: omap-gpmc: Fix debug output for access width
ARM: dts: LogicPD Torpedo: Fix camera pin mux
ARM: dts: omap4: enable CEC pin for Pandaboard A4 and ES
ARM: dts: gta04: fix polarity of clocks for mcbsp4
ARM: dts: dra7: Add power hold and power controller properties to palmas
soc: imx: add PM dependency for IMX7_PM_DOMAINS
ARM: dts: imx6sx-sdb: Remove OPP override
ARM: dts: imx53-qsrb: Pulldown PMIC IRQ pin
soc: bcm: brcmstb: Correctly match 7435 SoC
tee: add ARM_SMCCC dependency
ARM: omap2+: make omap4_get_cpu1_ns_pa_addr declaration usable
ARM64: dts: mediatek: configure some fixed mmc parameters
...
Pull arm64 fixes/cleanups from Catalin Marinas:
- Avoid taking a mutex in the secondary CPU bring-up path when
interrupts are disabled
- Ignore perf exclude_hv when the kernel is running in Hyp mode
- Remove redundant instruction in cmpxchg
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/cpufeature: don't use mutex in bringup path
arm64: perf: Ignore exclude_hv when kernel is running in HYP
arm64: Remove redundant mov from LL/SC cmpxchg
Pull powerpc fixes from Michael Ellerman:
"The headliner is a fix for FP/VMX register corruption when using
transactional memory, and a new selftest to go with it.
Then there's the virt_addr_valid() fix, currently HARDENDED_USERCOPY
is tripping on that causing some machines to crash.
A few other fairly minor fixes for long tail things, and a couple of
fixes for code we just merged.
Thanks to: Breno Leitao, Gautham Shenoy, Michael Neuling, Naveen Rao.
Nicholas Piggin, Paul Mackerras"
* tag 'powerpc-4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Fix virt_addr_valid() etc. on 64-bit hash
powerpc/mm: Fix crash in page table dump with huge pages
powerpc/kprobes: Fix handling of instruction emulation on probe re-entry
powerpc/powernv: Set NAPSTATELOST after recovering paca on P9 DD1
selftests/powerpc: Test TM and VMX register state
powerpc/tm: Fix FP and VMX register corruption
powerpc/modules: If mprofile-kernel is enabled add it to vermagic
get_msr() of MSR_EFER is currently always going to succeed, but static
checker doesn't see that far.
Don't complicate stuff and just use 0 for the fallback -- it means that
the feature is not present.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Static analysis noticed that pmu->nr_arch_gp_counters can be 32
(INTEL_PMC_MAX_GENERIC) and therefore cannot be used to shift 'int'.
I didn't add BUILD_BUG_ON for it as we have a better checker.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 25462f7f52 ("KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Static checker noticed that base3 could be used uninitialized if the
segment was not present (useable). Random stack values probably would
not pass VMCS entry checks.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 1aa366163b ("KVM: x86 emulator: consolidate segment accessors")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Huawei folks reported a read out-of-bounds vulnerability in kvm pio emulation.
- "inb" instruction to access PIT Mod/Command register (ioport 0x43, write only,
a read should be ignored) in guest can get a random number.
- "rep insb" instruction to access PIT register port 0x43 can control memcpy()
in emulator_pio_in_emulated() to copy max 0x400 bytes but only read 1 bytes,
which will disclose the unimportant kernel memory in host but no crash.
The similar test program below can reproduce the read out-of-bounds vulnerability:
void hexdump(void *mem, unsigned int len)
{
unsigned int i, j;
for(i = 0; i < len + ((len % HEXDUMP_COLS) ? (HEXDUMP_COLS - len % HEXDUMP_COLS) : 0); i++)
{
/* print offset */
if(i % HEXDUMP_COLS == 0)
{
printf("0x%06x: ", i);
}
/* print hex data */
if(i < len)
{
printf("%02x ", 0xFF & ((char*)mem)[i]);
}
else /* end of block, just aligning for ASCII dump */
{
printf(" ");
}
/* print ASCII dump */
if(i % HEXDUMP_COLS == (HEXDUMP_COLS - 1))
{
for(j = i - (HEXDUMP_COLS - 1); j <= i; j++)
{
if(j >= len) /* end of block, not really printing */
{
putchar(' ');
}
else if(isprint(((char*)mem)[j])) /* printable char */
{
putchar(0xFF & ((char*)mem)[j]);
}
else /* other char */
{
putchar('.');
}
}
putchar('\n');
}
}
}
int main(void)
{
int i;
if (iopl(3))
{
err(1, "set iopl unsuccessfully\n");
return -1;
}
static char buf[0x40];
/* test ioport 0x40,0x41,0x42,0x43,0x44,0x45 */
memset(buf, 0xab, sizeof(buf));
asm volatile("push %rdi;");
asm volatile("mov %0, %%rdi;"::"q"(buf));
asm volatile ("mov $0x40, %rdx;");
asm volatile ("in %dx,%al;");
asm volatile ("stosb;");
asm volatile ("mov $0x41, %rdx;");
asm volatile ("in %dx,%al;");
asm volatile ("stosb;");
asm volatile ("mov $0x42, %rdx;");
asm volatile ("in %dx,%al;");
asm volatile ("stosb;");
asm volatile ("mov $0x43, %rdx;");
asm volatile ("in %dx,%al;");
asm volatile ("stosb;");
asm volatile ("mov $0x44, %rdx;");
asm volatile ("in %dx,%al;");
asm volatile ("stosb;");
asm volatile ("mov $0x45, %rdx;");
asm volatile ("in %dx,%al;");
asm volatile ("stosb;");
asm volatile ("pop %rdi;");
hexdump(buf, 0x40);
printf("\n");
/* ins port 0x40 */
memset(buf, 0xab, sizeof(buf));
asm volatile("push %rdi;");
asm volatile("mov %0, %%rdi;"::"q"(buf));
asm volatile ("mov $0x20, %rcx;");
asm volatile ("mov $0x40, %rdx;");
asm volatile ("rep insb;");
asm volatile ("pop %rdi;");
hexdump(buf, 0x40);
printf("\n");
/* ins port 0x43 */
memset(buf, 0xab, sizeof(buf));
asm volatile("push %rdi;");
asm volatile("mov %0, %%rdi;"::"q"(buf));
asm volatile ("mov $0x20, %rcx;");
asm volatile ("mov $0x43, %rdx;");
asm volatile ("rep insb;");
asm volatile ("pop %rdi;");
hexdump(buf, 0x40);
printf("\n");
return 0;
}
The vcpu->arch.pio_data buffer is used by both in/out instrutions emulation
w/o clear after using which results in some random datas are left over in
the buffer. Guest reads port 0x43 will be ignored since it is write only,
however, the function kernel_pio() can't distigush this ignore from successfully
reads data from device's ioport. There is no new data fill the buffer from
port 0x43, however, emulator_pio_in_emulated() will copy the stale data in
the buffer to the guest unconditionally. This patch fixes it by clearing the
buffer before in instruction emulation to avoid to grant guest the stale data
in the buffer.
In addition, string I/O is not supported for in kernel device. So there is no
iteration to read ioport %RCX times for string I/O. The function kernel_pio()
just reads one round, and then copy the io size * %RCX to the guest unconditionally,
actually it copies the one round ioport data w/ other random datas which are left
over in the vcpu->arch.pio_data buffer to the guest. This patch fixes it by
introducing the string I/O support for in kernel device in order to grant the right
ioport datas to the guest.
Before the patch:
0x000000: fe 38 93 93 ff ff ab ab .8......
0x000008: ab ab ab ab ab ab ab ab ........
0x000010: ab ab ab ab ab ab ab ab ........
0x000018: ab ab ab ab ab ab ab ab ........
0x000020: ab ab ab ab ab ab ab ab ........
0x000028: ab ab ab ab ab ab ab ab ........
0x000030: ab ab ab ab ab ab ab ab ........
0x000038: ab ab ab ab ab ab ab ab ........
0x000000: f6 00 00 00 00 00 00 00 ........
0x000008: 00 00 00 00 00 00 00 00 ........
0x000010: 00 00 00 00 4d 51 30 30 ....MQ00
0x000018: 30 30 20 33 20 20 20 20 00 3
0x000020: ab ab ab ab ab ab ab ab ........
0x000028: ab ab ab ab ab ab ab ab ........
0x000030: ab ab ab ab ab ab ab ab ........
0x000038: ab ab ab ab ab ab ab ab ........
0x000000: f6 00 00 00 00 00 00 00 ........
0x000008: 00 00 00 00 00 00 00 00 ........
0x000010: 00 00 00 00 4d 51 30 30 ....MQ00
0x000018: 30 30 20 33 20 20 20 20 00 3
0x000020: ab ab ab ab ab ab ab ab ........
0x000028: ab ab ab ab ab ab ab ab ........
0x000030: ab ab ab ab ab ab ab ab ........
0x000038: ab ab ab ab ab ab ab ab ........
After the patch:
0x000000: 1e 02 f8 00 ff ff ab ab ........
0x000008: ab ab ab ab ab ab ab ab ........
0x000010: ab ab ab ab ab ab ab ab ........
0x000018: ab ab ab ab ab ab ab ab ........
0x000020: ab ab ab ab ab ab ab ab ........
0x000028: ab ab ab ab ab ab ab ab ........
0x000030: ab ab ab ab ab ab ab ab ........
0x000038: ab ab ab ab ab ab ab ab ........
0x000000: d2 e2 d2 df d2 db d2 d7 ........
0x000008: d2 d3 d2 cf d2 cb d2 c7 ........
0x000010: d2 c4 d2 c0 d2 bc d2 b8 ........
0x000018: d2 b4 d2 b0 d2 ac d2 a8 ........
0x000020: ab ab ab ab ab ab ab ab ........
0x000028: ab ab ab ab ab ab ab ab ........
0x000030: ab ab ab ab ab ab ab ab ........
0x000038: ab ab ab ab ab ab ab ab ........
0x000000: 00 00 00 00 00 00 00 00 ........
0x000008: 00 00 00 00 00 00 00 00 ........
0x000010: 00 00 00 00 00 00 00 00 ........
0x000018: 00 00 00 00 00 00 00 00 ........
0x000020: ab ab ab ab ab ab ab ab ........
0x000028: ab ab ab ab ab ab ab ab ........
0x000030: ab ab ab ab ab ab ab ab ........
0x000038: ab ab ab ab ab ab ab ab ........
Reported-by: Moguofang <moguofang@huawei.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Moguofang <moguofang@huawei.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
BUG: using __this_cpu_read() in preemptible [00000000] code: qemu-system-x86/2809
caller is __this_cpu_preempt_check+0x13/0x20
CPU: 2 PID: 2809 Comm: qemu-system-x86 Not tainted 4.11.0+ #13
Call Trace:
dump_stack+0x99/0xce
check_preemption_disabled+0xf5/0x100
__this_cpu_preempt_check+0x13/0x20
get_kvmclock_ns+0x6f/0x110 [kvm]
get_time_ref_counter+0x5d/0x80 [kvm]
kvm_hv_process_stimers+0x2a1/0x8a0 [kvm]
? kvm_hv_process_stimers+0x2a1/0x8a0 [kvm]
? kvm_arch_vcpu_ioctl_run+0xac9/0x1ce0 [kvm]
kvm_arch_vcpu_ioctl_run+0x5bf/0x1ce0 [kvm]
kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
? __fget+0xf3/0x210
do_vfs_ioctl+0xa4/0x700
? __fget+0x114/0x210
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x23/0xc2
RIP: 0033:0x7f9d164ed357
? __this_cpu_preempt_check+0x13/0x20
This can be reproduced by run kvm-unit-tests/hyperv_stimer.flat w/
CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT enabled.
Safe access to per-CPU data requires a couple of constraints, though: the
thread working with the data cannot be preempted and it cannot be migrated
while it manipulates per-CPU variables. If the thread is preempted, the
thread that replaces it could try to work with the same variables; migration
to another CPU could also cause confusion. However there is no preemption
disable when reads host per-CPU tsc rate to calculate the current kvmclock
timestamp.
This patch fixes it by utilizing get_cpu/put_cpu pair to guarantee both
__this_cpu_read() and rdtsc() are not preempted.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Initialize asoc_simple_card_init_mic with the correct struct
asoc_simple_jack.
Fixes: 9eac361877 ("ASoC: simple-card: add new asoc_simple_jack and use it")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
If SSI uses shared pin, some SSI will be used as parent SSI.
Then, normal SSI's remove and Parent SSI's remove
(these are same SSI) will be called when unbind or remove timing.
In this case, free_irq() will be called twice.
This patch solve this issue.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Tested-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
Reported-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
sscanf is a very poor way to parse integer. For example, I input
"discard" for act_mask, it gets 0xd and completely messes up. Using
correct API to do integer parse.
This patch also makes attributes accept any base of integer.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
If a malicious user corrupts the refcount btree to cause a cycle between
different levels of the tree, the next mount attempt will deadlock in
the CoW recovery routine while grabbing buffer locks. We can use the
ability to re-grab a buffer that was previous locked to a transaction to
avoid deadlocks, so do that here.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Commit bd698d24b1 ("i2c: designware: Get selected speed mode
sda-hold-time via ACPI") updated the logic that reads the timing
parameters for various I2C bus rates from the DSDT, to only read
the timing parameters for the currently selected mode.
This causes a WARN_ON() splat on platforms that legally omit the clock
frequency from the ACPI description, because in the new situation, the
core I2C designware driver still accesses the fields in the driver
struct that we no longer populate, and proceeds to calculate them from
the clock frequency. Since the clock frequency is unspecified, the
driver complains loudly using a WARN_ON().
So revert back to the old situation, where the struct fields for all
timings are populated, but retain the new logic which chooses the SDA
hold time from the timing mode that is currently in use.
Fixes: bd698d24b1 ("i2c: designware: Get selected speed mode ...")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The way we handle include paths for DT has changed a bit, which
broke a file that had an unconventional way to reference a common
header file:
arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dts:47:10: fatal error: include/dt-bindings/input/linux-event-codes.h: No such file or directory
This removes the leading "include/" from the path name, which fixes it.
Fixes: d5d332d3f7 ("devicetree: Move include prefixes from arch to separate directory")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
During xfrm migration copy replay and preplay sequence numbers
from the previous state.
Here is a tcpdump output showing the problem.
10.0.10.46 is running vanilla kernel, is the IKE/IPsec responder.
After the migration it sent wrong sequence number, reset to 1.
The migration is from 10.0.0.52 to 10.0.0.53.
IP 10.0.0.52.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7cf), length 136
IP 10.0.10.46.4500 > 10.0.0.52.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x7cf), length 136
IP 10.0.0.52.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d0), length 136
IP 10.0.10.46.4500 > 10.0.0.52.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x7d0), length 136
IP 10.0.0.53.4500 > 10.0.10.46.4500: NONESP-encap: isakmp: child_sa inf2[I]
IP 10.0.10.46.4500 > 10.0.0.53.4500: NONESP-encap: isakmp: child_sa inf2[R]
IP 10.0.0.53.4500 > 10.0.10.46.4500: NONESP-encap: isakmp: child_sa inf2[I]
IP 10.0.10.46.4500 > 10.0.0.53.4500: NONESP-encap: isakmp: child_sa inf2[R]
IP 10.0.0.53.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d1), length 136
NOTE: next sequence is wrong 0x1
IP 10.0.10.46.4500 > 10.0.0.53.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x1), length 136
IP 10.0.0.53.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d2), length 136
IP 10.0.10.46.4500 > 10.0.0.53.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x2), length 136
Signed-off-by: Antony Antony <antony@phenome.org>
Reviewed-by: Richard Guy Briggs <rgb@tricolour.ca>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The skb must be released in the receive handler since b91a2543b4
("batman-adv: Consume skb in receive handlers"). Just returning NET_RX_DROP
will no longer automatically free the memory. This results in memory leaks
when unicast packets from other backbones must be dropped because they
share a common backbone.
Fixes: 9e794b6bf4 ("batman-adv: drop unicast packets from other backbone gw")
Signed-off-by: Andreas Pape <apape@phoenixcontact.com>
[sven@narfation.org: adjust commit message]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The stats are generated by batadv_interface_stats and must not be stored
directly in the net_device stats member variable. The batadv_priv
bat_counters information is assembled when ndo_get_stats is called. The
stats previously stored in net_device::stats is then overwritten.
The batman-adv counters must therefore be increased when an ARP packet is
answered locally via the distributed arp table.
Fixes: c384ea3ec9 ("batman-adv: Distributed ARP Table - add snooping functions for ARP messages")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The tm-resched-dscr test has started failing sometimes, depending on
what compiler it's built with, eg:
test: tm_resched_dscr
Check DSCR TM context switch: tm-resched-dscr: tm-resched-dscr.c:76: test_body: Assertion `rv' failed.
!! child died by signal 6
When it fails we see that the compiler doesn't initialise rv to 1 before
entering the inline asm block. Although that's counter intuitive, it
is allowed because we tell the compiler that the inline asm will write
to rv (using "=r"), meaning the original value is irrelevant.
Marking it as a read/write parameter would presumably work, but it seems
simpler to fix it by setting the initial value of rv in the inline asm.
Fixes: 96d0161086 ("powerpc: Correct DSCR during TM context switch")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Michael Neuling <mikey@neuling.org>
In case of error, the function of_iomap() returns NULL pointer
not ERR_PTR(). The IS_ERR() test in the return value check should
be replaced with NULL test.
Fixes: e78f3d15e1 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets")
Reviewed-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
The ICH9 is listed as having TCO v2, and indeed the behavior in the
datasheet corresponds to v2 (for example the NO_REBOOT flag is
accessible via the 16KiB-aligned Root Complex Base Address).
However, the TCO counts twice just like in v1; the documentation
of the SECOND_TO_STS bit says: "ICH9 sets this bit to 1 to indicate
that the TIMEOUT bit had been (or is currently) set and a second
timeout occurred before the TCO_RLD register was written. If this
bit is set and the NO_REBOOT config bit is 0, then the ICH9 will
reboot the system after the second timeout. The same can be found
in the BayTrail (Atom E3800) datasheet, and even HOWTOs around
the Internet say that it will reboot after _twice_ the specified
heartbeat.
I did not find the Apollo Lake datasheet, but because v4/v5 has
a SECOND_TO_STS bit just like the previous version I'm enabling
this for Apollo Lake as well.
Cc: linux-watchdog@vger.kernel.org
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
gcc-7 notices that the length we pass to strncat is wrong:
drivers/firmware/ti_sci.c: In function 'ti_sci_probe':
drivers/firmware/ti_sci.c:204:32: error: specified bound 50 equals the size of the destination [-Werror=stringop-overflow=]
Instead of the total length, we must pass the length of the
remaining space here.
Fixes: aa276781a6 ("firmware: Add basic support for TI System Control Interface (TI-SCI) protocol")
Cc: stable@vger.kernel.org
Acked-by: Nishanth Menon <nm@ti.com>
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Johan writes:
USB-serial fixes for v4.12-rc2
Here's a fix for a long-standing issue in the ftdi_sio driver that
prevented unprivileged users from updating the low-latency flag,
something which became apparent after a recent change that restored the
older setting of not using low-latency mode by default.
A run of sparse revealed a couple of endianness issues that are now
fixed, and addressed is also a user-triggerable division-by-zero in
io_ti when debugging is enabled.
Finally there are some new device ids, including a simplification of how
we deal with a couple of older Olimex JTAG adapters.
All have been in linux-next with no reported issues.
Signed-off-by: Johan Hovold <johan@kernel.org>
When moving a merge dir or non-dir with copy up origin into a non-merge
upper dir (a.k.a pure upper dir), we are marking the target parent dir
"impure". ovl_iterate() iterates pure upper dirs directly, because there is
no need to filter out whiteouts and merge dir content with lower dir. But
for the case of an "impure" upper dir, ovl_iterate() will not be able to
iterate the real upper dir directly, because it will need to lookup the
origin inode and use it to fill d_ino.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
On failure to set opaque/redirect xattr on rename, skip setting xattr and
return -EXDEV.
On failure to set opaque xattr when creating a new directory, -EIO is
returned instead of -EOPNOTSUPP.
Any failure to set those xattr will be recorded in super block and
then setting any xattr on upper won't be attempted again.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
The devm_gpiod_get_optional() function appends a "-gpios" to the
string passed to it, so if we want to find the "power-gpios" signal,
we must pass "power" to this function.
Fixes: 01d9584333 ("mmc: cavium: Add MMC support for Octeon SOCs.")
Signed-off-by: David Daney <david.daney@cavium.com>
[jglauber@cavium.com: removed point after subject line]
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
OCTEON SoCs with CIU3 do not have interrupt masking local to the MMC
bus interface. Unfortunately, some even have a diagnostic register at
the same address of the enable register, which causes the interrupts
to fire immediately if stored to, thus breaking the driver. The proper
action on these SoCs is not to touch this register.
Fixes: 01d9584333 ("mmc: cavium: Add MMC support for Octeon SOCs.")
Signed-off-by: David Daney <david.daney@cavium.com>
[jglauber@cavium.com: removed point after subject line]
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Fixes for omaps for v4.12-rc cycle most consisting of few minor dts fixes
for various devices. Also included is a memory controller (GPMC) debug output
fix as without that the shown bootloader configured GPMC bus width will
be wrong and won't work for kernel timings:
- Add dra7 powerhold configuration to be able to shut down pmic correctly
- Fix polarity for gta04 mcbsp4 clocks for modem
- Fix Pandaboard CEC pin pull making it usable
- Fix LogicPD Torpedo camera pin mux
- Fix GPMC debug bus width
- Reduce cpu thermal shutdown temperature
* tag 'omap-for-v4.12/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: dra7: Reduce cpu thermal shutdown temperature
memory: omap-gpmc: Fix debug output for access width
ARM: dts: LogicPD Torpedo: Fix camera pin mux
ARM: dts: omap4: enable CEC pin for Pandaboard A4 and ES
ARM: dts: gta04: fix polarity of clocks for mcbsp4
ARM: dts: dra7: Add power hold and power controller properties to palmas
Signed-off-by: Olof Johansson <olof@lixom.net>
i.MX fixes for 4.12:
- A fix on GPCv2 power domain driver Kconfig which causes a build
failure when CONFIG_PM is not set.
- Pull down PMIC IRQ pin for imx53-qsrb board to prevent spurious
PMIC interrupts from happening.
- Remove board level OPP override for imx6sx-sdb to fix a boot crash
seen on Rev.C boards.
* tag 'imx-fixes-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
soc: imx: add PM dependency for IMX7_PM_DOMAINS
ARM: dts: imx6sx-sdb: Remove OPP override
ARM: dts: imx53-qsrb: Pulldown PMIC IRQ pin
Signed-off-by: Olof Johansson <olof@lixom.net>
Enable Qualcomm drivers needed to boot Dragonboard 410c with HDMI. This
enables support for clocks, regulators, and USB PHY.
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
[Olof: Turned off _RPM configs per follow-up email]
Signed-off-by: Olof Johansson <olof@lixom.net>
Currently, the xenon_clean_phy() is only used for freeing phy_params.
The phy_params is allocated by devm_kzalloc(), there's no need to free
is explicitly.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Acked-by: Hu Ziji <huziji@marvell.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Sync the defconfig with savedefconfig as config options change/move over
time.
Generated with the following commands:
make defconfig
make savedefconfig
cp defconfig arch/arm64/configs/defconfig
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
It makes sense to have a stripped-down defconfig for just Gemini, as
it is a pretty small platform used in NAS etc, and will use appended
device tree. It is also quick to compile and test. Hopefully this
defconfig can be a good base for distributions such as OpenWRT.
I plan to add in the config options needed for the different
variants of Gemini as we go along.
Cc: Janos Laube <janos.dev@gmail.com>
Cc: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
This pull request contains Broadcom SoC drivers fixes for 4.12, please
pull the following:
- Florian removes the duplicate compatible string matched by the
SUN_TOP_CTRL driver and instead uses the correct one for 7435
* tag 'arm-soc/for-4.12/drivers-fixes' of http://github.com/Broadcom/stblinux:
soc: bcm: brcmstb: Correctly match 7435 SoC
Signed-off-by: Olof Johansson <olof@lixom.net>
This pull request contains Broadcom ARM-based SoC Device Tree fixes for
4.12, please pull the following:
- Baruch provides several fixes for the Raspberry Pi (BCM2835) Device
Tree source include file: uart0 pinctrl node names, pin number for
i2c0, uart0 rts/cts pins and invalid uart1 pin, missing numbers for
ethernet aliases
* tag 'arm-soc/for-4.12/devicetree-fixes' of http://github.com/Broadcom/stblinux:
ARM: dts: bcm2835: add index to the ethernet alias
ARM: dts: bcm2835: fix uart0/uart1 pins
ARM: dts: bcm2835: fix i2c0 pins
ARM: dts: bcm2835: fix uart0 pinctrl node names
Signed-off-by: Olof Johansson <olof@lixom.net>
We use a directory under arch/$ARCH/boot/dts as an include path
that has links outside of the subtree to find dt-bindings from under
include/dt-bindings. That's been working well, but new DT architectures
haven't been adding them by default.
Recently there's been a desire to share some of the DT material between
arm and arm64, which originally caused developers to create symlinks or
relative includes between the subtrees. This isn't ideal -- it breaks
if the DT files aren't stored in the exact same hierarchy as the kernel
tree, and generally it's just icky.
As a somewhat cleaner solution we decided to add a $ARCH/ prefix link
once, and allow DTS files to reference dtsi (and dts) files in other
architectures that way.
Original approach was to create these links under each architecture,
but it lead to the problem of recursive symlinks.
As a remedy, move the include link directories out of the architecture
trees into a common location. At the same time, they can now share one
directory and one dt-bindings/ link as well.
Fixes: 4027494ae6 ('ARM: dts: add arm/arm64 include symlinks')
Reported-by: Russell King <linux@armlinux.org.uk>
Reported-by: Omar Sandoval <osandov@osandov.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Rob Herring <robh@kernel.org>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: linux-arch <linux-arch@vger.kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
We've received a few fixes branches with -rc1 as base, but our contents was
still at pre-rc1. Merge it in expliticly to make 'git merge --log' clear on
hat was actually merged.
Signed-off-by: Olof Johansson <olof@lixom.net>
There are some leftovers testing for pvh guest mode in pv-only source
files. Remove them.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
virt_addr_valid() is supposed to tell you if it's OK to call virt_to_page() on
an address. What this means in practice is that it should only return true for
addresses in the linear mapping which are backed by a valid PFN.
We are failing to properly check that the address is in the linear mapping,
because virt_to_pfn() will return a valid looking PFN for more or less any
address. That bug is actually caused by __pa(), used in virt_to_pfn().
eg: __pa(0xc000000000010000) = 0x10000 # Good
__pa(0xd000000000010000) = 0x10000 # Bad!
__pa(0x0000000000010000) = 0x10000 # Bad!
This started happening after commit bdbc29c19b ("powerpc: Work around gcc
miscompilation of __pa() on 64-bit") (Aug 2013), where we changed the definition
of __pa() to work around a GCC bug. Prior to that we subtracted PAGE_OFFSET from
the value passed to __pa(), meaning __pa() of a 0xd or 0x0 address would give
you something bogus back.
Until we can verify if that GCC bug is no longer an issue, or come up with
another solution, this commit does the minimal fix to make virt_addr_valid()
work, by explicitly checking that the address is in the linear mapping region.
Fixes: bdbc29c19b ("powerpc: Work around gcc miscompilation of __pa() on 64-bit")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Tested-by: Breno Leitao <breno.leitao@gmail.com>
In lower layer driver's (LLD) scsi_host_template, the driver may
optionally ask SCSI to allocate its private driver memory for each
command, by specifying cmd_size. This memory is allocated at the end of
scsi_cmnd by SCSI. Later when SCSI queues a command, the LLD can use
scsi_cmd_priv to get to its private data.
Some LLD, e.g. hv_storvsc, doesn't clear its private data before use. In
this case, the LLD may get to stale or uninitialized data in its private
driver memory. This may result in unexpected driver and hardware
behavior.
Fix this problem by also zeroing the private driver memory before
passing them to LLD.
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reviewed-by: KY Srinivasan <kys@microsoft.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
CC: <stable@vger.kernel.org> # 4.11+
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
mbp pointer is passed to csio_hw_validate_caps() so call mempool_free()
after calling csio_hw_validate_caps().
Signed-off-by: Varun Prakash <varun@chelsio.com>
Fixes: 541c571fa2 ("csiostor:Use firmware version from cxgb4/t4fw_version.h")
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When reloading module these two attributes aren't cleaned up properly
and they persist causing warnings when trying to load module
again. Additionally they are not recreated properly due to that.
Signed-off-by: Michał Potomski <michalx.potomski@intel.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drm/i915 fixes for v4.12-rc2
* tag 'drm-intel-fixes-2017-05-18-1' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: don't do allocate_va_range again on PIN_UPDATE
drm/i915: Fix rawclk readout for g4x
drm/i915: Fix runtime PM for LPE audio
drm/i915/glk: Fix DSI "*ERROR* ULPS is still active" messages
drm/i915/gvt: avoid unnecessary vgpu switch
drm/i915/gvt: not to restore in-context mmio
drm/i915/gvt: fix typo: "supporte" -> "support"
Driver Changes:
- host1x: Fix link error when host1x is built-in and iova is a module (Arnd)
- hlcdc: Fix arguments passed to drm_of_find_panel_or_bridge (Boris)
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
* tag 'drm-misc-fixes-2017-05-18' of git://anongit.freedesktop.org/git/drm-misc:
drm/atmel-hlcdc: Fix output initialization
gpu: host1x: select IOMMU_IOVA
Pull MD fixes from Shaohua Li:
- Several bug fixes for raid5-cache from Song Liu, mainly handle
journal disk error
- Fix bad block handling in choosing raid1 disk from Tomasz Majchrzak
- Simplify external metadata array sysfs handling from Artur
Paszkiewicz
- Optimize raid0 discard handling from me, now raid0 will dispatch
large discard IO directly to underlayer disks.
* tag 'md/4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
raid1: prefer disk without bad blocks
md/r5cache: handle sync with data in write back cache
md/r5cache: gracefully handle journal device errors for writeback mode
md/raid1/10: avoid unnecessary locking
md/raid5-cache: in r5l_do_submit_io(), submit io->split_bio first
md/md0: optimize raid0 discard handling
md: don't return -EAGAIN in md_allow_write for external metadata arrays
md/raid5: make use of spin_lock_irq over local_irq_disable + spin_lock
Fixes the following sparse warnings:
net/9p/trans_xen.c:528:5: warning:
symbol 'p9_trans_xen_init' was not declared. Should it be static?
net/9p/trans_xen.c:540:6: warning:
symbol 'p9_trans_xen_exit' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
In case of error, the function xenbus_read() returns ERR_PTR() and never
returns NULL. The NULL test in the return value check should be replaced
with IS_ERR().
Fixes: 71ebd71921 ("xen/9pfs: connect to the backend")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Pull networking fixes from David Miller:
1) Don't allow negative TCP reordering values, from Soheil Hassas
Yeganeh.
2) Don't overflow while parsing ipv6 header options, from Craig Gallek.
3) Handle more cleanly the case where an individual route entry during
a dump will not fit into the allocated netlink SKB, from David
Ahern.
4) Add missing CONFIG_INET dependency for mlx5e, from Arnd Bergmann.
5) Allow neighbour updates to converge more quickly via gratuitous
ARPs, from Ihar Hrachyshka.
6) Fix compile error from CONFIG_INET is disabled, from Eric Dumazet.
7) Fix use after free in x25 protocol init, from Lin Zhang.
8) Valid VLAN pvid ranges passed into br_validate(), from Tobias
Jungel.
9) NULL out address lists in child sockets in SCTP, this is similar to
the fix we made for inet connection sockets last week. From Eric
Dumazet.
10) Fix NULL deref in mlxsw driver, from Ido Schimmel.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
mlxsw: spectrum: Avoid possible NULL pointer dereference
sh_eth: Do not print an error message for probe deferral
sh_eth: Use platform device for printing before register_netdev()
mlxsw: spectrum_router: Fix rif counter freeing routine
mlxsw: spectrum_dpipe: Fix incorrect entry index
cxgb4: update latest firmware version supported
qmi_wwan: add another Lenovo EM74xx device ID
sctp: do not inherit ipv6_{mc|ac|fl}_list from parent
udp: make *udp*_queue_rcv_skb() functions static
bridge: netlink: check vlan_default_pvid range
net: ethernet: faraday: To support device tree usage.
net: x25: fix one potential use-after-free issue
bpf: adjust verifier heuristics
ipv6: Check ip6_find_1stfragopt() return value properly.
selftests/bpf: fix broken build due to types.h
bnxt_en: Check status of firmware DCBX agent before setting DCB_CAP_DCBX_HOST.
bnxt_en: Call bnxt_dcb_init() after getting firmware DCBX configuration.
net: fix compile error in skb_orphan_partial()
ipv6: Prevent overrun when parsing v6 header options
neighbour: update neigh timestamps iff update is effective
...
Pull Kbuild fix from Masahiro Yamada:
"Fix headers_install to not delete pre-existing headers in the install
destination"
* tag 'kbuild-fixes-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: skip install/check of headers right under uapi directories
Pull pid namespace fixes from Eric Biederman:
"These are two bugs that turn out to have simple fixes that were
reported during the merge window. Both of these issues have existed
for a while and it just happens that they both were reported at almost
the same time"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
pid_ns: Fix race between setns'ed fork() and zap_pid_ns_processes()
pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes
To fix following build error when SOFTWARE_REBOOT is defined:
CC [M] driver/watchdog/wdt_pci.o
driver/watchdog/wdt_pci.c: In function 'wdtpci_interrupt':
driver/watchdog/wdt_pci.c:335:3: error: too many arguments to function 'emergency_restart'
emergency_restart(NULL);
^
In file included from driver/watchdog/wdt_pci.c:51:0:
include/linux/reboot.h:80:13: note: declared here
extern void emergency_restart(void);
^
Signed-off-by: Shile Zhang <shile.zhang@nokia.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
wdt_timeout must not be initialized to CDNS_WDT_DEFAULT_TIMEOUT in
order to allow the value to be overriddden by a device tree setting.
This way, the default timeout value will be used only in case module_param
has not been set, or device tree timeout-sec has not been defined.
Signed-off-by: Tomas Melin <tomas.melin@vaisala.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
WDT_MR and WDT_CR must not updated within three slow clock periods after
the last ping (write to WDT_CR or WDT_MR). Ensure enough time has elapsed
before writing those registers.
wdt_write() waits for 4 periods to ensure at least 3 edges are seen by the
IP.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Wenyou.Yang <wenyou.yang@microchip.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
The datasheet states: "When setting the WDDIS bit, and while it is set, the
fields WDV and WDD must not be modified."
Because the whole configuration is already cached inside .mr, wait for the
user to enable the watchdog to configure it so it is enabled and configured
at the same time (what the IP is actually expecting).
When the watchdog is already enabled, it is not an issue to reconfigure it.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Wenyou.Yang <wenyou.yang@microchip.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
I ran into one corner case with the orion watchdog using the
atomic_io_modify interface:
drivers/watchdog/orion_wdt.o: In function `orion_stop':
orion_wdt.c:(.text.orion_stop+0x28): undefined reference to `atomic_io_modify'
drivers/watchdog/orion_wdt.o: In function `armada375_stop':
orion_wdt.c:(.text.armada375_stop+0x28): undefined reference to `atomic_io_modify'
This function is available on all 32-bit ARM builds except for ebsa110, so
we have to specifically exclude that from compile-testing.
Fixes: da2a68b3eb ("watchdog: Enable COMPILE_TEST where possible")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Pull hwmon fix from Guenter Roeck:
"Fix problem with hotplug state machine in coretemp driver"
* tag 'hwmon-for-linus-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (coretemp) Handle frozen hotplug state correctly
Add a new interface for registering a serdev controller and clients, and
a helper function to deregister serdev devices (or a tty device) that
were previously registered using the new interface.
Once every driver currently using the tty_port_register_device() helpers
have been vetted and converted to use the new serdev registration
interface (at least for deregistration), we can move serdev registration
to the current helpers and get rid of the serdev-specific functions.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case we got an FDB notification for a port that doesn't exist we
execute an FDB entry delete to prevent it from re-appearing the next
time we poll for notifications.
If the operation failed we would trigger a NULL pointer dereference as
'mlxsw_sp_port' is NULL.
Fix it by reporting the error using the underlying bus device instead.
Fixes: 12f1501e75 ("mlxsw: spectrum: remove FDB entry in case we get unknown object notification")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
EPROBE_DEFER is not an error, hence printing an error message like
sh-eth ee700000.ethernet: failed to initialise MDIO
may confuse the user.
To fix this, suppress the error message in case of probe deferral.
While at it, shorten the message, and add the actual error code.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MDIO initialization failure message is printed using the network
device, before it has been registered, leading to:
(null): failed to initialise MDIO
Use the platform device instead to fix this:
sh-eth ee700000.ethernet: failed to initialise MDIO
Fixes: daacf03f0b ("sh_eth: Register MDIO bus before registering the network device")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko says:
====================
mlxsw: couple of fixes
Couple of fixes from Arkadi
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
During rif counter freeing the counter index can be invalid. Add check
of validity before freeing the counter.
Fixes: e0c0afd8aa ("mlxsw: spectrum: Support for counters on router interfaces")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of disabled counters the entry index will be incorrect. Fix this
by moving the entry index set before the counter status check.
Fixes: 2ba5999f00 ("mlxsw: spectrum: Add Support for erif table entries access")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes several issues:
- if the 1st 'kzalloc' fails, we dereference a NULL pointer
- if the 2nd 'kzalloc' fails, there is a memory leak
- if 'sysfs_create_bin_file' fails there is also a memory leak
Fix it by adding a test after the first memory allocation and some error
handling paths to correctly free memory if needed.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
/dev/mem currently allows mmap() mappings that wrap around the end of
the physical address space, which should probably be illegal. It
circumvents the existing STRICT_DEVMEM permission check because the loop
immediately terminates (as the start address is already higher than the
end address). On the x86_64 architecture it will then cause a panic
(from the BUG(start >= end) in arch/x86/mm/pat.c:reserve_memtype()).
This patch adds an explicit check to make sure offset + size will not
wrap around in the physical address type.
Signed-off-by: Julius Werner <jwerner@chromium.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Starting with commit 6fe729c4bd ("serdev: Add serdev_device_write
subroutine") the function serdev_device_write_buf cannot be used in
atomic context anymore (mutex_lock is sleeping). So restore the old
behavior.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: 6fe729c4bd ("serdev: Add serdev_device_write subroutine")
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With serdev we might end up with serial ports that have no cdev exported
to userspace, as they are used as the bus interface to other devices. In
that case serial_match_port() won't be able to find a matching tty_dev.
Skip the irq wakeup enabling in that case, as serdev will make sure to
keep the port active, as long as there are devices depending on it.
Fixes: 8ee3fde047 (tty_port: register tty ports with serdev bus)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tty_insert_flip_string_fixed_flag() is racy against itself when called
from the ioctl(TCXONC, TCION/TCIOFF) path [1] and the flush_to_ldisc()
workqueue path [2].
The problem is that port->buf.tail->used is modified without consistent
locking; the ioctl path takes tty->atomic_write_lock, whereas the workqueue
path takes ldata->output_lock.
We cannot simply take ldata->output_lock, since that is specific to the
N_TTY line discipline.
It might seem natural to try to take port->buf.lock inside
tty_insert_flip_string_fixed_flag() and friends (where port->buf is
actually used/modified), but this creates problems for flush_to_ldisc()
which takes it before grabbing tty->ldisc_sem, o_tty->termios_rwsem,
and ldata->output_lock.
Therefore, the simplest solution for now seems to be to take
tty->atomic_write_lock inside tty_port_default_receive_buf(). This lock
is also used in the write path [3] with a consistent ordering.
[1]: Call Trace:
tty_insert_flip_string_fixed_flag
pty_write
tty_send_xchar // down_read(&o_tty->termios_rwsem)
// mutex_lock(&tty->atomic_write_lock)
n_tty_ioctl_helper
n_tty_ioctl
tty_ioctl // down_read(&tty->ldisc_sem)
do_vfs_ioctl
SyS_ioctl
[2]: Workqueue: events_unbound flush_to_ldisc
Call Trace:
tty_insert_flip_string_fixed_flag
pty_write
tty_put_char
__process_echoes
commit_echoes // mutex_lock(&ldata->output_lock)
n_tty_receive_buf_common
n_tty_receive_buf2
tty_ldisc_receive_buf // down_read(&o_tty->termios_rwsem)
tty_port_default_receive_buf // down_read(&tty->ldisc_sem)
flush_to_ldisc // mutex_lock(&port->buf.lock)
process_one_work
[3]: Call Trace:
tty_insert_flip_string_fixed_flag
pty_write
n_tty_write // mutex_lock(&ldata->output_lock)
// down_read(&tty->termios_rwsem)
do_tty_write (inline) // mutex_lock(&tty->atomic_write_lock)
tty_write // down_read(&tty->ldisc_sem)
__vfs_write
vfs_write
SyS_write
The bug can result in about a dozen different crashes depending on what
exactly gets corrupted when port->buf.tail->used points outside the
buffer.
The patch passes my LOCKDEP/PROVE_LOCKING testing but more testing is
always welcome.
Found using syzkaller.
Cc: <stable@vger.kernel.org>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Straighten out the initcall error handling to avoid deregistering a
never-registered tty driver (something which would lead to a
NULL-pointer dereference) in the most unlikely event that driver
registration fails (e.g. we've run out of major numbers).
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make sure to deregister the SPI driver before releasing the tty driver
to avoid use-after-free in the SPI remove callback where the tty
devices are deregistered.
Fixes: 72d4724ea5 ("serial: ifx6x60: Add modem power off function in the platform reboot process")
Cc: stable <stable@vger.kernel.org> # 3.8
Cc: Jun Chen <jun.d.chen@intel.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver does ioremap(port->mapbase, ALTERA_JTAGUART_SIZE),
but there is no any iounmap().
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After migrating 8250_exar to MSI in 172c33cb61, we can get stuck
without further interrupts because of the special wake-up event these
chips send. They are only cleared by reading INT0. As we fail to do so
during startup and shutdown, we can leave the interrupt line asserted,
which is fatal with edge-triggered MSIs.
Add the required reading of INT0 to startup and shutdown. Also account
for the fact that a pending wake-up interrupt means we have to return 1
from exar_handle_irq. Drop the unneeded reading of INT1..3 along with
this - those never reset anything.
An alternative approach would have been disabling the wake-up interrupt.
Unfortunately, this feature (REGB[17] = 1) is not available on the
XR17D15X.
Fixes: 172c33cb61 ("serial: exar: Enable MSI support")
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
UARTn_FRAME_PARITY_ODD is 0x0300
UARTn_FRAME_PARITY_EVEN is 0x0200
So if the UART is configured for EVEN parity, it would be reported as ODD.
Fix it by correctly testing if the 2 bits are set.
Fixes: 3afbd89c96 ("serial/efm32: add new driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The port client data must be set when registering the serdev controller
or client deregistration will fail (and the serdev devices are left
registered and allocated) if the port was never opened in between.
Make sure to clear the port client data on any probe errors to avoid a
use-after-free when the client is later deregistered unconditionally
(e.g. in a tty-port deregistration helper).
Also move port client operation initialisation to registration. Note
that the client ops must be restored on failed probe.
Fixes: bed35c6dfa ("serdev: add a tty port controller driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 8ee3fde047.
The new serdev bus hooked into the tty layer in
tty_port_register_device() by registering a serdev controller instead of
a tty device whenever a serdev client is present, and by deregistering
the controller in the tty-port destructor. This is broken in several
ways:
Firstly, it leads to a NULL-pointer dereference whenever a tty driver
later deregisters its devices as no corresponding character device will
exist.
Secondly, far from every tty driver uses tty-port refcounting (e.g.
serial core) so the serdev devices might never be deregistered or
deallocated.
Thirdly, deregistering at tty-port destruction is too late as the
underlying device and structures may be long gone by then. A port is not
released before an open tty device is closed, something which a
registered serdev client can prevent from ever happening. A driver
callback while the device is gone typically also leads to crashes.
Many tty drivers even keep their ports around until the driver is
unloaded (e.g. serial core), something which even if a late callback
never happens, leads to leaks if a device is unbound from its driver and
is later rebound.
The right solution here is to add a new tty_port_unregister_device()
helper and to never call tty_device_unregister() whenever the port has
been claimed by serdev, but since this requires modifying just about
every tty driver (and multiple subsystems) it will need to be done
incrementally.
Reverting the offending patch is the first step in fixing the broken
lifetime assumptions. A follow-up patch will add a new pair of
tty-device registration helpers, which a vetted tty driver can use to
support serdev (initially serial core). When every tty driver uses the
serdev helpers (at least for deregistration), we can add serdev
registration to tty_port_register_device() again.
Note that this also fixes another issue with serdev, which currently
allocates and registers a serdev controller for every tty device
registered using tty_port_device_register() only to immediately
deregister and deallocate it when the corresponding OF node or serdev
child node is missing. This should be addressed before enabling serdev
for hot-pluggable buses.
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When IOMMU_IOVA is not built-in but host1x is, we get a link error:
drivers/gpu/host1x/dev.o: In function `host1x_remove':
dev.c:(.text.host1x_remove+0x50): undefined reference to `put_iova_domain'
drivers/gpu/host1x/dev.o: In function `host1x_probe':
dev.c:(.text.host1x_probe+0x31c): undefined reference to `init_iova_domain'
dev.c:(.text.host1x_probe+0x38c): undefined reference to `put_iova_domain'
drivers/gpu/host1x/cdma.o: In function `host1x_cdma_init':
cdma.c:(.text.host1x_cdma_init+0x238): undefined reference to `alloc_iova'
cdma.c:(.text.host1x_cdma_init+0x2c0): undefined reference to `__free_iova'
drivers/gpu/host1x/cdma.o: In function `host1x_cdma_deinit':
cdma.c:(.text.host1x_cdma_deinit+0xb0): undefined reference to `free_iova'
This adds the same select statement that we have for drm_tegra.
Fixes: 404bfb78da ("gpu: host1x: Add IOMMU support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170419182449.885312-1-arnd@arndb.de
Commit fa01e2ca9f ("serial: 8250: Integrate Fintek into 8250_base")
modified the probing logic for PNP0501 devices, to remove a collision
between the generic 16550A driver and the Fintek driver, which reused
the same ACPI _HID.
The Fintek device probe is now incorporated into the common 8250 probe
path, and gets called for all discovered 16550A compatible devices,
including ones that are MMIO mapped rather than IO mapped. However,
the Fintek driver assumes the port base is a I/O address, and proceeds
to probe some arbitrary offsets above it.
This is generally a wrong thing to do, but on ARM systems (having no
native port I/O), this may result in faulting accesses of completely
unrelated MMIO regions in the PCI I/O space. Given that this is at
serial probe time, this results in hard to diagnose crashes at boot.
So let's restrict the Fintek probe to devices that we know are using
port I/O in the first place.
Fixes: fa01e2ca9f ("serial: 8250: Integrate Fintek into 8250_base")
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ricardo Ribalda <ricardo.ribalda@gmail.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Change t4fw_version.h to update latest firmware version
number to 1.16.43.0.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In their infinite wisdom, and never ending quest for end user frustration,
Lenovo has decided to use a new USB device ID for the wwan modules in
their 2017 laptops. The actual hardware is still the Sierra Wireless
EM7455 or EM7430, depending on region.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the udp memory accounting refactor, we don't need any more
to export the *udp*_queue_rcv_skb(). Make them static and fix
a couple of sparse warnings:
net/ipv4/udp.c:1615:5: warning: symbol 'udp_queue_rcv_skb' was not
declared. Should it be static?
net/ipv6/udp.c:572:5: warning: symbol 'udpv6_queue_rcv_skb' was not
declared. Should it be static?
Fixes: 850cbaddb5 ("udp: use it's own memory accounting schema")
Fixes: c915fe13cb ("udplite: fix NULL pointer dereference")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently it is allowed to set the default pvid of a bridge to a value
above VLAN_VID_MASK (0xfff). This patch adds a check to br_validate and
returns -EINVAL in case the pvid is out of bounds.
Reproduce by calling:
[root@test ~]# ip l a type bridge
[root@test ~]# ip l a type dummy
[root@test ~]# ip l s bridge0 type bridge vlan_filtering 1
[root@test ~]# ip l s bridge0 type bridge vlan_default_pvid 9999
[root@test ~]# ip l s dummy0 master bridge0
[root@test ~]# bridge vlan
port vlan ids
bridge0 9999 PVID Egress Untagged
dummy0 9999 PVID Egress Untagged
Fixes: 0f963b7592 ("bridge: netlink: add support for default_pvid")
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Tobias Jungel <tobias.jungel@bisdn.de>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
xattr are needed by overlayfs for setting opaque dir, redirect dir
and copy up origin.
Check at mount time by trying to set the overlay.opaque xattr on the
workdir and if that fails issue a warning message.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
The function x25_init is not properly unregister related resources
on error handler.It is will result in kernel oops if x25_init init
failed, so add properly unregister call on error handler.
Also, i adjust the coding style and make x25_register_sysctl properly
return failure.
Signed-off-by: linzhang <xiaolou4617@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have one register for each EP to set the maximum packet size for both
TX and RX.
If for example an RX programming would happen before the previous TX
transfer finishes we would reset the TX packet side.
To fix this issue, only modify the TX or RX part of the register.
Fixes: 550a7375fe ("USB: Add MUSB and TUSB support")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit d8e5f0eca1 ("usb: musb: Fix hardirq-safe hardirq-unsafe
lock order error") caused a regression where musb keeps trying to
enable host mode with no cable connected. This seems to be caused
by the fact that now phy is enabled earlier, and we are wrongly
trying to force USB host mode on an OTG port. The errors we are
getting are "trying to suspend as a_idle while active".
For ports configured as OTG, we should not need to do anything
to try to force USB host mode on it's OTG port. Trying to force host
mode in this case just seems to completely confuse the musb state
machine.
Let's fix the issue by making musb_host_setup() attempt to force the
mode only if port_mode is configured for host mode.
Fixes: d8e5f0eca1 ("usb: musb: Fix hardirq-safe hardirq-unsafe lock order error")
Cc: Johan Hovold <johan@kernel.org>
Cc: stable <stable@vger.kernel.org>
Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reported-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In 4.11 TRB completion codes were renamed to match spec.
Completion codes for command ring stopped and endpoint stopped
were mixed, leading to failures while handling a stopped command ring.
Use the correct completion code for command ring stopped events.
Fixes: 0b7c105a04 ("usb: host: xhci: rename completion codes to match spec")
Cc: <stable@vger.kernel.org> # 4.11
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With threaded interrupts, bottom-half handlers are called with
interrupts enabled. Therefore they can't safely use spin_lock(); they
have to use spin_lock_irqsave(). Lockdep warns about a violation
occurring in xhci_irq():
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
4.11.0-rc8-dbg+ #1 Not tainted
---------------------------------------------------------
swapper/7/0 just changed the state of lock:
(&(&ehci->lock)->rlock){-.-...}, at: [<ffffffffa0130a69>]
ehci_hrtimer_func+0x29/0xc0 [ehci_hcd]
but this lock took another, HARDIRQ-unsafe lock in the past:
(hcd_urb_list_lock){+.....}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(hcd_urb_list_lock);
local_irq_disable();
lock(&(&ehci->lock)->rlock);
lock(hcd_urb_list_lock);
<Interrupt>
lock(&(&ehci->lock)->rlock);
*** DEADLOCK ***
no locks held by swapper/7/0.
the shortest dependencies between 2nd lock and 1st lock:
-> (hcd_urb_list_lock){+.....} ops: 252 {
HARDIRQ-ON-W at:
__lock_acquire+0x602/0x1280
lock_acquire+0xd5/0x1c0
_raw_spin_lock+0x2f/0x40
usb_hcd_unlink_urb_from_ep+0x1b/0x60 [usbcore]
xhci_giveback_urb_in_irq.isra.45+0x70/0x1b0 [xhci_hcd]
finish_td.constprop.60+0x1d8/0x2e0 [xhci_hcd]
xhci_irq+0xdd6/0x1fa0 [xhci_hcd]
usb_hcd_irq+0x26/0x40 [usbcore]
irq_forced_thread_fn+0x2f/0x70
irq_thread+0x149/0x1d0
kthread+0x113/0x150
ret_from_fork+0x2e/0x40
This patch fixes the problem.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to xHCI spec Figure 30: Interrupt Throttle Flow Diagram
If PCI Message Signaled Interrupts (MSI or MSI-X) are enabled,
then the assertion of the Interrupt Pending (IP) flag in Figure 30
generates a PCI Dword write. The IP flag is automatically cleared
by the completion of the PCI write.
the MSI enabled HCs don't need to clear interrupt pending bit, but
hcd->irq = 0 doesn't equal to MSI enabled HCD. At some Dual-role
controller software designs, it sets hcd->irq as 0 to avoid HCD
requesting interrupt, and they want to decide when to call usb_hcd_irq
by software.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to xHCI ch4.20 Scratchpad Buffers, the Scratchpad
Buffer needs to be zeroed.
...
The following operations take place to allocate
Scratchpad Buffers to the xHC:
...
b. Software clears the Scratchpad Buffer to '0'
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Smatch complains that we check cap the upper bound of "index" but don't
check for negatives. It's a false positive because "index" is never
negative. But it's also simple enough to make it unsigned which makes
the code easier to audit.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
KVM/ARM Fixes for v4.12-rc2.
Includes:
- A fix for a build failure introduced in -rc1 when tracepoints are
enabled on 32-bit ARM.
- Disabling use of stack pointer protection in the hyp code which can
cause panics.
- A handful of VGIC fixes.
- A fix to the init of the redistributors on GICv3 systems that
prevented boot with kvmtool on GICv3 systems introduced in -rc1.
- A number of race conditions fixed in our MMU handling code.
- A fix for the guest being able to program the debug extensions for
the host on the 32-bit side.
The patch in the Fixes references COMPAT_XT_ALIGN in the definition
of XT_DATA_TO_USER, outside an #ifdef CONFIG_COMPAT block.
Split XT_DATA_TO_USER into separate compat and non compat variants and
define the first inside an CONFIG_COMPAT block.
This simplifies both variants by removing branches inside the macro.
Fixes: 324318f024 ("netfilter: xtables: zero padding in data_to_user")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We were not holding the kvm->slots_lock as required when calling
kvm_io_bus_unregister_dev() as required.
This only affects the error path, but still, let's do our due
diligence.
Reported by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
If userspace creates the VCPUs after initializing the VGIC, then we end
up in a situation where we trigger a bug in kvm_vcpu_get_idx(), because
it is called prior to adding the VCPU into the vcpus array on the VM.
There is no tight coupling between the VCPU index and the area of the
redistributor region used for the VCPU, so we can simply ensure that all
creations of redistributors are serialized per VM, and increment an
offset when we successfully add a redistributor.
The vgic_register_redist_iodev() function can be called from two paths:
vgic_redister_all_redist_iodev() which is called via the kvm_vgic_addr()
device attribute handler. This patch already holds the kvm->lock mutex.
The other path is via kvm_vgic_vcpu_init, which is called through a
longer chain from kvm_vm_ioctl_create_vcpu(), which releases the
kvm->lock mutex just before calling kvm_arch_vcpu_create(), so we can
simply take this mutex again later for our purposes.
Fixes: ab6f468c10 ("KVM: arm/arm64: Register iodevs when setting redist base and creating VCPUs")
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Tested-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
The API setkey checks for key sizes and alignment went AWOL during the
skcipher conversion. This patch restores them.
Cc: <stable@vger.kernel.org>
Fixes: 4e6c3df4d7 ("crypto: skcipher - Add low-level skcipher...")
Reported-by: Baozeng <sploving1@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Current limits with regards to processing program paths do not
really reflect today's needs anymore due to programs becoming
more complex and verifier smarter, keeping track of more data
such as const ALU operations, alignment tracking, spilling of
PTR_TO_MAP_VALUE_ADJ registers, and other features allowing for
smarter matching of what LLVM generates.
This also comes with the side-effect that we result in fewer
opportunities to prune search states and thus often need to do
more work to prove safety than in the past due to different
register states and stack layout where we mismatch. Generally,
it's quite hard to determine what caused a sudden increase in
complexity, it could be caused by something as trivial as a
single branch somewhere at the beginning of the program where
LLVM assigned a stack slot that is marked differently throughout
other branches and thus causing a mismatch, where verifier
then needs to prove safety for the whole rest of the program.
Subsequently, programs with even less than half the insn size
limit can get rejected. We noticed that while some programs
load fine under pre 4.11, they get rejected due to hitting
limits on more recent kernels. We saw that in the vast majority
of cases (90+%) pruning failed due to register mismatches. In
case of stack mismatches, majority of cases failed due to
different stack slot types (invalid, spill, misc) rather than
differences in spilled registers.
This patch makes pruning more aggressive by also adding markers
that sit at conditional jumps as well. Currently, we only mark
jump targets for pruning. For example in direct packet access,
these are usually error paths where we bail out. We found that
adding these markers, it can reduce number of processed insns
by up to 30%. Another option is to ignore reg->id in probing
PTR_TO_MAP_VALUE_OR_NULL registers, which can help pruning
slightly as well by up to 7% observed complexity reduction as
stand-alone. Meaning, if a previous path with register type
PTR_TO_MAP_VALUE_OR_NULL for map X was found to be safe, then
in the current state a PTR_TO_MAP_VALUE_OR_NULL register for
the same map X must be safe as well. Last but not least the
patch also adds a scheduling point and bumps the current limit
for instructions to be processed to a more adequate value.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not use unsigned variables to see if it returns a negative
error or not.
Fixes: 2423496af3 ("ipv6: Prevent overrun when parsing v6 header options")
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas discovered a bug where the kprobe trace tests had a race
condition where the kprobe_optimizer called from a delayed work queue
that does the optimizing and "unoptimizing" of a kprobe, can try to
modify the text after it has been freed by the init code.
The kprobe trace selftest is a special case, and Thomas and myself
investigated to see if there's a chance that this could also be a bug
with module unloading, as the code is not obvious to how it handles
this. After adding lots of printks, I figured it out. Thomas suggested
that this should be commented so that others will not have to go
through this exercise again.
Link: http://lkml.kernel.org/r/20170516145835.3827d3aa@gandalf.local.home
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Enabling the tracer selftest triggers occasionally the warning in
text_poke(), which warns when the to be modified page is not marked
reserved.
The reason is that the tracer selftest installs kprobes on functions marked
__init for testing. These probes are removed after the tests, but that
removal schedules the delayed kprobes_optimizer work, which will do the
actual text poke. If the work is executed after the init text is freed,
then the warning triggers. The bug can be reproduced reliably when the work
delay is increased.
Flush the optimizer work and wait for the optimizing/unoptimizing lists to
become empty before returning from the kprobes tracer selftest. That
ensures that all operations which were queued due to the probes removal
have completed.
Link: http://lkml.kernel.org/r/20170516094802.76a468bb@gandalf.local.home
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 6274de498 ("kprobes: Support delayed unoptimizing")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Recent commit on patchset "lpfc updates for 11.2.0.14" fixed an issue
about dereferencing a NULL pointer on port reset. The specific commit,
named "lpfc: Fix system crash when port is reset.", is missing a check
against NULL pointer on lpfc_els_flush_cmd() though.
Since we destroy the queues on adapter resets, like in PCI error
recovery path, we need the validation present on this patch in order to
avoid a NULL pointer dereference when trying to flush commands of ELS
wq, after it has been destroyed (which would lead to a kernel oops).
Tested-by: Raphael Silva <raphasil@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 0a5539f661 ("bpf: Provide a linux/types.h override
for bpf selftests.") caused a build failure for tools/testing/selftest/bpf
because of some missing types:
$ make -C tools/testing/selftests/bpf/
...
In file included from /home/yhs/work/net-next/tools/testing/selftests/bpf/test_pkt_access.c:8:
../../../include/uapi/linux/bpf.h:170:3: error: unknown type name '__aligned_u64'
__aligned_u64 key;
...
/usr/include/linux/swab.h:160:8: error: unknown type name '__always_inline'
static __always_inline __u16 __swab16p(const __u16 *p)
...
The type __aligned_u64 is defined in linux:include/uapi/linux/types.h.
The fix is to copy missing type definition into
tools/testing/selftests/bpf/include/uapi/linux/types.h.
Adding additional include "string.h" resolves __always_inline issue.
Fixes: 0a5539f661 ("bpf: Provide a linux/types.h override for bpf selftests.")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull device mapper fixes from Mike Snitzer:
- a couple DM thin provisioning fixes
- a few request-based DM and DM multipath fixes for issues that were
made when merging Christoph's changes with Bart's changes for 4.12
- a DM bufio unsigned overflow fix
- a couple pure fixes for the DM cache target.
- various very small tweaks to the DM cache target that enable
considerable speed improvements in the face of continuous IO. Given
that the cache target was significantly reworked for 4.12 I see no
reason to sit on these advances until 4.13 considering the favorable
results associated with such minimalist tweaks.
* tag 'for-4.12/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache: handle kmalloc failure allocating background_tracker struct
dm bufio: make the parameter "retain_bytes" unsigned long
dm mpath: multipath_clone_and_map must not return -EIO
dm mpath: don't return -EIO from dm_report_EIO
dm rq: add a missing break to map_request
dm space map disk: fix some book keeping in the disk space map
dm thin metadata: call precommit before saving the roots
dm cache policy smq: don't do any writebacks unless IDLE
dm cache: simplify the IDLE vs BUSY state calculation
dm cache: track all IO to the cache rather than just the origin device's IO
dm cache policy smq: stop preemptively demoting blocks
dm cache policy smq: put newly promoted entries at the top of the multiqueue
dm cache policy smq: be more aggressive about triggering a writeback
dm cache policy smq: only demote entries in bottom half of the clean multiqueue
dm cache: fix incorrect 'idle_time' reset in IO tracker
Pull i2c fixes from Wolfram Sang:
"Here are some bugfixes from I2C, especially removing a wrongly
displayed error message for all i2c muxes"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: xgene: Set ACPI_COMPANION_I2C
i2c: mv64xxx: don't override deferred probing when getting irq
i2c: mux: only print failure message on error
i2c: mux: reg: rename label to indicate what it does
i2c: mux: reg: put away the parent i2c adapter on probe failure
The kill_css() function may be called more than once under the condition
that the css was killed but not physically removed yet followed by the
removal of the cgroup that is hosting the css. This patch prevents any
harmm from being done when that happens.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v4.5+
Michael Chan says:
====================
bnxt_en: DCBX fixes.
2 bug fixes for the case where the NIC's firmware DCBX agent is enabled.
With these fixes, we will return the proper information to lldpad.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise, all the host based DCBX settings from lldpad will fail if the
firmware DCBX agent is running.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the current code, bnxt_dcb_init() is called too early before we
determine if the firmware DCBX agent is running or not. As a result,
we are not setting the DCB_CAP_DCBX_HOST and DCB_CAP_DCBX_LLD_MANAGED
flags properly to report to DCBNL.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If CONFIG_INET is not set, net/core/sock.c can not compile :
net/core/sock.c: In function ‘skb_orphan_partial’:
net/core/sock.c:1810:2: error: implicit declaration of function
‘skb_is_tcp_pure_ack’ [-Werror=implicit-function-declaration]
if (skb_is_tcp_pure_ack(skb))
^
Fix this by always including <net/tcp.h>
Fixes: f6ba8d33cf ("netem: fix skb_orphan_partial()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ftrace function_graph time measurements of a given function is not
accurate according to those recorded by ftrace using the function
filters. This change pulls the x86_64 fix from 'commit 722b3c7469
("ftrace/graph: Trace function entry before updating index")' into the
sparc specific prepare_ftrace_return which stops ftrace from
counting interrupted tasks in the time measurement.
Example measurements for select_task_rq_fair running "hackbench 100
process 1000":
| tracing/trace_stat/function0 | function_graph
Before patch | 2.802 us | 4.255 us
After patch | 2.749 us | 3.094 us
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greetings,
GCC 7 introduced the -Wstringop-overflow flag to detect buffer overflows
in calls to string handling functions [1][2]. Due to the way
``empty_zero_page'' is declared in arch/sparc/include/setup.h, this
causes a warning to trigger at compile time in the function mem_init(),
which is subsequently converted to an error. The ensuing patch fixes
this issue and aligns the declaration of empty_zero_page to that of
other architectures. Thank you.
Cheers,
Orlando.
[1] https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02308.html
[2] https://gcc.gnu.org/gcc-7/changes.html
Signed-off-by: Orlando Arias <oarias@knights.ucf.edu>
--------------------------------------------------------------------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
An incorrect huge page alignment check caused
mmap failure for 64K pages when MAP_FIXED is used
with address not aligned to HPAGE_SIZE.
Orabug: 25885991
Fixes: dcd1912d21 ("sparc64: Add 64K page size support")
Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 61562f981e ("uapi: export all arch specifics
directories"), "make INSTALL_HDR_PATH=$root/usr headers_install"
deletes standard glibc headers and others in $(root)/usr/include.
The cause of the issue is that headers_install now starts descending
from arch/$(hdr-arch)/include/uapi with $(root)/usr/include for its
destination when installing asm headers. So, headers already there
are assumed to be unwanted.
When headers_install starts descending from include/uapi with
$(root)/usr/include for its destination, it works around the problem
by creating an dummy destination $(root)/usr/include/uapi, but this
is tricky.
To fix the problem in a clean way is to skip headers install/check
in include/uapi and arch/$(hdr-arch)/include/uapi because we know
there are only sub-directories in uapi directories. A good side
effect is the empty destination $(root)/usr/include/uapi will go
away.
I am also removing the trailing slash in the headers_check target to
skip checking in arch/$(hdr-arch)/include/uapi.
Fixes: 61562f981e ("uapi: export all arch specifics directories")
Reported-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
The memory allocator passed to __unflatten_device_tree() (e.g. a wrapped
kzalloc) can fail so add the missing sanity check to avoid dereferencing
a NULL pointer.
Fixes: fe14042358 ("of/flattree: Refactor unflatten_device_tree and add fdt_unflatten_tree")
Cc: stable <stable@vger.kernel.org> # 2.6.38
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Fix the following compile error found on odroid-xu4:
checks.c: In function ‘check_simple_bus_reg’:
checks.c:876:41: error: format ‘%lx’ expects argument of type
‘long unsigned int’, but argument 4 has type
‘uint64_t{aka long long unsigned int}’ [-Werror=format=]
snprintf(unit_addr, sizeof(unit_addr), "%lx", reg);
^
checks.c:876:41: error: format ‘%lx’ expects argument of type
‘long unsigned int’, but argument 4 has type
‘uint64_t {aka long long unsigned int}’ [-Werror=format=]
cc1: all warnings being treated as errors
Makefile:304: recipe for target 'checks.o' failed
make: *** [checks.o] Error 1
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
[dwg: Correct new format to be correct in general]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[robh: cherry-picked from upstream dtc commit 2a42b14d0d03]
Signed-off-by: Rob Herring <robh@kernel.org>
Currently, cpus_set_cap() calls static_branch_enable_cpuslocked(), which
must take the jump_label mutex.
We call cpus_set_cap() in the secondary bringup path, from the idle
thread where interrupts are disabled. Taking a mutex in this path "is a
NONO" regardless of whether it's contended, and something we must avoid.
We didn't spot this until recently, as ___might_sleep() won't warn for
this case until all CPUs have been brought up.
This patch avoids taking the mutex in the secondary bringup path. The
poking of static keys is deferred until enable_cpu_capabilities(), which
runs in a suitable context on the boot CPU. To account for the static
keys being set later, cpus_have_const_cap() is updated to use another
static key to check whether the const cap keys have been initialised,
falling back to the caps bitmap until this is the case.
This means that users of cpus_have_const_cap() gain should only gain a
single additional NOP in the fast path once the const caps are
initialised, but should always see the current cap value.
The hyp code should never dereference the caps array, since the caps are
initialized before we run the module initcall to initialise hyp. A check
is added to the hyp init code to document this requirement.
This change will sidestep a number of issues when the upcoming hotplug
locking rework is merged.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyniger <marc.zyngier@arm.com>
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
It's a common practice to send gratuitous ARPs after moving an
IP address to another device to speed up healing of a service. To
fulfill service availability constraints, the timing of network peers
updating their caches to point to a new location of an IP address can be
particularly important.
Sometimes neigh_update calls won't touch neither lladdr nor state, for
example if an update arrives in locktime interval. The neigh->updated
value is tested by the protocol specific neigh code, which in turn
will influence whether NEIGH_UPDATE_F_OVERRIDE gets set in the
call to neigh_update() or not. As a result, we may effectively ignore
the update request, bailing out of touching the neigh entry, except that
we still bump its timestamps inside neigh_update.
This may be a problem for updates arriving in quick succession. For
example, consider the following scenario:
A service is moved to another device with its IP address. The new device
sends three gratuitous ARP requests into the network with ~1 seconds
interval between them. Just before the first request arrives to one of
network peer nodes, its neigh entry for the IP address transitions from
STALE to DELAY. This transition, among other things, updates
neigh->updated. Once the kernel receives the first gratuitous ARP, it
ignores it because its arrival time is inside the locktime interval. The
kernel still bumps neigh->updated. Then the second gratuitous ARP
request arrives, and it's also ignored because it's still in the (new)
locktime interval. Same happens for the third request. The node
eventually heals itself (after delay_first_probe_time seconds since the
initial transition to DELAY state), but it just wasted some time and
require a new ARP request/reply round trip. This unfortunate behaviour
both puts more load on the network, as well as reduces service
availability.
This patch changes neigh_update so that it bumps neigh->updated (as well
as neigh->confirmed) only once we are sure that either lladdr or entry
state will change). In the scenario described above, it means that the
second gratuitous ARP request will actually update the entry lladdr.
Ideally, we would update the neigh entry on the very first gratuitous
ARP request. The locktime mechanism is designed to ignore ARP updates in
a short timeframe after a previous ARP update was honoured by the kernel
layer. This would require tracking timestamps for state transitions
separately from timestamps when actual updates are received. This would
probably involve changes in neighbour struct. Therefore, the patch
doesn't tackle the issue of the first gratuitous APR ignored, leaving
it for a follow-up.
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When arp_accept is 1, gratuitous ARPs are supposed to override matching
entries irrespective of whether they arrive during locktime. This was
implemented in commit 56022a8fdd ("ipv4: arp: update neighbour address
when a gratuitous arp is received and arp_accept is set")
There is a glitch in the patch though. RFC 2002, section 4.6, "ARP,
Proxy ARP, and Gratuitous ARP", defines gratuitous ARPs so that they can
be either of Request or Reply type. Those Reply gratuitous ARPs can be
triggered with standard tooling, for example, arping -A option does just
that.
This patch fixes the glitch, making both Request and Reply flavours of
gratuitous ARPs to behave identically.
As per RFC, if gratuitous ARPs are of Reply type, their Target Hardware
Address field should also be set to the link-layer address to which this
cache entry should be updated. The field is present in ARP over Ethernet
but not in IEEE 1394. In this patch, I don't consider any broadcasted
ARP replies as gratuitous if the field is not present, to conform the
standard. It's not clear whether there is such a thing for IEEE 1394 as
a gratuitous ARP reply; until it's cleared up, we will ignore such
broadcasts. Note that they will still update existing ARP cache entries,
assuming they arrive out of locktime time interval.
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In their infinite wisdom, and never ending quest for end user frustration,
Lenovo has decided to use new USB device IDs for the wwan modules in
their 2017 laptops. The actual hardware is still the Sierra Wireless
EM7455 or EM7430, depending on region.
Cc: <stable@vger.kernel.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Johan Hovold <johan@kernel.org>
Commit 5f7f7543f5 "fuse: Convert to separately allocated bdi" didn't
properly handle fuseblk filesystem. When fuse_bdi_init() is called for
that filesystem type, sb->s_bdi is already initialized (by
set_bdev_super()) to point to block device's bdi and consequently
super_setup_bdi_name() complains about this fact when reseting bdi to
the private one.
Fix the problem by properly dropping bdi reference in fuse_bdi_init()
before creating a private bdi in super_setup_bdi_name().
Fixes: 5f7f7543f5 ("fuse: Convert to separately allocated bdi")
Reported-by: Rakesh Pandit <rakesh@tuxera.com>
Tested-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
On dra7, as per TRM, the HW shutdown (TSHUT) temperature is hardcoded
to 123C and cannot be modified by SW. This means when the temperature
reaches 123C HW asserts TSHUT output which signals a warm reset.
This reset is held until the temperature goes below the TSHUT low (105C).
While in SW, the thermal driver continuously monitors current temperature
and takes decisions based on whether it reached an alert or a critical point.
The intention of setting a SW critical point is to prevent force reset by HW
and instead do an orderly_poweroff(). But if the SW critical temperature is
greater than or equal to that of HW then it defeats the purpose. To address
this and let SW take action before HW does keep the SW critical temperature
less than HW TSHUT value.
The value for SW critical temperature was chosen as 120C just to ensure
we give SW sometime before HW catches up.
Document reference
SPRUI30C – DRA75x, DRA74x Technical Reference Manual - November 2016
SPRUHZ6H - AM572x Technical Reference Manual - November 2016
Tested on:
DRA75x PG 2.0 Rev H EVM
Signed-off-by: Ravikumar Kattekola <rk@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Currently there is no kmalloc failure check on the allocation of
the background_tracker struct in btracker_create(), and so a NULL return
will lead to a NULL pointer dereference. Add a NULL check.
Detected by CoverityScan, CID#1416587 ("Dereference null return value")
Fixes: b29d4986d ("dm cache: significant rework to leverage dm-bio-prison-v2")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The mediatek iommu driver relied on an implicit include of dma-mapping.h,
but for some reason that is no longer there in 4.12-rc1:
drivers/iommu/mtk_iommu_v1.c: In function 'mtk_iommu_domain_finalise':
drivers/iommu/mtk_iommu_v1.c:233:16: error: implicit declaration of function 'dma_zalloc_coherent'; did you mean 'debug_dma_alloc_coherent'? [-Werror=implicit-function-declaration]
drivers/iommu/mtk_iommu_v1.c: In function 'mtk_iommu_domain_free':
drivers/iommu/mtk_iommu_v1.c:265:2: error: implicit declaration of function 'dma_free_coherent'; did you mean 'debug_dma_free_coherent'? [-Werror=implicit-function-declaration]
This adds an explicit #include to make it build again.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 208480bb27 ('iommu: Remove trace-events include from iommu.h')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Ever since commit 091d42e43d ("iommu/vt-d: Copy translation tables from
old kernel") the kdump kernel copies the IOMMU context tables from the
previous kernel. Each device mappings will be destroyed once the driver
for the respective device takes over.
This unfortunately breaks the workflow of mapping and unmapping a new
context to the IOMMU. The mapping function assumes that either:
1) Unmapping did the proper IOMMU flushing and it only ever flush if the
IOMMU unit supports caching invalid entries.
2) The system just booted and the initialization code took care of
flushing all IOMMU caches.
This assumption is not true for the kdump kernel since the context
tables have been copied from the previous kernel and translations could
have been cached ever since. So make sure to flush the IOTLB as well
when we destroy these old copied mappings.
Cc: Joerg Roedel <joro@8bytes.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: stable@vger.kernel.org v4.2+
Fixes: 091d42e43d ("iommu/vt-d: Copy translation tables from old kernel")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When __iommu_dma_map() and iommu_dma_free_iova() are called from
iommu_dma_get_msi_page(), various iova_*() helpers are still invoked in
the process, whcih is unwise since they access a different member of the
union (the iova_domain) from that which was last written, and there's no
guarantee that sensible values will result anyway.
CLean up the code paths that are valid for an MSI cookie to ensure we
only do iova_domain-specific things when we're actually dealing with one.
Fixes: a44e665758 ("iommu/dma: Clean up MSI IOVA allocation")
Reported-by: Nate Watterson <nwatters@codeaurora.org>
Tested-by: Shanker Donthineni <shankerd@codeaurora.org>
Tested-by: Bharat Bhushan <bharat.bhushan@nxp.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Mesh forwarding path checks for address extension mode to fetch
appropriate proxied address and MPP address. Existing condition
that looks for 6 address format is not strict enough so that
frames with improper values are processed and invalid entries
are added into MPP table. Fix that by adding a stricter check before
processing the packet.
Per IEEE Std 802.11s-2011 spec. Table 7-6g1 lists address extension
mode 0x3 as reserved one. And also Table Table 9-13 does not specify
0x3 as valid address field.
Fixes: 9b395bc3be ("mac80211: verify that skb data is present")
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
For most cases a protection exception in the host (e.g. copy
on write or dirty tracking) on the sie instruction will indicate
an instruction length of 4. Turns out that there are some corner
cases (e.g. runtime instrumentation) where this is not necessarily
true and the ILC is unpredictable.
Let's replace our 4 byte rewind_pad with 3 byte nops to prepare for
all possible ILCs.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
It is wrong to iounmap resources in the normal path of davinci_pm_init()
The 3 ioremap'ed fields of 'pm_config' can be accessed later on in other
functions, so we should return 'success' instead of unrolling everything.
Fixes: aa9aa1ec2d ("ARM: davinci: PM: rework init, remove platform device")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
[nsekhar@ti.com: commit message and minor style fixes]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Use the new define for the maximum number of SuperSpeed ports instead of
a constant when allocating xHCI root hubs.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add define for the maximum number of ports on a SuperSpeed hub as per
USB 3.1 spec Table 10-5, and use it when verifying the retrieved hub
descriptor.
This specifically avoids benign attempts to update the DeviceRemovable
mask for non-existing ports (should we get that far).
Fixes: dbe79bbe9d ("USB 3.0 Hub Changes")
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add missing sanity check on the non-SuperSpeed hub-descriptor length in
order to avoid parsing and leaking two bytes of uninitialised slab data
through sysfs removable-attributes (or a compound-device debug
statement).
Note that we only make sure that the DeviceRemovable field is always
present (and specifically ignore the unused PortPwrCtrlMask field) in
order to continue support any hubs with non-compliant descriptors. As a
further safeguard, the descriptor buffer is also cleared.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable <stable@vger.kernel.org> # 2.6.12
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A SuperSpeed hub descriptor does not have any variable-length fields so
bail out when reading a short descriptor.
This avoids parsing and leaking two bytes of uninitialised slab data
through sysfs removable-attributes.
Fixes: dbe79bbe9d ("USB 3.0 Hub Changes")
Cc: stable <stable@vger.kernel.org> # 2.6.39
Cc: John Youn <John.Youn@synopsys.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix up the root-hub descriptor to accommodate the variable-length
DeviceRemovable and PortPwrCtrlMask fields, while marking all ports as
removable (and leaving the reserved bit zero unset).
Also add a build-time constraint on VHCI_HC_PORTS which must never be
greater than USB_MAXCHILDREN (but this was only enforced through a
KConfig constant).
This specifically fixes the descriptor layout whenever VHCI_HC_PORTS is
greater than seven (default is 8).
Fixes: 04679b3489 ("Staging: USB/IP: add client driver")
Cc: Takahiro Hirofuchi <hirofuchi@users.sourceforge.net>
Cc: Valentina Manea <valentina.manea.m@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Flag the first and only port as removable while also leaving the
remaining bits (including the reserved bit zero) unset in accordance
with the specifications:
"Within a byte, if no port exists for a given location, the bit
field representing the port characteristics shall be 0."
Also add a comment marking the legacy PortPwrCtrlMask field.
Fixes: 1cd8fd2887 ("usb: gadget: dummy_hcd: add SuperSpeed support")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: Tatyana Brokhman <tlinder@codeaurora.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Even if this file is not yet included in any toctree, it is parsed by
Sphinx since it is named '.rst'. This patch fixes the following two
ERRORs from Sphinx build:
Documentation/usb/typec.rst:116: ERROR: Error in "kernel-doc" directive:
invalid option block.
.. kernel-doc:: drivers/usb/typec/typec.c
:functions: typec_register_cable typec_unregister_cable typec_register_plug
typec_unregister_plug
Documentation/usb/typec.rst:139: ERROR: Error in "kernel-doc" directive:
invalid option block.
.. kernel-doc:: drivers/usb/typec/typec.c
:functions: typec_set_data_role typec_set_pwr_role typec_set_vconn_role
typec_set_pwr_opmode
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Document that the new companion-device lookup helper takes a reference
to the companion device which needs to be dropped after use.
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If multiple endpoints on a single device have pending IN URBs and one
endpoint times out due to NAKs (perfectly legal), select a different
endpoint URB to try.
The existing code only checked to see another device address has pending
URBs and ignores other IN endpoints on the current device address. This
leads to endpoints never getting serviced if one endpoint is using NAK as
a flow control method.
Fixes: 5d3043586d ("usb: r8a66597-hcd: host controller driver for R8A6659")
Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The timeout for BULK packets was 300ms which is a long time if other
endpoints or devices are waiting for their turn. Changing it to 50ms
greatly increased the overall performance for multi-endpoint devices.
Fixes: 5d3043586d ("usb: r8a66597-hcd: host controller driver for R8A6659")
Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The PM functions used in this driver are the ones defined in
sounc/soc/soc-core.c.
When suspending (using snd_soc_suspend), the regcache is marked dirty
but is never synced on resume.
Sync regcache on resume of Atmel ClassD device.
Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Current SSI uses PDTA bit which indicates data that Input/Output
data are Right-Aligned. But, 24bit sound should be Left-Aligned
in this HW. Because Linux is using Right-Aligned data, and HW uses
Left-Aligned data, current 24bit data is missing lower 8bit.
To fix this issue, this patch removes PDTA bit, and shift 8bit
in necessary module
Reported-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Tested-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Drop erroneous le16_to_cpu when returning the USB device speed which is
already in host byte order.
Found using sparse:
warning: cast to restricted __le16
Fixes: 946b960d13 ("USB: add driver for iowarrior devices.")
Cc: stable <stable@vger.kernel.org> # 2.6.21
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After commit d705ff3818 (tty: vt, cleanup and document con_scroll), in
the coccinelle output, we can see:
drivers/usb/misc/sisusbvga/sisusb_con.c:852:8-9: WARNING: return of 0/1 in function 'sisusbcon_scroll_area' with return type bool
Return true instead of 1 in the function returning bool which was
intended to do in d705ff3818 but omitted.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Fixes: d705ff3818 (tty: vt, cleanup and document con_scroll)
Cc: Thomas Winischhofer <thomas@winischhofer.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-usb@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add missing endianness conversion when using the USB device-descriptor
idProduct field to apply a hardware quirk.
Fixes: 1ba47da527 ("uwb: add the i1480 DFU driver")
Cc: stable <stable@vger.kernel.org> # 2.6.28
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Format specifier %p can leak kernel addresses while not valuing the
kptr_restrict system settings. When kptr_restrict is set to (1), kernel
pointers printed using the %pK format specifier will be replaced with
Zeros. Debugging Note : &pK prints only Zeros as address. If you need
actual address information, write 0 to kptr_restrict.
echo 0 > /proc/sys/kernel/kptr_restrict
[Found by poking around in a random vendor kernel tree, it would be nice
if someone would actually send these types of patches upstream - gkh]
Signed-off-by: Vamsi Krishna Samavedam <vskrishn@codeaurora.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The ene_usb6250 sub-driver in usb-storage does USB I/O to buffers on
the stack, which doesn't work with vmapped stacks. This patch fixes
the problem by allocating a separate 512-byte buffer at probe time and
using it for all of the offending I/O operations.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: Andreas Hartmann <andihartmann@01019freenet.de>
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Felipe writes:
usb: fixes for v4.12-rc2
- New device ID for Intel Canonlake CPUs
- fix for Isochronous performance regression on dwc3
- fix for out-of-bounds access on comp_desc on f_fs
- fix for lost events on dwc3 in case of spurious interrupts
This patch adds support for recognition of ARM-USB-TINY(H) devices which
are almost identical to ARM-USB-OCD(H) but lacking separate barrel jack
and serial console.
By suggestion from Johan Hovold it is possible to replace
ftdi_jtag_quirk with a bit more generic construction. Since all
Olimex-ARM debuggers has exactly two ports, we could safely always use
only second port within the debugger family.
Signed-off-by: Andrey Korolyov <andrey@xdel.ru>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
With ACPI, i2c-core requires ACPI companion to be set in order for it
to create slave device.
This patch sets the ACPI companion accordingly.
Signed-off-by: Tin Huynh <tnhuynh@apm.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The page table dump code doesn't know about huge pages, so currently
it crashes (or walks random memory, usually leading to a crash), if it
finds a huge page. On Book3S we only see huge pages in the Linux page
tables when we're using the P9 Radix MMU.
Teaching the code to properly handle huge pages is a bit more involved,
so for now just prevent the crash.
Cc: stable@vger.kernel.org # v4.10+
Fixes: 8eb07b1870 ("powerpc/mm: Dump linux pagetables")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Added code to support Cisco MDS loopback diagnostic. The diagnostics run
various loopbacks including one which loops-back frame through the
driver.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Code review of NVMEI's FC_PORT_ROLE_NVME_DISCOVERY looked wrong.
Discussions with storage architecture team clarified NVMEI's audit of
the PRLI response port roles. Following up discussion with code review
showed a few minor corrections were required - especially in
anticipation of NVME auto discovery.
During PRLI, NVMEI should sent prli_init - which it it does. NVMET
should send prli_tgt and prli_disc - which it does. When NVMEI receives
a PRLI Response now, it audits the incoming target bits and stores the
attributes in the corresponding NDLP. Later, when NVMEI registers the
NVME rport, it uses the stored ndlp attributes to set the rport
port_roles correctly.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Too many work items being processed in IRQ context take a lot of CPU
time and cause problems.
With a recent change, we get out of the ISR after hitting entry_repost
work items on a queue. However, the actual values for entry repost are
still high. EQ is 128 and CQ is 128, this could translate into
processing 128 * 128 (16384) work items under IRQ context.
Set entry_repost in the actual queue creation routine now. Limit EQ
repost to 8 and CQ repost to 64 to further limit the amount of time
spent in the IRQ.
Fix fof IRQ routines as well.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When unloading and reloading the driver, the driver fails to recreate
the lpfc root inode in the debugfs tree.
The driver is incorrectly removing the lpfc root inode in
lpfc_debugfs_terminate in the first driver instance that unloads and
then sets the lpfc_debugfs_root global parameter to NULL. When the
final driver instance unloads, the debugfs calls quietly ignore the
remove on a NULL pointer. The bug is that the debugfs_remove call
returns void so the driver doesn't know to correctly set the global
parameter to NULL.
Base the debugfs_remove of the lpfc_debugfs_root parameter on
lpfc_debugfs_hba_count because this parameter tracks the fnX instance
tracked per driver instance.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the driver send the RPA command, it does not send supported FC4
Type NVME to the management server.
Encode NVME (type x28) in the AttribEntry in the RPA command.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Previous logic would just drop the IO.
Added logic to queue the IO to wait for an IO context resource from an
IO thats already in progress.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently IO resources are mapped 1 to 1 with RQ buffers posted
Added logic to separate RQE buffers from IO op resources
(sgl/iocbq/context). During initialization, the driver will determine
how many SGLs it will allocate for NVMET (based on what the firmware
reports) and associate a NVMET IOCBq and NVMET context structure with
each one.
Now that hdr/data buffers are immediately reposted back to the RQ, 512
RQEs for each MRQ is sufficient. Also, since NVMET data buffers are now
128 bytes, lpfc_nvmet_mrq_post is not necessary anymore as we will
always post the max (512) buffers per NVMET MRQ.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
After running IOPS test for 30 second we get kernel:NMI watchdog:
Watchdog detected hard LOCKUP on cpu 0
The driver is speend too much time in its ISR.
In ISR EQ and CQ processing routines, if we hit the entry_repost numbers
of EQE/CQEs just break out of the routine as opposed to hitting the
doorbell with NOARM and continue processing.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During driver boot, a latency in the NVMET driver side causes the
incoming NVMEI PRLI to get rejected by the NVMET driver. When this
happens, the NVMEI driver runs out of PRLI retries. Bouncing the link
does not fix the situation.
If the NVMEI driver decides, on PRLI completion failures, to retry the
PRLI, always decrement the fc4_prli_sent counter. This allows the PRLI
completion to resolve to UNMAPPED when NVMET rejects the PRLI.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Large block writes to the nvme target were failing because the default
number of RQs posted was insufficient.
Expand the NVMET RQs to 2048 RQEs and ensure a minimum of 512 RQEs are
posted, no matter how many MRQs are configured.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
With 255 vports created a link trasition can casue a crash.
When going through discovery after a link bounce the driver is using
rpis before the cmd FCOE_POST_HDR_TEMPLATES completes. By doing that the
next rpi bumps the rpi range out of the boundary.
The fix it to increment the next_rpi only when the
FCOE_POST_HDR_TEMPLATE succeeds.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Previous assignment was causing the use of the uninitialized variable
_explan_ inside fc_seq_ls_rjt() function, which in this particular case
is being called by fc_seq_els_rsp_send().
[mkp: fixed typo]
Addresses-Coverity-ID: 1398125
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some external hard drives don't support the sync command even though the
hard drive has write cache enabled. In this case, upon suspend request,
sync cache failures are ignored if the error code in the sense header is
ILLEGAL_REQUEST. There's not much we can do for these drives, so we
shouldn't fail to suspend for this error case. The drive may stay
powered if that's the setup for the port it's plugged into.
Signed-off-by: Derek Basehore <dbasehore@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Presumably we can never actually hit this return, but static checkers
complain that we should unlock before we return.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The last goto looks spurious because it releases less resources than the
previous one.
Also free 'img->sig' if 'ls_ucode_img_build()' fails.
Fixes: 9d896f3e41 ("drm/nouveau/secboot: abstract LS firmware loading functions")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Commit cae9ff036e effectively disabled the drm poll_helper by checking
the wrong flag to see if the driver should enable the poll or not:
mode_config.poll_enabled is only set to true by poll_init and it is not
indicating if the poll is enabled or not.
nouveau_display_create() will initialize the poll and going to disable it
right away. After poll_init() the mode_config.poll_enabled will be true,
but the poll itself is disabled.
To avoid the race caused by calling the poll_enable() from different paths,
this patch will enable the poll from one place, in the
nouveau_display_hpd_work().
In case the pm_runtime is disabled we will enable the poll in
nouveau_drm_load() once.
Fixes: cae9ff036e ("drm/nouveau: Don't enabling polling twice on runtime resume")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Lyude <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
There is no reason to use platform_get_irq() for non-DT probing and
irq_of_parse_and_map() for DT probing. Indeed, platform_get_irq()
works fine for both.
In addition, using platform_get_irq() properly returns -EPROBE_DEFER
when the interrupt controller is not yet available, so instead of
inventing our own error code (-ENXIO), return the one provided by
platform_get_irq().
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Commit 75f0aef622 ("uio: fix memory leak") has fixed up some
memory leaks during the failure paths of the addition of uio
attributes, but still is not correct entirely. A kobject_uevent()
failure still needs a kobject_put() and the kobject container
structure allocation failure before the kobject_init() doesn't
need a kobject_put(). Fix this properly.
Fixes: 75f0aef622 ("uio: fix memory leak")
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is the following link error with CONFIG_PCI_ENDPOINT_TEST=y and
CONFIG_CRC32=m:
drivers/built-in.o: In function 'pci_endpoint_test_ioctl':
pci_endpoint_test.c:(.text+0xf1251): undefined reference to 'crc32_le'
pci_endpoint_test.c:(.text+0xf1322): undefined reference to 'crc32_le'
pci_endpoint_test.c:(.text+0xf13b2): undefined reference to 'crc32_le'
pci_endpoint_test.c:(.text+0xf141e): undefined reference to 'crc32_le'
Fix this by selecting CRC32 in the PCI_ENDPOINT_TEST kconfig entry.
Fixes: 2c156ac71c ("misc: Add host side PCI driver for PCI test function device")
Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The lp_setup() code doesn't apply any bounds checking when passing
"lp=none", and only in this case, resulting in an overflow of the
parport_nr[] array. All versions in Git history are affected.
Reported-By: Roee Hay <roee.hay@hcl.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull pstore fix from Kees Cook:
"Fix bad EFI vars iterator usage"
* tag 'pstore-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
efi-pstore: Fix read iter after pstore API refactor
This reverts commit 51f5677777 "nfsd: check for oversized NFSv2/v3
arguments", which breaks support for NFSv3 ACLs.
That patch was actually an earlier draft of a fix for the problem that
was eventually fixed by e6838a29ec "nfsd: check for oversized NFSv2/v3
arguments". But somehow I accidentally left this earlier draft in the
branch that was part of my 2.12 pull request.
Reported-by: Eryu Guan <eguan@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We now reference the arp_tbl, which requires IPv4 support to be
enabled in the kernel, otherwise we get a link error:
drivers/net/built-in.o: In function `mlx5e_tc_update_neigh_used_value':
(.text+0x16afec): undefined reference to `arp_tbl'
drivers/net/built-in.o: In function `mlx5e_rep_neigh_init':
en_rep.c:(.text+0x16c16d): undefined reference to `arp_tbl'
drivers/net/built-in.o: In function `mlx5e_rep_netevent_event':
en_rep.c:(.text+0x16cbb5): undefined reference to `arp_tbl'
This adds a Kconfig dependency for it.
Fixes: 232c001398 ("net/mlx5e: Add support to neighbour update flow")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
There were a number of handwaving complaints that one could "possibly"
use inode numbers and extent maps to fingerprint a filesystem hosting
multiple containers and somehow use the information to guess at the
contents of other containers and attack them. Despite the total lack of
any demonstration that this is actually possible, it's easier to
restrict access now and broaden it later, so use the rmapbt fsmap
backends only if the caller has CAP_SYS_ADMIN. Unprivileged users will
just have to make do with only getting the free space and static
metadata placement information.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
In some fio benchmarks, halt_poll_ns=400000 caused CPU utilization to
increase heavily even in cases where the performance improvement was
small. In particular, bandwidth divided by CPU usage was as much as
60% lower.
To some extent this is the expected effect of the patch, and the
additional CPU utilization is only visible when running the
benchmarks. However, halving the threshold also halves the extra
CPU utilization (from +30-130% to +20-70%) and has no negative
effect on performance.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Change the type of the parameter "retain_bytes" from unsigned to
unsigned long, so that on 64-bit machines the user can set more than
4GiB of data to be retained.
Also, change the type of the variable "count" in the function
"__evict_old_buffers" to unsigned long. The assignment
"count = c->n_buffers[LIST_CLEAN] + c->n_buffers[LIST_DIRTY];"
could result in unsigned long to unsigned overflow and that could result
in buffers not being freed when they should.
While at it, avoid division in get_retain_buffers(). Division is slow,
we can change it to shift because we have precalculated the log2 of
block size.
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
In general, rtnetlink dumps do not anticipate failure to dump a single
object (e.g., link or route) on a single pass. As both route and link
objects have grown via more attributes, that is no longer a given.
netlink dumps can handle a failure if the dump function returns an
error; specifically, netlink_dump adds the return code to the response
if it is <= 0 so userspace is notified of the failure. The missing
piece is the rtnetlink dump functions returning the error.
Fix route and link dump functions to return the errors if no object is
added to an skb (detected by skb->len != 0). IPv6 route dumps
(rt6_dump_route) already return the error; this patch updates IPv4 and
link dumps. Other dump functions may need to be ajusted as well.
Reported-by: Jan Moskyto Matejka <mq@ucw.cz>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver explicitly bypasses APIs to register all memory once a
connection is made, and thus allows remote access to memory.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, SMC enables remote access to physical memory when a user
has successfully configured and established an SMC-connection until ten
minutes after the last SMC connection is closed. Because this is considered
a security risk, drivers are supposed to use IB_PD_UNSAFE_GLOBAL_RKEY in
such a case.
This patch changes the current SMC code to use IB_PD_UNSAFE_GLOBAL_RKEY.
This improves user awareness, but does not remove the security risk itself.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During the internal pstore API refactoring, the EFI vars read entry was
accidentally made to update a stack variable instead of the pstore
private data pointer. This corrects the problem (and removes the now
needless argument).
Fixes: 125cc42baf ("pstore: Replace arguments for read() API")
Signed-off-by: Kees Cook <keescook@chromium.org>
Pull bugfixes from the i2c mux subsubsystem:
This fixes an old bug in resource cleanup on failure in i2c-mux-reg and
a new log spamming bug from this merge window in the i2c-mux core.
tcp_ack() can call tcp_fragment() which may dededuct the
value tp->fackets_out when MSS changes. When prior_fackets
is larger than tp->fackets_out, tcp_clean_rtx_queue() can
invoke tcp_update_reordering() with negative values. This
results in absurd tp->reodering values higher than
sysctl_tcp_max_reordering.
Note that tcp_update_reordering indeeds sets tp->reordering
to min(sysctl_tcp_max_reordering, metric), but because
the comparison is signed, a negative metric always wins.
Fixes: c7caf8d3ed ("[TCP]: Fix reord detection due to snd_una covered holes")
Reported-by: Rebecca Isaacs <risaacs@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull s390 fixes from Martin Schwidefsky:
- convert the debug feature to refcount_t
- reduce the copy size for strncpy_from_user
- 8 bug fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/virtio: change virtio_feature_desc:features type to __le32
s390: convert debug_info.ref_count from atomic_t to refcount_t
s390: move _text symbol to address higher than zero
s390/qdio: increase string buffer size
s390/ccwgroup: increase string buffer size
s390/topology: let topology_mnest_limit() return unsigned char
s390/uaccess: use sane length for __strncpy_from_user()
s390/uprobes: fix compile for !KPROBES
s390/ftrace: fix compile for !MODULES
s390/cputime: fix incorrect system time
By run fsstress long enough time enough in RHEL-7, I find an
assertion failure (harder to reproduce on linux-4.11, but problem
is still there):
XFS: Assertion failed: (iflags & BMV_IF_DELALLOC) != 0, file: fs/xfs/xfs_bmap_util.c
The assertion is in xfs_getbmap() funciton:
if (map[i].br_startblock == DELAYSTARTBLOCK &&
--> map[i].br_startoff <= XFS_B_TO_FSB(mp, XFS_ISIZE(ip)))
ASSERT((iflags & BMV_IF_DELALLOC) != 0);
When map[i].br_startoff == XFS_B_TO_FSB(mp, XFS_ISIZE(ip)), the
startoff is just at EOF. But we only need to make sure delalloc
extents that are within EOF, not include EOF.
Signed-off-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reduce stack usage and get rid of compiler warnings by eliminating
unused variables.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
When we're fulfilling a BMAPX request, jump out early if the data fork
is in local format. This prevents us from hitting a debugging check in
bmapi_read and barfing errors back to userspace. The on-disk extent
count check later isn't sufficient for IF_DELALLOC mode because da
extents are in memory and not on disk.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The delalloc -> real block conversion path uses an incorrect
calculation in the case where the middle part of a delalloc extent
is being converted. This is documented as a rare situation because
XFS generally attempts to maximize contiguity by converting as much
of a delalloc extent as possible.
If this situation does occur, the indlen reservation for the two new
delalloc extents left behind by the conversion of the middle range
is calculated and compared with the original reservation. If more
blocks are required, the delta is allocated from the global block
pool. This delta value can be characterized as the difference
between the new total requirement (temp + temp2) and the currently
available reservation minus those blocks that have already been
allocated (startblockval(PREV.br_startblock) - allocated).
The problem is that the current code does not account for previously
allocated blocks correctly. It subtracts the current allocation
count from the (new - old) delta rather than the old indlen
reservation. This means that more indlen blocks than have been
allocated end up stashed in the remaining extents and free space
accounting is broken as a result.
Fix up the calculation to subtract the allocated block count from
the original extent indlen and thus correctly allocate the
reservation delta based on the difference between the new total
requirement and the unused blocks from the original reservation.
Also remove a bogus assert that contradicts the fact that the new
indlen reservation can be larger than the original indlen
reservation.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Pull EDAC fix from Borislav Petkov:
"A single amd64_edac fix correcting chip select sizes reporting on
F17h"
* tag 'edac_fix_for_4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, amd64: Fix reporting of Chip Select sizes on Fam17h
When platform_get_irq() fails, it returns an error code, which
libahci_platform and replaces it by -EINVAL. This commit fixes that by
propagating the error code. It fixes the situation where
platform_get_irq() returns -EPROBE_DEFER because the interrupt
controller is not available yet, and generally looks like the right
thing to do.
We pay attention to not show the "no irq" message when we are in an
EPROBE_DEFER situation, because the driver probing will be retried
later on, once the interrupt controller becomes available to provide
the interrupt.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Here, Clock enable can failed. So adding an error check for
clk_prepare_enable.
tj: minor style updates
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
(Correction in this resend: fixed function name acer_sa5_271_workaround; fixed
the always-true condition in the function; fixed description.)
On the Acer Switch Alpha 12 (model number: SA5-271), the internal SSD may not
get detected because the port_map and CAP.nr_ports combination causes the driver
to skip the port that is actually connected to the SSD. More specifically,
either all SATA ports are identified as DUMMY, or all ports get ``link down''
and never get up again.
This problem occurs occasionally. When this problem occurs, CAP may hold a
value of 0xC734FF00 or 0xC734FF01 and port_map may hold a value of 0x00 or 0x01.
When this problem does not occur, CAP holds a value of 0xC734FF02 and port_map
may hold a value of 0x07. Overriding the CAP value to 0xC734FF02 and port_map to
0x7 significantly reduces the occurrence of this problem.
Link: https://bugzilla.kernel.org/attachment.cgi?id=253091
Signed-off-by: Sui Chen <suichen6@gmail.com>
Tested-by: Damian Ivanov <damianatorrpm@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
The width needs to be configured in bytes with 1 meaning 8-bit
access and 2 meaning 16-bit access.
Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Fix commit 05c4ffc3a2 ("ARM: dts: LogicPD Torpedo: Add MT9P031 Support")
In the previous commit, I indicated that the only testing was done by
showing the camera showed up when probing. This patch fixes an incorrect
pin muxing on cam_d0, cam_d1 and cam_d2.
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The CEC pin was always pulled up, making it impossible to use it.
Change to PIN_INPUT so it can be used by the new CEC support.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
The clock polarity setting of the mcbsp connected to
the modem was wrong so almost only noise
was received.
With this patch it is also the same as it was on
earlier non-dt kernels where it was working properly
Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Add power hold and power controller properties to palmas node.
This is needed to shutdown pmic correctly on boards with
powerhold set.
Signed-off-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The setting of return code ret should be based on the error code
passed into function end_extent_writepage and not on ret. Thanks
to Liu Bo for spotting this mistake in the original fix I submitted.
Detected by CoverityScan, CID#1414312 ("Logically dead code")
Fixes: 5dca6eea91 ("Btrfs: mark mapping with error flag to report errors to userspace")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit b685d3d65a "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions
Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.
CC: David Sterba <dsterba@suse.com>
CC: linux-btrfs@vger.kernel.org
Fixes: b685d3d65a
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Cycle mount btrfs can cause fiemap to return different result.
Like:
# mount /dev/vdb5 /mnt/btrfs
# dd if=/dev/zero bs=16K count=4 oflag=dsync of=/mnt/btrfs/file
# xfs_io -c "fiemap -v" /mnt/btrfs/file
/mnt/test/file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 25088..25215 128 0x1
# umount /mnt/btrfs
# mount /dev/vdb5 /mnt/btrfs
# xfs_io -c "fiemap -v" /mnt/btrfs/file
/mnt/test/file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..31]: 25088..25119 32 0x0
1: [32..63]: 25120..25151 32 0x0
2: [64..95]: 25152..25183 32 0x0
3: [96..127]: 25184..25215 32 0x1
But after above fiemap, we get correct merged result if we call fiemap
again.
# xfs_io -c "fiemap -v" /mnt/btrfs/file
/mnt/test/file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 25088..25215 128 0x1
[REASON]
Btrfs will try to merge extent map when inserting new extent map.
btrfs_fiemap(start=0 len=(u64)-1)
|- extent_fiemap(start=0 len=(u64)-1)
|- get_extent_skip_holes(start=0 len=64k)
| |- btrfs_get_extent_fiemap(start=0 len=64k)
| |- btrfs_get_extent(start=0 len=64k)
| | Found on-disk (ino, EXTENT_DATA, 0)
| |- add_extent_mapping()
| |- Return (em->start=0, len=16k)
|
|- fiemap_fill_next_extent(logic=0 phys=X len=16k)
|
|- get_extent_skip_holes(start=0 len=64k)
| |- btrfs_get_extent_fiemap(start=0 len=64k)
| |- btrfs_get_extent(start=16k len=48k)
| | Found on-disk (ino, EXTENT_DATA, 16k)
| |- add_extent_mapping()
| | |- try_merge_map()
| | Merge with previous em start=0 len=16k
| | resulting em start=0 len=32k
| |- Return (em->start=0, len=32K) << Merged result
|- Stripe off the unrelated range (0~16K) of return em
|- fiemap_fill_next_extent(logic=16K phys=X+16K len=16K)
^^^ Causing split fiemap extent.
And since in add_extent_mapping(), em is already merged, in next
fiemap() call, we will get merged result.
[FIX]
Here we introduce a new structure, fiemap_cache, which records previous
fiemap extent.
And will always try to merge current fiemap_cache result before calling
fiemap_fill_next_extent().
Only when we failed to merge current fiemap extent with cached one, we
will call fiemap_fill_next_extent() to submit cached one.
So by this method, we can merge all fiemap extents.
It can also be done in fs/ioctl.c, however the problem is if
fieinfo->fi_extents_max == 0, we have no space to cache previous fiemap
extent.
So I choose to merge it in btrfs.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
irq_set_chained_handler_and_data() sets up the chained interrupt and then
stores the handler data.
That's racy against an immediate interrupt which gets handled before the
store of the handler data happened. The handler will dereference a NULL
pointer and crash.
Cure it by storing handler data before installing the chained handler.
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
The i2c functions need to test the pm_suspend state and do, if needed, some
retry before i2c operations. This code was repeated 4x.
To isolate this, create a new function to check suspend state and call it in
every need place.
As at it, move the error message from pr_err to dev_err.
Signed-off-by: Rui Miguel Silva <rmfrfs@gmail.com>
Acked-by: Yueyao Zhu <yueyao.zhu@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Function devm_clk_get() returns an ERR_PTR when it fails. However, in
function kdwc3_probe(), its return value is not checked, which may
result in a bad memory access bug. This patch fixes the bug.
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Companion descriptor is only used for SuperSpeed endpoints,
if the endpoints are HighSpeed or FullSpeed, the Companion
descriptor will not allocated, so we can only access it if
gadget is SuperSpeed.
I can reproduce this issue on Rockchip platform rk3368 SoC
which supports USB 2.0, and use functionfs for ADB. Kernel
build with CONFIG_KASAN=y and CONFIG_SLUB_DEBUG=y report
the following BUG:
==================================================================
BUG: KASAN: slab-out-of-bounds in ffs_func_set_alt+0x224/0x3a0 at addr ffffffc0601f6509
Read of size 1 by task swapper/0/0
============================================================================
BUG kmalloc-256 (Not tainted): kasan: bad access detected
----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: Allocated in ffs_func_bind+0x52c/0x99c age=1275 cpu=0 pid=1
alloc_debug_processing+0x128/0x17c
___slab_alloc.constprop.58+0x50c/0x610
__slab_alloc.isra.55.constprop.57+0x24/0x34
__kmalloc+0xe0/0x250
ffs_func_bind+0x52c/0x99c
usb_add_function+0xd8/0x1d4
configfs_composite_bind+0x48c/0x570
udc_bind_to_driver+0x6c/0x170
usb_udc_attach_driver+0xa4/0xd0
gadget_dev_desc_UDC_store+0xcc/0x118
configfs_write_file+0x1a0/0x1f8
__vfs_write+0x64/0x174
vfs_write+0xe4/0x200
SyS_write+0x68/0xc8
el0_svc_naked+0x24/0x28
INFO: Freed in inode_doinit_with_dentry+0x3f0/0x7c4 age=1275 cpu=7 pid=247
...
Call trace:
[<ffffff900808aab4>] dump_backtrace+0x0/0x230
[<ffffff900808acf8>] show_stack+0x14/0x1c
[<ffffff90084ad420>] dump_stack+0xa0/0xc8
[<ffffff90082157cc>] print_trailer+0x188/0x198
[<ffffff9008215948>] object_err+0x3c/0x4c
[<ffffff900821b5ac>] kasan_report+0x324/0x4dc
[<ffffff900821aa38>] __asan_load1+0x24/0x50
[<ffffff90089eb750>] ffs_func_set_alt+0x224/0x3a0
[<ffffff90089d3760>] composite_setup+0xdcc/0x1ac8
[<ffffff90089d7394>] android_setup+0x124/0x1a0
[<ffffff90089acd18>] _setup+0x54/0x74
[<ffffff90089b6b98>] handle_ep0+0x3288/0x4390
[<ffffff90089b9b44>] dwc_otg_pcd_handle_out_ep_intr+0x14dc/0x2ae4
[<ffffff90089be85c>] dwc_otg_pcd_handle_intr+0x1ec/0x298
[<ffffff90089ad680>] dwc_otg_pcd_irq+0x10/0x20
[<ffffff9008116328>] handle_irq_event_percpu+0x124/0x3ac
[<ffffff9008116610>] handle_irq_event+0x60/0xa0
[<ffffff900811af30>] handle_fasteoi_irq+0x10c/0x1d4
[<ffffff9008115568>] generic_handle_irq+0x30/0x40
[<ffffff90081159b4>] __handle_domain_irq+0xac/0xdc
[<ffffff9008080e9c>] gic_handle_irq+0x64/0xa4
...
Memory state around the buggy address:
ffffffc0601f6400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffffffc0601f6480: 00 00 00 00 00 00 00 00 00 00 06 fc fc fc fc fc
>ffffffc0601f6500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffffffc0601f6580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffffffc0601f6600: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
==================================================================
Signed-off-by: William Wu <william.wu@rock-chips.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Check for bad pointer that may result because of kthread_create failure.
This check is needed since the gserial setup callback function
(gs_console_setup()) is only freeing the info->con_buf in case of
kthread_create failure which will result into bad info->console_thread
pointer.
Without checking info->console_thread pointer validity in the
gserial_console_exit() function, before calling kthread_stop(), the
rmmod will generate Kernel Oops.
Signed-off-by: Bogdan Mirea <Bogdan-Stefan_mirea@mentor.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The dwc3 driver can overwite its previous events if its top-half IRQ
handler (TH) gets invoked again before processing the events in the
cache. We see this as a hang in the file transfer and the host will
attempt to reset the device. TH gets the event count and deasserts the
interrupt line by writing DWC3_GEVNTSIZ_INTMASK to DWC3_GEVNTSIZ. If
there's a new event coming between reading the event count and interrupt
deassertion, dwc3 will lose previous pending events. More generally, we
will see 0 event count, which should not affect anything.
This shouldn't be possible in the current dwc3 implementation. However,
through testing and reading the PCIe trace, the TH occasionally still
gets invoked one more time after HW interrupt deassertion. (With PCIe
legacy interrupts, TH is called repeatedly as long as the interrupt line
is asserted). We suspect that there is a small detection delay in the
SW.
To avoid this issue, Check DWC3_EVENT_PENDING flag to determine if the
events are processed in the bottom-half IRQ handler. If not, return
IRQ_HANDLED and don't process new event.
Cc: stable@vger.kernel.org
Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Commit 08a36b5438 ("usb: dwc3: gadget: simplify __dwc3_gadget_ep_queue()")
caused a small change in the way ISO transfer is handled in the case
when XferInProgress event happens on Isoc EP with an active transfer.
This caused a performance degradation of 50%. e.g. using g_webcam on DUT
and luvcview on host the video frame rate dropped from 16fps to 8fps
@high-speed.
Make the ISO transfer handling equivalent to that prior to that commit
to get back the original ISO performance numbers.
Fixes: 08a36b5438 ("usb: dwc3: gadget: simplify __dwc3_gadget_ep_queue()")
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
We yield the kvm->mmu_lock occassionaly while performing an operation
(e.g, unmap or permission changes) on a large area of stage2 mappings.
However this could possibly cause another thread to clear and free up
the stage2 page tables while we were waiting for regaining the lock and
thus the original thread could end up in accessing memory that was
freed. This patch fixes the problem by making sure that the stage2
pagetable is still valid after we regain the lock. The fact that
mmu_notifer->release() could be called twice (via __mmu_notifier_release
and mmu_notifier_unregsister) enhances the possibility of hitting
this race where there are two threads trying to unmap the entire guest
shadow pages.
While at it, cleanup the redudant checks around cond_resched_lock in
stage2_wp_range(), as cond_resched_lock already does the same checks.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: andreyknvl@google.com
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
The info->target comes from userspace and it would be used directly.
So we need to add the sanity check to make sure it is a valid standard
target, although the ebtables tool has already checked it. Kernel needs
to validate anything coming from userspace.
If the target is set as an evil value, it would break the ebtables
and cause a panic. Because the non-standard target is treated as one
offset.
Now add one helper function ebt_invalid_target, and we would replace
the macro INVALID_TARGET later.
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ALC299 has no loopback mixer, but the driver still tries to add a beep
control over the mixer NID which leads to the error at accessing it.
This patch fixes it by properly declaring mixer_nid=0 for this codec.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195775
Fixes: 28f1f9b26c ("ALSA: hda/realtek - Add new codec ID ALC299")
Cc: stable@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
During v4.3 when the overflow/underflow check was relaxed by
commit c72c525022:
commit c72c525022
Author: Roland Dreier <roland@purestorage.com>
Date: Wed Jul 22 15:08:18 2015 -0700
target: allow underflow/overflow for PR OUT etc. commands
to allow underflow/overflow for Windows compliance + FCP, a
consequence was to allow control CDBs to process overflow
data for iscsi-target with immediate data as well.
As per Roland's original change, continue to allow underflow
cases for control CDBs to make Windows compliance + FCP happy,
but until overflow for control CDBs is supported tree-wide,
explicitly reject all control WRITEs with overflow following
pre v4.3.y logic.
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: <stable@vger.kernel.org> # v4.3+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The current code is not correctly calculating the req_lim_delta.
We want to make sure vscsi->credit is always incremented when
we do not send a response for the scsi op. Thus for the case where
there is a successfully aborted task we need to make sure the
vscsi->credit is incremented.
v2 - Moves the original location of the vscsi->credit increment
to a better spot. Since if we increment credit, the next command
we send back will have increased req_lim_delta. But we probably
shouldn't be doing that until the aborted cmd is actually released.
Otherwise the client will think that it can send a new command, and
we could find ourselves short of command elements. Not likely, but could
happen.
This patch depends on both:
commit 25e7853126 ("ibmvscsis: Do not send aborted task response")
commit 98883f1b54 ("ibmvscsis: Clear left-over abort_cmd pointers")
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-by: Michael Cyr <mikecyr@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
With the addition of ibmvscsis->abort_cmd pointer within
commit 25e7853126 ("ibmvscsis: Do not send aborted task response"),
make sure to explicitly NULL these pointers when clearing
DELAY_SEND flag.
Do this for two cases, when getting the new new ibmvscsis
descriptor in ibmvscsis_get_free_cmd() and before posting
the response completion in ibmvscsis_send_messages().
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-by: Michael Cyr <mikecyr@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Commit 22d8b3dec2 ("powerpc/kprobes: Emulate instructions on kprobe
handler re-entry") enabled emulating instructions on kprobe re-entry,
rather than single-stepping always. However, we didn't update the single
stepping code to only be run if the emulation fails. Also, we missed
re-enabling preemption if the instruction emulation was successful. Fix
those issues.
Fixes: 22d8b3dec2 ("powerpc/kprobes: Emulate instructions on kprobe handler re-entry")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit 17ed4c8f81 ("powerpc/powernv: Recover correct PACA on wakeup
from a stop on P9 DD1") promises to set the NAPSTATELOST bit in paca
after recovering the correct paca for the thread waking up from stop1
on DD1, so that the GPRs can be correctly restored on the stop exit
path. However, it loads the value 1 into r3, but stores the value in
r0 into NAPSTATELOST(r13).
Fix this by correctly set the NAPSTATELOST bit in paca after
recovering the paca on POWER9 DD1.
Fixes: 17ed4c8f81 ("powerpc/powernv: Recover correct PACA on wakeup from a stop on P9 DD1")
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Test that the VMX checkpointed register state is maintained when a VMX
unavailable exception is taken during a transaction.
Thanks to Breno Leitao <brenohl@br.ibm.com> and
Gustavo Bueno Romero <gromero@br.ibm.com> for the original test this
is based heavily on.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
[mpe: Add to .gitignore, always build it 64-bit to fix build errors]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull networking fixes from David Miller:
1) Track alignment in BPF verifier so that legitimate programs won't be
rejected on !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS architectures.
2) Make tail calls work properly in arm64 BPF JIT, from Deniel
Borkmann.
3) Make the configuration and semantics Generic XDP make more sense and
don't allow both generic XDP and a driver specific instance to be
active at the same time. Also from Daniel.
4) Don't crash on resume in xen-netfront, from Vitaly Kuznetsov.
5) Fix use-after-free in VRF driver, from Gao Feng.
6) Use netdev_alloc_skb_ip_align() to avoid unaligned IP headers in
qca_spi driver, from Stefan Wahren.
7) Always run cleanup routines in BPF samples when we get SIGTERM, from
Andy Gospodarek.
8) The mdio phy code should bring PHYs out of reset using the shared
GPIO lines before invoking bus->reset(). From Florian Fainelli.
9) Some USB descriptor access endian fixes in various drivers from
Johan Hovold.
10) Handle PAUSE advertisements properly in mlx5 driver, from Gal
Pressman.
11) Fix reversed test in mlx5e_setup_tc(), from Saeed Mahameed.
12) Cure netdev leak in AF_PACKET when using timestamping via control
messages. From Douglas Caetano dos Santos.
13) netcp doesn't support HWTSTAMP_FILTER_ALl, reject it. From Miroslav
Lichvar.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
ldmvsw: stop the clean timer at beginning of remove
ldmvsw: unregistering netdev before disable hardware
net: netcp: fix check of requested timestamping filter
ipv6: avoid dad-failures for addresses with NODAD
qed: Fix uninitialized data in aRFS infrastructure
mdio: mux: fix device_node_continue.cocci warnings
net/packet: fix missing net_device reference release
net/mlx4_core: Use min3 to select number of MSI-X vectors
macvlan: Fix performance issues with vlan tagged packets
net: stmmac: use correct pointer when printing normal descriptor ring
net/mlx5: Use underlay QPN from the root name space
net/mlx5e: IPoIB, Only support regular RQ for now
net/mlx5e: Fix setup TC ndo
net/mlx5e: Fix ethtool pause support and advertise reporting
net/mlx5e: Use the correct pause values for ethtool advertising
vmxnet3: ensure that adapter is in proper state during force_close
sfc: revert changes to NIC revision numbers
net: ch9200: add missing USB-descriptor endianness conversions
net: irda: irda-usb: fix firmware name on big-endian hosts
net: dsa: mv88e6xxx: add default case to switch
...
Pull cifs fixes from Steve French:
"A set of minor cifs fixes"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
[CIFS] Minor cleanup of xattr query function
fs: cifs: transport: Use time_after for time comparison
SMB2: Fix share type handling
cifs: cifsacl: Use a temporary ops variable to reduce code length
Don't delay freeing mids when blocked on slow socket write of request
CIFS: silence lockdep splat in cifs_relock_file()
The Raspberry Pi startup stub files for multi-core BCM283X processors
make the secondary CPUs spin until the corresponding mailbox is
written. These stubs are loaded at physical address 0x00000xxx (as seen
by the ARMs), but this page will be reused by the kernel unless it is
explicitly reserved, causing the waiting cores to execute random code.
Use the /memreserve/ Device Tree directive to mark the first page as
off-limits to the kernel.
See: https://github.com/raspberrypi/linux/issues/1989
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Shannon Nelson says:
====================
ldmvsw: port removal stability
Under heavy reboot stress testing we found a couple of timing issues
when removing the device that could cause the kernel great heartburn,
addressed by these two patches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Stop the clean timer earlier to be sure there's no asynchronous
interference while stopping the port.
Orabug: 25748241
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When running LDom binding/unbinding test, kernel may panic
in ldmvsw_open(). It is more likely that because we're removing
the ldc connection before unregistering the netdev in vsw_port_remove(),
we set up a window of time where one process could be removing the
device while another trying to UP the device. This also sometimes causes
vio handshake error due to opening a device without closing it completely.
We should unregister the netdev before we disable the "hardware".
Orabug: 25980913, 25925306
Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since 412445ac ("dm: introduce a new DM_MAPIO_KILL return value"), the
clone_and_map_rq methods must not return errno values, so fix it up
to properly return DM_MAPIO_KILL, instead of the -EIO value that snuck
in due to a conflict between two patches.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Instead just turn the macro into a helper for the warning message.
This removes an unnecessary assignment and will allow the next commit to
fix a place where -EIO is the wrong return value.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
We don't want to bug when receiving a DM_MAPIO_KILL value..
Fixes: 412445ac ("dm: introduce a new DM_MAPIO_KILL return value")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
When decrementing the reference count for a block, the free count wasn't
being updated if the reference count went to zero.
Cc: stable@vger.kernel.org
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2017-05-12
This series contains some mlx5 fixes for net.
Please pull and let me know if there's any problem.
For -stable:
("net/mlx5e: Fix ethtool pause support and advertise reporting") kernels >= 4.8
("net/mlx5e: Use the correct pause values for ethtool advertising") kernels >= 4.8
v1->v2:
Dropped statistics spinlock patch, it needs some extra work.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Every address gets added with TENTATIVE flag even for the addresses with
IFA_F_NODAD flag and dad-work is scheduled for them. During this DAD process
we realize it's an address with NODAD and complete the process without
sending any probe. However the TENTATIVE flags stays on the
address for sometime enough to cause misinterpretation when we receive a NS.
While processing NS, if the address has TENTATIVE flag, we mark it DADFAILED
and endup with an address that was originally configured as NODAD with
DADFAILED.
We can't avoid scheduling dad_work for addresses with NODAD but we can
avoid adding TENTATIVE flag to avoid this racy situation.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current memset is using incorrect type of variable, causing the
upper-half of the strucutre to be left uninitialized and causing:
ethernet/qlogic/qed/qed_init_fw_funcs.c: In function 'qed_set_rfs_mode_disable':
ethernet/qlogic/qed/qed_init_fw_funcs.c:993:3: error: '*((void *)&ramline+4)' is used uninitialized in this function [-Werror=uninitialized]
Fixes: d51e4af5c2 ("qed: aRFS infrastructure support")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Device node iterators put the previous value of the index variable, so an
explicit put causes a double put.
In particular, of_mdiobus_register can fail before doing anything
interesting, so one could view it as a no-op from the reference count
point of view.
Generated by: scripts/coccinelle/iterators/device_node_continue.cocci
CC: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using a TX ring buffer, if an error occurs processing a control
message (e.g. invalid message), the net_device reference is not
released.
Fixes c14ac9451c ("sock: enable timestamping using control messages")
Signed-off-by: Douglas Caetano dos Santos <douglascs@taghos.com.br>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Macvlan always turns on offload features that have sofware
fallback (NETIF_GSO_SOFTWARE). This allows much higher guest-guest
communications over macvtap.
However, macvtap does not turn on these features for vlan tagged traffic.
As a result, depending on the HW that mactap is configured on, the
performance of guest-guest communication over a vlan is very
inconsistent. If the HW supports TSO/UFO over vlans, then the
performance will be fine. If not, the the performance will suffer
greatly since the VM may continue using TSO/UFO, and will force the host
segment the traffic and possibly overlow the macvtap queue.
This patch adds the always on offloads to vlan_features. This
makes sure that any vlan tagged traffic between 2 guest will not
be segmented needlessly.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit d98ecdaca2 ("arm64: perf: Count EL2 events if the kernel is
running in HYP") returns -EINVAL when perf system call perf_event_open is
called with exclude_hv != exclude_kernel. This change breaks applications
on VHE enabled ARMv8.1 platforms. The issue was observed with HHVM
application, which calls perf_event_open with exclude_hv = 1 and
exclude_kernel = 0.
There is no separate hypervisor privilege level when VHE is enabled, the
host kernel runs at EL2. So when VHE is enabled, we should ignore
exclude_hv from the application. This behaviour is consistent with PowerPC
where the exclude_hv is ignored when the hypervisor is not present and with
x86 where this flag is ignored.
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
[will: added comment to justify the behaviour of exclude_hv]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The cmpxchg implementation introduced by commit c342f78217 ("arm64:
cmpxchg: patch in lse instructions when supported by the CPU") performs
an apparently redundant register move of [old] to [oldval] in the
success case - it always uses the same register width as [oldval] was
originally loaded with, and is only executed when [old] and [oldval] are
known to be equal anyway.
The only effect it seemingly does have is to take up a surprising amount
of space in the kernel text, as removing it reveals:
text data bss dec hex filename
12426658 1348614 4499749 18275021 116dacd vmlinux.o.new
12429238 1348614 4499749 18277601 116e4e1 vmlinux.o.old
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
As is, a failure message is printed unconditionally, which is confusing.
And noisy.
Fixes: 8d4d159f25 ("i2c: mux: provide more info on failure in i2c_mux_add_adapter")
Signed-off-by: Peter Rosin <peda@axentia.se>
It is only prudent to let go of resources that are not used.
Fixes: b3fdd32799 ("i2c: mux: Add register-based mux i2c-mux-reg")
Signed-off-by: Peter Rosin <peda@axentia.se>
This fixes the new ept_access_test_read_only and ept_access_test_read_write
testcases from vmx.flat.
The problem is that gpte_access moves bits around to switch from EPT
bit order (XWR) to ACC_*_MASK bit order (RWX). This results in an
incorrect exit qualification. To fix this, make pt_access and
pte_access operate on raw PTE values (only with NX flipped to mean
"can execute") and call gpte_access at the end of the walk. This
lets us use pte_access to compute the exit qualification with XWR
bit order.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
We can observe eptad kvm_intel module parameter is still Y
even if ept is disabled which is weird. This patch will
not enable EPT A/D feature if EPT feature is disabled.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reported by syzkaller:
BUG: unable to handle kernel paging request at ffffffffc07f6a2e
IP: report_bug+0x94/0x120
PGD 348e12067
P4D 348e12067
PUD 348e14067
PMD 3cbd84067
PTE 80000003f7e87161
Oops: 0003 [#1] SMP
CPU: 2 PID: 7091 Comm: kvm_load_guest_ Tainted: G OE 4.11.0+ #8
task: ffff92fdfb525400 task.stack: ffffbda6c3d04000
RIP: 0010:report_bug+0x94/0x120
RSP: 0018:ffffbda6c3d07b20 EFLAGS: 00010202
do_trap+0x156/0x170
do_error_trap+0xa3/0x170
? kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm]
? mark_held_locks+0x79/0xa0
? retint_kernel+0x10/0x10
? trace_hardirqs_off_thunk+0x1a/0x1c
do_invalid_op+0x20/0x30
invalid_op+0x1e/0x30
RIP: 0010:kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm]
? kvm_load_guest_fpu.part.175+0x1c/0x170 [kvm]
kvm_arch_vcpu_ioctl_run+0xed6/0x1b70 [kvm]
kvm_vcpu_ioctl+0x384/0x780 [kvm]
? kvm_vcpu_ioctl+0x384/0x780 [kvm]
? sched_clock+0x13/0x20
? __do_page_fault+0x2a0/0x550
do_vfs_ioctl+0xa4/0x700
? up_read+0x1f/0x40
? __do_page_fault+0x2a0/0x550
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x23/0xc2
SDM mentioned that "The MXCSR has several reserved bits, and attempting to write
a 1 to any of these bits will cause a general-protection exception(#GP) to be
generated". The syzkaller forks' testcase overrides xsave area w/ random values
and steps on the reserved bits of MXCSR register. The damaged MXCSR register
values of guest will be restored to SSEx MXCSR register before vmentry. This
patch fixes it by catching userspace override MXCSR register reserved bits w/
random values and bails out immediately.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
There are PML_ENTITY_NUM elements in the pml_address[] array so the >
should be >= or we write beyond the end of the array when we do:
pml_address[vmcs12->guest_pml_index--] = gpa;
Fixes: c5f983f6e8 ("nVMX: Implement emulated Page Modification Logging")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
There are two pointers in sysfs_display_ring,
one that increments if using normal dma descriptors,
another if using extended dma descriptors.
When printing the normal dma descriptors, the wrong pointer is used,
thus the printed descriptor addresses are incorrect.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pm_fake doesn't quite describe what the handler does (ignoring writes
and returning 0 for reads).
As we're about to use it (a lot) in a different context, rename it
with a (admitedly cryptic) name that make sense for all users.
Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
Reviewed-by: Alex Bennee <alex.bennee@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Hardware debugging in guests is not intercepted currently, it means
that a malicious guest can bring down the entire machine by writing
to the debug registers.
This patch enable trapping of all debug registers, preventing the
guests to access the debug registers. This includes access to the
debug mode(DBGDSCR) in the guest world all the time which could
otherwise mess with the host state. Reads return 0 and writes are
ignored (RAZ_WI).
The result is the guest cannot detect any working hardware based debug
support. As debug exceptions are still routed to the guest normal
debug using software based breakpoints still works.
To support debugging using hardware registers we need to implement a
debug register aware world switch as well as special trapping for
registers that may affect the host state.
Cc: stable@vger.kernel.org
Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
If a vma is already bound to a ppgtt, we incorrectly call
allocate_va_range again when doing a PIN_UPDATE, which will result in
over accounting within our paging structures, such that when we do
unbind something we don't actually destroy the structures and end up
inadvertently recycling them. In reality this probably isn't too bad,
but once we start touching PDEs and PDPEs for 64K/2M/1G pages this
apparent recycling will manifest into lots of really, really subtle
bugs.
v2: Fix the testing of vma->flags for aliasing_ppgtt_bind_vma
Fixes: ff685975d9 ("drm/i915: Move allocate_va_range to GTT")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170512091423.26085-1-chris@chris-wilson.co.uk
(cherry picked from commit 1f23475c89)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Turns out our skills in decoding the CLKCFG register weren't good
enough. On this particular elk the answer we got was 400 MHz when
in reality the clock was running at 266 MHz, which then caused us
to program a bogus AUX clock divider that caused all AUX communication
to fail.
Sadly the docs are now in bit heaven, so the fix will have to be based
on empirical evidence. Using another elk machine I was able to frob
the FSB frequency from the BIOS and see how it affects the CLKCFG
register. The machine seesm to use a frequency of 266 MHz by default,
and fortunately it still boot even with the 50% CPU overclock that
we get when we bump the FSB up to 400 MHz.
It turns out the actual FSB frequency and the register have no real
link whatsoever. The register value is based on some straps or something,
but fortunately those too can be configured from the BIOS on this board,
although it doesn't seem to respect the settings 100%. In the end I was
able to derive the following relationship:
BIOS FSB / strap | CLKCFG
-------------------------
200 | 0x2
266 | 0x0
333 | 0x4
400 | 0x4
So only the 200 and 400 MHz cases actually match how we're currently
decoding that register. But as the comment next to some of the defines
says, we have been just guessing anyway.
So let's fix things up so that at least the 266 MHz case will work
correctly as that is actually the setting used by both the buggy
machine and my test machine.
The fact that 333 and 400 MHz BIOS settings result in the same register
value is a little disappointing, as that means we can't tell them apart.
However, according to the gmch datasheet for both elk and ctg 400 Mhz is
not even a supported FSB frequency, so I'm going to make the assumption
that we should decode it as 333 MHz instead.
Cc: stable@vger.kernel.org
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100926
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170504181530.6908-1-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
(cherry picked from commit 6f38123eca)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Not calling pm_runtime_enable() means that runtime PM can't be
enabled at all via sysfs. So we definitely need to call it
from somewhere.
Calling it from the driver seems like a bad idea because it
would have to be paired with a pm_runtime_disable() at driver
unload time, otherwise the core gets upset. Also if there's
no LPE audio driver loaded then we couldn't runtime suspend
i915 either.
So it looks like a better plan is to call it from i915 when
we register the platform device. That seems to match how
pci generally does things. I cargo culted the
pm_runtime_forbid() and pm_runtime_set_active() calls from
pci as well.
The exposed runtime PM API is massive an thorougly misleading, so
I don't actually know if this is how you're supposed to use the API
or not. But it seems to work. I can now runtime suspend i915 again
with or without the LPE audio driver loaded, and reloading the
LPE audio driver also seems to work.
Note that powertop won't auto-tune runtime PM for platform devices,
which is a little annoying. So I'm not sure that leaving runtime
PM in "on" mode by default is the best choice here. But I've left
it like that for now at least.
Also remove the comment about there not being much benefit from
LPE audio runtime PM. Not allowing runtime PM blocks i915 runtime
PM, which will also block s0ix, and that could have a measurable
impact on power consumption.
Cc: stable@vger.kernel.org
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Fixes: 0b6b524f39 ("ALSA: x86: Don't enable runtime PM as default")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170427160231.13337-2-ville.syrjala@linux.intel.com
Reviewed-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 183c00350c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Andreas reports that the following incremental update using our commit
protocol doesn't work.
# nft -f incremental-update.nft
delete element ip filter client_to_any { 10.180.86.22 : goto CIn_1 }
delete chain ip filter CIn_1
... Error: Could not process rule: Device or resource busy
The existing code is not well-integrated into the commit phase protocol,
since element deletions do not result in refcount decrement from the
preparation phase. This results in bogus EBUSY errors like the one
above.
Two new functions come with this patch:
* nft_set_elem_activate() function is used from the abort path, to
restore the set element refcounting on objects that occurred from
the preparation phase.
* nft_set_elem_deactivate() that is called from nft_del_setelem() to
decrement set element refcounting on objects from the preparation
phase in the commit protocol.
The nft_data_uninit() has been renamed to nft_data_release() since this
function does not uninitialize any data store in the data register,
instead just releases the references to objects. Moreover, a new
function nft_data_hold() has been introduced to be used from
nft_set_elem_activate().
Reported-by: Andreas Schultz <aschultz@tpip.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Do not assume userspace always sends us NFT_DATA_VALUE for bitwise and
cmp expressions. Although NFT_DATA_VERDICT does not make any sense, it
is still possible to handcraft a netlink message using this incorrect
data type.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When dumping the elements related to a specified set, we may invoke the
nf_tables_dump_set with the NFNL_SUBSYS_NFTABLES lock not acquired. So
we should use the proper rcu operation to avoid race condition, just
like other nft dump operations.
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch fixes the creation of connection tracking entry from
netlink when synproxy is used. It was missing the addition of
the synproxy extension.
This was causing kernel crashes when a conntrack entry created by
conntrackd was used after the switch of traffic from active node
to the passive node.
Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When looking up an iptables rule, the iptables binary compares the
aligned match and target data (XT_ALIGN). In some cases this can
exceed the actual data size to include padding bytes.
Before commit f77bc5b23f ("iptables: use match, target and data
copy_to_user helpers") the malloc()ed bytes were overwritten by the
kernel with kzalloced contents, zeroing the padding and making the
comparison succeed. After this patch, the kernel copies and clears
only data, leaving the padding bytes undefined.
Extend the clear operation from data size to aligned data size to
include the padding bytes, if any.
Padding bytes can be observed in both match and target, and the bug
triggered, by issuing a rule with match icmp and target ACCEPT:
iptables -t mangle -A INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT
iptables -t mangle -D INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT
Fixes: f77bc5b23f ("iptables: use match, target and data copy_to_user helpers")
Reported-by: Paul Moore <pmoore@redhat.com>
Reported-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Simon Horman says:
====================
IPVS Fixes for v4.12
please consider this fix to IPVS for v4.12.
* It is a fix from Julian Anastasov to only SNAT SNAT packet replies only for
NATed connections
My understanding is that this fix is appropriate for 4.9.25, 4.10.13, 4.11
as well as the nf tree. Julian has separately posted backports for other
-stable kernels; please see:
* [PATCH 3.2.88,3.4.113 -stable 1/3] ipvs: SNAT packet replies only for
NATed connections
* [PATCH 3.10.105,3.12.73,3.16.43,4.1.39 -stable 2/3] ipvs: SNAT packet
replies only for NATed connections
* [PATCH 4.4.65 -stable 3/3] ipvs: SNAT packet replies only for NATed
connections
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We can still delete the ct helper even if it is in use, this will cause
a use-after-free error. In more detail, I mean:
# nfct helper add ssdp inet udp
# iptables -t raw -A OUTPUT -p udp -j CT --helper ssdp
# nfct helper delete ssdp //--> oops, succeed!
BUG: unable to handle kernel paging request at 000026ca
IP: 0x26ca
[...]
Call Trace:
? ipv4_helper+0x62/0x80 [nf_conntrack_ipv4]
nf_hook_slow+0x21/0xb0
ip_output+0xe9/0x100
? ip_fragment.constprop.54+0xc0/0xc0
ip_local_out+0x33/0x40
ip_send_skb+0x16/0x80
udp_send_skb+0x84/0x240
udp_sendmsg+0x35d/0xa50
So add reference count to fix this issue, if ct helper is used by
others, reject the delete request.
Apply this patch:
# nfct helper delete ssdp
nfct v1.4.3: netlink error: Device or resource busy
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
And convert module_put invocation to nf_conntrack_helper_put, this is
prepared for the followup patch, which will add a refcnt for cthelper,
so we can reject the deleting request when cthelper is in use.
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We cannot setup nat info if the ct has been confirmed already, else,
different cpu may race to handle the same ct. In extreme situation,
we may hit the "BUG_ON(nf_nat_initialized(ct, maniptype))" in the
nf_nat_setup_info.
Also running the following commands will easily hit NF_CT_ASSERT in
nf_conntrack_alter_reply:
# nft flush ruleset
# ping -c 2 -W 1 1.1.1.111 &
# nft add table t
# nft add chain t c {type nat hook postrouting priority 0 \;}
# nft add rule t c snat to 4.5.6.7
WARNING: CPU: 1 PID: 10065 at net/netfilter/nf_conntrack_core.c:1472
nf_conntrack_alter_reply+0x9a/0x1a0 [nf_conntrack]
[...]
Call Trace:
nf_nat_setup_info+0xad/0x840 [nf_nat]
? deactivate_slab+0x65d/0x6c0
nft_nat_eval+0xcd/0x100 [nft_nat]
nft_do_chain+0xff/0x5d0 [nf_tables]
? mark_held_locks+0x6f/0xa0
? __local_bh_enable_ip+0x70/0xa0
? trace_hardirqs_on_caller+0x11f/0x190
? ipt_do_table+0x310/0x610
? trace_hardirqs_on+0xd/0x10
? __local_bh_enable_ip+0x70/0xa0
? ipt_do_table+0x32b/0x610
? __lock_acquire+0x2ac/0x1580
? ipt_do_table+0x32b/0x610
nft_nat_do_chain+0x65/0x80 [nft_chain_nat_ipv4]
nf_nat_ipv4_fn+0x1ae/0x240 [nf_nat_ipv4]
nf_nat_ipv4_out+0x4a/0xf0 [nf_nat_ipv4]
nft_nat_ipv4_out+0x15/0x20 [nft_chain_nat_ipv4]
nf_hook_slow+0x2c/0xf0
ip_output+0x154/0x270
So for the confirmed ct, just ignore it and return NF_ACCEPT.
Fixes: 9a08ecfe74 ("netfilter: don't attach a nat extension by default")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
A re-positioned call to kfree() in
drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c
causes a segmentation error. This patch removed the kfree() call.
Fixes 6557ddfec3 ("staging: rtl8723bs: Fix various errors in os_dep/ioctl_cfg80211.c")
Signed-off-by: Ian W Morrison <ianwmorrison@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The feature member of virtio_feature_desc contains little endian
values, given that it contents will be converted with
le32_to_cpu(). The "wrong" __u32 type leads to the sparse warnings
below.
In order to avoid them, use the correct __le32 type instead.
drivers/s390/virtio/virtio_ccw.c:749:14: warning: cast to restricted __le32
drivers/s390/virtio/virtio_ccw.c:762:28: warning: cast to restricted __le32
Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Not all parameters passed to ctnetlink_parse_tuple() and
ctnetlink_exp_dump_tuple() match the enum type in the signatures of these
functions. Since this is intended change the argument type of to be an
unsigned integer value.
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In kvm_free_stage2_pgd() we check the stage2 PGD before holding
the lock and proceed to take the lock if it is valid. And we unmap
the page tables, followed by releasing the lock. We reset the PGD
only after dropping this lock, which could cause a race condition
where another thread waiting on or even holding the lock, could
potentially see that the PGD is still valid and proceed to perform
a stage2 operation and later encounter a NULL PGD.
[223090.242280] Unable to handle kernel NULL pointer dereference at
virtual address 00000040
[223090.262330] PC is at unmap_stage2_range+0x8c/0x428
[223090.262332] LR is at kvm_unmap_hva_handler+0x2c/0x3c
[223090.262531] Call trace:
[223090.262533] [<ffff0000080adb78>] unmap_stage2_range+0x8c/0x428
[223090.262535] [<ffff0000080adf40>] kvm_unmap_hva_handler+0x2c/0x3c
[223090.262537] [<ffff0000080ace2c>] handle_hva_to_gpa+0xb0/0x104
[223090.262539] [<ffff0000080af988>] kvm_unmap_hva+0x5c/0xbc
[223090.262543] [<ffff0000080a2478>]
kvm_mmu_notifier_invalidate_page+0x50/0x8c
[223090.262547] [<ffff0000082274f8>]
__mmu_notifier_invalidate_page+0x5c/0x84
[223090.262551] [<ffff00000820b700>] try_to_unmap_one+0x1d0/0x4a0
[223090.262553] [<ffff00000820c5c8>] rmap_walk+0x1cc/0x2e0
[223090.262555] [<ffff00000820c90c>] try_to_unmap+0x74/0xa4
[223090.262557] [<ffff000008230ce4>] migrate_pages+0x31c/0x5ac
[223090.262561] [<ffff0000081f869c>] compact_zone+0x3fc/0x7ac
[223090.262563] [<ffff0000081f8ae0>] compact_zone_order+0x94/0xb0
[223090.262564] [<ffff0000081f91c0>] try_to_compact_pages+0x108/0x290
[223090.262569] [<ffff0000081d5108>] __alloc_pages_direct_compact+0x70/0x1ac
[223090.262571] [<ffff0000081d64a0>] __alloc_pages_nodemask+0x434/0x9f4
[223090.262572] [<ffff0000082256f0>] alloc_pages_vma+0x230/0x254
[223090.262574] [<ffff000008235e5c>] do_huge_pmd_anonymous_page+0x114/0x538
[223090.262576] [<ffff000008201bec>] handle_mm_fault+0xd40/0x17a4
[223090.262577] [<ffff0000081fb324>] __get_user_pages+0x12c/0x36c
[223090.262578] [<ffff0000081fb804>] get_user_pages_unlocked+0xa4/0x1b8
[223090.262579] [<ffff0000080a3ce8>] __gfn_to_pfn_memslot+0x280/0x31c
[223090.262580] [<ffff0000080a3dd0>] gfn_to_pfn_prot+0x4c/0x5c
[223090.262582] [<ffff0000080af3f8>] kvm_handle_guest_abort+0x240/0x774
[223090.262584] [<ffff0000080b2bac>] handle_exit+0x11c/0x1ac
[223090.262586] [<ffff0000080ab99c>] kvm_arch_vcpu_ioctl_run+0x31c/0x648
[223090.262587] [<ffff0000080a1d78>] kvm_vcpu_ioctl+0x378/0x768
[223090.262590] [<ffff00000825df5c>] do_vfs_ioctl+0x324/0x5a4
[223090.262591] [<ffff00000825e26c>] SyS_ioctl+0x90/0xa4
[223090.262595] [<ffff000008085d84>] el0_svc_naked+0x38/0x3c
This patch moves the stage2 PGD manipulation under the lock.
Reported-by: Alexander Graf <agraf@suse.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Fix a division-by-zero in set_termios when debugging is enabled and a
high-enough speed has been requested so that the divisor value becomes
zero.
Instead of just fixing the offending debug statement, cap the baud rate
at the base as a zero divisor value also appears to crash the firmware.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable <stable@vger.kernel.org> # 2.6.12
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Drop erroneous cpu_to_le32 when setting the baud rate, something which
corrupted the divisor on big-endian hosts.
Found using sparse:
warning: incorrect type in argument 1 (different base types)
expected unsigned int [unsigned] [usertype] val
got restricted __le32 [usertype] <noident>
Fixes: af2ac1a091 ("USB: serial mct_usb232: move DMA buffers to heap")
Cc: stable <stable@vger.kernel.org> # 2.6.34
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-By: Pete Zaitcev <zaitcev@yahoo.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Add missing endianness conversion when printing the supported baud
rates.
Found using sparse:
warning: restricted __le16 degrades to integer
Fixes: e0d795e4f3 ("usb: irda: cleanup on ir-usb module")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
TID 7 is a valid value for QoS IEEE 802.11e.
The switch statement that follows states 7 is valid.
Remove function IsACValid and use the default case to filter
invalid TIDs.
Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
EPROM_CMD is 2 byte aligned on PCI map so calling with rtl92e_readl
will return invalid data so use rtl92e_readw.
The device is unable to select the right eeprom type.
Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
BSSIDR has two byte alignment on PCI ioremap correct the write
by swapping to 16 bits first.
This fixes a problem that the device associates fail because
the filter is not set correctly.
Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver attempts to alter memory that is mapped to PCI device.
This is because tx_fwinfo_8190pci points to skb->data
Move the pci_map_single to when completed buffer is ready to be mapped with
psdec is empty to drop on mapping error.
Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vchiq_arm supports transfers less than one page and at arbitrary
alignment, using the dma-mapping API to perform its cache maintenance
(even though the VPU drives the DMA hardware). Read (DMA_FROM_DEVICE)
operations use cache invalidation for speed, falling back to
clean+invalidate on partial cache lines, with writes (DMA_TO_DEVICE)
using flushes.
If a read transfer has ends which aren't page-aligned, performing cache
maintenance as if they were whole pages can lead to memory corruption
since the partial cache lines at the ends (and any cache lines before or
after the transfer area) will be invalidated. This bug was masked until
the disabling of the cache flush in flush_dcache_page().
Honouring the requested transfer start- and end-points prevents the
corruption.
Fixes: cf9caf1929 ("staging: vc04_services: Replace dmac_map_area with dmac_map_sg")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Cc: stable <stable@vger.kernel.org> # 4.10
Reported-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
An early error handler in send_request() tries to release a spinlock,
but the lock isn't acquired until the loop below it is entered.
Signed-off-by: Ian Chard <ian@chard.org>
Acked-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We get a harmless warning without CONFIG_PM:
drivers/memory/atmel-ebi.c:584:12: error: 'atmel_ebi_resume' defined but not used [-Werror=unused-function]
Marking the function as __maybe_unused does the right thing here
and drops it silently when unused.
Fixes: a483fb10e5ea ("memory: atmel-ebi: Add PM ops")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Fix the following sparse warnings about incorrect type usage:
fusb302.c:1028:32: warning: incorrect type in argument 1 (different base types)
fusb302.c:1028:32: expected unsigned short [unsigned] [usertype] header
fusb302.c:1028:32: got restricted __le16 const [usertype] header
fusb302.c:1484:32: warning: incorrect type in argument 1 (different base types)
fusb302.c:1484:32: expected unsigned short [unsigned] [usertype] header
fusb302.c:1484:32: got restricted __le16 [usertype] header
Signed-off-by: Guru Das Srinagesh <gurooodas@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When allocating a gpio using the managed resource devm_, we can avoid freeing it
manually. But even if we did it we should use devm_gpio_free.
So, just remove the free of the gpio in the error path.
Signed-off-by: Rui Miguel Silva <rmfrfs@gmail.com>
Acked-by: Yueyao Zhu <yueyao.zhu@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The GICv3 documentation is extremely confusing, as it talks about
the number of priorities represented by the ICH_APxRn_EL2 registers,
while it should really talk about the number of preemption levels.
This leads to a bug where we may access undefined ICH_APxRn_EL2
registers, since PREbits is allowed to be smaller than PRIbits.
Thankfully, nobody seem to have taken this path so far...
The fix is to use ICH_VTR_EL2.PREbits instead.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
When an interrupt is injected with the HW bit set (indicating that
deactivation should be propagated to the physical distributor),
special care must be taken so that we never mark the corresponding
LR with the Active+Pending state (as the pending state is kept in
the physycal distributor).
Cc: stable@vger.kernel.org
Fixes: 59529f69f5 ("KVM: arm/arm64: vgic-new: Add GICv3 world switch backend")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
When an interrupt is injected with the HW bit set (indicating that
deactivation should be propagated to the physical distributor),
special care must be taken so that we never mark the corresponding
LR with the Active+Pending state (as the pending state is kept in
the physycal distributor).
Cc: stable@vger.kernel.org
Fixes: 140b086dd1 ("KVM: arm/arm64: vgic-new: Add GICv2 world switch backend")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
In commit dc3106690b ("powerpc: tm: Always use fp_state and vr_state
to store live registers"), a section of code was removed that copied
the current state to checkpointed state. That code should not have been
removed.
When an FP (Floating Point) unavailable is taken inside a transaction,
we need to abort the transaction. This is because at the time of the
tbegin, the FP state is bogus so the state stored in the checkpointed
registers is incorrect. To fix this, we treclaim (to get the
checkpointed GPRs) and then copy the thread_struct FP live state into
the checkpointed state. We then trecheckpoint so that the FP state is
correctly restored into the CPU.
The copying of the FP registers from live to checkpointed is what was
missing.
This simplifies the logic slightly from the original patch.
tm_reclaim_thread() will now always write the checkpointed FP
state. Either the checkpointed FP state will be written as part of
the actual treclaim (in tm.S), or it'll be a copy of the live
state. Which one we use is based on MSR[FP] from userspace.
Similarly for VMX.
Fixes: dc3106690b ("powerpc: tm: Always use fp_state and vr_state to store live registers")
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: cyrilbur@gmail.com
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We like living dangerously. Nothing explicitely forbids stack-protector
to be used in the HYP code, while distributions routinely compile their
kernel with it. We're just lucky that no code actually triggers the
instrumentation.
Let's not try our luck for much longer, and disable stack-protector
for code living at HYP.
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
On powerpc we can build the kernel with two different ABIs for mcount(), which
is used by ftrace. Kernels built with one ABI do not know how to load modules
built with the other ABI. The new style ABI is called "mprofile-kernel", for
want of a better name.
Currently if we build a module using the old style ABI, and the kernel with
mprofile-kernel, when we load the module we'll oops something like:
# insmod autofs4-no-mprofile-kernel.ko
ftrace-powerpc: Unexpected instruction f8810028 around bl _mcount
------------[ cut here ]------------
WARNING: CPU: 6 PID: 3759 at ../kernel/trace/ftrace.c:2024 ftrace_bug+0x2b8/0x3c0
CPU: 6 PID: 3759 Comm: insmod Not tainted 4.11.0-rc3-gcc-5.4.1-00017-g5a61ef74f269 #11
...
NIP [c0000000001eaa48] ftrace_bug+0x2b8/0x3c0
LR [c0000000001eaff8] ftrace_process_locs+0x4a8/0x590
Call Trace:
alloc_pages_current+0xc4/0x1d0 (unreliable)
ftrace_process_locs+0x4a8/0x590
load_module+0x1c8c/0x28f0
SyS_finit_module+0x110/0x140
system_call+0x38/0xfc
...
ftrace failed to modify
[<d000000002a31024>] 0xd000000002a31024
actual: 35:65:00:48
We can avoid this by including in the vermagic whether the kernel/module was
built with mprofile-kernel. Which results in:
# insmod autofs4-pg.ko
autofs4: version magic
'4.11.0-rc3-gcc-5.4.1-00017-g5a61ef74f269 SMP mod_unload modversions '
should be
'4.11.0-rc3-gcc-5.4.1-00017-g5a61ef74f269-dirty SMP mod_unload modversions mprofile-kernel'
insmod: ERROR: could not insert module autofs4-pg.ko: Invalid module format
Fixes: 8c50b72a3b ("powerpc/ftrace: Add Kconfig & Make glue for mprofile-kernel")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Jessica Yu <jeyu@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We like living dangerously. Nothing explicitely forbids stack-protector
to be used in the EL2 code, while distributions routinely compile their
kernel with it. We're just lucky that no code actually triggers the
instrumentation.
Let's not try our luck for much longer, and disable stack-protector
for code living at EL2.
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
PS_RDY messages sent during power swap sequences are expected to reflect
the new power role.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the lower level driver provided a list of VDOs in its configuration
data, send it to the partner as response to a Discover Identity command
if in device mode (UFP).
Cc: Yueyao Zhu <yueyao.zhu@gmail.com>
Originally-from: Puma Hsu <puma_hsu@htc.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We do support USB PD communication, and devices supported by this driver
typically use USB power for purposes other than USB communication.
Originally-from: Puma Hsu <puma_hsu@htc.com>
Cc: Yueyao Zhu <yueyao.zhu@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Per USB PD standard, we have to drop duplicate PD messages.
We can not expect lower protocol layers to drop such messages,
since lower layers don't know if a message was dropped somewhere
else in the stack.
Originally-from: Puma Hsu <puma_hsu@htc.com>
Cc: Yueyao Zhu <yueyao.zhu@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
FUSB_REG_STATUS0 & FUSB_REG_STATUS0_VBUSOK = 0x40 & 0x80 is always
zero. Fix the code to what it is intended to be: check the VBUSOK
bit of the value read from address FUSB_REG_STATUS0.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Yueyao Zhu <yueyao.zhu@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.
Export the OF and I2C device ID table entries as module aliases, using the
MODULE_DEVICE_TABLE() macro.
Before this patch:
$ modinfo drivers/staging/typec/fusb302/fusb302.ko | grep alias
$
After this patch:
$ modinfo drivers/staging/typec/fusb302/fusb302.ko | grep alias
alias: of:N*T*Cfcs,fusb302C*
alias: of:N*T*Cfcs,fusb302
alias: i2c:typec_fusb302
Signed-off-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes a sparse warning regarding an undeclared symbol. Since the
structure tcpci_tcpc_config is private to tcpci.c, it should be declared as
static.
Signed-off-by: Olivier Leveque <o_leveque@orange.fr>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After commit 9828282e33 ("staging: android: ion: Remove old platform
support"), the document about devicetree of ion is no need anymore, so
just remove it.
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Acked-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add missing endianness conversion when using the USB device-descriptor
bcdDevice field when applying the Amanero Combo384 (endianness!) quirk.
Fixes: 3eff682d76 ("ALSA: usb-audio: Support both DSD LE/BE Amanero firmware versions")
Cc: Jussi Laako <jussi@sonarnerd.net>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
We get a link error when EXPORTFS is not enabled:
ERROR: "exportfs_encode_fh" [fs/overlayfs/overlay.ko] undefined!
ERROR: "exportfs_decode_fh" [fs/overlayfs/overlay.ko] undefined!
This adds a Kconfig 'select' statement for overlayfs, the same way that
it is done for the other users of exportfs.
Fixes: 3a1e819b4e ("ovl: store file handle of lower inode on copy up")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Commit 557aaa7ffa ("ft232: support the ASYNC_LOW_LATENCY
flag") enables unprivileged users to set the FTDI latency timer,
but there was a logic flaw that skipped sending the corresponding
USB control message to the device.
Specifically, the device latency timer would not be updated until next
open, something which was later also inadvertently broken by commit
c19db4c9e4 ("USB: ftdi_sio: set device latency timeout at port
probe").
A recent commit c6dce26266 ("USB: serial: ftdi_sio: fix extreme
low-latency setting") disabled the low-latency mode by default so we now
need this fix to allow unprivileged users to again enable it.
Signed-off-by: Anthony Mallet <anthony.mallet@laas.fr>
[johan: amend commit message]
Fixes: 557aaa7ffa ("ft232: support the ASYNC_LOW_LATENCY flag")
Fixes: c19db4c9e4 ("USB: ftdi_sio: set device latency timeout at port probe").
Cc: stable <stable@vger.kernel.org> # 2.6.31
Signed-off-by: Johan Hovold <johan@kernel.org>
I finally got around to creating trampolines for dynamically allocated
ftrace_ops with using synchronize_rcu_tasks(). For users of the ftrace
function hook callbacks, like perf, that allocate the ftrace_ops
descriptor via kmalloc() and friends, ftrace was not able to optimize
the functions being traced to use a trampoline because they would also
need to be allocated dynamically. The problem is that they cannot be
freed when CONFIG_PREEMPT is set, as there's no way to tell if a task
was preempted on the trampoline. That was before Paul McKenney
implemented synchronize_rcu_tasks() that would make sure all tasks
(except idle) have scheduled out or have entered user space.
While testing this, I triggered this bug:
BUG: unable to handle kernel paging request at ffffffffa0230077
...
RIP: 0010:0xffffffffa0230077
...
Call Trace:
schedule+0x5/0xe0
schedule_preempt_disabled+0x18/0x30
do_idle+0x172/0x220
What happened was that the idle task was preempted on the trampoline.
As synchronize_rcu_tasks() ignores the idle thread, there's nothing
that lets ftrace know that the idle task was preempted on a trampoline.
The idle task shouldn't need to ever enable preemption. The idle task
is simply a loop that calls schedule or places the cpu into idle mode.
In fact, having preemption enabled is inefficient, because it can
happen when idle is just about to call schedule anyway, which would
cause schedule to be called twice. Once for when the interrupt came in
and was returning back to normal context, and then again in the normal
path that the idle loop is running in, which would be pointless, as it
had already scheduled.
The only reason schedule_preempt_disable() enables preemption is to be
able to call sched_submit_work(), which requires preemption enabled. As
this is a nop when the task is in the RUNNING state, and idle is always
in the running state, there's no reason that idle needs to enable
preemption. But that means it cannot use schedule_preempt_disable() as
other callers of that function require calling sched_submit_work().
Adding a new function local to kernel/sched/ that allows idle to call
the scheduler without enabling preemption, fixes the
synchronize_rcu_tasks() issue, as well as removes the pointless spurious
schedule calls caused by interrupts happening in the brief window where
preemption is enabled just before it calls schedule.
Reviewed: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170414084809.3dacde2a@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Moving most of the shared code to virt/kvm/arm had for consequence
that KVM/ARM doesn't build anymore, because the code that used to
define the tracepoints is now somewhere else.
Fix this by defining CREATE_TRACE_POINTS in coproc.c, and clean-up
trace.h as well.
Fixes: 35d2d5d490 ("KVM: arm/arm64: Move shared files to virt/kvm/arm")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
If there are no clean blocks to be demoted the writeback will be
triggered at that point. Preemptively writing back can hurt high IO
load scenarios.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Drop the MODERATE state since it wasn't buying us much.
Also, in check_migrations(), prepare for the next commit ("dm cache
policy smq: don't do any writebacks unless IDLE") by deferring to the
policy to make the final decision on whether writebacks can be
serviced.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
IO tracking used to throttle writebacks when the origin device is busy.
Even if all the IO is going to the fast device, writebacks can
significantly degrade performance. So track all IO to gauge whether the
cache is busy or not.
Otherwise, synthetic IO tests (e.g. fio) that might send all IO to the
fast device wouldn't cause writebacks to get throttled.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
It causes a lot of churn if the working set's size is close to the fast
device's size.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This stops entries bouncing in and out of the cache quickly.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
If there are no clean entries to demote we really want to writeback
immediately.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Heavy IO load may mean there are very few clean blocks in the cache, and
we risk demoting entries that get hit a lot.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Some bios have no payload (eg, a FLUSH), don't reset the idle_time when
these come in.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The new pm domain driver causes a build failure when CONFIG_PM
is not set:
warning: (IMX7_PM_DOMAINS) selects PM_GENERIC_DOMAINS which has unmet direct dependencies (PM)
drivers/base/power/domain_governor.c: In function 'default_suspend_ok':
drivers/base/power/domain_governor.c:75:17: error: 'struct dev_pm_info' has no member named 'ignore_children'
This adds a dependency to ensure that we don't attempt to build the
driver without CONFIG_PM.
Fixes: 03aa12629f ("soc: imx: Add GPCv2 power gating driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The board file for imx6sx-sdb overrides cpufreq operating points to use
higher voltages. This is done because the board has a shared rail for
VDD_ARM_IN and VDD_SOC_IN and when using LDO bypass the shared voltage
needs to be a value suitable for both ARM and SOC.
This only applies to LDO bypass mode, a feature not present in upstream.
When LDOs are enabled the effect is to use higher voltages than necessary
for no good reason.
Setting these higher voltages can make some boards fail to boot with ugly
semi-random crashes reminiscent of memory corruption. These failures only
happen on board rev. C, rev. B is reported to still work.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Fixes: 54183bd7f7 ("ARM: imx6sx-sdb: add revb board and make it default")
Cc: stable@vger.kernel.org
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Currently the following errors are seen:
[ 14.015056] mc13xxx 0-0008: Failed to read IRQ status: -6
[ 27.321093] mc13xxx 0-0008: Failed to read IRQ status: -6
[ 27.411681] mc13xxx 0-0008: Failed to read IRQ status: -6
[ 27.456281] mc13xxx 0-0008: Failed to read IRQ status: -6
[ 30.527106] mc13xxx 0-0008: Failed to read IRQ status: -6
[ 36.596900] mc13xxx 0-0008: Failed to read IRQ status: -6
Also when reading the interrupts via 'cat /proc/interrupts' the
PMIC GPIO interrupt counter does not stop increasing.
The reason for the storm of interrupts is that the PUS field of
register IOMUXC_SW_PAD_CTL_PAD_CSI0_DAT5 is currently configured as:
10 : 100k pullup
and the PMIC interrupt is being registered as IRQ_TYPE_LEVEL_HIGH type,
which is the correct type as per the MC34708 datasheet.
Use the default power on value for the IOMUX, which sets PUS field as:
00: 360k pull down
This prevents the spurious PMIC interrupts from happening.
Commit e1ffceb078 ("ARM: imx53: qsrb: fix PMIC interrupt level")
correctly described the irq type as IRQ_TYPE_LEVEL_HIGH, but
missed to update the IOMUX of the PMIC GPIO as pull down.
Fixes: e1ffceb078 ("ARM: imx53: qsrb: fix PMIC interrupt level")
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The power and current "shunt-resistor" attribute's 'show' function
displays the resistor value in milli-Ohms, while the ABI description
specifies it should be displayed in Ohms. Fix it.
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
We should be allocating enough information for a tiadc_device struct
which is about 400 bytes but instead we allocate enough for a second
iio_dev struct which is over 2000 bytes.
Fixes: fea89e2dfc ("iio: adc: ti_am335x_adc: use variable names for sizeof() operator")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
The recent conversion to the hotplug state machine missed that the original
hotplug notifiers did not execute in the frozen state, which is used on
suspend on resume.
This does not matter on single socket machines, but on multi socket systems
this breaks when the device for a non-boot socket is removed when the last
CPU of that socket is brought offline. The device removal locks up the
machine hard w/o any debug output.
Prevent executing the hotplug callbacks when cpuhp_tasks_frozen is true.
Thanks to Tommi for providing debug information patiently while I failed to
spot the obvious.
Fixes: e00ca5df37 ("hwmon: (coretemp) Convert to hotplug state machine")
Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Tested-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The current implementation of interrupt coalescing doesn't work, because
it doesn't configure the coalescing timer, which is needed to make sure
we get an interrupt at some point.
As a fix for stable, we simply remove the interrupt coalescing
functionality. It will be re-introduced properly in a future commit.
Fixes: 19a340b1a8 ("dmaengine: mv_xor_v2: new driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The mv_xor_v2_tx_submit() gets the next available HW descriptor by
calling mv_xor_v2_get_desq_write_ptr(), which reads a HW register
telling the next available HW descriptor. This was working fine when HW
descriptors were issued for processing directly in tx_submit().
However, as part of the review process of the driver, a change was
requested to move the actual kick-off of HW descriptors processing to
->issue_pending(). Due to this, reading the HW register to know the next
available HW descriptor no longer works.
So instead of using this HW register, we implemented a software index
pointing to the next available HW descriptor.
Fixes: 19a340b1a8 ("dmaengine: mv_xor_v2: new driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The engine was enabled prior to its configuration, which isn't
correct. This patch relocates the activation of the XOR engine, to be
after the configuration of the XOR engine.
Fixes: 19a340b1a8 ("dmaengine: mv_xor_v2: new driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hanna Hawa <hannah@marvell.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Descriptors that have not been acknowledged by the async_tx layer
should not be re-used, so this commit adjusts the implementation of
mv_xor_v2_prep_sw_desc() to skip descriptors for which
async_tx_test_ack() is false.
Fixes: 19a340b1a8 ("dmaengine: mv_xor_v2: new driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
mv_xor_v2_tasklet() is looping over completed HW descriptors. Before the
loop, it initializes 'next_pending_hw_desc' to the first HW descriptor
to handle, and then the loop simply increments this point, without
taking care of wrapping when we reach the last HW descriptor. The
'pending_ptr' index was being wrapped back to 0 at the end, but it
wasn't used in each iteration of the loop to calculate
next_pending_hw_desc.
This commit fixes that, and makes next_pending_hw_desc a variable local
to the loop itself.
Fixes: 19a340b1a8 ("dmaengine: mv_xor_v2: new driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The mv_xor_v2_prep_sw_desc() is called from a few different places in
the driver, but we never take into account the fact that it might
return NULL. This commit fixes that, ensuring that we don't panic if
there are no more descriptors available.
Fixes: 19a340b1a8 ("dmaengine: mv_xor_v2: new driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Moving the cooling code into the cpufreq driver caused a possible build failure
when the cpu_thermal helper code is a loadable module or disabled:
drivers/cpufreq/dbx500-cpufreq.o: In function `dbx500_cpufreq_ready':
dbx500-cpufreq.c:(.text.dbx500_cpufreq_ready+0x4): undefined reference to `cpufreq_cooling_register'
This adds the same dependency that we have in other cpufreq drivers,
forcing the driver to be disabled when we can't possibly link it.
Fixes: 19678ffb9f (cpufreq: dbx500: Manage cooling device from cpufreq driver)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the current code we accidentally return the successful result from
idr_alloc() instead of a negative error pointer. The caller is looking
for an error pointer and so it treats the returned value as a valid
pointer.
This one might be a bit serious because if it lets people get around the
kernel's protection for remapping NULL. I'm not sure.
Fixes: 75d2364ea0 (PowerCap: Add class driver)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Root flow table is dynamically changed by the underlying flow steering
layer, and IPoIB/ULPs have no idea what will be the root flow table in
the future, hence we need a dynamic infrastructure to move Underlay QPs
with the root flow table.
Fixes: b3ba51498b ("net/mlx5: Refactor create flow table method to accept underlay QP")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
IPoIB doesn't support striding RQ at the moment, for this
we need to explicitly choose non striding RQ in IPoIB init,
even if the HW supports it.
Fixes: 8f493ffd88 ("net/mlx5e: IPoIB, RX steering RSS RQTs and TIRs")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Fail-safe support patches introduced a trivial bug,
setup tc callback is doing a wrong check of the netdevice state,
the fix is simply to invert the condition.
Fixes: 6f9485af40 ("net/mlx5e: Fail safe tc setup")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Pause bit should set when RX pause is on, not TX pause.
Also, setting Asym_Pause is incorrect, and should be turned off.
Fixes: 665bc53969 ("net/mlx5e: Use new ethtool get/set link ksettings API")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Query the operational pause from firmware (PFCC register) instead of
always passing zeros.
Fixes: 665bc53969 ("net/mlx5e: Use new ethtool get/set link ksettings API")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In the SRM lock check section of code the '&' bitwise operator is
used as part of checking lock status. Functionally the code works
as intended, but the conditional statement is a boolean comparison
so should really use '&&' logical operator instead. This commit
rectifies this discrepancy.
Signed-off-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Thinkpad Helix 2 is a tablet PC, the audio is powered by Core M
broadwell-audio and rt286 codec. For all versions of Linux kernel,
the stereo output doesn't work properly when earphones are plugged
in, the sound was coming out from both channels even if the audio
contains only the left or right channel. Furthermore, if a music
recorded in stereo is played, the two channels cancle out each other
out, as a result, no voice but only distorted background music can be
heard, like a sound card with builtin a Karaoke sount effect.
Apparently this tablet uses a combo jack with polarity incorrectly
set by rt286 driver. This patch adds DMI information of Thinkpad Helix 2
to force_combo_jack_table[] and the issue is resolved. The microphone
input doesn't work regardless to the presence of this patch and still
needs help from other developers to investigate.
This is my first patch to LKML directly, sorry for CC-ing too many
people here.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=93841
Signed-off-by: Yifeng Li <tomli@tomli.me>
Signed-off-by: Mark Brown <broonie@kernel.org>
The i915 component framework expects the caller to be invoking
snd_hdac_i915_init() from a thread context. Otherwise it results in
lockups on drm side.
So move the registering of component interface and probing of codecs on
this bus to a worker thread.
init_failed in skl structure is not used currently, so renamed to
init_done and used to track the initialization done in worker thread.
Reported-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Sodhi, VunnyX <vunnyx.sodhi@intel.com>
Signed-off-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The R_CCU of H3/H5 currently wrongly used A64 R_CCU compatible.
Fix it by changing it to the correct H3 compatible.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Remove the duplicate brcm,bcm7425-sun-top-ctrl compatible string and
replace it with brcm,bcm7435-sun-top-ctrl which was intended.
Fixes: bd0faf08dc ("soc: bcm: brcmstb: Match additional compatible strings")
Reported-by: Andreas Oberritter <obi@saftware.de>
Acked-by: Gregory Fong <gregory.0xf0@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
This pull request brings back bcm2835 DT fixups from Baruch Siach that
got misplaced after a PR for 4.11 got rejected.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Tetsuo reports:
fs/built-in.o: In function `xfs_file_iomap_end':
xfs_iomap.c:(.text+0xe0ef9): undefined reference to `put_dax'
fs/built-in.o: In function `xfs_file_iomap_begin':
xfs_iomap.c:(.text+0xe1a7f): undefined reference to `dax_get_by_host'
make: *** [vmlinux] Error 1
$ grep DAX .config
CONFIG_DAX=m
# CONFIG_DEV_DAX is not set
# CONFIG_FS_DAX is not set
When FS_DAX=n we can/must throw away the dax code in filesystems.
Implement 'fs_' versions of dax_get_by_host() and put_dax() that are
nops in the FS_DAX=n case.
Cc: <linux-xfs@vger.kernel.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: Jan Kara <jack@suse.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Fixes: ef51042472 ("block, dax: move 'select DAX' from BLOCK to FS_DAX")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Commit eed4d47efe (ACPI / sleep: Ignore spurious SCI wakeups from
suspend-to-idle) modified the core suspend-to-idle code to filter
out spurious SCI interrupts received while suspended, which requires
ACPI event source handlers to report wakeup events in a way that
will trigger a wakeup from suspend to idle (or abort system suspends
in progress, which is equivalent).
That needs to be done in the rtc-cmos driver too, which was overlooked
by the above commit, so do that now.
Fixes: eed4d47efe (ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle)
Reported-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 8a537ece3d (PM / wakeup: Integrate mechanism to abort
transitions in progress) modified wakeup_source_report_event()
and wakeup_source_activate() to make it possible to call
pm_system_wakeup() from the latter if so indicated by the
caller of the former (via a new function argument added by that
commit), but it overlooked the fact that in some situations
wakeup_source_report_event() is called to signal a "hard" event
(ie. such that should abort a system suspend in progress) after
pm_stay_awake() has been called for the same wakeup source object,
in which case the pm_system_wakeup() will not trigger.
To work around this issue, modify wakeup_source_activate() and
wakeup_source_report_event() again so that pm_system_wakeup() is
called by the latter directly (if its last argument is true), in
which case the additional argument does not need to be passed
to wakeup_source_activate() any more, so drop it from there.
Fixes: 8a537ece3d (PM / wakeup: Integrate mechanism to abort transitions in progress)
Reported-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a document describing the current behavior and user space
interface of the intel_pstate driver in the RST format and
drop the existing outdated intel_pstate.txt document.
Also update admin-guide/pm/cpufreq.rst with proper RST references
to the new intel_pstate.rst document.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the BLOCK=n case the dax core does not need to / must not emit the
block-device-dax helpers. Otherwise it leads to compile errors.
Cc: Arnd Bergmann <arnd@arndb.de>
Reported-by: Fabian Frederick <fabf@skynet.be>
Fixes: ef51042472 ("block, dax: move 'select DAX' from BLOCK to FS_DAX")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Imagine we have a pid namespace and a task from its parent's pid_ns,
which made setns() to the pid namespace. The task is doing fork(),
while the pid namespace's child reaper is dying. We have the race
between them:
Task from parent pid_ns Child reaper
copy_process() ..
alloc_pid() ..
.. zap_pid_ns_processes()
.. disable_pid_allocation()
.. read_lock(&tasklist_lock)
.. iterate over pids in pid_ns
.. kill tasks linked to pids
.. read_unlock(&tasklist_lock)
write_lock_irq(&tasklist_lock); ..
attach_pid(p, PIDTYPE_PID); ..
.. ..
So, just created task p won't receive SIGKILL signal,
and the pid namespace will be in contradictory state.
Only manual kill will help there, but does the userspace
care about this? I suppose, the most users just inject
a task into a pid namespace and wait a SIGCHLD from it.
The patch fixes the problem. It simply checks for
(pid_ns->nr_hashed & PIDNS_HASH_ADDING) in copy_process().
We do it under the tasklist_lock, and can't skip
PIDNS_HASH_ADDING as noted by Oleg:
"zap_pid_ns_processes() does disable_pid_allocation()
and then takes tasklist_lock to kill the whole namespace.
Given that copy_process() checks PIDNS_HASH_ADDING
under write_lock(tasklist) they can't race;
if copy_process() takes this lock first, the new child will
be killed, otherwise copy_process() can't miss
the change in ->nr_hashed."
If allocation is disabled, we just return -ENOMEM
like it's made for such cases in alloc_pid().
v2: Do not move disable_pid_allocation(), do not
introduce a new variable in copy_process() and simplify
the patch as suggested by Oleg Nesterov.
Account the problem with double irq enabling
found by Eric W. Biederman.
Fixes: c876ad7682 ("pidns: Stop pid allocation when init dies")
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Mike Rapoport <rppt@linux.vnet.ibm.com>
CC: Michal Hocko <mhocko@suse.com>
CC: Andy Lutomirski <luto@kernel.org>
CC: "Eric W. Biederman" <ebiederm@xmission.com>
CC: Andrei Vagin <avagin@openvz.org>
CC: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Serge Hallyn <serge@hallyn.com>
Cc: stable@vger.kernel.org
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
The code can potentially sleep for an indefinite amount of time in
zap_pid_ns_processes triggering the hung task timeout, and increasing
the system average. This is undesirable. Sleep with a task state of
TASK_INTERRUPTIBLE instead of TASK_UNINTERRUPTIBLE to remove these
undesirable side effects.
Apparently under heavy load this has been allowing Chrome to trigger
the hung time task timeout error and cause ChromeOS to reboot.
Reported-by: Vovo Yang <vovoy@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 6347e90091 ("pidns: guarantee that the pidns init will be the last pidns process reaped")
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Some minor cleanup of cifs query xattr functions (will also make
SMB3 xattr implementation cleaner as well).
Signed-off-by: Steve French <steve.french@primarydata.com>
Use time_after kernel macro for time comparison
that has safety check.
Signed-off-by: Karim Eshapa <karim.eshapa@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
In fs/cifs/smb2pdu.h, we have:
#define SMB2_SHARE_TYPE_DISK 0x01
#define SMB2_SHARE_TYPE_PIPE 0x02
#define SMB2_SHARE_TYPE_PRINT 0x03
Knowing that, with the current code, the SMB2_SHARE_TYPE_PRINT case can
never trigger and printer share would be interpreted as disk share.
So, test the ShareType value for equality instead.
Fixes: faaf946a7d ("CIFS: Add tree connect/disconnect capability for SMB2")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Create an ops variable to store tcon->ses->server->ops and cache
indirections and reduce code size a trivial bit.
$ size fs/cifs/cifsacl.o*
text data bss dec hex filename
5338 136 8 5482 156a fs/cifs/cifsacl.o.new
5371 136 8 5515 158b fs/cifs/cifsacl.o.old
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
If an array consists of two drives and the first drive has the bad
block, the read request to the region overlapping the bad block chooses
the same disk (with bad block) as device to read from over and over and
the request gets stuck. If the first disk only partially overlaps with
bad block, it becomes a candidate ("best disk") for shorter range of
sectors. The second disk is capable of reading the entire requested
range and it is updated accordingly, however it is not recorded as a
best device for the request. In the end the request is sent to the first
disk to read entire range of sectors. It fails and is re-tried in a
moment but with the same outcome.
Actually it is quite likely scenario but it had little exposure in my
test until commit 715d40b93b10 ("md/raid1: add failfast handling for
reads.") removed preference for idle disk. Such scenario had been
passing as second disk was always chosen when idle.
Reset a candidate ("best disk") to read from if disk can read entire
range. Do it only if other disk has already been chosen as a candidate
for a smaller range. The head position / disk type logic will select
the best disk to read from - it is fine as disk with bad block won't be
considered for it.
Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
This reverts commit ecb10b694b.
The only expected ACPI control method lid device's usage model is
1. Listen to the lid notification,
2. Evaluate _LID after being notified by BIOS,
3. Suspend the system (if users configure to do so) after seeing "close".
It's not ensured that BIOS will notify OS after boot/resume, and
it's not ensured that BIOS will always generate "open" event upon
opening the lid.
But there are 2 wrong usage models:
1. When the lid device is responsible for suspend/resume the system,
userspace requires to see "open" event to be paired with "close" after
the system is resumed, or it will suspend the system again.
2. When an external monitor connects to the laptop attached docks,
userspace requires to see "close" event after the system is resumed so
that it can determine whether the internal display should remain dark
and the external display should be lit on.
After we made default kernel behavior to be suitable for usage model 1,
users of usage model 2 start to report regressions for such behavior
change.
Reversion of button.lid_init_state=method doesn't actually reverts to old
default behavior as doing so can enter a regression loop, but facilitates
users to work the reported regressions around with
button.lid_init_state=method.
Fixes: ecb10b694b (ACPI / button: Remove lid_init_state=method mode)
Cc: 4.11+ <stable@vger.kernel.org> # 4.11+
Link: https://bugzilla.kernel.org/show_bug.cgi?id=195455
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1430259
Tested-by: Steffen Weber <steffen.weber@gmail.com>
Tested-by: Julian Wiedmann <julian.wiedmann@jwi.name>
Reported-by: Joachim Frieben <jfrieben@hotmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a .gitignore file so that git commands do not pick up the resulting
binaries and directories.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are several paths in vmxnet3, where settings changes cause the
adapter to be brought down and back up (vmxnet3_set_ringparam among
them). Should part of the reset operation fail, these paths call
vmxnet3_force_close, which enables all napi instances prior to calling
dev_close (with the expectation that vmxnet3_close will then properly
disable them again). However, vmxnet3_force_close neglects to clear
VMXNET3_STATE_BIT_QUIESCED prior to calling dev_close. As a result
vmxnet3_quiesce_dev (called from vmxnet3_close), returns early, and
leaves all the napi instances in a enabled state while the device itself
is closed. If a device in this state is activated again, napi_enable
will be called on already enabled napi_instances, leading to a BUG halt.
The fix is to simply enausre that the QUIESCED bit is cleared in
vmxnet3_force_close to allow quesence to be completed properly on close.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Shrikrishna Khare <skhare@vmware.com>
CC: "VMware, Inc." <pv-drivers@vmware.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The revision enum values (eg EFX_REV_HUNT_A0) form part of our API,
and are included in ethtool. If these are inconsistent then ethtool
will print garbage for a register dump (ethtool -d).
Fixes: 5a6681e22c ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the missing endianness conversions to a debug statement printing
the USB device-descriptor idVendor and idProduct fields during probe.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add default case to switch in order to avoid any chance of using an
uninitialized variable _low_, in case s->type does not match any of
the listed case values.
Addresses-Coverity-ID: 1398130
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 0ca50d12fe ("sctp: fix src address selection if using secondary
addresses") has fixed a src address selection issue when using secondary
addresses for ipv4.
Now sctp ipv6 also has the similar issue. When using a secondary address,
sctp_v6_get_dst tries to choose the saddr which has the most same bits
with the daddr by sctp_v6_addr_match_len. It may make some cases not work
as expected.
hostA:
[1] fd21:356b:459a:cf10::11 (eth1)
[2] fd21:356b:459a:cf20::11 (eth2)
hostB:
[a] fd21:356b:459a:cf30::2 (eth1)
[b] fd21:356b:459a:cf40::2 (eth2)
route from hostA to hostB:
fd21:356b:459a:cf30::/64 dev eth1 metric 1024 mtu 1500
The expected path should be:
fd21:356b:459a:cf10::11 <-> fd21:356b:459a:cf30::2
But addr[2] matches addr[a] more bits than addr[1] does, according to
sctp_v6_addr_match_len. It causes the path to be:
fd21:356b:459a:cf20::11 <-> fd21:356b:459a:cf30::2
This patch is to fix it with the same way as Marcelo's fix for sctp ipv4.
As no ip_dev_find for ipv6, this patch is to use ipv6_chk_addr to check
if the saddr is in a dev instead.
Note that for backwards compatibility, it will still do the addr_match_len
check here when no optimal is found.
Reported-by: Patrick Talbert <ptalbert@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The register array offset for clearing an interrupt is calculated by:
offset = (hwirq - RESERVED_IRQ_PER_MBIGEN_CHIP) / 32;
This is wrong because the clear register array includes the reserved
interrupts. So the clear operation ends up in the wrong register.
This went unnoticed so far, because the hardware clears the real bit
through a timeout mechanism when the hardware is configured in debug
mode. That debug mode was enabled on early generations of the hardware, so
the problem was papered over.
On newer hardware with updated firmware the debug mode was disabled, so the
bits did not get cleared which causes the system to malfunction.
Remove the subtraction of RESERVED_IRQ_PER_MBIGEN_CHIP, so the correct
register is accessed.
[ tglx: Rewrote changelog ]
Fixes: a6c2f87b88 ("irqchip/mbigen: Implement the mbigen irq chip operation functions")
Signed-off-by: MaJun <majun258@huawei.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: linuxarm@huawei.com
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Link: http://lkml.kernel.org/r/1494561328-39514-4-git-send-email-guohanjun@huawei.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit e91aa8e6ec ("KVM: PPC: Enable IOMMU_API for KVM_BOOK3S_64
permanently", 2017-03-22) enabled the SPAPR TCE code for all 64-bit
Book 3S kernel configurations in order to simplify the code and
reduce #ifdefs. However, 64-bit Book 3S PPC platforms other than
pseries and powernv don't implement the necessary IOMMU callbacks,
leading to build failures like the following (for a pasemi config):
scripts/kconfig/conf --silentoldconfig Kconfig
warning: (KVM_BOOK3S_64) selects SPAPR_TCE_IOMMU which has unmet direct dependencies (IOMMU_SUPPORT && (PPC_POWERNV || PPC_PSERIES))
...
CC [M] arch/powerpc/kvm/book3s_64_vio.o
/home/paulus/kernel/kvm/arch/powerpc/kvm/book3s_64_vio.c: In function ‘kvmppc_clear_tce’:
/home/paulus/kernel/kvm/arch/powerpc/kvm/book3s_64_vio.c:363:2: error: implicit declaration of function ‘iommu_tce_xchg’ [-Werror=implicit-function-declaration]
iommu_tce_xchg(tbl, entry, &hpa, &dir);
^
To fix this, we make the inclusion of the SPAPR TCE support, and the
code that uses it in book3s_vio.c and book3s_vio_hv.c, depend on
the inclusion of support for the pseries and/or powernv platforms.
This means that when running a 'pseries' guest on those platforms,
the guest won't have in-kernel acceleration of the PAPR TCE hypercalls,
but at least now they compile.
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Currently, sync of raid456 array cannot make progress when hitting
data in writeback r5cache.
This patch fixes this issue by flushing cached data of the stripe
before processing the sync request. This is achived by:
1. In handle_stripe(), do not set STRIPE_SYNCING if the stripe is
in write back cache;
2. In r5c_try_caching_write(), handle the stripe in sync with write
through;
3. In do_release_stripe(), make stripe in sync write out and send
it to the state machine.
Shaohua: explictly set STRIPE_HANDLE after write out completed
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
For the raid456 with writeback cache, when journal device failed during
normal operation, it is still possible to persist all data, as all
pending data is still in stripe cache. However, it is necessary to handle
journal failure gracefully.
During journal failures, the following logic handles the graceful shutdown
of journal:
1. raid5_error() marks the device as Faulty and schedules async work
log->disable_writeback_work;
2. In disable_writeback_work (r5c_disable_writeback_async), the mddev is
suspended, set to write through, and then resumed. mddev_suspend()
flushes all cached stripes;
3. All cached stripes need to be flushed carefully to the RAID array.
This patch fixes issues within the process above:
1. In r5c_update_on_rdev_error() schedule disable_writeback_work for
journal failures;
2. In r5c_disable_writeback_async(), wait for MD_SB_CHANGE_PENDING,
since raid5_error() updates superblock.
3. In handle_stripe(), allow stripes with data in journal (s.injournal > 0)
to make progress during log_failed;
4. In delay_towrite(), if log failed only process data in the cache (skip
new writes in dev->towrite);
5. In __get_priority_stripe(), process loprio_list during journal device
failures.
6. In raid5_remove_disk(), wait for all cached stripes are flushed before
calling log_exit().
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
The PR KVM implementation of the PAPR HPT hypercalls (H_ENTER etc.)
access an image of the HPT in userspace memory using copy_from_user
and copy_to_user. Recently, the declarations of those functions were
annotated to indicate that the return value must be checked. Since
this code doesn't currently check the return value, this causes
compile warnings like the ones shown below, and since on PPC the
default is to compile arch/powerpc with -Werror, this causes the
build to fail.
To fix this, we check the return values, and if non-zero, fail the
hypercall being processed with a H_FUNCTION error return value.
There is really no good error return value to use since PAPR didn't
envisage the possibility that the hypervisor may not be able to access
the guest's HPT, and H_FUNCTION (function not supported) seems as
good as any.
The typical compile warnings look like this:
CC arch/powerpc/kvm/book3s_pr_papr.o
/home/paulus/kernel/kvm/arch/powerpc/kvm/book3s_pr_papr.c: In function ‘kvmppc_h_pr_enter’:
/home/paulus/kernel/kvm/arch/powerpc/kvm/book3s_pr_papr.c:53:2: error: ignoring return value of ‘copy_from_user’, declared with attribute warn_unused_result [-Werror=unused-result]
copy_from_user(pteg, (void __user *)pteg_addr, sizeof(pteg));
^
/home/paulus/kernel/kvm/arch/powerpc/kvm/book3s_pr_papr.c:74:2: error: ignoring return value of ‘copy_to_user’, declared with attribute warn_unused_result [-Werror=unused-result]
copy_to_user((void __user *)pteg_addr, hpte, HPTE_SIZE);
^
... etc.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
POWER9 running a radix guest will take some hypervisor interrupts
without going to real mode (turning off the MMU). This means that
early hypercall handlers may now be called in virtual mode. Most of
the handlers work just fine in both modes, but there are some that
can crash the host if called in virtual mode, notably the TCE (IOMMU)
hypercalls H_PUT_TCE, H_STUFF_TCE and H_PUT_TCE_INDIRECT. These
already have both a real-mode and a virtual-mode version, so we
arrange for the real-mode version to return H_TOO_HARD for radix
guests, which will result in the virtual-mode version being called.
The other hypercall which is sensitive to the MMU mode is H_RANDOM.
It doesn't have a virtual-mode version, so this adds code to enable
it to be called in either mode.
An alternative solution was considered which would refuse to call any
of the early hypercall handlers when doing a virtual-mode exit from a
radix guest. However, the XICS-on-XIVE code depends on the XICS
hypercalls being handled early even for virtual-mode exits, because
the handlers need to be called before the XIVE vCPU state has been
pulled off the hardware. Therefore that solution would have become
quite invasive and complicated, and was rejected in favour of the
simpler, though less elegant, solution presented here.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
If the list search in sg_get_rq_mark() fails to find a valid request, we
return a bogus element. This then can later lead to a GPF in
sg_remove_scat().
So don't return bogus Sg_requests in sg_get_rq_mark() but NULL in case
the list search doesn't find a valid request.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Doug Gilbert <dgilbert@interlog.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Doug Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For a zoned block device, sd_zbc_complete() handles zone write unlock on
completion of a REQ_OP_WRITE_ZEROES command but the zone write locking
is missing from sd_setup_write_zeroes_cmnd(). This patch fixes this
problem by locking the target zone of a REQ_OP_WRITE_ZEROES request.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi_io_init() may fail, leaving a zone of a zoned block device locked.
Fix this by properly unlocking the write same request target zone if
scsi_io_init() fails.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The API convention makes it that a given MDIO bus reset should be able
to access PHY devices in its reset() callback and perform additional
MDIO accesses in order to bring the bus and PHYs in a working state.
Commit 69226896ad ("mdio_bus: Issue GPIO RESET to PHYs.") broke that
contract by first calling bus->reset() and then release all PHYs from
reset using their shared GPIO line, so restore the expected
functionality here.
Fixes: 69226896ad ("mdio_bus: Issue GPIO RESET to PHYs.")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We must accumulate into reg->aux_off rather than use a plain assignment.
Add a test for this situation to test_align.
Reported-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The macro tipc_wait_for_cond() is embedding the macro sk_wait_event()
to fulfil its task. The latter, in turn, is evaluating the stated
condition outside the socket lock context. This is problematic if
the condition is accessing non-trivial data structures which may be
altered by incoming interrupts, as is the case with the cong_links()
linked list, used by socket to keep track of the current set of
congested links. We sometimes see crashes when this list is accessed
by a condition function at the same time as a SOCK_WAKEUP interrupt
is removing an element from the list.
We fix this by expanding selected parts of sk_wait_event() into the
outer macro, while ensuring that all evaluations of a given condition
are performed under socket lock protection.
Fixes: commit 365ad353c2 ("tipc: reduce risk of user starvation during link congestion")
Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shahid Habib noticed that when xdp1 was killed from a different console the xdp
program was not cleaned-up properly in the kernel and it continued to forward
traffic.
Most of the applications in samples/bpf cleanup properly, but only when getting
SIGINT. Since kill defaults to using SIGTERM, add support to cleanup when the
application receives either SIGINT or SIGTERM.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Reported-by: Shahid Habib <shahid.habib@broadcom.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The error status err is initialized as zero and then being checked
several times to see if it is less than zero even when it has not
been updated. It may seem that the err should be assigned to the
return code of the call to the various *offload_en_set calls and
then we check for failure, however, these functions are void and
never actually return any status.
Since these error checks are redundant we can remove these
as well as err and the error exit label err_exit.
Detected by CoverityScan, CID#1398313 and CID#1398306 ("Logically
dead code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Acked-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra says:
====================
qlcnic: Bug fix and update version
This series has one fix and bumps up driver version.
Please consider applying to "net"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently driver returns error on speed configurations
for 83xx adapter's non XGBE ports, due to this link doesn't
come up on the ports using 1000Base-T as a connector with
autoneg disabled. This patch fixes this with initializing
appropriate port type based on queried module/connector
types from hardware before any speed/autoneg configuration.
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unavoidable crashes in netfront_resume() and netback_changed() after a
previous fail in talk_to_netback() (e.g. when we fail to read MAC from
xenstore) were discovered. The failure path in talk_to_netback() does
unregister/free for netdev but we don't reset drvdata and we try accessing
it after resume.
Fix the bug by removing the whole xen device completely with
device_unregister(), this guarantees we won't have any calls into netfront
after a failure.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 59cc1f61f0 ("net: sched: convert qdisc linked list to
hashtable") we missed the opportunity to considerably speed up
tc_dump_tclass_root() if a qdisc handle is provided by user.
Instead of iterating all the qdiscs, use qdisc_match_from_root()
to directly get the one we look for.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a bug in splitting an SKB during SACK
processing. Specifically if an skb contains multiple
packets and is only partially sacked in the higher sequences,
tcp_match_sack_to_skb() splits the skb and marks the second fragment
as SACKed.
The current code further attempts rounding up the first fragment
to MSS boundaries. But it misses a boundary condition when the
rounded-up fragment size (pkt_len) is exactly skb size. Spliting
such an skb is pointless and causses a kernel warning and aborts
the SACK processing. This patch universally checks such over-split
before calling tcp_fragment to prevent these unnecessary warnings.
Fixes: adb92db857 ("tcp: Make SACK code to split only at mss boundaries")
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I should have known that lowering skb->truesize was dangerous :/
In case packets are not leaving the host via a standard Ethernet device,
but looped back to local sockets, bad things can happen, as reported
by Michael Madsen ( https://bugzilla.kernel.org/show_bug.cgi?id=195713 )
So instead of tweaking skb->truesize, lets change skb->destructor
and keep a reference on the owner socket via its sk_refcnt.
Fixes: f2f872f927 ("netem: Introduce skb_orphan_partial() helper")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Michael Madsen <mkm@nabto.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
Two generic xdp related follow-ups
Two follow-ups for the generic XDP API, would be great if
both could still be considered, since the XDP API is not
frozen yet. For details please see individual patches.
v1 -> v2:
- Implemented feedback from Jakub Kicinski (reusing
attribute on dump), thanks!
- Rest as is.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
While working on the iproute2 generic XDP frontend, I noticed that
as of right now it's possible to have native *and* generic XDP
programs loaded both at the same time for the case when a driver
supports native XDP.
The intended model for generic XDP from b5cdae3291 ("net: Generic
XDP") is, however, that only one out of the two can be present at
once which is also indicated as such in the XDP netlink dump part.
The main rationale for generic XDP is to ease accessibility (in
case a driver does not yet have XDP support) and to generically
provide a semantical model as an example for driver developers
wanting to add XDP support. The generic XDP option for an XDP
aware driver can still be useful for comparing and testing both
implementations.
However, it is not intended to have a second XDP processing stage
or layer with exactly the same functionality of the first native
stage. Only reason could be to have a partial fallback for future
XDP features that are not supported yet in the native implementation
and we probably also shouldn't strive for such fallback and instead
encourage native feature support in the first place. Given there's
currently no such fallback issue or use case, lets not go there yet
if we don't need to.
Therefore, change semantics for loading XDP and bail out if the
user tries to load a generic XDP program when a native one is
present and vice versa. Another alternative to bailing out would
be to handle the transition from one flavor to another gracefully,
but that would require to bring the device down, exchange both
types of programs, and bring it up again in order to avoid a tiny
window where a packet could hit both hooks. Given this complicates
the logic for just a debugging feature in the native case, I went
with the simpler variant.
For the dump, remove IFLA_XDP_FLAGS that was added with b5cdae3291
and reuse IFLA_XDP_ATTACHED for indicating the mode. Dumping all
or just a subset of flags that were used for loading the XDP prog
is suboptimal in the long run since not all flags are useful for
dumping and if we start to reuse the same flag definitions for
load and dump, then we'll waste bit space. What we really just
want is to dump the mode for now.
Current IFLA_XDP_ATTACHED semantics are: nothing was installed (0),
a program is running at the native driver layer (1). Thus, add a
mode that says that a program is running at generic XDP layer (2).
Applications will handle this fine in that older binaries will
just indicate that something is attached at XDP layer, effectively
this is similar to IFLA_XDP_FLAGS attr that we would have had
modulo the redundancy.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit b5cdae3291 ("net: Generic XDP") we automatically fall
back to a generic XDP variant if the driver does not support native
XDP. Allow for an option where the user can specify that always the
native XDP variant should be selected and in case it's not supported
by a driver, just bail out.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we add bios to block plugging list, locking is unnecessry, since the block
unplug is guaranteed not to run at that time.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
We do not want to use the architecture's type.h header when
building BPF programs which are always 64-bit.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller says:
====================
bpf: Add alignment tracker to verifier.
First we add the alignment tracking logic to the verifier.
Next, we work on building up infrastructure to facilitate regression
testing of this facility.
Finally, we add the "test_align" test case.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows a test case to load a BPF program and unconditionally
acquire the verifier log.
It also allows specification of the strict alignment flag.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Add a new field, "prog_flags", and an initial flag value
BPF_F_STRICT_ALIGNMENT.
When set, the verifier will enforce strict pointer alignment
regardless of the setting of CONFIG_EFFICIENT_UNALIGNED_ACCESS.
The verifier, in this mode, will also use a fixed value of "2" in
place of NET_IP_ALIGN.
This facilitates test cases that will exercise and validate this part
of the verifier even when run on architectures where alignment doesn't
matter.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
If log_level > 1, do a state dump every instruction and emit it in
a more compact way (without a leading newline).
This will facilitate more sophisticated test cases which inspect the
verifier log for register state.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Currently if we add only constant values to pointers we can fully
validate the alignment, and properly check if we need to reject the
program on !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS architectures.
However, once an unknown value is introduced we only allow byte sized
memory accesses which is too restrictive.
Add logic to track the known minimum alignment of register values,
and propagate this state into registers containing pointers.
The most common paradigm that makes use of this new logic is computing
the transport header using the IP header length field. For example:
struct ethhdr *ep = skb->data;
struct iphdr *iph = (struct iphdr *) (ep + 1);
struct tcphdr *th;
...
n = iph->ihl;
th = ((void *)iph + (n * 4));
port = th->dest;
The existing code will reject the load of th->dest because it cannot
validate that the alignment is at least 2 once "n * 4" is added the
the packet pointer.
In the new code, the register holding "n * 4" will have a reg->min_align
value of 4, because any value multiplied by 4 will be at least 4 byte
aligned. (actually, the eBPF code emitted by the compiler in this case
is most likely to use a shift left by 2, but the end result is identical)
At the critical addition:
th = ((void *)iph + (n * 4));
The register holding 'th' will start with reg->off value of 14. The
pointer addition will transform that reg into something that looks like:
reg->aux_off = 14
reg->aux_off_align = 4
Next, the verifier will look at the th->dest load, and it will see
a load offset of 2, and first check:
if (reg->aux_off_align % size)
which will pass because aux_off_align is 4. reg_off will be computed:
reg_off = reg->off;
...
reg_off += reg->aux_off;
plus we have off==2, and it will thus check:
if ((NET_IP_ALIGN + reg_off + off) % size != 0)
which evaluates to:
if ((NET_IP_ALIGN + 14 + 2) % size != 0)
On strict alignment architectures, NET_IP_ALIGN is 2, thus:
if ((2 + 14 + 2) % size != 0)
which passes.
These pointer transformations and checks work regardless of whether
the constant offset or the variable with known alignment is added
first to the pointer register.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Shubham was recently asking on netdev why in arm64 JIT we don't multiply
the index for accessing the tail call map by 8. That led me into testing
out arm64 JIT wrt tail calls and it turned out I got a NULL pointer
dereference on the tail call.
The buggy access is at:
prog = array->ptrs[index];
if (prog == NULL)
goto out;
[...]
00000060: d2800e0a mov x10, #0x70 // #112
00000064: f86a682a ldr x10, [x1,x10]
00000068: f862694b ldr x11, [x10,x2]
0000006c: b40000ab cbz x11, 0x00000080
[...]
The code triggering the crash is f862694b. x1 at the time contains the
address of the bpf array, x10 offsetof(struct bpf_array, ptrs). Meaning,
above we load the pointer to the program at map slot 0 into x10. x10
can then be NULL if the slot is not occupied, which we later on try to
access with a user given offset in x2 that is the map index.
Fix this by emitting the following instead:
[...]
00000060: d2800e0a mov x10, #0x70 // #112
00000064: 8b0a002a add x10, x1, x10
00000068: d37df04b lsl x11, x2, #3
0000006c: f86b694b ldr x11, [x10,x11]
00000070: b40000ab cbz x11, 0x00000084
[...]
This basically adds the offset to ptrs to the base address of the bpf
array we got and we later on access the map with an index * 8 offset
relative to that. The tail call map itself is basically one large area
with meta data at the head followed by the array of prog pointers.
This makes tail calls working again, tested on Cavium ThunderX ARMv8.
Fixes: ddb55992b0 ("arm64: bpf: implement bpf_tail_call() helper")
Reported-by: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix error path while dma open channel issue. Also, no need to check output
on NULL if it's never returned.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann says:
====================
s390/net fixes
some qeth fixes for -net, the OSM/OSN one being the most crucial.
Please also queue these up for stable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
commit b4d72c08b3 ("qeth: bridgeport support - basic control")
broke the support for OSM and OSN devices as follows:
As OSM and OSN are L2 only, qeth_core_probe_device() does an early
setup by loading the l2 discipline and calling qeth_l2_probe_device().
In this context, adding the l2-specific bridgeport sysfs attributes
via qeth_l2_create_device_attributes() hits a BUG_ON in fs/sysfs/group.c,
since the basic sysfs infrastructure for the device hasn't been
established yet.
Note that OSN actually has its own unique sysfs attributes
(qeth_osn_devtype), so the additional attributes shouldn't be created
at all.
For OSM, add a new qeth_l2_devtype that contains all the common
and l2-specific sysfs attributes.
When qeth_core_probe_device() does early setup for OSM or OSN, assign
the corresponding devtype so that the ccwgroup probe code creates the
full set of sysfs attributes.
This allows us to skip qeth_l2_create_device_attributes() in case
of an early setup.
Any device that can't do early setup will initially have only the
generic sysfs attributes, and when it's probed later
qeth_l2_probe_device() adds the l2-specific attributes.
If an early-setup device is removed (by calling ccwgroup_ungroup()),
device_unregister() will - using the devtype - delete the
l2-specific attributes before qeth_l2_remove_device() is called.
So make sure to not remove them twice.
What complicates the issue is that qeth_l2_probe_device() and
qeth_l2_remove_device() is also called on a device when its
layer2 attribute changes (ie. its layer mode is switched).
For early-setup devices this wouldn't work properly - we wouldn't
remove the l2-specific attributes when switching to L3.
But switching the layer mode doesn't actually make any sense;
we already decided that the device can only operate in L2!
So just refuse to switch the layer mode on such devices. Note that
OSN doesn't have a layer2 attribute, so we only need to special-case
OSM.
Based on an initial patch by Ursula Braun.
Fixes: b4d72c08b3 ("qeth: bridgeport support - basic control")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When setting up the device from within the layer discipline's
probe routine, creating the layer-specific sysfs attributes can fail.
Report this error back to the caller, and handle it by
releasing the layer discipline.
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
[jwi: updated commit msg, moved an OSN change to a subsequent patch]
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a potential unnecessary refcount decrement on error path of
put_device(&pb->mii_bus->dev), as it is possible to avoid the
of_mdio_find_bus() call if mux_bus is specified by the calling function.
The same put_device() is not called in the error path if the
devm_kzalloc of pb fails. This caused the variable used in the
put_device() to be changed, as the pb pointer was obviously not set up.
There is an unnecessary of_node_get() on child_bus_node if the
of_mdiobus_register() is successful, as the
for_each_available_child_of_node() automatically increments this.
Thus the refcount on this node will always be +1 more than it should be.
There is no of_node_put() on child_bus_node if the of_mdiobus_register()
call fails.
Finally, it is lacking devm_kfree() of pb in the error path. While this
might not be technically necessary, it was present in other parts of the
function. So, I am adding it where necessary to make it uniform.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Fixes: f20e6657a8 ("mdio: mux: Enhanced MDIO mux framework for integrated multiplexers")
Fixes: 0ca2997d14 ("netdev/of/phy: Add MDIO bus multiplexer support.")
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently rcode is being initialized to NX_RCODE_SUCCESS and later it
is checked to see if it is not NX_RCODE_SUCCESS which is never true. It
appears that there is an unintentional missing assignment of rcode from
the return of the call to netxen_issue_cmd() that was dropped in
an earlier fix, so add it in.
Detected by CoverityScan, CID#401900 ("Logically dead code")
Fixes: 2dcd5d95ad ("netxen_nic: fix cdrp race condition")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The qca_spi driver causes alignment issues on ARM devices.
So fix this by using netdev_alloc_skb_ip_align().
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: 291ab06ecf ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
Signed-off-by: David S. Miller <davem@davemloft.net>
The current codes only deal with the case that the skb is dropped, it
may meet one use-after-free issue when NF_HOOK returns 0 that means
the skb is stolen by one netfilter rule or hook.
When one netfilter rule or hook stoles the skb and return NF_STOLEN,
it means the skb is taken by the rule, and other modules should not
touch this skb ever. Maybe the skb is queued or freed directly by the
rule.
Now uses the nf_hook instead of NF_HOOK to get the result of netfilter,
and check the return value of nf_hook. Only when its value equals 1, it
means the skb could go ahead. Or reset the skb as NULL.
BTW, because vrf_rcv_finish is empty function, so needn't invoke it
even though nf_hook returns 1. But we need to modify vrf_rcv_finish
to deal with the NF_STOLEN case.
There are two cases when skb is stolen.
1. The skb is stolen and freed directly.
There is nothing we need to do, and vrf_rcv_finish isn't invoked.
2. The skb is queued and reinjected again.
The vrf_rcv_finish would be invoked as okfn, so need to free the
skb in it.
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When killing kref_sub(), the unconditional additional kref_get()
was not properly paired with the necessary kref_put(), causing
a leak of struct drbd_requests (~ 224 Bytes) per submitted bio,
and breaking DRBD in general, as the destructor of those "drbd_requests"
does more than just the mempoll_free().
Fixes: bdfafc4ffd ("locking/atomic, kref: Kill kref_sub()")
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: stable@vger.kernel.org # v4.11
Signed-off-by: Jens Axboe <axboe@fb.com>
A change to function pointers that was meant to address a sparse warning
turned out to cause hundreds of new gcc-7 warnings:
include/linux/of_irq.h:11:13: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]
drivers/of/of_reserved_mem.c: In function '__reserved_mem_init_node':
drivers/of/of_reserved_mem.c:200:7: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]
int const (*initfn)(struct reserved_mem *rmem) = i->data;
Turns out the sparse warnings were spurious and have been fixed in
upstream sparse since 0.5.0 in commit "sparse: treat function pointers
as pointers to const data".
This partially reverts commit 17a70355ea.
Fixes: 17a70355ea ("of: fix sparse warnings in fdt, irq, reserved mem, and resolver code")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Herring <robh@kernel.org>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The ELECOM DEFT trackballs report only five buttons, when the device
actually has 8. Change the descriptor so that the HID driver can see all of
them.
For completeness and future reference, I included a side-by-side diff of
the part of the descriptor that is being edited.
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: Yuxuan Shui <yshuiv7@gmail.com>
Signed-off-by: Diego Elio Pettenò <flameeyes@flameeyes.eu>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
For the moment, the tee subsystem only makes sense in combination with
the op-tee driver that depends on ARM_SMCCC, so let's hide the subsystem
from users that can't select that.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This is a dependency for the following fix
* tee/initial-merge:
arm64: dt: hikey: Add optee node
Documentation: tee subsystem and op-tee driver
tee: add OP-TEE driver
tee: generic TEE subsystem
dt/bindings: add bindings for optee
In r5l_do_submit_io(), it is necessary to check io->split_bio before
submit io->current_bio. This is because, endio of current_bio may
free the whole IO unit, and thus change io->split_bio.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Merge "mvebu arm64 for 4.12" from Gregory CLEMENT:
enable the Armada 37xx pinctrl driver
* tag 'mvebu-arm64-4.12-1' of git://git.infradead.org/linux-mvebu:
arm64: marvell: enable the Armada 37xx pinctrl driver
Pull "mvebu dt64 for 4.12 (part 3)" from Gregory CLEMENT:
pinctrl and GPIO description for Armada 37xx SoCs
* tag 'mvebu-dt64-4.12-3' of git://git.infradead.org/linux-mvebu:
ARM64: dts: marvell: armada37xx: add pinctrl definition
ARM64: dts: marvell: Add pinctrl nodes for Armada 3700
Pull "Renesas ARM Based SoC Fixes for v4.12" from Simon Horman:
* Provide dummy rcar_rst_read_mode_pins() for compile-testing
* tag 'renesas-fixes-for-v4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
soc: renesas: Provide dummy rcar_rst_read_mode_pins() for compile-testing
It's no need to switch vgpu if next vgpu is the same with current
vgpu, otherwise it will make performance drop in some case.
v2: correct the comments.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
When processing responses, and in particular freeing mids (DeleteMidQEntry),
which is very important since it also frees the associated buffers (cifs_buf_release),
we can block a long time if (writes to) socket is slow due to low memory or networking
issues.
We can block in send (smb request) waiting for memory, and be blocked in processing
responess (which could free memory if we let it) - since they both grab the
server->srv_mutex.
In practice, in the DeleteMidQEntry case - there is no reason we need to
grab the srv_mutex so remove these around DeleteMidQEntry, and it allows
us to free memory faster.
Signed-off-by: Steve French <steve.french@primarydata.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
cifs_relock_file() can perform a down_write() on the inode's lock_sem even
though it was already performed in cifs_strict_readv(). Lockdep complains
about this. AFAICS, there is no problem here, and lockdep just needs to be
told that this nesting is OK.
=============================================
[ INFO: possible recursive locking detected ]
4.11.0+ #20 Not tainted
---------------------------------------------
cat/701 is trying to acquire lock:
(&cifsi->lock_sem){++++.+}, at: cifs_reopen_file+0x7a7/0xc00
but task is already holding lock:
(&cifsi->lock_sem){++++.+}, at: cifs_strict_readv+0x177/0x310
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&cifsi->lock_sem);
lock(&cifsi->lock_sem);
*** DEADLOCK ***
May be due to missing lock nesting notation
1 lock held by cat/701:
#0: (&cifsi->lock_sem){++++.+}, at: cifs_strict_readv+0x177/0x310
stack backtrace:
CPU: 0 PID: 701 Comm: cat Not tainted 4.11.0+ #20
Call Trace:
dump_stack+0x85/0xc2
__lock_acquire+0x17dd/0x2260
? trace_hardirqs_on_thunk+0x1a/0x1c
? preempt_schedule_irq+0x6b/0x80
lock_acquire+0xcc/0x260
? lock_acquire+0xcc/0x260
? cifs_reopen_file+0x7a7/0xc00
down_read+0x2d/0x70
? cifs_reopen_file+0x7a7/0xc00
cifs_reopen_file+0x7a7/0xc00
? printk+0x43/0x4b
cifs_readpage_worker+0x327/0x8a0
cifs_readpage+0x8c/0x2a0
generic_file_read_iter+0x692/0xd00
cifs_strict_readv+0x29f/0x310
generic_file_splice_read+0x11c/0x1c0
do_splice_to+0xa5/0xc0
splice_direct_to_actor+0xfa/0x350
? generic_pipe_buf_nosteal+0x10/0x10
do_splice_direct+0xb5/0xe0
do_sendfile+0x278/0x3a0
SyS_sendfile64+0xc4/0xe0
entry_SYSCALL_64_fastpath+0x1f/0xbe
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Pull "DTS fixes" from Matthias Brugger:
- mt8173: fix mmc parameters
- set timer frequency explicitly and force the driver to set it
* tag 'v4.11-next-fixes' of https://github.com/mbgg/linux-mediatek:
ARM64: dts: mediatek: configure some fixed mmc parameters
arm: dts: mt7623: add clock-frequency to the a7 timer node to mt7623.dtsi
When CONFIG_PM is disabled, we get a build error:
arch/arm/mach-omap2/omap-smp.c: In function 'omap4_smp_maybe_reset_cpu1':
arch/arm/mach-omap2/omap-smp.c:309:20: error: implicit declaration of function 'omap4_get_cpu1_ns_pa_addr'; did you mean 'omap4_get_scu_base'? [-Werror=implicit-function-declaration]
We need to fix this in multiple files, to ensure the declaration is visible,
to actually build the function without CONFIG_PM, and to only call it
when OMAP4 and/or OMAP5 are enabled.
Fixes: 351b7c4907 ("ARM: omap2+: Revert omap-smp.c changes resetting CPU1 during boot")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Tony Lindgren <tony@atomide.com>
We need to tell the driver what the timers frequency is and that the core
has not be configured by the bootrom. Not doing so makes the unit not
boot.
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
The perf tool assumes that kernel symbols are never present at address
zero. In fact it assumes if functions that map symbols to addresses
return zero, that the symbol was not found.
Given that s390's _text symbol historically is located at address zero
this yields at least a couple of false errors and warnings in one of
perf's test cases about not present symbols ("perf test 1").
To fix this simply move the _text symbol to address 0x200, just behind
the initial psw and channel program located at the beginning of the
kernel image. This is now hard coded within the linker script.
I tried a nicer solution which moves the initial psw and channel
program into an own section. However that would move the symbols
within the "real" head.text section to different addresses, since the
".org" statements within head.S are relative to the head.text
section. If there is a new section in front, everything else will be
moved. Alternatively I could have adjusted all ".org" statements. But
this current solution seems to be the easiest one, since nobody really
cares where the _text symbol is actually located.
Reported-by: Zvonko Kosic <zkosic@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Avoid false positive warnings like this with gcc 7.1:
drivers/s390/cio/qdio_debug.h:63:4:
note: 'snprintf' output between 8 and 17 bytes into a destination of size 16
snprintf(debug_buffer, QDIO_DBF_LEN, text);
and simply increase the size of the string buffer.
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Avoid false positive warnings like this with gcc 7.1:
drivers/s390/cio/ccwgroup.c:41:21:
warning: '%d' directive writing between 1 and 10 bytes into a region of size 4
sprintf(str, "cdev%d", i);
and simply increase the size of the string buffer.
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixes a couple of compile warnings with gcc 7.1.0 :
arch/s390/kernel/sysinfo.c:578:20:
note: directive argument in the range [-2147483648, 4]
sprintf(link_to, "15_1_%d", topology_mnest_limit());
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The average string that is copied from user space to kernel space is
rather short. E.g. booting a system involves about 50.000
strncpy_from_user() calls where the NULL terminated string has an
average size of 27 bytes.
By default our s390 specific strncpy_from_user() implementation
however copies up to 4096 bytes, which is a waste of cpu cycles and
cache lines. Therefore reduce the default length to L1_CACHE_BYTES
(256 bytes), which also reduces the average execution time of
strncpy_from_user() by 30-40%.
Alternatively we could have switched to the generic
strncpy_from_user() implementation, however it turned out that that
variant would be slower than the now optimized s390 variant.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There are complaints that raid0 discard handling is slow. Currently we
divide discard request into chunks and dispatch to underlayer disks. The
block layer will do merge to form big requests. This causes a lot of
request split/merge and uses significant CPU time.
A simple idea is to calculate the range for each raid disk for an IO
request and send a discard request to raid disks, which will avoid the
split/merge completely. Previously Coly tried the approach, but the
implementation was too complex because of raid0 zones. This patch always
split bio in zone boundary and handle bio within one zone. It simplifies
the implementation a lot.
Reviewed-by: NeilBrown <neilb@suse.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Shaohua Li <shli@fb.com>
The 2nd check to see if request_size is less than zero is redundant
because the first check takes error exit path on this condition. So,
since it is redundant, remove it.
Detected by CoverityScan, CID#146149 ("Logically Dead Code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
I believe there is a typo on the wq destroy of els_wq, currently the
driver is checking if els_cq is not null and I think this should be a
check on els_wq instead.
Detected by CoverityScan, CID#1411629 ("Copy-paste error")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver now uses IRQ_POLL and needs to select it to avoid the
following build error.
ERROR: ".irq_poll_complete" [drivers/scsi/cxlflash/cxlflash.ko] undefined!
ERROR: ".irq_poll_sched" [drivers/scsi/cxlflash/cxlflash.ko] undefined!
ERROR: ".irq_poll_disable" [drivers/scsi/cxlflash/cxlflash.ko] undefined!
ERROR: ".irq_poll_init" [drivers/scsi/cxlflash/cxlflash.ko] undefined!
Fixes: cba06e6de4 ("scsi: cxlflash: Implement IRQ polling for RRQ processing")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Using memcpy() from a string that is shorter than the length copied
means the destination buffer is being filled with arbitrary data from
the kernel rodata segment. Instead, use strncpy() which will fill the
trailing bytes with zeros.
This was found with the future CONFIG_FORTIFY_SOURCE feature.
Cc: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We store sc_cmd->cmnd[0] which is an unsigned char in io_log->op so
this should also be unsigned char. The other thing is that this is
displayed in the debugfs:
seq_printf(s, "0x%02x:", io_log->op);
Smatch complains that the formatting won't work for negative values so
changing it to unsigned silences that warning as well.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The open-osd domain doesn't exist anymore, and mails to the list lead to
really annoying bounced that repeat every day.
Also the primarydata address for Benny bounces, and while I have a new
one for him he doesn't seem to be maintaining the OSD code any more.
Which beggs the question: should we really leave the Supported status in
MAINTAINERS given that the code is barely maintained?
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Boaz Harrosh <ooo@electrozaur.com>
Acked-by: Benny Halevy <bhalevy@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a device is unplugged from a SCSI controller, if the scsi_device is
still in use by application layer, it won't get released until users
close it.
In this case, scsi_device_remove just set the scsi_device's state to be
SDEV_DEL. But if you plug the disk just before the old scsi_device is
released, then there will be two scsi_device structures in
scsi_host->__devices. When the next unplug event happens, some low-level
drivers will check whether the scsi_device has been added to host (for
example the MegaRAID SAS series controller) by calling
scsi_device_lookup(call __scsi_device_lookup) in function
megasas_aen_polling. __scsi_device_lookup will return the first
scsi_device. Because its state is SDEV_DEL, the scsi_device_lookup will
return NULL, making the low-level driver assume that the scsi_device has
been removed, and won't call scsi_device_remove which will lead to hot
swap failure.
Signed-off-by: Zhou Zhengping <johnzzpcrystal@gmail.com>
Tested-by: Zeng Rujia <ZengRujia@sangfor.com.cn>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195607
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
To select the appropriate shost template, the driver is issuing a
mailbox command to retrieve the wwn. Turns out the sending of the
command precedes the reset of the function. On SLI-4 adapters, this is
inconsequential as the mailbox command location is specified by dma via
the BMBX register. However, on SLI-3 adapters, the location of the
mailbox command submission area changes. When the function is first
powered on or reset, the cmd is submitted via PCI bar memory. Later the
driver changes the function config to use host memory and DMA. The
request to start a mailbox command is the same, a simple doorbell write,
regardless of submission area. So.. if there has not been a boot driver
run against the adapter, the mailbox command works as defaults are
ok. But, if the boot driver has configured the card and, and if no
platform pci function/slot reset occurs as the os starts, the mailbox
command will fail. The SLI-3 device will use the stale boot driver dma
location. This can cause PCI eeh errors.
Fix is to reset the sli-3 function before sending the mailbox command,
thus synchronizing the function/driver on mailbox location.
Note: The fix uses routines that are typically invoked later in the call
flow to reset the sli-3 device. The issue in using those routines is
that the normal (non-fix) flow does additional initialization, namely
the allocation of the pport structure. So, rather than significantly
reworking the initialization flow so that the pport is alloc'd first,
pointer checks are added to work around it. Checks are limited to the
routines invoked by a sli-3 adapter (s3 routines) as this fix/early call
is only invoked on a sli3 adapter. Nothing changes post the
fix. Subsequent initialization, and another adapter reset, still occur -
both on sli-3 and sli-4 adapters.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Fixes: 96418b5e2c ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
Cc: stable@vger.kernel.org # v4.11+
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the FCoE sending side becomes congested libfc tries to reduce the
queue depth on the host; however due to the built-in lag before
attempting to ramp down the queue depth _again_ the message log is
flooded with the following message:
libfc: queue full, reducing can_queue to 512
With this patch the message is printed only once (ie when it's
actually changed).
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "kmalloc_array".
This issue was detected by using the Coccinelle software.
* Replace the specification of a data type by a pointer dereference
to make the corresponding size determination a bit safer according to
the Linux coding style convention.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
A string which did not contain a data format specification should be put
into a sequence. Thus use the corresponding function "seq_puts".
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
This essentially reverts commit b5470dc5fc ("md: resolve external
metadata handling deadlock in md_allow_write") with some adjustments.
Since commit 6791875e2e ("md: make reconfig_mutex optional for writes
to md sysfs files.") changing array_state to 'active' does not use
mddev_lock() and will not cause a deadlock with md_allow_write(). This
revert simplifies userspace tools that write to sysfs attributes like
"stripe_cache_size" or "consistency_policy" because it removes the need
for special handling for external metadata arrays, checking for EAGAIN
and retrying the write.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
We do not check if packet from real server is for NAT
connection before performing SNAT. This causes problems
for setups that use DR/TUN and allow local clients to
access the real server directly, for example:
- local client in director creates IPVS-DR/TUN connection
CIP->VIP and the request packets are routed to RIP.
Talks are finished but IPVS connection is not expired yet.
- second local client creates non-IPVS connection CIP->RIP
with same reply tuple RIP->CIP and when replies are received
on LOCAL_IN we wrongly assign them for the first client
connection because RIP->CIP matches the reply direction.
As result, IPVS SNATs replies for non-IPVS connections.
The problem is more visible to local UDP clients but in rare
cases it can happen also for TCP or remote clients when the
real server sends the reply traffic via the director.
So, better to be more precise for the reply traffic.
As replies are not expected for DR/TUN connections, better
to not touch them.
Reported-by: Nick Moriarty <nick.moriarty@york.ac.uk>
Tested-by: Nick Moriarty <nick.moriarty@york.ac.uk>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Needn't to restore the in-context MMIO when SCHEDULE_OUT. Sometimes
with restoring the in-context MMIO, some GPU hang can be observed. So
remove the in-context MMIO restore
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Upon NETDEV_DOWN event, all xfrm_state objects which are bound to
the device are flushed.
The condition for this is wrong, though, testing dev->hw_features
instead of dev->features. If a device has non-user-modifiable
NETIF_F_HW_ESP, then its xfrm_state objects are not flushed,
causing a crash later on after the device is deleted.
Check dev->features instead of dev->hw_features.
Fixes: d77e38e612 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The sadb_x_sec_len is stored in the unit 'byte divided by eight'.
So we have to multiply this value by eight before we can do
size checks. Otherwise we may get a slab-out-of-bounds when
we memcpy the user sec_ctx.
Fixes: df71837d50 ("[LSM-IPSec]: Security association restriction.")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
I left Samsung and lost access to most Exynos hardware and documentation.
Also, I likely won't be able to keep an eye on the platform anymore in the
short term so remove myself as a reviewer for Exynos.
Signed-off-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
If the driver is built as a module, it won't be autloaded if the devices
are registered via OF code because the OF device table
entries are not exported as aliases
Before the patch:
$ modinfo drivers/iio/adc/sun4i-gpadc-iio.ko | grep alias
alias: platform:sun6i-a31-gpadc-iio
alias: platform:sun5i-a13-gpadc-iio
alias: platform:sun4i-a10-gpadc-iio
After the patch:
$ modinfo drivers/iio/adc/sun4i-gpadc-iio.ko | grep alias
alias: of:N*T*Callwinner,sun8i-a33-thsC*
alias: of:N*T*Callwinner,sun8i-a33-ths
alias: platform:sun6i-a31-gpadc-iio
alias: platform:sun5i-a13-gpadc-iio
alias: platform:sun4i-a10-gpadc-iio
Signed-off-by: Eduardo Molinas <edu.molinas@gmail.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
If the driver is built as a module, it won't be autloaded if the devices
are registered via PLATFORM code because the PLATFORM device table
entries are not exported as aliases
Before the patch:
$ modinfo drivers/iio/adc/sun4i-gpadc-iio.ko | grep alias
$
After the patch:
$ modinfo drivers/iio/adc/sun4i-gpadc-iio.ko | grep alias
alias: platform:sun6i-a31-gpadc-iio
alias: platform:sun5i-a13-gpadc-iio
alias: platform:sun4i-a10-gpadc-iio
Signed-off-by: Eduardo Molinas <edu.molinas@gmail.com>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Using iio_trigger_poll() can oops when multiple interrupts
happen before the first is handled.
Use iio_trigger_poll_chained() instead and use the timestamp
when processed, since it will be in theory be 2 ms max latency.
Fixes: 24ddb0e4bb ("iio: Add AS3935 lightning sensor support")
Cc: stable@vger.kernel.org
Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Currently, sugov_next_freq_shared() uses last_freq_update_time as a
reference to decide when to start considering CPU contributions as
stale.
However, since last_freq_update_time is set by the last CPU that issued
a frequency transition, this might cause problems in certain cases. In
practice, the detection of stale utilization values fails whenever the
CPU with such values was the last to update the policy. For example (and
please note again that the SCHED_CPUFREQ_RT flag is not the problem
here, but only the detection of after how much time that flag has to be
considered stale), suppose a policy with 2 CPUs:
CPU0 | CPU1
|
| RT task scheduled
| SCHED_CPUFREQ_RT is set
| CPU1->last_update = now
| freq transition to max
| last_freq_update_time = now
|
more than TICK_NSEC nsecs
|
a small CFS wakes up |
CPU0->last_update = now1 |
delta_ns(CPU0) < TICK_NSEC* |
CPU0's util is considered |
delta_ns(CPU1) = |
last_freq_update_time - |
CPU1->last_update = 0 |
< TICK_NSEC |
CPU1 is still considered |
CPU1->SCHED_CPUFREQ_RT is set |
we stay at max (until CPU1 |
exits from idle) |
* delta_ns is actually negative as now1 > last_freq_update_time
While last_freq_update_time is a sensible reference for rate limiting,
it doesn't seem to be useful for working around stale CPU states.
Fix the problem by always considering now (time) as the reference for
deciding when CPUs have stale contributions.
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If we bail out of the submit before actually adding the cmdstream
to the kernel ring there is no valid fence to put. Make sure to skip
the fence_put in that case.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
The driver emits multi-touch events for Magic Trackpad as well as Magic
Mouse, but it does not set keybits that are related to multi-touch event
for Magic Mouse; so set these keybits.
The keybits that are not set cause trouble because user programs often
probe these keybits for self-configuration and thus they cannot operate
properly if the keybits are not set.
One of such troubles is that libevdev will not be able to emit correct
touch count, causing gestures library failed to do fling stop.
Signed-off-by: Che-Liang Chiou <clchiou@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The following Smatch complaint was generated in response to commit
2a6cdbd ("HID: wacom: Introduce new 'touch_input' device"):
drivers/hid/wacom_wac.c:1586 wacom_tpc_irq()
error: we previously assumed 'wacom->touch_input' could be null (see line 1577)
The 'touch_input' and 'pen_input' variables point to the 'struct input_dev'
used for relaying touch and pen events to userspace, respectively. If a
device does not have a touch interface or pen interface, the associated
input variable is NULL. The 'wacom_tpc_irq()' function is responsible for
forwarding input reports to a more-specific IRQ handler function. An
unknown report could theoretically be mistaken as e.g. a touch report
on a device which does not have a touch interface. This can be prevented
by only calling the pen/touch functions are called when the pen/touch
pointers are valid.
Fixes: 2a6cdbd ("HID: wacom: Introduce new 'touch_input' device")
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
Reviewed-by: Ping Cheng <ping.cheng@wacom.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
On mainline, there is no functional difference, just less code, and
symmetric lock/unlock paths.
On PREEMPT_RT builds, this fixes the following warning, seen by
Alexander GQ Gerasiov, due to the sleeping nature of spinlocks.
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:993
in_atomic(): 0, irqs_disabled(): 1, pid: 58, name: kworker/u12:1
CPU: 5 PID: 58 Comm: kworker/u12:1 Tainted: G W 4.9.20-rt16-stand6-686 #1
Hardware name: Supermicro SYS-5027R-WRF/X9SRW-F, BIOS 3.2a 10/28/2015
Workqueue: writeback wb_workfn (flush-253:0)
Call Trace:
dump_stack+0x47/0x68
? migrate_enable+0x4a/0xf0
___might_sleep+0x101/0x180
rt_spin_lock+0x17/0x40
add_stripe_bio+0x4e3/0x6c0 [raid456]
? preempt_count_add+0x42/0xb0
raid5_make_request+0x737/0xdd0 [raid456]
Reported-by: Alexander GQ Gerasiov <gq@redlab-i.ru>
Tested-by: Alexander GQ Gerasiov <gq@redlab-i.ru>
Signed-off-by: Julia Cartwright <julia@ni.com>
Signed-off-by: Shaohua Li <shli@fb.com>
When CONFIG_XFRM_SUB_POLICY=y, xfrm_dst stores a copy of the flowi for
that dst. Unfortunately, the code that allocates and fills this copy
doesn't care about what type of flowi (flowi, flowi4, flowi6) gets
passed. In multiple code paths (from raw_sendmsg, from TCP when
replying to a FIN, in vxlan, geneve, and gre), the flowi that gets
passed to xfrm is actually an on-stack flowi4, so we end up reading
stuff from the stack past the end of the flowi4 struct.
Since xfrm_dst->origin isn't used anywhere following commit
ca116922af ("xfrm: Eliminate "fl" and "pol" args to
xfrm_bundle_ok()."), just get rid of it. xfrm_dst->partner isn't used
either, so get rid of that too.
Fixes: 9d6ec93801 ("ipv4: Use flowi4 in public route lookup interfaces.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Locally generated TCP packets are usually cloned, so we
do skb_cow_data() on this packets. After that we need to
reload the pointer to the esp header. On udpencap this
header has an offset to skb_transport_header, so take this
offset into account.
Fixes: 67d349ed60 ("net/esp4: Fix invalid esph pointer crash")
Fixes: fca11ebde3 ("esp4: Reorganize esp_output")
Reported-by: Don Bowman <db@donbowman.ca>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Fix the following compile error(s) if CONFIG_KPROBES is disabled:
arch/s390/kernel/uprobes.c:79:14:
error: implicit declaration of function 'probe_get_fixup_type'
arch/s390/kernel/uprobes.c:87:14:
error: 'FIXUP_PSW_NORMAL' undeclared (first use in this function)
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix this compile error if CONFIG_MODULES is disabled:
arch/s390/built-in.o: In function `ftrace_plt_init':
arch/s390/kernel/ftrace.o:(.init.text+0x34cc): undefined reference to `module_alloc'
Reported-by: Rob Landley <rob@landley.net>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
git commit c5328901aa "[S390] entry[64].S improvements" removed
the update of the exit_timer lowcore field from the critical section
cleanup of the .Lsysc_restore/.Lsysc_done and .Lio_restore/.Lio_done
blocks. If the PSW is updated by the critical section cleanup to point to
user space again, the interrupt entry code will do a vtime calculation
after the cleanup completed with an exit_timer value which has *not* been
updated. Due to this incorrect system time deltas are calculated.
If an interrupt occured with an old PSW between .Lsysc_restore/.Lsysc_done
or .Lio_restore/.Lio_done update __LC_EXIT_TIMER with the system entry
time of the interrupt.
Cc: stable@vger.kernel.org # 3.3+
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Start to populate the device tree of the Armada 37xx with the pincontrol
configuration used on the board providing a dts.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Add the nodes for the two pin controller present in the Armada 37xx SoCs.
Initially the node was named gpio1 using the same name that for the
register range in the datasheet. However renaming it pinctr_nb (nb for
North Bridge) makes more sens.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
If the R-Car RST driver is not included, compile-testing R-Car clock
drivers fails with a link error:
undefined reference to `rcar_rst_read_mode_pins'
To fix this, provide a dummy version. Use the exact same test logic as
in drivers/soc/renesas/Makefile, as there is no Kconfig symbol (yet) to
control compilation of the R-Car RST driver.
Fixes: 527c02f66d ("soc: renesas: Add R-Car RST driver")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
AS3935 interrupt mask has been incorrect so valid lightning events
would never trigger an buffer event. Also noise interrupt should be
BIT(0).
Fixes: 24ddb0e4bb ("iio: Add AS3935 lightning sensor support")
CC: stable@vger.kernel.org
Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Reading an ACPI table through the /sys/firmware/acpi/tables interface
more than 65,536 times leads to the following log message:
ACPI Error: Table ffff88033595eaa8, Validation count is zero after increment
(20170119/tbutils-423)
...and the table being unavailable until the next reboot. Add the
missing acpi_put_table() so the table ->validation_count is decremented
after each read.
Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
Fixes: 174cc7187e "ACPICA: Tables: Back port acpi_get_table_with_size() ..."
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
According to the datasheet the RCO must be recalibrated
on every power-on-reset. Also remove mutex locking in the
calibration function since callers other than the probe
function (which doesn't need it) will have a lock.
Fixes: 24ddb0e4bb ("iio: Add AS3935 lightning sensor support")
Cc: George McCollister <george.mccollister@gmail.com>
Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
An alias name should have an index number even when it is the only of its type.
This allows U-Boot to add the local-mac-address property. Otherwise U-Boot
skips the alias.
Fixes: 6a93792774 ("ARM: bcm2835: dt: Add the ethernet to the device trees")
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Acked-by: Lubomir Rintel <lkundrak@v3.sk>
Reviewed-by: Eric Anholt <eric@anholt.net>
According to the BCM2835 ARM Peripherals document uart1 doesn't map to pins
36-39, but uart0 does.
Also, split into separate Rx/Tx and CST/RTS groups to match other uart nodes.
Fixes: 21ff843931 ("ARM: dts: bcm283x: Define standard pinctrl groups in the gpio node.")
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Reviewed-by: Eric Anholt <eric@anholt.net>
According to the BCM2835 ARM Peripherals document i2c0 doesn't map to pins 32,
34 but to 28, 29.
Fixes: 21ff843931 ("ARM: dts: bcm283x: Define standard pinctrl groups in the gpio node.")
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Reviewed-by: Eric Anholt <eric@anholt.net>
Downstream kernel uses pins 32, 33 as UART0 (PL011) Rx/Tx to communicate with
the Bluetooth chip. So ALT3 of these pins is most likely not CTS/RTS. Change
the node name to reflect that. This matches section 6.2 "Alternative Function
Assignments" in the BCM2835 ARM Peripherals document.
With this change in place, adding
&uart0 {
pinctrl-names = "default";
pinctrl-0 = <&uart0_gpio32 &gpclk2_gpio43>;
status = "okay";
};
to bcm2837-rpi-3-b.dts does the right thing on my Raspberry Pi 3.
Pins 30, 31 are CTS/RTS of UART0 in alternate function 3. Rename uart0_gpio30
as well.
While at it, fix a little typo in a nearby comment.
Fixes: 21ff843931 ("ARM: dts: bcm283x: Define standard pinctrl groups in the gpio node.")
Acked-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Reviewed-by: Eric Anholt <eric@anholt.net>
The calculation of the framebuffer's start address was wrongly using
the CRTC's x and y position rather than the one of the source
framebuffer. To fix that we need to update the plane_check code to
call drm_plane_helper_check_state() to clip the src and dst coordinates.
While there so some minor cleanup of redundant freeing of
devm_alloc-ated memory.
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
2017-03-27 14:55:58 +01:00
1481 changed files with 16263 additions and 8926 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.