I've seen this trigger twice now, where the i915_gem_object_to_ggtt()
call in intel_unpin_fb_obj() returns NULL, resulting in an oops
immediately afterwards as the (inlined) call to i915_vma_unpin_fence()
tries to dereference it.
It seems to be some race condition where the object is going away at
shutdown time, since both times happened when shutting down the X
server. The call chains were different:
- VT ioctl(KDSETMODE, KD_TEXT):
intel_cleanup_plane_fb+0x5b/0xa0 [i915]
drm_atomic_helper_cleanup_planes+0x6f/0x90 [drm_kms_helper]
intel_atomic_commit_tail+0x749/0xfe0 [i915]
intel_atomic_commit+0x3cb/0x4f0 [i915]
drm_atomic_commit+0x4b/0x50 [drm]
restore_fbdev_mode+0x14c/0x2a0 [drm_kms_helper]
drm_fb_helper_restore_fbdev_mode_unlocked+0x34/0x80 [drm_kms_helper]
drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper]
intel_fbdev_set_par+0x18/0x70 [i915]
fb_set_var+0x236/0x460
fbcon_blank+0x30f/0x350
do_unblank_screen+0xd2/0x1a0
vt_ioctl+0x507/0x12a0
tty_ioctl+0x355/0xc30
do_vfs_ioctl+0xa3/0x5e0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x13/0x94
- i915 unpin_work workqueue:
intel_unpin_work_fn+0x58/0x140 [i915]
process_one_work+0x1f1/0x480
worker_thread+0x48/0x4d0
kthread+0x101/0x140
and this patch purely papers over the issue by adding a NULL pointer
check and a WARN_ON_ONCE() to avoid the oops that would then generally
make the machine unresponsive.
Other callers of i915_gem_object_to_ggtt() seem to also check for the
returned pointer being NULL and warn about it, so this clearly has
happened before in other places.
[ Reported it originally to the i915 developers on Jan 8, applying the
ugly workaround on my own now after triggering the problem for the
second time with no feedback.
This is likely to be the same bug reported as
https://bugs.freedesktop.org/show_bug.cgi?id=98829https://bugs.freedesktop.org/show_bug.cgi?id=99134
which has a patch for the underlying problem, but it hasn't gotten to
me, so I'm applying the workaround. ]
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull two parisc fixes from Helge Deller:
"One fix to avoid usage of BITS_PER_LONG in user-space exported swab.h
header which breaks compiling qemu, and one trivial fix for printk
continuation in the parisc parport driver"
* 'parisc-4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Don't use BITS_PER_LONG in userspace-exported swab.h header
parisc, parport_gsc: Fixes for printk continuation lines
Pull i2c fixes from Wolfram Sang:
"Two I2C driver bugfixes.
The 'VLLS mode support' patch should have been entitled 'reconfigure
pinctrl after suspend' to make the bugfix more clear. Sorry, I missed
that, yet didn't want to rebase"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: imx-lpi2c: add VLLS mode support
i2c: i2c-cadence: Initialize configuration before probing devices
In swab.h the "#if BITS_PER_LONG > 32" breaks compiling userspace programs if
BITS_PER_LONG is #defined by userspace with the sizeof() compiler builtin.
Solve this problem by using __BITS_PER_LONG instead. Since we now
#include asm/bitsperlong.h avoid further potential userspace pollution
by moving the #define of SHIFT_PER_LONG to bitops.h which is not
exported to userspace.
This patch unbreaks compiling qemu on hppa/parisc.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org>
Pull NFS client bugfixes from Trond Myklebust:
"Stable patches:
- NFSv4.1: Fix a deadlock in layoutget
- NFSv4 must not bump sequence ids on NFS4ERR_MOVED errors
- NFSv4 Fix a regression with OPEN EXCLUSIVE4 mode
- Fix a memory leak when removing the SUNRPC module
Bugfixes:
- Fix a reference leak in _pnfs_return_layout"
* tag 'nfs-for-4.10-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
pNFS: Fix a reference leak in _pnfs_return_layout
nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"
SUNRPC: cleanup ida information when removing sunrpc module
NFSv4.0: always send mode in SETATTR after EXCLUSIVE4
nfs: Don't increment lock sequence ID after NFS4ERR_MOVED
NFSv4.1: Fix a deadlock in layoutget
Pull MD fixes from Shaohua Li:
"This fixes several corner cases for raid5 cache, which is merged into
this cycle"
* tag 'md/4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md/r5cache: disable write back for degraded array
md/r5cache: shift complex rmw from read path to write path
md/r5cache: flush data only stripes in r5l_recovery_log()
md/raid5: move comment of fetch_block to right location
md/r5cache: read data into orig_page for prexor of cached data
md/raid5-cache: delete meaningless code
Pull arm64 fix from Catalin Marinas:
"Fix kernel panic on ACPI-based systems where CPU capacity description
is not currently handled"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: skip register_cpufreq_notifier on ACPI-based systems
Pull ARC fixes from Vineet Gupta:
"Hopefully last set of changes for ARC for 4.10:
- fix for unaligned access emulation corner case
- fix for udelay loop inline asm regression
- fix irq affinity finally for AXS103 board [Yuriy]
- final fixes for setting IO-coherency sanely in SMP"
* tag 'arc-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: [arcompact] handle unaligned access delay slot corner case
ARCv2: smp-boot: wake_flag polling by non-Masters needs to be uncached
ARC: smp-boot: Decouple Non masters waiting API from jump to entry point
ARCv2: MCIP: update the BCR per current changes
ARC: udelay: fix inline assembler by adding LP_COUNT to clobber list
ARCv2: MCIP: Deprecate setting of affinity in Device Tree
Pull networking fixes from David Miller:
1) GTP fixes from Andreas Schultz (missing genl module alias, clear IP
DF on transmit).
2) Netfilter needs to reflect the fwmark when sending resets, from Pau
Espin Pedrol.
3) nftable dump OOPS fix from Liping Zhang.
4) Fix erroneous setting of VIRTIO_NET_HDR_F_DATA_VALID on transmit,
from Rolf Neugebauer.
5) Fix build error of ipt_CLUSTERIP when procfs is disabled, from Arnd
Bergmann.
6) Fix regression in handling of NETIF_F_SG in harmonize_features(),
from Eric Dumazet.
7) Fix RTNL deadlock wrt. lwtunnel module loading, from David Ahern.
8) tcp_fastopen_create_child() needs to setup tp->max_window, from
Alexey Kodanev.
9) Missing kmemdup() failure check in ipv6 segment routing code, from
Eric Dumazet.
10) Don't execute unix_bind() under the bindlock, otherwise we deadlock
with splice. From WANG Cong.
11) ip6_tnl_parse_tlv_enc_lim() potentially reallocates the skb buffer,
therefore callers must reload cached header pointers into that skb.
Fix from Eric Dumazet.
12) Fix various bugs in legacy IRQ fallback handling in alx driver, from
Tobias Regnery.
13) Do not allow lwtunnel drivers to be unloaded while they are
referenced by active instances, from Robert Shearman.
14) Fix truncated PHY LED trigger names, from Geert Uytterhoeven.
15) Fix a few regressions from virtio_net XDP support, from John
Fastabend and Jakub Kicinski.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (102 commits)
ISDN: eicon: silence misleading array-bounds warning
net: phy: micrel: add support for KSZ8795
gtp: fix cross netns recv on gtp socket
gtp: clear DF bit on GTP packet tx
gtp: add genl family modules alias
tcp: don't annotate mark on control socket from tcp_v6_send_response()
ravb: unmap descriptors when freeing rings
virtio_net: reject XDP programs using header adjustment
virtio_net: use dev_kfree_skb for small buffer XDP receive
r8152: check rx after napi is enabled
r8152: re-schedule napi for tx
r8152: avoid start_xmit to schedule napi when napi is disabled
r8152: avoid start_xmit to call napi_schedule during autosuspend
net: dsa: Bring back device detaching in dsa_slave_suspend()
net: phy: leds: Fix truncated LED trigger names
net: phy: leds: Break dependency of phy.h on phy_led_triggers.h
net: phy: leds: Clear phy_num_led_triggers on failure to avoid crash
net-next: ethernet: mediatek: change the compatible string
Documentation: devicetree: change the mediatek ethernet compatible string
bnxt_en: Fix RTNL lock usage on bnxt_get_port_module_status().
...
Pull xfs uodates from Darrick Wong:
"I have some more fixes this week: better input validation, corruption
avoidance, build fixes, memory leak fixes, and a couple from Christoph
to avoid an ENOSPC failure.
Summary:
- Fix race conditions in the CoW code
- Fix some incorrect input validation checks
- Avoid crashing fs by running out of space when freeing inodes
- Fix toctou race wrt whether or not an inode has an attr
- Fix build error on arm
- Fix page refcount corruption when readahead fails
- Don't corrupt userspace in the bmap ioctl"
* tag 'xfs-for-linus-4.10-rc6-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: prevent quotacheck from overloading inode lru
xfs: fix bmv_count confusion w/ shared extents
xfs: clear _XBF_PAGES from buffers when readahead page
xfs: extsize hints are not unlikely in xfs_bmap_btalloc
xfs: remove racy hasattr check from attr ops
xfs: use per-AG reservations for the finobt
xfs: only update mount/resv fields on success in __xfs_ag_resv_init
xfs: verify dirblocklog correctly
xfs: fix COW writeback race
Pull btrfs updates from Chris Mason:
"Some fixes that we've collected from the list.
We still have one more pending to nail down a regression in lzo
compression, but I wanted to get this batch out the door"
* 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: remove ->{get, set}_acl() from btrfs_dir_ro_inode_operations
Btrfs: disable xattr operations on subvolume directories
Btrfs: remove old tree_root case in btrfs_read_locked_inode()
Btrfs: fix truncate down when no_holes feature is enabled
Btrfs: Fix deadlock between direct IO and fast fsync
btrfs: fix false enospc error when truncating heavily reflinked file
Pull block fixes from Jens Axboe:
"A set of fixes for this series. This contains:
- Set of fixes for the nvme target code
- A revert of patch from this merge window, causing a regression with
WRITE_SAME on iSCSI targets at least.
- A fix for a use-after-free in the new O_DIRECT bdev code.
- Two fixes for the xen-blkfront driver"
* 'for-linus' of git://git.kernel.dk/linux-block:
Revert "sd: remove __data_len hack for WRITE SAME"
nvme-fc: use blk_rq_nr_phys_segments
nvmet-rdma: Fix missing dma sync to nvme data structures
nvmet: Call fatal_error from keep-alive timout expiration
nvmet: cancel fatal error and flush async work before free controller
nvmet: delete controllers deletion upon subsystem release
nvmet_fc: correct logic in disconnect queue LS handling
block: fix use after free in __blkdev_direct_IO
xen-blkfront: correct maximum segment accounting
xen-blkfront: feature flags handling adjustments
Pull rdma fixes from Doug Ledford:
"Second round of -rc fixes for 4.10.
This -rc cycle has been slow for the rdma subsystem. I had already
sent you the first batch before the Holiday break. After that, we kept
only getting a few here or there. Up until this week, when I got a
drop of 13 to one driver (qedr). So, here's the -rc patches I have. I
currently have none held in reserve, so unless something new comes in,
this is it until the next merge window opens.
Summary:
- series of iw_cxgb4 fixes to make it work with the drain cq API
- one or two patches each to: srp, iser, cxgb3, vmw_pvrdma, umem,
rxe, and ipoib
- one big series (13 patches) for the new qedr driver"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (27 commits)
RDMA/cma: Fix unknown symbol when CONFIG_IPV6 is not enabled
IB/rxe: Prevent from completer to operate on non valid QP
IB/rxe: Fix rxe dev insertion to rxe_dev_list
IB/umem: Release pid in error and ODP flow
RDMA/qedr: Dispatch port active event from qedr_add
RDMA/qedr: Fix and simplify memory leak in PD alloc
RDMA/qedr: Fix RDMA CM loopback
RDMA/qedr: Fix formatting
RDMA/qedr: Mark three functions as static
RDMA/qedr: Don't reset QP when queues aren't flushed
RDMA/qedr: Don't spam dmesg if QP is in error state
RDMA/qedr: Remove CQ spinlock from CM completion handlers
RDMA/qedr: Return max inline data in QP query result
RDMA/qedr: Return success when not changing QP state
RDMA/qedr: Add uapi header qedr-abi.h
RDMA/qedr: Fix MTU returned from QP query
RDMA/core: Add the function ib_mtu_int_to_enum
IB/vmw_pvrdma: Fix incorrect cleanup on pvrdma_pci_probe error path
IB/vmw_pvrdma: Don't leak info from alloc_ucontext
IB/cxgb3: fix misspelling in header guard
...
Pull s390 fixes from Martin Schwidefsky:
"Another two bug fixes:
- ptrace partial write information leak
- a guest page hinting regression introduced with v4.6"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: Fix cmma unused transfer from pgste into pte
s390/ptrace: Preserve previous registers for short regset write
Pull swiotlb fix from Konrad Rzeszutek Wilk:
"An ARM fix in the Xen SWIOTLB - mainly the translation of physical to
bus addresses was done just a tad too late"
* 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
swiotlb-xen: update dev_addr after swapping pages
Pull VFIO fix from Alex Williamson:
"mdev IOMMU groups are not yet compatible with the powerpc SPAPR IOMMU
backend, detect and fail group attach (Greg Kurz)"
* tag 'vfio-v4.10-rc6' of git://github.com/awilliam/linux-vfio:
vfio/spapr: fail tce_iommu_attach_group() when iommu_data is null
If IPV6 has not been enabled in the underlying kernel, we must avoid
calling IPV6 procedures in rdma_cm.ko.
This requires using "IS_ENABLED(CONFIG_IPV6)" in "if" statements
surrounding any code which calls external IPV6 procedures.
In the instance fixed here, procedure cma_bind_addr() called
ipv6_addr_type() -- which resulted in calling external procedure
__ipv6_addr_type().
Fixes: 6c26a77124 ("RDMA/cma: fix IPv6 address resolution")
Cc: <stable@vger.kernel.org> # v4.2+
Cc: Spencer Baugh <sbaugh@catern.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Konrad writes:
Please pull in your 'for-linus' branch two little fixes for Xen
block front:
One fix is for handling the XEN_PAGE_SIZE != PAGE_SIZE (4KB vs 64KB
on ARM for example) mishandling while the other is fixing
the accounting for the configuration changes.
After emulating an unaligned access in delay slot of a branch, we
pretend as the delay slot never happened - so return back to actual
branch target (or next PC if branch was not taken).
Curently we did this by handling STATUS32.DE, we also need to clear the
BTA.T bit, which is disregarded when returning from original misaligned
exception, but could cause weirdness if it took the interrupt return
path (in case interrupt was acive too)
One ARC700 customer ran into this when enabling unaligned access fixup
for kernel mode accesses as well
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Pull media fixes from Mauro Carvalho Chehab:
- fix a regression on tvp5150 causing failures at input selection and
image glitches
- CEC was moved out of staging for v4.10. Fix some bugs on it while not
too late
- fix a regression on pctv452e caused by VM stack changes
- fix suspend issued with smiapp
- fix a regression on cobalt driver
- fix some warnings and Kconfig issues with some random configs.
* tag 'media/v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] s5k4ecgx: select CRC32 helper
[media] dvb: avoid warning in dvb_net
[media] v4l: tvp5150: Don't override output pinmuxing at stream on/off time
[media] v4l: tvp5150: Fix comment regarding output pin muxing
[media] v4l: tvp5150: Reset device at probe time, not in get/set format handlers
[media] pctv452e: move buffer to heap, no mutex
[media] media/cobalt: use pci_irq_allocate_vectors
[media] cec: fix race between configuring and unconfiguring
[media] cec: move cec_report_phys_addr into cec_config_thread_func
[media] cec: replace cec_report_features by cec_fill_msg_report_features
[media] cec: update log_addr[] before finishing configuration
[media] cec: CEC_MSG_GIVE_FEATURES should abort for CEC version < 2
[media] cec: when canceling a message, don't overwrite old status info
[media] cec: fix report_current_latency
[media] smiapp: Make suspend and resume functions __maybe_unused
[media] smiapp: Implement power-on and power-off sequences without runtime PM
Pull MMC fix from Ulf Hansson:
"MMC host: fix runtime PM resume path in dw_mmc"
* tag 'mmc-v4.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: dw_mmc: force setup bus if active slots exist
Pull thermal management fix from Zhang Rui:
"A single revert from a recently introduced problem.
Specifics:
Commit 7611fb6806 ("thermal: thermal_hwmon: Convert to
hwmon_device_register_with_info()"), which was introduced in 4.10-rc5,
uses new hwmon API. But this breaks some soc thermal driver because
the new hwmon API has a strict rule for the hwmon device name. Revert
the offending commit as a quick solution for 4.10"
* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
Revert "thermal: thermal_hwmon: Convert to hwmon_device_register_with_info()"
Quotacheck runs at mount time in situations where quota accounting must
be recalculated. In doing so, it uses bulkstat to visit every inode in
the filesystem. Historically, every inode processed during quotacheck
was released and immediately tagged for reclaim because quotacheck runs
before the superblock is marked active by the VFS. In other words,
the final iput() lead to an immediate ->destroy_inode() call, which
allowed the XFS background reclaim worker to start reclaiming inodes.
Commit 17c12bcd3 ("xfs: when replaying bmap operations, don't let
unlinked inodes get reaped") marks the XFS superblock active sooner as
part of the mount process to support caching inodes processed during log
recovery. This occurs before quotacheck and thus means all inodes
processed by quotacheck are inserted to the LRU on release. The
s_umount lock is held until the mount has completed and thus prevents
the shrinkers from operating on the sb. This means that quotacheck can
excessively populate the inode LRU and lead to OOM conditions on systems
without sufficient RAM.
Update the quotacheck bulkstat handler to set XFS_IGET_DONTCACHE on
inodes processed by quotacheck. This causes ->drop_inode() to return 1
and in turn causes iput_final() to evict the inode. This preserves the
original quotacheck behavior and prevents it from overloading the LRU
and running out of memory.
CC: stable@vger.kernel.org # v4.9
Reported-by: Martin Svec <martin.svec@zoner.cz>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
With some gcc versions, we get a warning about the eicon driver,
and that currently shows up as the only remaining warning in one
of the build bots:
In file included from ../drivers/isdn/hardware/eicon/message.c:30:0:
eicon/message.c: In function 'mixer_notify_update':
eicon/platform.h:333:18: warning: array subscript is above array bounds [-Warray-bounds]
The code is easily changed to open-code the unusual PUT_WORD() line
causing this to avoid the warning.
Cc: stable@vger.kernel.org
Link: http://arm-soc.lixom.net/buildlogs/stable-rc/v4.4.45/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is adds support for the PHYs in the KSZ8795 5port managed switch.
It will allow to detect the link between the switch and the soc
and uses the same read_status functions as the KSZ8873MLL switch.
Signed-off-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Schultz says:
====================
various gtp fixes
I'm sorry for the compile error mess up in the last version.
It's no excuse for not test compiling, but the hunks got lost in
a rebase.
This is the part of the previous "simple gtp improvements" series
that Pablo indicated should go into net.
The addition of the module alias fixes genl family autoloading,
clearing the DF bit fixes a protocol violation in regard to the
specification and the netns comparison fixes a corner case of
cross netns recv.
v2->v3: fix compiler error introduced in rebase
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The use of the passed through netlink src_net to check for a
cross netns operation was wrong. Using the GTP socket and the
GTP netdevice is always correct (even if the netdev has been
moved to new netns after link creation).
Remove the now obsolete net field from gtp_dev.
Signed-off-by: Andreas Schultz <aschultz@tpip.net>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3GPP TS 29.281 and 3GPP TS 29.060 imply that GTP-U packets should be
sent with the DF bit cleared. For example 3GPP TS 29.060, Release 8,
Section 13.2.2:
> Backbone router: Any router in the backbone may fragment the GTP
> packet if needed, according to IPv4.
Signed-off-by: Andreas Schultz <aschultz@tpip.net>
Acked-by: Harald Welte <laforge@netfilter.org>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unlike ipv4, this control socket is shared by all cpus so we cannot use
it as scratchpad area to annotate the mark that we pass to ip6_xmit().
Add a new parameter to ip6_xmit() to indicate the mark. The SCTP socket
family caches the flowi6 structure in the sctp_transport structure, so
we cannot use to carry the mark unless we later on reset it back, which
I discarded since it looks ugly to me.
Fixes: bf99b4ded5 ("tcp: fix mark propagation with fwmark_reflect enabled")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
On ACPI based systems where the topology is setup using the API
store_cpu_topology, at the moment we do not have necessary code
to parse cpu capacity and handle cpufreq notifier, thus
resulting in a kernel panic.
Stack:
init_cpu_capacity_callback+0xb4/0x1c8
notifier_call_chain+0x5c/0xa0
__blocking_notifier_call_chain+0x58/0xa0
blocking_notifier_call_chain+0x3c/0x50
cpufreq_set_policy+0xe4/0x328
cpufreq_init_policy+0x80/0x100
cpufreq_online+0x418/0x710
cpufreq_add_dev+0x118/0x180
subsys_interface_register+0xa4/0xf8
cpufreq_register_driver+0x1c0/0x298
cppc_cpufreq_init+0xdc/0x1000 [cppc_cpufreq]
do_one_initcall+0x5c/0x168
do_init_module+0x64/0x1e4
load_module+0x130c/0x14d0
SyS_finit_module+0x108/0x120
el0_svc_naked+0x24/0x28
Fixes: 7202bde8b7 ("arm64: parse cpu capacity-dmips-mhz from DT")
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Pull drm fixes from Dave Airlie:
"This is the main request for rc6, since really the one earlier was the
rc5 one :-)
The main thing are the nouveau specific race fixes for the connector
locking bug we fixed in -next and reverted here as it has quite large
prereqs. These two fixes should solve the problem at that level and we
can fix it properly in 4.11
Otherwise i915 has a bunch of changes, one ABI change for GVT related
stuff, some VC4 leak fixes, one core fence fix and some AMD changes,
oh and one ast hang avoidance fix.
Hoping it calms down around now"
* tag 'drm-fixes-for-v4.10-rc6-part-two' of git://people.freedesktop.org/~airlied/linux: (25 commits)
drm/nouveau: Handle fbcon suspend/resume in seperate worker
drm/nouveau: Don't enabling polling twice on runtime resume
drm/ast: Fixed system hanged if disable P2A
Revert "drm/radeon: always apply pci shutdown callbacks"
drm/i915: reinstate call to trace_i915_vma_bind
drm/i915: Move atomic state free from out of fence release
drm/i915: Check for NULL atomic state in intel_crtc_disable_noatomic()
drm/i915: Fix calculation of rotated x and y offsets for planar formats
drm/i915: Don't init hpd polling for vlv and chv from runtime_suspend()
drm/i915: Don't leak edid in intel_crt_detect_ddc()
drm/i915: Release temporary load-detect state upon switching
drm/i915: prevent crash with .disable_display parameter
drm/i915: Avoid drm_atomic_state_put(NULL) in intel_display_resume
MAINTAINERS: update new mail list for intel gvt driver
drm/i915/gvt: Fix kmem_cache_create() name
drm/i915/gvt/kvmgt: mdev ABI is available_instances, not available_instance
drm/amdgpu: fix unload driver issue for virtual display
drm/amdgpu: check ring being ready before using
drm/vc4: Return -EINVAL on the overflow checks failing.
drm/vc4: Fix an integer overflow in temporary allocation layout.
...
More fixes than I'd like at this stage, but I think the holidays and
conferences have delayed finding and fixing the stuff a bit. Almost all
of them have Fixes: tags, so it's not just random fixes, we can point
fingers at the commits that broke stuff.
There's an ABI fix to GVT from Alex, before we go on an release a kernel
with the wrong attribute name.
* tag 'drm-intel-fixes-2017-01-26' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: reinstate call to trace_i915_vma_bind
drm/i915: Move atomic state free from out of fence release
drm/i915: Check for NULL atomic state in intel_crtc_disable_noatomic()
drm/i915: Fix calculation of rotated x and y offsets for planar formats
drm/i915: Don't init hpd polling for vlv and chv from runtime_suspend()
drm/i915: Don't leak edid in intel_crt_detect_ddc()
drm/i915: Release temporary load-detect state upon switching
drm/i915: prevent crash with .disable_display parameter
drm/i915: Avoid drm_atomic_state_put(NULL) in intel_display_resume
MAINTAINERS: update new mail list for intel gvt driver
drm/i915/gvt: Fix kmem_cache_create() name
drm/i915/gvt/kvmgt: mdev ABI is available_instances, not available_instance
drm/i915/gvt: Fix relocation of shadow bb
drm/i915/gvt: Enable the shadow batch buffer
Pull ACPI fixes from Rafael Wysocki:
"These fix two regressions introduced recently, one by reverting the
problematic commit and one by fixing up locking in the ACPICA core.
Specifics:
- Revert a recent change that added an ACPI video blacklist entry for
HP Pavilion dv6 as it turned to introduce backlight handling
regressions on some systems (Hans de Goede).
- Fix locking in the ACPICA core to avoid deadlocks related to table
loading that were exposed by a recent change in that area (Lv
Zheng)"
* tag 'acpi-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI / video: Add force_native quirk for HP Pavilion dv6"
ACPICA: Tables: Fix hidden logic related to acpi_tb_install_standard_table()
Pull power management fixes from Rafael Wysocki:
"These fix two regressions introduced recently, one by reverting the
problematic commit and one by fixing up the behavior in an overlooked
case.
Specifics:
- Revert the recent change that caused suspend-to-idle to be used as
the default suspend method on systems where it is indicated to be
efficient by the ACPI tables, as that turned out to be premature
and introduced suspend regressions on some systems with missing
power management support in device drivers (Rafael Wysocki).
- Fix up the intel_pstate driver to take changes of the global limits
via sysfs correctly when the performance policy is used which has
been broken by a recent change in it (Srinivas Pandruvada)"
* tag 'pm-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy
Revert "PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag"
Resuming from RPM can happen while already holding
dev->mode_config.mutex. This means we can't actually handle fbcon in
any RPM resume workers, since restoring fbcon requires grabbing
dev->mode_config.mutex again. So move the fbcon suspend/resume code into
it's own worker, and rely on that instead to avoid deadlocking.
This fixes more deadlocks for runtime suspending the GPU on the ThinkPad
W541. Reproduction recipe:
- Get a machine with both optimus and a nvidia card with connectors
attached to it
- Wait for the nvidia GPU to suspend
- Attempt to manually reprobe any of the connectors on the nvidia GPU
using sysfs
- *deadlock*
[airlied: use READ_ONCE to address Hans's comment]
Signed-off-by: Lyude <lyude@redhat.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Kilian Singer <kilian.singer@quantumtechnology.info>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: David Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
As it turns out, on cards that actually have CRTCs on them we're already
calling drm_kms_helper_poll_enable(drm_dev) from
nouveau_display_resume() before we call it in
nouveau_pmops_runtime_resume(). This leads us to accidentally trying to
enable polling twice, which results in a potential deadlock between the
RPM locks and drm_dev->mode_config.mutex if we end up trying to enable
polling the second time while output_poll_execute is running and holding
the mode_config lock. As such, make sure we only enable polling in
nouveau_pmops_runtime_resume() if we need to.
This fixes hangs observed on the ThinkPad W541
Signed-off-by: Lyude <lyude@redhat.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Kilian Singer <kilian.singer@quantumtechnology.info>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: David Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The original ast driver will access some BMC configuration through P2A bridge
that can be disabled since AST2300 and after.
It will cause system hanged if P2A bridge is disabled.
Here is the update to fix it.
Signed-off-by: Y.C. Chen <yc_chen@aspeedtech.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This pull request brings in a few little error checking fixes and one
slow memory leak fix.
* tag 'drm-vc4-fixes-2017-01-23' of https://github.com/anholt/linux:
drm/vc4: Return -EINVAL on the overflow checks failing.
drm/vc4: Fix an integer overflow in temporary allocation layout.
drm/vc4: fix a bounds check
drm/vc4: Fix memory leak of the CRTC state.
Just a few small fixes.
* 'drm-fixes-4.10' of git://people.freedesktop.org/~agd5f/linux:
Revert "drm/radeon: always apply pci shutdown callbacks"
drm/amdgpu: fix unload driver issue for virtual display
drm/amdgpu: check ring being ready before using
Single fence fix.
* tag 'drm-misc-fixes-2017-01-23' of git://anongit.freedesktop.org/git/drm-misc:
drm/fence: fix memory overwrite when setting out_fence fd
When you snapshot a subvolume containing a subvolume, you get a
placeholder directory where the subvolume would be. These directory
inodes have ->i_ops set to btrfs_dir_ro_inode_operations. Previously,
these i_ops didn't include the xattr operation callbacks. The conversion
to xattr_handlers missed this case, leading to bogus attempts to set
xattrs on these inodes. This manifested itself as failures when running
delayed inodes.
To fix this, clear IOP_XATTR in ->i_opflags on these inodes.
Fixes: 6c6ef9f26e ("xattr: Stop calling {get,set,remove}xattr inode operations")
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Reported-by: Chris Murphy <lists@colorremedies.com>
Tested-by: Chris Murphy <lists@colorremedies.com>
Cc: <stable@vger.kernel.org> # 4.9.x
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
As Jeff explained in c2951f32d3 ("btrfs: remove old tree_root dirent
processing in btrfs_real_readdir()"), supporting this old format is no
longer necessary since the Btrfs magic number has been updated since we
changed to the current format. There are other places where we still
handle this old format, but since this is part of a fix that is going to
stable, I'm only removing this one for now.
Cc: <stable@vger.kernel.org> # 4.9.x
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
"swiotlb buffer is full" errors occur after repeated initialisation of a
device - f.e. suspend/resume or ip link set up/down. This is because memory
mapped using dma_map_single() in ravb_ring_format() and ravb_start_xmit()
is not released. Resolve this problem by unmapping descriptors when
freeing rings.
Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
[simon: reworked]
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IF NFS_LAYOUT_RETURN_REQUESTED is not set, then we currently exit
without freeing the list of invalidated layout segments, leading
to a reference leak.
Reported-by: Olga Kornievskaia <aglo@umich.edu>
Fixes: 24408f5282 ("pNFS: Fix bugs in _pnfs_return_layout")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Lock sequence IDs are bumped in decode_lock by calling
nfs_increment_seqid(). nfs_increment_sequid() does not use the
seqid_mutating_err() function fixed in commit 059aa73482 ("Don't
increment lock sequence ID after NFS4ERR_MOVED").
Fixes: 059aa73482 ("Don't increment lock sequence ID after ...")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Xuan Qi <xuan.qi@oracle.com>
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains a large batch with Netfilter fixes for
your net tree, they are:
1) Two patches to solve conntrack garbage collector cpu hogging, one to
remove GC_MAX_EVICTS and another to look at the ratio (scanned entries
vs. evicted entries) to make a decision on whether to reduce or not
the scanning interval. From Florian Westphal.
2) Two patches to fix incorrect set element counting if NLM_F_EXCL is
is not set. Moreover, don't decrenent set->nelems from abort patch
if -ENFILE which leaks a spare slot in the set. This includes a
patch to deconstify the set walk callback to update set->ndeact.
3) Two fixes for the fwmark_reflect sysctl feature: Propagate mark to
reply packets both from nf_reject and local stack, from Pau Espin Pedrol.
4) Fix incorrect handling of loopback traffic in rpfilter and nf_tables
fib expression, from Liping Zhang.
5) Fix oops on stateful objects netlink dump, when no filter is specified.
Also from Liping Zhang.
6) Fix a build error if proc is not available in ipt_CLUSTERIP, related
to fix that was applied in the previous batch for net. From Arnd Bergmann.
7) Fix lack of string validation in table, chain, set and stateful
object names in nf_tables, from Liping Zhang. Moreover, restrict
maximum log prefix length to 127 bytes, otherwise explicitly bail
out.
8) Two patches to fix spelling and typos in nf_tables uapi header file
and Kconfig, patches from Alexander Alemayhu and William Breathitt Gray.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In a bmapx call, bmv_count is the total size of the array, including the
zeroth element that userspace uses to supply the search key. The output
array starts at offset 1 so that we can set up the user for the next
invocation. Since we now can split an extent into multiple bmap records
due to shared/unshared status, we have to be careful that we don't
overflow the output array.
In the original patch f86f403794 ("xfs: teach get_bmapx about shared
extents and the CoW fork") I used cur_ext (the output index) to check
for overflows, albeit with an off-by-one error. Since nexleft no longer
describes the number of unfilled slots in the output, we can rip all
that out and use cur_ext for the overflow check directly.
Failure to do this causes heap corruption in bmapx callers such as
xfs_io and xfs_scrub. xfs/328 can reproduce this problem.
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
We perform the conversion between kernel jiffies and ms only when
exporting kernel value to user space.
We need to do the opposite operation when value is written by user.
Only matters when HZ != 1000
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull pin control fixes from Linus Walleij:
"A bunch of pin control fixes for v4.10 that didn't get sent off until
now, sorry for the delay.
It's only driver fixes:
- A bunch of fixes to the Intel drivers: broxton, baytrail. Bugs
related to register offsets, IRQ, debounce functionality.
- Fix a conflict amongst UART settings on the meson.
- Fix the ethernet setting on the Uniphier.
- A compilation warning squelched"
* tag 'pinctrl-v4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: uniphier: fix Ethernet (RMII) pin-mux setting for LD20
pinctrl: meson: fix uart_ao_b for GXBB and GXL/GXM
pinctrl: amd: avoid maybe-uninitalized warning
pinctrl: baytrail: Do not add all GPIOs to IRQ domain
pinctrl: baytrail: Rectify debounce support
pinctrl: intel: Set pin direction properly
pinctrl: broxton: Use correct PADCFGLOCK offset
Pull nvme target fixes from Sagi:
Given that its -rc6, I removed anything that is not
bug fix.
- nvmet-fc discard fix from Christoph
- queue disconnect fix from James
- nvmet-rdma dma sync fix from Parav
- Some more nvmet fixes
Pull drm revert from Dave Airlie:
"Revert one patch missing some prereqs.
One of the connector fixes was missing some prereqs, we have an
alternate driver fix that should work that I'll send tomorrow.
Today is a holiday here so quickly smashing this out"
Daniel Vetter explains:
"I pushed a locking change to fix a nouveau rpm issue to -fixes that
needed the connector_list rework. And that's only in -next, but I
missed that. Dave has the revert in a pull, and he'll follow-up with
the hack nouveau patch for 4.10, and then we'll reapply the proper fix
again for -next and revert the hacks. A bit a mess, but should be
sorted soon"
* tag 'drm-fixes-for-v4.10-rc6-revert-one' of git://people.freedesktop.org/~airlied/linux:
Revert "drm/probe-helpers: Drop locking from poll_enable"
Without this deallocate won't work properly due to the mismatch
of the bio/request size and the actual payload size.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
This patch performs dma sync operations on nvme_command
and nvme_completion.
nvme_command is synced
(a) on receiving of the recv queue completion for cpu access.
(b) before posting recv wqe back to rdma adapter for device access.
nvme_completion is synced
(a) on receiving of the recv queue completion of associated
nvme_command for cpu access.
(b) before posting send wqe to rdma adapter for device access.
This patch is generated for git://git.infradead.org/nvme-fabrics.git
Branch: nvmf-4.10
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
We only need to call delete_ctrl once, so given that both
keep-alive timeout and any other fatal error can trigger it,
just make sure we only call delete_ctrl once.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Make sure they are not running and we can free the controller
safely.
Signed-off-by: Roy Shterman <roys@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
No reason for them to be kept around if we are
deleting the subsystem, so instead of passively
wait for the host to disconnect, actively delete
the controllers.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Correct logic in disconnect queue LS handling.
Rework so that queue searching and error reporting is above the
section to send back a ls rjt
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
If we try to allocate memory pages to back an xfs_buf that we're trying
to read, it's possible that we'll be so short on memory that the page
allocation fails. For a blocking read we'll just wait, but for
readahead we simply dump all the pages we've collected so far.
Unfortunately, after dumping the pages we neglect to clear the
_XBF_PAGES state, which means that the subsequent call to xfs_buf_free
thinks that b_pages still points to pages we own. It then double-frees
the b_pages pages.
This results in screaming about negative page refcounts from the memory
manager, which xfs oughtn't be triggering. To reproduce this case,
mount a filesystem where the size of the inodes far outweighs the
availalble memory (a ~500M inode filesystem on a VM with 300MB memory
did the trick here) and run bulkstat in parallel with other memory
eating processes to put a huge load on the system. The "check summary"
phase of xfs_scrub also works for this purpose.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Simon Wunderlich says:
====================
Here is a batman-adv bugfix:
- fix reference count handling on fragmentation error, by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 17bedab272 ("bpf: xdp: Allow head adjustment in XDP prog")
added a new XDP helper to prepend and remove data from a frame.
Make virtio_net reject programs making use of this helper until
proper support is added.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the small buffer case during driver unload we currently use
put_page instead of dev_kfree_skb. Resolve this by adding a check
for virtnet mode when checking XDP queue type. Also name the
function so that the code reads correctly to match the additional
check.
Fixes: bb91accf27 ("virtio-net: XDP support for small buffers")
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hayes Wang says:
====================
r8152: fix scheduling napi
v3:
simply the argument for patch #3. Replace &tp->napi with napi.
v2:
Add smp_mb__after_atomic() for patch #1.
v1:
Scheduling the napi during the following periods would let it be ignored.
And the events wouldn't be handled until next napi_schedule() is called.
1. after napi_disable and before napi_enable().
2. after all actions of napi function is completed and before calling
napi_complete().
If no next napi_schedule() is called, tx or rx would stop working.
In order to avoid these situations, the followings solutions are applied.
1. prevent start_xmit() from calling napi_schedule() during runtime suspend
or after napi_disable().
2. re-schedule the napi for tx if it is necessary.
3. check if any rx is finished or not after napi_enable().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Schedule the napi after napi_enable() for rx, if it is necessary.
If the rx is completed when napi is disabled, the sheduling of napi
would be lost. Then, no one handles the rx packet until next napi
is scheduled.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Re-schedule napi after napi_complete() for tx, if it is necessay.
In r8152_poll(), if the tx is completed after tx_bottom() and before
napi_complete(), the scheduling of napi would be lost. Then, no
one handles the next tx until the next napi_schedule() is called.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stop the tx when the napi is disabled to prevent napi_schedule() is
called.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adjust the setting of the flag of SELECTIVE_SUSPEND to prevent start_xmit()
from calling napi_schedule() directly during runtime suspend.
After calling napi_disable() or clearing the flag of WORK_ENABLE,
scheduling the napi is useless.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When system enters VLLS mode, module power is turned off. As a result,
all registers are reset to HW default value. After exiting VLLS mode,
registers are still in default mode. As a result, the pinctrl settings
are incorrect, which will affect the module function.
The patch recovers the pinctrl setting when exit VLLS mode.
Signed-off-by: Gao Pan <pandy.gao@nxp.com>
Reviewed-by: Vladimir Zapolskiy <vz@mleia.com>
[wsa: added missing include]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The cadence I2C driver calls cdns_i2c_writereg(..) to setup a workaround
in the controller, but did so after calling i2c_add_adapter() which starts
probing devices on the bus. Change the order so that the configuration is
completely finished before using the adapter.
Signed-off-by: Mike Looijmans <mike.looijmans@topic.nl>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
This reverts commit 3846fd9b86.
There were some precursor commits missing for this around connector
locking, we should probably merge Lyude's nouveau avoid the problem patch.
Commit 448b4482c6 ("net: dsa: Add lockdep class to tx queues to avoid
lockdep splat") removed the netif_device_detach() call done in
dsa_slave_suspend() which is necessary, and paired with a corresponding
netif_device_attach(), bring it back.
Fixes: 448b4482c6 ("net: dsa: Add lockdep class to tx queues to avoid lockdep splat")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven says:
====================
net: phy: leds: Fix truncated LED trigger names and crashes
I started seeing crashes during s2ram and poweroff on all my ARM boards,
like:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
...
[<c04116d4>] (__list_del_entry_valid) from [<c05e8948>] (led_trigger_unregister+0x34/0xcc)
[<c05e8948>] (led_trigger_unregister) from [<c05336c4>] (phy_led_triggers_unregister+0x28/0x34)
[<c05336c4>] (phy_led_triggers_unregister) from [<c0531d44>] (phy_detach+0x30/0x74)
[<c0531d44>] (phy_detach) from [<c0538bdc>] (sh_eth_close+0x64/0x9c)
[<c0538bdc>] (sh_eth_close) from [<c04d4ce0>] (dpm_run_callback+0x48/0xc8)
or:
list_del corruption. prev->next should be dede6540, but was 2e323931
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:52!
...
[<c02f6d70>] (__list_del_entry_valid) from [<c0425168>] (led_trigger_unregister+0x34/0xcc)
[<c0425168>] (led_trigger_unregister) from [<c03a05a0>] (phy_led_triggers_unregister+0x28/0x34)
[<c03a05a0>] (phy_led_triggers_unregister) from [<c039ec04>] (phy_detach+0x30/0x74)
[<c039ec04>] (phy_detach) from [<c03a4fc0>] (sh_eth_close+0x6c/0xa4)
[<c03a4fc0>] (sh_eth_close) from [<c0483234>] (__dev_close_many+0xac/0xd0)
As the only clue was a kernel message like
sh-eth ee700000.ethernet eth0: No phy led trigger registered for speed(100)
I had to bisected this, leading to commit 4567d686f5 ("phy:
increase size of MII_BUS_ID_SIZE and bus_id"). Reverting that commit
fixed the issue.
More investigation revealed the crashes are due to the combination of
two things:
- Truncated LED trigger names, leading to duplicate names, and
registration failures,
- Bad error handling in case of registration failures.
Both are fixed by this patch series.
Changes compared to v1:
- Add Reviewed-by,
- New patch "net: phy: leds: Break dependency of phy.h on
phy_led_triggers.h",
- Drop moving the include of <linux/phy_led_triggers.h>, as
<linux/phy.h> no longer includes it,
- #include <linux/phy.h> from <linux/phy_led_triggers.h>.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 4567d686f5 ("phy: increase size of MII_BUS_ID_SIZE and
bus_id") increased the size of MII bus IDs, but forgot to update the
private definition in <linux/phy_led_triggers.h>.
This may cause:
1. Truncation of LED trigger names,
2. Duplicate LED trigger names,
3. Failures registering LED triggers,
4. Crashes due to bad error handling in the LED trigger failure path.
To fix this, and prevent the definitions going out of sync again in the
future, let the PHY LED trigger code use the existing MII_BUS_ID_SIZE
definition.
Example:
- Before I had triggers "ee700000.etherne:01:100Mbps" and
"ee700000.etherne:01:10Mbps",
- After the increase of MII_BUS_ID_SIZE, both became
"ee700000.ethernet-ffffffff:01:" => FAIL,
- Now, the triggers are "ee700000.ethernet-ffffffff:01:100Mbps" and
"ee700000.ethernet-ffffffff:01:10Mbps", which are unique again.
Fixes: 4567d686f5 ("phy: increase size of MII_BUS_ID_SIZE and bus_id")
Fixes: 2e0bc452f4 ("net: phy: leds: add support for led triggers on phy link state change")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
<linux/phy.h> includes <linux/phy_led_triggers.h>, which is not really
needed. Drop the include from <linux/phy.h>, and add it to all users
that didn't include it explicitly.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
phy_attach_direct() ignores errors returned by
phy_led_triggers_register(). I think that's OK, as LED triggers can be
considered a non-critical feature.
However, this causes problems later:
- phy_led_trigger_change_speed() will access the array
phy_device.phy_led_triggers, which has been freed in the error path
of phy_led_triggers_register(), which may lead to a crash.
- phy_led_triggers_unregister() will access the same array, leading to
crashes during s2ram or poweroff, like:
Unable to handle kernel NULL pointer dereference at virtual address
00000000
...
[<c04116d4>] (__list_del_entry_valid) from [<c05e8948>] (led_trigger_unregister+0x34/0xcc)
[<c05e8948>] (led_trigger_unregister) from [<c05336c4>] (phy_led_triggers_unregister+0x28/0x34)
[<c05336c4>] (phy_led_triggers_unregister) from [<c0531d44>] (phy_detach+0x30/0x74)
[<c0531d44>] (phy_detach) from [<c0538bdc>] (sh_eth_close+0x64/0x9c)
[<c0538bdc>] (sh_eth_close) from [<c04d4ce0>] (dpm_run_callback+0x48/0xc8)
or:
list_del corruption. prev->next should be dede6540, but was 2e323931
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:52!
...
[<c02f6d70>] (__list_del_entry_valid) from [<c0425168>] (led_trigger_unregister+0x34/0xcc)
[<c0425168>] (led_trigger_unregister) from [<c03a05a0>] (phy_led_triggers_unregister+0x28/0x34)
[<c03a05a0>] (phy_led_triggers_unregister) from [<c039ec04>] (phy_detach+0x30/0x74)
[<c039ec04>] (phy_detach) from [<c03a4fc0>] (sh_eth_close+0x6c/0xa4)
[<c03a4fc0>] (sh_eth_close) from [<c0483234>] (__dev_close_many+0xac/0xd0)
To fix this, clear phy_device.phy_num_led_triggers in the error path of
phy_led_triggers_register() fails.
Note that the "No phy led trigger registered for speed" message will
still be printed on link speed changes, which is a good cue that
something went wrong with the LED triggers.
Fixes: 2e0bc452f4 ("net: phy: leds: add support for led triggers on phy link state change")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the binding was defined, I was not aware that mt2701 was an earlier
version of the SoC. For sake of consistency, the ethernet driver should
use mt2701 inside the compat string as this is the earliest SoC with the
ethernet core.
The ethernet driver is currently of no real use until we finish and
upstream the DSA driver. There are no users of this binding yet. It should
be safe to fix this now before it is too late and we need to provide
backward compatibility for the mt7623-eth compat string.
Reported-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the binding was defined, I was not aware that mt2701 was an earlier
version of the SoC. For sake of consistency, the ethernet driver should
use mt2701 inside the compat string as this is the earliest SoC with the
ethernet core.
The ethernet driver is currently of no real use until we finish and
upstream the DSA driver. There are no users of this binding yet. It should
be safe to fix this now before it is too late and we need to provide
backward compatibility for the mt7623-eth compat string.
Reported-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: John Crispin <john@phrozen.org>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Fix RTNL lock usage in bnxt_sp_task().
There are 2 function calls from bnxt_sp_task() that have buggy RTNL
usage. These 2 functions take RTNL lock under some conditions, but
some callers (such as open, ethtool) have already taken RTNL. These
3 patches fix the issue by making it clear that callers must take
RTNL. If the caller is bnxt_sp_task() which does not automatically
take RTNL, we add a common scheme for bnxt_sp_task() to call these
functions properly under RTNL.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_get_port_module_status() calls bnxt_update_link() which expects
RTNL to be held. In bnxt_sp_task() that does not hold RTNL, we need to
call it with a prior call to bnxt_rtnl_lock_sp() and the call needs to
be moved to the end of bnxt_sp_task().
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_update_link() is called from multiple code paths. Most callers,
such as open, ethtool, already hold RTNL. Only the caller bnxt_sp_task()
does not. So it is a bug to take RTNL inside bnxt_update_link().
Fix it by removing the RTNL inside bnxt_update_link(). The function
now expects the caller to always hold RTNL.
In bnxt_sp_task(), call bnxt_rtnl_lock_sp() before calling
bnxt_update_link(). We also need to move the call to the end of
bnxt_sp_task() since it will be clearing the BNXT_STATE_IN_SP_TASK bit.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In bnxt_sp_task(), we set a bit BNXT_STATE_IN_SP_TASK so that bnxt_close()
will synchronize and wait for bnxt_sp_task() to finish. Some functions
in bnxt_sp_task() require us to clear BNXT_STATE_IN_SP_TASK and then
acquire rtnl_lock() to prevent race conditions.
There are some bugs related to this logic. This patch refactors the code
to have common bnxt_rtnl_lock_sp() and bnxt_rtnl_unlock_sp() to handle
the RTNL and the clearing/setting of the bit. Multiple functions will
need the same logic. We also need to move bnxt_reset() to the end of
bnxt_sp_task(). Functions that clear BNXT_STATE_IN_SP_TASK must be the
last functions to be called in bnxt_sp_task(). The common scheme will
handle the condition properly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull virtio/vhost fixes from Michael Tsirkin:
- ARM DMA fixes
- vhost vsock bugfix
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vring: Force use of DMA API for ARM-based systems with legacy devices
virtio_mmio: Set DMA masks appropriately
vhost/vsock: handle vhost_vq_init_access() error
sock_reset_flag() maps to __clear_bit() not the atomic version clear_bit().
Thus, we need smp_mb(), smp_mb__after_atomic() is not sufficient.
Fixes: 3c7151275c ("tcp: add memory barriers to write space paths")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now sctp gso puts segments into skb's frag_list, then processes these
segments in skb_segment. But skb_segment handles them only when gs is
enabled, as it's in the same branch with skb's frags.
Although almost all the NICs support sg other than some old ones, but
since commit 1e16aa3ddf ("net: gso: use feature flag argument in all
protocol gso handlers"), features &= skb->dev->hw_enc_features, and
xfrm_output_gso call skb_segment with features = 0, which means sctp
gso would call skb_segment with sg = 0, and skb_segment would not work
as expected.
This patch is to fix it by setting features param with NETIF_F_SG when
calling skb_segment so that it can go the right branch to process the
skb's frag_list.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sctp_addr_id2transport is a function for sockopt to look up assoc by
address. As the address is from userspace, it can be a v4-mapped v6
address. But in sctp protocol stack, it always handles a v4-mapped
v6 address as a v4 address. So it's necessary to convert it to a v4
address before looking up assoc by address.
This patch is to fix it by calling sctp_verify_addr in which it can do
this conversion before calling sctp_endpoint_lookup_assoc, just like
what sctp_sendmsg and __sctp_connect do for the address from users.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With COW files they are the hotpath, just like for files with the
extent size hint attribute. We really shouldn't micro-manage anything
but failure cases with unlikely.
Additionally Arnd Bergmann recently reported that one of these two
unlikely annotations causes link failures together with an upcoming
kernel instrumentation patch, so let's get rid of it ASAP.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
xfs_attr_[get|remove]() have unlocked attribute fork checks to optimize
away a lock cycle in cases where the fork does not exist or is otherwise
empty. This check is not safe, however, because an attribute fork short
form to extent format conversion includes a transient state that causes
the xfs_inode_hasattr() check to fail. Specifically,
xfs_attr_shortform_to_leaf() creates an empty extent format attribute
fork and then adds the existing shortform attributes to it.
This means that lookup of an existing xattr can spuriously return
-ENOATTR when racing against a setxattr that causes the associated
format conversion. This was originally reproduced by an untar on a
particularly configured glusterfs volume, but can also be reproduced on
demand with properly crafted xattr requests.
The format conversion occurs under the exclusive ilock. xfs_attr_get()
and xfs_attr_remove() already have the proper locking and checks further
down in the functions to handle this situation correctly. Drop the
unlocked checks to avoid the spurious failure and rely on the existing
logic.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Currently we try to rely on the global reserved block pool for block
allocations for the free inode btree, but I have customer reports
(fairly complex workload, need to find an easier reproducer) where that
is not enough as the AG where we free an inode that requires a new
finobt block is entirely full. This causes us to cancel a dirty
transaction and thus a file system shutdown.
I think the right way to guard against this is to treat the finot the same
way as the refcount btree and have a per-AG reservations for the possible
worst case size of it, and the patch below implements that.
Note that this could increase mount times with large finobt trees. In
an ideal world we would have added a field for the number of finobt
fields to the AGI, similar to what we did for the refcount blocks.
We should do add it next time we rev the AGI or AGF format by adding
new fields.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Try to reserve the blocks first and only then update the fields in
or hanging off the mount structure. This way we can call __xfs_ag_resv_init
again after a previous failure.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Revert commit 6276e53fa8 (ACPI / video: Add force_native quirk for
HP Pavilion dv6).
In the commit message for the quirk this revert removes I wrote:
"Note that there are quite a few HP Pavilion dv6 variants, some
woth ATI and some with NVIDIA hybrid gfx, both seem to need this
quirk to have working backlight control. There are also some versions
with only Intel integrated gfx, these may not need this quirk, but it
should not hurt there."
Unfortunately that seems wrong, I've already received 2 reports of
this commit causing regressions on some dv6 variants (at least one
of which actually has a nvidia GPU). So it seems that HP has made a
mess here by using the same model-name both in marketing and in the
DMI data for many different variants. Some of which need
acpi_backlight=native for functional backlight control (as the
quirk this commit reverts was doing), where as others are broken by
it. So lets get back to the old sitation so as to avoid regressing
on models which used to work without any kernel cmdline arguments
before.
Fixes: 6276e53fa8 (ACPI / video: Add force_native quirk for HP Pavilion dv6)
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
gvt-fixes-2017-01-25
- re-enable shadow batch buffer for security that was falsely turned off.
- kvmgt/mdev typo fix for correct ABI
- gvt mail list change
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
According to kmem_cache_sanity_check(), spaces are not allowed in the
name of a cache and results in a kernel oops with CONFIG_DEBUG_VM.
Convert to underscores.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Per the ABI specification[1], each mdev_supported_types entry should
have an available_instances, with an "s", not available_instance.
[1] Documentation/ABI/testing/sysfs-bus-vfio-mdev
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This reverts commit 7611fb6806 ("thermal: thermal_hwmon: Convert to
hwmon_device_register_with_info()").
Pavel Machek reported breakage in the Nokia N900 due to this commit.
We can revisit a proper fix for the warning later.
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Merge fixes from Andrew Morton:
"26 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (26 commits)
MAINTAINERS: add Dan Streetman to zbud maintainers
MAINTAINERS: add Dan Streetman to zswap maintainers
mm: do not export ioremap_page_range symbol for external module
mn10300: fix build error of missing fpu_save()
romfs: use different way to generate fsid for BLOCK or MTD
frv: add missing atomic64 operations
mm, page_alloc: fix premature OOM when racing with cpuset mems update
mm, page_alloc: move cpuset seqcount checking to slowpath
mm, page_alloc: fix fast-path race with cpuset update or removal
mm, page_alloc: fix check for NULL preferred_zone
kernel/panic.c: add missing \n
fbdev: color map copying bounds checking
frv: add atomic64_add_unless()
mm/mempolicy.c: do not put mempolicy before using its nodemask
radix-tree: fix private list warnings
Documentation/filesystems/proc.txt: add VmPin
mm, memcg: do not retry precharge charges
proc: add a schedule point in proc_pid_readdir()
mm: alloc_contig: re-allow CMA to compact FS pages
mm/slub.c: trace free objects at KERN_INFO
...
Recently, I've found cases in which ioremap_page_range was used
incorrectly, in external modules, leading to crashes. This can be
partly attributed to the fact that ioremap_page_range is lower-level,
with fewer protections, as compared to the other functions that an
external module would typically call. Those include:
ioremap_cache
ioremap_nocache
ioremap_prot
ioremap_uc
ioremap_wc
ioremap_wt
...each of which wraps __ioremap_caller, which in turn provides a safer
way to achieve the mapping.
Therefore, stop EXPORT-ing ioremap_page_range.
Link: http://lkml.kernel.org/r/1485173220-29010-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Suggested-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 8a59f5d252 ("fs/romfs: return f_fsid for statfs(2)") generates
a 64bit id from sb->s_bdev->bd_dev. This is only correct when romfs is
defined with CONFIG_ROMFS_ON_BLOCK. If romfs is only defined with
CONFIG_ROMFS_ON_MTD, sb->s_bdev is NULL, referencing sb->s_bdev->bd_dev
will triger an oops.
Richard Weinberger points out that when CONFIG_ROMFS_BACKED_BY_BOTH=y,
both CONFIG_ROMFS_ON_BLOCK and CONFIG_ROMFS_ON_MTD are defined.
Therefore when calling huge_encode_dev() to generate a 64bit id, I use
the follow order to choose parameter,
- CONFIG_ROMFS_ON_BLOCK defined
use sb->s_bdev->bd_dev
- CONFIG_ROMFS_ON_BLOCK undefined and CONFIG_ROMFS_ON_MTD defined
use sb->s_dev when,
- both CONFIG_ROMFS_ON_BLOCK and CONFIG_ROMFS_ON_MTD undefined
leave id as 0
When CONFIG_ROMFS_ON_MTD is defined and sb->s_mtd is not NULL, sb->s_dev
is set to a device ID generated by MTD_BLOCK_MAJOR and mtd index,
otherwise sb->s_dev is 0.
This is a try-best effort to generate a uniq file system ID, if all the
above conditions are not meet, f_fsid of this romfs instance will be 0.
Generally only one romfs can be built on single MTD block device, this
method is enough to identify multiple romfs instances in a computer.
Link: http://lkml.kernel.org/r/1482928596-115155-1-git-send-email-colyli@suse.de
Signed-off-by: Coly Li <colyli@suse.de>
Reported-by: Nong Li <nongli1031@gmail.com>
Tested-by: Nong Li <nongli1031@gmail.com>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ganapatrao Kulkarni reported that the LTP test cpuset01 in stress mode
triggers OOM killer in few seconds, despite lots of free memory. The
test attempts to repeatedly fault in memory in one process in a cpuset,
while changing allowed nodes of the cpuset between 0 and 1 in another
process.
The problem comes from insufficient protection against cpuset changes,
which can cause get_page_from_freelist() to consider all zones as
non-eligible due to nodemask and/or current->mems_allowed. This was
masked in the past by sufficient retries, but since commit 682a3385e7
("mm, page_alloc: inline the fast path of the zonelist iterator") we fix
the preferred_zoneref once, and don't iterate over the whole zonelist in
further attempts, thus the only eligible zones might be placed in the
zonelist before our starting point and we always miss them.
A previous patch fixed this problem for current->mems_allowed. However,
cpuset changes also update the task's mempolicy nodemask. The fix has
two parts. We have to repeat the preferred_zoneref search when we
detect cpuset update by way of seqcount, and we have to check the
seqcount before considering OOM.
[akpm@linux-foundation.org: fix typo in comment]
Link: http://lkml.kernel.org/r/20170120103843.24587-5-vbabka@suse.cz
Fixes: c33d6c06f6 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ganapatrao Kulkarni reported that the LTP test cpuset01 in stress mode
triggers OOM killer in few seconds, despite lots of free memory. The
test attempts to repeatedly fault in memory in one process in a cpuset,
while changing allowed nodes of the cpuset between 0 and 1 in another
process.
One possible cause is that in the fast path we find the preferred
zoneref according to current mems_allowed, so that it points to the
middle of the zonelist, skipping e.g. zones of node 1 completely. If
the mems_allowed is updated to contain only node 1, we never reach it in
the zonelist, and trigger OOM before checking the cpuset_mems_cookie.
This patch fixes the particular case by redoing the preferred zoneref
search if we switch back to the original nodemask. The condition is
also slightly changed so that when the last non-root cpuset is removed,
we don't miss it.
Note that this is not a full fix, and more patches will follow.
Link: http://lkml.kernel.org/r/20170120103843.24587-3-vbabka@suse.cz
Fixes: 682a3385e7 ("mm, page_alloc: inline the fast path of the zonelist iterator")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "fix premature OOM regression in 4.7+ due to cpuset races".
This is v2 of my attempt to fix the recent report based on LTP cpuset
stress test [1]. The intention is to go to stable 4.9 LTSS with this,
as triggering repeated OOMs is not nice. That's why the patches try to
be not too intrusive.
Unfortunately why investigating I found that modifying the testcase to
use per-VMA policies instead of per-task policies will bring the OOM's
back, but that seems to be much older and harder to fix problem. I have
posted a RFC [2] but I believe that fixing the recent regressions has a
higher priority.
Longer-term we might try to think how to fix the cpuset mess in a better
and less error prone way. I was for example very surprised to learn,
that cpuset updates change not only task->mems_allowed, but also
nodemask of mempolicies. Until now I expected the parameter to
alloc_pages_nodemask() to be stable. I wonder why do we then treat
cpusets specially in get_page_from_freelist() and distinguish HARDWALL
etc, when there's unconditional intersection between mempolicy and
cpuset. I would expect the nodemask adjustment for saving overhead in
g_p_f(), but that clearly doesn't happen in the current form. So we
have both crazy complexity and overhead, AFAICS.
[1] https://lkml.kernel.org/r/CAFpQJXUq-JuEP=QPidy4p_=FN0rkH5Z-kfB4qBvsf6jMS87Edg@mail.gmail.com
[2] https://lkml.kernel.org/r/7c459f26-13a6-a817-e508-b65b903a8378@suse.cz
This patch (of 4):
Since commit c33d6c06f6 ("mm, page_alloc: avoid looking up the first
zone in a zonelist twice") we have a wrong check for NULL preferred_zone,
which can theoretically happen due to concurrent cpuset modification. We
check the zoneref pointer which is never NULL and we should check the zone
pointer. Also document this in first_zones_zonelist() comment per Michal
Hocko.
Fixes: c33d6c06f6 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
Link: http://lkml.kernel.org/r/20170120103843.24587-2-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since commit be97a41b29 ("mm/mempolicy.c: merge alloc_hugepage_vma to
alloc_pages_vma") alloc_pages_vma() can potentially free a mempolicy by
mpol_cond_put() before accessing the embedded nodemask by
__alloc_pages_nodemask(). The commit log says it's so "we can use a
single exit path within the function" but that's clearly wrong. We can
still do that when doing mpol_cond_put() after the allocation attempt.
Make sure the mempolicy is not freed prematurely, otherwise
__alloc_pages_nodemask() can end up using a bogus nodemask, which could
lead e.g. to premature OOM.
Fixes: be97a41b29 ("mm/mempolicy.c: merge alloc_hugepage_vma to alloc_pages_vma")
Link: http://lkml.kernel.org/r/20170118141124.8345-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently when trace is enabled (e.g. slub_debug=T,kmalloc-128 ) the
trace messages are mostly output at KERN_INFO. However the trace code
also calls print_section() to hexdump the head of a free object. This
is hard coded to use KERN_ERR, meaning the console is deluged with trace
messages even if we've asked for quiet.
Fix this the obvious way but adding a level parameter to
print_section(), allowing calls from the trace code to use the same
trace level as other trace messages.
Link: http://lkml.kernel.org/r/20170113154850.518-1-daniel.thompson@linaro.org
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With >=32 CPUs the userfaultfd selftest triggered a graceful but
unexpected SIGBUS because VM_FAULT_RETRY was returned by
handle_userfault() despite the UFFDIO_COPY wasn't completed.
This seems caused by rwsem waking the thread blocked in
handle_userfault() and we can't run up_read() before the wait_event
sequence is complete.
Keeping the wait_even sequence identical to the first one, would require
running userfaultfd_must_wait() again to know if the loop should be
repeated, and it would also require retaking the rwsem and revalidating
the whole vma status.
It seems simpler to wait the targeted wakeup so that if false wakeups
materialize we still wait for our specific wakeup event, unless of
course there are signals or the uffd was released.
Debug code collecting the stack trace of the wakeup showed this:
$ ./userfaultfd 100 99999
nr_pages: 25600, nr_pages_per_cpu: 800
bounces: 99998, mode: racing ver poll, userfaults: 32 35 90 232 30 138 69 82 34 30 139 40 40 31 20 19 43 13 15 28 27 38 21 43 56 22 1 17 31 8 4 2
bounces: 99997, mode: rnd ver poll, Bus error (core dumped)
save_stack_trace+0x2b/0x50
try_to_wake_up+0x2a6/0x580
wake_up_q+0x32/0x70
rwsem_wake+0xe0/0x120
call_rwsem_wake+0x1b/0x30
up_write+0x3b/0x40
vm_mmap_pgoff+0x9c/0xc0
SyS_mmap_pgoff+0x1a9/0x240
SyS_mmap+0x22/0x30
entry_SYSCALL_64_fastpath+0x1f/0xbd
0xffffffffffffffff
FAULT_FLAG_ALLOW_RETRY missing 70
CPU: 24 PID: 1054 Comm: userfaultfd Tainted: G W 4.8.0+ #30
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
Call Trace:
dump_stack+0xb8/0x112
handle_userfault+0x572/0x650
handle_mm_fault+0x12cb/0x1520
__do_page_fault+0x175/0x500
trace_do_page_fault+0x61/0x270
do_async_page_fault+0x19/0x90
async_page_fault+0x25/0x30
This always happens when the main userfault selftest thread is running
clone() while glibc runs either mprotect or mmap (both taking mmap_sem
down_write()) to allocate the thread stack of the background threads,
while locking/userfault threads already run at full throttle and are
susceptible to false wakeups that may cause handle_userfault() to return
before than expected (which results in graceful SIGBUS at the next
attempt).
This was reproduced only with >=32 CPUs because the loop to start the
thread where clone() is too quick with fewer CPUs, while with 32 CPUs
there's already significant activity on ~32 locking and userfault
threads when the last background threads are started with clone().
This >=32 CPUs SMP race condition is likely reproducible only with the
selftest because of the much heavier userfault load it generates if
compared to real apps.
We'll have to allow "one more" VM_FAULT_RETRY for the WP support and a
patch floating around that provides it also hidden this problem but in
reality only is successfully at hiding the problem.
False wakeups could still happen again the second time
handle_userfault() is invoked, even if it's a so rare race condition
that getting false wakeups twice in a row is impossible to reproduce.
This full fix is needed for correctness, the only alternative would be
to allow VM_FAULT_RETRY to be returned infinitely. With this fix the WP
support can stick to a strict "one more" VM_FAULT_RETRY logic (no need
of returning it infinite times to avoid the SIGBUS).
Link: http://lkml.kernel.org/r/20170111005535.13832-2-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Shubham Kumar Sharma <shubham.kumar.sharma@oracle.com>
Tested-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Michael Rapoport <RAPOPORT@il.ibm.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
gcc-7 produces a harmless false-postive warning about a possible NULL
pointer access:
drivers/memstick/core/memstick.c: In function 'h_memstick_read_dev_id':
drivers/memstick/core/memstick.c:309:3: error: argument 2 null where non-null expected [-Werror=nonnull]
memcpy(mrq->data, buf, mrq->data_len);
This can't happen because the caller sets the command to 'MS_TPC_READ_REG',
which causes the data direction to be 'READ' and the NULL pointer not
accessed.
As a simple workaround for the warning, we can pass a pointer to the
data that we actually want to read into. This is not needed here, but
also harmless, and lets the compiler know that the access is ok.
Link: http://lkml.kernel.org/r/20170111144143.548867-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On an overloaded system, it is possible that a change in the watchdog
threshold can be delayed long enough to trigger a false positive.
This can easily be achieved by having a cpu spinning indefinitely on a
task, while another cpu updates watchdog threshold.
What happens is while trying to park the watchdog threads, the hrtimers
on the other cpus trigger and reprogram themselves with the new slower
watchdog threshold. Meanwhile, the nmi watchdog is still programmed
with the old faster threshold.
Because the one cpu is blocked, it prevents the thread parking on the
other cpus from completing, which is needed to shutdown the nmi watchdog
and reprogram it correctly. As a result, a false positive from the nmi
watchdog is reported.
Fix this by setting a park_in_progress flag to block all lockups until
the parking is complete.
Fix provided by Ulrich Obergfell.
[akpm@linux-foundation.org: s/park_in_progress/watchdog_park_in_progress/]
Link: http://lkml.kernel.org/r/1481041033-192236-1-git-send-email-dzickus@redhat.com
Signed-off-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As reported by Arnd:
https://lkml.org/lkml/2017/1/10/756
Compiling with the following configuration:
# CONFIG_EXT2_FS is not set
# CONFIG_EXT4_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_FS_IOMAP depends on the above filesystems, as is not set
CONFIG_FS_DAX=y
generates build warnings about unused functions in fs/dax.c:
fs/dax.c:878:12: warning: `dax_insert_mapping' defined but not used [-Wunused-function]
static int dax_insert_mapping(struct address_space *mapping,
^~~~~~~~~~~~~~~~~~
fs/dax.c:572:12: warning: `copy_user_dax' defined but not used [-Wunused-function]
static int copy_user_dax(struct block_device *bdev, sector_t sector, size_t size,
^~~~~~~~~~~~~
fs/dax.c:542:12: warning: `dax_load_hole' defined but not used [-Wunused-function]
static int dax_load_hole(struct address_space *mapping, void **entry,
^~~~~~~~~~~~~
fs/dax.c:312:14: warning: `grab_mapping_entry' defined but not used [-Wunused-function]
static void *grab_mapping_entry(struct address_space *mapping, pgoff_t index,
^~~~~~~~~~~~~~~~~~
Now that the struct buffer_head based DAX fault paths and I/O path have
been removed we really depend on iomap support being present for DAX.
Make this explicit by selecting FS_IOMAP if we compile in DAX support.
This allows us to remove conditional selections of FS_IOMAP when FS_DAX
was present for ext2 and ext4, and to remove an #ifdef in fs/dax.c.
Link: http://lkml.kernel.org/r/1484087383-29478-1-git-send-email-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In commit 19be0eaffa ("mm: remove gup_flags FOLL_WRITE games from
__get_user_pages()"), the mm code was changed from unsetting FOLL_WRITE
after a COW was resolved to setting the (newly introduced) FOLL_COW
instead. Simultaneously, the check in gup.c was updated to still allow
writes with FOLL_FORCE set if FOLL_COW had also been set.
However, a similar check in huge_memory.c was forgotten. As a result,
remote memory writes to ro regions of memory backed by transparent huge
pages cause an infinite loop in the kernel (handle_mm_fault sets
FOLL_COW and returns 0 causing a retry, but follow_trans_huge_pmd bails
out immidiately because `(flags & FOLL_WRITE) && !pmd_write(*pmd)` is
true.
While in this state the process is stil SIGKILLable, but little else
works (e.g. no ptrace attach, no other signals). This is easily
reproduced with the following code (assuming thp are set to always):
#include <assert.h>
#include <fcntl.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#define TEST_SIZE 5 * 1024 * 1024
int main(void) {
int status;
pid_t child;
int fd = open("/proc/self/mem", O_RDWR);
void *addr = mmap(NULL, TEST_SIZE, PROT_READ,
MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
assert(addr != MAP_FAILED);
pid_t parent_pid = getpid();
if ((child = fork()) == 0) {
void *addr2 = mmap(NULL, TEST_SIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
assert(addr2 != MAP_FAILED);
memset(addr2, 'a', TEST_SIZE);
pwrite(fd, addr2, TEST_SIZE, (uintptr_t)addr);
return 0;
}
assert(child == waitpid(child, &status, 0));
assert(WIFEXITED(status) && WEXITSTATUS(status) == 0);
return 0;
}
Fix this by updating follow_trans_huge_pmd in huge_memory.c analogously
to the update in gup.c in the original commit. The same pattern exists
in follow_devmap_pmd. However, we should not be able to reach that
check with FOLL_COW set, so add WARN_ONCE to make sure we notice if we
ever do.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170106015025.GA38411@juliacomputing.com
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
online_{kernel|movable} is used to change the memory zone to
ZONE_{NORMAL|MOVABLE} and online the memory.
To check that memory zone can be changed, zone_can_shift() is used.
Currently the function returns minus integer value, plus integer
value and 0. When the function returns minus or plus integer value,
it means that the memory zone can be changed to ZONE_{NORNAL|MOVABLE}.
But when the function returns 0, there are two meanings.
One of the meanings is that the memory zone does not need to be changed.
For example, when memory is in ZONE_NORMAL and onlined by online_kernel
the memory zone does not need to be changed.
Another meaning is that the memory zone cannot be changed. When memory
is in ZONE_NORMAL and onlined by online_movable, the memory zone may
not be changed to ZONE_MOVALBE due to memory online limitation(see
Documentation/memory-hotplug.txt). In this case, memory must not be
onlined.
The patch changes the return type of zone_can_shift() so that memory
online operation fails when memory zone cannot be changed as follows:
Before applying patch:
# grep -A 35 "Node 2" /proc/zoneinfo
Node 2, zone Normal
<snip>
node_scanned 0
spanned 8388608
present 7864320
managed 7864320
# echo online_movable > memory4097/state
# grep -A 35 "Node 2" /proc/zoneinfo
Node 2, zone Normal
<snip>
node_scanned 0
spanned 8388608
present 8388608
managed 8388608
online_movable operation succeeded. But memory is onlined as
ZONE_NORMAL, not ZONE_MOVABLE.
After applying patch:
# grep -A 35 "Node 2" /proc/zoneinfo
Node 2, zone Normal
<snip>
node_scanned 0
spanned 8388608
present 7864320
managed 7864320
# echo online_movable > memory4097/state
bash: echo: write error: Invalid argument
# grep -A 35 "Node 2" /proc/zoneinfo
Node 2, zone Normal
<snip>
node_scanned 0
spanned 8388608
present 7864320
managed 7864320
online_movable operation failed because of failure of changing
the memory zone from ZONE_NORMAL to ZONE_MOVABLE
Fixes: df429ac039 ("memory-hotplug: more general validation of zone during online")
Link: http://lkml.kernel.org/r/2f9c3837-33d7-b6e5-59c0-6ca4372b2d84@gmail.com
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Booting Linux on an ARM fastmodel containing an SMMU emulation results
in an unexpected I/O page fault from the legacy virtio-blk PCI device:
[ 1.211721] arm-smmu-v3 2b400000.smmu: event 0x10 received:
[ 1.211800] arm-smmu-v3 2b400000.smmu: 0x00000000fffff010
[ 1.211880] arm-smmu-v3 2b400000.smmu: 0x0000020800000000
[ 1.211959] arm-smmu-v3 2b400000.smmu: 0x00000008fa081002
[ 1.212075] arm-smmu-v3 2b400000.smmu: 0x0000000000000000
[ 1.212155] arm-smmu-v3 2b400000.smmu: event 0x10 received:
[ 1.212234] arm-smmu-v3 2b400000.smmu: 0x00000000fffff010
[ 1.212314] arm-smmu-v3 2b400000.smmu: 0x0000020800000000
[ 1.212394] arm-smmu-v3 2b400000.smmu: 0x00000008fa081000
[ 1.212471] arm-smmu-v3 2b400000.smmu: 0x0000000000000000
<system hangs failing to read partition table>
This is because the legacy virtio-blk device is behind an SMMU, so we
have consequently swizzled its DMA ops and configured the SMMU to
translate accesses. This then requires the vring code to use the DMA API
to establish translations, otherwise all transactions will result in
fatal faults and termination.
Given that ARM-based systems only see an SMMU if one is really present
(the topology is all described by firmware tables such as device-tree or
IORT), then we can safely use the DMA API for all legacy virtio devices.
Modern devices can advertise the prescense of an IOMMU using the
VIRTIO_F_IOMMU_PLATFORM feature flag.
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: <stable@vger.kernel.org>
Fixes: 876945dbf6 ("arm64: Hook up IOMMU dma_ops")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Once DMA API usage is enabled, it becomes apparent that virtio-mmio is
inadvertently relying on the default 32-bit DMA mask, which leads to
problems like rapidly exhausting SWIOTLB bounce buffers.
Ensure that we set the appropriate 64-bit DMA mask whenever possible,
with the coherent mask suitably limited for the legacy vring as per
a0be1db430 ("virtio_pci: Limit DMA mask to 44 bits for legacy virtio
devices").
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Reported-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Fixes: b42111382f ("virtio_mmio: Use the DMA API if enabled")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Propagate the error when vhost_vq_init_access() fails and set
vq->private_data to NULL.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is needed on HS38 cores, for setting up IO-Coherency aperture properly
The polling could perturb the caches and coherecy fabric which could be
wrong in the small window when Master is setting up IOC aperture etc
in arc_cache_init()
We do it only for ARCv2 based builds to not affect EZChip ARCompact
based platform.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Robert Shearman says:
====================
net: Fix oops on state free after lwt module unload
An oops is seen in lwtstate_free after an lwt ops module has been
unloaded. This patchset fixes this by preventing modules implementing
lwtunnel ops from being unloaded whilst there's state alive using
those ops. The first patch adds fills in a new owner field in all lwt
ops and the second patch makes use of this to reference count the
modules as state is built and destroyed using them.
Changes in v3:
- don't put module reference if try_module_get fails on building state
Changes in v2:
- specify module owner for all modules as suggested by DaveM
- reference count all modules building lwt state, not just those ops
implementing destroy_state, as also suggested by DaveM.
- rebased on top of David Ahern's lwtunnel changes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When attempting to free lwtunnel state after the module for the encap
has been unloaded an oops occurs:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: lwtstate_free+0x18/0x40
[..]
task: ffff88003e372380 task.stack: ffffc900001fc000
RIP: 0010:lwtstate_free+0x18/0x40
RSP: 0018:ffff88003fd83e88 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88002bbb3380 RCX: ffff88000c91a300
[..]
Call Trace:
<IRQ>
free_fib_info_rcu+0x195/0x1a0
? rt_fibinfo_free+0x50/0x50
rcu_process_callbacks+0x2d3/0x850
? rcu_process_callbacks+0x296/0x850
__do_softirq+0xe4/0x4cb
irq_exit+0xb0/0xc0
smp_apic_timer_interrupt+0x3d/0x50
apic_timer_interrupt+0x93/0xa0
[..]
Code: e8 6e c6 fc ff 89 d8 5b 5d c3 bb de ff ff ff eb f4 66 90 66 66 66 66 90 55 48 89 e5 53 0f b7 07 48 89 fb 48 8b 04 c5 00 81 d5 81 <48> 8b 40 08 48 85 c0 74 13 ff d0 48 8d 7b 20 be 20 00 00 00 e8
The problem is after the module for the encap can be unloaded the
corresponding ops is removed and is thus NULL here.
Modules implementing lwtunnel ops should not be allowed to unload
while there is state alive using those ops, so grab the module
reference for the ops on creating lwtunnel state and of course release
the reference when freeing the state.
Fixes: 1104d9ba44 ("lwtunnel: Add destroy state operation")
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modules implementing lwtunnel ops should not be allowed to unload
while there is state alive using those ops, so specify the owning
module for all lwtunnel ops.
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On UD QP completer tasklet is scheduled for each packet sent.
If it is followed by a destroy_qp(), the kernel panic will
happen as the completer tries to operate on a destroyed QP.
Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The first argument of list_add_tail is the new item and the second
is the head of the list. Fix the code to pass arguments in the
right order, otherwise not all the rxe devices will be removed
during teardown.
Fixes: 8700e3e7c4 ('Soft RoCE driver')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Parthasarathy Bhuvaragan says:
====================
tipc: topology server fixes for nametable soft lockup
In this series, we revert the commit 333f796235 ("tipc: fix a
race condition leading to subscriber refcnt bug") and provide an
alternate solution to fix the race conditions in commits 2-4.
We have to do this as the above commit introduced a nametbl soft
lockup at module exit as described by patch#4.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In tipc_server_stop(), we iterate over the connections with limiting
factor as server's idr_in_use. We ignore the fact that this variable
is decremented in tipc_close_conn(), leading to premature exit.
In this commit, we iterate until the we have no connections left.
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Tested-by: John Thompson <thompa.atl@gmail.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In tipc_conn_sendmsg(), we first queue the request to the outqueue
followed by the connection state check. If the connection is not
connected, we should not queue this message.
In this commit, we reject the messages if the connection state is
not CF_CONNECTED.
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Tested-by: John Thompson <thompa.atl@gmail.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 333f796235 ("tipc: fix a race condition leading to
subscriber refcnt bug") reveals a soft lockup while acquiring
nametbl_lock.
Before commit 333f796235, we call tipc_conn_shutdown() from
tipc_close_conn() in the context of tipc_topsrv_stop(). In that
context, we are allowed to grab the nametbl_lock.
Commit 333f796235, moved tipc_conn_release (renamed from
tipc_conn_shutdown) to the connection refcount cleanup. This allows
either tipc_nametbl_withdraw() or tipc_topsrv_stop() to the cleanup.
Since tipc_exit_net() first calls tipc_topsrv_stop() and then
tipc_nametble_withdraw() increases the chances for the later to
perform the connection cleanup.
The soft lockup occurs in the call chain of tipc_nametbl_withdraw(),
when it performs the tipc_conn_kref_release() as it tries to grab
nametbl_lock again while holding it already.
tipc_nametbl_withdraw() grabs nametbl_lock
tipc_nametbl_remove_publ()
tipc_subscrp_report_overlap()
tipc_subscrp_send_event()
tipc_conn_sendmsg()
<< if (con->flags != CF_CONNECTED) we do conn_put(),
triggering the cleanup as refcount=0. >>
tipc_conn_kref_release
tipc_sock_release
tipc_conn_release
tipc_subscrb_delete
tipc_subscrp_delete
tipc_nametbl_unsubscribe << Soft Lockup >>
The previous changes in this series fixes the race conditions fixed
by commit 333f796235. Hence we can now revert the commit.
Fixes: 333f796235 ("tipc: fix a race condition leading to subscriber refcnt bug")
Reported-and-Tested-by: John Thompson <thompa.atl@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now, the generic server framework maintains the connection
id's per subscriber in server's conn_idr. At tipc_close_conn, we
remove the connection id from the server list, but the connection is
valid until we call the refcount cleanup. Hence we have a window
where the server allocates the same connection to an new subscriber
leading to inconsistent reference count. We have another refcount
warning we grab the refcount in tipc_conn_lookup() for connections
with flag with CF_CONNECTED not set. This usually occurs at shutdown
when the we stop the topology server and withdraw TIPC_CFG_SRV
publication thereby triggering a withdraw message to subscribers.
In this commit, we:
1. remove the connection from the server list at recount cleanup.
2. grab the refcount for a connection only if CF_CONNECTED is set.
Tested-by: John Thompson <thompa.atl@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now, the subscribers keep track of the subscriptions using
reference count at subscriber level. At subscription cancel or
subscriber delete, we delete the subscription only if the timer
was pending for the subscription. This approach is incorrect as:
1. del_timer() is not SMP safe, if on CPU0 the check for pending
timer returns true but CPU1 might schedule the timer callback
thereby deleting the subscription. Thus when CPU0 is scheduled,
it deletes an invalid subscription.
2. We export tipc_subscrp_report_overlap(), which accesses the
subscription pointer multiple times. Meanwhile the subscription
timer can expire thereby freeing the subscription and we might
continue to access the subscription pointer leading to memory
violations.
In this commit, we introduce subscription refcount to avoid deleting
an invalid subscription.
Reported-and-Tested-by: John Thompson <thompa.atl@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We trigger a soft lockup as we grab nametbl_lock twice if the node
has a pending node up/down or link up/down event while:
- we process an incoming named message in tipc_named_rcv() and
perform an tipc_update_nametbl().
- we have pending backlog items in the name distributor queue
during a nametable update using tipc_nametbl_publish() or
tipc_nametbl_withdraw().
The following are the call chain associated:
tipc_named_rcv() Grabs nametbl_lock
tipc_update_nametbl() (publish/withdraw)
tipc_node_subscribe()/unsubscribe()
tipc_node_write_unlock()
<< lockup occurs if an outstanding node/link event
exits, as we grabs nametbl_lock again >>
tipc_nametbl_withdraw() Grab nametbl_lock
tipc_named_process_backlog()
tipc_update_nametbl()
<< rest as above >>
The function tipc_node_write_unlock(), in addition to releasing the
lock processes the outstanding node/link up/down events. To do this,
we need to grab the nametbl_lock again leading to the lockup.
In this commit we fix the soft lockup by introducing a fast variant of
node_unlock(), where we just release the lock. We adapt the
node_subscribe()/node_unsubscribe() to use the fast variants.
Reported-and-Tested-by: John Thompson <thompa.atl@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing set->ndeact update on each deactivated element from the set
flush path. Otherwise, sets with fixed size break after flush since
accounting breaks.
# nft add set x y { type ipv4_addr\; size 2\; }
# nft add element x y { 1.1.1.1 }
# nft add element x y { 1.1.1.2 }
# nft flush set x y
# nft add element x y { 1.1.1.1 }
<cmdline>:1:1-28: Error: Could not process rule: Too many open files in system
Fixes: 8411b6442e ("netfilter: nf_tables: support for set flushing")
Reported-by: Elise Lennion <elise.lennion@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If the element exists and no NLM_F_EXCL is specified, do not bump
set->nelems, otherwise we leak one set element slot. This problem
amplifies if the set is full since the abort path always decrements the
counter for the -ENFILE case too, giving one spare extra slot.
Fix this by moving set->nelems update to nft_add_set_elem() after
successful element insertion. Moreover, remove the element if the set is
full so there is no need to rely on the abort path to undo things
anymore.
Fixes: c016c7e45d ("netfilter: nf_tables: honor NLM_F_EXCL flag in set element insertion")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
First, log prefix will be truncated to NF_LOG_PREFIXLEN-1, i.e. 127,
at nf_log_packet(), so the extra part is useless.
Second, after adding a log rule with a very very long prefix, we will
fail to dump the nft rules after this _special_ one, but acctually,
they do exist. For example:
# name_65000=$(printf "%0.sQ" {1..65000})
# nft add rule filter output log prefix "$name_65000"
# nft add rule filter output counter
# nft add rule filter output counter
# nft list chain filter output
table ip filter {
chain output {
type filter hook output priority 0; policy accept;
}
}
So now, restrict the log prefix length to NF_LOG_PREFIXLEN-1.
Fixes: 96518518cc ("netfilter: add nftables")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
1. Release pid before enter odp flow
2. Release pid when fail to allocate memory
Fixes: 87773dd56d ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get")
Fixes: 8ada2c1c0c ("IB/core: Add support for on demand paging regions")
Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Reviewed-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Pull x86 platform-driver fixes from Andy Shevchenko:
"This is my first pull request since I become a co-maintainer of
Platform Drivers x86 subsystem. It's a bit bigger than usual due to
material collected for almost two weeks in a row.
MAINTAINERS:
- Add myself to X86 PLATFORM DRIVERS as a co-maintainer
ideapad-laptop:
- handle ACPI event 1
intel_mid_powerbtn:
- Set IRQ_ONESHOT
surface3-wmi:
- fix uninitialized symbol
- Shut up unused-function warning
mlx-platform:
- free first dev on error"
* tag 'platform-drivers-x86-v4.10-4' of git://git.infradead.org/linux-platform-drivers-x86:
MAINTAINERS: Add myself to X86 PLATFORM DRIVERS as a co-maintainer
platform/x86: ideapad-laptop: handle ACPI event 1
platform/x86: intel_mid_powerbtn: Set IRQ_ONESHOT
platform/x86: surface3-wmi: fix uninitialized symbol
platform/x86: surface3-wmi: Shut up unused-function warning
platform/x86: mlx-platform: free first dev on error
Relying on qede to trigger qedr on startup is problematic. When probing
both if qedr loads slowly then qede can assume qedr is missing and not
trigger it. This patch adds a triggering from qedr and protects against
a race via an atomic bit.
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Free the PD if no internal resources were available. Move userspace
code under the relevant 'if'.
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
mark qedr_get_state_from_ibqp(), __qedr_alloc_mr() and __qedr_post_send()
as static since they are only used in the same file.
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Fail QP state transition from error to reset if SQ/RQ are not empty
and still in the process of flushing out the queued work entries.
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
It is normal to flush CQEs if the QP is in error state. Hence there's no
use in printing a message per CQE to dmesg.
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
There is only a single event queue that triggers the completion
events for the RDMA CM and it is being processed serially. This means
that inherently there can no parallelism of CQ completion handler
callbacks, hence the lock is redundant.
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Return the maximum supported amount of inline data, not the qp's current
configured inline data size, when filling out the results of a query
qp call.
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
If the user is requesting us to change the QP state to the same state
that it is already in, return success instead of failure.
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
As the functionality to convert the MTU from a number to enum_ib_mtu
is ubiquitous, define a dedicated function and remove the duplicated
code.
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Tobias Regnery says:
====================
alx: fix fallout from multi queue conversion
Here are 3 fixes for the multi queue conversion in v4.10.
The first patch fixes a wrong condition in an if statement.
Patches 2 and 3 fixes regressions in the corner case when requesting msi-x
interrupts fails and we fall back to msi or legacy interrupts.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If requesting msi-x interrupts fails in alx_request_irq we fall back to
a single tx queue and msi or legacy interrupts.
Currently the adapter stops working in this case and we get tx watchdog
timeouts. For reasons unknown the adapter gets confused when we load the
dma adresses to the chip in alx_init_ring_ptrs twice: the first time with
multiple queues and the second time in the fallback case with a single
queue.
To fix this move the the call to alx_reinit_rings (which calls
alx_init_ring_ptrs) after alx_request_irq. At this time it is clear how
much tx queues we have and which dma addresses we use.
Fixes: d768319cd4 ("alx: enable multiple tx queues")
Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If requesting msi-x interrupts fails we should fall back to msi or
legacy interrupts. However alx_realloc_ressources don't call
alx_init_intr, so we fail to set the right number of tx queues.
This results in watchdog timeouts and a nonfunctional adapter.
Fixes: d768319cd4 ("alx: enable multiple tx queues")
Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The condition to free the descriptor memory is wrong, we want to free the
memory if it is set and not if it is unset. Invert the test to fix this
issue.
Fixes: b0999223f224b ("alx: add ability to allocate and free alx_napi structures")
Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Another rebranded Novatel E371. qmi_wwan should drive this device, while
cdc_ether should ignore it. Even though the USB descriptors are plain
CDC-ETHER that USB interface is a QMI interface. Ref commit 7fdb7846c9
("qmi_wwan/cdc_ether: add device IDs for Dell 5804 (Novatel E371) WWAN
card")
Cc: Dan Williams <dcbw@redhat.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
sb_dirblklog is added to sb_blocklog to compute the directory block size
in bytes. Therefore, we must compare the sum of both those values
against XFS_MAX_BLOCKSIZE_LOG, not just dirblklog.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Pull namespace fix from Eric Biederman:
"This has a single brown bag fix.
The possible deadlock with dec_pid_namespaces that I had thought was
fixed earlier turned out only to have been moved. So instead of being
cleaver this change takes ucounts_lock with irqs disabled. So
dec_ucount can be used from any context without fear of deadlock.
The items accounted for dec_ucount and inc_ucount are all
comparatively heavy weight objects so I don't exepct this will have
any measurable performance impact"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
userns: Make ucounts lock irq-safe
When using the ibmveth driver in a KVM/QEMU based VM, it currently
always prints out a scary error message like this when it is started:
ibmveth 71000003 (unregistered net_device): unable to change
checksum offload settings. 1 rc=-2 ret_attr=71000003
This happens because the driver always tries to enable the checksum
offloading without checking for the availability of this feature first.
QEMU does not support checksum offloading for the spapr-vlan device,
thus we always get the error message here.
According to the LoPAPR specification, the "ibm,illan-options" property
of the corresponding device tree node should be checked first to see
whether the H_ILLAN_ATTRIUBTES hypercall and thus the checksum offloading
feature is available. Thus let's do this in the ibmveth driver, too, so
that the error message is really only limited to cases where something
goes wrong, and does not occur if the feature is just missing.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mac aging is applicable only for dynamically learnt remote mac
entries. Check for user configured static remote mac entries
and skip aging.
Signed-off-by: Balakrishnan Raman <ramanb@cumulusnetworks.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch skips flushing static fdb entries in
ndo_stop, but flushes all fdb entries during vxlan
device delete. This is consistent with the bridge
driver fdb
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet says:
====================
ipv6: fix ip6_tnl_parse_tlv_enc_lim() issues
First patch fixes ip6_tnl_parse_tlv_enc_lim() callers,
bug added in linux-3.7
Second patch fixes ip6_tnl_parse_tlv_enc_lim() itself,
bug predates linux-2.6.12
Based on a report from Dmitry Vyukov, thanks to KASAN.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This function suffers from multiple issues.
First one is that pskb_may_pull() may reallocate skb->head,
so the 'raw' pointer needs either to be reloaded or not used at all.
Second issue is that NEXTHDR_DEST handling does not validate
that the options are present in skb->data, so we might read
garbage or access non existent memory.
With help from Willem de Bruijn.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since ip6_tnl_parse_tlv_enc_lim() can call pskb_may_pull(),
we must reload any pointer that was related to skb->head
(or skb->data), or risk use after free.
Fixes: c12b395a46 ("gre: Support GRE over IPv6")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
I don't have any guests with PAGE_SIZE > 64k but the
code seems to be clearly broken in that case
as PAGE_SIZE / MERGEABLE_BUFFER_ALIGN will need
more than 8 bit and so the code in mergeable_ctx_to_buf_address
does not give us the actual true size.
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry reported a deadlock scenario:
unix_bind() path:
u->bindlock ==> sb_writer
do_splice() path:
sb_writer ==> pipe->mutex ==> u->bindlock
In the unix_bind() code path, unix_mknod() does not have to
be done with u->bindlock held, since it is a pure fs operation,
so we can just move unix_mknod() out.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
write-back cache in degraded mode introduces corner cases to the array.
Although we try to cover all these corner cases, it is safer to just
disable write-back cache when the array is in degraded mode.
In this patch, we disable writeback cache for degraded mode:
1. On device failure, if the array enters degraded mode, raid5_error()
will submit async job r5c_disable_writeback_async to disable
writeback;
2. In r5c_journal_mode_store(), it is invalid to enable writeback in
degraded mode;
3. In r5c_try_caching_write(), stripes with s->failed>0 will be handled
in write-through mode.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Write back cache requires a complex RMW mechanism, where old data is
read into dev->orig_page for prexor, and then xor is done with
dev->page. This logic is already implemented in the write path.
However, current read path is not awared of this requirement. When
the array is optimal, the RMW is not required, as the data are
read from raid disks. However, when the target stripe is degraded,
complex RMW is required to generate right data.
To keep read path as clean as possible, we handle read path by
flushing degraded, in-journal stripes before processing reads to
missing dev.
Specifically, when there is read requests to a degraded stripe
with data in journal, handle_stripe_fill() calls
r5c_make_stripe_write_out() and exits. Then handle_stripe_dirtying()
will do the complex RMW and flush the stripe to RAID disks. After
that, read requests are handled.
There is one more corner case when there is non-overwrite bio for
the missing (or out of sync) dev. handle_stripe_dirtying() will not
be able to process the non-overwrite bios without constructing the
data in handle_stripe_fill(). This is fixed by delaying non-overwrite
bios in handle_stripe_dirtying(). So handle_stripe_fill() works on
these bios after the stripe is flushed to raid disks.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
For safer operation, all arrays start in write-through mode, which has been
better tested and is more mature. And actually the write-through/write-mode
isn't persistent after array restarted, so we always start array in
write-through mode. However, if recovery found data-only stripes before the
shutdown (from previous write-back mode), it is not safe to start the array in
write-through mode, as write-through mode can not handle stripes with data in
write-back cache. To solve this problem, we flush all data-only stripes in
r5l_recovery_log(). When r5l_recovery_log() returns, the array starts with
empty cache in write-through mode.
This logic is implemented in r5c_recovery_flush_data_only_stripes():
1. enable write back cache
2. flush all stripes
3. wake up conf->mddev->thread
4. wait for all stripes get flushed (reuse wait_for_quiescent)
5. disable write back cache
The wait in 4 will be waked up in release_inactive_stripe_list()
when conf->active_stripes reaches 0.
It is safe to wake up mddev->thread here because all the resource
required for the thread has been initialized.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
With write back cache, we use orig_page to do prexor. This patch
makes sure we read data into orig_page for it.
Flag R5_OrigPageUPTDODATE is added to show whether orig_page
has the latest data from raid disk.
We introduce a helper function uptodate_for_rmw() to simplify
the a couple conditions in handle_stripe_dirtying().
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
If the interrupt allocation failed we should start freeing the CQ rings
rather than unregistering the netdev notifier.
Fixes: 29c8d9eba5 ("IB: Add vmw_pvrdma driver")
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
For run-on-reset SMP configs, non master cores call a routine which
waits until Master gives it a "go" signal (currently using a shared
mem flag). The same routine then jumps off the well known entry point of
all non Master cores i.e. @first_lines_of_secondary
This patch moves out the last part into one single place in early boot
code.
This is better in terms of absraction (the wait API only waits) and
returns, leaving out the "jump off to" part.
In actual implementation this requires some restructuring of the early
boot code as well as Master now jumps to BSS setup explicitly,
vs. falling thru into it before.
Technically this patch doesn't cause any functional change, it just
moves the ugly #ifdef'ry from assembly code to "C"
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
commit 3c7c7a2fc8 ("ARC: Don't use "+l" inline asm constraint")
modified the inline assembly to setup LP_COUNT register manually and NOT
rely on gcc to do it (with the +l inline assembler contraint hint, now
being retired in the compiler)
However the fix was flawed as we didn't add LP_COUNT to asm clobber list,
meaning gcc doesn't know that LP_COUNT or zero-delay-loops are in action
in the inline asm.
This resulted in some fun - as nested ZOL loops were being generared
| mov lp_count,250000 ;16 # tmp235,
| lp .L__GCC__LP14 # <======= OUTER LOOP (gcc generated)
| .L14:
| ld r2, [r5] # MEM[(volatile u32 *)prephitmp_43], w
| dmb 1
| breq r2, -1, @.L21 #, w,,
| bbit0 r2,1,@.L13 # w,,
| ld r4,[r7] ;25 # loops_per_jiffy, loops_per_jiffy
| mpymu r3,r4,r6 #, loops_per_jiffy, tmp234
|
| mov lp_count, r3 # <====== INNER LOOP (from inline asm)
| lp 1f
| nop
| 1:
| nop_s
| .L__GCC__LP14: ; loop end, start is @.L14 #,
This caused issues with drivers relying on sane behaviour of udelay
friends.
With LP_COUNT added to clobber list, gcc doesn't generate the outer
loop in say above case.
Addresses STAR 9001146134
Reported-by: Joao Pinto <jpinto@synopsys.com>
Fixes: 3c7c7a2fc8 ("ARC: Don't use "+l" inline asm constraint")
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
mlxsw_sp_nexthop_group_mac_update() is called in one of two cases:
1) When the MAC of a nexthop needs to be updated
2) When the size of a nexthop group has changed
In the second case the adjacency entries for the nexthop group need to
be reallocated from the adjacency table. In this case we must write to
the entries the MAC addresses of all the nexthops that should be
offloaded and not only those whose MAC changed. Otherwise, these entries
would be filled with garbage data, resulting in packet loss.
Fixes: a7ff87acd9 ("mlxsw: spectrum_router: Implement next-hop routing")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Runtime suspend shouldn't be executed if the tx queue is not empty,
because the device is not idle.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The old description basically read like "ethernet-phy-idAAAA.BBBB" can
be specified when you know the actual PHY ID. However, specifying this
has a side-effect: it forces Linux to bind to a certain PHY driver (the
one that matches the ID given in the compatible string), ignoring the ID
which is reported by the actual PHY.
Whenever a device is shipped with (multiple) different PHYs during it's
production lifetime then explicitly specifying
"ethernet-phy-idAAAA.BBBB" could break certain revisions of that device.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ignore value of interrupt distribution mode for common interrupts in
IDU since setting of affinity using value from Device Tree is deprecated
in ARC. Originally it is done in idu_irq_xlate() function and it is
semantically wrong and does not guaranty that an affinity value will be
set properly. idu_irq_enable() function is better place for
initialization of common interrupts.
By default send all common interrupts to all available online CPUs.
The affinity of common interrupts in IDU must be set manually since
in some cases the kernel will not call irq_set_affinity() by itself:
1. When the kernel is not configured with support of SMP.
2. When the kernel is configured with support of SMP but upper
interrupt controllers does not support setting of the affinity
and cannot propagate it to IDU.
Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Some nfsv4.0 servers may return a mode for the verifier following an open
with EXCLUSIVE4 createmode, but this does not mean the client should skip
setting the mode in the following SETATTR. It should only do that for
EXCLUSIVE4_1 or UNGAURDED createmode.
Fixes: 5334c5bdac ("NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Xuan Qi reports that the Linux NFSv4 client failed to lock a file
that was migrated. The steps he observed on the wire:
1. The client sent a LOCK request to the source server
2. The source server replied NFS4ERR_MOVED
3. The client switched to the destination server
4. The client sent the same LOCK request to the destination
server with a bumped lock sequence ID
5. The destination server rejected the LOCK request with
NFS4ERR_BAD_SEQID
RFC 3530 section 8.1.5 provides a list of NFS errors which do not
bump a lock sequence ID.
However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530 section
9.1.7, this list has been updated by the addition of NFS4ERR_MOVED.
Reported-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
For devices that can register page list that is bigger than
USHRT_MAX, we actually take the wrong value for sg_tablesize.
E.g: for CX4 max_fast_reg_page_list_len is 65536 (bigger than USHRT_MAX)
so we set sg_tablesize to 0 by mistake. Therefore, each IO that is
bigger than 4k splitted to "< 4k" chunks that cause performance degredation.
Remove wrong sg_tablesize assignment, and use the value that was set during
address resolution handler with the needed casting.
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
For last few months Darren and I are co-maintaining PDx86 subsystem.
Make this fact official by updating MAINTAINERS database.
Acked-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
After setting indirect_sg_entries module_param to huge value (e.g 500,000),
srp_alloc_req_data() fails to allocate indirect descriptors for the request
ring (kmalloc fails). This commit enforces the maximum value of
indirect_sg_entries to be SG_MAX_SEGMENTS as signified in module param
description.
Fixes: 65e8617fba (scsi: rename SCSI_MAX_{SG, SG_CHAIN}_SEGMENTS)
Fixes: c07d424d61 (IB/srp: add support for indirect tables that don't fit in SRP_CMD)
Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>--
Signed-off-by: Doug Ledford <dledford@redhat.com>
Johannes Berg says:
====================
A single fix, for a sleeping context problem found by LTP.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In my previous patch, I missed that rate_control_rate_init() is
called from some places that cannot sleep, so it cannot call
ieee80211_recalc_min_chandef(). Remove that call for now to fix
the context bug, we'll have to find a different way to fix the
minimum channel width issue.
Fixes: 96aa2e7cf1 ("mac80211: calculate min channel width correctly")
Reported-by: Xiaolong Ye (via lkp-robot) <xiaolong.ye@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
We can't dereference the dio structure after submitting the last bio for
this request, as I/O completion might have happened before the code is
run. Introduce a local is_sync variable instead.
Fixes: 542ff7bf ("block: new direct I/O implementation")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Matias Bjørling <m@bjorling.me>
Tested-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
convert_vx_to_fp() is adapted to handle only a specified number of
registers rather than unconditionally handling all of them: other
callers of this function are adapted appropriately.
Based on an initial patch by Dave Martin.
Cc: stable@vger.kernel.org
Reported-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We cannot call nfs4_handle_exception() without first ensuring that the
slot has been freed. If not, we end up deadlocking with the process
waiting for recovery to complete, and recovery waiting for the slot
table to drain.
Fixes: 2e80dbe7ac ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Currently, if the user add a stateful object with the name size exceed
NFT_OBJ_MAXNAMELEN - 1 (i.e. 31), we truncate it down to 31 silently.
This is not friendly, furthermore, this will cause duplicated stateful
objects when the first 31 characters of the name is same. So limit the
stateful object's name size to NFT_OBJ_MAXNAMELEN - 1.
After apply this patch, error message will be printed out like this:
# name_32=$(printf "%0.sQ" {1..32})
# nft add counter filter $name_32
<cmdline>:1:1-52: Error: Could not process rule: Numerical result out
of range
add counter filter QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Also this patch cleans up the codes which missing the name size limit
validation in nftables.
Fixes: e50092404c ("netfilter: nf_tables: add stateful objects")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull tile bugfix from Chris Metcalf:
"This avoids an issue with short userspace reads for regset via ptrace"
* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile/ptrace: Preserve previous registers for short regset write
Return success when the ring is properly initialized, otherwise return
failure.
Tonga SRIOV VF doesn't have UVD and VCE engines, the initialization of
these IPs is bypassed. The system crashes if application submit IB to
their rings which are not ready to use. It could be a common issue if
IP having ring buffer is disabled for some reason on specific ASIC, so
it should check the ring being ready to use.
Bug: amdgpu_test crashes system on Tonga VF.
Signed-off-by: Ding Pixel <Pixel.Ding@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull GPIO fix from Linus Walleij:
"A single lockdep fix, nothing else going on. This makes lockdep
noiseless and work properly with threaded GPIO IRQchips.
Summary:
Fix a lockdep issue: the threaded irqchips also need their unique key,
and take this opportunity to get rid of the horrible macro and replace
it with a static inline"
* tag 'gpio-v4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: provide lockdep keys for nested/unnested irqchips
Pull drm fixes from Dave Airlie:
"drm fixes across the board.
Okay holidays and LCA kinda caught up with me, I thought I'd get some
of this dequeued last week, but Hobart was sunny and warm and not all
gloomy and rainy as usual.
This is a bit large, but not too much considering it's two weeks stuff
from AMD and Intel.
core:
- one locking fix that helps with dynamic suspend/resume races
i915:
- mostly GVT updates, GVT was a recent introduction so fixes for it
shouldn't cause any notable side effects.
amdgpu:
- a bunch of fixes for GPUs with a different memory controller design
that need different firmware.
exynos:
- decon regression fixes
msm:
- two regression fixes
etnaviv:
- a workaround for an mmu bug that needs a lot more work.
virtio:
- sparse fix, and a maintainers update"
* tag 'drm-fixes-for-v4.10-rc6' of git://people.freedesktop.org/~airlied/linux: (56 commits)
drm/exynos/decon5433: set STANDALONE_UPDATE_F on output enablement
drm/exynos/decon5433: fix CMU programming
drm/exynos/decon5433: do not disable video after reset
drm/i915: Ignore bogus plane coordinates on SKL when the plane is not visible
drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.
drm/amdgpu: add support for new hainan variants
drm/radeon: add support for new hainan variants
drm/amdgpu: change clock gating mode for uvd_v4.
drm/amdgpu: fix program vce instance logic error.
drm/amdgpu: fix bug set incorrect value to vce register
Revert "drm/amdgpu: Only update the CUR_SIZE register when necessary"
drm/msm: fix potential null ptr issue in non-iommu case
drm/msm/mdp5: rip out plane->pending tracking
drm/exynos/decon5433: set STANDALONE_UPDATE_F also if planes are disabled
drm/exynos/decon5433: update shadow registers iff there are active windows
drm/i915/gvt: rewrite gt reset handler using new function intel_gvt_reset_vgpu_locked
drm/i915/gvt: fix vGPU instance reuse issues by vGPU reset function
drm/i915/gvt: introduce intel_vgpu_reset_mmio() to reset mmio space
drm/i915/gvt: move mmio init/clean function to mmio.c
drm/i915/gvt: introduce intel_vgpu_reset_cfg_space to reset configuration space
...
We need to check the return value of phy_connect_direct() in
dsa_slave_phy_connect() otherwise we may be continuing the
initialization of a slave network device with a PHY that already
attached somewhere else and which will soon be in error because the PHY
device is in error.
The conditions for such an error to occur are that we have a port of our
switch that is not disabled, and has the same port number as a PHY
address (say both 5) that can be probed using the DSA slave MII bus. We
end-up having this slave network device find a PHY at the same address
as our port number, and we try to attach to it.
A slave network (e.g: port 0) has already attached to our PHY device,
and we try to re-attach it with a different network device, but since we
ignore the error we would end-up initializating incorrect device
references by the time the slave network interface is opened.
The code has been (re)organized several times, making it hard to provide
an exact Fixes tag, this is a bugfix nonetheless.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
phy_error() is called in the PHY state machine workqueue context, and
calls phy_trigger_machine() which does a cancel_delayed_work_sync() of
the workqueue we execute from, causing a deadlock situation.
Augment phy_trigger_machine() machine with a sync boolean indicating
whether we should use cancel_*_sync() or just cancel_*_work().
Fixes: 3c293f4e08 ("net: phy: Trigger state machine on state change and not polling.")
Reported-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to the way how xfs_iomap_write_allocate tries to convert the whole
found extents from delalloc to real space we can run into a race
condition with multiple threads doing writes to this same extent.
For the non-COW case that is harmless as the only thing that can happen
is that we call xfs_bmapi_write on an extent that has already been
converted to a real allocation. For COW writes where we move the extent
from the COW to the data fork after I/O completion the race is, however,
not quite as harmless. In the worst case we are now calling
xfs_bmapi_write on a region that contains hole in the COW work, which
will trip up an assert in debug builds or lead to file system corruption
in non-debug builds. This seems to be reproducible with workloads of
small O_DSYNC write, although so far I've not managed to come up with
a with an isolated reproducer.
The fix for the issue is relatively simple: tell xfs_bmapi_write
that we are only asked to convert delayed allocations and skip holes
in that case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Making use of "max_indirect_segments=" has issues:
- blkfront_setup_indirect() may end up with zero psegs when PAGE_SIZE
is sufficiently much larger than XEN_PAGE_SIZE
- the variable driven by the command line option
(xen_blkif_max_segments) has a somewhat different purpose, and hence
should namely never end up being zero
- as long as the specified value is lower than the legacy default,
we better don't use indirect segments at all (or we'd in fact lower
throughput)
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Don't truncate the "feature-persistent" value read from xenstore: Any
non-zero value is supposed to enable the feature, just like is already
being done for feature_secdiscard.
Just like the other feature_* fields, feature_flush and feature_fua are
boolean flags, and hence fit well into a single bit.
Keep all bit fields together to limit gaps.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
MPLS multipath for LSR is broken -- always selecting the first nexthop
in the one label case. For example:
$ ip -f mpls ro ls
100
nexthop as to 200 via inet 172.16.2.2 dev virt12
nexthop as to 300 via inet 172.16.3.2 dev virt13
101
nexthop as to 201 via inet6 2000:2::2 dev virt12
nexthop as to 301 via inet6 2000:3::2 dev virt13
In this example incoming packets have a single MPLS labels which means
BOS bit is set. The BOS bit is passed from mpls_forward down to
mpls_multipath_hash which never processes the hash loop because BOS is 1.
Update mpls_multipath_hash to process the entire label stack. mpls_hdr_len
tracks the total mpls header length on each pass (on pass N mpls_hdr_len
is N * sizeof(mpls_shim_hdr)). When the label is found with the BOS set
it verifies the skb has sufficient header for ipv4 or ipv6, and find the
IPv4 and IPv6 header by using the last mpls_hdr pointer and adding 1 to
advance past it.
With these changes I have verified the code correctly sees the label,
BOS, IPv4 and IPv6 addresses in the network header and icmp/tcp/udp
traffic for ipv4 and ipv6 are distributed across the nexthops.
Fixes: 1c78efa831 ("mpls: flow-based multipath selection")
Acked-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's necessary to setup bus if any slots are present.
- update clock after ctrl reset
- if the host has genpd node, we can guarantee the clock is
available before starting request. Otherwies, the clock register
is reset once power off the pd, and host can't output the active
clock during communication.
Fixes: e9ed8835e9 ("mmc: dw_mmc: add runtime PM callback")
Fixes: df9bcc2bc0 ("mmc: dw_mmc: add missing codes for runtime resume")
cc: <stable@vger.kernel.org>
Reported-by: Randy Li <randy.li@rock-chips.com>
Reported-by: S. Gilles <sgilles@math.umd.edu>
Signed-off-by: Ziyuan Xu <xzy.xu@rock-chips.com>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
a single fix for a FE hang after IOVA rollover on GC3000. This isn't
pretty, but is the minimal fix for the issue. A larger rework of the
code, that will also fix this issue properly, is currently in the works,
but that needs to wait for at least the next feature pull.
* 'drm-etnaviv-fixes' of https://git.pengutronix.de/git/lst/linux:
drm/etnaviv: trick drm_mm into giving out a low IOVA
Just regression fixups to resolve page fault issue of DECON device.
* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
drm/exynos/decon5433: set STANDALONE_UPDATE_F on output enablement
drm/exynos/decon5433: fix CMU programming
drm/exynos/decon5433: do not disable video after reset
drm/exynos/decon5433: set STANDALONE_UPDATE_F also if planes are disabled
drm/exynos/decon5433: update shadow registers iff there are active windows
A little bigger than usual since it's two weeks worth. Highlights:
- Add support for new smc firmware on some new hainan variants
- add support for SI chips that require special mc firmware
- remove workarounds for issues fixed by new mc firmware
- fix a regression in cursor handling
- various VCE fixes
- fix for UVD clockgating
* 'drm-fixes-4.10' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu: add support for new hainan variants
drm/radeon: add support for new hainan variants
drm/amdgpu: change clock gating mode for uvd_v4.
drm/amdgpu: fix program vce instance logic error.
drm/amdgpu: fix bug set incorrect value to vce register
Revert "drm/amdgpu: Only update the CUR_SIZE register when necessary"
drm/amd/powerplay: refine vce dpm update code on Cz.
drm/amdgpu: fix vm_fault_stop on gfx6
drm/amd/powerplay: fix vce cg logic error on CZ/St.
drm/radeon: drop the mclk quirk for hainan
drm/radeon: drop oland quirks
drm/amdgpu: drop the mclk quirk for hainan
drm/amdgpu: drop oland quirks
drm/amdgpu/si: load special ucode for certain MC configs
drm/radeon/si: load special ucode for certain MC configs
* 'msm-fixes-4.10-rc4' of git://people.freedesktop.org/~robclark/linux:
drm/msm: fix potential null ptr issue in non-iommu case
drm/msm/mdp5: rip out plane->pending tracking
A few more core fixes.
* tag 'drm-misc-fixes-2017-01-13' of git://anongit.freedesktop.org/git/drm-misc:
drm/probe-helpers: Drop locking from poll_enable
drm: Fix broken VT switch with video=1366x768 option
drm: Schedule the output_poll_work with 1s delay if we have delayed event
More GVT-g stuff than I'd like at this stage, but then again that's
pretty new and isolated so I'm not too worried.
* tag 'drm-intel-fixes-2017-01-19' of git://anongit.freedesktop.org/git/drm-intel: (26 commits)
drm/i915: Ignore bogus plane coordinates on SKL when the plane is not visible
drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.
drm/i915/gvt: rewrite gt reset handler using new function intel_gvt_reset_vgpu_locked
drm/i915/gvt: fix vGPU instance reuse issues by vGPU reset function
drm/i915/gvt: introduce intel_vgpu_reset_mmio() to reset mmio space
drm/i915/gvt: move mmio init/clean function to mmio.c
drm/i915/gvt: introduce intel_vgpu_reset_cfg_space to reset configuration space
drm/i915/gvt: move cfg space inititation function to cfg_space.c
drm/i915/gvt: introuduce intel_vgpu_reset_gtt() to reset gtt
drm/i915/gvt: introudce intel_vgpu_reset_resource() to reset vgpu resource state
drm/i915: Fix phys pwrite for struct_mutex-less operation
drm/i915: Clear ret before unbinding in i915_gem_evict_something()
drm/i915/gvt: cleanup GFP flags
drm/i915/gvt/kvmgt: return meaningful error for vgpu creating failure
drm/i915/gvt: cleanup opregion memory allocation code
drm/i915/gvt: destroy the allocated idr on vgpu creating failures
drm/i915/gvt: init/destroy vgpu_idr properly
drm/i915/gvt: dec vgpu->running_workload_num after the workload is really done
drm/i915/gvt: fix use after free for workload
drm/i915/gvt: remove duplicated definition
...
Tom Lendacky says:
====================
amd-xgbe: AMD XGBE driver fixes 2017-01-20
This patch series addresses some issues in the AMD XGBE driver.
The following fixes are included in this driver update series:
- Add a fix for a version of the hardware that uses different register
offset values for a device with the same PCI device ID
- Add support to check the return code from the xgbe_init() function
This patch series is based on net.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The xgbe_init() routine returns a return code indicating success or
failure, but the return code is not checked. Add code to xgbe_init()
to issue a message when failures are seen and add code to check the
xgbe_init() return code.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A newer version of the hardware is using the same PCI ids for the network
device but has altered register definitions for determining the window
settings for the indirect PCS access. Add support to check for this
hardware and if found use the new register values.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull x86 fix from Thomas Gleixner:
"Restore the retrigger callbacks in the IO APIC irq chips. That
addresses a long standing regression which got introduced with the
rewrite of the x86 irq subsystem two years ago and went unnoticed so
far"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ioapic: Restore IO-APIC irq_chip retrigger callback
Pull smp/hotplug fix from Thomas Gleixner:
"Remove an unused variable which is a leftover from the notifier
removal"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Remove unused but set variable in _cpu_down()
Pull virtio/vhost fixes from Michael Tsirkin:
"Random fixes and cleanups that accumulated over the time"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio/s390: virtio: constify virtio_config_ops structures
virtio/s390: add missing \n to end of dev_err message
virtio/s390: support READ_STATUS command for virtio-ccw
tools/virtio/ringtest: tweaks for s390
tools/virtio/ringtest: fix run-on-all.sh for offline cpus
virtio_console: fix a crash in config_work_handler
vhost/scsi: silence uninitialized variable warning
vhost: scsi: constify target_core_fabric_ops structures
Pull thermal management fixes from Zhang Rui:
- fix a regression that thermal zone dynamically allocated sysfs
attributes are freed before they're removed, which is introduced in
4.10-rc1 (Jacob von Chorus)
- fix a boot warning because deprecated hwmon API is used (Fabio
Estevam)
- a couple of fixes for rockchip thermal driver (Brian Norris, Caesar
Wang)
* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
thermal: rockchip: fixes the conversion table
thermal: core: move tz->device.groups cleanup to thermal_release
thermal: thermal_hwmon: Convert to hwmon_device_register_with_info()
thermal: rockchip: handle set_trips without the trip points
thermal: rockchip: optimize the conversion table
thermal: rockchip: fixes invalid temperature case
thermal: rockchip: don't pass table structs by value
thermal: rockchip: improve conversion error messages
On Ideapad laptops, ACPI event 1 is currently not handled. Many models
log "ideapad_laptop: Unknown event: 1" every 20 seconds or so while
running on battery power. Some convertible laptops receive this event
when switching in and out of tablet mode.
This adds and additional case for event 1 in ideapad_acpi_notify to call
ideapad_input_report(priv, vpc_bit), so that the event is reported to
userspace and we avoid unnecessary logging.
Fixes bug #107481 (https://bugzilla.kernel.org/show_bug.cgi?id=107481)
Fixes bug #65751 (https://bugzilla.kernel.org/show_bug.cgi?id=65751)
Signed-off-by: Zach Ploskey <zach@ploskey.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Pull USB fixes from Greg KH:
"Here are a few small USB fixes for 4.10-rc5.
Most of these are gadget/dwc2 fixes for reported issues, all of these
have been in linux-next for a while. The last one is a single xhci
WARN_ON removal to handle an issue that the dwc3 driver is hitting in
the 4.10-rc tree. The warning is harmless and needs to be removed, and
a "real" fix that is more complex will show up in 4.11-rc1 for this
device.
That last patch hasn't been in linux-next yet due to the weekend
timing, but it's a "simple" WARN_ON() removal so what could go wrong?
:)"
Famous last words.
* tag 'usb-4.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
xhci: remove WARN_ON if dma mask is not set for platform devices
usb: dwc2: host: fix Wmaybe-uninitialized warning
usb: dwc2: gadget: Fix GUSBCFG.USBTRDTIM value
usb: gadget: udc: atmel: remove memory leak
usb: dwc3: exynos fix axius clock error path to do cleanup
usb: dwc2: Avoid suspending if we're in gadget mode
usb: dwc2: use u32 for DT binding parameters
usb: gadget: f_fs: Fix iterations on endpoints.
usb: dwc2: gadget: Fix DMA memory freeing
usb: gadget: composite: Fix function used to free memory
Pull libnvdimm fixes from Dan Williams:
"Two fixes:
- a regression fix for the multiple-pmem-namespace-per-region support
added in 4.9. Even if an existing environment is not using that
feature the act of creating and a destroying a single namespace
with the ndctl utility will lead to the proliferation of extra
unwanted namespace devices.
- a fix for the error code returned from the pmem driver when the
memcpy_mcsafe() routine returns -EFAULT. Btrfs seems to be the only
block I/O consumer that tries to parse the meaning of the error
code when it is non-zero.
Neither of these fixes are critical, the namespace leak is awkward in
that it can cause device naming to change and complicates debugging
namespace initialization issues. The error code fix is included out of
caution for what other consumers might be expecting -EIO for block I/O
errors"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm, namespace: fix pmem namespace leak, delete when size set to zero
pmem: return EIO on read_pmem() failure
Pull clk fix from Stephen Boyd:
"One fix for Samsung Exynos524x SoCs where recent IOMMU patches have
caused some of these clocks to turn off when they were always left on
before"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk/samsung: exynos542x: mark some clocks as critical
Pull ARC fixes from Vineet Gupta:
- more intc updates [Yuriv]
- fix module build when unwinder is turned off
- IO Coherency Programming model updates
- other miscellaneous
* tag 'arc-4.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: Revert "ARC: mm: IOC: Don't enable IOC by default"
ARC: mm: split arc_cache_init to allow __init reaping of bulk
ARCv2: IOC: Use actual memory size to setup aperture size
ARCv2: IOC: Adhere to progamming model guidelines to avoid DMA corruption
ARCv2: IOC: refactor the IOC and SLC operations into own functions
ARC: module: Fix !CONFIG_ARC_DW2_UNWIND builds
ARCv2: save r30 on kernel entry as gcc uses it for code-gen
ARCv2: IRQ: Call entry/exit functions for chained handlers in MCIP
ARC: IRQ: Use hwirq instead of virq in mask/unmask
ARC: mmu: clarify the MMUv3 programming model
Pull powerpc fixes from Michael Ellerman:
"Two fixes for fallout from the hugetlb changes we merged this cycle.
Ten other fixes, four only affect Power9, and the rest are a bit of a
mixture though nothing terrible.
Thanks to: Aneesh Kumar K.V, Anton Blanchard, Benjamin Herrenschmidt,
Dave Martin, Gavin Shan, Madhavan Srinivasan, Nicholas Piggin, Reza
Arbab"
* tag 'powerpc-4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Ignore reserved field in DCSR and PVR reads and writes
powerpc/ptrace: Preserve previous TM fprs/vsrs on short regset write
powerpc/ptrace: Preserve previous fprs/vsrs on short regset write
powerpc/perf: Use MSR to report privilege level on P9 DD1
selftest/powerpc: Wrong PMC initialized in pmc56_overflow test
powerpc/eeh: Enable IO path on permanent error
powerpc/perf: Fix PM_BRU_CMPL event code for power9
powerpc/mm: Fix little-endian 4K hugetlb
powerpc/mm/hugetlb: Don't panic when we don't find the default huge page size
powerpc: Fix pgtable pmd cache init
powerpc/icp-opal: Fix missing KVM case and harden replay
powerpc/mm: Fix memory hotplug BUG() on radix
The commit 1c6c69525b ("genirq: Reject bogus threaded irq requests")
starts refusing misconfigured interrupt handlers. This makes
intel_mid_powerbtn not working anymore.
Add a mandatory flag to a threaded IRQ request in the driver.
Fixes: 1c6c69525b ("genirq: Reject bogus threaded irq requests")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
The patch 3dda3b3798: "platform/x86: Add custom surface3 platform
device for controlling LID" from Nov 25, 2016, leads to the following
static checker warning:
drivers/platform/x86/surface3-wmi.c:168 s3_wmi_check_platform_device()
error: uninitialized symbol 'ts_adev'.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
The newly added driver guards its "resume" callback with an
warning in some configurations:
drivers/platform/x86/surface3-wmi.c:248:12: error: 's3_wmi_resume' defined but not used [-Werror=unused-function]
Using a __maybe_unused annotation without an #ifdef avoids the mistake more
reliably.
Fixes: 3dda3b3798 ("platform/x86: Add custom surface3 platform device for controlling LID")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
There is an off-by-one error so we don't unregister priv->pdev_mux[0].
Also it's slightly simpler as a while loop instead of a for loop.
Fixes: 58cbbee239 ("x86/platform/mellanox: Introduce support for Mellanox systems platform")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Pull KVM fixes from Radim Krčmář:
"ARM:
- Fix for timer setup on VHE machines
- Drop spurious warning when the timer races against the vcpu running
again
- Prevent a vgic deadlock when the initialization fails (for stable)
s390:
- Fix a kernel memory exposure (for stable)
x86:
- Fix exception injection when hypercall instruction cannot be
patched"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: do not expose random data via facility bitmap
KVM: x86: fix fixing of hypercalls
KVM: arm/arm64: vgic: Fix deadlock on error handling
KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems
KVM: arm/arm64: Fix occasional warning from the timer work function
Pull SCSI target fixes from Bart Van Assche:
- two small fixes for the ibmvscsis driver
- ten patches with bug fixes for the target mode of the qla2xxx driver
- four patches that avoid that the "sparse" and "smatch" static
analyzer tools report false positives for the qla2xxx code base
* 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux:
qla2xxx: Disable out-of-order processing by default in firmware
qla2xxx: Fix erroneous invalid handle message
qla2xxx: Reduce exess wait during chip reset
qla2xxx: Terminate exchange if corrupted
qla2xxx: Fix crash due to null pointer access
qla2xxx: Collect additional information to debug fw dump
qla2xxx: Reset reserved field in firmware options to 0
qla2xxx: Set tcm_qla2xxx version to automatically track qla2xxx version
qla2xxx: Include ATIO queue in firmware dump when in target mode
qla2xxx: Fix wrong IOCB type assumption
qla2xxx: Avoid that building with W=1 triggers complaints about set-but-not-used variables
qla2xxx: Move two arrays from header files to .c files
qla2xxx: Declare an array with file scope static
qla2xxx: Fix indentation
ibmvscsis: Fix sleeping in interrupt context
ibmvscsis: Fix max transfer length
Pull block fixes from Jens Axboe:
"Just two small fixes for this -rc.
One is just killing an unused variable from Keith, but the other
fixes a performance regression for nbd in this series, where we
inadvertently flipped when we set MSG_MORE when outputting data"
* 'for-linus' of git://git.kernel.dk/linux-block:
nbd: only set MSG_MORE when we have more to send
blk-mq: Remove unused variable
Pull spi fixes from Mark Brown:
"The usual small smattering of driver specific fixes. A few bits that
stand out here:
- the R-Car patches adding fallbacks are just adding new compatible
strings to the driver so that device trees are written in a more
robustly future proof fashion, this isn't strictly a fix but it's
just new IDs and it's better to get it into mainline sooner to
improve the ABI
- the DesignWare "switch to new API part 2" patch is actually a
misleadingly titled fix for a bit that got missed in the original
conversion"
* tag 'spi-fix-v4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: davinci: use dma_mapping_error()
spi: spi-axi: Free resources on error path
spi: pxa2xx: add missed break
spi: dw-mid: switch to new dmaengine_terminate_* API (part 2)
spi: dw: Make debugfs name unique between instances
spi: sh-msiof: Do not use C++ style comment
spi: armada-3700: Set mode bits correctly
spi: armada-3700: fix unsigned compare than zero on irq
spi: sh-msiof: Add R-Car Gen 2 and 3 fallback bindings
spi: SPI_FSL_DSPI should depend on HAS_DMA
Pull ceph fixes from Ilya Dryomov:
"Three filesystem endianness fixes (one goes back to the 2.6 era, all
marked for stable) and two fixups for this merge window's patches"
* tag 'ceph-for-4.10-rc5' of git://github.com/ceph/ceph-client:
ceph: fix bad endianness handling in parse_reply_info_extra
ceph: fix endianness bug in frag_tree_split_cmp
ceph: fix endianness of getattr mask in ceph_d_revalidate
libceph: make sure ceph_aes_crypt() IV is aligned
ceph: fix ceph_get_caps() interruption
Any bridge options specified during link creation (e.g. ip link add)
are ignored as br_dev_newlink() does not process them.
Use br_changelink() to do it.
Fixes: 1332351617 ("bridge: implement rtnl_link_ops->changelink")
Signed-off-by: Ivan Vecera <cera@cera.cz>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull overlayfs fix from Miklos Szeredi:
"This fixes a regression introduced in this cycle"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: fix possible use after free on redirect dir lookup
Pull fuse fixes from Miklos Szeredi:
"Fix two regressions, one introduced in 4.9 and a less recent one in
4.2"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: fix time_to_jiffies nsec sanity check
fuse: clear FR_PENDING flag when moving requests out of pending queue
Pull SCSI fixes from James Bottomley:
"This is a set of 12 fixes including the mpt3sas one that was causing
hangs on ATA passthrough.
The others are a couple of zoned block device fixes, a SAS device
detection bug which lead to SATA drives not being matched to bays, two
qla2xxx MSI fixes, a qla2xxx req for rsp confusion caused by cut and
paste, and a few other minor fixes"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpt3sas: fix hang on ata passthrough commands
scsi: lpfc: Set elsiocb contexts to NULL after freeing it
scsi: sd: Ignore zoned field for host-managed devices
scsi: sd: Fix wrong DPOFUA disable in sd_read_cache_type
scsi: bfa: fix wrongly initialized variable in bfad_im_bsg_els_ct_request()
scsi: ses: Fix SAS device detection in enclosure
scsi: libfc: Fix variable name in fc_set_wwpn
scsi: lpfc: avoid double free of resource identifiers
scsi: qla2xxx: remove irq_affinity_notifier
scsi: qla2xxx: fix MSI-X vector affinity
scsi: qla2xxx: Fix apparent cut-n-paste error.
scsi: qla2xxx: Get mutex lock before checking optrom_state
Pull arm64 fixes from Catalin Marinas:
- avoid potential stack information leak via the ptrace ABI caused by
uninitialised variables
- SWIOTLB DMA API fall-back allocation fix when the SWIOTLB buffer is
not initialised (all RAM is suitable for 32-bit DMA masks)
- fix the bad_mode function returning for unhandled exceptions coming
from user space
- fix name clash in __page_to_voff()
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: avoid returning from bad_mode
arm64/ptrace: Reject attempts to set incomplete hardware breakpoint fields
arm64/ptrace: Avoid uninitialised struct padding in fpr_set()
arm64/ptrace: Preserve previous registers for short regset write
arm64/ptrace: Preserve previous registers for short regset write
arm64/ptrace: Preserve previous registers for short regset write
arm64: mm: avoid name clash in __page_to_voff()
arm64: Fix swiotlb fallback allocation
During an OOM scenario, request slots could not be created as skb
allocation fails. So the netback cannot pass in packets and netfront
wrongly assumes that there is no more work to be done and it disables
polling. This causes Rx to stall.
The issue is with the retry logic which schedules the timer if the
created slots are less than NET_RX_SLOTS_MIN. The count of new request
slots to be pushed are calculated as a difference between new req_prod
and rsp_cons which could be more than the actual slots, if there are
unconsumed responses.
The fix is to calculate the count of newly created slots as the
difference between new req_prod and old req_prod.
Signed-off-by: Vineeth Remanan Pillai <vineethp@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A driver using dev_alloc_page() must not reuse a page allocated from
emergency memory reserve.
Otherwise all packets using this page will be immediately dropped,
unless for very specific sockets having SOCK_MEMALLOC bit set.
This issue might be hard to debug, because only a fraction of received
packets would be dropped.
Fixes: 4415a0319f ("net/mlx5e: Implement RX mapped page cache for page recycle")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix build errors for samples/bpf xdp_tx_iptunnel and tc_l2_redirect,
when dynamic debugging is enabled (CONFIG_DYNAMIC_DEBUG) by defining a
fake KBUILD_MODNAME.
Just like Daniel Borkmann fixed other samples/bpf in commit
96a8eb1eee ("bpf: fix samples to add fake KBUILD_MODNAME").
Fixes: 12d8bb64e3 ("bpf: xdp: Add XDP example for head adjustment")
Fixes: 90e02896f1 ("bpf: Add test for bpf_redirect to ipip/ip6tnl")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gcc-7 and probably earlier versions get confused by this function
and print a harmless warning:
drivers/net/ethernet/broadcom/bcm63xx_enet.c: In function 'bcm_enet_open':
drivers/net/ethernet/broadcom/bcm63xx_enet.c:1130:3: error: 'phydev' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This adds an initialization for the 'phydev' variable when it is unused
and changes the check to test for that NULL pointer to make it clear
that we always pass a valid pointer here.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct qed_ll2_info is rather large, so putting it on the stack
can cause an overflow, as this warning tries to tell us:
drivers/net/ethernet/qlogic/qed/qed_ll2.c: In function 'qed_ll2_start':
drivers/net/ethernet/qlogic/qed/qed_ll2.c:2159:1: error: the frame size of 1056 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
qed_ll2_start_ooo() already uses a dynamic allocation for the structure
to work around that problem, and we could do the same in qed_ll2_start()
as well as qed_roce_ll2_start(), but since the structure is only
used to pass a couple of initialization values here, it seems nicer
to replace it with a different structure.
Lacking any idea for better naming, I'm adding 'struct qed_ll2_conn',
which now contains all the initialization data, and this now simply
gets copied into struct qed_ll2_info rather than assigning all members
one by one.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The comparison on the timeout can lead to an array overrun
read on sctp_timer_tbl because of an off-by-one error. Fix
this by using < instead of <= and also compare to the array
size rather than SCTP_EVENT_TIMEOUT_MAX.
Fixes CoverityScan CID#1397639 ("Out-of-bounds read")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
seg6_genl_get_tunsrc() and set_tun_src() do not handle tun_src being
possibly NULL, so we must check kmemdup() return value and abort if
it is NULL
Fixes: 915d7e5e59 ("ipv6: sr: add code base for control plane support of SR-IPv6")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Lebrun <david.lebrun@uclouvain.be>
Acked-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rtl8152_post_reset() should sumbit rx urb and interrupt transfer,
otherwise the rx wouldn't work and the linking change couldn't be
detected.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 501db51139 ("virtio: don't set VIRTIO_NET_HDR_F_DATA_VALID on
xmit") in fact disables VIRTIO_HDR_F_DATA_VALID on receiving path too,
fixing this by adding a hint (has_data_valid) and set it only on the
receiving path.
Cc: Rolf Neugebauer <rolf.neugebauer@docker.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
kvm_s390_get_machine() populates the facility bitmap by copying bytes
from the host results that are stored in a 256 byte array in the prefix
page. The KVM code does use the size of the target buffer (2k), thus
copying and exposing unrelated kernel memory (mostly machine check
related logout data).
Let's use the size of the source buffer instead. This is ok, as the
target buffer will always be greater or equal than the source buffer as
the KVM internal buffers (and thus S390_ARCH_FAC_LIST_SIZE_BYTE) cover
the maximum possible size that is allowed by STFLE, which is 256
doublewords. All structures are zero allocated so we can leave bytes
256-2047 unchanged.
Add a similar fix for kvm_arch_init_vm().
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[found with smatch]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: stable@vger.kernel.org
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
The warn on is a bit too much, we will anyway set the dma mask if not set
previously.
The main reason for this fix is that 4.10-rc1 has a dwc3 change that
pass a parent sysdev dev pointer instead of setting the dma mask of
its xhci platform device. xhci platform driver can then get more
attributes from the sysdev than just the dma mask.
The usb core and xhci changes are not yet in 4.10, and a fix like
this was preferred instead of taking those big changes this late in
the rc-cycle.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In xen_swiotlb_map_page and xen_swiotlb_map_sg_attrs, if the original
page is not suitable, we swap it for another page from the swiotlb
pool.
In these cases, we don't update the previously calculated dma address
for the page before calling xen_dma_map_page. Thus, we end up calling
xen_dma_map_page passing the wrong dev_addr, resulting in
xen_dma_map_page mistakenly assuming that the page is foreign when it is
local.
Fix the bug by updating dev_addr appropriately.
This change has no effect on x86, because xen_dma_map_page is a stub
there.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Pooya Keshavarzi <Pooya.Keshavarzi@de.bosch.com>
Tested-by: Pooya Keshavarzi <Pooya.Keshavarzi@de.bosch.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
DECON_TV requires STANDALONE_UPDATE after output enabling, otherwise it does
not start. This change is neutral for DECON.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
DECON_CMU register has reserved bits which should not be zeroed, otherwise
IP can behave strangely and cause IOMMU faults.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
decon_commit is called just after reset so video is disabled anyway.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
IBM bit 31 (for the rest of us - bit 0) is a reserved field in the
instruction definition of mtspr and mfspr. Hardware is encouraged to
(and does) ignore it.
As a result, if userspace executes an mtspr DSCR with the reserved bit
set, we get a DSCR facility unavailable exception. The kernel fails to
match against the expected value/mask, and we silently return to
userspace to try and re-execute the same mtspr DSCR instruction. We
loop forever until the process is killed.
We should do something here, and it seems mirroring what hardware does
is the better option vs killing the process. While here, relax the
matching of mfspr PVR too.
Cc: stable@vger.kernel.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET
to fill all the check pointed registers, the thread's old check pointed
registers are preserved.
Fixes: 9d3918f7c0 ("powerpc/ptrace: Enable support for NT_PPC_CVSX")
Fixes: 19cbcbf75a ("powerpc/ptrace: Enable support for NT_PPC_CFPR")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET
to fill all the registers, the thread's old registers are preserved.
Fixes: c6e6771b87 ("powerpc: Introduce VSX thread_struct and CONFIG_VSX")
Cc: stable@vger.kernel.org # v2.6.27+
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There is a hidden logic for acpi_tb_install_standard_table() as it can be
invoked from the boot stage and during runtime.
1. When it is invoked from the OS boot stage, the ACPICA mutex may not have
been initialized yet and so acpi_ut_acquire_mutex()/acpi_ut_release_mutex()
are not invoked in these code paths:
acpi_initialize_tables
acpi_tb_parse_root_table
acpi_tb_install_standard_table (4 invocations)
acpi_install_table
acpi_tb_install_standard_table
2. When it is invoked during the runtime, ACPICA mutex is used as
appropriate:
acpi_ex_load_op
acpi_tb_install_and_load_table
acpi_tb_install_standard_table
acpi_load_table
acpi_tb_install_and_load_table
acpi_tb_install_standard_table
The mutex is now used in acpi_tb_install_and_load_table(), while it actually
should be in acpi_tb_install_standard_table().
This introduces another problem in acpi_tb_install_standard_table() where
acpi_gbl_table_handler is invoked from and the lock contexts are thus not
consistent for the table handlers. This triggers a regression when
acpi_get_table()/acpi_put_table() start to hold table mutex during runtime.
The regression is noticed by LKP as new errors reported by ACPICA mutex
debugging facility.
[ 2.043693] ACPI Error: Mutex [ACPI_MTX_Tables] already acquired by this thread [497483776] (20160930/utmutex-254)
[ 2.054084] ACPI Error: Mutex [0x2] is not acquired, cannot release (20160930/utmutex-326)
And it triggers a deadlock:
[ 247.066214] INFO: task swapper/0:1 blocked for more than 120 seconds.
...
[ 247.091271] Call Trace:
...
[ 247.121523] down_timeout+0x47/0x50
[ 247.125065] acpi_os_wait_semaphore+0x47/0x62
[ 247.129475] acpi_ut_acquire_mutex+0x43/0x81
[ 247.133798] acpi_get_table+0x2d/0x84
[ 247.137513] acpi_table_attr_init+0xcd/0x100
[ 247.146590] acpi_sysfs_table_handler+0x5d/0xb8
[ 247.151174] acpi_bus_table_handler+0x23/0x2a
[ 247.155583] acpi_tb_install_standard_table+0xe0/0x213
[ 247.164489] acpi_tb_install_and_load_table+0x3a/0x82
[ 247.169592] acpi_ex_load_op+0x194/0x201
...
[ 247.200108] acpi_ns_evaluate+0x1bb/0x247
[ 247.204170] acpi_evaluate_object+0x178/0x274
[ 247.213249] acpi_processor_set_pdc+0x154/0x17b
...
The table mutex is held in acpi_tb_install_and_load_table() and is re-visited by
acpi_get_table().
Noticing that the early mutex requirement actually belongs to the OSL layer
and has already been handled in acpi_os_wait_semaphore()/acpi_os_signal_semaphore(),
the regression canbe fixed by removing this hidden logic from the ACPICA core
to the OS-specific code.
Fixes: 174cc7187e ("ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel")
Reported-and-tested-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reported-by: Ye Xiaolong <xiaolong.ye@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A side effect of keeping intel_pstate sysfs limits in sync with cpufreq
is that the now sysfs limits can't enforced under performance policy.
For example, if the max_perf_pct is changed from 100 to 80, this will call
intel_pstate_set_policy(), which will change the max_perf to 100 again for
performance policy. Same issue happens, when no_turbo is set.
This change calculates max and min frequency using sysfs performance
limits in intel_pstate_verify_policy() and adjusts policy limits by
calling cpufreq_verify_within_limits().
Also, it causes the setting of performance limits to be skipped if
no_turbo is set.
Fixes: 111b8b3fe4 (cpufreq: intel_pstate: Always keep all limits settings in sync)
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Revert commit 08b98d3291 (PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0
flag) as it caused system suspend (in the default configuration) to fail
on Dell XPS13 (9360) with the Kaby Lake processor.
Fixes: 08b98d3291 (PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag)
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull ARM SoC fixes from Olof Johansson:
"We've been sitting on fixes for a while, and they keep trickling in at
a low rate. Nothing in here comes across as particularly scary or
noteworthy, for the most part it's a large collection of small DT
tweaks"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (24 commits)
ARM: dts: da850-evm: fix read access to SPI flash
ARM: dts: omap3: Fix Card Detect and Write Protect on Logic PD SOM-LV
ARM64: dts: meson-gxbb-odroidc2: Disable SCPI DVFS
ARM: dts: OMAP5 / DRA7: indicate that SATA port 0 is available.
ARM: dts: NSP: Fix DT ranges error
ARM: multi_v7_defconfig: set bcm47xx watchdog
ARM: multi_v7_defconfig: fix config typo
ARM: dts: dra72-evm-revc: fix typo in ethernet-phy node
soc: ti: wkup_m3_ipc: Fix error return code in wkup_m3_ipc_probe()
ARM: ux500: fix prcmu_is_cpu_in_wfi() calculation
ARM: dts: sunxi: Change node name for pwrseq pin on Olinuxino-lime2-emmc
ARM: dts: sun8i: Support DTB build for NanoPi M1
ARM: dts: sun6i: hummingbird: Enable display engine again
ARM: dts: sun6i: Disable display pipeline by default
ARM, ARM64: dts: drop "arm,amba-bus" in favor of "simple-bus" part 3
ARM: dts: imx6qdl-nitrogen6_som2: fix sgtl5000 pinctrl init
ARM: dts: imx6qdl-nitrogen6_max: fix sgtl5000 pinctrl init
ARM: OMAP1: DMA: Correct the number of logical channels
ARM: dts: am335x-icev2: Remove the duplicated pinmux setting
ARM: OMAP2+: Fix WL1283 Bluetooth Baud Rate
...
Pull xfs fixes from Darrick Wong:
"I have a few more patches this week -- one to make the behavior of a
quota id ioctl consistent with the other filesystems, and the rest
improve validation of i_mode & i_size values coming into xfs so that
we don't read off the ends of arrays or crash when handed garbage disk
data.
Summary:
- inode i_mode sanitization
- prevent overflows in getnextquota
- minor build fixes"
* tag 'xfs-for-linux-4.10-rc5-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix xfs_mode_to_ftype() prototype
xfs: don't wrap ID in xfs_dq_get_next_id
xfs: sanity check inode di_mode
xfs: sanity check inode mode when creating new dentry
xfs: replace xfs_mode_to_ftype table with switch statement
xfs: add missing include dependencies to xfs_dir2.h
xfs: sanity check directory inode di_size
xfs: make the ASSERT() condition likely
Read access to the SPI flash are broken on da850-evm, i.e. the data
read is not what is actually programmed on the flash.
According to the datasheet for the M25P64 part present on the da850-evm,
if the SPI frequency is higher than 20MHz then the READ command is not
usable anymore and only the FAST_READ command can be used to read data.
This commit specifies in the DTS that we should use FAST_READ command
instead of the READ command.
Cc: stable@vger.kernel.org
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Fabien Parent <fparent@baylibre.com>
[nsekhar@ti.com: subject line adjustment]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Declare virtio_config_ops structure as const as it is only stored in the
config field of a virtio_device structure. This field is of type const, so
virtio_config_ops structures having this property can be declared const.
Done using Coccinelle:
@r1 disable optional_qualifier@
identifier i;
position p;
@@
static struct virtio_config_ops i@p={...};
@ok1@
identifier r1.i;
position p;
struct virtio_ccw_device x;
@@
x.vdev.config=&i@p
@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p
@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct virtio_config_ops i;
File size before and after applying the patch remains the same.
text data bss dec hex filename
9235 296 32928 42459 a5db drivers/s390/virtio/virtio_ccw.o
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Message-Id: <1484333336-13443-1-git-send-email-bhumirks@gmail.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
As virtio-1 introduced the possibility of the device manipulating the
status byte, revision 2 of the virtio-ccw transport introduced a means
of getting the status byte from the device via READ_STATUS. Let's wire
it up for revisions >= 2 and fall back to returning the stored status
byte if not supported.
Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Since ef1b144d ("tools/virtio/ringtest: fix run-on-all.sh to work
without /dev/cpu") run-on-all.sh uses seq 0 $HOST_AFFINITY as the list
of ids of the CPUs to run the command on (assuming ids of online CPUs
are consecutive and start from 0), where $HOST_AFFINITY is the highest
CPU id in the system previously determined using lscpu. This can fail
on systems with offline CPUs.
Instead let's use lscpu to determine the list of online CPUs.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Fixes: ef1b144d ("tools/virtio/ringtest: fix run-on-all.sh to work without
/dev/cpu")
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Using control_work instead of config_work as the 3rd argument to
container_of results in an invalid portdev pointer. Indeed, the work
structure is initialized as below:
INIT_WORK(&portdev->config_work, &config_work_handler);
It leads to a crash when portdev->vdev is dereferenced later. This
bug
is triggered when the guest uses a virtio-console without multiport
feature and receives a config_changed virtio interrupt.
Signed-off-by: G. Campana <gcampana@quarkslab.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is to silence an uninitialized variable warning in debug output.
The problem is this line:
pr_debug("vhost_get_vq_desc: head: %d, out: %u in: %u\n",
head, out, in);
If "head == vq->num" is true on the first iteration then "out" and "in"
aren't initialized. We handle that a few lines after the printk. I was
tempted to just delete the pr_debug() but I decided to just initialize
them to zero instead.
Also checkpatch.pl complains if variables are declared as just
"unsigned" without the "int".
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Declare target_core_fabric_ops strucrues as const as they are only
passed as an argument to the functions target_register_template and
target_unregister_template. The arguments are of type const struct
target_core_fabric_ops *, so target_core_fabric_ops structures having
this property can be declared const.
Done using Coccinelle:
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct target_core_fabric_ops i@p={...};
@ok@
position p;
identifier r.i;
@@
(
target_register_template(&i@p)
|
target_unregister_template(&i@p)
)
@bad@
position p!={r.p,ok.p};
identifier r.i;
@@
i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
+const
struct target_core_fabric_ops i;
File size before: drivers/vhost/scsi.o
text data bss dec hex filename
18063 2985 40 21088 5260 drivers/vhost/scsi.o
File size after: drivers/vhost/scsi.o
text data bss dec hex filename
18479 2601 40 21120 5280 drivers/vhost/scsi.o
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
A user noticed that write performance was horrible over loopback and we
traced it to an inversion of when we need to set MSG_MORE. It should be
set when we have more bvec's to send, not when we are on the last bvec.
This patch made the test go from 20 iops to 78k iops.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Fixes: 429a787be6 ("nbd: fix use-after-free of rq/bio in the xmit path")
Signed-off-by: Jens Axboe <axboe@fb.com>
Pull PCI fixes from Bjorn Helgaas:
- recognize that a PCI-to-PCIe bridge originates a PCIe hierarchy, so
we enumerate that hierarchy correctly
- X-Gene: fix a change merged for v4.10 that broke MSI
- Keystone: avoid reading undefined registers, which can cause
asynchronous external aborts
- Supermicro X8DTH-i/6/iF/6F: ignore broken _CRS that caused us to
change (and break) existing I/O port assignments
* tag 'pci-v4.10-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/MSI: pci-xgene-msi: Fix CPU hotplug registration handling
PCI: Enumerate switches below PCI-to-PCIe bridges
x86/PCI: Ignore _CRS on Supermicro X8DTH-i/6/iF/6F
PCI: designware: Check for iATU unroll only on platforms that use ATU
Pull two s390 bug fixes from Martin Schwidefsky:
"Two changes, the first is a fix to add a missing memory clobber to the
inline assembly to load control registers. This has not caused any
issues so far, but who knows what code gcc will generate in future
versions.
The second change is an update for the default configurations. This
includes CONFIG_BUG_ON_DATA_CORRUPTION=y, we want this to be enabled
for s390. The usual approach to debug problems on production systems
is to use crash on a system dump and for us avoiding data corruptions
is priority one"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: update defconfigs
s390/ctl_reg: make __ctl_load a full memory barrier
Pull xen fix from Juergen Gross:
"A fix for Xen running in nested virtualization environment"
* tag 'for-linus-4.10-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
partially revert "xen: Remove event channel notification through Xen PCI platform device"
For such a file mapping,
[0-4k][hole][8k-12k]
In NO_HOLES mode, we don't have the [hole] extent any more.
Commit c1aa45759e ("Btrfs: fix shrinking truncate when the no_holes feature is enabled")
fixed disk isize not being updated in NO_HOLES mode when data is not flushed.
However, even if data has been flushed, we can still have trouble
in updating disk isize since we updated disk isize to 'start' of
the last evicted extent.
Reviewed-by: Chris Mason <clm@fb.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The following deadlock is seen when executing generic/113 test,
---------------------------------------------------------+----------------------------------------------------
Direct I/O task Fast fsync task
---------------------------------------------------------+----------------------------------------------------
btrfs_direct_IO
__blockdev_direct_IO
do_blockdev_direct_IO
do_direct_IO
btrfs_get_blocks_direct
while (blocks needs to written)
get_more_blocks (first iteration)
btrfs_get_blocks_direct
btrfs_create_dio_extent
down_read(&BTRFS_I(inode) >dio_sem)
Create and add extent map and ordered extent
up_read(&BTRFS_I(inode) >dio_sem)
btrfs_sync_file
btrfs_log_dentry_safe
btrfs_log_inode_parent
btrfs_log_inode
btrfs_log_changed_extents
down_write(&BTRFS_I(inode) >dio_sem)
Collect new extent maps and ordered extents
wait for ordered extent completion
get_more_blocks (second iteration)
btrfs_get_blocks_direct
btrfs_create_dio_extent
down_read(&BTRFS_I(inode) >dio_sem)
--------------------------------------------------------------------------------------------------------------
In the above description, Btrfs direct I/O code path has not yet started
submitting bios for file range covered by the initial ordered
extent. Meanwhile, The fast fsync task obtains the write semaphore and
waits for I/O on the ordered extent to get completed. However, the
Direct I/O task is now blocked on obtaining the read semaphore.
To resolve the deadlock, this commit modifies the Direct I/O code path
to obtain the read semaphore before invoking
__blockdev_direct_IO(). The semaphore is then given up after
__blockdev_direct_IO() returns. This allows the Direct I/O code to
complete I/O on all the ordered extents it creates.
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Below test script can reveal this bug:
dd if=/dev/zero of=fs.img bs=$((1024*1024)) count=100
dev=$(losetup --show -f fs.img)
mkdir -p /mnt/mntpoint
mkfs.btrfs -f $dev
mount $dev /mnt/mntpoint
cd /mnt/mntpoint
echo "workdir is: /mnt/mntpoint"
blocksize=$((128 * 1024))
dd if=/dev/zero of=testfile bs=$blocksize count=1
sync
count=$((17*1024*1024*1024/blocksize))
echo "file size is:" $((count*blocksize))
for ((i = 1; i <= $count; i++)); do
dst_offset=$((blocksize * i))
xfs_io -f -c "reflink testfile 0 $dst_offset $blocksize"\
testfile > /dev/null
done
sync
truncate --size 0 testfile
The last truncate operation will fail for ENOSPC reason, but indeed
it should not fail.
In btrfs_truncate(), we use a temporary block_rsv to do truncate
operation. With every btrfs_truncate_inode_items() call, we migrate space
to this block_rsv, but forget to cleanup previous reservation, which
will make this block_rsv's reserved bytes keep growing, and this reserved
space will only be released in the end of btrfs_truncate(), this metadata
leak will impact other's metadata reservation. In this case, it's
"btrfs_start_transaction(root, 2);" fails for enospc error, which make
this truncate operation fail.
Call btrfs_block_rsv_release() to fix this bug.
Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
A driver using dev_alloc_page() must not reuse a page that had to
use emergency memory reserve.
Otherwise all packets using this page will be immediately dropped,
unless for very specific sockets having SOCK_MEMALLOC bit set.
This issue might be hard to debug, because only a fraction of the RX
ring buffer would suffer from drops.
Fixes: 75354148ce ("gianfar: Add paged allocation and Rx S/G")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Claudiu Manoil <claudiu.manoil@freescale.com>
Acked-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Found that if we run LTP netstress test with large MSS (65K),
the first attempt from server to send data comparable to this
MSS on fastopen connection will be delayed by the probe timer.
Here is an example:
< S seq 0:0 win 43690 options [mss 65495 wscale 7 tfo cookie] length 32
> S. seq 0:0 ack 1 win 43690 options [mss 65495 wscale 7] length 0
< . ack 1 win 342 length 0
Inside tcp_sendmsg(), tcp_send_mss() returns max MSS in 'mss_now',
as well as in 'size_goal'. This results the segment not queued for
transmition until all the data copied from user buffer. Then, inside
__tcp_push_pending_frames(), it breaks on send window test and
continues with the check probe timer.
Fragmentation occurs in tcp_write_wakeup()...
+0.2 > P. seq 1:43777 ack 1 win 342 length 43776
< . ack 43777, win 1365 length 0
> P. seq 43777:65001 ack 1 win 342 options [...] length 21224
...
This also contradicts with the fact that we should bound to the half
of the window if it is large.
Fix this flaw by correctly initializing max_window. Before that, it
could have large values that affect further calculations of 'size_goal'.
Fixes: 168a8f5805 ("tcp: TCP Fast Open Server - main code path")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A cleanup removed the only user of this variable
mlx5/core/en_ethtool.c: In function 'mlx5e_set_channels':
mlx5/core/en_ethtool.c:546:6: error: unused variable 'ncv' [-Werror=unused-variable]
Let's remove the declaration as well.
Fixes: 639e9e9416 ("net/mlx5e: Remove unnecessary checks when setting num channels")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just like commit 4acd4945cd ("ipv6: addrconf: Avoid calling
netdevice notifiers with RCU read-side lock"), it is unnecessary
to make addrconf_disable_change() use RCU iteration over the
netdev list, since it already holds the RTNL lock, or we may meet
Illegal context switch in RCU read-side critical section.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Generally, taking an unexpected exception should be a fatal event, and
bad_mode is intended to cater for this. However, it should be possible
to contain unexpected synchronous exceptions from EL0 without bringing
the kernel down, by sending a SIGILL to the task.
We tried to apply this approach in commit 9955ac47f4 ("arm64:
don't kill the kernel on a bad esr from el0"), by sending a signal for
any bad_mode call resulting from an EL0 exception.
However, this also applies to other unexpected exceptions, such as
SError and FIQ. The entry paths for these exceptions branch to bad_mode
without configuring the link register, and have no kernel_exit. Thus, if
we take one of these exceptions from EL0, bad_mode will eventually
return to the original user link register value.
This patch fixes this by introducing a new bad_el0_sync handler to cater
for the recoverable case, and restoring bad_mode to its original state,
whereby it calls panic() and never returns. The recoverable case
branches to bad_el0_sync with a bl, and returns to userspace via the
usual ret_to_user mechanism.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9955ac47f4 ("arm64: don't kill the kernel on a bad esr from el0")
Reported-by: Mark Salter <msalter@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This further refines the changes made to conntrack gc_worker in
commit e0df8cae6c ("netfilter: conntrack: refine gc worker heuristics").
The main idea of that change was to reduce the scan interval when evictions
take place.
However, on the reporters' setup, there are 1-2 million conntrack entries
in total and roughly 8k new (and closing) connections per second.
In this case we'll always evict at least one entry per gc cycle and scan
interval is always at 1 jiffy because of this test:
} else if (expired_count) {
gc_work->next_gc_run /= 2U;
next_run = msecs_to_jiffies(1);
being true almost all the time.
Given we scan ~10k entries per run its clearly wrong to reduce interval
based on nonzero eviction count, it will only waste cpu cycles since a vast
majorities of conntracks are not timed out.
Thus only look at the ratio (scanned entries vs. evicted entries) to make
a decision on whether to reduce or not.
Because evictor is supposed to only kick in when system turns idle after
a busy period, pick a high ratio -- this makes it 50%. We thus keep
the idea of increasing scan rate when its likely that table contains many
expired entries.
In order to not let timed-out entries hang around for too long
(important when using event logging, in which case we want to timely
destroy events), we now scan the full table within at most
GC_MAX_SCAN_JIFFIES (16 seconds) even in worst-case scenario where all
timed-out entries sit in same slot.
I tested this with a vm under synflood (with
sysctl net.netfilter.nf_conntrack_tcp_timeout_syn_recv=3).
While flood is ongoing, interval now stays at its max rate
(GC_MAX_SCAN_JIFFIES / GC_MAX_BUCKETS_DIV -> 125ms).
With feedback from Nicolas Dichtel.
Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Fixes: b87a2f9199 ("netfilter: conntrack: add gc worker to remove timed-out entries")
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Instead of breaking loop and instant resched, don't bother checking
this in first place (the loop calls cond_resched for every bucket anyway).
Suggested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Commit 345857b ("HID: wacom: generic: Add support for sensor offsets") included
a change to the operation and location of the call to 'wacom_add_shared_data'
in 'wacom_parse_and_register'. The modifications included moving it higher up
so that it would occur before the call to 'wacom_retrieve_hid_descriptor'. This
was done to prevent a crash that would have occured when the report containing
tablet offsets was fed into the driver with 'wacom_hid_report_raw_event'
(specifically: the various 'wacom_wac_*_report' functions were written with the
assumption that they would only be called once tablet setup had completed;
'wacom_wac_pen_report' in particular dereferences 'shared' which wasn't yet
allocated).
Moving the call to 'wacom_add_shared_data' effectively prevented the crash but
also broke the sibiling detection code which assumes that the HID descriptor
has been read and the various device_type flags set.
To fix this situation, we restore the original 'wacom_add_shared_data'
operation and location and instead implement an alternative change that can
also prevent the crash. Specifically, we notice that the report functions
mentioned above expect to be called only for input reports. By adding a check,
we can prevent feature reports (such as the offset report) from
causing trouble.
Fixes: 345857bb49 ("HID: wacom: generic: Add support for sensor offsets")
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
Tested-by: Ping Cheng <pingc@wacom.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The GXBB and GXL/GXM pinctrl drivers had a configuration which conflicts
with uart_ao_a. According to the GXBB ("S905") datasheet the AO UART
functions are:
- GPIOAO_0: Func1 = UART_TX_AO_A (bit 12), Func2 = UART_TX_AO_B (bit 26)
- GPIOAO_1: Func1 = UART_RX_AO_A (bit 11), Func2 = UART_RX_AO_B (bit 25)
- GPIOAO_4: Func2 = UART_TX_AO_B (bit 24)
- GPIOAO_5: Func2 = UART_RX_AO_B (bit 25)
The existing definition for uart_AO_A already uses GPIOAO_0 and GPIOAO_1.
The old definition of uart_AO_B however was broken, as it used GPIOAO_0
for TX (which would be fine) and two pins (GPIOAO_1 and GPIOAO_5) for RX
(which does not make any sense).
This fixes the uart_AO_B configuration by moving it to GPIOAO_4 and
GPIOAO_5 (it would be possible to use GPIOAO_0 and GPIOAO_1 in theory,
but all existing hardware uses uart_AO_A there).
The fix for GXBB and GXL/GXM is identical since it seems that these
specific pins are identical on both SoC variants.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The helper function for adding a GPIO chip compiles in a lockdep
key for debugging, the same key is needed for nested chips as
well.
The macro construction is unreadable, replace this with two
static inlines instead.
The _gpiochip_irqchip_add prefixed function is not helpful,
rename it with gpiochip_irqchip_add_key() that tell us what the
function is actually doing.
Fixes: d245b3f9bd ("gpio: simplify adding threaded interrupts")
Cc: Roger Quadros <rogerq@ti.com>
Reported-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Reported-by: Roger Quadros <rogerq@ti.com>
Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The programming model has been fixed with prev patches so re-enable it
by default
This reverts commit 23cb1f6440.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
arc_cache_init() is called for each core so can't be tagged __init.
However bulk of it is only executed by master core and thus is candidate
for __init reaping.
So split it up to allow that.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Fixes for omaps for v4.10-rc cycle. Mostly a DMA regression fix for
omap1, and then a handful of trivial fixes for boards and devices to
work:
- Fixes TI wilink bluetooth strange platform data baud rate
- Remove duplicate pinmux line for am335x-icev2
- Fix omap1 dma regression
- Fix uninitialized return value for wkup_m3_ipc_probe()
- Fix Ethernet PHY binding typo for dra72-evm
- Fix init for omap5 and dra7 sata ports
- Fix mmc card detect pin for Logic PD SOM-LV
* tag 'omap-for-v4.10/fixes-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: omap3: Fix Card Detect and Write Protect on Logic PD SOM-LV
ARM: dts: OMAP5 / DRA7: indicate that SATA port 0 is available.
ARM: dts: dra72-evm-revc: fix typo in ethernet-phy node
soc: ti: wkup_m3_ipc: Fix error return code in wkup_m3_ipc_probe()
ARM: OMAP1: DMA: Correct the number of logical channels
ARM: dts: am335x-icev2: Remove the duplicated pinmux setting
ARM: OMAP2+: Fix WL1283 Bluetooth Baud Rate
Signed-off-by: Olof Johansson <olof@lixom.net>
vs. fixed 512M before.
But this still assumes that all of memory is under IOC which may not be
true for the SoC. Improve that later when this becomes a real issue, by
specifying this from DT.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
On AXS103 release bitfiles, DMA data corruptions were seen because IOC
setup was not following the recommended way in documentation.
Flipping IOC on when caches are enabled or coherency transactions are in
flight, might cause some of the memory operations to not observe
coherency as expected.
So strictly follow the programming model recommendations as documented
in comment header above arc_ioc_setup()
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This patch adds two helpers, bpf_map_area_alloc() and bpf_map_area_free(),
that are to be used for map allocations. Using kmalloc() for very large
allocations can cause excessive work within the page allocator, so i) fall
back earlier to vmalloc() when the attempt is considered costly anyway,
and even more importantly ii) don't trigger OOM killer with any of the
allocators.
Since this is based on a user space request, for example, when creating
maps with element pre-allocation, we really want such requests to fail
instead of killing other user space processes.
Also, don't spam the kernel log with warnings should any of the allocations
fail under pressure. Given that, we can make backend selection in
bpf_map_area_alloc() generic, and convert all maps over to use this API
for spots with potentially large allocation requests.
Note, replacing the one kmalloc_array() is fine as overflow checks happen
earlier in htab_map_alloc(), since it must also protect the multiplication
for vmalloc() should kmalloc_array() fail.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trying to add an mpls encap route when the MPLS modules are not loaded
hangs. For example:
CONFIG_MPLS=y
CONFIG_NET_MPLS_GSO=m
CONFIG_MPLS_ROUTING=m
CONFIG_MPLS_IPTUNNEL=m
$ ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
The ip command hangs:
root 880 826 0 21:25 pts/0 00:00:00 ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
$ cat /proc/880/stack
[<ffffffff81065a9b>] call_usermodehelper_exec+0xd6/0x134
[<ffffffff81065efc>] __request_module+0x27b/0x30a
[<ffffffff814542f6>] lwtunnel_build_state+0xe4/0x178
[<ffffffff814aa1e4>] fib_create_info+0x47f/0xdd4
[<ffffffff814ae451>] fib_table_insert+0x90/0x41f
[<ffffffff814a8010>] inet_rtm_newroute+0x4b/0x52
...
modprobe is trying to load rtnl-lwt-MPLS:
root 881 5 0 21:25 ? 00:00:00 /sbin/modprobe -q -- rtnl-lwt-MPLS
and it hangs after loading mpls_router:
$ cat /proc/881/stack
[<ffffffff81441537>] rtnl_lock+0x12/0x14
[<ffffffff8142ca2a>] register_netdevice_notifier+0x16/0x179
[<ffffffffa0033025>] mpls_init+0x25/0x1000 [mpls_router]
[<ffffffff81000471>] do_one_initcall+0x8e/0x13f
[<ffffffff81119961>] do_init_module+0x5a/0x1e5
[<ffffffff810bd070>] load_module+0x13bd/0x17d6
...
The problem is that lwtunnel_build_state is called with rtnl lock
held preventing mpls_init from registering.
Given the potential references held by the time lwtunnel_build_state it
can not drop the rtnl lock to the load module. So, extract the module
loading code from lwtunnel_build_state into a new function to validate
the encap type. The new function is called while converting the user
request into a fib_config which is well before any table, device or
fib entries are examined.
Fixes: 745041e2aa ("lwtunnel: autoload of lwt modules")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the TPA GRO code path, initialize the tcp_opt_len variable to 0 so
that it will be correct for packets without TCP timestamps. The bug
caused the SKB fields to be incorrectly set up for packets without
TCP timestamps, leading to these packets being rejected by the stack.
Reported-by: Andy Gospodarek <andrew.gospodarek@broadocm.com>
Acked-by: Andy Gospodarek <andrew.gospodarek@broadocm.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull UBIFS fixes from Richard Weinberger:
"This contains fixes for UBIFS:
- a long standing issue in UBIFS journal replay code
- fallout from the merge window"
* tag 'upstream-4.10-rc5' of git://git.infradead.org/linux-ubifs:
ubifs: Fix journal replay wrt. xattr nodes
ubifs: remove redundant checks for encryption key
ubifs: allow encryption ioctls in compat mode
ubifs: add CONFIG_BLOCK dependency for encryption
ubifs: fix unencrypted journal write
ubifs: ensure zero err is returned on successful return
Commit a1cba5613e ("net: phy: Add Broadcom phy library for common
interfaces") make the BCM63xx PHY driver utilize bcm_phy_config_intr()
which would appear to do the right thing, except that it does not write
to the MII_BCM63XX_IR register but to MII_BCM54XX_ECR which is
different.
This would be causing invalid link parameters and events from being
generated by the PHY interrupt.
Fixes: a1cba5613e ("net: phy: Add Broadcom phy library for common interfaces")
Signed-off-by: Daniel Gonzalez Cabanelas <dgcbueu@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A harmless warning just got introduced:
fs/xfs/libxfs/xfs_dir2.h:40:8: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]
Removing the 'const' modifier avoids the warning and has no
other effect.
Fixes: 1fc4d33fed ("xfs: replace xfs_mode_to_ftype table with switch statement")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Ashizuka reported a highmem oddity and sent a patch for freescale
fec driver.
But the problem root cause is that core networking stack
must ensure no skb with highmem fragment is ever sent through
a device that does not assert NETIF_F_HIGHDMA in its features.
We need to call illegal_highdma() from harmonize_features()
regardless of CSUM checks.
Fixes: ec5f061564 ("net: Kill link between CSUM and SG features.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Pravin Shelar <pshelar@ovn.org>
Reported-by: "Ashizuka, Yuusuke" <ashiduka@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Druzhinin says:
====================
xen-netback: fix memory leaks on XenBus disconnect
Just split the initial patch in two as proposed by Wei.
Since the approach for locking netdev statistics is inconsistent (tends not
to have any locking at all) accross the kernel we'd better to rely on our
internal lock for this purpose.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminate memory leaks introduced several years ago by cleaning the
queue resources which are allocated on XenBus connection event. Namely, queue
structure array and pages used for IO rings.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can't access c->pde if CONFIG_PROC_FS is disabled:
net/ipv4/netfilter/ipt_CLUSTERIP.c: In function 'clusterip_config_find_get':
net/ipv4/netfilter/ipt_CLUSTERIP.c:147:9: error: 'struct clusterip_config' has no member named 'pde'
This moves the check inside of another #ifdef.
Fixes: 6c5d5cfbe3 ("netfilter: ipt_CLUSTERIP: check duplicate config when initializing")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tariq Toukan says:
====================
ethtool fix
This patchset from Eran contains a fix to ethtool set_channels, where the call
to get_channels with an uninitialized parameter might result in garbage fields.
It also contains two followup changes in our mlx4/mlx5 Eth drivers.
Series generated against net commit:
0faa9cb5b3 net sched actions: fix refcnt when GETing of action after bind
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Boundaries checks for the number of RX and TX should be checked by the
caller and not in the driver.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Boundaries checks for the number of RX, TX, other and combined channels
should be checked by the caller and not in the driver.
In addition, remove wrong memset on get channels as it overrides the cmd
field in the requester struct.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ethtool channels respond struct was uninitialized when querying device
channel boundaries settings. As a result, unreported fields by the driver
hold garbage. This may cause sending unsupported params to driver.
Fixes: 8bf3686204 ('ethtool: ensure channel counts are within bounds ...')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
CC: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull ARM fixes from Russell King:
"A few ARM fixes:
- fix a crash while performing TLB maintanence on early ARM SMP cores
- blacklist Scorpion CPUs for hardware breakpoints
- ARMs asm/types.h has been included as part of the UAPI due to the
way the makefiles work, move it to uapi/asm/types.h to make it
official
- fix up ftrace syscall name matching"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8613/1: Fix the uaccess crash on PB11MPCore
MAINTAINERS: update rmk's entries
ARM: put types.h in uapi
ARM: 8634/1: hw_breakpoint: blacklist Scorpion CPUs
ARM: 8632/1: ftrace: fix syscall name matching
commit d65283f7b6 added mod->arch.secstr under
CONFIG_ARC_DW2_UNWIND, but used it unconditionally which broke builds
when the option was disabled. Fix that by adjusting the #ifdef guard.
And while at it add a missing guard (for unwinder) in module.c as well
Reported-by: Waldemar Brodkorb <wbx@openadk.org>
Cc: stable@vger.kernel.org #4.9
Fixes: d65283f7b6 ("ARC: module: elide loop to save reference to .eh_frame")
Tested-by: Anton Kolesov <akolesov@synopsys.com>
Reviewed-by: Alexey Brodkin <abrodkin@synopsys.com>
[abrodkin: provided fixlet to Kconfig per failure in allnoconfig build]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Pull SMP hotplug update from Thomas Gleixner:
"This contains a trivial typo fix and an extension to the core code for
dynamically allocating states in the prepare stage.
The extension is necessary right now because we need a proper way to
unbreak LTTNG, which iscurrently non functional due to the removal of
the notifiers. Surely it's out of tree, but it's widely used by
distros.
The simple solution would have been to reserve a state for LTTNG, but
I'm not fond about unused crap in the kernel and the dynamic range,
which we admittedly should have done right away, allows us to remove
quite some of the hardcoded states, i.e. those which have no ordering
requirements. So doing the right thing now is better than having an
smaller intermediate solution which needs to be reworked anyway"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Provide dynamic range for prepare stage
perf/x86/amd/ibs: Fix typo after cleanup state names in cpu/hotplug
Pull timer fix from Ingo Molnar:
"Fix a crash in the ARM-Exynos clocksource driver, triggered by CPU
hotplug operations"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/exynos_mct: Clear interrupt when cpu is shut down
Pull RCU fixes from Ingo Molnar:
"This fixes sporadic ACPI related hangs in synchronize_rcu() that were
caused by the ACPI code mistakenly relying on an aspect of RCU that
was neither promised to work nor reliable but which happened to work -
until in v4.9 we changed the RCU implementation, which made the hangs
more prominent.
Since the mis-use of the RCU facility wasn't properly detected and
prevented either, these fixes make the RCU side work reliably instead
of working around the problem in the ACPI code.
Hence the slightly larger diffstat that goes beyond the normal scope
of RCU fixes in -rc kernels"
* 'rcu-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Narrow early boot window of illegal synchronous grace periods
rcu: Remove cond_resched() from Tiny synchronize_sched()
Pull perf fixes from Ingo Molnar:
"An Intel PMU driver hotplug fix and three 'perf probe' tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Handle exclusive threadid correctly on CPU hotplug
perf probe: Fix to probe on gcc generated functions in modules
perf probe: Add error checks to offline probe post-processing
perf probe: Fix to show correct locations for events on modules
We cannot preserve partial fields for hardware breakpoints, because
the values written by userspace to the hardware breakpoint
registers can't subsequently be recovered intact from the hardware.
So, just reject attempts to write incomplete fields with -EINVAL.
Cc: <stable@vger.kernel.org> # 3.7.x-
Fixes: 478fcb2cdb ("arm64: Debugging support")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <Will.Deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch adds an explicit __reserved[] field to user_fpsimd_state
to replace what was previously unnamed padding.
This ensures that data in this region are propagated across
assignment rather than being left possibly uninitialised at the
destination.
Cc: <stable@vger.kernel.org> # 3.7.x-
Fixes: 60ffc30d56 ("arm64: Exception handling")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <Will.Deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
Cc: <stable@vger.kernel.org> # 4.3.x-
Fixes: 5d220ff942 ("arm64: Better native ptrace support for compat tasks")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <Will.Deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
Cc: <stable@vger.kernel.org> # 3.19.x-
Fixes: 766a85d7bc ("arm64: ptrace: add NT_ARM_SYSTEM_CALL regset")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <Will.Deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
Cc: <stable@vger.kernel.org> # 3.7.x-
Fixes: 478fcb2cdb ("arm64: Debugging support")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <Will.Deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
sparse says:
fs/ceph/mds_client.c:291:23: warning: restricted __le32 degrades to integer
fs/ceph/mds_client.c:293:28: warning: restricted __le32 degrades to integer
fs/ceph/mds_client.c:294:28: warning: restricted __le32 degrades to integer
fs/ceph/mds_client.c:296:28: warning: restricted __le32 degrades to integer
The op value is __le32, so we need to convert it before comparing it.
Cc: stable@vger.kernel.org # needs backporting for < 3.14
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
sparse says:
fs/ceph/inode.c:308:36: warning: incorrect type in argument 1 (different base types)
fs/ceph/inode.c:308:36: expected unsigned int [unsigned] [usertype] a
fs/ceph/inode.c:308:36: got restricted __le32 [usertype] frag
fs/ceph/inode.c:308:46: warning: incorrect type in argument 2 (different base types)
fs/ceph/inode.c:308:46: expected unsigned int [unsigned] [usertype] b
fs/ceph/inode.c:308:46: got restricted __le32 [usertype] frag
We need to convert these values to host-endian before calling the
comparator.
Fixes: a407846ef7 ("ceph: don't assume frag tree splits in mds reply are sorted")
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
... otherwise the crypto stack will align it for us with a GFP_ATOMIC
allocation and a memcpy() -- see skcipher_walk_first().
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Commit 5c341ee328 ("ceph: fix scheduler warning due to nested
blocking") causes infinite loop when process is interrupted. Fix it.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Marc Kleine-Budde says:
====================
pull-request: can 2017-01-18
this is a pull request for net/master consisting of two patches.
In the first patch Einar Jón fixes a NULL-pointer-deref in the c_can_pci
driver. In the second patch Yegor Yefremov fixes the clock handling in the
ti_hecc driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The WaDisableLSQCROPERFforOCL workaround has the side effect of
disabling an L3SQ optimization that has huge performance implications
and is unlikely to be necessary for the correct functioning of usual
graphic workloads. Userspace is free to re-enable the workaround on
demand, and is generally in a better position to determine whether the
workaround is necessary than the DRM is (e.g. only during the
execution of compute kernels that rely on both L3 fences and HDC R/W
requests).
The same workaround seems to apply to BDW (at least to production
stepping G1) and SKL as well (the internal workaround database claims
that it does for all steppings, while the BSpec workaround table only
mentions pre-production steppings), but the DRM doesn't do anything
beyond whitelisting the L3SQCREG4 register so userspace can enable it
when it sees fit. Do the same on KBL platforms.
Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
This is followed by a regression of 35% and 10% respectively for the
same benchmarks and platform caused by my recent patch series
switching userspace to use the dataport constant cache instead of the
sampler to implement uniform pull constant loads, which caused us to
hit more heavily the L3 cache (and on platforms other than KBL had the
opposite effect of improving performance of the same two benchmarks).
The overall effect on KBL of this change combined with the recent
userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapPcf
was affected by the constant cache changes (though it improved as it
did on other platforms rather than regressing), but is not
significantly affected by this patch (with statistical significance of
5% and sample size 20).
v2: Drop some more code to avoid unused variable warning.
Fixes: 738fa1b312 ("drm/i915/kbl: Add WaDisableLSQCROPERFforOCL")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: beignet@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.7+
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[Removed double Fixes tag]
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484217894-20505-1-git-send-email-mika.kuoppala@intel.com
(cherry picked from commit 8726f2faa3)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
ovl_lookup_layer() iterates on path elements of d->name.name
but also frees and allocates a new pointer for d->name.name.
For the case of lookup in upper layer, the initial d->name.name
pointer is stable (dentry->d_name), but for lower layers, the
initial d->name.name can be d->redirect, which can be freed during
iteration.
[SzM]
Keep the count of remaining characters in the redirect path and calculate
the current position from that. This works becuase only the prefix is
modified, the ending always stays the same.
Fixes: 02b69b284c ("ovl: lookup redirects")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
In order to make the driver work with the common clock framework, this
patch converts the clk_enable()/clk_disable() to
clk_prepare_enable()/clk_disable_unprepare().
Also add error checking for clk_prepare_enable().
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The priv->device pointer for c_can_pci is never set, but it is used
without a NULL check in c_can_start(). Setting it in c_can_pci_probe()
like c_can_plat_probe() prevents c_can_pci.ko from crashing, with and
without CONFIG_PM.
This might also cause the pm_runtime_*() functions in c_can.c to
actually be executed for c_can_pci devices - they are the only other
place where priv->device is used, but they all contain a null check.
Signed-off-by: Einar Jón <tolvupostur@gmail.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The arm64 __page_to_voff() macro takes a parameter called 'page', and
also refers to 'struct page'. Thus, if the value passed in is not
called 'page', we'll refer to the wrong struct name (which might not
exist).
Fixes: 3fa72fe9c6 ("arm64: mm: fix __page_to_voff definition")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Signed-off-by: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@epam.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
After the recent removal of the hotplug notifiers the variable 'hasdied' in
_cpu_down() is set but no longer read, leading to the following GCC warning
when building with 'make W=1':
kernel/cpu.c:767:7: warning: variable ‘hasdied’ set but not used [-Wunused-but-set-variable]
Fix it by removing the variable.
Fixes: 530e9b76ae ("cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20170117143501.20893-1-tklauser@distanz.ch
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
SIER and SIAR are not updated correctly for some samples, so force the
use of MSR and regs->nip instead for misc_flag updates. This is done by
adding a new ppmu flag and updating the use_siar logic in
perf_read_regs() to use it, and dropping the PPMU_HAS_SIER flag.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Rename flag to PPMU_NO_SIAR, and also drop PPMU_HAS_SIER]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Test uses PMC2 to count the event. But PMC1 is being initialized.
Patch to fix it.
Fixes: 3752e453f6 ('selftests/powerpc: Add tests of PMU EBBs')
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We give up recovery on permanent error, simply shutdown the affected
devices and remove them. If the devices can't be put into quiet state,
they spew more traffic that is likely to cause another unexpected EEH
error. This was observed on "p8dtu2u" machine:
0002:00:00.0 PCI bridge: IBM Device 03dc
0002:01:00.0 Ethernet controller: Intel Corporation \
Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
0002:01:00.1 Ethernet controller: Intel Corporation \
Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
0002:01:00.2 Ethernet controller: Intel Corporation \
Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
0002:01:00.3 Ethernet controller: Intel Corporation \
Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
On P8 PowerNV platform, the IO path is frozen when shutdowning the
devices, meaning the memory registers are inaccessible. It is why
the devices can't be put into quiet state before removing them.
This fixes the issue by enabling IO path prior to putting the devices
into quiet state.
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Use 0x10012 event code for PM_BRU_CMPL event in power9 event list
instead of current 0x40060.
Fixes: 34922527a2 ('powerpc/perf: Add power9 event list macros for generic and cache events')
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When we switched to big endian page table, we never updated the hugepd
format such that it can work for both big endian and little endian
config. This patch series update hugepd format such that it is looked at
as __be64 value in big endian page table config.
This patch also switch hugepd_t.pd from signed long to unsigned long.
I did update the FSL hugepd_ok check to check for the top bit instead
of checking > 0.
Fixes: 5dc1ef858c ("powerpc/mm: Use big endian Linux page tables for book3s 64")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The generic hugetlbfs code can handle not finding the default huge page
size correctly. With HPAGE_SHIFT = 0 we see in dmesg:
hugetlbfs: disabling because there are no supported hugepage sizes
bash-4.2# echo 30 > /proc/sys/vm/nr_hugepages
bash: echo: write error: Operation not supported
Fixes: 03bb2d6590 ("powerpc: get hugetlbpage handling more generic")
Reported-by: Chris Smart <chris@distroguy.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit 9b081e1080 ("powerpc: port 64 bits pgtable_cache to 32 bits")
mixed up PMD_INDEX_SIZE and PMD_CACHE_INDEX a couple of times. This
resulted in 64s/hash/4k configs to panic at boot with a false positive
error check.
Fix that and simplify error handling by moving the check to the caller.
Fixes: 9b081e1080 ("powerpc: port 64 bits pgtable_cache to 32 bits")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull modules fix from Jessica Yu:
- fix out-of-tree module breakage when it supplies its own definitions
of true and false
* tag 'modules-for-v4.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
taint/module: Fix problems when out-of-kernel driver defines true or false
This fixes commit ab8dd3aed0 ("ARM: DTS: Add minimal Support for
Logic PD DM3730 SOM-LV") where the Card Detect and Write Protect
pins were improperly configured.
Fixes: ab8dd3aed0 ("ARM: DTS: Add minimal Support for
Logic PD DM3730 SOM-LV")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
This PHY with fiber support is register compatible with DP83848,
so add support for it.
Signed-off-by: Alvaro Gamez Machado <alvaro.gamez@hazent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
test_lru_sanity5() fails when the number of online cpus
is fewer than the number of possible cpus. It can be
reproduced with qemu by using cmd args "--smp cpus=2,maxcpus=8".
The problem is the loop in test_lru_sanity5() is testing
'i' which is incorrect.
This patch:
1. Make sched_next_online() always return -1 if it cannot
find a next cpu to schedule the process.
2. In test_lru_sanity5(), the parent process does
sched_setaffinity() first (through sched_next_online())
and the forked process will inherit it according to
the 'man sched_setaffinity'.
Fixes: 5db58faf98 ("bpf: Add tests for the LRU bpf_htab")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
vxlan->cfg.dst_port is in network byte order, so an htons()
is needed here. Also reduced comment length to stay closer
to 80 column width (still slightly over, however).
Fixes: e1e5314de0 ("vxlan: implement GPE")
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current hardware is not able to run with all cores enabled at a
cluster frequency superior at 1536MHz.
But the currently shipped u-boot for the platform still reports an OPP
table with possible DVFS frequency up to 2GHz, and will not change since
the off-tree linux tree supports limiting the OPPs with a kernel parameter.
A recent u-boot change reports the boot-time DVFS around 100MHz and
the default performance cpufreq governor sets the maximum frequency.
Previous version of u-boot reported to be already at the max OPP and
left the OPP as is.
Nevertheless, other governors like ondemand could setup the max frequency
and make the system crash.
This patch disables the DVFS clock and disables cpufreq.
Fixes: 70db166a2b ("ARM64: dts: meson-gxbb: Add SCPI with cpufreq & sensors Nodes")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
The GETNEXTQOTA ioctl takes whatever ID is sent in,
and looks for the next active quota for an user
equal or higher to that ID.
But if we are at the maximum ID and then ask for the "next"
one, we may wrap back to zero. In this case, userspace
may loop forever, because it will start querying again
at zero.
We'll fix this in userspace as well, but for the kernel,
return -ENOENT if we ask for the next quota ID
past UINT_MAX so the caller knows to stop.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
The helper xfs_dentry_to_name() is used by 2 different
classes of callers: Callers that pass zero mode and don't care
about the returned name.type field and Callers that pass
non zero mode and do care about the name.type field.
Change xfs_dentry_to_name() to not take the mode argument and
change the call sites of the first class to not pass the mode
argument.
Create a new helper xfs_dentry_mode_to_name() which does pass
the mode argument and returns -EFSCORRUPTED if mode is invalid.
Callers that translate non zero mode to on-disk file type now
check the return value and will export the error to user instead
of staging an invalid file type to be written to directory entry.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
The size of the xfs_mode_to_ftype[] conversion table
was too small to handle an invalid value of mode=S_IFMT.
Instead of fixing the table size, replace the conversion table
with a conversion helper that uses a switch statement.
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
xfs_dir2.h dereferences some data types in inline functions
and fails to include those type definitions, e.g.:
xfs_dir2_data_aoff_t, struct xfs_da_geometry.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
This changes fixes an assertion hit when fuzzing on-disk
i_mode values.
The easy case to fix is when changing an empty file
i_mode to S_IFDIR. In this case, xfs_dinode_verify()
detects an illegal zero size for directory and fails
to load the inode structure from disk.
For the case of non empty file whose i_mode is changed
to S_IFDIR, the ASSERT() statement in xfs_dir2_isblock()
is replaced with return -EFSCORRUPTED, to avoid interacting
with corrupted jusk also when XFS_DEBUG is disabled.
Suggested-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
The ASSERT() condition is the normal case, not the exception,
so testing the condition should be likely(), not unlikely().
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
mpt3sas has a firmware failure where it can only handle one pass through
ATA command at a time. If another comes in, contrary to the SAT
standard, it will hang until the first one completes (causing long
commands like secure erase to timeout). The original fix was to block
the device when an ATA command came in, but this caused a regression
with
commit 669f044170
Author: Bart Van Assche <bart.vanassche@sandisk.com>
Date: Tue Nov 22 16:17:13 2016 -0800
scsi: srp_transport: Move queuecommand() wait code to SCSI core
So fix the original fix of the secure erase timeout by properly
returning SAM_STAT_BUSY like the SAT recommends. The original patch
also had a concurrency problem since scsih_qcmd is lockless at that
point (this is fixed by using atomic bitops to set and test the flag).
[mkp: addressed feedback wrt. test_bit and fixed whitespace]
Fixes: 18f6084a98 (mpt3sas: Fix secure erase premature termination)
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Out of order(OOO) processing requires initiator, switch
and target to support OOO. In today's environment, none
of the switches support OOO. OOO requires extra buffer
space which affect performance. By turning ON this feature
in QLogic's FW, it delays error recovery because dropped
frame is treated as out of order frame. We're turning OFF
this option of speed up error recovery.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed spelling in patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Termination of Immediate Notify IOCB was using wrong
IOCB handle. IOCB completion code was unable to find
appropriate code path due to wrong handle.
Following message is seen in the logs.
"Error entry - invalid handle/queue (ffff)."
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed word order in patch title ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Soft reset and Risc reset should take 100uS to complete.
This change pad the timeout up to 400uS, which should be
plenty.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Corrupted ATIO is defined as length of fcp_header & fcp_cmd
payload is less than 0x38. It's the minimum size for a frame to
carry 8..16 bytes SCSI CDB. The exchange will be dropped or
terminated if corrupted.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed spelling in patch title ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
During code inspection, while investigating following stack trace
seen on one of the test setup, we found out there was possibility
of memory leak becuase driver was not unwinding the stack properly.
This issue has not been reproduced in a test environment or on a
customer setup.
Here's stack trace that was seen.
[1469877.797315] Call Trace:
[1469877.799940] [<ffffffffa03ab6e9>] qla2x00_mem_alloc+0xb09/0x10c0 [qla2xxx]
[1469877.806980] [<ffffffffa03ac50a>] qla2x00_probe_one+0x86a/0x1b50 [qla2xxx]
[1469877.814013] [<ffffffff813b6d01>] ? __pm_runtime_resume+0x51/0xa0
[1469877.820265] [<ffffffff8157c1f5>] ? _raw_spin_lock_irqsave+0x25/0x90
[1469877.826776] [<ffffffff8157cd2d>] ? _raw_spin_unlock_irqrestore+0x6d/0x80
[1469877.833720] [<ffffffff810741d1>] ? preempt_count_sub+0xb1/0x100
[1469877.839885] [<ffffffff8157cd0c>] ? _raw_spin_unlock_irqrestore+0x4c/0x80
[1469877.846830] [<ffffffff81319b9c>] local_pci_probe+0x4c/0xb0
[1469877.852562] [<ffffffff810741d1>] ? preempt_count_sub+0xb1/0x100
[1469877.858727] [<ffffffff81319c89>] pci_call_probe+0x89/0xb0
Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed spelling in patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
This patch avoids that building with W=1 triggers compiler
warnings similar to the following:
drivers/scsi/qla2xxx/qla_nx2.h:538:23: warning: ‘qla8044_reg_tbl’ defined but not used [-Wunused-const-variable=]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Christoph Hellwig <hch@lst.de>
The function stmmac_dt_phy provides several possibilities for initializing
plat->mdio_node, all of which have the effect of increasing the reference
count of the assigned value. This field is not updated elsewhere, so the
value is live until the end of the lifetime of plat (devm_allocated), just
after the end of stmmac_remove_config_dt. Thus, add an of_node_put on
plat->mdio_node in stmmac_remove_config_dt. It is possible that the field
mdio_node is never initialized, but of_node_put is NULL-safe, so it is also
safe to call of_node_put in that case.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch part reverts fd2a0437dc and e858fae2b0 which introduced a
subtle change in how the virtio_net flags are derived from the SKBs
ip_summed field.
With the above commits, the flags are set to VIRTIO_NET_HDR_F_DATA_VALID
when ip_summed == CHECKSUM_UNNECESSARY, thus treating it differently to
ip_summed == CHECKSUM_NONE, which should be the same.
Further, the virtio spec 1.0 / CS04 explicitly says that
VIRTIO_NET_HDR_F_DATA_VALID must not be set by the driver.
Fixes: fd2a0437dc ("virtio_net: introduce virtio_net_hdr_{from,to}_skb")
Fixes: e858fae2b0 (" virtio_net: use common code for virtio_net_hdr and skb GSO conversion")
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no good match of the zoned field of the block device
characteristics page for host-managed devices. For these devices, the
zoning model is derived directly from the device type. So ignore the
zoned field for these drives.
[mkp: typo]
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Zoned block devices force the use of READ/WRITE(16) commands by setting
sdkp->use_16_for_rw and clearing sdkp->use_10_for_rw. This result in
DPOFUA always being disabled for these drives as the assumed use of
the deprecated READ/WRITE(6) commands only looks at sdkp->use_10_for_rw.
Strenghten the test by also checking that sdkp->use_16_for_rw is false.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 01e0e15c8b ("scsi: don't use fc_bsg_job::request and
fc_bsg_job::reply directly") introduced a typo, which causes that the
bsg_request variable in bfad_im_bsg_els_ct_request() is initialized to
itself instead of pointing to the bsg job's request.
Reported-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 7fd8329ba5 ("taint/module: Clean up global and module taint
flags handling") used the key words true and false as character members
of a new struct. These names cause problems when out-of-kernel modules
such as VirtualBox include their own definitions of true and false.
Fixes: 7fd8329ba5 ("taint/module: Clean up global and module taint flags handling")
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Jessica Yu <jeyu@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jessica Yu <jeyu@redhat.com>
Pull networking fixes from David Miller:
1) Handle multicast packets properly in fast-RX path of mac80211, from
Johannes Berg.
2) Because of a logic bug, the user can't actually force SW
checksumming on r8152 devices. This makes diagnosis of hw
checksumming bugs really annoying. Fix from Hayes Wang.
3) VXLAN route lookup does not take the source and destination ports
into account, which means IPSEC policies cannot be matched properly.
Fix from Martynas Pumputis.
4) Do proper RCU locking in netvsc callbacks, from Stephen Hemminger.
5) Fix SKB leaks in mlxsw driver, from Arkadi Sharshevsky.
6) If lwtunnel_fill_encap() fails, we do not abort the netlink message
construction properly in fib_dump_info(), from David Ahern.
7) Do not use kernel stack for DMA buffers in atusb driver, from Stefan
Schmidt.
8) Openvswitch conntack actions need to maintain a correct checksum,
fix from Lance Richardson.
9) ax25_disconnect() is missing a check for ax25->sk being NULL, in
fact it already checks this, but not in all of the necessary spots.
Fix from Basil Gunn.
10) Action GET operations in the packet scheduler can erroneously bump
the reference count of the entry, making it unreleasable. Fix from
Jamal Hadi Salim. Jamal gives a great set of example command lines
that trigger this in the commit message.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits)
net sched actions: fix refcnt when GETing of action after bind
net/mlx4_core: Eliminate warning messages for SRQ_LIMIT under SRIOV
net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions
net/mlx4_core: Fix racy CQ (Completion Queue) free
net: stmmac: don't use netdev_[dbg, info, ..] before net_device is registered
net/mlx5e: Fix a -Wmaybe-uninitialized warning
ax25: Fix segfault after sock connection timeout
bpf: rework prog_digest into prog_tag
tipc: allocate user memory with GFP_KERNEL flag
net: phy: dp83867: allow RGMII_TXID/RGMII_RXID interface types
ip6_tunnel: Account for tunnel header in tunnel MTU
mld: do not remove mld souce list info when set link down
be2net: fix MAC addr setting on privileged BE3 VFs
be2net: don't delete MAC on close on unprivileged BE3 VFs
be2net: fix status check in be_cmd_pmac_add()
cpmac: remove hopeless #warning
ravb: do not use zero-length alignment DMA descriptor
mlx4: do not call napi_schedule() without care
openvswitch: maintain correct checksum state in conntrack actions
tcp: fix tcp_fastopen unaligned access complaints on sparc
...
Pull swiotlb fix from Konrad Rzeszutek Wilk:
"A tiny fix to make sure that page-sized mappings are page-aligned (and
not say straddle two pages). This is important for some drivers (such
as NVME)"
* 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
swiotlb: ensure that page-sized mappings are page-aligned
Pull MMC fixes from Ulf Hansson:
"MMC core:
- fix regressions detecting HS/HS DDR eMMC cards related to CMD6
MMC host:
- mmc: mxs-mmc: Fix additional cycles after transmission stop
- sdhci-acpi: Only powered up enabled acpi child devices
- meson: avoid possible NULL dereference"
* tag 'mmc-v4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: core: Restore parts of the polling policy when switch to HS/HS DDR
mmc: mxs-mmc: Fix additional cycles after transmission stop
mmc: sdhci-acpi: Only powered up enabled acpi child devices
MMC: meson: avoid possible NULL dereference
Pull MTD fixes from Brian Norris:
"Just NAND updates from Boris:
- avoid compiling xway NAND controller driver as a module (which
didn't work)
- fix tango NAND DT binding and make sure the controller is in a
clean state at probe time
- add dependency on HAS_IOMEM to the oxnas NAND driver
- fix irq number validity check in the lpc32xx driver"
* tag 'for-linus-20170116' of git://git.infradead.org/linux-mtd:
mtd: nand: lpc32xx: fix invalid error handling of a requested irq
mtd: nand: tango: Reset pbus to raw mode in probe
mtd: nand: tango: Update DT binding description
mtd: nand: oxnas_nand: fix build errors on arch/um, require HAS_IOMEM
mtd: nand: xway: fix build because of module functions
mtd: nand: xway: disable module support
The conversion to the new hotplug state machine introduced a regression
where a successful hotplug registration would be treated as an error,
effectively disabling the MSI driver forever.
Fix it by doing the proper check on the return value.
Fixes: 9c248f8896 ("PCI/xgene-msi: Convert to hotplug state machine")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Duc Dang <dhdang@apm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: stable@vger.kernel.org
emulator_fix_hypercall() replaces hypercall with vmcall instruction,
but it does not handle GP exception properly when writes the new instruction.
It can return X86EMUL_PROPAGATE_FAULT without setting exception information.
This leads to incorrect emulation and triggers
WARN_ON(ctxt->exception.vector > 0x1f) in x86_emulate_insn()
as discovered by syzkaller fuzzer:
WARNING: CPU: 2 PID: 18646 at arch/x86/kvm/emulate.c:5558
Call Trace:
warn_slowpath_null+0x2c/0x40 kernel/panic.c:582
x86_emulate_insn+0x16a5/0x4090 arch/x86/kvm/emulate.c:5572
x86_emulate_instruction+0x403/0x1cc0 arch/x86/kvm/x86.c:5618
emulate_instruction arch/x86/include/asm/kvm_host.h:1127 [inline]
handle_exception+0x594/0xfd0 arch/x86/kvm/vmx.c:5762
vmx_handle_exit+0x2b7/0x38b0 arch/x86/kvm/vmx.c:8625
vcpu_enter_guest arch/x86/kvm/x86.c:6888 [inline]
vcpu_run arch/x86/kvm/x86.c:6947 [inline]
Set exception information when write in emulator_fix_hypercall() fails.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: kvm@vger.kernel.org
Cc: syzkaller@googlegroups.com
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
KVM/ARM updates for 4.10-rc4
- Fix for timer setup on VHE machines
- Drop spurious warning when the timer races against
the vcpu running again
- Prevent a vgic deadlock when the initialization fails
When replaying the journal it can happen that a journal entry points to
a garbage collected node.
This is the case when a power-cut occurred between a garbage collect run
and a commit. In such a case nodes have to be read using the failable
read functions to detect whether the found node matches what we expect.
One corner case was forgotten, when the journal contains an entry to
remove an inode all xattrs have to be removed too. UBIFS models xattr
like directory entries, so the TNC code iterates over
all xattrs of the inode and removes them too. This code re-uses the
functions for walking directories and calls ubifs_tnc_next_ent().
ubifs_tnc_next_ent() expects to be used only after the journal and
aborts when a node does not match the expected result. This behavior can
render an UBIFS volume unmountable after a power-cut when xattrs are
used.
Fix this issue by using failable read functions in ubifs_tnc_next_ent()
too when replaying the journal.
Cc: stable@vger.kernel.org
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Reported-by: Rock Lee <rockdotlee@gmail.com>
Reviewed-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
In several places, ubifs checked for an encryption key before creating a
file in an encrypted directory. This was redundant with
fscrypt_setup_filename() or ubifs_new_inode(), and in the case of
ubifs_link() it broke linking to special files. So remove the extra
checks.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
The ubifs encryption ioctls did not work when called by a 32-bit program
on a 64-bit kernel. Since 'struct fscrypt_policy' is not affected by
the word size, ubifs just needs to allow these ioctls through, like what
ext4 and f2fs do.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This came up during the v4.10 merge window:
warning: (UBIFS_FS_ENCRYPTION) selects FS_ENCRYPTION which has unmet direct dependencies (BLOCK)
fs/crypto/crypto.c: In function 'fscrypt_zeroout_range':
fs/crypto/crypto.c:355:9: error: implicit declaration of function 'bio_alloc';did you mean 'd_alloc'? [-Werror=implicit-function-declaration]
bio = bio_alloc(GFP_NOWAIT, 1);
The easiest way out is to limit UBIFS_FS_ENCRYPTION to configurations
that also enable BLOCK.
Fixes: d475a50745 ("ubifs: Add skeleton for fscrypto")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
err is no longer being set on a successful return path, causing
a garbage value being returned. Fix this by setting err to zero
for the successful return path.
Found with static analysis by CoverityScan, CID 1389473
Fixes: 7799953b34 ("ubifs: Implement encrypt/decrypt for all IO")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Commit b67a8b29df introduced logic to skip swiotlb allocation when all memory
is DMA accessible anyway.
While this is a great idea, __dma_alloc still calls swiotlb code unconditionally
to allocate memory when there is no CMA memory available. The swiotlb code is
called to ensure that we at least try get_free_pages().
Without initialization, swiotlb allocation code tries to access io_tlb_list
which is NULL. That results in a stack trace like this:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
[...]
[<ffff00000845b908>] swiotlb_tbl_map_single+0xd0/0x2b0
[<ffff00000845be94>] swiotlb_alloc_coherent+0x10c/0x198
[<ffff000008099dc0>] __dma_alloc+0x68/0x1a8
[<ffff000000a1b410>] drm_gem_cma_create+0x98/0x108 [drm]
[<ffff000000abcaac>] drm_fbdev_cma_create_with_funcs+0xbc/0x368 [drm_kms_helper]
[<ffff000000abcd84>] drm_fbdev_cma_create+0x2c/0x40 [drm_kms_helper]
[<ffff000000abc040>] drm_fb_helper_initial_config+0x238/0x410 [drm_kms_helper]
[<ffff000000abce88>] drm_fbdev_cma_init_with_funcs+0x98/0x160 [drm_kms_helper]
[<ffff000000abcf90>] drm_fbdev_cma_init+0x40/0x58 [drm_kms_helper]
[<ffff000000b47980>] vc4_kms_load+0x90/0xf0 [vc4]
[<ffff000000b46a94>] vc4_drm_bind+0xec/0x168 [vc4]
[...]
Thankfully swiotlb code just learned how to not do allocations with the FORCE_NO
option. This patch configures the swiotlb code to use that if we decide not to
initialize the swiotlb framework.
Fixes: b67a8b29df ("arm64: mm: only initialize swiotlb when necessary")
Signed-off-by: Alexander Graf <agraf@suse.de>
CC: Jisheng Zhang <jszhang@marvell.com>
CC: Geert Uytterhoeven <geert+renesas@glider.be>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
By failing to set the errno, we'd continue on to trying to set up the
RCL, and then oops on trying to dereference the tile_bo that binning
validation should have set up.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: d5b1a78a77 ("drm/vc4: Add support for drawing 3D frames.")
We copy the unvalidated ioctl arguments from the user into kernel
temporary memory to run the validation from, to avoid a race where the
user updates the unvalidate contents in between validating them and
copying them into the validated BO.
However, in setting up the layout of the kernel side, we failed to
check one of the additions (the roundup() for shader_rec_offset)
against integer overflow, allowing a nearly MAX_UINT value of
bin_cl_size to cause us to under-allocate the temporary space that we
then copy_from_user into.
Reported-by: Murray McAllister <murray.mcallister@insomniasec.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: d5b1a78a77 ("drm/vc4: Add support for drawing 3D frames.")
We accidentally return success even if vc4_full_res_bounds_check() fails.
Fixes: d5b1a78a77 ("drm/vc4: Add support for drawing 3D frames.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
The underscores variant frees the pointers inside, while the
no-underscores variant calls underscores and then frees the struct.
Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: d8dbf44f13 ("drm/vc4: Make the CRTCs cooperate on allocating display lists.")
Cc: stable@vger.kernel.org
Felipe writes:
usb: fixes for v4.10-rc5
One memory leak fix on the atmel UDC. Several fixes for dwc2. A fix on
composite.c to use usb_ep_free_request() when freeing struct
usb_request.
Shadow batch buffer is used to shadow the privileged batch
buffer which is submitted by vGPU's workload. This patch is
used to unmark this functionality.
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
A single PM fix from Arnd
* tag 'ux500-fix-for-armsoc' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson:
ARM: ux500: fix prcmu_is_cpu_in_wfi() calculation
Signed-off-by: Olof Johansson <olof@lixom.net>
This pull request contains Broadcom ARM-based SoC Device Tree fixes for v4.10, please
pull the following:
- Jon fixes an invalid value for the "ranges" property of the bus nodes on NorthStar
Plus SoCs
* tag 'arm-soc/for-4.10/devicetree-fixes' of http://github.com/Broadcom/stblinux:
ARM: dts: NSP: Fix DT ranges error
Signed-off-by: Olof Johansson <olof@lixom.net>
This pull request contains fixes to multi_v7_defconfig for Broadcom ARM-based
SoCs, please pull the following changes:
- Valenting fixes two incorrect Kconfig symbols for BCM47xx: NVRAM and watchdog drivers
* tag 'arm-soc/for-4.10/defconfig-fixes' of http://github.com/Broadcom/stblinux:
ARM: multi_v7_defconfig: set bcm47xx watchdog
ARM: multi_v7_defconfig: fix config typo
Signed-off-by: Olof Johansson <olof@lixom.net>
Samsung fixes for v4.10:
1. Update maintainers entry with Patchwork address.
2. Fix invalid values for NF_CT_PROTO_* in s3c2410 defconfig (these options
cannot be modules anymore).
* tag 'samsung-fixes-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
ARM: s3c2410_defconfig: Fix invalid values for NF_CT_PROTO_*
MAINTAINERS: Add Patchwork URL to Samsung Exynos entry
Signed-off-by: Olof Johansson <olof@lixom.net>
Allwinner fixes for 4.10
A few fixes here and there to enable the build of some DT leftover, prevent
display issues or setup a proper muxing.
* tag 'sunxi-fixes-for-4.10' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
ARM: dts: sunxi: Change node name for pwrseq pin on Olinuxino-lime2-emmc
ARM: dts: sun8i: Support DTB build for NanoPi M1
ARM: dts: sun6i: hummingbird: Enable display engine again
ARM: dts: sun6i: Disable display pipeline by default
Signed-off-by: Olof Johansson <olof@lixom.net>
i.MX fixes for 4.10, 2nd round:
- A couple of Nitrogen6 device tree fixes for audio codec probe
failure, which is caused by that pinctrl setting for codec clock
was not in the correct device node.
* tag 'imx-fixes-4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx6qdl-nitrogen6_som2: fix sgtl5000 pinctrl init
ARM: dts: imx6qdl-nitrogen6_max: fix sgtl5000 pinctrl init
Signed-off-by: Olof Johansson <olof@lixom.net>
As Ayaka reported the thermal was abormal on rk3288 at booting time.
thermal thermal_zone1: critical temperature reached(125 C),shutting down
thermal thermal_zone2: critical temperature reached(125 C),shutting down
thermal thermal_zone1: critical temperature reached(125 C),shutting down
thermal thermal_zone2: critical temperature reached(125 C),shutting down
...
The root caused by reading the invald analogic value, the value is zero
will convert the 125 degree to trigger the critical temperature.
Fixes it with insteading of the incorrect reading now.
Fixes commit cadf29dc2a
("thermal: rockchip: optimize the conversion table")
Reported-by: ayaka <ayaka@soulik.info>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The icp-opal call is missing the code from icp-native to recover
interrupts snatched by KVM. Without that, when running KVM, we can
get into a situation where an interrupt is lost and the CPU stuck
with an elevated CPPR.
Also harden replay by always checking the return from opal_int_eoi().
Fixes: d74361881f ("powerpc/xics: Add ICP OPAL backend")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Demonstrating the issue:
.. add a drop action
$sudo $TC actions add action drop index 10
.. retrieve it
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 0 installed 29 sec used 29 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
... bug 1 above: reference is two.
Reference is actually 1 but we forget to subtract 1.
... do a GET again and we see the same issue
try a few times and nothing changes
~$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 0 installed 31 sec used 31 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
... lets try to bind the action to a filter..
$ sudo $TC qdisc add dev lo ingress
$ sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \
u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 10
... and now a few GETs:
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 3 bind 1 installed 204 sec used 204 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 4 bind 1 installed 206 sec used 206 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 5 bind 1 installed 235 sec used 235 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
.... as can be observed the reference count keeps going up.
After the fix
$ sudo $TC actions add action drop index 10
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 1 bind 0 installed 4 sec used 4 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 1 bind 0 installed 6 sec used 6 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
$ sudo $TC qdisc add dev lo ingress
$ sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \
u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 10
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 1 installed 32 sec used 32 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 1 installed 33 sec used 33 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Fixes: aecc5cefc3 ("net sched actions: fix GETing actions")
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Memory hotplug is leading to hash page table calls, even on radix:
arch_add_memory
create_section_mapping
htab_bolt_mapping
BUG_ON(!ppc_md.hpte_insert);
To fix, refactor {create,remove}_section_mapping() into hash__ and
radix__ variants. Leave the radix versions stubbed for now.
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull NFS client bugfixes from Trond Myklebust:
- fix invalid fget()/fput() calls when doing file locking
- fix multiple directory cache invalidation issues due to the client
failing to recognise that the directory wasn't changed
- fix client recovery when server reboots multiple times
* tag 'nfs-for-4.10-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Fix client recovery when server reboots multiple times
NFSv4: update_changeattr should update the attribute timestamp
NFSv4: Don't call update_changeattr() unless the unlink is successful
NFSv4: Don't apply change_info4 twice on rename within a directory
NFSv4: Call update_changeattr() from _nfs4_proc_open only if a file was created
nfs: Don't take a reference on fl->fl_file for LOCK operation
Tariq Toukan says:
====================
mlx4 core fixes
This patchset contains bug fixes from Jack to the mlx4 Core driver.
Patch 1 solves a race in the flow of CQ free.
Patch 2 moves some qp context flags update to the correct qp transition.
Patch 3 eliminates warnings from the path of SRQ_LIMIT that flood the message log,
and keeps them only in the path of SRQ_CATAS_ERROR.
Series generated against net commit:
1a717fcf8b Merge tag 'mac80211-for-davem-2017-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When running SRIOV, warnings for SRQ LIMIT events flood the Hypervisor's
message log when (correct, normally operating) apps use SRQ LIMIT events
as a trigger to post WQEs to SRQs.
Add more information to the existing debug printout for SRQ_LIMIT, and
output the warning messages only for the SRQ CATAS ERROR event.
Fixes: acba2420f9 ("mlx4_core: Add wrapper functions and comm channel and slave event support to EQs")
Fixes: e0debf9cb5 ("mlx4_core: Reduce warning message for SRQ_LIMIT event to debug level")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Save the qp context flags byte containing the flag disabling vlan stripping
in the RESET to INIT qp transition, rather than in the INIT to RTR
transition. Per the firmware spec, the flags in this byte are active
in the RESET to INIT transition.
As a result of saving the flags in the incorrect qp transition, when
switching dynamically from VGT to VST and back to VGT, the vlan
remained stripped (as is required for VST) and did not return to
not-stripped (as is required for VGT).
Fixes: f0f829bf42 ("net/mlx4_core: Add immediate activate for VGT->VST->VGT")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In function mlx4_cq_completion() and mlx4_cq_event(), the
radix_tree_lookup requires a rcu_read_lock.
This is mandatory: if another core frees the CQ, it could
run the radix_tree_node_rcu_free() call_rcu() callback while
its being used by the radix tree lookup function.
Additionally, in function mlx4_cq_event(), since we are adding
the rcu lock around the radix-tree lookup, we no longer need to take
the spinlock. Also, the synchronize_irq() call for the async event
eliminates the need for incrementing the cq reference count in
mlx4_cq_event().
Other changes:
1. In function mlx4_cq_free(), replace spin_lock_irq with spin_lock:
we no longer take this spinlock in the interrupt context.
The spinlock here, therefore, simply protects against different
threads simultaneously invoking mlx4_cq_free() for different cq's.
2. In function mlx4_cq_free(), we move the radix tree delete to before
the synchronize_irq() calls. This guarantees that we will not
access this cq during any subsequent interrupts, and therefore can
safely free the CQ after the synchronize_irq calls. The rcu_read_lock
in the interrupt handlers only needs to protect against corrupting the
radix tree; the interrupt handlers may access the cq outside the
rcu_read_lock due to the synchronize_irq calls which protect against
premature freeing of the cq.
3. In function mlx4_cq_event(), we change the mlx_warn message to mlx4_dbg.
4. We leave the cq reference count mechanism in place, because it is
still needed for the cq completion tasklet mechanism.
Fixes: 6d90aa5cf1 ("net/mlx4_core: Make sure there are no pending async events when freeing CQ")
Fixes: 225c7b1fee ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't use netdev_info and friends before the net_device is registered.
This avoids ugly messages like
"meson8b-dwmac c9410000.ethernet (unnamed net_device) (uninitialized):
Enable RX Mitigation via HW Watchdog Timer"
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As found by Olof's build bot, we gain a harmless warning about a
potential uninitialized variable reference in mlx5:
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c: In function 'parse_tc_fdb_actions':
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:769:13: warning: 'out_dev' may be used uninitialized in this function [-Wmaybe-uninitialized]
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:811:21: note: 'out_dev' was declared here
This was introduced through the addition of an 'IS_ERR/PTR_ERR' pair
that gcc is unfortunately unable to completely figure out.
The problem being gcc cannot tell that if(IS_ERR()) in
mlx5e_route_lookup_ipv4() is equivalent to checking if(err) later,
so it assumes that 'out_dev' is used after the 'return PTR_ERR(rt)'.
The PTR_ERR_OR_ZERO() case by comparison is fairly easy to detect
by gcc, so it can't get that wrong, so it no longer warns.
Hadar Hen Zion already attempted to fix the warning earlier by adding fake
initializations, but that ended up not fully addressing all warnings, so
I'm reverting it now that it is no longer needed.
Link: http://arm-soc.lixom.net/buildlogs/mainline/v4.10-rc3-98-gcff3b2c/
Fixes: a42485eb0e ("net/mlx5e: TC ipv4 tunnel encap offload error flow fixes")
Fixes: a757d108dc ("net/mlx5e: Fix kbuild warnings for uninitialized parameters")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ax.25 socket connection timed out & the sock struct has been
previously taken down ie. sock struct is now a NULL pointer. Checking
the sock_flag causes the segfault. Check if the socket struct pointer
is NULL before checking sock_flag. This segfault is seen in
timed out netrom connections.
Please submit to -stable.
Signed-off-by: Basil Gunn <basil@pacabunga.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7bd509e311 ("bpf: add prog_digest and expose it via
fdinfo/netlink") was recently discussed, partially due to
admittedly suboptimal name of "prog_digest" in combination
with sha1 hash usage, thus inevitably and rightfully concerns
about its security in terms of collision resistance were
raised with regards to use-cases.
The intended use cases are for debugging resp. introspection
only for providing a stable "tag" over the instruction sequence
that both kernel and user space can calculate independently.
It's not usable at all for making a security relevant decision.
So collisions where two different instruction sequences generate
the same tag can happen, but ideally at a rather low rate. The
"tag" will be dumped in hex and is short enough to introspect
in tracepoints or kallsyms output along with other data such
as stack trace, etc. Thus, this patch performs a rename into
prog_tag and truncates the tag to a short output (64 bits) to
make it obvious it's not collision-free.
Should in future a hash or facility be needed with a security
relevant focus, then we can think about requirements, constraints,
etc that would fit to that situation. For now, rework the exposed
parts for the current use cases as long as nothing has been
released yet. Tested on x86_64 and s390x.
Fixes: 7bd509e311 ("bpf: add prog_digest and expose it via fdinfo/netlink")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix to probe on gcc generated functions on modules. Since
probing on a module is based on its symbol name, it should
be adjusted on actual symbols.
E.g. without this fix, perf probe shows probe definition
on non-exist symbol as below.
$ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -F in_range*
in_range.isra.12
$ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
p:probe/in_range nf_nat:in_range+0
With this fix, perf probe correctly shows a probe on
gcc-generated symbol.
$ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
p:probe/in_range nf_nat:in_range.isra.12+0
This also fixes same problem on online module as below.
$ perf probe -m i915 -D assert_plane
p:probe/assert_plane i915:assert_plane.constprop.134+0
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411450673.9978.14905987549651656075.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Until now, we allocate memory always with GFP_ATOMIC flag.
When the system is under memory pressure and a user tries to send,
the send fails due to low memory. However, the user application
can wait for free memory if we allocate it using GFP_KERNEL flag.
In this commit, we use allocate memory with GFP_KERNEL for all user
allocation.
Reported-by: Rune Torgersen <runet@innovsys.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently dp83867 driver returns error if phy interface type
PHY_INTERFACE_MODE_RGMII_RXID is used to set the rx only internal
delay. Similarly issue happens for PHY_INTERFACE_MODE_RGMII_TXID.
Fix this by checking also the interface type if a particular delay
value is missing in the phy dt bindings. Also update the DT document
accordingly.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With ip6gre we have a tunnel header which also makes the tunnel MTU
smaller. We need to reserve room for it. Previously we were using up
space reserved for the Tunnel Encapsulation Limit option
header (RFC 2473).
Also, after commit b05229f442 ("gre6: Cleanup GREv6 transmit path,
call common GRE functions") our contract with the caller has
changed. Now we check if the packet length exceeds the tunnel MTU after
the tunnel header has been pushed, unlike before.
This is reflected in the check where we look at the packet length minus
the size of the tunnel header, which is already accounted for in tunnel
MTU.
Fixes: b05229f442 ("gre6: Cleanup GREv6 transmit path, call common GRE functions")
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix to show correct locations for events on modules by relocating given
address instead of retrying after failure.
This happens when the module text size is big enough, bigger than
sh_addr, because the original code retries with given address + sh_addr
if it failed to find CU DIE at the given address.
Any address smaller than sh_addr always fails and it retries with the
correct address, but addresses bigger than sh_addr will get a CU DIE
which is on the given address (not adjusted by sh_addr).
In my environment(x86-64), the sh_addr of ".text" section is 0x10030.
Since i915 is a huge kernel module, we can see this issue as below.
$ grep "[Tt] .*\[i915\]" /proc/kallsyms | sort | head -n1
ffffffffc0270000 t i915_switcheroo_can_switch [i915]
ffffffffc0270000 + 0x10030 = ffffffffc0280030, so we'll check
symbols cross this boundary.
$ grep "[Tt] .*\[i915\]" /proc/kallsyms | grep -B1 ^ffffffffc028\
| head -n 2
ffffffffc027ff80 t haswell_init_clock_gating [i915]
ffffffffc0280110 t valleyview_init_clock_gating [i915]
So setup probes on both function and see what happen.
$ sudo ./perf probe -m i915 -a haswell_init_clock_gating \
-a valleyview_init_clock_gating
Added new events:
probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)
You can now use it in all perf tools, such as:
perf record -e probe:valleyview_init_clock_gating -aR sleep 1
$ sudo ./perf probe -l
probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
probe:valleyview_init_clock_gating (on i915_vga_set_decode:4@gpu/drm/i915/i915_drv.c in i915)
As you can see, haswell_init_clock_gating is correctly shown,
but valleyview_init_clock_gating is not.
With this patch, both events are shown correctly.
$ sudo ./perf probe -l
probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
Committer notes:
In my case:
# perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
Added new events:
probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)
You can now use it in all perf tools, such as:
perf record -e probe:valleyview_init_clock_gating -aR sleep 1
# perf probe -l
probe:haswell_init_clock_gating (on i915_getparam+432@gpu/drm/i915/i915_drv.c in i915)
probe:valleyview_init_clock_gating (on __i915_printk+240@gpu/drm/i915/i915_drv.c in i915)
#
# readelf -SW /lib/modules/4.9.0+/build/vmlinux | egrep -w '.text|Name'
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 1] .text PROGBITS ffffffff81000000 200000 822fd3 00 AX 0 0 4096
#
So both are b0rked, now with the fix:
# perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
Added new events:
probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)
You can now use it in all perf tools, such as:
perf record -e probe:valleyview_init_clock_gating -aR sleep 1
# perf probe -l
probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
#
Both looks correct.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/148411436777.9978.1440275861947194930.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is an IPv6 version of commit 24803f38a5 ("igmp: do not remove igmp
souce list..."). In mld_del_delrec(), we will restore back all source filter
info instead of flush them.
Move mld_clear_delrec() from ipv6_mc_down() to ipv6_mc_destroy_dev() since
we should not remove source list info when set link down. Remove
igmp6_group_dropped() in ipv6_mc_destroy_dev() since we have called it in
ipv6_mc_down().
Also clear all source info after igmp6_group_dropped() instead of in it
because ipv6_mc_down() will call igmp6_group_dropped().
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull nfsd fixes from Bruce Fields:
"Miscellaneous nfsd bugfixes, one for a 4.10 regression, three for
older bugs"
* tag 'nfsd-4.10-1' of git://linux-nfs.org/~bfields/linux:
svcrdma: avoid duplicate dma unmapping during error recovery
sunrpc: don't call sleeping functions from the notifier block callbacks
svcrpc: don't leak contexts on PROC_DESTROY
nfsd: fix supported attributes for acl & labels
The following patch was sketched by Russell in response to my
crashes on the PB11MPCore after the patch for software-based
priviledged no access support for ARMv8.1. See this thread:
http://marc.info/?l=linux-arm-kernel&m=144051749807214&w=2
I am unsure what is going on, I suspect everyone involved in
the discussion is. I just want to repost this to get the
discussion restarted, as I still have to apply this patch
with every kernel iteration to get my PB11MPCore Realview
running.
Testing by Neil Armstrong on the Oxnas NAS has revealed that
this bug exist also on that widely deployed hardware, so
we are probably currently regressing all ARM11MPCore systems.
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: a5e090acbf ("ARM: software-based priviledged-no-access support")
Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
During interface opening MAC address stored in netdev->dev_addr is
programmed in the HW with exception of BE3 VFs where the initial
MAC is programmed by parent PF. This is OK when MAC address is not
changed when an interfaces is down. In this case the requested MAC is
stored to netdev->dev_addr and later is stored into HW during opening.
But this is not done for all BE3 VFs so the NIC HW does not know
anything about this change and all traffic is filtered.
This is the case of bonding if fail_over_mac == 0 where the MACs of
the slaves are changed while they are down.
The be2net behavior is too restrictive because if a BE3 VF has
the FILTMGMT privilege then it is able to modify its MAC without
any restriction.
To solve the described problem the driver should take care about these
privileged BE3 VFs so the MAC is programmed during opening. And by
contrast unpriviled BE3 VFs should not be allowed to change its MAC
in any case.
Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
The #warning was present 10 years ago when the driver first got merged.
As the platform is rather obsolete by now, it seems very unlikely that
the warning will cause anyone to fix the code properly.
kernelci.org reports the warning for every build in the meantime, so
I think it's better to just turn it into a code comment to reduce
noise.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update my entries in the MAINTAINERS file with the same email address
for kernel work, and, now that the git tree is hosted on more suitable
hardware, add git tree references where appropriate.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Disable BH around the call to napi_schedule() to avoid following warning
[ 52.095499] NOHZ: local_softirq_pending 08
[ 52.421291] NOHZ: local_softirq_pending 08
[ 52.608313] NOHZ: local_softirq_pending 08
Fixes: 8d59de8f7b ("net/mlx4_en: Process all completions in RX rings after port goes up")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Erez Shitrit <erezsh@mellanox.com>
Cc: Eugenia Emantayev <eugenia@mellanox.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hedberg says:
====================
pull request: bluetooth 2017-01-16
Here are a couple of important 802.15.4 driver fixes for the 4.10
kernel.
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Regressions for not being able to detect an eMMC HS DDR mode card has been
reported for the sdhci-esdhc-imx driver, but potentially other sdhci
variants may suffer from the similar problem.
The commit e173f8911f ("mmc: core: Update CMD13 polling policy when
switch to HS DDR mode"), is causing the problem. It seems that change moved
one step to far, regarding changing the host's timing before polling for a
busy card.
To fix this, let's move back to the behaviour when the host's timing is
updated after the polling, but before the switch status is fetched and
validated.
In cases when polling with CMD13, we keep validating the switch status at
each attempt. However, to align with the other card busy detections
mechanism, let's fetch and validate the switch status also after the host's
timing is updated.
Reported-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Reported-by: Gary Bisson <gary.bisson@boundarydevices.com>
Fixes: e173f8911f ("mmc: core: Update CMD13 polling policy when switch..")
Cc: Shawn Lin <shawn.lin@rock-chips.com>
Cc: Dong Aisheng <aisheng.dong@nxp.com>
Cc: Haibo Chen <haibo.chen@nxp.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Tested-by: Jagan Teki <jagan@amarulasolutions.com>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Tested-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
The NF_CONNTRACK Kconfig option description makes an incorrect reference
to the "meta" expression where the "ct" expression would be correct.This
patch fixes the respective typographical error.
Fixes: d497c63527 ("netfilter: add help information to new nf_tables Kconfig options")
Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When dumping nft stateful objects, if NFTA_OBJ_TABLE and NFTA_OBJ_TYPE
attributes are not specified either, filter will become NULL, so oops
will happen(actually nft utility will always set NFTA_OBJ_TABLE attr,
so I write a test program to make this happen):
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: nf_tables_dump_obj+0x17c/0x330 [nf_tables]
[...]
Call Trace:
? nf_tables_dump_obj+0x5/0x330 [nf_tables]
? __kmalloc_reserve.isra.35+0x31/0x90
? __alloc_skb+0x5b/0x1e0
netlink_dump+0x124/0x2a0
__netlink_dump_start+0x161/0x190
nf_tables_getobj+0xe8/0x280 [nf_tables]
Fixes: a9fea2a3c3 ("netfilter: nf_tables: allow to filter stateful object dumps by type")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently, we check the existing rtable in PREROUTING hook, if RTCF_LOCAL
is set, we assume that the packet is loopback.
But this assumption is incorrect, for example, a packet encapsulated
in ipsec transport mode was received and routed to local, after
decapsulation, it would be delivered to local again, and the rtable
was not dropped, so RTCF_LOCAL check would trigger. But actually, the
packet was not loopback.
So for these normal loopback packets, we can check whether the in device
is IFF_LOOPBACK or not. For these locally generated broadcast/multicast,
we can check whether the skb->pkt_type is PACKET_LOOPBACK or not.
Finally, there's a subtle difference between nft fib expr and xtables
rpfilter extension, user can add the following nft rule to do strict
rpfilter check:
# nft add rule x y meta iif eth0 fib saddr . iif oif != eth0 drop
So when the packet is loopback, it's better to store the in device
instead of the LOOPBACK_IFINDEX, otherwise, after adding the above
nft rule, locally generated broad/multicast packets will be dropped
incorrectly.
Fixes: f83a7ea207 ("netfilter: xt_rpfilter: skip locally generated broadcast/multicast, too")
Fixes: f6d0cbcf09 ("netfilter: nf_tables: add fib expression")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Mathieu reported that the LTTNG modules are broken as of 4.10-rc1 due to
the removal of the cpu hotplug notifiers.
Usually I don't care much about out of tree modules, but LTTNG is widely
used in distros. There are two ways to solve that:
1) Reserve a hotplug state for LTTNG
2) Add a dynamic range for the prepare states.
While #1 is the simplest solution, #2 is the proper one as we can convert
in tree users, which do not care about ordering, to the dynamic range as
well.
Add a dynamic range which allows LTTNG to request states in the prepare
stage.
Reported-and-tested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701101353010.3401@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
USBTrdTim must be programmed to 0x5 when phy has a UTMI+ 16-bit wide
interface or 0x9 when it has a 8-bit wide interface.
GUSBCFG reset value (Value After Reset: 0x1400) sets USBTrdTim to 0x5.
In case of 8-bit UTMI+, without clearing GUSBCFG.USBTRDTIM mask, USBTrdTim
results in 0xD (0x5 | 0x9).
That's why we need to clear GUSBCFG.USBTRDTIM mask before setting USBTrdTim
value, to ensure USBTrdTim is correctly set in case of 8-bit UTMI+.
Signed-off-by: Amelie Delaunay <amelie.delaunay@st.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Pull an urgent RCU fix from Paul E. McKenney:
"This series contains a pair of commits that permit RCU synchronous grace
periods (synchronize_rcu() and friends) to work correctly throughout boot.
This eliminates the current "dead time" starting when the scheduler spawns
its first taks and ending when the last of RCU's kthreads is spawned
(this last happens during early_initcall() time). Although RCU's
synchronous grace periods have long been documented as not working
during this time, prior to 4.9, the expedited grace periods worked by
accident, and some ACPI code came to rely on this unintentional behavior.
(Note that this unintentional behavior was -not- reliable. For example,
failures from ACPI could occur on !SMP systems and on systems booting
with the rcu_normal kernel boot parameter.)
Either way, there is a bug that needs fixing, and the 4.9 switch of RCU's
expedited grace periods to workqueues could be considered to have caused
a regression. This series therefore makes RCU's expedited grace periods
operate correctly throughout the boot process. This has been demonstrated
to fix the problems ACPI was encountering, and has the added longer-term
benefit of simplifying RCU's behavior."
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We have quite a lot of code that depends on the order of the
__ctl_load inline assemby and subsequent memory accesses, like
e.g. disabling lowcore protection and the writing to lowcore.
Since the __ctl_load macro does not have memory barrier semantics, nor
any other dependencies the compiler is, theoretically, free to shuffle
code around. Or in other words: storing to lowcore could happen before
lowcore protection is disabled.
In order to avoid this class of potential bugs simply add a full
memory barrier to the __ctl_load macro.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Johannes Berg says:
====================
We have a number of fixes, in part because I was late
in actually sending them out - will try to do better in
the future:
* handle VHT opmode properly when hostapd is controlling
full station state
* two fixes for minimum channel width in mac80211
* don't leave SMPS set to junk in HT capabilities
* fix headroom when forwarding mesh packets, recently
broken by another fix that failed to take into account
frame encryption
* fix the TID in null-data packets indicating EOSP (end
of service period) in U-APSD
* prevent attempting to use (and then failing which
results in crashes) TXQs on stations that aren't added
to the driver yet
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Some drivers do depend on page mappings to be page aligned.
Swiotlb already enforces such alignment for mappings greater than page,
extend that to page-sized mappings as well.
Without this fix, nvme hits BUG() in nvme_setup_prps(), because that routine
assumes page-aligned mappings.
Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
The current preemptible RCU implementation goes through three phases
during bootup. In the first phase, there is only one CPU that is running
with preemption disabled, so that a no-op is a synchronous grace period.
In the second mid-boot phase, the scheduler is running, but RCU has
not yet gotten its kthreads spawned (and, for expedited grace periods,
workqueues are not yet running. During this time, any attempt to do
a synchronous grace period will hang the system (or complain bitterly,
depending). In the third and final phase, RCU is fully operational and
everything works normally.
This has been OK for some time, but there has recently been some
synchronous grace periods showing up during the second mid-boot phase.
This code worked "by accident" for awhile, but started failing as soon
as expedited RCU grace periods switched over to workqueues in commit
8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue").
Note that the code was buggy even before this commit, as it was subject
to failure on real-time systems that forced all expedited grace periods
to run as normal grace periods (for example, using the rcu_normal ksysfs
parameter). The callchain from the failure case is as follows:
early_amd_iommu_init()
|-> acpi_put_table(ivrs_base);
|-> acpi_tb_put_table(table_desc);
|-> acpi_tb_invalidate_table(table_desc);
|-> acpi_tb_release_table(...)
|-> acpi_os_unmap_memory
|-> acpi_os_unmap_iomem
|-> acpi_os_map_cleanup
|-> synchronize_rcu_expedited
The kernel showing this callchain was built with CONFIG_PREEMPT_RCU=y,
which caused the code to try using workqueues before they were
initialized, which did not go well.
This commit therefore reworks RCU to permit synchronous grace periods
to proceed during this mid-boot phase. This commit is therefore a
fix to a regression introduced in v4.9, and is therefore being put
forward post-merge-window in v4.10.
This commit sets a flag from the existing rcu_scheduler_starting()
function which causes all synchronous grace periods to take the expedited
path. The expedited path now checks this flag, using the requesting task
to drive the expedited grace period forward during the mid-boot phase.
Finally, this flag is updated by a core_initcall() function named
rcu_exp_runtime_mode(), which causes the runtime codepaths to be used.
Note that this arrangement assumes that tasks are not sent POSIX signals
(or anything similar) from the time that the first task is spawned
through core_initcall() time.
Fixes: 8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue")
Reported-by: "Zheng, Lv" <lv.zheng@intel.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Stan Kain <stan.kain@gmail.com>
Tested-by: Ivan <waffolz@hotmail.com>
Tested-by: Emanuel Castelo <emanuel.castelo@gmail.com>
Tested-by: Bruno Pesavento <bpesavento@infinito.it>
Tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Frederic Bezies <fredbezies@gmail.com>
Cc: <stable@vger.kernel.org> # 4.9.0-
It is now legal to invoke synchronize_sched() at early boot, which causes
Tiny RCU's synchronize_sched() to emit spurious splats. This commit
therefore removes the cond_resched() from Tiny RCU's synchronize_sched().
Fixes: 8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue")
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # 4.9.0-
If the server reboots multiple times, the client should rely on the
server to tell it that it cannot reclaim state as per section 9.6.3.4
in RFC7530 and section 8.4.2.1 in RFC5661.
Currently, the client is being to conservative, and is assuming that
if the server reboots while state recovery is in progress, then it must
ignore state that was not recovered before the reboot.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Commit 72a9b18629 ("xen: Remove event channel notification through Xen
PCI platform device") broke Linux when booting as Dom0 on Xen in a
nested Xen environment (Xen installed inside a Xen VM). In this
scenario, Linux is a PV guest, but at the same time it uses the
platform-pci driver to receive notifications from L0 Xen. vector
callbacks are not available because L1 Xen doesn't allow them.
Partially revert the offending commit, by restoring IRQ based
notifications for PV guests only. I restored only the code which is
strictly needed and replaced the xen_have_vector_callback checks within
it with xen_pv_domain() checks.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Commit 98a29c39dc ("libnvdimm, namespace: allow creation of multiple
pmem-namespaces per region") added support for establishing additional
pmem namespace beyond the seed device, similar to blk namespaces.
However, it neglected to delete the namespace when the size is set to
zero.
Fixes: 98a29c39dc ("libnvdimm, namespace: allow creation of multiple pmem-namespaces per region")
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fix up a data alignment issue on sparc by swapping the order
of the cookie byte array field with the length field in
struct tcp_fastopen_cookie, and making it a proper union
to clean up the typecasting.
This addresses log complaints like these:
log_unaligned: 113 callbacks suppressed
Kernel unaligned access at TPC[976490] tcp_try_fastopen+0x2d0/0x360
Kernel unaligned access at TPC[9764ac] tcp_try_fastopen+0x2ec/0x360
Kernel unaligned access at TPC[9764c8] tcp_try_fastopen+0x308/0x360
Kernel unaligned access at TPC[9764e4] tcp_try_fastopen+0x324/0x360
Kernel unaligned access at TPC[976490] tcp_try_fastopen+0x2d0/0x360
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The __bcm_sysport_tx_reclaim() function is used to reclaim transmit
resources in different places within the driver. Most of them should
not affect the state of the transit flow control.
Introduce bcm_sysport_tx_clean() which cleans the ring, but does not
re-enable flow control towards the networking stack, and make
bcm_sysport_tx_reclaim() do the actual transmit queue flow control.
Fixes: 80105befdb ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
AHCI provides the register PORTS_IMPL to let the software know which port
is supported. The register must be initialized by the bootloader. However
in some cases u-boot doesn't properly initialize this value (if it is not
compiled with SATA support for example or if the SATA initialization fails).
The DTS entry "ports-implemented" can be used to override the value in
PORTS_IMPL.
Without this patch the SATA will not work in the following two cases:
* if there has been a failure to initialize SATA in u-boot.
* if ahci_platform module has been removed and re-inserted. The reason is
that the content of PORTS_IMPL is lost after the module is removed.
I suspect that it's because the controller is reset by the hwmod.
Cc: <stable@vger.kernel.org> # v4.6+
Signed-off-by: Jean-Jacques Hiblot <jjhiblot@ti.com>
Acked-by: Roger Quadros <rogerq@ti.com>
[tony@atomide.com: updated comments with what goes wrong]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Due to the way kbuild works, this header was unintentionally exported
back in 2013 when it was created, despite it not being in a uapi/
directory. This is very non-intuitive behaviour by Kbuild.
However, we've had this include exported to userland for almost four
years, and searching google for "ARM types.h __UINTPTR_TYPE__" gives
no hint that anyone has complained about it. So, let's make it
officially exported in this state.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Commit bcb6f6d2b9 ("fuse: use timespec64") introduced clamped nsec values
in time_to_jiffies but used the max of nsec and NSEC_PER_SEC - 1 instead of
the min. Because of this, dentries would stay in the cache longer than
requested and go stale in scenarios that relied on their timely eviction.
Fixes: bcb6f6d2b9 ("fuse: use timespec64")
Signed-off-by: David Sheets <dsheets@docker.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # 4.9
It would race between userspace thread and commit worker. Ie. vblank
irq would trigger event and userspace could begin the next atomic
update, before the commit worker had a chance to clear the pending
flag.
If we do end up needing something to prevent userspace from trying
another pageflip before getting vblank event, it should probably be
implemented as a pending_planes bitmask, similar to pending_crtcs. See
start_atomic() and end_atomic().
Signed-off-by: Rob Clark <robdclark@gmail.com>
Station structure is considered as not uploaded
(to driver) until drv_sta_state() finishes. This
call is however done after the structure is
attached to mac80211 internal lists and hashes.
This means mac80211 can lookup (and use) station
structure before it is uploaded to a driver.
If this happens (structure exists, but
sta->uploaded is false) fast_tx path can still be
taken. Deep in the fastpath call the sta->uploaded
is checked against to derive "pubsta" argument for
ieee80211_get_txq(). If sta->uploaded is false
(and sta is actually non-NULL) ieee80211_get_txq()
effectively downgraded to vif->txq.
At first glance this may look innocent but coerces
mac80211 into a state that is almost guaranteed
(codel may drop offending skb) to crash because a
station-oriented skb gets queued up on
vif-oriented txq. The ieee80211_tx_dequeue() ends
up looking at info->control.flags and tries to use
txq->sta which in the fail case is NULL.
It's probably pointless to pretend one can
downgrade skb from sta-txq to vif-txq.
Since downgrading unicast traffic to vif->txq must
not be done there's no txq to put a frame on if
sta->uploaded is false. Therefore the code is made
to fall back to regular tx() op path if the
described condition is hit.
Only drivers using wake_tx_queue were affected.
Example crash dump before fix:
Unable to handle kernel paging request at virtual address ffffe26c
PC is at ieee80211_tx_dequeue+0x204/0x690 [mac80211]
[<bf4252a4>] (ieee80211_tx_dequeue [mac80211]) from
[<bf4b1388>] (ath10k_mac_tx_push_txq+0x54/0x1c0 [ath10k_core])
[<bf4b1388>] (ath10k_mac_tx_push_txq [ath10k_core]) from
[<bf4bdfbc>] (ath10k_htt_txrx_compl_task+0xd78/0x11d0 [ath10k_core])
[<bf4bdfbc>] (ath10k_htt_txrx_compl_task [ath10k_core])
[<bf51c5a4>] (ath10k_pci_napi_poll+0x54/0xe8 [ath10k_pci])
[<bf51c5a4>] (ath10k_pci_napi_poll [ath10k_pci]) from
[<c0572e90>] (net_rx_action+0xac/0x160)
Reported-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Dmitry Vyukov reported that the syzkaller fuzzer triggered a
deadlock in the vgic setup code when an error was detected, as
the cleanup code tries to take a lock that is already held by
the setup code.
The fix is to avoid retaking the lock when cleaning up, by
telling the cleanup function that we already hold it.
Cc: stable@vger.kernel.org
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Current KVM world switch code is unintentionally setting wrong bits to
CNTHCTL_EL2 when E2H == 1, which may allow guest OS to access physical
timer. Bit positions of CNTHCTL_EL2 are changing depending on
HCR_EL2.E2H bit. EL1PCEN and EL1PCTEN are 1st and 0th bits when E2H is
not set, but they are 11th and 10th bits respectively when E2H is set.
In fact, on VHE we only need to set those bits once, not for every world
switch. This is because the host kernel runs in EL2 with HCR_EL2.TGE ==
1, which makes those bits have no effect for the host kernel execution.
So we just set those bits once for guests, and that's it.
Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
When a VCPU blocks (WFI) and has programmed the vtimer, we program a
soft timer to expire in the future to wake up the vcpu thread when
appropriate. Because such as wake up involves a vcpu kick, and the
timer expire function can get called from interrupt context, and the
kick may sleep, we have to schedule the kick in the work function.
The work function currently has a warning that gets raised if it turns
out that the timer shouldn't fire when it's run, which was added because
the idea was that in that case the work should never have been cancelled.
However, it turns out that this whole thing is racy and we can get
spurious warnings. The problem is that we clear the armed flag in the
work function, which may run in parallel with the
kvm_timer_unschedule->timer_disarm() call. This results in a possible
situation where the timer_disarm() call does not call
cancel_work_sync(), which effectively synchronizes the completion of the
work function with running the VCPU. As a result, the VCPU thread
proceeds before the work function completees, causing changes to the
timer state such that kvm_timer_should_fire(vcpu) returns false in the
work function.
All we do in the work function is to kick the VCPU, and an occasional
rare extra kick never harmed anyone. Since the race above is extremely
rare, we don't bother checking if the race happens but simply remove the
check and the clearing of the armed flag from the work function.
Reported-by: Matthias Brugger <mbrugger@suse.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
fuse_abort_conn() moves requests from pending list to a temporary list
before canceling them. This operation races with request_wait_answer()
which also tries to remove the request after it gets a fatal signal. It
checks FR_PENDING flag to determine whether the request is still in the
pending list.
Make fuse_abort_conn() clear FR_PENDING flag so that request_wait_answer()
does not remove the request from temporary list.
This bug causes an Oops when trying to delete an already deleted list entry
in end_requests().
Fixes: ee314a870e ("fuse: abort: no fc->lock needed for request ending")
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # 4.2+
Make sure to check for short control transfers in order to avoid parsing
uninitialised buffer data and leaking it to user space.
Note that the backlight and macro-mode buffer constraints are kept as
loose as possible in order to avoid any regressions should the current
buffer sizes be larger than necessary.
Fixes: 6f78193ee9 ("HID: corsair: Add Corsair Vengeance K90 driver")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Not all platforms support DMA to the stack, and specifically since v4.9
this is no longer supported on x86 with VMAP_STACK either.
Note that the macro-mode buffer was larger than necessary.
Fixes: 6f78193ee9 ("HID: corsair: Add Corsair Vengeance K90 driver")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
ibss and mesh modes copy the ht capabilites from the band without
overriding the SMPS state. Unfortunately the default value 0 for the
SMPS field means static SMPS instead of disabled.
This results in HT ibss and mesh setups using only single-stream rates,
even though SMPS is not supposed to be active.
Initialize SMPS to disabled for all bands on ieee80211_hw_register to
ensure that the value is sane where it is not overriden with the real
SMPS state.
Reported-by: Elektra Wagenrad <onelektra@gmx.net>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
[move VHT TODO comment to a better place]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
STANDALONE_UPDATE_F should be set if something changed in plane configurations,
including plane disable.
The patch fixes page-faults bugs, caused by decon still using framebuffers
of disabled planes.
v2: fixed clear-bit code (Thx Marek)
v3: use test_and_clear_bit (Thx Joonyoung)
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Tested-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Improper usage of DECON_UPDATE register leads to subtle errors.
If it set in decon_commit when there are no active windows it results
in slow registry updates - all subsequent shadow registry updates takes more
than full vblank. On the other side if it is not set when there are
active windows it results in garbage on the screen after suspend/resume of
FB console.
The patch hopefully fixes it.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
GT reset and FLR share some operations and they are both implemented in
our new function intel_gvt_reset_vgpu_locked(). This patch rewrite the
gt reset handler using this new function.
Besides, this new implementation fixed the old issue in GT reset. The
old implementation reset GGTT entries which is illegal. We only clear
GGTT entries at PCI level reset.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Our function tests found several issues related to reusing vGPU
instance. They are qemu reboot failure, guest tdr after reboot, host
hang when reboot guest. All these issues are caused by dirty status
inherited from last VM.
This patch fix all these issues by resetting a virtual GPU before VM
use it. The reset logical is put into a low level function
_intel_gvt_reset_vgpu(), which supports Device Model Level Reset, Full
GT Reset and Per-Engine Reset.
vGPU Device Model Level Reset (DMLR) simulates the PCI reset to reset
the whole vGPU to default state as when it is created, including GTT,
execlist, scratch pages, cfg space, mmio space, pvinfo page, scheduler
and fence registers. The ultimate goal of vGPU DMLR is that reuse a
vGPU instance by different virtual machines. When we reassign a vGPU
to a virtual machine we must issue such reset first.
Full GT Reset and Per-Engine GT Reset are soft reset flow for GPU engines
(Render, Blitter, Video, Video Enhancement). It is defined by GPU Spec.
Unlike the FLR, GT reset only reset particular resource of a vGPU per
the reset request. Guest driver can issue a GT reset by programming
the virtual GDRST register to reset specific virtual GPU engine or all
engines.
Since vGPU DMLR and GT reset can share some code so we implement both
these two into one single function intel_gvt_reset_vgpu_locked(). The
parameter dmlr is to identify if we will do FLR or GT reset. The
parameter engine_mask is to specific the engines that need to be
resetted. If value ALL_ENGINES is given for engine_mask, it means
the caller requests a full gt reset that we will reset all virtual
GPU engines.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Jike Song <jike.song@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch introduces a new function intel_vgpu_reset_mmio() to
reset vGPU MMIO space (virtual registers of the vGPU). The default
values are loaded as firmware during gvt inititiation.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Move the mmio space inititation function setup_vgpu_mmio()
and cleanup function clean_vgpu_mmio() in vgpu.c to dedicated
source file mmio.c, and rename them as intel_vgpu_init_mmio()
and intel_vgpu_clean_mmio() respectively.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch introduces a new function intel_vgpu_reset_cfg_space()
to reset vGPU configuration space. This function will unmap gttmmio
and aperture if they are mapped before. Then entire cfg space will
be restored to default values.
Currently we only do such reset when vGPU is not owned by any VM
so we simply restore entire cfg space to default value, not following
the PCIe FLR spec that some fields should remain unchanged.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Move the configuration space inititation function setup_vgpu_cfg_space()
in vgpu.c to dedicated source file cfg_space.c, and rename the function
as intel_vgpu_init_cfg_space().
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch introduces a new function intel_vgpu_reset_gtt() to reset
the all GTT related status, including GGTT, PPGTT, scratch page. This
function can free all shadowed PPGTT, clear all GGTT entry, and clear
scratch page to all zero. After this, we can ensure no gtt related
information can be leakaged from one VM to anothor one when assign
vgpu instance across different VMs (not simultaneously).
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
This patch introudces a new function intel_vgpu_reset_resource() to
reset allocated vgpu resources by intel_vgpu_alloc_resource(). So far
we only need clear the fence registers. The function _clear_vgpu_fence()
will reset both virtual and physical fence registers to 0.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The read_pmem() function uses memcpy_mcsafe() on x86 where an EFAULT
error code indicates a failed read. Block I/O should use EIO to
indicate failure. Other pmem code paths (like bad blocks) already use
EIO so let's be consistent.
This fixes compatibility with consumers like btrfs that try to parse the
specific error code rather than treat all errors the same.
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The range size for axi is 0x2 bytes too small, as the QSPI needs
0x11c408 + 0x004 (which is 0x0011c40c, not 0x0011c40a). No errors have
been observed with this shortcoming, but fixing it for correctness.
Fixes: 329f98c197 ("ARM: dts: NSP: Add QSPI nodes to NSPI and bcm958625k DTSes")
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Program HardMin based on the vce_arbiter.ecclk
if ecclk is 0, disable ECLK DPM 0. Otherwise VCE
could hang if switching SCLK from DPM 0 to 6/7
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
can fix Bug 191281: vce ib test failed.
when vce idle, set vce clock gate, so the clock
in vce domain will be disabled.
when need to encode, disable vce clock gate,
enable the clocks to vce engine.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix a typo in impedance setting for ethernet-phy@3
Fixes: b76db38cd8 ("ARM: dts: dra72-evm-revc: add phy impedance settings")
Cc: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
While probing BGX we requesting appropriate QLM for it's configuration
and get LMAC count by that request. Then, while reading configured
MAC values from SSDT table we need to save them in proper mapping:
BGX[i]->lmac[j].mac = <MAC value>
to later provide for initialization stuff. In order to fill
such mapping properly we need to add lmac index to be used while
acpi initialization since at this moment bgx->lmac_count already contains
actual value.
Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix to return a negative error code from the kthread_run() error
handling case instead of 0, as done elsewhere in this function.
Fixes: cdd5de500b ("soc: ti: Add wkup_m3_ipc driver")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
In rdma_read_chunk_frmr() when ib_post_send() fails, the error code path
invokes ib_dma_unmap_sg() to unmap the sg list. It then invokes
svc_rdma_put_frmr() which in turn tries to unmap the same sg list through
ib_dma_unmap_sg() again. This second unmap is invalid and could lead to
problems when the iova being unmapped is subsequently reused. Remove
the call to unmap in rdma_read_chunk_frmr() and let svc_rdma_put_frmr()
handle it.
Fixes: 412a15c0fe ("svcrdma: Port to new memory registration API")
Cc: stable@vger.kernel.org
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
After the addition of the frame_retries callback we could run into cases where
a ATUSB device with an older firmware version would now longer be able to bring
the interface up.
We keep this functionality disabled now if the minimum firmware version for this
feature is not available.
Fixes: 5d82288b93 ("ieee802154: atusb: implement .set_frame_retries
ops callback")
Reported-by: Alexander Aring <aar@pengutronix.de>
Acked-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Driver code never touches "rstn" signal in atomic context, so there's
no need to implicitly put such restriction on it by using gpio_set_value
to manipulate it. Replace gpio_set_value to gpio_set_value_cansleep to
fix that.
As a an example of where such restriction might be inconvenient,
consider a hardware design where "rstn" is connected to a pin of I2C/SPI
GPIO expander chip.
Cc: Chris Healy <cphealy@gmail.com>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
From 4.9 we should really avoid using the stack here as this will not be DMA
able on various platforms. This changes a buffer that was introduced in the
4.10 merge window.
Fixes: 6cc33eba23 ("ieee802154: atusb: try to read permanent extended
address from device")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
In the unlikely case were the firmware is new enough but the actual USB command
still fails make sure we set a random address and return.
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
From 4.9 we should really avoid using the stack here as this will not be DMA
able on various platforms. This changes the buffers already being present in
time of 4.9 being released. This should go into stable as well.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The inet6addr_chain is an atomic notifier chain, so we can't call
anything that might sleep (like lock_sock)... instead of closing the
socket from svc_age_temp_xprts_now (which is called by the notifier
function), just have the rpc service threads do it instead.
Cc: stable@vger.kernel.org
Fixes: c3d4879e01 "sunrpc: Add a function to close..."
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Context expiry times are in units of seconds since boot, not unix time.
The use of get_seconds() here therefore sets the expiry time decades in
the future. This prevents timely freeing of contexts destroyed by
client RPC_GSS_PROC_DESTROY requests. We'd still free them eventually
(when the module is unloaded or the container shut down), but a lot of
contexts could pile up before then.
Cc: stable@vger.kernel.org
Fixes: c5b29f885a "sunrpc: use seconds since boot in expiry cache"
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Oops--in 916d2d844a I moved some constants into an array for
convenience, but here I'm accidentally writing to that array.
The effect is that if you ever encounter a filesystem lacking support
for ACLs or security labels, then all queries of supported attributes
will report that attribute as unsupported from then on.
Fixes: 916d2d844a "nfsd: clean up supported attribute handling"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
If a file is renamed, but stays in the same directory, we will still receive
2 change_info4 structures describing the change to that directory, but we
only want to apply it once.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
We don't want to invalidate the directory attribute and data cache unless we
know that a file was created, or the change attribute differs from the one
in our cache.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
rtm_table is an 8-bit field while table ids are allowed up to u32. Commit
709772e6e0 ("net: Fix routing tables with id > 255 for legacy software")
added the preference to set rtm_table in dumps to RT_TABLE_COMPAT if the
table id is > 255. The table id returned on get route requests should do
the same.
Fixes: c36ba6603a ("net: Allow user to get table id from route lookup")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6ffe1c4cd0 ("net: qcom/emac: fix of_node and phydev leaks")
fixed the problem with reference leaks on phydev, but the fix is
device-tree specific. When the driver unloads, the reference is
dropped only on DT systems.
Instead, it's cleaner if up grab an reference on ACPI systems.
When the driver unloads, we can drop the reference without having
to check whether we're on a DT system.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Handle failure in lwtunnel_fill_encap adding attributes to skb.
Fixes: 571e722676 ("ipv4: support for fib route lwtunnel encap attributes")
Fixes: 19e42e4515 ("ipv6: support for fib route lwtunnel encap attributes")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It was only needed to protect the connector_list walking, see
commit 8c4ccc4ab6
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Jul 9 23:44:26 2015 +0200
drm/probe-helper: Grab mode_config.mutex in poll_init/enable
Unfortunately the commit message of that patch fails to mention that
the new locking check was for the connector_list.
But that requirement disappeared in
commit c36a3254f7
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Dec 15 16:58:43 2016 +0100
drm: Convert all helpers to drm_connector_list_iter
and so we can drop this again.
This fixes a locking inversion on nouveau, where the rpm code needs to
re-enable. But in other places the rpm_get() calls are nested within
the big modeset locks.
While at it, also improve the kerneldoc for these two functions a
notch.
v2: Update the kerneldoc even more to explain that these functions
can't be called concurrently, or bad things happen (Chris).
Cc: Dave Airlie <airlied@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Lyude <lyude@redhat.com>
Reviewed-by: Lyude <lyude@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111090117.5134-1-daniel.vetter@ffwll.ch
I have reports of a crash that look like __fput() was called twice for
a NFSv4.0 file. It seems possible that the state manager could try to
reclaim a lock and take a reference on the fl->fl_file at the same time the
file is being released if, during the close(), a signal interrupts the wait
for outstanding IO while removing locks which then skips the removal
of that lock.
Since 83bfff23e9 ("nfs4: have do_vfs_lock take an inode pointer") has
removed the need to traverse fl->fl_file->f_inode in nfs4_lock_done(),
taking that reference is no longer necessary.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
The correct error checking for dma_map_single() is to use
dma_mapping_error().
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Jiri Pirko says:
====================
mlxsw: Couple of fixes
Couple of simple fixes from Arkadi and Elad.
Please queue these up for stable. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The event_data starts from address 0x00-0x0C and not from 0x08-0x014. This
leads to duplication with other fields in the Event Queue Element such as
sub-type, cqn and owner.
Fixes: eda6500a98 ("mlxsw: Add PCI bus implementation")
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During transmission the skb is checked for headroom in order to
add vendor specific header. In case the skb needs to be re-allocated,
skb_realloc_headroom() is called to make a private copy of the original,
but doesn't release it. Current code assumes that the original skb is
released during reallocation and only releases it at the error path
which causes a memory leak.
Fix this by adding the original skb release to the main path.
Fixes: d003462a50 ("mlxsw: Simplify mlxsw_sx_port_xmit function")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During transmission the skb is checked for headroom in order to
add vendor specific header. In case the skb needs to be re-allocated,
skb_realloc_headroom() is called to make a private copy of the original,
but doesn't release it. Current code assumes that the original skb is
released during reallocation and only releases it at the error path
which causes a memory leak.
Fix this by adding the original skb release to the main path.
Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function clearly never worked and always returns true,
as pointed out by gcc-7:
arch/arm/mach-ux500/pm.c: In function 'prcmu_is_cpu_in_wfi':
arch/arm/mach-ux500/pm.c:137:212: error: ?:
using integer constants in boolean context, the expression
will always evaluate to 'true' [-Werror=int-in-bool-context]
With the added braces, the condition actually makes sense.
Fixes: 34fe6f107e ("mfd : Check if the other db8500 core is in WFI")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
According to the code the intention is to append 8 SCK cycles
instead of 4 at end of a MMC_STOP_TRANSMISSION command. But this
will never happened because it's an AC command not an ADTC command.
So fix this by moving the statement into the right function.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: e4243f13d1 (mmc: mxs-mmc: add mmc host driver for i.MX23/28)
Cc: <stable@vger.kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Commit e5bbf30733 ("mmc: sdhci-acpi: Ensure connected devices are
powered when probing") introduced code to powerup any acpi child
nodes listed in the dstd. But some dstd-s list all possible devices
used on some board variants, while reporting if the device is actually
present and enabled in the status field of the device.
So we end up calling the acpi _PS0 (power-on) method for devices which
are not actually present. This does not always end well, e.g. on my
cube iwork8 air tablet, this results in freezing the entire tablet as
soon as the r8723bs module is loaded.
This commit fixes this by checking the child device's status.present
and status.enabled bits and only call acpi_device_fix_up_power()
if both are set.
Fixes: e5bbf30733 ("mmc: sdhci-acpi: Ensure connected devices are powered when probing")
BugLink: https://github.com/hadess/rtl8723bs/issues/80
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Commit bbe097f092 ("usb: gadget: udc: atmel: fix endpoint name")
introduced a memory leak when unbinding the driver. The endpoint names
would not be freed. Solve that by including the name as a string in struct
usba_ep so it is freed when the endpoint is.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Axius clock error path returns without disabling clock and suspend clock.
Fix it to disable them before returning error.
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Commit 05ee799f20 ("usb: dwc2: Move gadget settings into core_params")
changes to type u16 for DT binding "g-rx-fifo-size" and
"g-np-tx-fifo-size" but use type u32 for "g-tx-fifo-size". Finally the
the first two parameters cannot be passed successfully with wrong data
format. This is found the data transferring broken on 96boards Hikey.
This patch is to change all parameters to u32 type, and verified on
Hikey board the DT parameters can pass successfully.
[johnyoun: minor rebase]
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: John Youn <johnyoun@synopsys.com>
Tested-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
When zero endpoints are declared for a function, there is no endpoint
to disable, enable or free, so replace do...while loops with while loops.
Change pre-decrement to post-decrement to iterate the same number of times
when there are endpoints to process.
Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
'cdev->os_desc_req' has been allocated with 'usb_ep_alloc_request()' so
'usb_ep_free_request()' should be used to free it.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Since gpio_dev->hwbank_num is now a variable, the compiler cannot
figure out if pin_num is initialized at all:
drivers/pinctrl/pinctrl-amd.c: In function 'amd_gpio_dbg_show':
drivers/pinctrl/pinctrl-amd.c:210:3: warning: 'pin_num' may be used uninitialized in this function [-Wmaybe-uninitialized]
for (; i < pin_num; i++) {
^~~
drivers/pinctrl/pinctrl-amd.c:172:21: warning: 'i' may be used uninitialized in this function [-Wmaybe-uninitialized]
This adds a 'default' statement to make that case well-defined.
Fixes: 3bfd44306c ("pinctrl: amd: Add support for additional GPIO")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
When DIRECT_IRQ_EN is set, the pin is routed directly to the IO-APIC bypassing
the GPIO driver completely. However, the mask register is still used to
determine if the pin is supposed to generate IRQ or not.
So with commit 3ae02c14d9 the IRQ core masks all IRQs (because of
handle_bad_irq()) the pin connected to the touchscreen gets masked as well and
hence no interrupts.
To make this all work as expected we do not add those GPIOs to the IRQ domain
that can actually propagate interrupts.
Fixes: 3ae02c14d9 ("pinctrl: intel: set default handler to be handle_bad_irq()")
Reported-by: Robert R. Howell <rhowell@uwyo.edu>
Suggested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Set variables initialized in lpfc_sli4_alloc_resource_identifiers() to
NULL if an error occurred. Otherwise, lpfc_sli4_driver_resource_unset()
attempts to free the memory again.
Signed-off-by: Roberto Sassu <rsassu@suse.de>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Now that qla2xxx uses the IRQ layer affinity assignment, affinity won't
change over the life time of a device and the notifiers are useless.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The first two or three vectors in qla2xxx adapter are global and not
associated with a specific queue. They should not have IRQ affinity
assigned.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The receive callback (in tasklet context) is using RCU to get reference
to associated VF network device but this is not safe. RCU read lock
needs to be held. Found by running with full lockdep debugging
enabled.
Fixes: f207c10d98 ("hv_netvsc: use RCU to protect vf_netdev")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise, a xfrm policy with sport/dport being set cannot be matched.
Signed-off-by: Martynas Pumputis <martynas@weave.works>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the hw rx checksum is always enabled, and the user couldn't switch
it to sw rx checksum.
Note that the RTL_VER_01 only support sw rx checksum only. Besides,
the hw rx checksum for RTL_VER_02 is disabled after
commit b9a321b48a ("r8152: Fix broken RX checksums."). Re-enable it.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I noticed that the VT switch doesn't work any longer with a Dell
laptop with 1366x768 eDP when the machine is connected with a DP
monitor. It behaves as if VT were switched, but the graphics remain
frozen. Actually the keyboard works, so I could switch back to VT7
again.
I tried to track down the problem, and encountered a long story until
we reach to this error:
- The machine is booted with video=1366x768 option (the distro
installer seems to add it as default).
- Recently, drm_helper_probe_single_connector_modes() deals with
cmdline modes, and it tries to create a new mode when no
matching mode is found.
- The drm_mode_create_from_cmdline_mode() creates a mode based on
either CVT of GFT according to the given cmdline mode; in our case,
it's 1366x768.
- Since both CVT and GFT can't express the width 1366 due to
alignment, the resultant mode becomes 1368x768, slightly larger than
the given size.
- Later on, the atomic commit is performed, and in
drm_atomic_check_only(), the size of each plane is checked.
- The size check of 1366x768 fails due to the above, and eventually
the whole VT switch fails.
Back in the history, we've had a manual fix-up of 1368x768 in various
places via c09dedb7a5 ("drm/edid: Add a workaround for 1366x768 HD
panel"), but they have been all in drm_edid.c at probing the modes
from EDID. For addressing the problem above, we need a similar hack
to the mode newly created from cmdline, manually adjusting the width
when the expected size is 1366 while we get 1368 instead.
Fixes: eaf99c749d ("drm: Perform cmdline mode parsing during...")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109145614.29454-1-tiwai@suse.de
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
When an associated station changes its VHT operating mode this
can/will affect the bandwidth it's using, and consequently we
must recalculate the minimum bandwidth we need to use. Failure
to do so can lead to one of two scenarios:
1) we use a too high bandwidth, this is benign
2) we use a too narrow bandwidth, causing rate control and
actual PHY configuration to be out of sync, which can in
turn cause problems/crashes
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In the current minimum chandef code there's an issue in that the
recalculation can happen after rate control is initialized for a
station that has a wider bandwidth than the current chanctx, and
then rate control can immediately start using those higher rates
which could cause problems.
Observe that first of all that this problem is because we don't
take non-associated and non-uploaded stations into account. The
restriction to non-associated is quite pointless and is one of
the causes for the problem described above, since the rate init
will happen before the station is set to associated; no frames
could actually be sent until associated, but the rate table can
already contain higher rates and that might cause problems.
Also, rejecting non-uploaded stations is wrong, since the rate
control can select higher rates for those as well.
Secondly, it's then necessary to recalculate the minimal config
before initializing rate control, so that when rate control is
initialized, the higher rates are already available. This can be
done easily by adding the necessary function call in rate init.
Change-Id: Ib9bc02d34797078db55459d196993f39dcd43070
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Currently, this attribute is only fetched on station addition, but
not on station change. Since this info is only present in the assoc
request, with full station state support in the driver it cannot be
present when the station is added.
Thus, add support for changing the VHT opmode on station update if
done before (or while) the station is marked as associated. After
this, ignore it, since it used to be ignored.
Signed-off-by: Beni Lev <beni.lev@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In the commit below, I forgot to translate the mac80211's
AC to QoS IE order. Moreover, the condition in the if was
wrong. Fix both issues.
This bug would hit only with clients that didn't set all
the ACs as delivery enabled.
Fixes: f438ceb81d ("mac80211: uapsd_queues is in QoS IE order")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
A PCI-to-PCIe bridge (a "reverse bridge") has a PCI or PCI-X primary
interface and a PCI Express secondary interface. The PCIe interface is a
Downstream Port that originates a Link. See the "PCI Express to PCI/PCI-X
Bridge Specification", rev 1.0, sections 1.2 and A.6.
The bug report below involves a PCI-to-PCIe bridge and a PCIe switch below
the bridge:
00:1e.0 Intel 82801 PCI Bridge to [bus 01-0a]
01:00.0 Pericom PI7C9X111SL PCIe-to-PCI Reversible Bridge to [bus 02-0a]
02:00.0 Pericom Device 8608 [PCIe Upstream Port] to [bus 03-0a]
03:01.0 Pericom Device 8608 [PCIe Downstream Port] to [bus 0a]
01:00.0 is configured as a PCI-to-PCIe bridge (despite the name printed by
lspci). As we traverse a PCIe hierarchy, device connections alternate
between PCIe Links and internal Switch logic. Previously we did not
recognize that 01:00.0 had a secondary link, so we thought the 02:00.0
Upstream Port *did* have a secondary link. In fact, it's the other way
around: 01:00.0 has a secondary link, and 02:00.0 has internal Switch logic
on its secondary side.
When we thought 02:00.0 had a secondary link, the pci_scan_slot() ->
only_one_child() path assumed 02:00.0 could have only one child, so 03:00.0
was the only possible downstream device. But 03:00.0 doesn't exist, so we
didn't look for any other devices on bus 03.
Booting with "pci=pcie_scan_all" is a workaround, but we don't want users
to have to do that.
Recognize that PCI-to-PCIe bridges originate links on their secondary
interfaces.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=189361
Fixes: d0751b98df ("PCI: Add dev->has_secondary_link to track downstream PCIe links")
Tested-by: Blake Moore <blake.moore@men.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v4.2+
Martin reported that the Supermicro X8DTH-i/6/iF/6F advertises incorrect
host bridge windows via _CRS:
pci_root PNP0A08:00: host bridge window [io 0xf000-0xffff]
pci_root PNP0A08:01: host bridge window [io 0xf000-0xffff]
Both bridges advertise the 0xf000-0xffff window, which cannot be correct.
Work around this by ignoring _CRS on this system. The downside is that we
may not assign resources correctly to hot-added PCI devices (if they are
possible on this system).
Link: https://bugzilla.kernel.org/show_bug.cgi?id=42606
Reported-by: Martin Burnicki <martin.burnicki@meinberg.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org
This patch fix issue introduced by my previous commit that
tried to ensure enough headroom was present, and instead
broke it.
When forwarding mesh pkt, mac80211 may also add security header,
and it must therefore be taken into account in the needed headroom.
Fixes: d8da0b5d64 ("mac80211: Ensure enough headroom when forwarding mesh pkt")
Signed-off-by: Cedric Izoard <cedric.izoard@ceva-dsp.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The commit 658b476c74 ("pinctrl: baytrail: Add debounce configuration")
implements debounce for Baytrail pin control, but seems wasn't tested properly.
The register which keeps debounce value is separated from the configuration
one. Writing wrong values to the latter will guarantee wrong behaviour of the
driver and even might break something physically.
Besides above there is missed case how to disable it, which is actually done
through the bit in configuration register.
Rectify implementation here by using proper register for debounce value.
Fixes: 658b476c74 ("pinctrl: baytrail: Add debounce configuration")
Cc: Cristina Ciocan <cristina.ciocan@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
There are two bits in the PADCFG0 register to configure direction, one per
TX/RX buffers.
For now we wrongly assume that the GPIO is always requested before it is being
used, which is not true when the GPIO is used through irqchip. In this case the
GPIO is never requested and we never enable RX buffer for it.
Fix this by setting both bits accordingly.
Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
PADCFGLOCK (and PADCFGLOCK_TX) offset in Broxton actually starts at 0x060
and not 0x090 as used in the driver. Fix it to use the correct offset.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
After rollover of the IOVA space, we want to get a low IOVA address,
otherwise the the games we play by remembering the last IOVA are
pointless. When we search for a free hole with DRM_MM_SEARCH_DEFAULT,
drm_mm will pop the next entry from the free holes stack, which will
likely be a high IOVA. By using DRM_MM_SEARCH_BELOW we can trick
drm_mm into reversing the search and provide us with a low IOVA.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir van der Laan <laanwj@gmail.com>
On APQ8060, the kernel crashes in arch_hw_breakpoint_init, taking an
undefined instruction trap within write_wb_reg. This is because Scorpion
CPUs erroneously appear to set DBGPRSR.SPD when WFI is issued, even if
the core is not powered down. When DBGPRSR.SPD is set, breakpoint and
watchpoint registers are treated as undefined.
It's possible to trigger similar crashes later on from userspace, by
requesting the kernel to install a breakpoint or watchpoint, as we can
go idle at any point between the reset of the debug registers and their
later use. This has always been the case.
Given that this has always been broken, no-one has complained until now,
and there is no clear workaround, disable hardware breakpoints and
watchpoints on Scorpion to avoid these issues.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ARM has a few system calls (most notably mmap) for which the names of
the functions which are referenced in the syscall table do not match the
names of the syscall tracepoints. As a consequence of this, these
tracepoints are not made available. Implement
arch_syscall_match_sym_name to fix this and allow tracing even these
system calls.
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Function rx_data(), which handles ingress CPL_RX_DATA messages, was
always sending an RX_DATA_ACK with the goal of updating the credits.
However, if the RDMA connection is moved out of FPDU mode abruptly,
then it is possible for iw_cxgb4 to process queued RX_DATA CPLs after HW
has aborted the connection. These CPLs should not trigger RX_DATA_ACKS.
If they do, HW can see a READ after DELETE of the DB_LE hash entry for
the tid and post a LE_DB HashTblMemCrcError.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Commit ad61a4c7a9 ("iw_cxgb4: don't block in destroy_qp awaiting
the last deref") introduced a bug where the RDMA QP EQ queue memory
(and QIDs) are possibly freed before the underlying connection has been
fully shutdown. The result being a possible DMA read issued by HW after
the queue memory has been unmapped and freed. This results in possible
WR corruption in the worst case, system bus errors if an IOMMU is in use,
and SGE "bad WR" errors reported in the very least. The fix is to defer
unmap/free of queue memory and QID resources until the QP struct has
been fully dereferenced. To do this, the c4iw_ucontext must also be kept
around until the last QP that references it is fully freed. In addition,
since the last QP deref can happen in an IRQ disabled context, we need
a new workqueue thread to do the final unmap/free of the EQ queue memory.
Fixes: ad61a4c7a9 ("iw_cxgb4: don't block in destroy_qp awaiting the last deref")
Cc: stable@vger.kernel.org
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
With the addition of the IB/Core drain API, iw_cxgb4 supported drain
by watching the CQs when the QP was out of RTS and signalling "drain
complete" when the last CQE is polled. This, however, doesn't fully
support the drain semantics. Namely, the drain logic is supposed to signal
"drain complete" only when the application has _processed_ the last CQE,
not just removed them from the CQ. Thus a small timing hole exists that
can cause touch after free type bugs in applications using the drain API
(nvmf, iSER, for example). So iw_cxgb4 needs a better solution.
The iWARP Verbs spec mandates that "_at some point_ after the QP is
moved to ERROR", the iWARP driver MUST synchronously fail post_send and
post_recv calls. iw_cxgb4 was currently not allowing any posts once the
QP is in ERROR. This was in part due to the fact that the HW queues for
the QP in ERROR state are disabled at this point, so there wasn't much
else to do but fail the post operation synchronously. This restriction
is what drove the first drain implementation in iw_cxgb4 that has the
above mentioned flaw.
This patch changes iw_cxgb4 to allow post_send and post_recv WRs after
the QP is moved to ERROR state for kernel mode users, thus still adhering
to the Verbs spec for user mode users, but allowing flush WRs for kernel
users. Since the HW queues are disabled, we just synthesize a CQE for
this post, queue it to the SW CQ, and then call the CQ event handler.
This enables proper drain operations for the various storage applications.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The node name for the power seq pin is mmc2@0 like the mmc2_pins_a one.
This makes the original node (mmc2_pins_a) scrapped out of the dtb and
result in a unusable eMMC if U-Boot didn't configured the pins to the
correct functions.
Signed-off-by: Emmanuel Vadot <manu@bidouilliste.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Now that we disable the display engine by default, we need to re-enable
it for the Hummingbird A31, which already had its display pipeline
enabled.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
While we now support the internal display pipeline found on sun6i, it
is possible that we are unable to enable the display for some boards,
due to a lack of drivers for the panels or bridges found on them. If
the display pipeline is enabled, the driver will try to enable, and
possibly screw up the simple framebuffer U-boot had configured.
Disable the display pipeline by default.
Fixes: 6d0e5b70be ("ARM: dts: sun6i: Add device nodes for first
display pipeline")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Previously we checked for iATU unroll support by reading PCIE_ATU_VIEWPORT
even on platforms, e.g., Keystone, that do not have ATU ports. This can
cause bad behavior such as asynchronous external aborts:
OF: PCI: MEM 0x60000000..0x6fffffff -> 0x60000000
Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
pgd = c0003000
[00000000] *pgd=80000800004003, *pmd=00000000
Internal error: : 1211 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-00009-g6ff59d2-dirty #7
Hardware name: Keystone
task: eb878000 task.stack: eb866000
PC is at dw_pcie_setup_rc+0x24/0x380
LR is at ks_pcie_host_init+0x10/0x170
Move the dw_pcie_iatu_unroll_enabled() check so we only call it on
platforms that do not use the ATU. These platforms supply their own
->rd_other_conf() and ->wr_other_conf() methods.
[bhelgaas: changelog]
Fixes: a0601a4705 ("PCI: designware: Add iATU Unroll feature")
Fixes: 416379f9eb ("PCI: designware: Check for iATU unroll support after initializing host")
Tested-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-By: Joao Pinto <jpinto@synopsys.com>
CC: stable@vger.kernel.org # v4.9+
Also update Kconfig help text, explaining things:
Cirrus is obsolete, the hardware was designed in the 90ies
and can't keep up with todays needs. More background:
https://www.kraxel.org/blog/2014/10/qemu-using-cirrus-considered-harmful/
Better alternatives are:
- stdvga (DRM_BOCHS, qemu -vga std, default in qemu 2.2+)
- qxl (DRM_QXL, qemu -vga qxl, works best with spice)
- virtio (VIRTIO_GPU), qemu -vga virtio)
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Changes:
* add myself as maintainer, so patches land in my inbox.
* add virtualization@lists.linux-foundation.org mailing list.
* add drm-qemu git repo.
* flip bochs and qxl status to "Maintained".
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
virtio uses normal ram as backing storage for the framebuffer, so we
should assign the address to new screen_buffer (added by commit
17a7b0b4d9) instead of screen_base.
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
No actual segmentation faults were observed but the coding is
at least inconsistent.
irqreturn_t meson_mmc_irq():
We should not dereference host before checking it.
meson_mmc_irq_thread():
If cmd or mrq are NULL we should not dereference them after
writing a warning.
Fixes: 51c5d8447b MMC: meson: initial support for GX platforms
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Instead of scheduling the work to handle the initial delayed event, use 1s
delay.
This delay should not be needed, but Optimus/nouveau will fail in a
mysterious way if the delayed event is handled as soon as possible like it
is done in drm_helper_probe_single_connector_modes() in case the poll
was enabled before.
Reverting 339fd36238 would give back the 10 sec (!) delay to handle the
delayed event. Adding 1sec delay to the poll_work is enough to work around
the issue in Optimus setups and gives shorter response on handling the
initial delayed event.
Fixes: 339fd36238 ("drm: drm_probe_helper: Fix output_poll_work scheduling")
Cc: stable@vger.kernel.org # v4.9
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
[danvet: Add FIXME to the comment to make it stick out more.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109143158.21917-1-peter.ujfalusi@ti.com
GVT-g fixes from Zhenya, "Please pull GVT-g device model fixes for
rc4. This is based on rc3 with new vfio/mdev interface change."
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Commit 093df73771 ("scsi: qla2xxx: Fix Target mode handling with
Multiqueue changes.") introduces two bodies of code that look similar
but with s/req/rsp/ in the second instance. But in one case, it looks
like this conversion was missed.
Signed-off-by: Dave Jones <davej@codemonkey.org.uk>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Quinn Tran <Quinn.Tran@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a race condition with qla2xxx optrom functions where one thread
might modify optrom buffer, optrom_state while other thread is still
reading from it.
In couple of crashes, it was found that we had successfully passed the
following 'if' check where we confirm optrom_state to be
QLA_SREADING. But by the time we acquired mutex lock to proceed with
memory_read_from_buffer function, some other thread/process had already
modified that option rom buffer and optrom_state from QLA_SREADING to
QLA_SWAITING. Then we got ha->optrom_buffer 0x0 and crashed the system:
if (ha->optrom_state != QLA_SREADING)
return 0;
mutex_lock(&ha->optrom_mutex);
rval = memory_read_from_buffer(buf, count, &off, ha->optrom_buffer,
ha->optrom_region_size);
mutex_unlock(&ha->optrom_mutex);
With current optrom function we get following crash due to a race
condition:
[ 1479.466679] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 1479.466707] IP: [<ffffffff81326756>] memcpy+0x6/0x110
[...]
[ 1479.473673] Call Trace:
[ 1479.474296] [<ffffffff81225cbc>] ? memory_read_from_buffer+0x3c/0x60
[ 1479.474941] [<ffffffffa01574dc>] qla2x00_sysfs_read_optrom+0x9c/0xc0 [qla2xxx]
[ 1479.475571] [<ffffffff8127e76b>] read+0xdb/0x1f0
[ 1479.476206] [<ffffffff811fdf9e>] vfs_read+0x9e/0x170
[ 1479.476839] [<ffffffff811feb6f>] SyS_read+0x7f/0xe0
[ 1479.477466] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
Below patch modifies qla2x00_sysfs_read_optrom,
qla2x00_sysfs_write_optrom functions to get the mutex_lock before
checking ha->optrom_state to avoid similar crashes.
The patch was applied and tested and same crashes were no longer
observed again.
Tested-by: Milan P. Gandhi <mgandhi@redhat.com>
Signed-off-by: Milan P. Gandhi <mgandhi@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Tree-wide replacement was done by commit 2ef7d5f342 ("ARM, ARM64:
dts: drop "arm,amba-bus" in favor of "simple-bus"), then the 2nd
round by commit 15b7cc78f0 ("arm64: dts: drop "arm,amba-bus" in
favor of "simple-bus" part 2").
Here, some new users have appeared for Linux v4.10-rc1. Eliminate
them now.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Since the codec is probed first, the pinctrl node should be
under the codec node.
The codec init was working for this board since U-Boot was
already setting GPIO_0 as CLKO1 but better fix it anyway.
Fixes: 3faa1bb2e8 ("ARM: dts: imx: add Boundary Devices Nitrogen6_SOM2 support")
Signed-off-by: Gary Bisson <gary.bisson@boundarydevices.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
This patch fixes the following error:
sgtl5000 0-000a: Error reading chip id -6
imx-sgtl5000 sound: ASoC: CODEC DAI sgtl5000 not registered
imx-sgtl5000 sound: snd_soc_register_card failed (-517)
The problem was that the pinctrl group was linked to the sound driver
instead of the codec node. Since the codec is probed first, the sys_mclk
was missing and it would therefore fail to initialize.
Fixes: b32e700256 ("ARM: dts: imx: add Boundary Devices Nitrogen6_Max board")
Signed-off-by: Gary Bisson <gary.bisson@boundarydevices.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Some parent clocks of the Exynos542x clock blocks, which have separate
power domains (like DISP, MFC, MSC, GSC, FSYS and G2D) must be always
enabled to access any register related to power management unit or devices
connected to it. For the time being, until a proper solution based on
runtime PM is applied, mark those clocks as critical (instead of ignore
unused or even no flags) to prevent disabling them.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com> [Exynos5800 Peach Pi Chromebook]
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Single drm bridge fix.
* tag 'drm-misc-fixes-2017-01-09' of git://anongit.freedesktop.org/git/drm-misc:
drm/bridge: analogix dp: Fix runtime PM state on driver bind
Otherwise, RST packets generated by the TCP stack for non-existing
sockets always have mark 0.
The mark from the original packet is assigned to the netns_ipv4/6
socket used to send the response so that it can get copied into the
response skb when the socket sends it.
Fixes: e110861f86 ("net: add a sysctl to reflect the fwmark on replies")
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Pau Espin Pedrol <pau.espin@tessares.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Otherwise, RST packets generated by ipt_REJECT always have mark 0 when
the routing is checked later in the same code path.
Fixes: e110861f86 ("net: add a sysctl to reflect the fwmark on replies")
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Pau Espin Pedrol <pau.espin@tessares.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We should go to 'err_put_master' here instead of returning directly.
Otherwise a call to 'spi_master_put' is missing.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
In AP (or VLAN) mode, when unicast 802.11 packets are received,
they might actually be multicast after conversion. In this case
the fast-RX path didn't handle them properly to send them back
to the wireless medium. Implement that by copying the SKB and
sending it back out.
The possible alternative would be to just punt the packet back
to the regular (slow) RX path, but since we have almost all of
the required code here already it's not so complicated to add
here. Punting it back would also mean acquiring the spinlock,
which would be bad for the stated purpose of the fast-RX path,
to enable well-performing parallel RX.
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In gvt, almost all memory allocations are in sleepable contexts. It's
fault-prone to use GFP_ATOMIC everywhere. Replace it with GFP_KERNEL
wherever possible.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Analogix_dp_bind() can be called from component framework, which doesn't
guarantee proper runtime PM state of the device during bind operation,
so ensure that device is runtime active before doing any register access.
This ensures that the power domain, to which DP module belongs, is turned
on. While at it, also fix the unbalanced call to phy_power_on() in
analogix_dp_bind() function.
This patch solves the following kernel oops on Samsung Exynos5250 Snow
board:
Unhandled fault: imprecise external abort (0x406) at 0x00000000
pgd = c0004000
[00000000] *pgd=00000000
Internal error: : 406 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 75 Comm: kworker/0:2 Not tainted 4.9.0 #1046
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
Workqueue: events deferred_probe_work_func
task: ee272300 task.stack: ee312000
PC is at analogix_dp_enable_sw_function+0x18/0x2c
LR is at analogix_dp_init_dp+0x2c/0x50
...
[<c03fcb38>] (analogix_dp_enable_sw_function) from [<c03fa9c4>] (analogix_dp_init_dp+0x2c/0x50)
[<c03fa9c4>] (analogix_dp_init_dp) from [<c03fab6c>] (analogix_dp_bind+0x184/0x42c)
[<c03fab6c>] (analogix_dp_bind) from [<c03fdb84>] (component_bind_all+0xf0/0x218)
[<c03fdb84>] (component_bind_all) from [<c03ed64c>] (exynos_drm_load+0x134/0x200)
[<c03ed64c>] (exynos_drm_load) from [<c03d5058>] (drm_dev_register+0xa0/0xd0)
[<c03d5058>] (drm_dev_register) from [<c03d66b8>] (drm_platform_init+0x58/0xb0)
[<c03d66b8>] (drm_platform_init) from [<c03fe0c4>] (try_to_bring_up_master+0x14c/0x188)
[<c03fe0c4>] (try_to_bring_up_master) from [<c03fe188>] (component_add+0x88/0x138)
[<c03fe188>] (component_add) from [<c0403a38>] (platform_drv_probe+0x50/0xb0)
[<c0403a38>] (platform_drv_probe) from [<c0402470>] (driver_probe_device+0x1f0/0x2a8)
[<c0402470>] (driver_probe_device) from [<c0400a54>] (bus_for_each_drv+0x44/0x8c)
[<c0400a54>] (bus_for_each_drv) from [<c04021f8>] (__device_attach+0x9c/0x100)
[<c04021f8>] (__device_attach) from [<c04018e8>] (bus_probe_device+0x84/0x8c)
[<c04018e8>] (bus_probe_device) from [<c0401d1c>] (deferred_probe_work_func+0x60/0x8c)
[<c0401d1c>] (deferred_probe_work_func) from [<c012fc14>] (process_one_work+0x120/0x318)
[<c012fc14>] (process_one_work) from [<c012fe34>] (process_scheduled_works+0x28/0x38)
[<c012fe34>] (process_scheduled_works) from [<c0130048>] (worker_thread+0x204/0x4ac)
[<c0130048>] (worker_thread) from [<c01352c4>] (kthread+0xd8/0xf4)
[<c01352c4>] (kthread) from [<c0107978>] (ret_from_fork+0x14/0x3c)
Code: e59035f0 e5935018 f57ff04f e3c55001 (f57ff04e)
---[ end trace 3d1d0d87796de344 ]---
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1483091866-1088-1-git-send-email-m.szyprowski@samsung.com
The vgpu_create() routine we called returns meaningful errors to indicate
failures, so we'd better to pass it to our caller, the mdev framework,
whereby the sysfs is able to tell userspace what happened.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
According to the spec, ACPI OpRegion must be placed at a physical address
below 4G. That is, for a vGPU it must be associated with a GPA below 4G,
but on host side, it doesn't matter where the backing pages actually are.
So when allocating pages from host, the GFP_DMA32 flag is unnecessary.
Also the allocation is from a sleepable context, so GFP_ATOMIC is also
unnecessary.
This patch also removes INTEL_GVT_OPREGION_PORDER and use get_order()
instead.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Once idr_alloc gets called data is allocated within the idr list, if
any error occurs afterwards, we should undo that by idr_remove on the
error path.
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The vgpu->running_workload_num is used to determine whether a vgpu has
any workload running or not. So we should make sure the workload is
really done before we dec running_workload_num. Function
complete_current_workload is not the right place to do it, since this
function is still processing the workload. This patch move the dec op
afterward.
v2: move dec op before wake_up(&scheduler->workload_complete_wq) (Min He)
Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Min He <min.he@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
In the function workload_thread(), we invoke complete_current_workload()
to cleanup the just processed workload (workload will be freed there).
So we cannot access workload->req after that. This patch move
complete_current_workload() afterward.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Remove duplicated definition for resource size in aperture_gm.c
which are already defined in gvt.h. Need only one to take effect.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Previous high mem size initialized for vGPU type was too small which caused
failure for some VMs. This trys to take minimal value of 384MB for each VM and
enlarge default high mem size to make guest driver happy.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
In function intel_vgpu_emulate_mmio_read, the untracked mmio register is
dumped through kernel log, but the register value is not correct. This
patch fixes this issue.
V2: fix the fromat warning from checkpatch.pl.
Signed-off-by: Pei Zhang <pei.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The readq and writeq are already offered by drm_os_linux.h. So we can
use them directly whithout dectecting their presence. This patch removed
the duplicated code.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Return ealier for a invalid access, else it would false set
tlb flag for RCS.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The current prototype of new_mmio_info() uses void* for parameters read
and write, which are functions with precise calling conventions
(argument types and return type). Write down these conventions in
new_mmio_info() definition.
This has been reported by the following warnings when clang is used to
build the kernel:
drivers/gpu/drm/i915/gvt/handlers.c:124:21: error: pointer type
mismatch ('void *' and 'int (*)(struct intel_vgpu *, unsigned int,
void *, unsigned int)') [-Werror,-Wpointer-type-mismatch]
info->read = read ? read : intel_vgpu_default_mmio_read;
^ ~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/gvt/handlers.c:125:23: error: pointer type
mismatch ('void *' and 'int (*)(struct intel_vgpu *, unsigned int,
void *, unsigned int)') [-Werror,-Wpointer-type-mismatch]
info->write = write ? write : intel_vgpu_default_mmio_write;
^ ~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This allows the compiler to detect that sbi_ctl_mmio_write() returns a
"bool" value instead of an expected "int" one. Fix this.
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
Cc: stable@vger.kernel.org
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
OMAP1510, OMAP5910 and OMAP310 have only 9 logical channels.
OMAP1610, OMAP5912, OMAP1710, OMAP730, and OMAP850 have 16 logical channels
available.
The wired 17 for the lch_count must have been used to cover the 16 + 1
dedicated LCD channel, in reality we can only use 9 or 16 channels.
The d->chan_count is not used by the omap-dma stack, so we can skip the
setup. chan_count was configured to the number of logical channels and not
the actual number of physical channels anyways.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The device_unregister call in thermal_zone_device_unregister causes the
thermal_zone_device structure to be freed before the call to free the
dynamically allocated attribute groups. This leads to a kernel panic.
Furthermore, the 4 calls to free the trip point attribute structures
occur before the call to unregister the device, leading to a kernel
panic when sysfs attempts to access the attributes to remove them.
Here is an example of a kernel panic when the cpu thermal zones are
removed upon cpu offline:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: strlen+0x0/0x20
<snip>
Call Trace:
? kernfs_name_hash+0x17/0x80
kernfs_find_ns+0x3f/0xd0
kernfs_remove_by_name_ns+0x36/0xa0
remove_files.isra.1+0x36/0x70
sysfs_remove_group+0x44/0x90
sysfs_remove_groups+0x2e/0x50
device_remove_attrs+0x5e/0x90
device_del+0x1ea/0x350
device_unregister+0x1a/0x60
thermal_zone_device_unregister+0x1f2/0x210
pkg_thermal_cpu_offline+0x14f/0x1a0 [x86_pkg_temp_thermal]
? kzalloc.constprop.2+0x10/0x10 [x86_pkg_temp_thermal]
cpuhp_invoke_callback+0x8d/0x3f0
cpuhp_down_callbacks+0x42/0x80
cpuhp_thread_fun+0x8b/0xf0
smpboot_thread_fn+0x110/0x160
kthread+0x101/0x140
? sort_range+0x30/0x30
? kthread_park+0x90/0x90
ret_from_fork+0x25/0x30
This patch moves the kfree calls to clean up the dynamic attributes to
the thermal_class's thermal_zone_device release function.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Jacob von Chorus <jacobvonchorus@cwphoto.ca>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
From Boris Brezillon:
"""
Fixes for 4.10-rc3
- Forbid compiling xway NAND controller driver as a module
- Fix tango NAND DT binding and make sure the controller is in a clean
state at probe time
- Add dependency on HAS_IOMEM to the oxnas NAND driver
- Fix irq number validity check in the lpc32xx driver
"""
There is no mmc sd card detect on am335x-ice board. But the spi0_cs1
pin being configured as mmcsd_cd. Removing it fixes the below warning
during boot:
pinctrl-single 44e10800.pinmux: pin 44e10960.0 already
requested by 48030000.spi; cannot claim for 48060000.mmc
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
[tony@atomide.com: tidied up commit message]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Commit 485fa1261f ("ARM: OMAP2+: LogicPD Torpedo + Wireless: Add Bluetooth")
set the wrong baud rate for the UART. The Baud rate was 300,000 and it should
be 3,000,000 for WL1283.
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
It is necessary to call entry/exit functions for parent interrupt
controllers for proper masking/unmasking of interrupt lines.
Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
It is necessary to use hwirq instead of virq when you communicate
with an interrupt controller since there is no guaranty that virq
numbers match hwirq numbers.
Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Semantics of NR_IRQS is different on machines with SPARSE_IRQ option
disabled or enabled, in the latter case IRQs are allocated starting
at least from the value specified by NR_IRQS and going upwards, so
the check of (irq >= NR_IRQ) to decide about an error code returned by
platform_get_irq() is completely invalid, don't attempt to overrule
irq subsystem in the driver.
The change fixes LPC32xx NAND MLC driver initialization on boot.
Fixes: 8cb17b5ed0 ("irqchip: Add LPC32xx interrupt controller driver")
Cc: stable@kernel.org # v4.7+
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Sylvain Lemieux <slemieux.tyco@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
The commit 7c7289a404 ("spi: pxa2xx: Default thresholds to PXA
configuration") while splitting up CE4100 code obviously missed a break
condition in one chunk. Add it here.
Looks like we have no active user of CE4100, though better to fix this later
than never.
Fixes: commit 7c7289a404 ("spi: pxa2xx: Default thresholds to PXA configuration")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The commit a3ff958236 ("spi: dw-mid: switch to new dmaengine_terminate_*
API") converted mid_spi_dma_exit() but missed mid_spi_dma_stop().
This is follow up to convert the rest.
Fixes: a3ff958236 ("spi: dw-mid: switch to new dmaengine_terminate_* API")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
An error before the hardif is found has to free the skb. But every error
after that has to free the skb + put the hard interface.
Fixes: 8def0be82d ("batman-adv: Consume skb in batadv_frag_send_packet")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Booting Linux on a mx6q based board leads to the following warning:
(NULL device *): hwmon_device_register() is deprecated. Please convert the
driver to use hwmon_device_register_with_info().
,so do as suggested.
Also, this results in the core taking care of creating the 'name'
attribute, so drop the code doing that from the thermal driver.
Suggested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Visually separate register ranges (address/size pairs) in reg prop.
Change DMA channel name, for consistency with other drivers.
Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
NF_CT_PROTO_DCCP/SCTP/UDPLITE were switched from tristate to boolean so
defconfig needs to be adjusted to silence warnings:
warning: symbol value 'm' invalid for NF_CT_PROTO_DCCP
warning: symbol value 'm' invalid for NF_CT_PROTO_SCTP
warning: symbol value 'm' invalid for NF_CT_PROTO_UDPLITE
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
The patch removes these warnings reported by dtc 1.4:
Warning (unit_address_vs_reg): Node /amba_apu has a reg or ranges
property, but no unit name
Warning (unit_address_vs_reg): Node /memory has a reg or ranges
property, but no unit name
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Fix build errors on arch/um, which does not support HAS_IOMEM,
while the oxnas_nand.c driver uses interfaces that are
supplied by HAS_IOMEM.
(loadable module build:)
ERROR: "devm_ioremap_resource" [drivers/mtd/nand/oxnas_nand.ko] undefined!
or (built-in build:)
drivers/built-in.o: In function `oxnas_nand_probe':
drivers/mtd/nand/oxnas_nand.c:102: undefined reference to `devm_ioremap_resource'
Fixes: 6685924924 ("mtd: nand: Add OX820 NAND Support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Remove the usage of modules functions to make this driver compile
again. Otherwise an include of linux/modules.h would be needed.
Fixes: 024366750c ("mtd: nand: xway: convert to normal platform driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
The xway_nand driver accesses the ltq_ebu_membase symbol which is not
exported. This also should not get exported and we should handle the
EBU interface in a better way later. This quick fix just deactivated
support for building as module.
Fixes: 99f2b10792 ("mtd: lantiq: Add NAND support on Lantiq XWAY SoC.")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Some system have multiple dw devices. Currently the driver uses a
fixed name for the debugfs dir. Append dev name to the debugfs dir
name to make it unique.
Signed-off-by: Phil Reid <preid@electromag.com.au>
Signed-off-by: Mark Brown <broonie@kernel.org>
I use Patchwork for handling incoming patches. Put its address here so
submitters could know what is in the queue.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Acked-by: Kukjin Kim <kgene@kernel.org>
* patchwork:
[media] s5k4ecgx: select CRC32 helper
[media] dvb: avoid warning in dvb_net
[media] v4l: tvp5150: Don't override output pinmuxing at stream on/off time
[media] v4l: tvp5150: Fix comment regarding output pin muxing
[media] v4l: tvp5150: Reset device at probe time, not in get/set format handlers
[media] pctv452e: move buffer to heap, no mutex
[media] media/cobalt: use pci_irq_allocate_vectors
[media] cec: fix race between configuring and unconfiguring
[media] cec: move cec_report_phys_addr into cec_config_thread_func
[media] cec: replace cec_report_features by cec_fill_msg_report_features
[media] cec: update log_addr[] before finishing configuration
[media] cec: CEC_MSG_GIVE_FEATURES should abort for CEC version < 2
[media] cec: when canceling a message, don't overwrite old status info
[media] cec: fix report_current_latency
[media] smiapp: Make suspend and resume functions __maybe_unused
[media] smiapp: Implement power-on and power-off sequences without runtime PM
A rare randconfig build failure shows up in this driver when
the CRC32 helper is not there:
drivers/media/built-in.o: In function `s5k4ecgx_s_power':
s5k4ecgx.c:(.text+0x9eb4): undefined reference to `crc32_le'
This adds the 'select' that all other users of this function have.
Fixes: 8b99312b72 ("[media] Add v4l2 subdev driver for S5K4ECGX sensor")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
With gcc-5 or higher on x86, we can get a bogus warning in the
dvb-net code:
drivers/media/dvb-core/dvb_net.c: In function 'dvb_net_ule':
arch/x86/include/asm/string_32.h:78:22: error: '*((void *)&dest_addr+4)' may be used uninitialized in this function [-Werror=maybe-uninitialized]
The problem here is that gcc doesn't track all of the conditions
to prove it can't end up copying uninitialized data.
This changes the logic around so we zero out the destination
address earlier when we determine that it is not set here.
This allows the compiler to figure it out.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The s_stream() handler incorrectly writes the whole MISC_CTL register to
enable or disable the outputs, overriding the output pinmuxing
configuration. Fix it to only touch the output enable bits.
The CONF_SHARED_PIN register is also written by the same function,
resulting in muxing the INTREQ signal instead of the VBLK/GPCL signal on
the INTREQ/GPCL/VBLK pin. As the driver doesn't support interrupts this
is obviously incorrect, and breaks operation on other devices. Fix it by
removing the write.
Cc: stable@vger.kernel.org # For Kernel 4.5 and upper
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The FID/GLCO/VLK/HVLK and INTREQ/GPCL/VBLK pins are muxed differently
depending on whether the input is an S-Video or composite signal. The
comment that explains the logic doesn't reflect the code. It appears
that the comment is incorrect, as disabling the output data bus in
composite mode makes no sense. Update the comment to match the code.
While at it define macros for the MISC_CTL register bits, the code is
too confusing with numerical values.
Cc: stable@vger.kernel.org # For Kernel 4.5 and upper
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The tvp5150 doesn't support format setting through the subdev pad API
and thus implements the set format handler as a get format operation.
The single handler, tvp5150_fill_fmt(), resets the device by calling
tvp5150_reset(). This causes malfunction as the device can be reset at
will, possibly from userspace when the subdev userspace API is enabled.
The reset call was added in commit ec2c4f3f93 ("[media] media:
tvp5150: Add mbus_fmt callbacks"), probably as an attempt to set the
device to a known state before detecting the current TV standard.
However, the get format handler doesn't access the hardware to get the
TV standard since commit 963ddc63e2 ("[media] media: tvp5150: Add
cropping support"). There is thus no need to reset the device when
getting the format.
However, removing the tvp5150_reset() from the get/set format handlers
results in the function not being called at all if the bridge driver
doesn't use the .reset() operation. The operation is nowadays abused and
shouldn't be used, so shouldn't expect bridge drivers to call it. To
make sure the device is properly initialize, move the reset call from
the format handlers to the probe function.
Cc: stable@vger.kernel.org # For Kernel 4.5 and upper
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
commit 73d5c5c864 ("[media] pctv452e: don't do DMA on stack") caused
a NULL pointer dereference which occurs when dvb_usb_init()
calls dvb_usb_device_power_ctrl() for the first time, before the
frontend has been attached. It also caused a recursive deadlock because
tt3650_ci_msg_locked() has already locked the mutex.
So, partially revert it, but move the buffer to the heap
(DMA capable), not to the stack (may not be DMA capable).
Instead of sharing one buffer which needs mutex protection,
do a new heap allocation for each call.
Fixes: commit 73d5c5c864 ("[media] pctv452e: don't do DMA on stack")
Cc: stable@vger.kernel.org # For Kernel 4.9
Signed-off-by: Max Kellermann <max.kellermann@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Simply the interrupt setup by using the new PCI layer helpers.
Despite using pci_enable_msi_range, this driver was only requesting a
single MSI vector anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This race was discovered by running cec-compliance -A with the cec module debug
parameter set to 2: suddenly the test would fail.
It turns out that this happens when the test configures the adapter in
non-blocking mode, then it waits for the CEC_EVENT_STATE_CHANGE event and once
the event is received it unconfigures the adapter.
What happened was that the unconfigure was executed while the configure was
still transmitting the Report Features and Report Physical Address messages.
This messed up the internal state of the cec_adapter.
The fix is to transmit those messages with the adap->lock mutex held (this will
just queue them up in the internal transmit queue, and not actually transmit
anything yet). Only unlock the mutex once everything is done. The main thread
will dequeue the messages from the internal transmit queue and transmit them
one by one, unless an unconfigure was done, and in that case any messages are
just dropped.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
It's only a small function and this makes it easier to switch to
transmitting the message with adap->lock held in the next patch.
Signed-off-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The fill function just fills in the cec_msg struct, it doesn't transmit
the message. This is now done explicitly.
This makes it possible to switch to transmitting this message with adap->lock
held.
Signed-off-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The loop that sets the unused logical addresses to INVALID should be
done before 'configured' is set to true. This ensures that cec_log_addrs
is consistent before it will be used.
Signed-off-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This is a 2.0 only message, so it should return Feature Abort if the
adapter is configured for CEC version 1.4.
Right now it does nothing, which means that the sender will time out.
Signed-off-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
When a pending message was canceled (e.g. due to a timeout), then the
old tx_status info was overwritten instead of ORed. The same happened
with the tx_error_cnt field. So just modify them instead of overwriting
them.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
In the (very) small print of the REPORT_CURRENT_LATENCY message there is a
line that says that the last byte of the message (audio out delay) is only
present if the 'audio out compensated' value is 3.
I missed this, and so if this message was sent with a total length of 6 (i.e.
without the audio out delay byte), then it was rejected by the framework
since a minimum length of 7 was expected.
Fix this minimum length check and update the wrappers in cec-funcs.h to do
the right thing based on the message length.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The smiapp_suspend() and smiapp_resume() functions will end up being unused
if CONFIG_PM is enabled but CONFIG_PM_SLEEP is disabled, causing a
compiler warning from both of the function definitions. Fix this by
marking the functions with __maybe_unused.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
4286db8456 ("spi: sh-msiof: Add R-Car Gen 2 and 3 fallback bindings")
added a C++ style comment. This is not in keeping with the style used
for comments elsewhere in this fine. Update it accordingly.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Mark Brown <broonie@kernel.org>
spi->irq is an unsigned integer hence the check if status is less than
zero has no effect. Fix this by replacing spi->irq with an int irq
so the less than zero compare will correctly detect errors.
Issue found with static analysis with CoverityScan, CID1388567
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Romain Perier <romain.perier@free-electrons.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
In the case of Renesas R-Car hardware we know that there are generations of
SoCs, e.g. Gen 2 and Gen 3. But beyond that it's not clear what the
relationship between IP blocks might be. For example, I believe that
r8a7790 is older than r8a7791 but that doesn't imply that the latter is a
descendant of the former or vice versa.
We can, however, by examining the documentation and behaviour of the
hardware at run-time observe that the current driver implementation appears
to be compatible with the IP blocks on SoCs within a given generation.
For the above reasons and convenience when enabling new SoCs a
per-generation fallback compatibility string scheme is being adopted for
drivers for Renesas SoCs.
Also:
* Deprecate renesas,sh-msiof. It seems poorly named as it is only
compatible with SH-Mobile. It also appears unused in mainline.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Mark Brown <broonie@kernel.org>
If NO_DMA=y:
ERROR: "bad_dma_ops" [drivers/spi/spi-fsl-dspi.ko] undefined!
Add a dependency on HAS_DMA to fix this.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
In some cases, some sensors didn't need the trip points, the
set_trips will pass {-INT_MAX, INT_MAX} to trigger tsadc alarm in the end,
ignore this case and disable the high temperature interrupt.
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
In order to support the valid temperature can conver to analog value.
The rockchip thermal driver has not supported the all valid temperature
to convert the analog value. (e.g.: 61C, 62C, 63C....)
For example:
In some cases, we need adjust the trip point.
$cd /sys/class/thermal/thermal_zone*
$echo 68000 > trip_point_0_temp
That will return the max analogic value indicates the invalid before
posting this patch.
So, this patch will optimize the conversion table to support the other
cases.
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The temp_to_code function will return 0 when we set the temperature to a
invalid value (e.g. 61C, 62C, 63C....), that's unpractical. This patch
will prevent this case happening. That will return the max analog value to
indicate the temperature is invalid or over table temperature range.
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
This driver passes struct chip_tsadc_table by value throughout; this is
inefficient, and AFAICT, there is no reason for it. Let's pass pointers
instead.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
These error messages don't give much information about what went wrong.
It would be nice, for one, to see what invalid temperature was being
requested when conversion fails. It's also good to return an error when
we can't handle a conversion properly.
While we're at it, fix the grammar too.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-12-13 20:31:54 -08:00
574 changed files with 5334 additions and 3074 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.