Pull nouveau and radeon fixes from Dave Airlie:
"Just some nouveau and radeon/amdgpu fixes.
The nouveau fixes look large as the firmware context files are
regenerated, but the actual change is quite small"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: make some dpm errors debug only
drm/nouveau/volt/pwm/gk104: fix an off-by-one resulting in the voltage not being set
drm/nouveau/nvif: allow userspace access to its own client object
drm/nouveau/gr/gf100-: fix oops when calling zbc methods
drm/nouveau/gr/gf117-: assume no PPC if NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK is zero
drm/nouveau/gr/gf117-: read NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK from correct GPC
drm/nouveau/gr/gf100-: split out per-gpc address calculation macro
drm/nouveau/bios: return actual size of the buffer retrieved via _ROM
drm/nouveau/instmem: protect instobj list with a spinlock
drm/nouveau/pci: enable c800 magic for some unknown Samsung laptop
drm/nouveau/pci: enable c800 magic for Clevo P157SM
drm/radeon: make rv770_set_sw_state failures non-fatal
drm/amdgpu: move dependency handling out of atomic section v2
drm/amdgpu: optimize scheduler fence handling
drm/amdgpu: remove vm->mutex
drm/amdgpu: add mutex for ba_va->valids/invalids
drm/amdgpu: adapt vce session create interface changes
drm/amdgpu: vce use multiple cache surface starting from stoney
drm/amdgpu: reset vce trap interrupt flag
Pull RTC fixes from Alexandre Belloni:
"Two fixes for the ds1307 alarm and wakeup"
* tag 'rtc-4.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: ds1307: fix alarm reading at probe time
rtc: ds1307: fix kernel splat due to wakeup irq handling
Pull MIPS fix from Ralf Baechle:
"Just a fix for empty loops that may be removed by non-antique GCC"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Fix delay loops which may be removed by GCC.
Pull m68k fixes from Geert Uytterhoeven:
"Summary:
- Add missing initialization of max_pfn, which is needed to make
selftests/vm/mlock2-tests succeed,
- Wire up new mlock2 syscall"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Wire up mlock2
m68knommu: Add missing initialization of max_pfn and {min,max}_low_pfn
m68k/mm: sun3 - Add missing initialization of max_pfn and {min,max}_low_pfn
m68k/mm: m54xx - Add missing initialization of max_pfn
m68k/mm: motorola - Add missing initialization of max_pfn
Pull ARM fixes from Russell King:
"Just two changes this time around:
- wire up the new mlock2 syscall added during the last merge window
- fix a build problem with certain configurations provoked by making
CONFIG_OF user selectable"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8454/1: OF implies OF_FLATTREE
ARM: wire up mlock2 syscall
Pull SCSI target fixes from Nicholas Bellinger:
- fix tcm-user backend driver expired cmd time processing (agrover)
- eliminate kref_put_spinlock_irqsave() for I/O completion (bart)
- fix iscsi login kthread failure case hung task regression (nab)
- fix COMPARE_AND_WRITE completion use-after-free race (nab)
- fix COMPARE_AND_WRITE with SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC non zero
SGL offset data corruption. (Jan + Doug)
- fix >= v4.4-rc1 regression for tcm_qla2xxx enable configfs attribute
(Himanshu + HCH)
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target/stat: print full t10_wwn.model buffer
target: fix COMPARE_AND_WRITE non zero SGL offset data corruption
qla2xxx: Fix regression introduced by target configFS changes
kref: Remove kref_put_spinlock_irqsave()
target: Invoke release_cmd() callback without holding a spinlock
target: Fix race for SCF_COMPARE_AND_WRITE_POST checking
iscsi-target: Fix rx_login_comp hang after login failure
iscsi-target: return -ENOMEM instead of -1 in case of failed kmalloc()
target/user: Do not set unused fields in tcmu_ops
target/user: Fix time calc in expired cmd processing
Pull thermal management fixes from Zhang Rui:
"Specifics:
- several fixes and cleanups on Rockchip thermal drivers.
- add the missing support of RK3368 SoCs in Rockchip driver.
- small fixes on of-thermal, power_allocator, rcar driver, IMX, and
QCOM drivers, and also compilation fixes, on thermal.h, when thermal
is not selected"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
imx: thermal: use CPU temperature grade info for thresholds
thermal: fix thermal_zone_bind_cooling_device prototype
Revert "thermal: qcom_spmi: allow compile test"
thermal: rcar_thermal: remove redundant operation
thermal: of-thermal: Reduce log level for message when can't fine thermal zone
thermal: power_allocator: Use temperature reading from tz
thermal: rockchip: Support the RK3368 SoCs in thermal driver
thermal: rockchip: consistently use int for temperatures
thermal: rockchip: Add the sort mode for adc value increment or decrement
thermal: rockchip: improve the conversion function
thermal: rockchip: trivial: fix typo in commit
thermal: rockchip: better to compatible the driver for different SoCs
dt-bindings: rockchip-thermal: Support the RK3368 SoCs compatible
Cut 'n paste error saw it only process sizeof(t10_wwn.vendor) characters.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
target_core_sbc's compare_and_write functionality suffers from taking
data at the wrong memory location when writing a CAW request to disk
when a SGL offset is non-zero.
This can happen with loopback and vhost-scsi fabric drivers when
SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC is used to map existing user-space
SGL memory into COMPARE_AND_WRITE READ/WRITE payload buffers.
Given the following sample LIO subtopology,
% targetcli ls /loopback/
o- loopback ................................. [1 Target]
o- naa.6001405ebb8df14a ....... [naa.60014059143ed2b3]
o- luns ................................... [2 LUNs]
o- lun0 ................ [iblock/ram0 (/dev/ram0)]
o- lun1 ................ [iblock/ram1 (/dev/ram1)]
% lsscsi -g
[3:0:1:0] disk LIO-ORG IBLOCK 4.0 /dev/sdc /dev/sg3
[3:0:1:1] disk LIO-ORG IBLOCK 4.0 /dev/sdd /dev/sg4
the following bug can be observed in Linux 4.3 and 4.4~rc1:
% perl -e 'print chr$_ for 0..255,reverse 0..255' >rand
% perl -e 'print "\0" x 512' >zero
% cat rand >/dev/sdd
% sg_compare_and_write -i rand -D zero --lba 0 /dev/sdd
% sg_compare_and_write -i zero -D rand --lba 0 /dev/sdd
Miscompare reported
% hexdump -Cn 512 /dev/sdd
00000000 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
00000200
Rather than writing all-zeroes as instructed with the -D file, it
corrupts the data in the sector by splicing some of the original
bytes in. The page of the first entry of cmd->t_data_sg includes the
CDB, and sg->offset is set to a position past the CDB. I presume that
sg->offset is also the right choice to use for subsequent sglist
members.
Signed-off-by: Jan Engelhardt <jengelh@netitwork.de>
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
this patch fixes following regression
# targetcli
[Errno 13] Permission denied: '/sys/kernel/config/target/qla2xxx/21:00:00:0e:1e:08:c7:20/tpgt_1/enable'
Fixes: 2eafd72939 ("target: use per-attribute show and store methods")
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch addresses a race + use after free where the first
stage of COMPARE_AND_WRITE in compare_and_write_callback()
is rescheduled after the backend sends the secondary WRITE,
resulting in second stage compare_and_write_post() callback
completing in target_complete_ok_work() before the first
can return.
Because current code depends on checking se_cmd->se_cmd_flags
after return from se_cmd->transport_complete_callback(),
this results in first stage having SCF_COMPARE_AND_WRITE_POST
set, which incorrectly falls through into second stage CAW
processing code, eventually triggering a NULL pointer
dereference due to use after free.
To address this bug, pass in a new *post_ret parameter into
se_cmd->transport_complete_callback(), and depend upon this
value instead of ->se_cmd_flags to determine when to return
or fall through into ->queue_status() code for CAW.
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch addresses a case where iscsi_target_do_tx_login_io()
fails sending the last login response PDU, after the RX/TX
threads have already been started.
The case centers around iscsi_target_rx_thread() not invoking
allow_signal(SIGINT) before the send_sig(SIGINT, ...) occurs
from the failure path, resulting in RX thread hanging
indefinately on iscsi_conn->rx_login_comp.
Note this bug is a regression introduced by:
commit e54198657b
Author: Nicholas Bellinger <nab@linux-iscsi.org>
Date: Wed Jul 22 23:14:19 2015 -0700
iscsi-target: Fix iscsit_start_kthreads failure OOPs
To address this bug, complete ->rx_login_complete for good
measure in the failure path, and immediately return from
RX thread context if connection state did not actually reach
full feature phase (TARG_CONN_STATE_LOGGED_IN).
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Smatch complains about returning hard coded error codes, silence this
warning.
drivers/target/iscsi/iscsi_target_parameters.c:211
iscsi_create_default_params() warn: returning -1 instead of -ENOMEM is sloppy
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
TCMU sets TRANSPORT_FLAG_PASSTHROUGH, so INQUIRY commands will not be
emulated by LIO but passed up to userspace. Therefore TCMU should not
set these, just like pscsi doesn't.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Reversed arguments meant that we were doing nothing for cmds whose deadline
had passed.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
On the ARM architecture, individual platforms select CONFIG_USE_OF if they
need it, but all device tree code is keyed off CONFIG_OF. When building
a platform without DT support and manually enabling CONFIG_OF, we now
get a number of build errors, e.g.
arch/arm/kernel/devtree.c: In function 'setup_machine_fdt':
arch/arm/kernel/devtree.c:215:19: error: implicit declaration of function 'early_init_dt_verify' [-Werror=implicit-function-declaration]
We could now try to separate the use case of booting from DT vs. the
case of using the dynamic implementation, but that seems more complicated
than it can gain us.
This simply changes the ARM Kconfig file to always enable OF_RESERVED_MEM
and OF_EARLY_FLATTREE when CONFIG_OF is enabled. These options add a little
extra code when we just want the dynamic OF implementation, but that seems
like a rather obscure case, and this version solves all CONFIG_OF related
randconfig regressions.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 0166dc11be ("of: make CONFIG_OF user selectable")
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull PCI fixes from Bjorn Helgaas:
"Here are a few fixes I'd like to have in v4.4: a generic one for sysfs
and three for HiSilicon and DesignWare host controllers.
Summary:
NUMA:
- Prevent out of bounds access in numa_node override (Mathias Krause)
HiSilicon host bridge driver:
- Fix deferred probing (Arnd Bergmann)
Synopsys DesignWare host bridge driver:
- Remove incorrect io_base assignment (Stanimir Varbanov)
- Move align_resource function pointer to pci_host_bridge structure
(Gabriele Paoloni)"
* tag 'pci-v4.4-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
ARM/PCI: Move align_resource function pointer to pci_host_bridge structure
PCI: hisi: Fix deferred probing
PCI: designware: Remove incorrect io_base assignment
PCI: Prevent out of bounds access in numa_node override
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
Stable patches:
- Fix a NFSv4 callback identifier leak that was also causing client
crashes
- Fix NFSv4 callback decoding issues when incoming requests are
truncated
- Don't declare the attribute cache valid when we call
nfs_update_inode with an empty attribute structure.
- Resend LAYOUTGET when there is a race that changes the seqid
Bugfixes:
- Fix a number of issues with the NFSv4.2 CLONE ioctl()
- Properly set NFS v4.2 NFSDBG_FACILITY
- NFSv4 referrals are broken; Cleanup FATTR4_WORD0_FS_LOCATIONS after
decoding success
- Use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
- Ensure that attrcache is revalidated after a SETATTR"
* tag 'nfs-for-4.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
nfs4: resend LAYOUTGET when there is a race that changes the seqid
nfs: if we have no valid attrs, then don't declare the attribute cache valid
nfs: ensure that attrcache is revalidated after a SETATTR
nfs4: limit callback decoding to received bytes
nfs4: start callback_ident at idr 1
nfs: use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
NFS4: Cleanup FATTR4_WORD0_FS_LOCATIONS after decoding success
NFS: Properly set NFS v4.2 NFSDBG_FACILITY
nfs: reduce the amount of ifdefs for v4.2 in nfs4file.c
nfs: use btrfs ioctl defintions for clone
nfs: allow intra-file CLONE
nfs: offer native ioctls even if CONFIG_COMPAT is set
nfs: pass on count for CLONE operations
Pull watchdog fixes from Wim Van Sebroeck:
- a null pointer dereference fix for omap_wdt
- some clock related fixes for pnx4008
- an underflow fix in wdt_set_timeout() for w83977f_wdt
- restart fix for tegra wdt
- Kconfig change to support Freescale Layerscape platforms
- fix for stopping the mtk_wdt watchdog
* git://www.linux-watchdog.org/linux-watchdog:
watchdog: mtk_wdt: Use MODE_KEY when stopping the watchdog
watchdog: Add support for Freescale Layerscape platforms
watchdog: tegra: Stop watchdog first if restarting
watchdog: w83977f_wdt: underflow in wdt_set_timeout()
watchdog: pnx4008: make global wdt_clk static
watchdog: pnx4008: fix warnings caused by enabling unprepared clock
watchdog: omap_wdt: fix null pointer dereference
Pull btrfs fixes from Chris Mason:
"This has Mark Fasheh's patches to fix quota accounting during subvol
deletion, which we've been working on for a while now. The patch is
pretty small but it's a key fix.
Otherwise it's a random assortment"
* 'for-linus-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: fix balance range usage filters in 4.4-rc
btrfs: qgroup: account shared subtree during snapshot delete
Btrfs: use btrfs_get_fs_root in resolve_indirect_ref
btrfs: qgroup: fix quota disable during rescan
Btrfs: fix race between cleaner kthread and space cache writeout
Btrfs: fix scrub preventing unused block groups from being deleted
Btrfs: fix race between scrub and block group deletion
btrfs: fix rcu warning during device replace
btrfs: Continue replace when set_block_ro failed
btrfs: fix clashing number of the enhanced balance usage filter
Btrfs: fix the number of transaction units needed to remove a block group
Btrfs: use global reserve when deleting unused block group after ENOSPC
Btrfs: tests: checking for NULL instead of IS_ERR()
btrfs: fix signed overflows in btrfs_sync_file
Pull security layer fixes from James Morris:
"A fix for SELinux policy processing (regression introduced by
commit fa1aa143ac: "selinux: extended permissions for ioctls"), as
well as a fix for the user-triggerable oops in the Keys code"
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
KEYS: Fix handling of stored error in a negatively instantiated user key
selinux: fix bug in conditional rules handling
Pull ARM SoC fixes from Arnd Bergmann:
"There is a small backlog of at91 patches here, the most significant is
the addition of some sama5d2 Xplained nodes that were waiting on an
MFD include file to get merged through another tree.
We normally try to sort those out before the merge window opens, but
the maintainer wasn't aware of that here and I decided to merge the
changes this time as an exception.
On OMAP a series of audio changes for dra7 missed the merge window but
turned out to be necessary to fix a boot time imprecise external abort
error and to get audio working.
The other changes are the usual simple changes, here is a list sorted
by platform:
at91:
removal of a useless defconfig option
removal of some legacy DT pieces
use of the proper watchdog compatible string
update of the MAINTAINERS entries for some Atmel drivers
drivers/scpi:
hide get_scpi_ops in module from built-in code
imx:
add missing .irq_set_type for i.MX GPC irq_chip.
fix the wrong spi-num-chipselects settings for Vybrid DSPI devices.
fix a merge error in Vybrid dts regarding to ADC device property
keystone:
fix the optional PDSP firmware loading
fix linking RAM setup for QMs
fix crash with clk_ignore_unused
mediatek:
Enable SCPSYS power domain driver by default
mvebu:
fix QNAP TS219 power-off in dts
fix legacy get_irqnr_and_base for dove and orion5x
omap:
fix l4 related boot time errors for dm81xx
use lockless cldm/pwrdm api in omap4_boot_secondary
remove t410 abort handler to avoid hiding other critical errors
mark cpuidle tracepoints as _rcuidle
fix module alias for omap-ocp2scp
pxa:
palm: Fix typos in PWM lookup table code
renesas:
missing __initconst annotation for r8a7793_boards_compat_dt
rockchip:
disable mmc-tuning on the veyron-minnie board
adding the init state for the over-temperature-protection
zx:
only build power domain code when CONFIG_PM=y"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (31 commits)
ARM: OMAP4+: SMP: use lockless clkdm/pwrdm api in omap4_boot_secondary
arm: omap2+: add missing HWMOD_NO_IDLEST in 81xx hwmod data
ARM: orion5x: Fix legacy get_irqnr_and_base
ARM: dove: Fix legacy get_irqnr_and_base
soc: Mediatek: Enable SCPSYS power domain driver by default
ARM: dts: vfxxx: Fix dspi[01] spi-num-chipselects.
ARM: dts: keystone: k2l: fix kernel crash when clk_ignore_unused is not in bootargs
soc: ti: knav_qmss_queue: Fix linking RAM setup for queue managers
soc: ti: use request_firmware_direct() as acc firmware is optional
ARM: imx: add platform irq type setting in gpc
ARM: dts: vfxxx: Fix erroneous property in esdhc0 node
ARM: shmobile: r8a7793: proper constness with __initconst
scpi: hide get_scpi_ops in module from built-in code
ARM: zx: only build power domain code when CONFIG_PM=y
ARM: pxa: palm: Fix typos in PWM lookup table code
ARM: dts: Kirkwood: Fix QNAP TS219 power-off
ARM: dts: rockchip: Add OTP gpio pinctrl to rk3288 tsadc node
ARM: dts: rockchip: temporarily remove emmc hs200 speed from rk3288 minnie
MAINTAINERS: Atmel drivers: change NAND and ISI entries
ARM: at91/dt: sama5d2 Xplained: add several devices
...
Pull more power management and ACPI fixes from Rafael Wysocki:
"These fix one recent regression (cpufreq core), fix up two features
added recently (ACPI CPPC support, SCPI support in the arm_big_little
cpufreq driver) and fix three older bugs in the intel_pstate driver.
Specifics:
- Fix a recent regression in the cpufreq core causing it to fail to
clean up sysfs directories properly on cpufreq driver removal
(Viresh Kumar).
- Fix a build problem in the SCPI support code recently added to the
arm_big_little cpufreq driver (Punit Agrawal).
- Fix up the recently added CPPC cpufreq frontend to process the CPU
coordination information provided by the platform firmware
correctly (Ashwin Chaugule).
- Fix the intel_pstate driver to behave as intended when switched
over to the "performance" mode via sysfs if hardware-driven P-state
selection (HWP) is enabled (Alexandra Yates).
- Fix two rounding errors in the intel_pstate driver that sometimes
cause it to use lower P-states than requested (Prarit Bhargava)"
* tag 'pm+acpi-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
intel_pstate: Fix "performance" mode behavior with HWP enabled
cpufreq: SCPI: Depend on SCPI clk driver
cpufreq: intel_pstate: Fix limits->max_perf rounding error
cpufreq: intel_pstate: Fix limits->max_policy_pct rounding error
cpufreq: Always remove sysfs cpuX/cpufreq link on ->remove_dev()
cpufreq: CPPC: Initialize and check CPUFreq CPU co-ord type correctly
Ben Skeggs wrote:
A couple of regression fixes, some more boards whitelisted for a hw bug
workaround, gr/ucode fixes for hangs a user is seeing.
The changes look larger than they actually are due to the ucode binaries
(*.fucN.h) being regenerated.
* 'linux-4.4' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nouveau/volt/pwm/gk104: fix an off-by-one resulting in the voltage not being set
drm/nouveau/nvif: allow userspace access to its own client object
drm/nouveau/gr/gf100-: fix oops when calling zbc methods
drm/nouveau/gr/gf117-: assume no PPC if NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK is zero
drm/nouveau/gr/gf117-: read NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK from correct GPC
drm/nouveau/gr/gf100-: split out per-gpc address calculation macro
drm/nouveau/bios: return actual size of the buffer retrieved via _ROM
drm/nouveau/instmem: protect instobj list with a spinlock
drm/nouveau/pci: enable c800 magic for some unknown Samsung laptop
drm/nouveau/pci: enable c800 magic for Clevo P157SM
Pull sound fixes from Takashi Iwai:
"Here are no big surprises but just all small fixes, mostly
device-specific quirks for HD-audio and USB-audio:
- Fix for detection of FireWire DICE Loud devices
- Intel Broxton HDMI/DP PCI IDs and relevant quirks
- Noise fixes: Dell XPS13 2015 model, Dell Latitude E6440, Gigabyte
Z170X mobo
- Fix the headphone mixer assignment on HP laptops for PulseAudio
- USB-MIDI fixes for Medeli DD305 and CH345
- Apply fixup for Acer Aspire One Cloudbook 14"
* tag 'sound-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix noise on Gigabyte Z170X mobo
ALSA: hda - Fix headphone noise after Dell XPS 13 resume back from S3
ALSA: hda - Apply HP headphone fixups more generically
ALSA: hda - Add fixup for Acer Aspire One Cloudbook 14
ALSA: hda - apply SKL display power request/release patch to BXT
ALSA: hda - add PCI IDs for Intel Broxton
ALSA: usb-audio: work around CH345 input SysEx corruption
ALSA: usb-audio: prevent CH345 multiport output SysEx corruption
ALSA: usb-audio: add packet size quirk for the Medeli DD305
ALSA: dice: fix detection of Loud devices
ALSA: hda - Fix noise on Dell Latitude E6440
Pull arm64 fixes from Catalin Marinas:
- Build fix when !CONFIG_UID16 (the patch is touching generic files but
it only affects arm64 builds; submitted by Arnd Bergmann)
- EFI fixes to deal with early_memremap() returning NULL and correctly
mapping run-time regions
- Fix CPUID register extraction of unsigned fields (not to be
sign-extended)
- ASID allocator fix to deal with long-running tasks over multiple
generation roll-overs
- Revert support for marking page ranges as contiguous PTEs (it leads
to TLB conflicts and requires additional non-trivial kernel changes)
- Proper early_alloc() failure check
- Disable KASan for 48-bit VA and 16KB page configuration (the pgd is
larger than the KASan shadow memory)
- Update the fault_info table (original descriptions based on early
engineering spec)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: efi: fix initcall return values
arm64: efi: deal with NULL return value of early_memremap()
arm64: debug: Treat the BRPs/WRPs as unsigned
arm64: cpufeature: Track unsigned fields
arm64: cpufeature: Add helpers for extracting unsigned values
Revert "arm64: Mark kernel page ranges contiguous"
arm64: mm: keep reserved ASIDs in sync with mm after multiple rollovers
arm64: KASAN depends on !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
arm64: efi: correctly map runtime regions
arm64: mm: fix fault_info table xFSC decoding
arm64: fix building without CONFIG_UID16
arm64: early_alloc: Fix check for allocation failure
GCC 4.1 and newer remove empty loops. This becomes a problem when delay
loops get removed. Fixed by rewriting to user the proper Linux interface
for such delays.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: John Crispin <blogic@openwrt.org>
Pull ARC fixes from Vineet Gupta:
- Fix for perf callgraph unwinding causing RCU stalls
- Fix to enable Linux to run on non-default Interrupt priority 0
- Removal of pointless SYNC from __switch_to()
* tag 'arc-4.4-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: dw2 unwind: Remove falllback linear search thru FDE entries
ARC: remove SYNC from __switch_to()
ARCv2: Use the default irq priority for idle sleep
ARC: Abstract out ISA specific SLEEP args
ARC: comments update
ARC: switch to arc-linux- CROSS_COMPILE prefix across all configs
Merge "ARM: rockchip: devicetree fixes for 4.4" from Heiko Stuebner:
Two fixes to Rockchip devicetree files, disabling the mmc-tuning
on the veyron-minnie board for now and adding the init state for
the over-temperature-protection to prevent glitches making the
system reboot sometimes.
* tag 'v4.4-rockchip-dts32-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
ARM: dts: rockchip: Add OTP gpio pinctrl to rk3288 tsadc node
ARM: dts: rockchip: temporarily remove emmc hs200 speed from rk3288 minnie
Merge "Renesas ARM Based SoC Fixes for v4.4" from Simon Horman:
* r8a7793 SoC: Annotate r8a7793_boards_compat_dt with __initconst
Aside from being correct this builds that otherwise
fail with section mismatch errors.
* tag 'renesas-fixes-for-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: r8a7793: proper constness with __initconst
Pull xen bug fixes from David Vrabel:
- Fix gntdev and numa balancing.
- Fix x86 boot crash due to unallocated legacy irq descs.
- Fix overflow in evtchn device when > 1024 event channels.
* tag 'for-linus-4.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/evtchn: dynamically grow pending event channel ring
xen/events: Always allocate legacy interrupts on PV guests
xen/gntdev: Grant maps should not be subject to NUMA balancing
Pull powerpc fixes from Michael Ellerman:
- tm: Block signal return from setting invalid MSR state from Michael
Neuling
- tm: Check for already reclaimed tasks from Michael Neuling
* tag 'powerpc-4.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/tm: Check for already reclaimed tasks
powerpc/tm: Block signal return setting invalid MSR state
If more than 1024 event channels are bound to a evtchn device then it
possible (even with well behaved applications) for the ring to
overflow and events to be lost (reported as an -EFBIG error).
Dynamically increase the size of the ring so there is always enough
space for all bound events. Well behaved applicables that only unmask
events after draining them from the ring can thus no longer lose
events.
However, an application could unmask an event before draining it,
allowing multiple entries per port to accumulate in the ring, and a
overflow could still occur. So the overflow detection and reporting
is retained.
The ring size is initially only 64 entries so the common use case of
an application only binding a few events will use less memory than
before. The ring size may grow to 512 KiB (enough for all 2^17
possible channels). This order 7 kmalloc() may fail due to memory
fragmentation, so we fall back to trying vmalloc().
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Even though initcall return values are typically ignored, the
prototype is to return 0 on success or a negative errno value on
error. So fix the arm_enable_runtime_services() implementation to
return 0 on conditions that are not in fact errors, and return a
meaningful error code otherwise.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add NULL return value checks to two invocations of early_memremap()
in the UEFI init code. For the UEFI configuration tables, we just
warn since we have a better chance of being able to report the issue
in a way that can actually be noticed by a human operator if we don't
abort right away. For the UEFI memory map, however, all we can do is
panic() since we cannot proceed without a description of memory.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Some of the feature bits have unsigned values and need
to be treated accordingly to avoid errors. Adds the property
to the feature bits and use the appropriate field extract helpers.
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
After commit 8c058b0b9c ("x86/irq: Probe for PIC presence before
allocating descs for legacy IRQs") early_irq_init() will no longer
preallocate descriptors for legacy interrupts if PIC does not
exist, which is the case for Xen PV guests.
Therefore we may need to allocate those descriptors ourselves.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
The cpuid_feature_extract_field() extracts the feature value
as a signed integer. This could be problematic for features
whose values are unsigned. e.g, ID_AA64DFR0_EL1:BRPs. Add
an unsigned variant for the unsigned fields.
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Doing so will cause the grant to be unmapped and then, during
fault handling, the fault to be mistakenly treated as NUMA hint
fault.
In addition, even if those maps could partcipate in NUMA
balancing, it wouldn't provide any benefit since we are unable
to determine physical page's node (even if/when VNUMA is
implemented).
Marking grant maps' VMAs as VM_IO will exclude them from being
part of NUMA balancing.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
With the actual code, read_alarm() always returns -EINVAL when called
during the RTC device registration. This prevents from retrieving an
already configured alarm in hardware.
This patch fixes the issue by moving the HAS_ALARM bit configuration
(if supported by the hardware) above the rtc_device_register() call.
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
This reverts commit 348a65cdcb.
Incorrect page table manipulation that does not respect the ARM ARM
recommended break-before-make sequence may lead to TLB conflicts. The
contiguous PTE patch makes the system even more susceptible to such
errors by changing the mapping from a single page to a contiguous range
of pages. An additional TLB invalidation would reduce the risk window,
however, the correct fix is to switch to a temporary swapper_pg_dir.
Once the correct workaround is done, the reverted commit will be
re-applied.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Under some unusual context-switching patterns, it is possible to end up
with multiple threads from the same mm running concurrently with
different ASIDs:
1. CPU x schedules task t with mm p containing ASID a and generation g
This task doesn't block and the CPU doesn't context switch.
So:
* per_cpu(active_asid, x) = {g,a}
* p->context.id = {g,a}
2. Some other CPU generates an ASID rollover. The global generation is
now (g + 1). CPU x is still running t, with no context switch and
so per_cpu(reserved_asid, x) = {g,a}
3. CPU y schedules task t', which shares mm p with t. The generation
mismatches, so we take the slowpath and hit the reserved ASID from
CPU x. p is then updated so that p->context.id = {g + 1,a}
4. CPU y schedules some other task u, which has an mm != p.
5. Some other CPU generates *another* CPU rollover. The global
generation is now (g + 2). CPU x is still running t, with no context
switch and so per_cpu(reserved_asid, x) = {g,a}.
6. CPU y once again schedules task t', but now *fails* to hit the
reserved ASID from CPU x because of the generation mismatch. This
results in a new ASID being allocated, despite the fact that t is
still running on CPU x with the same mm.
Consequently, TLBIs (e.g. as a result of CoW) will not be synchronised
between the two threads.
This patch fixes the problem by updating all of the matching reserved
ASIDs when we hit on the slowpath (i.e. in step 3 above). This keeps
the reserved ASIDs in-sync with the mm and avoids the problem.
Reported-by: Tony Thompson <anthony.thompson@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
On KASAN + 16K_PAGES + 48BIT_VA
arch/arm64/mm/kasan_init.c: In function ‘kasan_early_init’:
include/linux/compiler.h:484:38: error: call to ‘__compiletime_assert_95’ declared with attribute error: BUILD_BUG_ON failed: !IS_ALIGNED(KASAN_SHADOW_END, PGDIR_SIZE)
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
Currently KASAN will not work on 16K_PAGES and 48BIT_VA, so
forbid such configuration to avoid above build failure.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Suzuki K. Poulose <Suzuki.Poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Pull vfs fixes from Al Viro:
"A couple of fixes for sendfile lockups caught by Dmitry + a fix for
ancient sysvfs symlink breakage"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: Avoid softlockups with sendfile(2)
vfs: Make sendfile(2) killable even better
fix sysvfs symlinks
Merge "Fixes for omaps for v4.4-rc cycle" from Tony Lindgren:
- A series of audio changes for dra7 that missed the merge window but turned
out to be necessary to fix a boot time imprecise external abort error and to
getaudio working
- Fix l4 related boot time errors for dm81xx
- Use lockless cldm/pwrdm api in omap4_boot_secondary
- Remove t410 custom abort handler that is no longer needed and may
hide other critical errors
- Mark cpuidle tracepoints as _rcuidle
- Fix module alias for omap-ocp2scp
* tag 'omap-for-v4.4/fixes-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP4+: SMP: use lockless clkdm/pwrdm api in omap4_boot_secondary
arm: omap2+: add missing HWMOD_NO_IDLEST in 81xx hwmod data
ARM: OMAP2+: remove custom abort handler for t410
ARM: OMAP: DRA7: hwmod: Add data for McASP3
ARM: OMAP2+: hwmod: Add hwmod flag for HWMOD_OPT_CLKS_NEEDED
ARM: dts: dra7: Fix McASP3 node regarding to clocks
bus: omap-ocp2scp: Fix module alias
ARM: OMAP2+: PM: Denote the cpuidle tracepoints as _rcuidle()
Merge "Few Keystone fixes for 4.4-rcx" from Santosh Shilimkar:
- Fix the optional PDSP firmware loading
- Fix linking RAM setup for QMs
- Fix crash with clk_ignore_unused
* tag 'keystone-fixes-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone:
ARM: dts: keystone: k2l: fix kernel crash when clk_ignore_unused is not in bootargs
soc: ti: knav_qmss_queue: Fix linking RAM setup for queue managers
soc: ti: use request_firmware_direct() as acc firmware is optional
Merge "The i.MX fixes for 4.4" from Shawn Guo:
- Add missing .irq_set_type for i.MX GPC irq_chip. It fixes an issue
that device IRQ type setting doesn't match the one specified in device
tree, since stacked IRQ domain is adopted in GPC driver.
- Fix the wrong spi-num-chipselects settings for Vybrid DSPI devices.
- Fix a merge error in Vybrid dts regarding to ADC device property
fsl,adck-max-frequency
* tag 'imx-fixes-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: vfxxx: Fix dspi[01] spi-num-chipselects.
ARM: imx: add platform irq type setting in gpc
ARM: dts: vfxxx: Fix erroneous property in esdhc0 node
If hardware-driven P-state selection (HWP) is enabled, the
"performance" mode of intel_pstate should only allow the processor
to use the highest-performance P-state available. That is not
the case currently, so make it actually happen.
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Alexandra Yates <alexandra.yates@linux.intel.com>
[ rjw: Subject and changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
pnfs_layout_process will check the returned layout stateid against what
the kernel has in-core. If it turns out that the stateid we received is
older, then we should resend the LAYOUTGET instead of falling back to
MDS I/O.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org # 3.18+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
If we pass in an empty nfs_fattr struct to nfs_update_inode, it will
(correctly) not update any of the attributes, but it then clears the
NFS_INO_INVALID_ATTR flag, which indicates that the attributes are
up to date. Don't clear the flag if the fattr struct has no valid
attrs to apply.
Reviewed-by: Steve French <steve.french@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
If we get no post-op attributes back from a SETATTR operation, then no
attributes will of course be updated during the call to
nfs_update_inode.
We know however that the attributes are invalid at that point, since we
just changed some of them. At the very least, the ctime will be bogus.
If we get no post-op attributes back on the call, mark the attrcache
invalid to reflect that fact.
Reviewed-by: Steve French <steve.french@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Commit b3a72384fe ("ARM/PCI: Replace pci_sys_data->align_resource with
global function pointer") introduced an ARM-specific align_resource()
function pointer. This is not portable to other arches and doesn't work
for platforms with two different PCIe host bridge controllers.
Move the function pointer to the pci_host_bridge structure so each host
bridge driver can specify its own align_resource() function.
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Pull more block layer fixes from Jens Axboe:
"I wasn't going to send off a new pull before next week, but the blk
flush fix from Jan from the other day introduced a regression. It's
rare enough not to have hit during testing, since it requires both a
device that rejects the first flush, and bad timing while it does
that. But since someone did hit it, let's get the revert into 4.4-rc3
so we don't have a released rc with that known issue.
Apart from that revert, three other fixes:
- From Christoph, a fix for a missing unmap in NVMe request
preparation.
- An NVMe fix from Nishanth that fixes data corruption on powerpc.
- Also from Christoph, fix a list_del() attempt on blk-mq that didn't
have a matching list_add() at timer start"
* 'for-linus' of git://git.kernel.dk/linux-block:
Revert "blk-flush: Queue through IO scheduler when flush not required"
block: fix blk_abort_request for blk-mq drivers
nvme: add missing unmaps in nvme_queue_rq
NVMe: default to 4k device page size
OMAP CPU hotplug uses cpu1's clocks and power domains for CPU1 wake up
from low power states (or turn on CPU1). This part of code is also
part of system suspend (disable_nonboot_cpus()).
>From other side, cpu1's clocks and power domains are used by CPUIdle. All above
functionality is mutually exclusive and, therefore, lockless clkdm/pwrdm api
can be used in omap4_boot_secondary().
This fixes below back-trace on -RT which is triggered by
pwrdm_lock/unlock():
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 1, irqs_disabled(): 0, pid: 118, name: sh
9 locks held by sh/118:
#0: (sb_writers#4){.+.+.+}, at: [<c0144a6c>] vfs_write+0x13c/0x164
#1: (&of->mutex){+.+.+.}, at: [<c01b4c70>] kernfs_fop_write+0x48/0x19c
#2: (s_active#24){.+.+.+}, at: [<c01b4c78>] kernfs_fop_write+0x50/0x19c
#3: (device_hotplug_lock){+.+.+.}, at: [<c03cbff0>] lock_device_hotplug_sysfs+0xc/0x4c
#4: (&dev->mutex){......}, at: [<c03cd284>] device_online+0x14/0x88
#5: (cpu_add_remove_lock){+.+.+.}, at: [<c003af90>] cpu_up+0x50/0x1a0
#6: (cpu_hotplug.lock){++++++}, at: [<c003ae48>] cpu_hotplug_begin+0x0/0xc4
#7: (cpu_hotplug.lock#2){+.+.+.}, at: [<c003aec0>] cpu_hotplug_begin+0x78/0xc4
#8: (boot_lock){+.+...}, at: [<c002b254>] omap4_boot_secondary+0x1c/0x178
Preemption disabled at:[< (null)>] (null)
CPU: 0 PID: 118 Comm: sh Not tainted 4.1.12-rt11-01998-gb4a62c3-dirty #137
Hardware name: Generic DRA74X (Flattened Device Tree)
[<c0017574>] (unwind_backtrace) from [<c0013be8>] (show_stack+0x10/0x14)
[<c0013be8>] (show_stack) from [<c05a8670>] (dump_stack+0x80/0x94)
[<c05a8670>] (dump_stack) from [<c05ad158>] (rt_spin_lock+0x24/0x54)
[<c05ad158>] (rt_spin_lock) from [<c0030dac>] (clkdm_wakeup+0x10/0x2c)
[<c0030dac>] (clkdm_wakeup) from [<c002b2c0>] (omap4_boot_secondary+0x88/0x178)
[<c002b2c0>] (omap4_boot_secondary) from [<c0015d00>] (__cpu_up+0xc4/0x164)
[<c0015d00>] (__cpu_up) from [<c003b09c>] (cpu_up+0x15c/0x1a0)
[<c003b09c>] (cpu_up) from [<c03cd2d4>] (device_online+0x64/0x88)
[<c03cd2d4>] (device_online) from [<c03cd360>] (online_store+0x68/0x74)
[<c03cd360>] (online_store) from [<c01b4ce0>] (kernfs_fop_write+0xb8/0x19c)
[<c01b4ce0>] (kernfs_fop_write) from [<c0144124>] (__vfs_write+0x20/0xd8)
[<c0144124>] (__vfs_write) from [<c01449c0>] (vfs_write+0x90/0x164)
[<c01449c0>] (vfs_write) from [<c01451e4>] (SyS_write+0x44/0x9c)
[<c01451e4>] (SyS_write) from [<c0010240>] (ret_fast_syscall+0x0/0x54)
CPU1: smp_ops.cpu_die() returned, trying to resuscitate
Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Add missing HWMOD_NO_IDLEST hwmod flag for entries not
having omap4 clkctrl values.
The emac0 hwmod flag fixes the davinci_emac driver probe
since the return of pm_resume() call is now checked.
This solves the following boot errors :
[ 0.121429] omap_hwmod: l4_ls: _wait_target_ready failed: -16
[ 0.121441] omap_hwmod: l4_ls: cannot be enabled for reset (3)
[ 0.124342] omap_hwmod: l4_hs: _wait_target_ready failed: -16
[ 0.124352] omap_hwmod: l4_hs: cannot be enabled for reset (3)
[ 1.967228] omap_hwmod: emac0: _wait_target_ready failed: -16
Cc: Brian Hutchinson <b.hutchman@gmail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
This reverts commit 1b2ff19e6a.
Jan writes:
--
Thanks for report! After some investigation I found out we allocate
elevator specific data in __get_request() only for non-flush requests. And
this is actually required since the flush machinery uses the space in
struct request for something else. Doh. So my patch is just wrong and not
easy to fix since at the time __get_request() is called we are not sure
whether the flush machinery will be used in the end. Jens, please revert
1b2ff19e6a. Thanks!
I'm somewhat surprised that you can reliably hit the race where flushing
gets disabled for the device just while the request is in flight. But I
guess during boot it makes some sense.
--
So let's just revert it, we can fix the queue run manually after the
fact. This race is rare enough that it didn't trigger in testing, it
requires the specific disable-while-in-flight scenario to trigger.
Pull KVM fixes from Paolo Bonzini:
"Bug fixes for all architectures. Nothing really stands out"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
KVM: nVMX: remove incorrect vpid check in nested invvpid emulation
arm64: kvm: report original PAR_EL1 upon panic
arm64: kvm: avoid %p in __kvm_hyp_panic
KVM: arm/arm64: vgic: Trust the LR state for HW IRQs
KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.active
KVM: arm/arm64: Fix preemptible timer active state crazyness
arm64: KVM: Add workaround for Cortex-A57 erratum 834220
arm64: KVM: Fix AArch32 to AArch64 register mapping
ARM/arm64: KVM: test properly for a PTE's uncachedness
KVM: s390: fix wrong lookup of VCPUs by array index
KVM: s390: avoid memory overwrites on emergency signal injection
KVM: Provide function for VCPU lookup by id
KVM: s390: fix pfmf intercept handler
KVM: s390: enable SIMD only when no VCPUs were created
KVM: x86: request interrupt window when IRQ chip is split
KVM: x86: set KVM_REQ_EVENT on local interrupt request from user space
KVM: x86: split kvm_vcpu_ready_for_interrupt_injection out of dm_request_for_irq_injection
KVM: x86: fix interrupt window handling in split IRQ chip case
MIPS: KVM: Uninit VCPU in vcpu_create error path
MIPS: KVM: Fix CACHE immediate offset sign extension
...
The kernel may use a page granularity of 4K, 16K, or 64K depending on
configuration.
When mapping EFI runtime regions, we use memrange_efi_to_native to round
the physical base address of a region down to a kernel page boundary,
and round the size up to a kernel page boundary, adding the residue left
over from rounding down the physical base address. We do not round down
the virtual base address.
In __create_mapping we account for the offset of the virtual base from a
granule boundary, adding the residue to the size before rounding the
base down to said granule boundary.
Thus we account for the residue twice, and when the residue is non-zero
will cause __create_mapping to map an additional page at the end of the
region. Depending on the memory map, this page may be in a region we are
not intended/permitted to map, or may clash with a different region that
we wish to map. In typical cases, mapping the next item in the memory
map will overwrite the erroneously created entry, as we sort the memory
map in the stub.
As __create_mapping can cope with base addresses which are not page
aligned, we can instead rely on it to map the region appropriately, and
simplify efi_virtmap_init by removing the unnecessary code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We are missing descriptions for some valid xFSC values in the fault info
table (e.g. "TLB conflict abort"), and have erroneous descriptions for
reserved values (e.g. "asynchronous external abort", "debug event").
This patch adds the missing xFSC values, and removes erroneous decoding
of values reserved by the architecture, as described in ARM DDI 0487A.h.
At the same time, fixed the unbalanced brackets for the synchronous
parity error strings in the table.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
As reported by Michal Simek, building an ARM64 kernel with CONFIG_UID16
disabled currently fails because the system call table still needs to
reference the individual function entry points that are provided by
kernel/sys_ni.c in this case, and the declarations are hidden inside
of #ifdef CONFIG_UID16:
arch/arm64/include/asm/unistd32.h:57:8: error: 'sys_lchown16' undeclared here (not in a function)
__SYSCALL(__NR_lchown, sys_lchown16)
I believe this problem only exists on ARM64, because older architectures
tend to not need declarations when their system call table is built
in assembly code, while newer architectures tend to not need UID16
support. ARM64 only uses these system calls for compatibility with
32-bit ARM binaries.
This changes the CONFIG_UID16 check into CONFIG_HAVE_UID16, which is
set unconditionally on ARM64 with CONFIG_COMPAT, so we see the
declarations whenever we need them, but otherwise the behavior is
unchanged.
Fixes: af1839eb4b ("Kconfig: clean up the long arch list for the UID16 config option")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit 5be9fc23cd ("ARM: orion5x: fix legacy orion5x IRQ numbers") shifted
IRQ numbers by one but didn't update the get_irqnr_and_base macro
accordingly. This macro is involved when CONFIG_MULTI_IRQ_HANDLER
is not defined.
[jac: 5d6bed2a9c went in to v4.2, but was backported to v3.18]
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Fixes: 5be9fc23cd ("ARM: orion5x: fix legacy orion5x IRQ numbers")
Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Commit 5d6bed2a9c ("ARM: dove: fix legacy dove IRQ numbers") shifted
IRQ numbers by one but didn't update the get_irqnr_and_base macro
accordingly. This macro is involved when CONFIG_MULTI_IRQ_HANDLER
is not defined.
[jac: 5d6bed2a9c went in to v4.2, but was backported to v3.18]
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Fixes: 5d6bed2a9c ("ARM: dove: fix legacy dove IRQ numbers")
Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
This patch removes the vpid check when emulating nested invvpid
instruction of type all-contexts invalidation. The existing code is
incorrect because:
(1) According to Intel SDM Vol 3, Section "INVVPID - Invalidate
Translations Based on VPID", invvpid instruction does not check
vpid in the invvpid descriptor when its type is all-contexts
invalidation.
(2) According to the same document, invvpid of type all-contexts
invalidation does not require there is an active VMCS, so/and
get_vmcs12() in the existing code may result in a NULL-pointer
dereference. In practice, it can crash both KVM itself and L1
hypervisors that use invvpid (e.g. Xen).
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There's a regression in 4.4-rc since commit bc3094673f
(btrfs: extend balance filter usage to take minimum and maximum) in that
existing (non-ranged) balance with -dusage=x no longer works; all chunks
are skipped.
After staring at the code for a while and wondering why a non-ranged
balance would even need min and max thresholds (..which then were not
set correctly, leading to the bug) I realized that the only problem
was the fact that the filter functions were named wrong, thanks to
patching copypasta. Simply renaming both functions lets the existing
btrfs-progs call balance with -dusage=x and now the non-ranged filter
function is invoked, properly using only a single chunk limit.
Signed-off-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
Fixes: bc3094673f ("btrfs: extend balance filter usage to take minimum and maximum")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Commit 0ed4792 ('btrfs: qgroup: Switch to new extent-oriented qgroup
mechanism.') removed our qgroup accounting during
btrfs_drop_snapshot(). Predictably, this results in qgroup numbers
going bad shortly after a snapshot is removed.
Fix this by adding a dirty extent record when we encounter extents during
our shared subtree walk. This effectively restores the functionality we had
with the original shared subtree walking code in 1152651 (btrfs: qgroup:
account shared subtrees during snapshot delete).
The idea with the original patch (and this one) is that shared subtrees can
get skipped during drop_snapshot. The shared subtree walk then allows us a
chance to visit those extents and add them to the qgroup work for later
processing. This ultimately makes the accounting for drop snapshot work.
The new qgroup code nicely handles all the other extents during the tree
walk via the ref dec/inc functions so we don't have to add actions beyond
what we had originally.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Chris Mason <clm@fb.com>
The backref code will look up the fs_root we're trying to resolve our indirect
refs for, unfortunately we use btrfs_read_fs_root_no_name, which returns -ENOENT
if the ref is 0. This isn't helpful for the qgroup stuff with snapshot delete
as it won't be able to search down the snapshot we are deleting, which will
cause us to miss roots. So use btrfs_get_fs_root and send false for check_ref
so we can always get the root we're looking for. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Chris Mason <clm@fb.com>
There's a race condition that leads to a NULL pointer dereference if you
disable quotas while a quota rescan is running. To fix this, we just need
to wait for the quota rescan worker to actually exit before tearing down
the quota structures.
Signed-off-by: Justin Maggard <jmaggard@netgear.com>
Signed-off-by: Chris Mason <clm@fb.com>
When a block group becomes unused and the cleaner kthread is currently
running, we can end up getting the current transaction aborted with error
-ENOENT when we try to commit the transaction, leading to the following
trace:
[59779.258768] WARNING: CPU: 3 PID: 5990 at fs/btrfs/extent-tree.c:3740 btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]()
[59779.272594] BTRFS: Transaction aborted (error -2)
(...)
[59779.291137] Call Trace:
[59779.291621] [<ffffffff812566f4>] dump_stack+0x4e/0x79
[59779.292543] [<ffffffff8104d0a6>] warn_slowpath_common+0x9f/0xb8
[59779.293435] [<ffffffffa04cb81f>] ? btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]
[59779.295000] [<ffffffff8104d107>] warn_slowpath_fmt+0x48/0x50
[59779.296138] [<ffffffffa04c2721>] ? write_one_cache_group.isra.32+0x77/0x82 [btrfs]
[59779.297663] [<ffffffffa04cb81f>] btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]
[59779.299141] [<ffffffffa0549b0d>] commit_cowonly_roots+0x1de/0x261 [btrfs]
[59779.300359] [<ffffffffa04dd5b6>] btrfs_commit_transaction+0x4c4/0x99c [btrfs]
[59779.301805] [<ffffffffa04b5df4>] btrfs_sync_fs+0x145/0x1ad [btrfs]
[59779.302893] [<ffffffff81196634>] sync_filesystem+0x7f/0x93
(...)
[59779.318186] ---[ end trace 577e2daff90da33a ]---
The following diagram illustrates a sequence of steps leading to this
problem:
CPU 1 CPU 2
<at transaction N>
adds bg A to list
fs_info->unused_bgs
adds bg B to list
fs_info->unused_bgs
<transaction kthread
commits transaction N
and wakes up the
cleaner kthread>
cleaner kthread
delete_unused_bgs()
sees bg A in list
fs_info->unused_bgs
btrfs_start_transaction()
<transaction N + 1 starts>
deletes bg A
update_block_group(bg C)
--> adds bg C to list
fs_info->unused_bgs
deletes bg B
sees bg C in the list
fs_info->unused_bgs
btrfs_remove_chunk(bg C)
btrfs_remove_block_group(bg C)
--> checks if the block group
is in a dirty list, and
because it isn't now, it
does nothing
--> the block group item
is deleted from the
extent tree
--> adds bg C to list
transaction->dirty_bgs
some task calls
btrfs_commit_transaction(t N + 1)
commit_cowonly_roots()
btrfs_write_dirty_block_groups()
--> sees bg C in cur_trans->dirty_bgs
--> calls write_one_cache_group()
which returns -ENOENT because
it did not find the block group
item in the extent tree
--> transaction aborte with -ENOENT
because write_one_cache_group()
returned that error
So fix this by adding a block group to the list of dirty block groups
before adding it to the list of unused block groups.
This happened on a stress test using fsstress plus concurrent calls to
fallocate 20G and truncate (releasing part of the space allocated with
fallocate).
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Currently scrub can race with the cleaner kthread when the later attempts
to delete an unused block group, and the result is preventing the cleaner
kthread from ever deleting later the block group - unless the block group
becomes used and unused again. The following diagram illustrates that
race:
CPU 1 CPU 2
cleaner kthread
btrfs_delete_unused_bgs()
gets block group X from
fs_info->unused_bgs and
removes it from that list
scrub_enumerate_chunks()
searches device tree using
its commit root
finds device extent for
block group X
gets block group X from the tree
fs_info->block_group_cache_tree
(via btrfs_lookup_block_group())
sets bg X to RO
sees the block group is
already RO and therefore
doesn't delete it nor adds
it back to unused list
So fix this by making scrub add the block group again to the list of
unused block groups if the block group is still unused when it finished
scrubbing it and it hasn't been removed already.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Scrub can race with the cleaner kthread deleting block groups that are
unused (and with relocation too) leading to a failure with error -EINVAL
that gets returned to user space.
The following diagram illustrates how it happens:
CPU 1 CPU 2
cleaner kthread
btrfs_delete_unused_bgs()
gets block group X from
fs_info->unused_bgs
sets block group to RO
btrfs_remove_chunk(bg X)
deletes device extents
scrub_enumerate_chunks()
searches device tree using
its commit root
finds device extent for
block group X
gets block group X from the tree
fs_info->block_group_cache_tree
(via btrfs_lookup_block_group())
sets bg X to RO (again)
btrfs_remove_block_group(bg X)
deletes block group from
fs_info->block_group_cache_tree
removes extent map from
fs_info->mapping_tree
scrub_chunk(offset X)
searches fs_info->mapping_tree
for extent map starting at
offset X
--> doesn't find any such
extent map
--> returns -EINVAL and scrub
errors out to userspace
with -EINVAL
Fix this by dealing with an extent map lookup failure as an indicator of
block group deletion.
Issue reproduced with fstest btrfs/071.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
The test btrfs/011 triggers a rcu warning
Reviewed-by: Anand Jain <anand.jain@oracle.com>
===============================
[ INFO: suspicious RCU usage. ]
4.4.0-rc1-default+ #286 Tainted: G W
-------------------------------
fs/btrfs/volumes.c:1977 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
4 locks held by btrfs/28786:
0: (&fs_info->dev_replace.lock_finishing_cancel_unmount){+.+...}, at: [<ffffffffa00bc785>] btrfs_dev_replace_finishing+0x45/0xa00 [btrfs]
1: (uuid_mutex){+.+.+.}, at: [<ffffffffa00bc84f>] btrfs_dev_replace_finishing+0x10f/0xa00 [btrfs]
2: (&fs_devs->device_list_mutex){+.+.+.}, at: [<ffffffffa00bc868>] btrfs_dev_replace_finishing+0x128/0xa00 [btrfs]
3: (&fs_info->chunk_mutex){+.+...}, at: [<ffffffffa00bc87d>] btrfs_dev_replace_finishing+0x13d/0xa00 [btrfs]
stack backtrace:
CPU: 0 PID: 28786 Comm: btrfs Tainted: G W 4.4.0-rc1-default+ #286
Hardware name: Intel Corporation SandyBridge Platform/To be filled by O.E.M., BIOS ASNBCPT1.86C.0031.B00.1006301607 06/30/2010
0000000000000001 ffff8800a07dfb48 ffffffff8141d47b 0000000000000001
0000000000000001 0000000000000000 ffff8801464a4f00 ffff8800a07dfb78
ffffffff810cd883 ffff880146eb9400 ffff8800a3698600 ffff8800a33fe220
Call Trace:
[<ffffffff8141d47b>] dump_stack+0x4f/0x74
[<ffffffff810cd883>] lockdep_rcu_suspicious+0x103/0x140
[<ffffffffa0071261>] btrfs_rm_dev_replace_remove_srcdev+0x111/0x130 [btrfs]
[<ffffffff810d354d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff81449536>] ? __percpu_counter_sum+0x66/0x80
[<ffffffffa00bcc15>] btrfs_dev_replace_finishing+0x4d5/0xa00 [btrfs]
[<ffffffffa00bc96e>] ? btrfs_dev_replace_finishing+0x22e/0xa00 [btrfs]
[<ffffffffa00a8795>] ? btrfs_scrub_dev+0x415/0x6d0 [btrfs]
[<ffffffffa003ea69>] ? btrfs_start_transaction+0x9/0x20 [btrfs]
[<ffffffffa00bda79>] btrfs_dev_replace_start+0x339/0x590 [btrfs]
[<ffffffff81196aa5>] ? __might_fault+0x95/0xa0
[<ffffffffa0078638>] btrfs_ioctl_dev_replace+0x118/0x160 [btrfs]
[<ffffffff811409c6>] ? stack_trace_call+0x46/0x70
[<ffffffffa007c914>] ? btrfs_ioctl+0x24/0x1770 [btrfs]
[<ffffffffa007ce43>] btrfs_ioctl+0x553/0x1770 [btrfs]
[<ffffffff811409c6>] ? stack_trace_call+0x46/0x70
[<ffffffff811d6eb1>] ? do_vfs_ioctl+0x21/0x5a0
[<ffffffff811d6f1c>] do_vfs_ioctl+0x8c/0x5a0
[<ffffffff811e3336>] ? __fget_light+0x86/0xb0
[<ffffffff811e3369>] ? __fdget+0x9/0x20
[<ffffffff811d7451>] ? SyS_ioctl+0x21/0x80
[<ffffffff811d7483>] SyS_ioctl+0x53/0x80
[<ffffffff81b1efd7>] entry_SYSCALL_64_fastpath+0x12/0x6f
This is because of unprotected use of rcu_dereference in
btrfs_scratch_superblocks. We can't add rcu locks around the whole
function because we read the superblock.
The fix will use the rcu string buffer directly without the rcu locking.
Thi is safe as the device will not go away in the meantime. We're
holding the device list mutexes.
Restructuring the code to narrow down the rcu section turned out to be
impossible, we need to call filp_open (through update_dev_time) on the
buffer and this could call kmalloc/__might_sleep. We could call kstrdup
with GFP_ATOMIC but it's not absolutely necessary.
Fixes: 12b1c2637b (Btrfs: enhance btrfs_scratch_superblock to scratch all superblocks)
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
xfstests/011 failed in node with small_size filesystem.
Can be reproduced by following script:
DEV_LIST="/dev/vdd /dev/vde"
DEV_REPLACE="/dev/vdf"
do_test()
{
local mkfs_opt="$1"
local size="$2"
dmesg -c >/dev/null
umount $SCRATCH_MNT &>/dev/null
echo mkfs.btrfs -f $mkfs_opt "${DEV_LIST[*]}"
mkfs.btrfs -f $mkfs_opt "${DEV_LIST[@]}" || return 1
mount "${DEV_LIST[0]}" $SCRATCH_MNT
echo -n "Writing big files"
dd if=/dev/urandom of=$SCRATCH_MNT/t0 bs=1M count=1 >/dev/null 2>&1
for ((i = 1; i <= size; i++)); do
echo -n .
/bin/cp $SCRATCH_MNT/t0 $SCRATCH_MNT/t$i || return 1
done
echo
echo Start replace
btrfs replace start -Bf "${DEV_LIST[0]}" "$DEV_REPLACE" $SCRATCH_MNT || {
dmesg
return 1
}
return 0
}
# Set size to value near fs size
# for example, 1897 can trigger this bug in 2.6G device.
#
./do_test "-d raid1 -m raid1" 1897
System will report replace fail with following warning in dmesg:
[ 134.710853] BTRFS: dev_replace from /dev/vdd (devid 1) to /dev/vdf started
[ 135.542390] BTRFS: btrfs_scrub_dev(/dev/vdd, 1, /dev/vdf) failed -28
[ 135.543505] ------------[ cut here ]------------
[ 135.544127] WARNING: CPU: 0 PID: 4080 at fs/btrfs/dev-replace.c:428 btrfs_dev_replace_start+0x398/0x440()
[ 135.545276] Modules linked in:
[ 135.545681] CPU: 0 PID: 4080 Comm: btrfs Not tainted 4.3.0 #256
[ 135.546439] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 135.547798] ffffffff81c5bfcf ffff88003cbb3d28 ffffffff817fe7b5 0000000000000000
[ 135.548774] ffff88003cbb3d60 ffffffff810a88f1 ffff88002b030000 00000000ffffffe4
[ 135.549774] ffff88003c080000 ffff88003c082588 ffff88003c28ab60 ffff88003cbb3d70
[ 135.550758] Call Trace:
[ 135.551086] [<ffffffff817fe7b5>] dump_stack+0x44/0x55
[ 135.551737] [<ffffffff810a88f1>] warn_slowpath_common+0x81/0xc0
[ 135.552487] [<ffffffff810a89e5>] warn_slowpath_null+0x15/0x20
[ 135.553211] [<ffffffff81448c88>] btrfs_dev_replace_start+0x398/0x440
[ 135.554051] [<ffffffff81412c3e>] btrfs_ioctl+0x1d2e/0x25c0
[ 135.554722] [<ffffffff8114c7ba>] ? __audit_syscall_entry+0xaa/0xf0
[ 135.555506] [<ffffffff8111ab36>] ? current_kernel_time64+0x56/0xa0
[ 135.556304] [<ffffffff81201e3d>] do_vfs_ioctl+0x30d/0x580
[ 135.557009] [<ffffffff8114c7ba>] ? __audit_syscall_entry+0xaa/0xf0
[ 135.557855] [<ffffffff810011d1>] ? do_audit_syscall_entry+0x61/0x70
[ 135.558669] [<ffffffff8120d1c1>] ? __fget_light+0x61/0x90
[ 135.559374] [<ffffffff81202124>] SyS_ioctl+0x74/0x80
[ 135.559987] [<ffffffff81809857>] entry_SYSCALL_64_fastpath+0x12/0x6f
[ 135.560842] ---[ end trace 2a5c1fc3205abbdd ]---
Reason:
When big data writen to fs, the whole free space will be allocated
for data chunk.
And operation as scrub need to set_block_ro(), and when there is
only one metadata chunk in system(or other metadata chunks
are all full), the function will try to allocate a new chunk,
and failed because no space in device.
Fix:
When set_block_ro failed for metadata chunk, it is not a problem
because scrub_lock paused commit_trancaction in same time, and
metadata are always cowed, so the on-the-fly writepages will not
write data into same place with scrub/replace.
Let replace continue in this case is no problem.
Tested by above script, and xfstests/011, plus 100 times xfstests/070.
Changelog v1->v2:
1: Add detail comments in source and commit-message.
2: Add dmesg detail into commit-message.
3: Limit return value of -ENOSPC to be passed.
All suggested by: Filipe Manana <fdmanana@gmail.com>
Suggested-by: Filipe Manana <fdmanana@gmail.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
I've accidentally picked an already used number for the enhanced usage
filter represented by BTRFS_BALANCE_ARGS_USAGE_RANGE, clashing with
BTRFS_BALANCE_ARGS_CONVERT. Introduced during the development phase,
no backward compatibility issues.
Reported-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: bc3094673f ("btrfs: extend balance filter usage to take minimum and maximum")
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
We were using only 1 transaction unit when attempting to delete an unused
block group but in reality we need 3 + N units, where N corresponds to the
number of stripes. We were accounting only for the addition of the orphan
item (for the block group's free space cache inode) but we were not
accounting that we need to delete one block group item from the extent
tree, one free space item from the tree of tree roots and N device extent
items from the device tree.
While one unit is not enough, it worked most of the time because for each
single unit we are too pessimistic and assume an entire tree path, with
the highest possible heigth (8), needs to be COWed with eventual node
splits at every possible level in the tree, so there was usually enough
reserved space for removing all the items and adding the orphan item.
However after adding the orphan item, writepages() can by called by the VM
subsystem against the btree inode when we are under memory pressure, which
causes writeback to start for the nodes we COWed before, this forces the
operation to remove the free space item to COW again some (or all of) the
same nodes (in the tree of tree roots). Even without writepages() being
called, we could fail with ENOSPC because these items are located in
multiple trees and one of them might have a higher heigth and require
node/leaf splits at many levels, exhausting all the reserved space before
removing all the items and adding the orphan.
In the kernel 4.0 release, commit 3d84be7991 ("Btrfs: fix BUG_ON in
btrfs_orphan_add() when delete unused block group"), we attempted to fix
a BUG_ON due to ENOSPC when trying to add the orphan item by making the
cleaner kthread reserve one transaction unit before attempting to remove
the block group, but this was not enough. We had a couple user reports
still hitting the same BUG_ON after 4.0, like Stefan Priebe's report on
a 4.2-rc6 kernel for example:
http://www.spinics.net/lists/linux-btrfs/msg46070.html
So fix this by reserving all the necessary units of metadata.
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Fixes: 3d84be7991 ("Btrfs: fix BUG_ON in btrfs_orphan_add() when delete unused block group")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
It's possible to reach a state where the cleaner kthread isn't able to
start a transaction to delete an unused block group due to lack of enough
free metadata space and due to lack of unallocated device space to allocate
a new metadata block group as well. If this happens try to use space from
the global block group reserve just like we do for unlink operations, so
that we don't reach a permanent state where starting a transaction for
filesystem operations (file creation, renames, etc) keeps failing with
-ENOSPC. Such an unfortunate state was observed on a machine where over
a dozen unused data block groups existed and the cleaner kthread was
failing to delete them due to ENOSPC error when attempting to start a
transaction, and even running balance with a -dusage=0 filter failed with
ENOSPC as well. Also unmounting and mounting again the filesystem didn't
help. Allowing the cleaner kthread to use the global block reserve to
delete the unused data block groups fixed the problem.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
btrfs_alloc_dummy_root() return an error pointer on failure, it never
returns NULL.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
The calculation of range length in btrfs_sync_file leads to signed
overflow. This was caught by PaX gcc SIZE_OVERFLOW plugin.
https://forums.grsecurity.net/viewtopic.php?f=1&t=4284
The fsync call passes 0 and LLONG_MAX, the range length does not fit to
loff_t and overflows, but the value is converted to u64 so it silently
works as expected.
The minimal fix is a typecast to u64, switching functions to take
(start, end) instead of (start, len) would be more intrusive.
Coccinelle script found that there's one more opencoded calculation of
the length.
<smpl>
@@
loff_t start, end;
@@
* end - start
</smpl>
CC: stable@vger.kernel.org
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
In early_alloc we check if the memblock_alloc failed by checking
the virtual address of the result, which will never fail. This patch
fixes it to check the actual result for failure.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Since commit 3fffd12839 ("i2c: allow specifying
separate wakeup interrupt in device tree") we have
automatic wakeup irq support for i2c devices. That
commit missed the fact that rtc-1307 had its own
wakeup irq handling and ended up introducing a
kernel splat for at least Beagle x15 boards.
Fix that by reverting original commit _and_ passing
correct interrupt names on DTS so i2c-core can
choose correct IRQ as wakeup.
Now that we have automatic wakeirq support, we can
revert the original commit which did it manually.
Fixes the following warning:
[ 10.346582] WARNING: CPU: 1 PID: 263 at linux/drivers/base/power/wakeirq.c:43 dev_pm_attach_wake_irq+0xbc/0xd4()
[ 10.359244] rtc-ds1307 2-006f: wake irq already initialized
Cc: Tony Lindgren <tony@atomide.com>
Cc: Nishanth Menon <nm@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Each GPCCS unit was reading the mask from GPC0, which causes problems on
boards where some GPCs are missing PPCs.
Part of the fix for fdo#92761.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
There's a few places where we need to access a GPC register from ucode,
but outside of the falcon's io address space. To do this we need to
calculate the offset based on which GPC we're executing on.
This used to be done manually, but we've since found a "base" offset
that can be added by the hardware. To use this, an extra bit needs to
be set in the register address, which is what this macro achieves.
There should be no functional change from this commit.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Fixes detection of a failed attempt at fetching the entire ROM image
in one-shot (a violation of the spec, that works a lot of the time).
Tested on a HP Zbook 15 G2.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
No locking is required for the traversal of this list, as it only
happens during suspend/resume where nothing else can be executing.
Fixes some of the issues noticed during parallel piglit runs.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
If a user key gets negatively instantiated, an error code is cached in the
payload area. A negatively instantiated key may be then be positively
instantiated by updating it with valid data. However, the ->update key
type method must be aware that the error code may be there.
The following may be used to trigger the bug in the user key type:
keyctl request2 user user "" @u
keyctl add user user "a" @u
which manifests itself as:
BUG: unable to handle kernel paging request at 00000000ffffff8a
IP: [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 kernel/rcu/tree.c:3046
PGD 7cc30067 PUD 0
Oops: 0002 [#1] SMP
Modules linked in:
CPU: 3 PID: 2644 Comm: a.out Not tainted 4.3.0+ #49
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88003ddea700 ti: ffff88003dd88000 task.ti: ffff88003dd88000
RIP: 0010:[<ffffffff810a376f>] [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280
[<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 kernel/rcu/tree.c:3046
RSP: 0018:ffff88003dd8bdb0 EFLAGS: 00010246
RAX: 00000000ffffff82 RBX: 0000000000000000 RCX: 0000000000000001
RDX: ffffffff81e3fe40 RSI: 0000000000000000 RDI: 00000000ffffff82
RBP: ffff88003dd8bde0 R08: ffff88007d2d2da0 R09: 0000000000000000
R10: 0000000000000000 R11: ffff88003e8073c0 R12: 00000000ffffff82
R13: ffff88003dd8be68 R14: ffff88007d027600 R15: ffff88003ddea700
FS: 0000000000b92880(0063) GS:ffff88007fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000ffffff8a CR3: 000000007cc5f000 CR4: 00000000000006e0
Stack:
ffff88003dd8bdf0 ffffffff81160a8a 0000000000000000 00000000ffffff82
ffff88003dd8be68 ffff88007d027600 ffff88003dd8bdf0 ffffffff810a39e5
ffff88003dd8be20 ffffffff812a31ab ffff88007d027600 ffff88007d027620
Call Trace:
[<ffffffff810a39e5>] kfree_call_rcu+0x15/0x20 kernel/rcu/tree.c:3136
[<ffffffff812a31ab>] user_update+0x8b/0xb0 security/keys/user_defined.c:129
[< inline >] __key_update security/keys/key.c:730
[<ffffffff8129e5c1>] key_create_or_update+0x291/0x440 security/keys/key.c:908
[< inline >] SYSC_add_key security/keys/keyctl.c:125
[<ffffffff8129fc21>] SyS_add_key+0x101/0x1e0 security/keys/keyctl.c:60
[<ffffffff8185f617>] entry_SYSCALL_64_fastpath+0x12/0x6a arch/x86/entry/entry_64.S:185
Note the error code (-ENOKEY) in EDX.
A similar bug can be tripped by:
keyctl request2 trusted user "" @u
keyctl add trusted user "a" @u
This should also affect encrypted keys - but that has to be correctly
parameterised or it will fail with EINVAL before getting to the bit that
will crashes.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
We only added the request to the request list for the !blk-mq case,
so we should only delete it in that case as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
When we fail various metadata related operations in nvme_queue_rq we
need to unmap the data SGL.
Cc: stable@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
We received a bug report recently when DDW (64-bit direct DMA on Power)
is not enabled for NVMe devices. In that case, we fall back to 32-bit
DMA via the IOMMU, which is always done via 4K TCEs (Translation Control
Entries).
The NVMe device driver, though, assumes that the DMA alignment for the
PRP entries will match the device's page size, and that the DMA aligment
matches the kernel's page aligment. On Power, the the IOMMU page size,
as mentioned above, can be 4K, while the device can have a page size of
8K, while the kernel has a page size of 64K. This eventually trips the
BUG_ON in nvme_setup_prps(), as we have a 'dma_len' that is a multiple
of 4K but not 8K (e.g., 0xF000).
In this particular case of page sizes, we clearly want to use the
IOMMU's page size in the driver. And generally, the NVMe driver in this
function should be using the IOMMU's page size for the default device
page size, rather than the kernel's page size. There is not currently an
API to obtain the IOMMU's page size across all architectures and in the
interest of a stop-gap fix to this functional issue, default the NVMe
device page size to 4K, with the intent of adding such an API and
implementation across all architectures in the next merge window.
With the functionally equivalent v3 of this patch, our hardware test
exerciser survives when using 32-bit DMA; without the patch, the kernel
will BUG within a few minutes.
Signed-off-by: Nishanth Aravamudan <nacc at linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The hisi_pcie_probe() function is incorrectly marked as __init, as Kconfig
tells us:
WARNING: drivers/pci/host/built-in.o(.data+0x7780): Section mismatch in reference from the variable hisi_pcie_driver to the function .init.text:hisi_pcie_probe()
If the probe for this device gets deferred past the point where __init
functions are removed, or the device is unbound and then reattached to the
driver, we branch into uninitialized memory, which is bad.
Remove the __init annotation from hisi_pcie_probe() and
hisi_add_pcie_port().
Fixes: 500a1d9a43 ("PCI: hisi: Add HiSilicon SoC Hip05 PCIe driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Acked-by: Zhou Wang <wangzhou1@hisilicon.com>
Pull device mapper fixes from Mike Snitzer:
"Two fixes for 4.4-rc1's DM ioctl changes that introduced the potential
for infinite recursion on ioctl (with DM multipath).
And four stable fixes:
- A DM thin-provisioning fix to restore 'error_if_no_space' setting
when a thin-pool is made writable again (after having been out of
space).
- A DM thin-provisioning fix to properly advertise discard support
for thin volumes that are stacked on a thin-pool whose underlying
data device doesn't support discards.
- A DM ioctl fix to allow ctrl-c to break out of an ioctl retry loop
when DM multipath is configured to 'queue_if_no_path'.
- A DM crypt fix for a possible hang on dm-crypt device removal"
* tag 'dm-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm thin: fix regression in advertised discard limits
dm crypt: fix a possible hang due to race condition on exit
dm mpath: fix infinite recursion in ioctl when no paths and !queue_if_no_path
dm: do not reuse dm_blk_ioctl block_device input as local variable
dm: fix ioctl retry termination with signal
dm thin: restore requested 'error_if_no_space' setting on OODS to WRITE transition
"pp->io" is an I/O resource, e.g., "[io 0x0000-0xffff]"; "pp->io_base" is
the CPU physical address of a region where the host bridge converts CPU
memory accesses into PCI I/O transactions.
Corrupting pp->io_base by assigning pp->io->start to it breaks access to
the PCI I/O space, as reported by Kishon.
Remove the invalid assignment.
[bhelgaas: changelog]
Fixes: 0021d22b73 ("PCI: designware: Use of_pci_get_host_bridge_resources() to parse DT")
Reported-and-tested-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
I got a crash during a "perf top" session that was caused by a race in
__task_pid_nr_ns() :
pid_nr_ns() was inlined, but apparently compiler chose to read
task->pids[type].pid twice, and the pid->level dereference crashed
because we got a NULL pointer at the second read :
if (pid && ns->level <= pid->level) { // CRASH
Just use RCU API properly to solve this race, and not worry about "perf
top" crashing hosts :(
get_task_pid() can benefit from same fix.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
commit fa1aa143ac ("selinux: extended permissions for ioctls")
introduced a bug into the handling of conditional rules, skipping the
processing entirely when the caller does not provide an extended
permissions (xperms) structure. Access checks from userspace using
/sys/fs/selinux/access do not include such a structure since that
interface does not presently expose extended permission information.
As a result, conditional rules were being ignored entirely on userspace
access requests, producing denials when access was allowed by
conditional rules in the policy. Fix the bug by only skipping
computation of extended permissions in this situation, not the entire
conditional rules processing.
Reported-by: Laurent Bigonville <bigon@debian.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: fixed long lines in patch description]
Cc: stable@vger.kernel.org # 4.3
Signed-off-by: Paul Moore <pmoore@redhat.com>
Commit 1266963170 ("PCI: Prevent out of bounds access in numa_node
override") missed that the user-provided node could also be negative.
Handle this case as well to avoid out-of-bounds accesses to the
node_states[] array. However, allow the special value -1, i.e.
NUMA_NO_NODE, to be able to set the 'no specific node' configuration.
Fixes: 1266963170 ("PCI: Prevent out of bounds access in numa_node override")
Fixes: 63692df103 ("PCI: Allow numa_node override via sysfs")
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Sasha Levin <sasha.levin@oracle.com>
CC: Prarit Bhargava <prarit@redhat.com>
CC: stable@vger.kernel.org # v3.19+
Pull block layer fixes from Jens Axboe:
"A round of fixes/updates for the current series.
This looks a little bigger than it is, but that's mainly because we
pushed the lightnvm enabled null_blk change out of the merge window so
it could be updated a bit. The rest of the volume is also mostly
lightnvm. In particular:
- Lightnvm. Various fixes, additions, updates from Matias and
Javier, as well as from Wenwei Tao.
- NVMe:
- Fix for potential arithmetic overflow from Keith.
- Also from Keith, ensure that we reap pending completions from
a completion queue before deleting it. Fixes kernel crashes
when resetting a device with IO pending.
- Various little lightnvm related tweaks from Matias.
- Fixup flushes to go through the IO scheduler, for the cases where a
flush is not required. Fixes a case in CFQ where we would be
idling and not see this request, hence not break the idling. From
Jan Kara.
- Use list_{first,prev,next} in elevator.c for cleaner code. From
Gelian Tang.
- Fix for a warning trigger on btrfs and raid on single queue blk-mq
devices, where we would flush plug callbacks with preemption
disabled. From me.
- A mac partition validation fix from Kees Cook.
- Two merge fixes from Ming, marked stable. A third part is adding a
new warning so we'll notice this quicker in the future, if we screw
up the accounting.
- Cleanup of thread name/creation in mtip32xx from Rasmus Villemoes"
* 'for-linus' of git://git.kernel.dk/linux-block: (32 commits)
blk-merge: warn if figured out segment number is bigger than nr_phys_segments
blk-merge: fix blk_bio_segment_split
block: fix segment split
blk-mq: fix calling unplug callbacks with preempt disabled
mac: validate mac_partition is within sector
mtip32xx: use formatting capability of kthread_create_on_node
NVMe: reap completion entries when deleting queue
lightnvm: add free and bad lun info to show luns
lightnvm: keep track of block counts
nvme: lightnvm: use admin queues for admin cmds
lightnvm: missing free on init error
lightnvm: wrong return value and redundant free
null_blk: do not del gendisk with lightnvm
null_blk: use device addressing mode
null_blk: use ppa_cache pool
NVMe: Fix possible arithmetic overflow for max segments
blk-flush: Queue through IO scheduler when flush not required
null_blk: register as a LightNVM device
elevator: use list_{first,prev,next}_entry
lightnvm: cleanup queue before target removal
...
If enable Mediatek 8173 SoC, it should also enable power domain
driver. Otherwise access clk subsystem register will fail.
Signed-off-by: Eddie Huang <eddie.huang@mediatek.com>
Acked-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
If we call __kvm_hyp_panic while a guest context is active, we call
__restore_sysregs before acquiring the system register values for the
panic, in the process throwing away the PAR_EL1 value at the point of
the panic.
This patch modifies __kvm_hyp_panic to stash the PAR_EL1 value prior to
restoring host register values, enabling us to report the original
values at the point of the panic.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Currently __kvm_hyp_panic uses %p for values which are not pointers,
such as the ESR value. This can confusingly lead to "(null)" being
printed for the value.
Use %x instead, and only use %p for host pointers.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
We were probing the physial distributor state for the active state of a
HW virtual IRQ, because we had seen evidence that the LR state was not
cleared when the guest deactivated a virtual interrupted.
However, this issue turned out to be a software bug in the GIC, which
was solved by: 84aab5e68c2a5e1e18d81ae8308c3ce25d501b29
(KVM: arm/arm64: arch_timer: Preserve physical dist. active
state on LR.active, 2015-11-24)
Therefore, get rid of the complexities and just look at the LR.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
We were incorrectly removing the active state from the physical
distributor on the timer interrupt when the timer output level was
deasserted. We shouldn't be doing this without considering the virtual
interrupt's active state, because the architecture requires that when an
LR has the HW bit set and the pending or active bits set, then the
physical interrupt must also have the corresponding bits set.
This addresses an issue where we have been observing an inconsistency
between the LR state and the physical distributor state where the LR
state was active and the physical distributor was not active, which
shouldn't happen.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
We were setting the physical active state on the GIC distributor in a
preemptible section, which could cause us to set the active state on
different physical CPU from the one we were actually going to run on,
hacoc ensues.
Since we are no longer descheduling/scheduling soft timers in the
flush/sync timer functions, simply moving the timer flush into a
non-preemptible section.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Cortex-A57 parts up to r1p2 can misreport Stage 2 translation faults
when a Stage 1 permission fault or device alignment fault should
have been reported.
This patch implements the workaround (which is to validate that the
Stage-1 translation actually succeeds) by using code patching.
Cc: stable@vger.kernel.org
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
When running a 32bit guest under a 64bit hypervisor, the ARMv8
architecture defines a mapping of the 32bit registers in the 64bit
space. This includes banked registers that are being demultiplexed
over the 64bit ones.
On exceptions caused by an operation involving a 32bit register, the
HW exposes the register number in the ESR_EL2 register. It was so
far understood that SW had to distinguish between AArch32 and AArch64
accesses (based on the current AArch32 mode and register number).
It turns out that I misinterpreted the ARM ARM, and the clue is in
D1.20.1: "For some exceptions, the exception syndrome given in the
ESR_ELx identifies one or more register numbers from the issued
instruction that generated the exception. Where the exception is
taken from an Exception level using AArch32 these register numbers
give the AArch64 view of the register."
Which means that the HW is already giving us the translated version,
and that we shouldn't try to interpret it at all (for example, doing
an MMIO operation from the IRQ mode using the LR register leads to
very unexpected behaviours).
The fix is thus not to perform a call to vcpu_reg32() at all from
vcpu_reg(), and use whatever register number is supplied directly.
The only case we need to find out about the mapping is when we
actively generate a register access, which only occurs when injecting
a fault in a guest.
Cc: stable@vger.kernel.org
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
The open coded tests for checking whether a PTE maps a page as
uncached use a flawed '(pte_val(xxx) & CONST) != CONST' pattern,
which is not guaranteed to work since the type of a mapping is
not a set of mutually exclusive bits
For HYP mappings, the type is an index into the MAIR table (i.e, the
index itself does not contain any information whatsoever about the
type of the mapping), and for stage-2 mappings it is a bit field where
normal memory and device types are defined as follows:
#define MT_S2_NORMAL 0xf
#define MT_S2_DEVICE_nGnRE 0x1
I.e., masking *and* comparing with the latter matches on the former,
and we have been getting lucky merely because the S2 device mappings
also have the PTE_UXN bit set, or we would misidentify memory mappings
as device mappings.
Since the unmap_range() code path (which contains one instance of the
flawed test) is used both for HYP mappings and stage-2 mappings, and
considering the difference between the two, it is non-trivial to fix
this by rewriting the tests in place, as it would involve passing
down the type of mapping through all the functions.
However, since HYP mappings and stage-2 mappings both deal with host
physical addresses, we can simply check whether the mapping is backed
by memory that is managed by the host kernel, and only perform the
D-cache maintenance if this is the case.
Cc: stable@vger.kernel.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
We had seen lots of reports of this kind issue, so add one
warnning in blk-merge, then it can be triggered easily and
avoid to depend on warning/bug from drivers.
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Commit bdced438acd83a(block: setup bi_phys_segments after
splitting) introduces function of computing bio->bi_phys_segments
during bio splitting.
Unfortunately both bio->bi_seg_front_size and bio->bi_seg_back_size
arn't computed, so too many physical segments may be obtained
for one request since both the two are used to check if one segment
across two bios can be possible.
This patch fixes the issue by computing the two variables in
blk_bio_segment_split().
Fixes: bdced438acd83a(block: setup bi_phys_segments after splitting)
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Mark Salter <msalter@redhat.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Inside blk_bio_segment_split(), previous bvec pointer(bvprvp)
always points to the iterator local variable, which is obviously
wrong, so fix it by pointing to the local variable of 'bvprv'.
Fixes: 5014c311baa2b(block: fix bogus compiler warnings in blk-merge.c)
Cc: stable@kernel.org #4.3
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Mark Salter <msalter@redhat.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
A truncated cb_compound request will cause the client to decode null or
data from a previous callback for nfs4.1 backchannel case, or uninitialized
data for the nfs4.0 case. This is because the path through
svc_process_common() advances the request's iov_base and decrements iov_len
without adjusting the overall xdr_buf's len field. That causes
xdr_init_decode() to set up the xdr_stream with an incorrect length in
nfs4_callback_compound().
Fixing this for the nfs4.1 backchannel case first requires setting the
correct iov_len and page_len based on the length of received data in the
same manner as the nfs4.0 case.
Then the request's xdr_buf length can be adjusted for both cases based upon
the remaining iov_len and page_len.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
If clp->cl_cb_ident is zero, then nfs_cb_idr_remove_locked() skips removing
it when the nfs_client is freed. A decoding or server bug can then find
and try to put that first nfs_client which would lead to a crash.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: d687031265 ("nfs4client: convert to idr_alloc()")
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
When LAYOUTGET gets NFS4ERR_DELAY, we currently will wait 15s before
retrying the call. That is a _very_ long time, so add a timeout value to
struct nfs4_layoutget and pass nfs4_async_handle_error a pointer to it.
This allows the RPC engine to use a sliding delay window, instead of a
15s delay.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
NFS v4.2 operations can work outside of pNFS, so dprintk() output
shouldn't be placed under NFSDBG_PNFS.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
The NFS CLONE_RANGE defintion was wrong and thus never worked. Fix this
by simply using the btrfs ioctl defintion.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Originally CLONE didn't allow for intra-file clones, but we recently
updated the spec to support this feature which is also supported by
local Linux file systems.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Without this for example 64-bit binaries on typical amd64 distributions
would not be able to use ioctls on NFS. For now this only affects clones.
Additionally ->compat_ioctl is defined even for non-compat builds, so
get rid of the pointless ifdef.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Currently we pass uninitialized stack garbage in the count parameter.
The value is usually large enought to clone whole files and thus let
simple tests pass, but it makes the tests for range clones very unhappy.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
The following test program from Dmitry can cause softlockups or RCU
stalls as it copies 1GB from tmpfs into eventfd and we don't have any
scheduling point at that path in sendfile(2) implementation:
int r1 = eventfd(0, 0);
int r2 = memfd_create("", 0);
unsigned long n = 1<<30;
fallocate(r2, 0, 0, n);
sendfile(r1, r2, 0, n);
Add cond_resched() into __splice_from_pipe() to fix the problem.
CC: Dmitry Vyukov <dvyukov@google.com>
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Commit 296291cdd1 (mm: make sendfile(2) killable) fixed an issue where
sendfile(2) was doing a lot of tiny writes into a filesystem and thus
was unkillable for a long time. However sendfile(2) can be (mis)used to
issue lots of writes into arbitrary file descriptor such as evenfd or
similar special file descriptors which never hit the standard filesystem
write path and thus are still unkillable. E.g. the following example
from Dmitry burns CPU for ~16s on my test system without possibility to
be killed:
int r1 = eventfd(0, 0);
int r2 = memfd_create("", 0);
unsigned long n = 1<<30;
fallocate(r2, 0, 0, n);
sendfile(r1, r2, 0, n);
There are actually quite a few tests for pending signals in sendfile
code however we data to write is always available none of them seems to
trigger. So fix the problem by adding a test for pending signal into
splice_from_pipe_next() also before the loop waiting for pipe buffers to
be available. This should fix all the lockup issues with sendfile of the
do-ton-of-tiny-writes nature.
CC: stable@vger.kernel.org
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The thing got broken back in 2002 - sysvfs does *not* have inline
symlinks; even short ones have bodies stored in the first block
of file. sysv_symlink() handles that correctly; unfortunately,
attempting to look an existing symlink up will end up confusing
them for inline symlinks, and interpret the block number containing
the body as the body itself.
Nobody has noticed until now, which says something about the level
of testing sysvfs gets ;-/
Cc: stable@vger.kernel.org # all of them, not that anyone cared
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The IMX6Q/IMX6DL SoC's have a 2-bit temperature grade stored in OTP which
is valid for all IMX6 SoC's (despite the fact that the IMXSDLRM and
IMXSXRM do not document this - this has been proven via tests as well as
verified by Freescale FAE).
Instead of assuming a fixed 85C for passive cooling threshold and 105C for
critical use the thermal grade for these configurations.
We will set the critical to maxT - 5C and passive to maxT - 10C.
Cc: Anson Huang <b20788@freescale.com>
Cc: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Jon Nettleton <jon@solid-run.com>
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
----
v3:
- rebase against linux-soc-thermal.git
- added ack's from Shawn and Jon
v2:
- remove check for IMX6Q and update comments: The OTP values have been tested
on IMX6SOLO, IMX6DUALLITE, and IMX6SX and Freescale FAE has shared data with
me that the OTP settings are the same and that the reference manuals will
reflect this in their next updates.
- set critical to max - 5C
- set passive to max - 10C
- display max temp in info
- do not allow passive to be set above critical
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
When the prototype for thermal_zone_bind_cooling_device
changed, the static inline wrapper function was left alone,
which in theory can cause build warnings:
I have seen this error in the past:
drivers/thermal/db8500_thermal.c: In function 'db8500_cdev_bind':
drivers/thermal/db8500_thermal.c:78:9: error: too many arguments to function 'thermal_zone_bind_cooling_device'
ret = thermal_zone_bind_cooling_device(thermal, i, cdev,
while this one no longer shows up, there is no doubt that
the prototype is still wrong, so let's just fix it anyway.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 6cd9e9f629 ("thermal: of: fix cooling device weights in device tree")
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
This just caused build errors:
warning: (QCOM_SPMI_TEMP_ALARM) selects REGMAP_SPMI which has unmet direct dependencies (SPMI)
drivers/built-in.o: In function `regmap_spmi_ext_gather_write':
:(.text+0x609b0): undefined reference to `spmi_ext_register_write'
:(.text+0x609f0): undefined reference to `spmi_ext_register_writel'
While it's generally a good idea to allow compile testing, in this
case, it just doesn't work, so reverting the patch that
introduced the compile-test variant seems the most appropriate
solution.
Note that SPMI also has a 'depends on ARCH_QCOM || COMPILE_TEST'
statement, so we should be able to enable SPMI on all architectures
for compile testing already.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: cb7fb4d342 ("thermal: qcom_spmi: allow compile test")
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The SCPI clk driver registers the virtual cpufreq device that kicks off
initialisation of the SCPI cpufreq driver. Also, clk_get() will fail for
the cpufreq driver if the SCPI clk driver is missing.
Fix this by making the SCPI cpufreq driver explicitly depend on the SCPI
clk driver.
Fixes: 8def31034d (cpufreq: arm_big_little: add SCPI interface driver)
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A rounding error was found in the calculation of limits->max_perf
in intel_pstate_set_policy(), which is used to calculate the max and min
pstate values in intel_pstate_get_min_max(). In that code,
limits->max_perf is truncated to 2 hex digits such that, for example,
0x169 was incorrectly calculated to 0x16 instead of 0x17. This resulted in
the pstate being set one level too low. This patch rounds the value of
limits->max_perf up instead of down so that the correct max pstate can
be reached.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
I have a Intel (6,63) processor with a "marketing" frequency (from
/proc/cpuinfo) of 2100MHz, and a max turbo frequency of 2600MHz. I
can execute
cpupower frequency-set -g powersave --min 1200MHz --max 2100MHz
and the max_freq_pct is set to 80. When adding load to the system I noticed
that the cpu frequency only reached 2000MHZ and not 2100MHz as expected.
This is because limits->max_policy_pct is calculated as 2100 * 100 /2600 = 80.7
and is rounded down to 80 when it should be rounded up to 81. This patch
adds a DIV_ROUND_UP() which will return the correct value.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subsys interface's ->remove_dev() is called when the cpufreq driver is
unregistering or the CPU is getting physically removed. We keep removing
the cpuX/cpufreq link for all CPUs except the last one, which is a
mistake as all CPUs contain a link now.
Because of this, one CPU from each policy will still contain a link (to
an already removed policyX directory), after the cpufreq driver is
unregistered.
Fix that by removing the link first and then only see if the policy is
required to be freed. That will make sure that no links are left out.
Fixes: 96bdda61f5 ("cpufreq: create cpu/cpufreq/policyX directories")
Reported-and-tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The CPU policy struct indicates the co-ordination type
for all CPUs of a common freq domain. Initialize it
correctly using the CPU specific data gathered from
CPPC ACPI lib via acpi_get_psd_map().
The PSD object is optional, so the cpu->shared_type
can also be 0. So instead of assuming any value other
than SW_ANY(0xFD) is unsupported, explictly check
if shared_type is SW_ALL and then bail.
Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull kselftest fixes from Shuah Khan:
"This update consists of one minor documentation fix and a fix to an
existing test"
* tag 'linux-kselftest-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/seccomp: Get page size from sysconf
tools:testing/selftests: fix typo in futex/README
When establishing a thin device's discard limits we cannot rely on the
underlying thin-pool device's discard capabilities (which are inherited
from the thin-pool's underlying data device) given that DM thin devices
must provide discard support even when the thin-pool's underlying data
device doesn't support discards.
Users were exposed to this thin device discard limits regression if
their thin-pool's underlying data device does _not_ support discards.
This regression caused all upper-layers that called the
blkdev_issue_discard() interface to not be able to issue discards to
thin devices (because discard_granularity was 0). This regression
wasn't caught earlier because the device-mapper-test-suite's extensive
'thin-provisioning' discard tests are only ever performed against
thin-pool's with data devices that support discards.
Fix is to have thin_io_hints() test the pool's 'discard_enabled' feature
rather than inferring whether or not a thin device's discard support
should be enabled by looking at the thin-pool's discard_granularity.
Fixes: 216076705 ("dm thin: disable discard support for thin devices if pool's is disabled")
Reported-by: Mike Gerber <mike@sprachgewalt.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 4.1+
Currently kernel crash randomly when K2L EVM is booted without
clk_ignore_unused in the bootargs. This workaround is not needed
on other K2 devices such as K2HK and K2E and with this fix, we can
remove the workaround altogether. netcp driver on K2L uses linked
ram on OSR (On chip Static RAM) and requires the clock to this peripheral
enabled for proper functioning. This is the reason for the kernel crash.
So add the clock node to fix this issue.
While at it, remove the workaround documentation as well.
With the fix applied, clk_summary dump shows the clock to OSR enabled.
cat /sys/kernel/debug/clk/clk_summary
------cut--------------
tcp3d-1 0 0 399360000 0 0
tcp3d-0 0 0 399360000 0 0
osr 1 1 399360000 0 0
fftc-0 0 0 399360000 0 0
-----cut----------------
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Configure linking RAM for both queue managers also in case
when only linking RAM 0 is specified in device tree.
Currently hwqueue driver configures linking RAM(s) to be used
cooperatively by the QMs (shared mode). Therefore if both
queue managers are used then both must be configured with
exactly the same linking RAM info (base address and size)
independent of the number of linking RAM(s) specified in the
device tree.
For proper operation only one linking RAM is required and in most
cases this can be internal one as long as it is able to handle
the number of descriptors used in the system.
Current driver code however skips configuration of second
queue manager if second linking RAM is not specified.
If the configuration for the QM2 is missing there will be
a crash when it tries to push/pop descriptors from its queues.
Signed-off-by: Michal Morawiec <michal.1.morawiec.ext@nokia.com>
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
When firmware image for PDSP firmware is absent in the file system
the kernel boot with ramfs/nfs is stuck for 60 seconds being the
the default timeout. request_firmware_direct() is to take care of
such optional firmware loading and hence replace the call in the
driver with this API.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
This way the driver isn't limited in the dependency handling callback.
v2: remove extra check in amd_sched_entity_pop_job()
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
We only need to wait for jobs to be scheduled when
the dependency is from the same scheduler.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Fixes STAR 9000953410: "perf callgraph profiling causing RCU stalls"
| perf record -g -c 15000 -e cycles /sbin/hackbench
|
| INFO: rcu_preempt self-detected stall on CPU
| 1: (1 GPs behind) idle=609/140000000000002/0 softirq=2914/2915 fqs=603
| Task dump for CPU 1:
in-kernel dwarf unwinder has a fast binary lookup and a fallback linear
search (which iterates thru each of ~11K entries) thus takes 2 orders of
magnitude longer (~3 million cycles vs. 2000). Routines written in hand
assembler lack dwarf info (as we don't support assembler CFI pseudo-ops
yet) fail the unwinder binary lookup, hit linear search, failing
nevertheless in the end.
However the linear search is pointless as binary lookup tables are created
from it in first place. It is impossible to have binary lookup fail while
succeed the linear search. It is pure waste of cycles thus removed by
this patch.
This manifested as RCU stalls / NMI watchdog splat when running
hackbench under perf with callgraph profiling. The triggering condition
was perf counter overflowing in routine lacking dwarf info (like memset)
leading to patheic 3 million cycle unwinder slow path and by the time it
returned new interrupts were already pending (Timer, IPI) and taken
rightaway. The original memset didn't make forward progress, system kept
accruing more interrupts and more unwinder delayes in a vicious feedback
loop, ultimately triggering the NMI diagnostic.
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Currently we can hit a scenario where we'll tm_reclaim() twice. This
results in a TM bad thing exception because the second reclaim occurs
when not in suspend mode.
The scenario in which this can happen is the following. We attempt to
deliver a signal to userspace. To do this we need obtain the stack
pointer to write the signal context. To get this stack pointer we
must tm_reclaim() in case we need to use the checkpointed stack
pointer (see get_tm_stackpointer()). Normally we'd then return
directly to userspace to deliver the signal without going through
__switch_to().
Unfortunatley, if at this point we get an error (such as a bad
userspace stack pointer), we need to exit the process. The exit will
result in a __switch_to(). __switch_to() will attempt to save the
process state which results in another tm_reclaim(). This
tm_reclaim() now causes a TM Bad Thing exception as this state has
already been saved and the processor is no longer in TM suspend mode.
Whee!
This patch checks the state of the MSR to ensure we are TM suspended
before we attempt the tm_reclaim(). If we've already saved the state
away, we should no longer be in TM suspend mode. This has the
additional advantage of checking for a potential TM Bad Thing
exception.
Found using syscall fuzzer.
Fixes: fb09692e71 ("powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes")
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently we allow both the MSR T and S bits to be set by userspace on
a signal return. Unfortunately this is a reserved configuration and
will cause a TM Bad Thing exception if attempted (via rfid).
This patch checks for this case in both the 32 and 64 bit signals
code. If both T and S are set, we mark the context as invalid.
Found using a syscall fuzzer.
Fixes: 2b0a576d15 ("powerpc: Add new transactional memory state to the signal context")
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
WDT_MODE value need to be or-ed with MODE_KEY when setting
watchdog mode. Add it to mtk_wdt_stop function, so that the
watchdog can be stopped (e.g. during suspend).
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Acked-by: Matthias Brugger <matthias.bgg@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
If we need to restart the watchdog due to someone changing the timeout
interval, stop the watchdog before restarting it. Otherwise, the new
timeout doesn't seem to take.
Signed-off-by: Andrew Chew <achew@nvidia.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
"t" is controlled by the user. If "t" is a very large integer then it
could lead to a negative "tmrval". We cap the upper bound of "tmrval"
but, in the current code, we allow negatives. This is a bug and it
causes a static checker warning. Let's make "tmrval" unsigned to avoid
this problem.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Silences sparse warning:
drivers/watchdog/pnx4008_wdt.c:83:25:
warning: symbol 'wdt_clk' was not declared. Should it be static?
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
If common clock framework is configured, the driver generates a warning,
which is fixed by this change:
WARNING: CPU: 0 PID: 1 at drivers/clk/clk.c:727 clk_core_enable+0x2c/0xa4()
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Tainted: G W 4.3.0-rc2+ #171
Hardware name: LPC32XX SoC (Flattened Device Tree)
Backtrace:
[<>] (dump_backtrace) from [<>] (show_stack+0x18/0x1c)
[<>] (show_stack) from [<>] (dump_stack+0x20/0x28)
[<>] (dump_stack) from [<>] (warn_slowpath_common+0x90/0xb8)
[<>] (warn_slowpath_common) from [<>] (warn_slowpath_null+0x24/0x2c)
[<>] (warn_slowpath_null) from [<>] (clk_core_enable+0x2c/0xa4)
[<>] (clk_core_enable) from [<>] (clk_enable+0x24/0x38)
[<>] (clk_enable) from [<>] (pnx4008_wdt_probe+0x78/0x11c)
[<>] (pnx4008_wdt_probe) from [<>] (platform_drv_probe+0x50/0xa0)
[<>] (platform_drv_probe) from [<>] (driver_probe_device+0x18c/0x408)
[<>] (driver_probe_device) from [<>] (__driver_attach+0x70/0x94)
[<>] (__driver_attach) from [<>] (bus_for_each_dev+0x74/0x98)
[<>] (bus_for_each_dev) from [<>] (driver_attach+0x20/0x28)
[<>] (driver_attach) from [<>] (bus_add_driver+0x11c/0x248)
[<>] (bus_add_driver) from [<>] (driver_register+0xa4/0xe8)
[<>] (driver_register) from [<>] (__platform_driver_register+0x50/0x64)
[<>] (__platform_driver_register) from [<>] (platform_wdt_driver_init+0x18/0x20)
[<>] (platform_wdt_driver_init) from [<>] (do_one_initcall+0x11c/0x1dc)
[<>] (do_one_initcall) from [<>] (kernel_init_freeable+0x10c/0x1d4)
[<>] (kernel_init_freeable) from [<>] (kernel_init+0x10/0xec)
[<>] (kernel_init) from [<>] (ret_from_fork+0x14/0x24)
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
GPC irq domain is a child domain of GIC, now all of platform irqs
are inside GPC domain, during the module populate, all devices irq
should have correct type setting in GIC, however, there is no
.irq_set_type callback setting in GPC, so the irq_set_type will be
skipped and cause all irqs' type in /proc/interrupt are "edge" which
mismatch with irq type setting in dtb file. Since GPC has no irq
type setting, so just tell kernel to use irq_chip_set_type_parent.
Signed-off-by: Anson Huang <Anson.Huang@freescale.com>
Cc: <stable@vger.kernel.org> # 4.1+
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Something seems to have gone wrong during the merging of the device
tree changes with the following patch
"ARM: dts: add property for maximum ADC clock frequencies"
The property "fsl,adck-max-frequency" instead of being applied for
the ADC1 node got applied to the esdhc0 node. This patch fixes it.
Signed-off-by: Sanchayan Maity <maitysanchayan@gmail.com>
Fixes: def0641e2f ("ARM: dts: add property for maximum ADC clock frequencies")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Both the pointer array and the pointed data have to be const when using
__initconst to be correct. This also fixes LTO builds that otherwise
fail with section mismatch errors.
Fixes: ec60d95b4f ("ARM: shmobile: Basic r8a7793 SoC support")
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
If max_pfn is not initialized, the block layer may use wrong DMA masks.
Replace open-coded shifts by PFN_DOWN(), and drop the "0 on coldfire"
comment, as it is not even true on all Coldfires, let alone all
m68knommu platforms.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Tested-By: Greg Ungerer <gerg@uclinux.org>
If max_pfn is not initialized, the various /proc/kpage* files are empty,
and selftests/vm/mlock2-tests will fail. max_pfn is also used by the
block layer to calculate DMA masks.
Switch from init_bootmem_node() to init_bootmem(), as there's only one
memory node on Sun-3. This will initialize min_low_pfn and max_low_pfn,
which was also not done before.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
If max_pfn is not initialized, the various /proc/kpage* files are empty,
and selftests/vm/mlock2-tests will fail. max_pfn is also used by the
block layer to calculate DMA masks.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Tested-by: Greg Ungerer <gerg@uclinux.org>
If max_pfn is not initialized, the various /proc/kpage* files are empty,
and selftests/vm/mlock2-tests will fail. max_pfn is also used by the
block layer to calculate DMA masks.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
It turned out that many HP laptops suffer from the same problem as
fixed in commit [c932b98c1e: ALSA: hda - Apply pin fixup for HP
ProBook 6550b]. But, it's tiresome to list up all such PCI SSIDs, as
there are really lots of HP machines.
Instead, we do a bit more clever, try to check the supposedly dock and
built-in headphone pins, and apply the fixup when both seem valid.
This rule can be applied generically to all models using the same
quirk, so we'll fix all in a shot.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107491
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
If md->signature == MAC_DRIVER_MAGIC and md->block_size == 1023, a single
512 byte sector would be read (secsize / 512). However the partition
structure would be located past the end of the buffer (secsize % 512).
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
kthread_create_on_node takes format+args, so there's no need to do the
pretty-printing in advance. Moreover, "mtip_svc_thd_99" (including its
'\0') only just fits in 16 bytes, so if index could ever go above 99
we'd have a stack buffer overflow.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Make sure that there are no unprocesssed entries on a completion
queue before deleting it, and check for validity of the CQ
door bell before writing completions to it.
This fixes problems with doing a sysfs reset of the device while
it's handling IO.
Tested-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Add free block, used block, and bad block information to the show debug
interface. This information is used to debug how targets track blocks.
Also, change debug function name to make it more generic.
Signed-off-by: Javier Gonzalez <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Maintain number of in use blocks, free blocks, and bad blocks in a per
lun basis. This allows the upper layers to get information about the
state of each lun.
Also, account for blocks reserved to the device on the free block count.
nr_free_blocks matches now the actual number of blocks on the free list
when the device is booted.
Signed-off-by: Javier Gonzalez <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
According to the Open-Channel SSD Specification, the NVMe-NVM admin
commands use vendor specific opcodes of NVMe, so use the NVMe admin
queue to dispatch these commands.
Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Updated by me to include set bad block table as well and also use
the admin queue for l2p len calculation.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
If either max_phys_sect is out of bound, the nvm_dev structure is not
freed.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The return value should be non-zero under error conditions.
Remove nvme_free(dev) to avoid free dev more than once.
Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The gendisk structure has not been initialized when using lightnvm.
Make sure to not delete it upon exit. Also make sure that we use the
appropriate disk_name at unregistration.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The linear addressing mode was removed in 7386af2. Make null_blk instead
expose the ppa format geometry and support the generic addressing mode.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Instead of using a page pool, we can save memory by only allocating room
for 64 entries for the ppa command. Introduce a ppa_cache to allocate only
the required memory for the ppa list.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
A kernel thread executes __set_current_state(TASK_INTERRUPTIBLE),
__add_wait_queue, spin_unlock_irq and then tests kthread_should_stop().
It is possible that the processor reorders memory accesses so that
kthread_should_stop() is executed before __set_current_state(). If such
reordering happens, there is a possible race on thread termination:
CPU 0:
calls kthread_should_stop()
it tests KTHREAD_SHOULD_STOP bit, returns false
CPU 1:
calls kthread_stop(cc->write_thread)
sets the KTHREAD_SHOULD_STOP bit
calls wake_up_process on the kernel thread, that sets the thread
state to TASK_RUNNING
CPU 0:
sets __set_current_state(TASK_INTERRUPTIBLE)
spin_unlock_irq(&cc->write_thread_wait.lock)
schedule() - and the process is stuck and never terminates, because the
state is TASK_INTERRUPTIBLE and wake_up_process on CPU 1 already
terminated
Fix this race condition by using a new flag DM_CRYPT_EXIT_THREAD to
signal that the kernel thread should exit. The flag is set and tested
while holding cc->write_thread_wait.lock, so there is no possibility of
racy access to the flag.
Also, remove the unnecessary set_task_state(current, TASK_RUNNING)
following the schedule() call. When the process was woken up, its state
was already set to TASK_RUNNING. Other kernel code also doesn't set the
state to TASK_RUNNING following schedule() (for example,
do_wait_for_common in completion.c doesn't do it).
Fixes: dc2676210c ("dm crypt: offload writes to thread")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
KVM: s390: Fixes for 4.4
1. disallow changing the SIMD mode when CPUs have been created.
it allowed userspace to corrupt kernel memory
2. Fix vCPU lookup. Until now the vCPU number equals the vCPU id. Some
kernel code places relied on that. This might
a: cause guest failures
b: allow userspace to corrupt kernel memory
3. Fencing of the PFMF instruction should use the guest facilities
and not the host facilities.
For making the speakers on Acer Aspire One Cloudbook 14 to work, we
need the as same quirk as for another Chromebook. This patch adds the
corresponding fixup entry.
Reported-by: Patrick <epictetus@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
For SKL, only the HDMI codec is in the display power well while the
HD-A controller isn't. So the codec flag 'link_power_control' is
set to request/release the display power via bus link_power ops.
For BXT, the power well design is the same as SKL, so the patch
should be applied to BXT too.
Signed-off-by: Lu, Han <han.lu@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Add HD Audio Device PCI ID for the Intel Broxton platform.
It is an HDA Intel PCH controller.
Signed-off-by: Lu, Han <han.lu@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The scpi_clock driver can be built-in when CONFIG_COMPILE_TEST
is set even when ARM_SCPI_PROTOCOL is a loadable module, and
that results in a link error:
drivers/built-in.o: In function `scpi_clocks_probe':
(.text+0x14453c): undefined reference to `get_scpi_ops'
Using #if IS_REACHABLE() around the get_scpi_ops() declaration
makes it build successfully in this case for compile-testing,
but the effect is the same as when ARM_SCPI_PROTOCOL is
disabled, as the code will not be used.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Merge "First fixes for 4.4" from Nicolas Ferre:
- removal of a useless defconfig option
- removal of some legacy DT pieces
- use of the proper watchdog compatible string
- addition of some sama5d2 Xplained nodes now that the MFD include is in place
- update of the MAINTAINERS entries for some Atmel drivers
* tag 'at91-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nferre/linux-at91:
MAINTAINERS: Atmel drivers: change NAND and ISI entries
ARM: at91/dt: sama5d2 Xplained: add several devices
ARM: at91/dt: remove bootargs
ARM: at91/dt: remove leftovers clock definition
ARM: at91/dt: replace gpio-key,wakeup with wakeup-source property
ARM: at91/dt: sama5d4: change watchdog compatible
ARM: at91/defconfig: remove CONFIG_SSB from Atmel defconfigs
The newly added zx power domain code causes build errors in
some configurations:
warning: (PM_RMOBILE && SOC_ZX296702) selects PM_GENERIC_DOMAINS which has unmet direct dependencies (PM)
warning: (ARCH_EXYNOS) selects EXYNOS_THERMAL which has unmet direct dependencies (THERMAL && (ARCH_EXYNOS || COMPILE_TEST) && THERMAL_OF)
power/domain.c: In function 'genpd_queue_power_off_work':
power/domain.c:192:13: error: 'pm_wq' undeclared (first use in this function)
queue_work(pm_wq, &genpd->power_off_work);
^
power/domain.c:192:13: note: each undeclared identifier is reported only once for each function it appears in
This ensures we don't try to enable it when CONFIG_PM is
disabled, mirroring what we do on most other platforms.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: f15107f412 ("ARM: zx: Add power domains for ZX296702")
Reviewed-by: Jun Nie <jun.nie@linaro.org>
The patches that were applied to add PWM lookup tables for legacy boards
were from v1 of the series instead of the revised v2 where the resulting
build errors had already been fixed.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
For now, VCPUs were always created sequentially with incrementing
VCPU ids. Therefore, the index in the VCPUs array matched the id.
As sequential creation might change with cpu hotplug, let's use
the correct lookup function to find a VCPU by id, not array index.
Let's also use kvm_lookup_vcpu() for validation of the sending VCPU
on external call injection.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org # db27a7a KVM: Provide function for VCPU lookup by id
Commit 383d0b0501 ("KVM: s390: handle pending local interrupts via
bitmap") introduced a possible memory overwrite from user space.
User space could pass an invalid emergency signal code (sending VCPU)
and therefore exceed the bitmap. Let's take care of this case and
check that the id is in the valid range.
Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v3.19+ db27a7a KVM: Provide function for VCPU lookup by id
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The "reg" entry in the "poweroff" section of "kirkwood-ts219.dtsi"
addressed the wrong uart (0 = console). This patch changes the address
to select uart 1, which is the uart connected to the pic
microcontroller, which can switch the device off.
Signed-off-by: Helmut Klein <hgkr.klein@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Fixes: 4350a47bba ("ARM: Kirkwood: Make use of the QNAP Power off driver.")
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
The pfmf intercept handler should check if the EDAT 1 facility
is installed in the guest, not if it is installed in the host.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We should never allow to enable/disable any facilities for the guest
when other VCPUs were already created.
kvm_arch_vcpu_(load|put) relies on SIMD not changing during runtime.
If somebody would create and run VCPUs and then decides to enable
SIMD, undefined behaviour could be possible (e.g. vector save area
not being set up).
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org # 4.1+
Add the "init" anf "sleep" pinctrl as the OTP gpio state.
We need the OTP pin is gpio state before resetting the TSADC controller,
since the tshut polarity will generate a high signal.
"init" pinctrl property is defined by Doug's Patch[0].
Patch[0]:
https://patchwork.kernel.org/patch/7454311/
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
The eMMC of the minnie Chromebook doesn't like our current method of
tuning and while there are solutions on the horizon, they still need
investigating. Other Chromebooks tune just fine with the emmc, so
simply disable tuning on Minnie for now.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
The commit fd88d16c58 ("selftests/seccomp: Be more precise with
syscall arguments.") use PAGE_SIZE directly which lead to build
failure on arm64.
Replace it with generic interface(sysconf(_SC_PAGESIZE)) to fix this
failure.
Build and test successful on x86_64 and arm64.
Signed-off-by: Bamvor Jian Zhang <bamvor.zhangjian@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Before this patch, we incorrectly enter the guest without requesting an
interrupt window if the IRQ chip is split between user space and the
kernel.
Because lapic_in_kernel no longer implies the PIC is in the kernel, this
patch tests pic_in_kernel to determining whether an interrupt window
should be requested when entering the guest.
If the APIC is in the kernel and we request an interrupt window the
guest will return immediately. If the APIC is masked the guest will not
not make forward progress and unmask it, leading to a loop when KVM
reenters and requests again. This patch adds a check to ensure the APIC
is ready to accept an interrupt before requesting a window.
Reviewed-by: Steve Rutherford <srutherford@google.com>
Signed-off-by: Matt Gingell <gingell@google.com>
[Use the other newly introduced functions. - Paolo]
Fixes: 1c1a9ce973
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Set KVM_REQ_EVENT when a PIC in user space injects a local interrupt.
Currently a request is only made when neither the PIC nor the APIC is in
the kernel, which is not sufficient in the split IRQ chip case.
This addresses a problem in QEMU where interrupts are delayed until
another path invokes the event loop.
Reviewed-by: Steve Rutherford <srutherford@google.com>
Signed-off-by: Matt Gingell <gingell@google.com>
Fixes: 1c1a9ce973
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch breaks out a new function kvm_vcpu_ready_for_interrupt_injection.
This routine encapsulates the logic required to determine whether a vcpu
is ready to accept an interrupt injection, which is now required on
multiple paths.
Reviewed-by: Steve Rutherford <srutherford@google.com>
Signed-off-by: Matt Gingell <gingell@google.com>
Fixes: 1c1a9ce973
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch ensures that dm_request_for_irq_injection and
post_kvm_run_save are in sync, avoiding that an endless ping-pong
between userspace (who correctly notices that IF=0) and
the kernel (who insists that userspace handles its request
for the interrupt window).
To synchronize them, it also adds checks for kvm_arch_interrupt_allowed
and !kvm_event_needs_reinjection. These are always needed, not
just for in-kernel LAPIC.
Signed-off-by: Matt Gingell <gingell@google.com>
[A collage of two patches from Matt. - Paolo]
Fixes: 1c1a9ce973
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ASID restoration on guest resume should determine the guest execution
mode based on the guest Status register rather than bit 30 of the guest
PC.
Fix the two places in locore.S that do this, loading the guest status
from the cop0 area. Note, this assembly is specific to the trap &
emulate implementation of KVM, so it doesn't need to check the
supervisor bit as that mode is not implemented in the guest.
Fixes: b680f70fc1 ("KVM/MIPS32: Entry point for trampolining to...")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.10.x-
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In multipath_prepare_ioctl(),
- pgpath is a path selected from available paths
- m->queue_io is true if we cannot send a request immediately to
paths, either because:
* there is no available path
* the path group needs activation (pg_init)
- pg_init is not started
- pg_init is still running
- m->queue_if_no_path is true if the device is configured to queue
I/O if there are no available paths
If !pgpath && !m->queue_if_no_path, the handler should return -EIO.
However in the course of refactoring the condition check has broken
and returns success in that case. Since bdev points to the dm device
itself, dm_blk_ioctl() calls __blk_dev_driver_ioctl() for itself and
recurses until crash.
You could reproduce the problem like this:
# dmsetup create mp --table '0 1024 multipath 0 0 0 0'
# sg_inq /dev/mapper/mp
<crash>
[ 172.648615] BUG: unable to handle kernel paging request at fffffffc81b10268
[ 172.662843] PGD 19dd067 PUD 0
[ 172.666269] Thread overran stack, or stack corrupted
[ 172.671808] Oops: 0000 [#1] SMP
...
Fix the condition check with some clarifications.
Fixes: e56f81e0b0 ("dm: refactor ioctl handling")
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
(Ab)using the @bdev passed to dm_blk_ioctl() opens the potential for
targets' .prepare_ioctl to fail if they go on to check the bdev for
!NULL.
Fixes: e56f81e0b0 ("dm: refactor ioctl handling")
Reported-by: Junichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
dm-mpath retries ioctl, when no path is readily available and the device
is configured to queue I/O in such a case. If you want to stop the retry
before multipathd decides to turn off queueing mode, you could send
signal for the process to exit from the loop.
However the check of fatal signal has not carried along when commit
6c182cd88d ("dm mpath: fix ioctl deadlock when no paths") moved the
loop from dm-mpath to dm core. As a result, we can't terminate such
a process in the retry loop.
Easy reproducer of the situation is:
# dmsetup create mp --table '0 1024 multipath 0 0 0 0'
# dmsetup message mp 0 'queue_if_no_path'
# sg_inq /dev/mapper/mp
then you should be able to terminate sg_inq by pressing Ctrl+C.
Fixes: 6c182cd88d ("dm mpath: fix ioctl deadlock when no paths")
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
SYNC in __switch_to() is a historic relic and not needed at all.
- In UP context it is obviously useless, why would we want to stall
the core for all updates to stack memory of t0 to complete before
loading kernel mode callee registers from t1 stack's memory.
- In SMP, there could be potential race in which outgoing task could
be concurrently picked for running on a different core, thus writes
to stack here need to be visible before the reads from stack on
other core. Peter confirmed that generic schedular already has needed
barriers (by way of rq lock) so there is no need for additional arch
barrier.
This came up when Noam was trying to replace this SYNC with EZChip
specific hardware thread scheduling instruction for their platform
support.
Link: http://lkml.kernel.org/r/20151102092654.GM17308@twins.programming.kicks-ass.net
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Cc: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Currently blk_insert_flush() just adds flush request to q->queue_head
when flush is not required. That completely bypasses IO scheduler so
e.g. CFQ can be idling waiting for new request to arrive and will idle
through the whole window unnecessarily. Luckily this only happens in
rare cases as usually checks in generic_make_request_checks() clear
FLUSH and FUA flags early if they are not needed.
When no flushing is actually required, we can easily fix the problem by
properly queueing the request through the IO scheduler. Ideally IO
scheduler should be also made aware of requests queued via
blk_flush_queue_rq(). However inserting flush request through IO
scheduler can have unwanted side-effects since due to flush batching
delaying the flush request in IO scheduler will delay all flush requests
possibly coming from other processes. So we keep adding the request
directly to q->queue_head.
Signed-off-by: Jan Kara <jack@suse.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Add support for registering as a LightNVM device. This allows us to
evaluate the performance of the LightNVM subsystem.
In /drivers/Makefile, LightNVM is moved above block device drivers
to make sure that the LightNVM media managers have been initialized
before drivers under /drivers/block are initialized.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Fix by Jens Axboe to remove unneeded slab cache and the following
memory leak.
Signed-off-by: Jens Axboe <axboe@fb.com>
To make the intention clearer, use list_{first,prev,next}_entry
instead of list_entry.
Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This prevents outstanding IOs to be sent for completion to target after
the target has been removed. The flow is now: stop new IOs > cleanup
queue > remove target.
Signed-off-by: Javier Gonzalez <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The specification was updated the remove the double word just after
number of configuration groups and capabilities. Update the identify
structure to reflect it.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The ppa format was not copied from the NVMe specific ppa format to the
lightnvm specific ppa format. This led to the ppa format not being
communicated to the layers above.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The linear and device specific address modes can be replaced with a
simple offset and bit length conversion that is generic across all
devices.
This both simplifies the specification and removes the special case for
qemu nvme, that previously relied on the linear address mapping.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Both the nvm_register and nvm_init does a kfree(dev) on error. Make sure
to only free it once.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
We register with nvm_devices when there registration can still fail.
Move the final registration at the end of the nvm_register function
to make sure we are fully registered when added to the nvm_devices list.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Only NAND flash with SLC and MLC is supported. Make sure to not try to
initialize TLC memory or other non-volatile memory types.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The nvm_id, nvm_id_group and nvm_addr_format data structures contain
reserved attributes. They are unused by media managers and targets.
Remove them.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The mccap field is required for I/O command option support. It defines the
following flash access modes:
* SLC mode
* Erase/Program Suspension
* Scramble On/Off
* Encryption
It is slotted in between mpos and cpar, changing the offset for
cpar as well.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
A single 8 bit and 16 bit reserve field were inserted in the
specification to align fields appropriately. Reflect this in the
identify group structure.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The specification was changed to reflect a multi-value bad block table.
Instead of bit-based bad block table, the bad block table now allows
eight bad block categories. Currently four are defined:
* Factory bad blocks
* Grown bad blocks
* Device-side reserved blocks
* Host-side reserved blocks
The factory and grown bad blocks are the regular bad blocks. The
reserved blocks are either for internal use or external use. In
particular, the device-side reserved blocks allows the host to
bootstrap from a limited number of flash blocks. Reducing the flash
blocks to scan upon super block initialization.
Support for both get bad block table and set bad block table is added.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The max_phys_sect variable is defined as a char. We do a boundary check
to maximally allow 256 physical page descriptors per command. As we are
not indexing from zero. This expression is always false. Bump the
max_phys_sect to an unsigned int to support the range check.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Some systems register thermal zone by themself and don't need to
have thermal zones node in DT. Therefore reduce the log level from
ERROR to DEBUG when thermal zone node can't be find in
of_thermal_destroy_zones().
Signed-off-by: Jiada Wang <jiada_wang@mentor.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Passing earlyprintk in the bootargs may crash the board as it depends on
having a sane DEBUG_UART_PHYS configured which is not always the case.
Also remove ignore_loglevel
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
The clocks group properties and the clock@0 node are useless, remove them
to avoid copy pasting in future device trees.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Though the keyboard driver for GPIO buttons(gpio-keys) will continue to
check for/support the legacy "gpio-key,wakeup" boolean property to
enable gpio buttons as wakeup source, "wakeup-source" is the new
standard binding.
This patch replaces the legacy "gpio-key,wakeup" with the unified
"wakeup-source" property in order to avoid any futher copy-paste
duplication.
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Change the watchdog compatible to "atmel,sama5d4-wdt" to support
SAMA5D4 watchdog driver.
Signed-off-by: Wenyou Yang <wenyou.yang@atmel.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
A thin-pool that is in out-of-data-space (OODS) mode may transition back
to write mode -- without the admin adding more space to the thin-pool --
if/when blocks are released (either by deleting thin devices or
discarding provisioned blocks).
But as part of the thin-pool's earlier transition to out-of-data-space
mode the thin-pool may have set the 'error_if_no_space' flag to true if
the no_space_timeout expires without more space having been made
available. That implementation detail, of changing the pool's
error_if_no_space setting, needs to be reset back to the default that
the user specified when the thin-pool's table was loaded.
Otherwise we'll drop the user requested behaviour on the floor when this
out-of-data-space to write mode transition occurs.
Reported-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Fixes: 2c43fd26e4 ("dm thin: fix missing out-of-data-space to write mode transition if blocks are released")
Cc: stable@vger.kernel.org
Although kernel doesn't support the multiple IRQ priority levels provided
by HS38x core intc yet, ensure that the default prio value is used
anyways by relevant code.
SLEEP insn needs to be provided the IRQ priority level which can
interrupt it. This needs to be the default level which maynot
necessarily be 0 as assumed by current code.
This change allows a kernel with ARCV2_IRQ_DEF_PRIO = 1 to boot fine.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
One of the many faults of the QinHeng CH345 USB MIDI interface chip is
that it does not handle received SysEx messages correctly -- every second
event packet has a wrong code index number, which is the one from the last
seen message, instead of 4. For example, the two messages "FE F0 01 02 03
04 05 06 07 08 09 0A 0B 0C 0D 0E F7" result in the following event
packets:
correct: CH345:
0F FE 00 00 0F FE 00 00
04 F0 01 02 04 F0 01 02
04 03 04 05 0F 03 04 05
04 06 07 08 04 06 07 08
04 09 0A 0B 0F 09 0A 0B
04 0C 0D 0E 04 0C 0D 0E
05 F7 00 00 05 F7 00 00
A class-compliant driver must interpret an event packet with CIN 15 as
having a single data byte, so the other two bytes would be ignored. The
message received by the host would then be missing two bytes out of six;
in this example, "F0 01 02 03 06 07 08 09 0C 0D 0E F7".
These corrupted SysEx event packages contain only data bytes, while the
CH345 uses event packets with a correct CIN value only for messages with
a status byte, so it is possible to distinguish between these two cases by
checking for the presence of this status byte.
(Other bugs in the CH345's input handling, such as the corruption resulting
from running status, cannot be worked around.)
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Cc: stable@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The CH345 USB MIDI chip has two output ports. However, they are
multiplexed through one pin, and the number of ports cannot be reduced
even for hardware that implements only one connector, so for those
devices, data sent to either port ends up on the same hardware output.
This becomes a problem when both ports are used at the same time, as
longer MIDI commands (such as SysEx messages) are likely to be
interrupted by messages from the other port, and thus to get lost.
It would not be possible for the driver to detect how many ports the
device actually has, except that in practice, _all_ devices built with
the CH345 have only one port. So we can just ignore the device's
descriptors, and hardcode one output port.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Cc: stable@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
When building kernel with buildroot built toolchain, CROSS_COMPILE
currently needs adjustment even if minor. This is because the defconfigs
prefer "arc-linux-uclibc-" prefix from hand built (non buildroot) toolchain
while buildroot provides "arc-buildroot-linux-uclibc-"
To avoid this use the common "arc-linux-" prefix which is provided by
buildroot and has also been in hand built tools for quite some time.
Signed-off-by: sujayraaj <sujayraaj@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: updated changelog]
Commit a471fcde8c ("ALSA: dice: fix detection of Weiss devices") adds
a quirk of Weiss models. According to users' reports, Loud models also
have the similar quirk. They have 0x10 in the category field.
This commit adds support for Mackie Onyx Blackbird and Onyx-i series.
As long as I know, Dice-based models produced by
Focusrite/Alesis/PreSonus/M-Audio/TC Electronic have default value (0x04)
in their category field, thus it may be reasonable to add a condition
statement for Loud models, instead of removing the check of category value.
Reported-by: Rouge Etienne <erouge.externe@m6.fr>
Reported-by: Etilem <contact@etilem.net>
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The RK3368 SoCs support to 2 channel TS-ADC, the temperature criteria
of each channel can be configurable.
The system has two Temperature Sensors, channel 0 is for CPU,
and channel 1 is for GPU.
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
As Temperature is currently represented as int not long in the thermal
framework since use int intead of unsigned long/long to represent
temperature to avoid bogus overheat detection when negative temperature
reported.
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The conversion table has the adc value and temperature.
In fact, the adc value only has the increment or decrement mode in
conversion table.
Moment, we can add the sort mode to be better support the *code_to_temp*
for differenr SoCs.
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
We should make the conversion table in as a parameter since the different
SoCs have the different conversionion table.
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
This is not needed anymore. Handling a potentially pending imprecise external
abort left behind by the bootloader is now done in a slightly safer way inside
the common ARM startup code.
With the recent changes to abort handling, this issue got fixed by 57df538085
("ARM: OMAP2+: Fix imprecise external abort caused by bogus SRAM init").
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
[tony@atomide.com: updated comments to describe what fixed the issue]
Signed-off-by: Tony Lindgren <tony@atomide.com>
The current driver is default to register the two thermal sensors
in probe since some SoCs maybe only have one sensor for thermal.
In some cases, the channel 0 is not always the cpu or gpu sensor.
So add the channel can be configured for sensors.
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
This patchset attempts to new compatible for thermal founding
on RK3368 SoCs.
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Some module needs more than one functional clock in order to be accessible,
like the McASPs found in DRA7xx family.
This flag will indicate that the opt_clks need to be handled at the same
time as the main_clk for the given hwmod, ensuring that all needed clocks
are enabled before we try to access the module's address space.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
McASP node needs to list all mandatory clocks: gfclk and ahclkx
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Remove extra space between platform prefix and driver name in MODULE_ALIAS.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The cpuidle tracepoints are called within a rcu_idle_exit() section, and
must be denoted with the _rcuidle() version of the tracepoint.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2015-10-12 16:09:57 -07:00
245 changed files with 3348 additions and 2344 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.