Compare commits

...

437 Commits

Author SHA1 Message Date
Linus Torvalds
cc4a41fe55 Linux 4.13-rc7 2017-08-27 17:20:40 -07:00
Linus Torvalds
2c25833c42 Merge tag 'iommu-fixes-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fix from Joerg Roedel:
 "Another fix, this time in common IOMMU sysfs code.

  In the conversion from the old iommu sysfs-code to the
  iommu_device_register interface, I missed to update the release path
  for the struct device associated with an IOMMU. It freed the 'struct
  device', which was a pointer before, but is now embedded in another
  struct.

  Freeing from the middle of allocated memory had all kinds of nasty
  side effects when an IOMMU was unplugged. Unfortunatly nobody
  unplugged and IOMMU until now, so this was not discovered earlier. The
  fix is to make the 'struct device' a pointer again"

* tag 'iommu-fixes-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu: Fix wrong freeing of iommu_device->dev
2017-08-27 17:10:34 -07:00
Linus Torvalds
80f73b2da0 Merge tag 'char-misc-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fix from Greg KH:
 "Here is a single misc driver fix for 4.13-rc7. It resolves a reported
  problem in the Android binder driver due to previous patches in
  4.13-rc.

  It's been in linux-next with no reported issues"

* tag 'char-misc-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  ANDROID: binder: fix proc->tsk check.
2017-08-27 17:08:37 -07:00
Linus Torvalds
c3c162635f Merge tag 'staging-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/iio fixes from Greg KH:
 "Here are few small staging driver fixes, and some more IIO driver
  fixes for 4.13-rc7. Nothing major, just resolutions for some reported
  problems.

  All of these have been in linux-next with no reported problems"

* tag 'staging-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  iio: magnetometer: st_magn: remove ihl property for LSM303AGR
  iio: magnetometer: st_magn: fix status register address for LSM303AGR
  iio: hid-sensor-trigger: Fix the race with user space powering up sensors
  iio: trigger: stm32-timer: fix get trigger mode
  iio: imu: adis16480: Fix acceleration scale factor for adis16480
  PATCH] iio: Fix some documentation warnings
  staging: rtl8188eu: add RNX-N150NUB support
  Revert "staging: fsl-mc: be consistent when checking strcmp() return"
  iio: adc: stm32: fix common clock rate
  iio: adc: ina219: Avoid underflow for sleeping time
  iio: trigger: stm32-timer: add enable attribute
  iio: trigger: stm32-timer: fix get/set down count direction
  iio: trigger: stm32-timer: fix write_raw return value
  iio: trigger: stm32-timer: fix quadrature mode get routine
  iio: bmp280: properly initialize device for humidity reading
2017-08-27 17:03:33 -07:00
Linus Torvalds
fff4e7a0e6 Merge tag 'ntb-4.13-bugfixes' of git://github.com/jonmason/ntb
Pull NTB fixes from Jon Mason:
 "NTB bug fixes to address an incorrect ntb_mw_count reference in the
  NTB transport, improperly bringing down the link if SPADs are
  corrupted, and an out-of-order issue regarding link negotiation and
  data passing"

* tag 'ntb-4.13-bugfixes' of git://github.com/jonmason/ntb:
  ntb: ntb_test: ensure the link is up before trying to configure the mws
  ntb: transport shouldn't disable link due to bogus values in SPADs
  ntb: use correct mw_count function in ntb_tool and ntb_transport
2017-08-27 17:01:54 -07:00
Linus Torvalds
a8b169afbf Avoid page waitqueue race leaving possible page locker waiting
The "lock_page_killable()" function waits for exclusive access to the
page lock bit using the WQ_FLAG_EXCLUSIVE bit in the waitqueue entry
set.

That means that if it gets woken up, other waiters may have been
skipped.

That, in turn, means that if it sees the page being unlocked, it *must*
take that lock and return success, even if a lethal signal is also
pending.

So instead of checking for lethal signals first, we need to check for
them after we've checked the actual bit that we were waiting for.  Even
if that might then delay the killing of the process.

This matches the order of the old "wait_on_bit_lock()" infrastructure
that the page locking used to use (and is still used in a few other
areas).

Note that if we still return an error after having unsuccessfully tried
to acquire the page lock, that is ok: that means that some other thread
was able to get ahead of us and lock the page, and when that other
thread then unlocks the page, the wakeup event will be repeated.  So any
other pending waiters will now get properly woken up.

Fixes: 6290602709 ("mm: add PageWaiters indicating tasks are waiting for a page bit")
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jan Kara <jack@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-27 16:25:09 -07:00
Linus Torvalds
3510ca20ec Minor page waitqueue cleanups
Tim Chen and Kan Liang have been battling a customer load that shows
extremely long page wakeup lists.  The cause seems to be constant NUMA
migration of a hot page that is shared across a lot of threads, but the
actual root cause for the exact behavior has not been found.

Tim has a patch that batches the wait list traversal at wakeup time, so
that we at least don't get long uninterruptible cases where we traverse
and wake up thousands of processes and get nasty latency spikes.  That
is likely 4.14 material, but we're still discussing the page waitqueue
specific parts of it.

In the meantime, I've tried to look at making the page wait queues less
expensive, and failing miserably.  If you have thousands of threads
waiting for the same page, it will be painful.  We'll need to try to
figure out the NUMA balancing issue some day, in addition to avoiding
the excessive spinlock hold times.

That said, having tried to rewrite the page wait queues, I can at least
fix up some of the braindamage in the current situation. In particular:

 (a) we don't want to continue walking the page wait list if the bit
     we're waiting for already got set again (which seems to be one of
     the patterns of the bad load).  That makes no progress and just
     causes pointless cache pollution chasing the pointers.

 (b) we don't want to put the non-locking waiters always on the front of
     the queue, and the locking waiters always on the back.  Not only is
     that unfair, it means that we wake up thousands of reading threads
     that will just end up being blocked by the writer later anyway.

Also add a comment about the layout of 'struct wait_page_key' - there is
an external user of it in the cachefiles code that means that it has to
match the layout of 'struct wait_bit_key' in the two first members.  It
so happens to match, because 'struct page *' and 'unsigned long *' end
up having the same values simply because the page flags are the first
member in struct page.

Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Christopher Lameter <cl@linux.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-27 13:55:12 -07:00
Linus Torvalds
0cc3b0ec23 Clarify (and fix) MAX_LFS_FILESIZE macros
We have a MAX_LFS_FILESIZE macro that is meant to be filled in by
filesystems (and other IO targets) that know they are 64-bit clean and
don't have any 32-bit limits in their IO path.

It turns out that our 32-bit value for that limit was bogus.  On 32-bit,
the VM layer is limited by the page cache to only 32-bit index values,
but our logic for that was confusing and actually wrong.  We used to
define that value to

	(((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1)

which is actually odd in several ways: it limits the index to 31 bits,
and then it limits files so that they can't have data in that last byte
of a page that has the highest 31-bit index (ie page index 0x7fffffff).

Neither of those limitations make sense.  The index is actually the full
32 bit unsigned value, and we can use that whole full page.  So the
maximum size of the file would logically be "PAGE_SIZE << BITS_PER_LONG".

However, we do wan tto avoid the maximum index, because we have code
that iterates over the page indexes, and we don't want that code to
overflow.  So the maximum size of a file on a 32-bit host should
actually be one page less than the full 32-bit index.

So the actual limit is ULONG_MAX << PAGE_SHIFT.  That means that we will
not actually be using the page of that last index (ULONG_MAX), but we
can grow a file up to that limit.

The wrong value of MAX_LFS_FILESIZE actually caused problems for Doug
Nazar, who was still using a 32-bit host, but with a 9.7TB 2 x RAID5
volume.  It turns out that our old MAX_LFS_FILESIZE was 8TiB (well, one
byte less), but the actual true VM limit is one page less than 16TiB.

This was invisible until commit c2a9737f45 ("vfs,mm: fix a dead loop
in truncate_inode_pages_range()"), which started applying that
MAX_LFS_FILESIZE limit to block devices too.

NOTE! On 64-bit, the page index isn't a limiter at all, and the limit is
actually just the offset type itself (loff_t), which is signed.  But for
clarity, on 64-bit, just use the maximum signed value, and don't make
people have to count the number of 'f' characters in the hex constant.

So just use LLONG_MAX for the 64-bit case.  That was what the value had
been before too, just written out as a hex constant.

Fixes: c2a9737f45 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()")
Reported-and-tested-by: Doug Nazar <nazard@nazar.ca>
Cc: Andreas Dilger <adilger@dilger.ca>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Dave Kleikamp <shaggy@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-27 12:12:25 -07:00
Linus Torvalds
bab9752480 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:

 - a tweak to the IBM Trackpoint driver that helps recognizing
   trackpoints on never Lenovo Carbons

 - a fix to the ALPS driver solving scroll issues on some Dells

 - yet another ACPI ID has been added to Elan I2C toucpad driver

 - quieted diagnostic message in soc_button_array driver

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad
  Input: soc_button_array - silence -ENOENT error on Dell XPS13 9365
  Input: trackpoint - add new trackpoint firmware ID
  Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
2017-08-26 12:48:29 -07:00
Linus Torvalds
9716bdb23e Merge tag 'pci-v4.13-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
 "Remove needlessly alarming MSI affinity warning (this is not actually
  a bug fix, but the warning prompts unnecessary bug reports)"

* tag 'pci-v4.13-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI/MSI: Don't warn when irq_create_affinity_masks() returns NULL
2017-08-26 12:46:14 -07:00
Linus Torvalds
c153e62105 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Two fixes: one for an ldt_struct handling bug and a cherry-picked
  objtool fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix use-after-free of ldt_struct
  objtool: Fix '-mtune=atom' decoding support in objtool 2.0
2017-08-26 09:06:28 -07:00
Linus Torvalds
0adb8f3d31 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
 "Fix a timer granularity handling race+bug, which would manifest itself
  by spuriously increasing timeouts of some timers (from 1 jiffy to ~500
  jiffies in the worst case measured) in certain nohz states"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timers: Fix excessive granularity of new timers after a nohz idle
2017-08-26 09:02:18 -07:00
Linus Torvalds
53ede64de3 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Ingo Molnar:
 "A single fix to not allow nonsensical event groups that result in
  kernel warnings"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix group {cpu,task} validation
2017-08-26 08:59:50 -07:00
Linus Torvalds
b3242dba9f Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "6 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/memblock.c: reversed logic in memblock_discard()
  fork: fix incorrect fput of ->exe_file causing use-after-free
  mm/madvise.c: fix freeing of locked page with MADV_FREE
  dax: fix deadlock due to misaligned PMD faults
  mm, shmem: fix handling /sys/kernel/mm/transparent_hugepage/shmem_enabled
  PM/hibernate: touch NMI watchdog when creating snapshot
2017-08-25 18:02:27 -07:00
Linus Torvalds
67a3b5cb33 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull Paolo Bonzini:
 "Bugfixes for x86, PPC and s390"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()
  KVM, pkeys: do not use PKRU value in vcpu->arch.guest_fpu.state
  KVM: x86: simplify handling of PKRU
  KVM: x86: block guest protection keys unless the host has them enabled
  KVM: PPC: Book3S HV: Add missing barriers to XIVE code and document them
  KVM: PPC: Book3S HV: Workaround POWER9 DD1.0 bug causing IPB bit loss
  KVM: PPC: Book3S HV: Use msgsync with hypervisor doorbells on POWER9
  KVM: s390: sthyi: fix specification exception detection
  KVM: s390: sthyi: fix sthyi inline assembly
2017-08-25 17:46:23 -07:00
Linus Torvalds
17e34c4fd0 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
 "Fixes two obvious bugs in virtio pci"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_pci: fix cpu affinity support
  virtio_blk: fix incorrect message when disk is resized
2017-08-25 17:40:03 -07:00
Linus Torvalds
42e6d5e5ee Merge tag 'powerpc-4.13-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
 "Just one fix, to add a barrier in the switch_mm() code to make sure
  the mm cpumask update is ordered vs the MMU starting to load
  translations. As far as we know no one's actually hit the bug, but
  that's just luck.

  Thanks to Benjamin Herrenschmidt, Nicholas Piggin"

* tag 'powerpc-4.13-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm: Ensure cpumask update is ordered
2017-08-25 17:32:35 -07:00
Linus Torvalds
105065c3f7 Merge tag 'nfsd-4.13-2' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
 "Two nfsd bugfixes, neither 4.13 regressions, but both potentially
  serious"

* tag 'nfsd-4.13-2' of git://linux-nfs.org/~bfields/linux:
  net: sunrpc: svcsock: fix NULL-pointer exception
  nfsd: Limit end of page list when decoding NFSv4 WRITE
2017-08-25 17:27:26 -07:00
Linus Torvalds
8c7932a32e Merge tag 'cifs-fixes-for-4.13-rc6-and-stable' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
 "Some bug fixes for stable for cifs"

* tag 'cifs-fixes-for-4.13-rc6-and-stable' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
  cifs: Fix df output for users with quota limits
2017-08-25 17:22:33 -07:00
Linus Torvalds
d580e80c7f Merge tag 'for-linus-20170825' of git://git.infradead.org/linux-mtd
Pull MTD fixes from Brian Norris:
 "Two fixes - one for a 4.13 regression, and the other for an older one:

   - Atmel NAND: since we started utilizing ONFI timings, we found that
     we were being too restrict at rejecting them, partly due to
     discrepancies in ONFI 4.0 and earlier versions. Relax the
     restriction to keep these platforms booting. This is a 4.13-rc1
     regression.

   - nandsim: repeated probe/removal may not work after a failed init,
     because we didn't free up our debugfs files properly on the failure
     path. This has been around since 3.8, but it's nice to get this
     fixed now in a nice easy patch that can target -stable, since
     there's already refactoring work (that also fixes the issue)
     targeted for the next merge window"

* tag 'for-linus-20170825' of git://git.infradead.org/linux-mtd:
  mtd: nand: atmel: Relax tADL_min constraint
  mtd: nandsim: remove debugfs entries in error path
2017-08-25 17:09:19 -07:00
Linus Torvalds
0b31c3ec1b Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A small batch of fixes that should be included for the 4.13 release.
  This contains:

   - Revert of the 4k loop blocksize support. Even with a recent batch
     of 4 fixes, we're still not really happy with it. Rather than be
     stuck with an API issue, let's revert it and get it right for 4.14.

   - Trivial patch from Bart, adding a few flags to the blk-mq debugfs
     exports that were added in this release, but not to the debugfs
     parts.

   - Regression fix for bsg, fixing a potential kernel panic. From
     Benjamin.

   - Tweak for the blk throttling, improving how we account discards.
     From Shaohua"

* 'for-linus' of git://git.kernel.dk/linux-block:
  blk-mq-debugfs: Add names for recently added flags
  bsg-lib: fix kernel panic resulting from missing allocation of reply-buffer
  Revert "loop: support 4k physical blocksize"
  blk-throttle: cap discard request size
2017-08-25 17:02:59 -07:00
Linus Torvalds
1f5de42da4 Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "I2C has some bugfixes for you: mainly Jarkko fixed up a few things in
  the designware driver regarding the new slave mode. But Ulf also fixed
  a long-standing and now agreed suspend problem. Plus, some simple
  stuff which nonetheless needs fixing"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: designware: Fix runtime PM for I2C slave mode
  i2c: designware: Remove needless pm_runtime_put_noidle() call
  i2c: aspeed: fixed potential null pointer dereference
  i2c: simtec: use release_mem_region instead of release_resource
  i2c: core: Make comment about I2C table requirement to reflect the code
  i2c: designware: Fix standard mode speed when configuring the slave mode
  i2c: designware: Fix oops from i2c_dw_irq_handler_slave
  i2c: designware: Fix system suspend
2017-08-25 16:59:38 -07:00
Christoph Hellwig
8e1101d251 PCI/MSI: Don't warn when irq_create_affinity_masks() returns NULL
irq_create_affinity_masks() can return NULL on non-SMP systems, when there
are not enough "free" vectors available to spread, or if memory allocation
for the CPU masks fails.  Only the allocation failure is of interest, and
even then the system will work just fine except for non-optimally spread
vectors.  Thus remove the warnings.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
2017-08-25 18:58:42 -05:00
Linus Torvalds
299c460876 Merge tag 'mmc-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fix from Ulf Hansson:
 "MMC core: don't return error code R1_OUT_OF_RANGE for open-ending mode"

* tag 'mmc-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: block: prevent propagating R1_OUT_OF_RANGE for open-ending mode
2017-08-25 16:57:53 -07:00
Linus Torvalds
8efeb3500c Merge tag 'sound-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "We're keeping in a good shape, this batch contains just a few small
  fixes (a regression fix for ASoC rt5677 codec, NULL dereference and
  error-path fixes in firewire, and a corner-case ioctl error fix for
  user TLV), as well as usual quirks for USB-audio and HD-audio"

* tag 'sound-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: rt5677: Reintroduce I2C device IDs
  ALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978)
  ALSA: core: Fix unexpected error at replacing user TLV
  ALSA: usb-audio: Add delay quirk for H650e/Jabra 550a USB headsets
  ALSA: firewire-motu: destroy stream data surely at failure of card initialization
  ALSA: firewire: fix NULL pointer dereference when releasing uninitialized data of iso-resource
2017-08-25 16:56:04 -07:00
Linus Torvalds
985e775573 Merge tag 'dmaengine-fix-4.13-rc7' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fix from Vinod Koul:
 "A single fix for tegra210-adma driver to check of_irq_get() error"

* tag 'dmaengine-fix-4.13-rc7' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: tegra210-adma: fix of_irq_get() error check
2017-08-25 16:43:08 -07:00
Linus Torvalds
9e15400180 Merge tag 'drm-fixes-for-v4.13-rc7' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Fixes for rc7, nothing too crazy, some core, i915, and sunxi fixes,
  Intel CI has been responsible for some of these fixes being required"

* tag 'drm-fixes-for-v4.13-rc7' of git://people.freedesktop.org/~airlied/linux:
  drm/i915/gvt: Fix the kernel null pointer error
  drm: Release driver tracking before making the object available again
  drm/i915: Clear lost context-switch interrupts across reset
  drm/i915/bxt: use NULL for GPIO connection ID
  drm/i915/cnl: Fix LSPCON support.
  drm/i915/vbt: ignore extraneous child devices for a port
  drm/i915: Initialize 'data' in intel_dsi_dcs_backlight.c
  drm/atomic: If the atomic check fails, return its value first
  drm/atomic: Handle -EDEADLK with out-fences correctly
  drm: Fix framebuffer leak
  drm/imx: ipuv3-plane: fix YUV framebuffer scanout on the base plane
  gpu: ipu-v3: add DRM dependency
  drm/rockchip: Fix suspend crash when drm is not bound
  drm/sun4i: Implement drm_driver lastclose to restore fbdev console
2017-08-25 16:39:51 -07:00
Pavel Tatashin
91b540f988 mm/memblock.c: reversed logic in memblock_discard()
In recently introduced memblock_discard() there is a reversed logic bug.
Memory is freed of static array instead of dynamically allocated one.

Link: http://lkml.kernel.org/r/1503511441-95478-2-git-send-email-pasha.tatashin@oracle.com
Fixes: 3010f87650 ("mm: discard memblock data later")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reported-by: Woody Suwalski <terraluna977@gmail.com>
Tested-by: Woody Suwalski <terraluna977@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-25 16:12:46 -07:00
Eric Biggers
2b7e8665b4 fork: fix incorrect fput of ->exe_file causing use-after-free
Commit 7c05126793 ("mm, fork: make dup_mmap wait for mmap_sem for
write killable") made it possible to kill a forking task while it is
waiting to acquire its ->mmap_sem for write, in dup_mmap().

However, it was overlooked that this introduced an new error path before
a reference is taken on the mm_struct's ->exe_file.  Since the
->exe_file of the new mm_struct was already set to the old ->exe_file by
the memcpy() in dup_mm(), it was possible for the mmput() in the error
path of dup_mm() to drop a reference to ->exe_file which was never
taken.

This caused the struct file to later be freed prematurely.

Fix it by updating mm_init() to NULL out the ->exe_file, in the same
place it clears other things like the list of mmaps.

This bug was found by syzkaller.  It can be reproduced using the
following C program:

    #define _GNU_SOURCE
    #include <pthread.h>
    #include <stdlib.h>
    #include <sys/mman.h>
    #include <sys/syscall.h>
    #include <sys/wait.h>
    #include <unistd.h>

    static void *mmap_thread(void *_arg)
    {
        for (;;) {
            mmap(NULL, 0x1000000, PROT_READ,
                 MAP_POPULATE|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
        }
    }

    static void *fork_thread(void *_arg)
    {
        usleep(rand() % 10000);
        fork();
    }

    int main(void)
    {
        fork();
        fork();
        fork();
        for (;;) {
            if (fork() == 0) {
                pthread_t t;

                pthread_create(&t, NULL, mmap_thread, NULL);
                pthread_create(&t, NULL, fork_thread, NULL);
                usleep(rand() % 10000);
                syscall(__NR_exit_group, 0);
            }
            wait(NULL);
        }
    }

No special kernel config options are needed.  It usually causes a NULL
pointer dereference in __remove_shared_vm_struct() during exit, or in
dup_mmap() (which is usually inlined into copy_process()) during fork.
Both are due to a vm_area_struct's ->vm_file being used after it's
already been freed.

Google Bug Id: 64772007

Link: http://lkml.kernel.org/r/20170823211408.31198-1-ebiggers3@gmail.com
Fixes: 7c05126793 ("mm, fork: make dup_mmap wait for mmap_sem for write killable")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>	[v4.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-25 16:12:46 -07:00
Eric Biggers
263630e8d1 mm/madvise.c: fix freeing of locked page with MADV_FREE
If madvise(..., MADV_FREE) split a transparent hugepage, it called
put_page() before unlock_page().

This was wrong because put_page() can free the page, e.g. if a
concurrent madvise(..., MADV_DONTNEED) has removed it from the memory
mapping. put_page() then rightfully complained about freeing a locked
page.

Fix this by moving the unlock_page() before put_page().

This bug was found by syzkaller, which encountered the following splat:

    BUG: Bad page state in process syzkaller412798  pfn:1bd800
    page:ffffea0006f60000 count:0 mapcount:0 mapping:          (null) index:0x20a00
    flags: 0x200000000040019(locked|uptodate|dirty|swapbacked)
    raw: 0200000000040019 0000000000000000 0000000000020a00 00000000ffffffff
    raw: ffffea0006f60020 ffffea0006f60020 0000000000000000 0000000000000000
    page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
    bad because of flags: 0x1(locked)
    Modules linked in:
    CPU: 1 PID: 3037 Comm: syzkaller412798 Not tainted 4.13.0-rc5+ #35
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:16 [inline]
     dump_stack+0x194/0x257 lib/dump_stack.c:52
     bad_page+0x230/0x2b0 mm/page_alloc.c:565
     free_pages_check_bad+0x1f0/0x2e0 mm/page_alloc.c:943
     free_pages_check mm/page_alloc.c:952 [inline]
     free_pages_prepare mm/page_alloc.c:1043 [inline]
     free_pcp_prepare mm/page_alloc.c:1068 [inline]
     free_hot_cold_page+0x8cf/0x12b0 mm/page_alloc.c:2584
     __put_single_page mm/swap.c:79 [inline]
     __put_page+0xfb/0x160 mm/swap.c:113
     put_page include/linux/mm.h:814 [inline]
     madvise_free_pte_range+0x137a/0x1ec0 mm/madvise.c:371
     walk_pmd_range mm/pagewalk.c:50 [inline]
     walk_pud_range mm/pagewalk.c:108 [inline]
     walk_p4d_range mm/pagewalk.c:134 [inline]
     walk_pgd_range mm/pagewalk.c:160 [inline]
     __walk_page_range+0xc3a/0x1450 mm/pagewalk.c:249
     walk_page_range+0x200/0x470 mm/pagewalk.c:326
     madvise_free_page_range.isra.9+0x17d/0x230 mm/madvise.c:444
     madvise_free_single_vma+0x353/0x580 mm/madvise.c:471
     madvise_dontneed_free mm/madvise.c:555 [inline]
     madvise_vma mm/madvise.c:664 [inline]
     SYSC_madvise mm/madvise.c:832 [inline]
     SyS_madvise+0x7d3/0x13c0 mm/madvise.c:760
     entry_SYSCALL_64_fastpath+0x1f/0xbe

Here is a C reproducer:

    #define _GNU_SOURCE
    #include <pthread.h>
    #include <sys/mman.h>
    #include <unistd.h>

    #define MADV_FREE	8
    #define PAGE_SIZE	4096

    static void *mapping;
    static const size_t mapping_size = 0x1000000;

    static void *madvise_thrproc(void *arg)
    {
        madvise(mapping, mapping_size, (long)arg);
    }

    int main(void)
    {
        pthread_t t[2];

        for (;;) {
            mapping = mmap(NULL, mapping_size, PROT_WRITE,
                           MAP_POPULATE|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);

            munmap(mapping + mapping_size / 2, PAGE_SIZE);

            pthread_create(&t[0], 0, madvise_thrproc, (void*)MADV_DONTNEED);
            pthread_create(&t[1], 0, madvise_thrproc, (void*)MADV_FREE);
            pthread_join(t[0], NULL);
            pthread_join(t[1], NULL);
            munmap(mapping, mapping_size);
        }
    }

Note: to see the splat, CONFIG_TRANSPARENT_HUGEPAGE=y and
CONFIG_DEBUG_VM=y are needed.

Google Bug Id: 64696096

Link: http://lkml.kernel.org/r/20170823205235.132061-1-ebiggers3@gmail.com
Fixes: 854e9ed09d ("mm: support madvise(MADV_FREE)")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>	[v4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-25 16:12:46 -07:00
Ross Zwisler
fffa281b48 dax: fix deadlock due to misaligned PMD faults
In DAX there are two separate places where the 2MiB range of a PMD is
defined.

The first is in the page tables, where a PMD mapping inserted for a
given address spans from (vmf->address & PMD_MASK) to ((vmf->address &
PMD_MASK) + PMD_SIZE - 1).  That is, from the 2MiB boundary below the
address to the 2MiB boundary above the address.

So, for example, a fault at address 3MiB (0x30 0000) falls within the
PMD that ranges from 2MiB (0x20 0000) to 4MiB (0x40 0000).

The second PMD range is in the mapping->page_tree, where a given file
offset is covered by a radix tree entry that spans from one 2MiB aligned
file offset to another 2MiB aligned file offset.

So, for example, the file offset for 3MiB (pgoff 768) falls within the
PMD range for the order 9 radix tree entry that ranges from 2MiB (pgoff
512) to 4MiB (pgoff 1024).

This system works so long as the addresses and file offsets for a given
mapping both have the same offsets relative to the start of each PMD.

Consider the case where the starting address for a given file isn't 2MiB
aligned - say our faulting address is 3 MiB (0x30 0000), but that
corresponds to the beginning of our file (pgoff 0).  Now all the PMDs in
the mapping are misaligned so that the 2MiB range defined in the page
tables never matches up with the 2MiB range defined in the radix tree.

The current code notices this case for DAX faults to storage with the
following test in dax_pmd_insert_mapping():

	if (pfn_t_to_pfn(pfn) & PG_PMD_COLOUR)
		goto unlock_fallback;

This test makes sure that the pfn we get from the driver is 2MiB
aligned, and relies on the assumption that the 2MiB alignment of the pfn
we get back from the driver matches the 2MiB alignment of the faulting
address.

However, faults to holes were not checked and we could hit the problem
described above.

This was reported in response to the NVML nvml/src/test/pmempool_sync
TEST5:

	$ cd nvml/src/test/pmempool_sync
	$ make TEST5

You can grab NVML here:

	https://github.com/pmem/nvml/

The dmesg warning you see when you hit this error is:

  WARNING: CPU: 13 PID: 2900 at fs/dax.c:641 dax_insert_mapping_entry+0x2df/0x310

Where we notice in dax_insert_mapping_entry() that the radix tree entry
we are about to replace doesn't match the locked entry that we had
previously inserted into the tree.  This happens because the initial
insertion was done in grab_mapping_entry() using a pgoff calculated from
the faulting address (vmf->address), and the replacement in
dax_pmd_load_hole() => dax_insert_mapping_entry() is done using
vmf->pgoff.

In our failure case those two page offsets (one calculated from
vmf->address, one using vmf->pgoff) point to different order 9 radix
tree entries.

This failure case can result in a deadlock because the radix tree unlock
also happens on the pgoff calculated from vmf->address.  This means that
the locked radix tree entry that we swapped in to the tree in
dax_insert_mapping_entry() using vmf->pgoff is never unlocked, so all
future faults to that 2MiB range will block forever.

Fix this by validating that the faulting address's PMD offset matches
the PMD offset from the start of the file.  This check is done at the
very beginning of the fault and covers faults that would have mapped to
storage as well as faults to holes.  I left the COLOUR check in
dax_pmd_insert_mapping() in place in case we ever hit the insanity
condition where the alignment of the pfn we get from the driver doesn't
match the alignment of the userspace address.

Link: http://lkml.kernel.org/r/20170822222436.18926-1-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: "Slusarz, Marcin" <marcin.slusarz@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-25 16:12:46 -07:00
Kirill A. Shutemov
435c0b87d6 mm, shmem: fix handling /sys/kernel/mm/transparent_hugepage/shmem_enabled
/sys/kernel/mm/transparent_hugepage/shmem_enabled controls if we want
to allocate huge pages when allocate pages for private in-kernel shmem
mount.

Unfortunately, as Dan noticed, I've screwed it up and the only way to
make kernel allocate huge page for the mount is to use "force" there.
All other values will be effectively ignored.

Link: http://lkml.kernel.org/r/20170822144254.66431-1-kirill.shutemov@linux.intel.com
Fixes: 5a6e75f811 ("shmem: prepare huge= mount option and sysfs knob")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-25 16:12:46 -07:00
Chen Yu
556b969a1c PM/hibernate: touch NMI watchdog when creating snapshot
There is a problem that when counting the pages for creating the
hibernation snapshot will take significant amount of time, especially on
system with large memory.  Since the counting job is performed with irq
disabled, this might lead to NMI lockup.  The following warning were
found on a system with 1.5TB DRAM:

  Freezing user space processes ... (elapsed 0.002 seconds) done.
  OOM killer disabled.
  PM: Preallocating image memory...
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 27
  CPU: 27 PID: 3128 Comm: systemd-sleep Not tainted 4.13.0-0.rc2.git0.1.fc27.x86_64 #1
  task: ffff9f01971ac000 task.stack: ffffb1a3f325c000
  RIP: 0010:memory_bm_find_bit+0xf4/0x100
  Call Trace:
   swsusp_set_page_free+0x2b/0x30
   mark_free_pages+0x147/0x1c0
   count_data_pages+0x41/0xa0
   hibernate_preallocate_memory+0x80/0x450
   hibernation_snapshot+0x58/0x410
   hibernate+0x17c/0x310
   state_store+0xdf/0xf0
   kobj_attr_store+0xf/0x20
   sysfs_kf_write+0x37/0x40
   kernfs_fop_write+0x11c/0x1a0
   __vfs_write+0x37/0x170
   vfs_write+0xb1/0x1a0
   SyS_write+0x55/0xc0
   entry_SYSCALL_64_fastpath+0x1a/0xa5
  ...
  done (allocated 6590003 pages)
  PM: Allocated 26360012 kbytes in 19.89 seconds (1325.28 MB/s)

It has taken nearly 20 seconds(2.10GHz CPU) thus the NMI lockup was
triggered.  In case the timeout of the NMI watch dog has been set to 1
second, a safe interval should be 6590003/20 = 320k pages in theory.
However there might also be some platforms running at a lower frequency,
so feed the watchdog every 100k pages.

[yu.c.chen@intel.com: simplification]
  Link: http://lkml.kernel.org/r/1503460079-29721-1-git-send-email-yu.c.chen@intel.com
[yu.c.chen@intel.com: use interval of 128k instead of 100k to avoid modulus]
Link: http://lkml.kernel.org/r/1503328098-5120-1-git-send-email-yu.c.chen@intel.com
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Reported-by: Jan Filipcewicz <jan.filipcewicz@intel.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-25 16:12:46 -07:00
Christoph Hellwig
ba74b6f7fc virtio_pci: fix cpu affinity support
Commit 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for
virtqueues"") removed the adjustment of the pre_vectors for the virtio
MSI-X vector allocation which was added in commit fb5e31d9 ("virtio:
allow drivers to request IRQ affinity when creating VQs"). This will
lead to an incorrect assignment of MSI-X vectors, and potential
deadlocks when offlining cpus.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Fixes: 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for virtqueues")
Reported-by: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-08-25 21:38:26 +03:00
Stefan Hajnoczi
1046d30490 virtio_blk: fix incorrect message when disk is resized
The message printed on disk resize is incorrect.  The following is
printed when resizing to 2 GiB:

  $ truncate -s 1G test.img
  $ qemu -device virtio-blk-pci,logical_block_size=4096,...
  (qemu) block_resize drive1 2G

  virtio_blk virtio0: new size: 4194304 4096-byte logical blocks (17.2 GB/16.0 GiB)

The virtio_blk capacity config field is in 512-byte sector units
regardless of logical_block_size as per the VIRTIO specification.
Therefore the message should read:

  virtio_blk virtio0: new size: 524288 4096-byte logical blocks (2.15 GB/2.0 GiB)

Note that this only affects the printed message.  Thankfully the actual
block device has the correct size because the block layer expects
capacity in sectors.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-08-25 21:37:15 +03:00
Bart Van Assche
22d538213e blk-mq-debugfs: Add names for recently added flags
The symbolic constants QUEUE_FLAG_SCSI_PASSTHROUGH, QUEUE_FLAG_QUIESCED
and REQ_NOWAIT are missing from blk-mq-debugfs.c. Add these to
blk-mq-debugfs.c such that these appear as names in debugfs instead of
as numbers.

Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-25 08:07:44 -06:00
Paul Mackerras
47c5310a8d KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()
Nixiaoming pointed out that there is a memory leak in
kvm_vm_ioctl_create_spapr_tce() if the call to anon_inode_getfd()
fails; the memory allocated for the kvmppc_spapr_tce_table struct
is not freed, and nor are the pages allocated for the iommu
tables.  In addition, we have already incremented the process's
count of locked memory pages, and this doesn't get restored on
error.

David Hildenbrand pointed out that there is a race in that the
function checks early on that there is not already an entry in the
stt->iommu_tables list with the same LIOBN, but an entry with the
same LIOBN could get added between then and when the new entry is
added to the list.

This fixes all three problems.  To simplify things, we now call
anon_inode_getfd() before placing the new entry in the list.  The
check for an existing entry is done while holding the kvm->lock
mutex, immediately before adding the new entry to the list.
Finally, on failure we now call kvmppc_account_memlimit to
decrement the process's count of locked memory pages.

Reported-by: Nixiaoming <nixiaoming@huawei.com>
Reported-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-25 11:08:57 +02:00
Mark Rutland
64aee2a965 perf/core: Fix group {cpu,task} validation
Regardless of which events form a group, it does not make sense for the
events to target different tasks and/or CPUs, as this leaves the group
inconsistent and impossible to schedule. The core perf code assumes that
these are consistent across (successfully intialised) groups.

Core perf code only verifies this when moving SW events into a HW
context. Thus, we can violate this requirement for pure SW groups and
pure HW groups, unless the relevant PMU driver happens to perform this
verification itself. These mismatched groups subsequently wreak havoc
elsewhere.

For example, we handle watchpoints as SW events, and reserve watchpoint
HW on a per-CPU basis at pmu::event_init() time to ensure that any event
that is initialised is guaranteed to have a slot at pmu::add() time.
However, the core code only checks the group leader's cpu filter (via
event_filter_match()), and can thus install follower events onto CPUs
violating thier (mismatched) CPU filters, potentially installing them
into a CPU without sufficient reserved slots.

This can be triggered with the below test case, resulting in warnings
from arch backends.

  #define _GNU_SOURCE
  #include <linux/hw_breakpoint.h>
  #include <linux/perf_event.h>
  #include <sched.h>
  #include <stdio.h>
  #include <sys/prctl.h>
  #include <sys/syscall.h>
  #include <unistd.h>

  static int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu,
			   int group_fd, unsigned long flags)
  {
	return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
  }

  char watched_char;

  struct perf_event_attr wp_attr = {
	.type = PERF_TYPE_BREAKPOINT,
	.bp_type = HW_BREAKPOINT_RW,
	.bp_addr = (unsigned long)&watched_char,
	.bp_len = 1,
	.size = sizeof(wp_attr),
  };

  int main(int argc, char *argv[])
  {
	int leader, ret;
	cpu_set_t cpus;

	/*
	 * Force use of CPU0 to ensure our CPU0-bound events get scheduled.
	 */
	CPU_ZERO(&cpus);
	CPU_SET(0, &cpus);
	ret = sched_setaffinity(0, sizeof(cpus), &cpus);
	if (ret) {
		printf("Unable to set cpu affinity\n");
		return 1;
	}

	/* open leader event, bound to this task, CPU0 only */
	leader = perf_event_open(&wp_attr, 0, 0, -1, 0);
	if (leader < 0) {
		printf("Couldn't open leader: %d\n", leader);
		return 1;
	}

	/*
	 * Open a follower event that is bound to the same task, but a
	 * different CPU. This means that the group should never be possible to
	 * schedule.
	 */
	ret = perf_event_open(&wp_attr, 0, 1, leader, 0);
	if (ret < 0) {
		printf("Couldn't open mismatched follower: %d\n", ret);
		return 1;
	} else {
		printf("Opened leader/follower with mismastched CPUs\n");
	}

	/*
	 * Open as many independent events as we can, all bound to the same
	 * task, CPU0 only.
	 */
	do {
		ret = perf_event_open(&wp_attr, 0, 0, -1, 0);
	} while (ret >= 0);

	/*
	 * Force enable/disble all events to trigger the erronoeous
	 * installation of the follower event.
	 */
	printf("Opened all events. Toggling..\n");
	for (;;) {
		prctl(PR_TASK_PERF_EVENTS_DISABLE, 0, 0, 0, 0);
		prctl(PR_TASK_PERF_EVENTS_ENABLE, 0, 0, 0, 0);
	}

	return 0;
  }

Fix this by validating this requirement regardless of whether we're
moving events.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zhou Chengming <zhouchengming1@huawei.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1498142498-15758-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-25 11:00:34 +02:00
Eric Biggers
ccd5b32351 x86/mm: Fix use-after-free of ldt_struct
The following commit:

  39a0526fb3 ("x86/mm: Factor out LDT init from context init")

renamed init_new_context() to init_new_context_ldt() and added a new
init_new_context() which calls init_new_context_ldt().  However, the
error code of init_new_context_ldt() was ignored.  Consequently, if a
memory allocation in alloc_ldt_struct() failed during a fork(), the
->context.ldt of the new task remained the same as that of the old task
(due to the memcpy() in dup_mm()).  ldt_struct's are not intended to be
shared, so a use-after-free occurred after one task exited.

Fix the bug by making init_new_context() pass through the error code of
init_new_context_ldt().

This bug was found by syzkaller, which encountered the following splat:

    BUG: KASAN: use-after-free in free_ldt_struct.part.2+0x10a/0x150 arch/x86/kernel/ldt.c:116
    Read of size 4 at addr ffff88006d2cb7c8 by task kworker/u9:0/3710

    CPU: 1 PID: 3710 Comm: kworker/u9:0 Not tainted 4.13.0-rc4-next-20170811 #2
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:16 [inline]
     dump_stack+0x194/0x257 lib/dump_stack.c:52
     print_address_description+0x73/0x250 mm/kasan/report.c:252
     kasan_report_error mm/kasan/report.c:351 [inline]
     kasan_report+0x24e/0x340 mm/kasan/report.c:409
     __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:429
     free_ldt_struct.part.2+0x10a/0x150 arch/x86/kernel/ldt.c:116
     free_ldt_struct arch/x86/kernel/ldt.c:173 [inline]
     destroy_context_ldt+0x60/0x80 arch/x86/kernel/ldt.c:171
     destroy_context arch/x86/include/asm/mmu_context.h:157 [inline]
     __mmdrop+0xe9/0x530 kernel/fork.c:889
     mmdrop include/linux/sched/mm.h:42 [inline]
     exec_mmap fs/exec.c:1061 [inline]
     flush_old_exec+0x173c/0x1ff0 fs/exec.c:1291
     load_elf_binary+0x81f/0x4ba0 fs/binfmt_elf.c:855
     search_binary_handler+0x142/0x6b0 fs/exec.c:1652
     exec_binprm fs/exec.c:1694 [inline]
     do_execveat_common.isra.33+0x1746/0x22e0 fs/exec.c:1816
     do_execve+0x31/0x40 fs/exec.c:1860
     call_usermodehelper_exec_async+0x457/0x8f0 kernel/umh.c:100
     ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431

    Allocated by task 3700:
     save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
     save_stack+0x43/0xd0 mm/kasan/kasan.c:447
     set_track mm/kasan/kasan.c:459 [inline]
     kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
     kmem_cache_alloc_trace+0x136/0x750 mm/slab.c:3627
     kmalloc include/linux/slab.h:493 [inline]
     alloc_ldt_struct+0x52/0x140 arch/x86/kernel/ldt.c:67
     write_ldt+0x7b7/0xab0 arch/x86/kernel/ldt.c:277
     sys_modify_ldt+0x1ef/0x240 arch/x86/kernel/ldt.c:307
     entry_SYSCALL_64_fastpath+0x1f/0xbe

    Freed by task 3700:
     save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
     save_stack+0x43/0xd0 mm/kasan/kasan.c:447
     set_track mm/kasan/kasan.c:459 [inline]
     kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
     __cache_free mm/slab.c:3503 [inline]
     kfree+0xca/0x250 mm/slab.c:3820
     free_ldt_struct.part.2+0xdd/0x150 arch/x86/kernel/ldt.c:121
     free_ldt_struct arch/x86/kernel/ldt.c:173 [inline]
     destroy_context_ldt+0x60/0x80 arch/x86/kernel/ldt.c:171
     destroy_context arch/x86/include/asm/mmu_context.h:157 [inline]
     __mmdrop+0xe9/0x530 kernel/fork.c:889
     mmdrop include/linux/sched/mm.h:42 [inline]
     __mmput kernel/fork.c:916 [inline]
     mmput+0x541/0x6e0 kernel/fork.c:927
     copy_process.part.36+0x22e1/0x4af0 kernel/fork.c:1931
     copy_process kernel/fork.c:1546 [inline]
     _do_fork+0x1ef/0xfb0 kernel/fork.c:2025
     SYSC_clone kernel/fork.c:2135 [inline]
     SyS_clone+0x37/0x50 kernel/fork.c:2129
     do_syscall_64+0x26c/0x8c0 arch/x86/entry/common.c:287
     return_from_SYSCALL_64+0x0/0x7a

Here is a C reproducer:

    #include <asm/ldt.h>
    #include <pthread.h>
    #include <signal.h>
    #include <stdlib.h>
    #include <sys/syscall.h>
    #include <sys/wait.h>
    #include <unistd.h>

    static void *fork_thread(void *_arg)
    {
        fork();
    }

    int main(void)
    {
        struct user_desc desc = { .entry_number = 8191 };

        syscall(__NR_modify_ldt, 1, &desc, sizeof(desc));

        for (;;) {
            if (fork() == 0) {
                pthread_t t;

                srand(getpid());
                pthread_create(&t, NULL, fork_thread, NULL);
                usleep(rand() % 10000);
                syscall(__NR_exit_group, 0);
            }
            wait(NULL);
        }
    }

Note: the reproducer takes advantage of the fact that alloc_ldt_struct()
may use vmalloc() to allocate a large ->entries array, and after
commit:

  5d17a73a2e ("vmalloc: back off when the current task is killed")

it is possible for userspace to fail a task's vmalloc() by
sending a fatal signal, e.g. via exit_group().  It would be more
difficult to reproduce this bug on kernels without that commit.

This bug only affected kernels with CONFIG_MODIFY_LDT_SYSCALL=y.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@vger.kernel.org> [v4.6+]
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Fixes: 39a0526fb3 ("x86/mm: Factor out LDT init from context init")
Link: http://lkml.kernel.org/r/20170824175029.76040-1-ebiggers3@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-25 09:55:52 +02:00
Paolo Bonzini
38cfd5e3df KVM, pkeys: do not use PKRU value in vcpu->arch.guest_fpu.state
The host pkru is restored right after vcpu exit (commit 1be0e61), so
KVM_GET_XSAVE will return the host PKRU value instead.  Fix this by
using the guest PKRU explicitly in fill_xsave and load_xsave.  This
part is based on a patch by Junkang Fu.

The host PKRU data may also not match the value in vcpu->arch.guest_fpu.state,
because it could have been changed by userspace since the last time
it was saved, so skip loading it in kvm_load_guest_fpu.

Reported-by: Junkang Fu <junkang.fjk@alibaba-inc.com>
Cc: Yang Zhang <zy107165@alibaba-inc.com>
Fixes: 1be0e61c1f
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-25 09:28:37 +02:00
Paolo Bonzini
b9dd21e104 KVM: x86: simplify handling of PKRU
Move it to struct kvm_arch_vcpu, replacing guest_pkru_valid with a
simple comparison against the host value of the register.  The write of
PKRU in addition can be skipped if the guest has not enabled the feature.
Once we do this, we need not test OSPKE in the host anymore, because
guest_CR4.PKE=1 implies host_CR4.PKE=1.

The static PKU test is kept to elide the code on older CPUs.

Suggested-by: Yang Zhang <zy107165@alibaba-inc.com>
Fixes: 1be0e61c1f
Cc: stable@vger.kernel.org
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-25 09:28:28 +02:00
Paolo Bonzini
c469268cd5 KVM: x86: block guest protection keys unless the host has them enabled
If the host has protection keys disabled, we cannot read and write the
guest PKRU---RDPKRU and WRPKRU fail with #GP(0) if CR4.PKE=0.  Block
the PKU cpuid bit in that case.

This ensures that guest_CR4.PKE=1 implies host_CR4.PKE=1.

Fixes: 1be0e61c1f
Cc: stable@vger.kernel.org
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-25 09:28:02 +02:00
Boris Brezillon
be3e83e347 mtd: nand: atmel: Relax tADL_min constraint
Version 4 of the ONFI spec mandates that tADL be at least 400 nanoseconds,
but, depending on the master clock rate, 400 ns may not fit in the tADL
field of the SMC reg. We need to relax the check and accept the -ERANGE
return code.

Note that previous versions of the ONFI spec had a lower tADL_min (100 or
200 ns). It's not clear why this timing constraint got increased but it
seems most NANDs are fine with values lower than 400ns, so we should be
safe.

Fixes: f9ce2eddf1 ("mtd: nand: atmel: Add ->setup_data_interface() hooks")
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Tested-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2017-08-24 20:59:50 -07:00
Uwe Kleine-König
b974696da1 mtd: nandsim: remove debugfs entries in error path
The debugfs entries must be removed before an error is returned in the
probe function. Otherwise another try to load the module fails and when
the debugfs files are accessed without the module loaded, the kernel
still tries to call a function in that module.

Fixes: 5346c27c5f ("mtd: nandsim: Introduce debugfs infrastructure")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2017-08-24 20:59:43 -07:00
Dave Airlie
da61197977 Merge tag 'drm-misc-fixes-2017-08-24' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
Core Changes:
- Release driver tracking before making the object available again (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>

* tag 'drm-misc-fixes-2017-08-24' of git://anongit.freedesktop.org/git/drm-misc:
  drm: Release driver tracking before making the object available again
2017-08-25 09:29:38 +10:00
Dave Airlie
4b5587c8c3 Merge tag 'drm-intel-fixes-2017-08-24' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
drm/i915 fixes for v4.13-rc7

* tag 'drm-intel-fixes-2017-08-24' of git://anongit.freedesktop.org/git/drm-intel:
  drm/i915/gvt: Fix the kernel null pointer error
  drm/i915: Clear lost context-switch interrupts across reset
  drm/i915/bxt: use NULL for GPIO connection ID
  drm/i915/cnl: Fix LSPCON support.
  drm/i915/vbt: ignore extraneous child devices for a port
  drm/i915: Initialize 'data' in intel_dsi_dcs_backlight.c
2017-08-25 09:29:06 +10:00
Masaki Ota
4a646580f7 Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad
Fixed the issue that two finger scroll does not work correctly
on V8 protocol. The cause is that V8 protocol X-coordinate decode
is wrong at SS4 PLUS device. I added SS4 PLUS X decode definition.

Mote notes:
the problem manifests itself by the commit e7348396c6 ("Input: ALPS
- fix V8+ protocol handling (73 03 28)"), where a fix for the V8+
protocol was applied.  Although the culprit must have been present
beforehand, the two-finger scroll worked casually even with the
wrongly reported values by some reason.  It got broken by the commit
above just because it changed x_max value, and this made libinput
correctly figuring the MT events.  Since the X coord is reported as
falsely doubled, the events on the right-half side go outside the
boundary, thus they are no longer handled.  This resulted as a broken
two-finger scroll.

One finger event is decoded differently, and it didn't suffer from
this problem.  The problem was only about MT events. --tiwai

Fixes: e7348396c6 ("Input: ALPS - fix V8+ protocol handling (73 03 28)")
Signed-off-by: Masaki Ota <masaki.ota@jp.alps.com>
Tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Paul Donohue <linux-kernel@PaulSD.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-08-24 15:53:46 -07:00
Linus Torvalds
90a6cd5039 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull more rdma fixes from Doug Ledford:
 "Well, I thought we were going to be done for this -rc cycle. I should
  have known better than to say so though.

  We have four additional items that trickled in.

  One was a simple mistake on my part. I took a patch into my for-next
  thinking that the issue was less severe than it was. I was then
  notified that it needed to be in my -rc area instead.

  The other three were just found late in testing.

  Summary:

   - One core fix accidentally applied first to for-next and then cherry
     picked back because it needed to be in the -rc cycles instead

   - Another core fix

   - Two mlx5 fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/mlx5: Always return success for RoCE modify port
  IB/mlx5: Fix Raw Packet QP event handler assignment
  IB/core: Avoid accessing non-allocated memory when inferring port type
  RDMA/uverbs: Initialize cq_context appropriately
2017-08-24 15:48:38 -07:00
Vadim Lomovtsev
eebe53e87f net: sunrpc: svcsock: fix NULL-pointer exception
While running nfs/connectathon tests kernel NULL-pointer exception
has been observed due to races in svcsock.c.

Race is appear when kernel accepts connection by kernel_accept
(which creates new socket) and start queuing ingress packets
to new socket. This happens in ksoftirq context which could run
concurrently on a different core while new socket setup is not done yet.

The fix is to re-order socket user data init sequence and add
write/read barrier calls to be sure that we got proper values
for callback pointers before actually calling them.

Test results: nfs/connectathon reports '0' failed tests for about 200+ iterations.

Crash log:
---<-snip->---
[ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 6708.647093] pgd = ffff0000094e0000
[ 6708.650497] [00000000] *pgd=0000010ffff90003, *pud=0000010ffff90003, *pmd=0000010ffff80003, *pte=0000000000000000
[ 6708.660761] Internal error: Oops: 86000005 [#1] SMP
[ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vfat fat ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg thunderx_edac i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
[ 6708.736446]  ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_3c300909c5b3f46dcacd49aab3334af_87021]
[ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: G        W  OE   4.11.0-4.el7.aarch64 #1
[ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 2017
[ 6708.767910] task: ffff810006842e80 task.stack: ffff81000689c000
[ 6708.773822] PC is at 0x0
[ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc]
[ 6708.781866] pc : [<0000000000000000>] lr : [<ffff0000029d7378>] pstate: 60000145
[ 6708.789248] sp : ffff810ffbad3900
[ 6708.792551] x29: ffff810ffbad3900 x28: ffff000008c73d58
[ 6708.797853] x27: 0000000000000000 x26: ffff81000bbe1e00
[ 6708.803156] x25: 0000000000000020 x24: ffff800f7410bf28
[ 6708.808458] x23: ffff000008c63000 x22: ffff000008c63000
[ 6708.813760] x21: ffff800f7410bf28 x20: ffff81000bbe1e00
[ 6708.819063] x19: ffff810012412400 x18: 00000000d82a9df2
[ 6708.824365] x17: 0000000000000000 x16: 0000000000000000
[ 6708.829667] x15: 0000000000000000 x14: 0000000000000001
[ 6708.834969] x13: 0000000000000000 x12: 722e736f622e676e
[ 6708.840271] x11: 00000000f814dd99 x10: 0000000000000000
[ 6708.845573] x9 : 7374687225000000 x8 : 0000000000000000
[ 6708.850875] x7 : 0000000000000000 x6 : 0000000000000000
[ 6708.856177] x5 : 0000000000000028 x4 : 0000000000000000
[ 6708.861479] x3 : 0000000000000000 x2 : 00000000e5000000
[ 6708.866781] x1 : 0000000000000000 x0 : ffff81000bbe1e00
[ 6708.872084]
[ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0xffff81000689c000)
[ 6708.880341] Stack: (0xffff810ffbad3900 to 0xffff8100068a0000)
[ 6708.886075] Call trace:
[ 6708.888513] Exception stack(0xffff810ffbad3710 to 0xffff810ffbad3840)
[ 6708.894942] 3700:                                   ffff810012412400 0001000000000000
[ 6708.902759] 3720: ffff810ffbad3900 0000000000000000 0000000060000145 ffff800f79300000
[ 6708.910577] 3740: ffff000009274d00 00000000000003ea 0000000000000015 ffff000008c63000
[ 6708.918395] 3760: ffff810ffbad3830 ffff800f79300000 000000000000004d 0000000000000000
[ 6708.926212] 3780: ffff810ffbad3890 ffff0000080f88dc ffff800f79300000 000000000000004d
[ 6708.934030] 37a0: ffff800f7930093c ffff000008c63000 0000000000000000 0000000000000140
[ 6708.941848] 37c0: ffff000008c2c000 0000000000040b00 ffff81000bbe1e00 0000000000000000
[ 6708.949665] 37e0: 00000000e5000000 0000000000000000 0000000000000000 0000000000000028
[ 6708.957483] 3800: 0000000000000000 0000000000000000 0000000000000000 7374687225000000
[ 6708.965300] 3820: 0000000000000000 00000000f814dd99 722e736f622e676e 0000000000000000
[ 6708.973117] [<          (null)>]           (null)
[ 6708.977824] [<ffff0000086f9fa4>] tcp_data_queue+0x754/0xc5c
[ 6708.983386] [<ffff0000086fa64c>] tcp_rcv_established+0x1a0/0x67c
[ 6708.989384] [<ffff000008704120>] tcp_v4_do_rcv+0x15c/0x22c
[ 6708.994858] [<ffff000008707418>] tcp_v4_rcv+0xaf0/0xb58
[ 6709.000077] [<ffff0000086df784>] ip_local_deliver_finish+0x10c/0x254
[ 6709.006419] [<ffff0000086dfea4>] ip_local_deliver+0xf0/0xfc
[ 6709.011980] [<ffff0000086dfad4>] ip_rcv_finish+0x208/0x3a4
[ 6709.017454] [<ffff0000086e018c>] ip_rcv+0x2dc/0x3c8
[ 6709.022328] [<ffff000008692fc8>] __netif_receive_skb_core+0x2f8/0xa0c
[ 6709.028758] [<ffff000008696068>] __netif_receive_skb+0x38/0x84
[ 6709.034580] [<ffff00000869611c>] netif_receive_skb_internal+0x68/0xdc
[ 6709.041010] [<ffff000008696bc0>] napi_gro_receive+0xcc/0x1a8
[ 6709.046690] [<ffff0000014b0fc4>] nicvf_cq_intr_handler+0x59c/0x730 [nicvf]
[ 6709.053559] [<ffff0000014b1380>] nicvf_poll+0x38/0xb8 [nicvf]
[ 6709.059295] [<ffff000008697a6c>] net_rx_action+0x2f8/0x464
[ 6709.064771] [<ffff000008081824>] __do_softirq+0x11c/0x308
[ 6709.070164] [<ffff0000080d14e4>] irq_exit+0x12c/0x174
[ 6709.075206] [<ffff00000813101c>] __handle_domain_irq+0x78/0xc4
[ 6709.081027] [<ffff000008081608>] gic_handle_irq+0x94/0x190
[ 6709.086501] Exception stack(0xffff81000689fdf0 to 0xffff81000689ff20)
[ 6709.092929] fde0:                                   0000810ff2ec0000 ffff000008c10000
[ 6709.100747] fe00: ffff000008c70ef4 0000000000000001 0000000000000000 ffff810ffbad9b18
[ 6709.108565] fe20: ffff810ffbad9c70 ffff8100169d3800 ffff810006843ab0 ffff81000689fe80
[ 6709.116382] fe40: 0000000000000bd0 0000ffffdf979cd0 183f5913da192500 0000ffff8a254ce4
[ 6709.124200] fe60: 0000ffff8a254b78 0000aaab10339808 0000000000000000 0000ffff8a0c2a50
[ 6709.132018] fe80: 0000ffffdf979b10 ffff000008d6d450 ffff000008c10000 ffff000008d6d000
[ 6709.139836] fea0: 0000000000000054 ffff000008cd3dbc 0000000000000000 0000000000000000
[ 6709.147653] fec0: 0000000000000000 0000000000000000 0000000000000000 ffff81000689ff20
[ 6709.155471] fee0: ffff000008085240 ffff81000689ff20 ffff000008085244 0000000060000145
[ 6709.163289] ff00: ffff81000689ff10 ffff00000813f1e4 ffffffffffffffff ffff00000813f238
[ 6709.171107] [<ffff000008082eb4>] el1_irq+0xb4/0x140
[ 6709.175976] [<ffff000008085244>] arch_cpu_idle+0x44/0x11c
[ 6709.181368] [<ffff0000087bf3b8>] default_idle_call+0x20/0x30
[ 6709.187020] [<ffff000008116d50>] do_idle+0x158/0x1e4
[ 6709.191973] [<ffff000008116ff4>] cpu_startup_entry+0x2c/0x30
[ 6709.197624] [<ffff00000808e7cc>] secondary_start_kernel+0x13c/0x160
[ 6709.203878] [<0000000001bc71c4>] 0x1bc71c4
[ 6709.207967] Code: bad PC value
[ 6709.211061] SMP: stopping secondary CPUs
[ 6709.218830] Starting crashdump kernel...
[ 6709.222749] Bye!
---<-snip>---

Signed-off-by: Vadim Lomovtsev <vlomovts@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-08-24 18:11:28 -04:00
Chuck Lever
fc788f64f1 nfsd: Limit end of page list when decoding NFSv4 WRITE
When processing an NFSv4 WRITE operation, argp->end should never
point past the end of the data in the final page of the page list.
Otherwise, nfsd4_decode_compound can walk into uninitialized memory.

More critical, nfsd4_decode_write is failing to increment argp->pagelen
when it increments argp->pagelist.  This can cause later xdr decoders
to assume more data is available than really is, which can cause server
crashes on malformed requests.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-08-24 18:05:30 -04:00
Linus Torvalds
4898b99c26 Merge tag 'acpi-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
 "These fix two recent regressions (in ACPICA and in the ACPI EC driver)
  and one bug in code introduced during the 4.12 cycle (ACPI device
  properties library routine).

  Specifics:

   - Fix a regression in the ACPI EC driver causing a kernel to crash
     during initialization on some systems due to a code ordering issue
     exposed by a recent change (Lv Zheng).

   - Fix a recent regression in ACPICA due to a change of the behavior
     of a library function in a way that is not backwards compatible
     with some existing callers of it (Rafael Wysocki).

   - Fix a coding mistake in a library function related to the handling
     of ACPI device properties introduced during the 4.12 cycle (Sakari
     Ailus)"

* tag 'acpi-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: device property: Fix node lookup in acpi_graph_get_child_prop_value()
  ACPICA: Fix acpi_evaluate_object_typed()
  ACPI: EC: Fix regression related to wrong ECDT initialization order
2017-08-24 14:56:20 -07:00
Linus Torvalds
f7bbf0754b Merge tag 'kbuild-fixes-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:

 - fix linker script regression caused by dead code elimination support

 - fix typos and outdated comments

 - specify kselftest-clean as a PHONY target

 - fix "make dtbs_install" when $(srctree) includes shell special
   characters like '~'

 - Move -fshort-wchar to the global option list because defining it
   partially emits warnings

* tag 'kbuild-fixes-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: update comments of Makefile.asm-generic
  kbuild: Do not use hyphen in exported variable name
  Makefile: add kselftest-clean to PHONY target list
  Kbuild: use -fshort-wchar globally
  fixdep: trivial: typo fix and correction
  kbuild: trivial cleanups on the comments
  kbuild: linker script do not match C names unless LD_DEAD_CODE_DATA_ELIMINATION is configured
2017-08-24 14:22:27 -07:00
Linus Torvalds
b71a5e3fe8 Merge branch 'for-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
 "We have one more fixup that stems from the blk_status_t conversion
  that did not quite cover everything.

  The normal cases were not affected because the code is 0, but any
  error and retries could mix up new and old values"

* 'for-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Btrfs: fix blk_status_t/errno confusion
2017-08-24 14:10:31 -07:00
Linus Torvalds
415be6c256 Merge tag 'trace-v4.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
 "Various bug fixes:

   - Two small memory leaks in error paths.

   - A missed return error code on an error path.

   - A fix to check the tracing ring buffer CPU when it doesn't exist
     (caused by setting maxcpus on the command line that is less than
     the actual number of CPUs, and then onlining them manually).

   - A fix to have the reset of boot tracers called by lateinit_sync()
     instead of just lateinit(). As some of the tracers register via
     lateinit(), and if the clear happens before the tracer is
     registered, it will never start even though it was told to via the
     kernel command line"

* tag 'trace-v4.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix freeing of filter in create_filter() when set_str is false
  tracing: Fix kmemleak in tracing_map_array_free()
  ftrace: Check for null ret_stack on profile function graph entry function
  ring-buffer: Have ring_buffer_alloc_read_page() return error on offline CPU
  tracing: Missing error code in tracer_alloc_buffers()
  tracing: Call clear_boot_tracer() at lateinit_sync
2017-08-24 14:08:22 -07:00
Linus Torvalds
1cffe5955f Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
 "A small number of bugfixes, again nothing serious.

   - Alexander Dahl found multiple bugs in the Atmel memory interface
     driver

   - A randconfig build fix for at91 was incomplete, the second attempt
     fixes the remaining corner case

   - One fix for the TI Keystone queue handler

   - The Odroid XU4 HDMI port (added in 4.13) needs a small DT fix"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: dts: exynos: add needs-hpd for Odroid-XU3/4
  ARM: at91: don't select CONFIG_ARM_CPU_SUSPEND for old platforms
  soc: ti: knav: Add a NULL pointer check for kdev in knav_pool_create
  memory: atmel-ebi: Fix smc cycle xlate converter
  memory: atmel-ebi: Allow t_DF timings of zero ns
  memory: atmel-ebi: Fix smc timing return value evaluation
2017-08-24 14:01:18 -07:00
Eric W. Biederman
311fc65c9f pty: Repair TIOCGPTPEER
The implementation of TIOCGPTPEER has two issues.

When /dev/ptmx (as opposed to /dev/pts/ptmx) is opened the wrong
vfsmount is passed to dentry_open.  Which results in the kernel displaying
the wrong pathname for the peer.

The second is simply by caching the vfsmount and dentry of the peer it leaves
them open, in a way they were not previously Which because of the inreased
reference counts can cause unnecessary behaviour differences resulting in
regressions.

To fix these move the ioctl into tty_io.c at a generic level allowing
the ioctl to have access to the struct file on which the ioctl is
being called.  This allows the path of the slave to be derived when
opening the slave through TIOCGPTPEER instead of requiring the path to
the slave be cached.  Thus removing the need for caching the path.

A new function devpts_ptmx_path is factored out of devpts_acquire and
used to implement a function devpts_mntget.   The new function devpts_mntget
takes a filp to perform the lookup on and fsi so that it can confirm
that the superblock that is found by devpts_ptmx_path is the proper superblock.

v2: Lots of fixes to make the code actually work
v3: Suggestions by Linus
    - Removed the unnecessary initialization of filp in ptm_open_peer
    - Simplified devpts_ptmx_path as gotos are no longer required

[ This is the fix for the issue that was reverted in commit
  143c97cc65, but this time without breaking 'pbuilder' due to
  increased reference counts   - Linus ]

Fixes: 54ebbfb160 ("tty: add TIOCGPTPEER ioctl")
Reported-by: Christian Brauner <christian.brauner@canonical.com>
Reported-and-tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-24 13:23:03 -07:00
Rafael J. Wysocki
d5d6c1ddb9 Merge branches 'acpica-fix', 'acpi-ec-fix' and 'acpi-properties-fix'
* acpica-fix:
  ACPICA: Fix acpi_evaluate_object_typed()

* acpi-ec-fix:
  ACPI: EC: Fix regression related to wrong ECDT initialization order

* acpi-properties-fix:
  ACPI: device property: Fix node lookup in acpi_graph_get_child_prop_value()
2017-08-24 22:22:53 +02:00
Majd Dibbiny
ec2558796d IB/mlx5: Always return success for RoCE modify port
CM layer calls ib_modify_port() regardless of the link layer.

For the Ethernet ports, qkey violation and Port capabilities
are meaningless. Therefore, always return success for ib_modify_port
calls on the Ethernet ports.

Cc: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-24 15:33:33 -04:00
Majd Dibbiny
1d31e9c09f IB/mlx5: Fix Raw Packet QP event handler assignment
In case we have SQ and RQ for Raw Packet QP, the SQ's event handler
wasn't assigned.

Fixing this by assigning event handler for each WQ after creation.

[ 1877.145243] Call Trace:
[ 1877.148644] <IRQ>
[ 1877.150580] [<ffffffffa07987c5>] ? mlx5_rsc_event+0x105/0x210 [mlx5_core]
[ 1877.159581] [<ffffffffa0795bd7>] ? mlx5_cq_event+0x57/0xd0 [mlx5_core]
[ 1877.167137] [<ffffffffa079208e>] mlx5_eq_int+0x53e/0x6c0 [mlx5_core]
[ 1877.174526] [<ffffffff8101a679>] ? sched_clock+0x9/0x10
[ 1877.180753] [<ffffffff810f717e>] handle_irq_event_percpu+0x3e/0x1e0
[ 1877.188014] [<ffffffff810f735d>] handle_irq_event+0x3d/0x60
[ 1877.194567] [<ffffffff810f9fe7>] handle_edge_irq+0x77/0x130
[ 1877.201129] [<ffffffff81014c3f>] handle_irq+0xbf/0x150
[ 1877.207244] [<ffffffff815ed78a>] ? atomic_notifier_call_chain+0x1a/0x20
[ 1877.214829] [<ffffffff815f434f>] do_IRQ+0x4f/0xf0
[ 1877.220498] [<ffffffff815e94ad>] common_interrupt+0x6d/0x6d
[ 1877.227025] <EOI>
[ 1877.228967] [<ffffffff814834e2>] ? cpuidle_enter_state+0x52/0xc0
[ 1877.236990] [<ffffffff81483615>] cpuidle_idle_call+0xc5/0x200
[ 1877.243676] [<ffffffff8101bc7e>] arch_cpu_idle+0xe/0x30
[ 1877.249831] [<ffffffff810b4725>] cpu_startup_entry+0xf5/0x290
[ 1877.256513] [<ffffffff815cfee1>] start_secondary+0x265/0x27b
[ 1877.263111] Code: Bad RIP value.
[ 1877.267296] RIP [< (null)>] (null)
[ 1877.273264] RSP <ffff88046fd63df8>
[ 1877.277531] CR2: 0000000000000000

Fixes: 19098df2da ("IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-24 15:33:33 -04:00
Noa Osherovich
498ca3c82a IB/core: Avoid accessing non-allocated memory when inferring port type
Commit 44c58487d5 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
introduced the concept of type in ah_attr:
 * During ib_register_device, each port is checked for its type which
   is stored in ib_device's port_immutable array.
 * During uverbs' modify_qp, the type is inferred using the port number
   in ib_uverbs_qp_dest struct (address vector) by accessing the
   relevant port_immutable array and the type is passed on to
   providers.

IB spec (version 1.3) enforces a valid port value only in Reset to
Init. During Init to RTR, the address vector must be valid but port
number is not mentioned as a field in the address vector, so its
value is not validated, which leads to accesses to a non-allocated
memory when inferring the port type.

Save the real port number in ib_qp during modify to Init (when the
comp_mask indicates that the port number is valid) and use this value
to infer the port type.

Avoid copying the address vector fields if the matching bit is not set
in the attr_mask. Address vector can't be modified before the port, so
no valid flow is affected.

Fixes: 44c58487d5 ('IB/core: Define 'ib' and 'roce' rdma_ah_attr types')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-24 15:33:33 -04:00
Tom Rini
9ce76511b6 ASoC: rt5677: Reintroduce I2C device IDs
Not all devices with ACPI and this combination of sound devices will
have the required information provided via ACPI.  Reintroduce the I2C
device ID to restore sound functionality on on the Chromebook 'Samus'
model.

[ More background note:
 the commit a36afb0ab6 ("ASoC: rt5677: Introduce proper table...")
 moved the i2c ID probed via ACPI ("RT5677CE:00") to a proper
 acpi_device_id table.  Although the action itself is correct per se,
 the overseen issue is the reference id->driver_data at
 rt5677_i2c_probe() for retrieving the corresponding chip model for
 the given id.  Since id=NULL is passed for ACPI matching case, we get
 an Oops now.

 We already have queued more fixes for 4.14 and they already address
 the issue, but they are bigger changes that aren't preferable for the
 late 4.13-rc stage.  So, this patch just papers over the bug as a
 once-off quick fix for a particular ACPI matching.  -- tiwai ]

Fixes: a36afb0ab6 ("ASoC: rt5677: Introduce proper table for ACPI enumeration")
Signed-off-by: Tom Rini <trini@konsulko.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-24 18:04:29 +02:00
Omar Sandoval
58efbc9f54 Btrfs: fix blk_status_t/errno confusion
This fixes several instances of blk_status_t and bare errno ints being
mixed up, some of which are real bugs.

In the normal case, 0 matches BLK_STS_OK, so we don't observe any
effects of the missing conversion, but in case of errors or passes
through the repair/retry paths, the errors get mixed up.

The changes were identified using 'sparse', we don't have reports of the
buggy behaviour.

Fixes: 4e4cbee93d ("block: switch bios to blk_status_t")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-08-24 17:19:02 +02:00
Cao jin
64236e3159 kbuild: update comments of Makefile.asm-generic
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-08-25 00:10:05 +09:00
Benjamin Block
50b4d48552 bsg-lib: fix kernel panic resulting from missing allocation of reply-buffer
Since we split the scsi_request out of struct request bsg fails to
provide a reply-buffer for the drivers. This was done via the pointer
for sense-data, that is not preallocated anymore.

Failing to allocate/assign it results in illegal dereferences because
LLDs use this pointer unquestioned.

An example panic on s390x, using the zFCP driver, looks like this (I had
debugging on, otherwise NULL-pointer dereferences wouldn't even panic on
s390x):

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6403
Fault in home space mode while using kernel ASCE.
AS:0000000001590007 R3:0000000000000024
Oops: 0038 ilc:2 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: <Long List>
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.12.0-bsg-regression+ #3
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
task: 0000000065cb0100 task.stack: 0000000065cb4000
Krnl PSW : 0704e00180000000 000003ff801e4156 (zfcp_fc_ct_els_job_handler+0x16/0x58 [zfcp])
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000001 000000005fa9d0d0 000000005fa9d078 0000000000e16866
           000003ff00000290 6b6b6b6b6b6b6b6b 0000000059f78f00 000000000000000f
           00000000593a0958 00000000593a0958 0000000060d88800 000000005ddd4c38
           0000000058b50100 07000000659cba08 000003ff801e8556 00000000659cb9a8
Krnl Code: 000003ff801e4146: e31020500004        lg      %r1,80(%r2)
           000003ff801e414c: 58402040           l       %r4,64(%r2)
          #000003ff801e4150: e35020200004       lg      %r5,32(%r2)
          >000003ff801e4156: 50405004           st      %r4,4(%r5)
           000003ff801e415a: e54c50080000       mvhi    8(%r5),0
           000003ff801e4160: e33010280012       lt      %r3,40(%r1)
           000003ff801e4166: a718fffb           lhi     %r1,-5
           000003ff801e416a: 1803               lr      %r0,%r3
Call Trace:
([<000003ff801e8556>] zfcp_fsf_req_complete+0x726/0x768 [zfcp])
 [<000003ff801ea82a>] zfcp_fsf_reqid_check+0x102/0x180 [zfcp]
 [<000003ff801eb980>] zfcp_qdio_int_resp+0x230/0x278 [zfcp]
 [<00000000009b91b6>] qdio_kick_handler+0x2ae/0x2c8
 [<00000000009b9e3e>] __tiqdio_inbound_processing+0x406/0xc10
 [<00000000001684c2>] tasklet_action+0x15a/0x1d8
 [<0000000000bd28ec>] __do_softirq+0x3ec/0x848
 [<00000000001675a4>] irq_exit+0x74/0xf8
 [<000000000010dd6a>] do_IRQ+0xba/0xf0
 [<0000000000bd19e8>] io_int_handler+0x104/0x2d4
 [<00000000001033b6>] enabled_wait+0xb6/0x188
([<000000000010339e>] enabled_wait+0x9e/0x188)
 [<000000000010396a>] arch_cpu_idle+0x32/0x50
 [<0000000000bd0112>] default_idle_call+0x52/0x68
 [<00000000001cd0fa>] do_idle+0x102/0x188
 [<00000000001cd41e>] cpu_startup_entry+0x3e/0x48
 [<0000000000118c64>] smp_start_secondary+0x11c/0x130
 [<0000000000bd2016>] restart_int_handler+0x62/0x78
 [<0000000000000000>]           (null)
INFO: lockdep is turned off.
Last Breaking-Event-Address:
 [<000003ff801e41d6>] zfcp_fc_ct_job_handler+0x3e/0x48 [zfcp]

Kernel panic - not syncing: Fatal exception in interrupt

This patch moves bsg-lib to allocate and setup struct bsg_job ahead of
time, including the allocation of a buffer for the reply-data.

This means, struct bsg_job is not allocated separately anymore, but as part
of struct request allocation - similar to struct scsi_cmd. Reflect this in
the function names that used to handle creation/destruction of struct
bsg_job.

Reported-by: Steffen Maier <maier@linux.vnet.ibm.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Fixes: 82ed4db499 ("block: split scsi_request out of struct request")
Cc: <stable@vger.kernel.org> #4.11+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-24 08:22:10 -06:00
Steven Rostedt (VMware)
8b0db1a5bd tracing: Fix freeing of filter in create_filter() when set_str is false
Performing the following task with kmemleak enabled:

 # cd /sys/kernel/tracing/events/irq/irq_handler_entry/
 # echo 'enable_event:kmem:kmalloc:3 if irq >' > trigger
 # echo 'enable_event:kmem:kmalloc:3 if irq > 31' > trigger
 # echo scan > /sys/kernel/debug/kmemleak
 # cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8800b9290308 (size 32):
  comm "bash", pid 1114, jiffies 4294848451 (age 141.139s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff81cef5aa>] kmemleak_alloc+0x4a/0xa0
    [<ffffffff81357938>] kmem_cache_alloc_trace+0x158/0x290
    [<ffffffff81261c09>] create_filter_start.constprop.28+0x99/0x940
    [<ffffffff812639c9>] create_filter+0xa9/0x160
    [<ffffffff81263bdc>] create_event_filter+0xc/0x10
    [<ffffffff812655e5>] set_trigger_filter+0xe5/0x210
    [<ffffffff812660c4>] event_enable_trigger_func+0x324/0x490
    [<ffffffff812652e2>] event_trigger_write+0x1a2/0x260
    [<ffffffff8138cf87>] __vfs_write+0xd7/0x380
    [<ffffffff8138f421>] vfs_write+0x101/0x260
    [<ffffffff8139187b>] SyS_write+0xab/0x130
    [<ffffffff81cfd501>] entry_SYSCALL_64_fastpath+0x1f/0xbe
    [<ffffffffffffffff>] 0xffffffffffffffff

The function create_filter() is passed a 'filterp' pointer that gets
allocated, and if "set_str" is true, it is up to the caller to free it, even
on error. The problem is that the pointer is not freed by create_filter()
when set_str is false. This is a bug, and it is not up to the caller to free
the filter on error if it doesn't care about the string.

Link: http://lkml.kernel.org/r/1502705898-27571-2-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: 38b78eb85 ("tracing: Factorize filter creation")
Reported-by: Chunyu Hu <chuhu@redhat.com>
Tested-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-24 10:07:38 -04:00
Chunyu Hu
475bb3c69a tracing: Fix kmemleak in tracing_map_array_free()
kmemleak reported the below leak when I was doing clear of the hist
trigger. With this patch, the kmeamleak is gone.

unreferenced object 0xffff94322b63d760 (size 32):
  comm "bash", pid 1522, jiffies 4403687962 (age 2442.311s)
  hex dump (first 32 bytes):
    00 01 00 00 04 00 00 00 08 00 00 00 ff 00 00 00  ................
    10 00 00 00 00 00 00 00 80 a8 7a f2 31 94 ff ff  ..........z.1...
  backtrace:
    [<ffffffff9e96c27a>] kmemleak_alloc+0x4a/0xa0
    [<ffffffff9e424cba>] kmem_cache_alloc_trace+0xca/0x1d0
    [<ffffffff9e377736>] tracing_map_array_alloc+0x26/0x140
    [<ffffffff9e261be0>] kretprobe_trampoline+0x0/0x50
    [<ffffffff9e38b935>] create_hist_data+0x535/0x750
    [<ffffffff9e38bd47>] event_hist_trigger_func+0x1f7/0x420
    [<ffffffff9e38893d>] event_trigger_write+0xfd/0x1a0
    [<ffffffff9e44dfc7>] __vfs_write+0x37/0x170
    [<ffffffff9e44f552>] vfs_write+0xb2/0x1b0
    [<ffffffff9e450b85>] SyS_write+0x55/0xc0
    [<ffffffff9e203857>] do_syscall_64+0x67/0x150
    [<ffffffff9e977ce7>] return_from_SYSCALL_64+0x0/0x6a
    [<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff9431f27aa880 (size 128):
  comm "bash", pid 1522, jiffies 4403687962 (age 2442.311s)
  hex dump (first 32 bytes):
    00 00 8c 2a 32 94 ff ff 00 f0 8b 2a 32 94 ff ff  ...*2......*2...
    00 e0 8b 2a 32 94 ff ff 00 d0 8b 2a 32 94 ff ff  ...*2......*2...
  backtrace:
    [<ffffffff9e96c27a>] kmemleak_alloc+0x4a/0xa0
    [<ffffffff9e425348>] __kmalloc+0xe8/0x220
    [<ffffffff9e3777c1>] tracing_map_array_alloc+0xb1/0x140
    [<ffffffff9e261be0>] kretprobe_trampoline+0x0/0x50
    [<ffffffff9e38b935>] create_hist_data+0x535/0x750
    [<ffffffff9e38bd47>] event_hist_trigger_func+0x1f7/0x420
    [<ffffffff9e38893d>] event_trigger_write+0xfd/0x1a0
    [<ffffffff9e44dfc7>] __vfs_write+0x37/0x170
    [<ffffffff9e44f552>] vfs_write+0xb2/0x1b0
    [<ffffffff9e450b85>] SyS_write+0x55/0xc0
    [<ffffffff9e203857>] do_syscall_64+0x67/0x150
    [<ffffffff9e977ce7>] return_from_SYSCALL_64+0x0/0x6a
    [<ffffffffffffffff>] 0xffffffffffffffff

Link: http://lkml.kernel.org/r/1502705898-27571-1-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: 08d43a5fa0 ("tracing: Add lock-free tracing_map")
Signed-off-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-24 10:05:51 -04:00
Steven Rostedt (VMware)
a8f0f9e499 ftrace: Check for null ret_stack on profile function graph entry function
There's a small race when function graph shutsdown and the calling of the
registered function graph entry callback. The callback must not reference
the task's ret_stack without first checking that it is not NULL. Note, when
a ret_stack is allocated for a task, it stays allocated until the task exits.
The problem here, is that function_graph is shutdown, and a new task was
created, which doesn't have its ret_stack allocated. But since some of the
functions are still being traced, the callbacks can still be called.

The normal function_graph code handles this, but starting with commit
8861dd303c ("ftrace: Access ret_stack->subtime only in the function
profiler") the profiler code references the ret_stack on function entry, but
doesn't check if it is NULL first.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=196611

Cc: stable@vger.kernel.org
Fixes: 8861dd303c ("ftrace: Access ret_stack->subtime only in the function profiler")
Reported-by: lilydjwg@gmail.com
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-24 10:04:01 -04:00
Benjamin Herrenschmidt
bb9b52bd51 KVM: PPC: Book3S HV: Add missing barriers to XIVE code and document them
This adds missing memory barriers to order updates/tests of
the virtual CPPR and MFRR, thus fixing a lost IPI problem.

While at it also document all barriers in this file.

This fixes a bug causing guest IPIs to occasionally get lost.  The
symptom then is hangs or stalls in the guest.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-08-24 20:02:01 +10:00
Benjamin Herrenschmidt
2c4fb78f78 KVM: PPC: Book3S HV: Workaround POWER9 DD1.0 bug causing IPB bit loss
This adds a workaround for a bug in POWER9 DD1 chips where changing
the CPPR (Current Processor Priority Register) can cause bits in the
IPB (Interrupt Pending Buffer) to get lost.  Thankfully it only
happens when manually manipulating CPPR which is quite rare.  When it
does happen it can cause interrupts to be delayed or lost.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-08-24 20:01:39 +10:00
Nicholas Piggin
bd0fdb191c KVM: PPC: Book3S HV: Use msgsync with hypervisor doorbells on POWER9
When msgsnd is used for IPIs to other cores, msgsync must be executed by
the target to order stores performed on the source before its msgsnd
(provided the source executes the appropriate sync).

Fixes: 1704a81cce ("KVM: PPC: Book3S HV: Use msgsnd for IPIs to other cores on POWER9")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-08-24 20:01:39 +10:00
Nicholas Piggin
2fe59f507a timers: Fix excessive granularity of new timers after a nohz idle
When a timer base is idle, it is forwarded when a new timer is added
to ensure that granularity does not become excessive. When not idle,
the timer tick is expected to increment the base.

However there are several problems:

- If an existing timer is modified, the base is forwarded only after
  the index is calculated.

- The base is not forwarded by add_timer_on.

- There is a window after a timer is restarted from a nohz idle, after
  it is marked not-idle and before the timer tick on this CPU, where a
  timer may be added but the ancient base does not get forwarded.

These result in excessive granularity (a 1 jiffy timeout can blow out
to 100s of jiffies), which cause the rcu lockup detector to trigger,
among other things.

Fix this by keeping track of whether the timer base has been idle
since it was last run or forwarded, and if so then forward it before
adding a new timer.

There is still a case where mod_timer optimises the case of a pending
timer mod with the same expiry time, where the timer can see excessive
granularity relative to the new, shorter interval. A comment is added,
but it's not changed because it is an important fastpath for
networking.

This has been tested and found to fix the RCU softlockup messages.

Testing was also done with tracing to measure requested versus
achieved wakeup latencies for all non-deferrable timers in an idle
system (with no lockup watchdogs running). Wakeup latency relative to
absolute latency is calculated (note this suffers from round-up skew
at low absolute times) and analysed:

             max     avg      std
upstream   506.0    1.20     4.68
patched      2.0    1.08     0.15

The bug was noticed due to the lockup detector Kconfig changes
dropping it out of people's .configs and resulting in larger base
clk skew When the lockup detectors are enabled, no CPU can go idle for
longer than 4 seconds, which limits the granularity errors.
Sub-optimal timer behaviour is observable on a smaller scale in that
case:

	     max     avg      std
upstream     9.0    1.05     0.19
patched      2.0    1.04     0.11

Fixes: Fixes: a683f390b9 ("timers: Forward the wheel clock whenever possible")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: David Miller <davem@davemloft.net>
Cc: dzickus@redhat.com
Cc: sfr@canb.auug.org.au
Cc: mpe@ellerman.id.au
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: linuxarm@huawei.com
Cc: abdhalee@linux.vnet.ibm.com
Cc: John Stultz <john.stultz@linaro.org>
Cc: akpm@linux-foundation.org
Cc: paulmck@linux.vnet.ibm.com
Cc: torvalds@linux-foundation.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170822084348.21436-1-npiggin@gmail.com
2017-08-24 11:40:18 +02:00
Linus Torvalds
143c97cc65 Revert "pty: fix the cached path of the pty slave file descriptor in the master"
This reverts commit c8c03f1858.

It turns out that while fixing the ptmx file descriptor to have the
correct 'struct path' to the associated slave pty is a really good
thing, it breaks some user space tools for a very annoying reason.

The problem is that /dev/ptmx and its associated slave pty (/dev/pts/X)
are on different mounts.  That was what caused us to have the wrong path
in the first place (we would mix up the vfsmount of the 'ptmx' node,
with the dentry of the pty slave node), but it also means that now while
we use the right vfsmount, having the pty master open also keeps the pts
mount busy.

And it turn sout that that makes 'pbuilder' very unhappy, as noted by
Stefan Lippers-Hollmann:

 "This patch introduces a regression for me when using pbuilder
  0.228.7[2] (a helper to build Debian packages in a chroot and to
  create and update its chroots) when trying to umount /dev/ptmx (inside
  the chroot) on Debian/ unstable (full log and pbuilder configuration
  file[3] attached).

  [...]
  Setting up build-essential (12.3) ...
  Processing triggers for libc-bin (2.24-15) ...
  I: unmounting dev/ptmx filesystem
  W: Could not unmount dev/ptmx: umount: /var/cache/pbuilder/build/1340/dev/ptmx: target is busy
          (In some cases useful info about processes that
           use the device is found by lsof(8) or fuser(1).)"

apparently pbuilder tries to unmount the /dev/pts filesystem while still
holding at least one master node open, which is arguably not very nice,
but we don't break user space even when fixing other bugs.

So this commit has to be reverted.

I'll try to figure out a way to avoid caching the path to the slave pty
in the master pty.  The only thing that actually wants that slave pty
path is the "TIOCGPTPEER" ioctl, and I think we could just recreate the
path at that time.

Reported-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: Eric W Biederman <ebiederm@xmission.com>
Cc: Christian Brauner <christian.brauner@canonical.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-23 18:16:11 -07:00
Omar Sandoval
1e6ec9ea89 Revert "loop: support 4k physical blocksize"
There's some stuff still up in the air, let's not get stuck with a
subpar ABI. I'll follow up with something better for 4.14.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23 15:57:55 -06:00
Shaohua Li
ea0ea2bc6d blk-throttle: cap discard request size
discard request usually is very big and easily use all bandwidth budget
of a cgroup. discard request size doesn't really mean the size of data
written, so it doesn't make sense to account it into bandwidth budget.
Jens pointed out treating the size 0 doesn't make sense too, because
discard request does have cost. But it's not easy to find the actual
cost. This patch simply makes the size one sector.

Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23 15:56:33 -06:00
Hans Verkuil
93a4c8355e ARM: dts: exynos: add needs-hpd for Odroid-XU3/4
CEC support was added for Exynos5 in 4.13, but for the Odroids we need to set
'needs-hpd' as well since CEC is disabled when there is no HDMI hotplug signal,
just as for the exynos4 Odroid-U3.

This is due to the level-shifter that is disabled when there is no HPD, thus
blocking the CEC signal as well. Same close-but-no-cigar board design as the
Odroid-U3.

Tested with my Odroid XU4.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2017-08-23 21:43:29 +02:00
Linus Torvalds
2acf097f16 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
 "Late arm64 fixes.

  They fix very early boot failures with KASLR where the early mapping
  of the kernel is incorrect, so the failure mode looks like a hang with
  no output. There's also a signal-handling fix when a uaccess routine
  faults with a fatal signal pending, which could be used to create
  unkillable user tasks using userfaultfd and finally a state leak fix
  for the floating pointer registers across a call to exec().

  We're still seeing some random issues crop up (inode memory corruption
  and spinlock recursion) but we've not managed to reproduce things
  reliably enough to debug or bisect them yet.

  Summary:

   - Fix very early boot failures with KASLR enabled

   - Fix fatal signal handling on userspace access from kernel

   - Fix leakage of floating point register state across exec()"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kaslr: Adjust the offset to avoid Image across alignment boundary
  arm64: kaslr: ignore modulo offset when validating virtual displacement
  arm64: mm: abort uaccess retries upon fatal signal
  arm64: fpsimd: Prevent registers leaking across exec
2017-08-23 12:05:46 -07:00
Linus Torvalds
a67ca1e9bd Merge tag 'gpio-v4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
 "Here are the (hopefully) last GPIO fixes for v4.13:

   - an important core fix to reject invalid GPIOs *before* trying to
     obtain a GPIO descriptor for it.

   - a driver fix for the mvebu driver IRQ handling"

* tag 'gpio-v4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: mvebu: Fix cause computation in irq handler
  gpio: reject invalid gpio before getting gpio_desc
2017-08-23 11:43:38 -07:00
Ronnie Sahlberg
d3edede29f cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
Add checking for the path component length and verify it is <= the maximum
that the server advertizes via FileFsAttributeInformation.

With this patch cifs.ko will now return ENAMETOOLONG instead of ENOENT
when users to access an overlong path.

To test this, try to cd into a (non-existing) directory on a CIFS share
that has a too long name:
cd /mnt/aaaaaaaaaaaaaaa...

and it now should show a good error message from the shell:
bash: cd: /mnt/aaaaaaaaaaaaaaaa...aaaaaa: File name too long

rh bz 1153996

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Cc: <stable@vger.kernel.org>
2017-08-23 13:34:52 -05:00
Linus Torvalds
55652400fd Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Six minor and error leg fixes, plus one major change: the reversion of
  scsi-mq as the default.

  We're doing the latter temporarily (with a backport to stable) to give
  us time to fix all the issues that turned up with this default before
  trying again"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: cxgb4i: call neigh_event_send() to update MAC address
  Revert "scsi: default to scsi-mq"
  scsi: sd_zbc: Write unlock zone from sd_uninit_cmnd()
  scsi: aacraid: Fix out of bounds in aac_get_name_resp
  scsi: csiostor: fail probe if fw does not support FCoE
  scsi: megaraid_sas: fix error handle in megasas_probe_one
2017-08-23 11:34:40 -07:00
Sachin Prabhu
42bec214d8 cifs: Fix df output for users with quota limits
The df for a SMB2 share triggers a GetInfo call for
FS_FULL_SIZE_INFORMATION. The values returned are used to populate
struct statfs.

The problem is that none of the information returned by the call
contains the total blocks available on the filesystem. Instead we use
the blocks available to the user ie. quota limitation when filling out
statfs.f_blocks. The information returned does contain Actual free units
on the filesystem and is used to populate statfs.f_bfree. For users with
quota enabled, it can lead to situations where the total free space
reported is more than the total blocks on the system ending up with df
reports like the following

 # df -h /mnt/a
Filesystem         Size  Used Avail Use% Mounted on
//192.168.22.10/a  2.5G -2.3G  2.5G    - /mnt/a

To fix this problem, we instead populate both statfs.f_bfree with the
same value as statfs.f_bavail ie. CallerAvailableAllocationUnits. This
is similar to what is done already in the code for cifs and df now
reports the quota information for the user used to mount the share.

 # df --si /mnt/a
Filesystem         Size  Used Avail Use% Mounted on
//192.168.22.10/a  2.7G  101M  2.6G   4% /mnt/a

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Pierguido Lambri <plambri@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Cc: <stable@vger.kernel.org>
2017-08-23 13:33:21 -05:00
Arnd Bergmann
dbeb0c8e84 ARM: at91: don't select CONFIG_ARM_CPU_SUSPEND for old platforms
My previous patch fixed a link error for all at91 platforms when
CONFIG_ARM_CPU_SUSPEND was not set, however this caused another
problem on a configuration that enabled CONFIG_ARCH_AT91 but none
of the individual SoCs, and that also enabled CPU_ARM720 as
the only CPU:

warning: (ARCH_AT91 && SOC_IMX23 && SOC_IMX28 && ARCH_PXA && MACH_MVEBU_V7 && SOC_IMX6 && ARCH_OMAP3 && ARCH_OMAP4 && SOC_OMAP5 && SOC_AM33XX && SOC_DRA7XX && ARCH_EXYNOS3 && ARCH_EXYNOS4 && EXYNOS5420_MCPM && EXYNOS_CPU_SUSPEND && ARCH_VEXPRESS_TC2_PM && ARM_BIG_LITTLE_CPUIDLE && ARM_HIGHBANK_CPUIDLE && QCOM_PM) selects ARM_CPU_SUSPEND which has unmet direct dependencies (ARCH_SUSPEND_POSSIBLE)
arch/arm/kernel/sleep.o: In function `cpu_resume':
(.text+0xf0): undefined reference to `cpu_arm720_suspend_size'
arch/arm/kernel/suspend.o: In function `__cpu_suspend_save':
suspend.c:(.text+0x134): undefined reference to `cpu_arm720_do_suspend'

This improves the hack some more by only selecting ARM_CPU_SUSPEND
for the part that requires it, and changing pm.c to drop the
contents of unused init functions so we no longer refer to
cpu_resume on at91 platforms that don't need it.

Fixes: cc7a938f5f ("ARM: at91: select CONFIG_ARM_CPU_SUSPEND")
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2017-08-23 17:31:39 +02:00
Jani Nikula
e7c50e1156 Merge tag 'gvt-fixes-2017-08-23' of https://github.com/01org/gvt-linux into drm-intel-fixes
gvt-fixes-2017-08-23

- Fix possible null ptr reference in error path (Fred)

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170823075352.nlo7hp3bplnb5ilx@zhen-hp.sh.intel.com
2017-08-23 11:48:05 +03:00
Takashi Iwai
bbba6f9d3d ALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978)
Lenovo G50-70 (17aa:3978) with Conexant codec chip requires the
similar workaround for the inverted stereo dmic like other Lenovo
models.

Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1020657
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-23 09:30:17 +02:00
fred gao
ffeaf9aaf9 drm/i915/gvt: Fix the kernel null pointer error
once error happens in shadow_indirect_ctx function, the variable
wa_ctx->indirect_ctx.obj is not initialized but accessed, so the
kernel null point panic occurs.

Fixes: 894cf7d156 ("drm/i915/gvt: i915_gem_object_create() returns an error pointer")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-08-23 14:08:57 +08:00
Linus Torvalds
98b9f8a454 Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
 "Fix a clang build regression and an potential xattr corruption bug"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: add missing xattr hash update
  ext4: fix clang build regression
2017-08-22 21:30:52 -07:00
Martijn Coenen
b2a6d1b999 ANDROID: binder: fix proc->tsk check.
Commit c4ea41ba19 ("binder: use group leader instead of open thread")'
was incomplete and didn't update a check in binder_mmap(), causing all
mmap() calls into the binder driver to fail.

Signed-off-by: Martijn Coenen <maco@android.com>
Tested-by: John Stultz <john.stultz@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-22 18:42:57 -07:00
Sakari Ailus
b5212f57da ACPI: device property: Fix node lookup in acpi_graph_get_child_prop_value()
acpi_graph_get_child_prop_value() is intended to find a child node with a
certain property value pair. The check

	if (!fwnode_property_read_u32(fwnode, prop_name, &nr))
		continue;

is faulty: fwnode_property_read_u32() returns zero on success, not on
failure, leading to comparing values only if the searched property was not
found.

Moreover, the check is made against the parent device node instead of
the child one as it should be.

Fixes: 79389a83bc (ACPI / property: Add support for remote endpoints)
Reported-by: Hyungwoo Yang <hyungwoo.yang@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: 4.12+ <stable@vger.kernel.org> # 4.12+
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-22 22:58:38 +02:00
Rafael J. Wysocki
9b40eebcd3 ACPICA: Fix acpi_evaluate_object_typed()
Commit 2d2a954375 (ACPICA: Update two error messages to emit
control method name) causes acpi_evaluate_object_typed() to fail
if its pathname argument is NULL, but some callers of that function
in the kernel, particularly acpi_nondev_subnode_data_ok(), pass
NULL as pathname to it and expect it to work.

For this reason, make acpi_evaluate_object_typed() check if its
pathname argument is NULL and fall back to using the pathname of
its handle argument if that is the case.

Reported-by: Sakari Ailus <sakari.ailus@intel.com>
Tested-by: Yang, Hyungwoo <hyungwoo.yang@intel.com>
Fixes: 2d2a954375 (ACPICA: Update two error messages to emit control method name)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-22 22:44:13 +02:00
Bharat Potnuri
65159c051c RDMA/uverbs: Initialize cq_context appropriately
Initializing cq_context with ev_queue in create_cq(), leads to NULL pointer
dereference in ib_uverbs_comp_handler(), if application doesnot use completion
channel. This patch fixes the cq_context initialization.

Fixes: 1e7710f3f6 ("IB/core: Change completion channel to use the reworked")
Cc: stable@vger.kernel.org # 4.12
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit 699a2d5b1b)
2017-08-22 13:55:47 -04:00
Linus Torvalds
eec3b80fb1 Merge tag 'mfd-fixes-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD fix from Lee Jones:
 "Revert duplicate commit in da9062-core"

* tag 'mfd-fixes-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  Revert "mfd: da9061: Fix to remove BBAT_CONT register from chip model"
2017-08-22 10:21:05 -07:00
Catalin Marinas
a067d94d37 arm64: kaslr: Adjust the offset to avoid Image across alignment boundary
With 16KB pages and a kernel Image larger than 16MB, the current
kaslr_early_init() logic for avoiding mappings across swapper table
boundaries fails since increasing the offset by kimg_sz just moves the
problem to the next boundary.

This patch rounds the offset down to (1 << SWAPPER_TABLE_SHIFT) if the
Image crosses a PMD_SIZE boundary.

Fixes: afd0e5a876 ("arm64: kaslr: Fix up the kernel image alignment")
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-22 18:15:42 +01:00
Ard Biesheuvel
4a23e56ad6 arm64: kaslr: ignore modulo offset when validating virtual displacement
In the KASLR setup routine, we ensure that the early virtual mapping
of the kernel image does not cover more than a single table entry at
the level above the swapper block level, so that the assembler routines
involved in setting up this mapping can remain simple.

In this calculation we add the proposed KASLR offset to the values of
the _text and _end markers, and reject it if they would end up falling
in different swapper table sized windows.

However, when taking the addresses of _text and _end, the modulo offset
(the physical displacement modulo 2 MB) is already accounted for, and
so adding it again results in incorrect results. So disregard the modulo
offset from the calculation.

Fixes: 08cdac619c ("arm64: relocatable: deal with physically misaligned ...")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-22 18:15:42 +01:00
Mark Rutland
289d07a2dc arm64: mm: abort uaccess retries upon fatal signal
When there's a fatal signal pending, arm64's do_page_fault()
implementation returns 0. The intent is that we'll return to the
faulting userspace instruction, delivering the signal on the way.

However, if we take a fatal signal during fixing up a uaccess, this
results in a return to the faulting kernel instruction, which will be
instantly retried, resulting in the same fault being taken forever. As
the task never reaches userspace, the signal is not delivered, and the
task is left unkillable. While the task is stuck in this state, it can
inhibit the forward progress of the system.

To avoid this, we must ensure that when a fatal signal is pending, we
apply any necessary fixup for a faulting kernel instruction. Thus we
will return to an error path, and it is up to that code to make forward
progress towards delivering the fatal signal.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-22 18:15:42 +01:00
Dave Martin
096622104e arm64: fpsimd: Prevent registers leaking across exec
There are some tricky dependencies between the different stages of
flushing the FPSIMD register state during exec, and these can race
with context switch in ways that can cause the old task's regs to
leak across.  In particular, a context switch during the memset() can
cause some of the task's old FPSIMD registers to reappear.

Disabling preemption for this small window would be no big deal for
performance: preemption is already disabled for similar scenarios
like updating the FPSIMD registers in sigreturn.

So, instead of rearranging things in ways that might swap existing
subtle bugs for new ones, this patch just disables preemption
around the FPSIMD state flushing so that races of this type can't
occur here.  This brings fpsimd_flush_thread() into line with other
code paths.

Cc: stable@vger.kernel.org
Fixes: 674c242c93 ("arm64: flush FP/SIMD state correctly after execve()")
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-22 18:15:42 +01:00
Takashi Iwai
88c54cdf61 ALSA: core: Fix unexpected error at replacing user TLV
When user tries to replace the user-defined control TLV, the kernel
checks the change of its content via memcmp().  The problem is that
the kernel passes the return value from memcmp() as is.  memcmp()
gives a non-zero negative value depending on the comparison result,
and this shall be recognized as an error code.

The patch covers that corner-case, return 1 properly for the changed
TLV.

Fixes: 8aa9b586e4 ("[ALSA] Control API - more robust TLV implementation")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-22 15:43:40 +02:00
Arnd Bergmann
324c654dfc Merge tag 'at91-ab-4.13-drivers-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux into fixes
Pull "Driver fixes for 4.13" from Alexandre Belloni:

 - Multiple EBI/SMC timing setting/calculation fixes

* tag 'at91-ab-4.13-drivers-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  memory: atmel-ebi: Fix smc cycle xlate converter
  memory: atmel-ebi: Allow t_DF timings of zero ns
  memory: atmel-ebi: Fix smc timing return value evaluation
2017-08-22 15:27:37 +02:00
Chris Wilson
fe4600a548 drm: Release driver tracking before making the object available again
This is the same bug as we fixed in commit f6cd7daecf ("drm: Release
driver references to handle before making it available again"), but now
the exposure is via the PRIME lookup tables. If we remove the
object/handle from the PRIME lut, then a new request for the same
object/fd will generate a new handle, thus for a short window that
object is known to userspace by two different handles. Fix this by
releasing the driver tracking before PRIME.

Fixes: 0ff926c7d4 ("drm/prime: add exported buffers to current fprivs
imported buffer list (v2)")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Thierry Reding <treding@nvidia.com>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170819120558.6465-1-chris@chris-wilson.co.uk
2017-08-22 16:03:43 +03:00
Joakim Tjernlund
07b3b5e9ed ALSA: usb-audio: Add delay quirk for H650e/Jabra 550a USB headsets
These headsets reports a lot of: cannot set freq 44100 to ep 0x81
and need a small delay between sample rate settings, just like
Zoom R16/24. Add both headsets to the Zoom R16/24 quirk for
a 1 ms delay between control msgs.

Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-22 11:15:15 +02:00
Lee Jones
0f0fc5c090 Revert "mfd: da9061: Fix to remove BBAT_CONT register from chip model"
This patch was applied to the MFD twice, causing unwanted behavour.

This reverts commit b77eb79acc.

Fixes: b77eb79acc ("mfd: da9061: Fix to remove BBAT_CONT register from chip model")
Reported-by: Steve Twiss <stwiss.opensource@diasemi.com>
Reviewed-by: Steve Twiss <stwiss.opensource@diasemi.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2017-08-22 09:03:00 +01:00
Dave Airlie
b313f780de Merge tag 'drm-misc-fixes-2017-08-18' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
Core Changes:
- Fix framebuffer leak in setplane error condition (Nikil)
- Prevent BUG in atomic_ioctl by properly resetting state on EDEADLK (Maarten)
- Add missing return in atomic_check_only if atomic_check fails (Maarten)

Driver Changes:
- rockchip: Don't try to suspend if device not initialized (Jeffy)

Cc: Jeffy Chen <jeffy.chen@rock-chips.com>
Cc: Nikhil Mahale <nmahale@nvidia.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

* tag 'drm-misc-fixes-2017-08-18' of git://anongit.freedesktop.org/git/drm-misc:
  drm/atomic: If the atomic check fails, return its value first
  drm/atomic: Handle -EDEADLK with out-fences correctly
  drm: Fix framebuffer leak
  drm/rockchip: Fix suspend crash when drm is not bound
2017-08-22 16:53:32 +10:00
Dave Airlie
4a9f153dbd Merge tag 'imx-drm-fixes-2017-08-18' of git://git.pengutronix.de/git/pza/linux into drm-fixes
drm/imx: fix YUV primary plane and IPUv3 build corner case

- Enable color space conversion on the primary plane when the framebuffer
  format is a YUV format.
- The IPUv3 base driver now uses drm_format_info in the PRE/PRG code. The
  PRE/PRG parts are already disabled if DRM is not available. Enforce that
  if DRM is built as a module, IPUv3 must be built as a module, too.

* tag 'imx-drm-fixes-2017-08-18' of git://git.pengutronix.de/git/pza/linux:
  drm/imx: ipuv3-plane: fix YUV framebuffer scanout on the base plane
  gpu: ipu-v3: add DRM dependency
2017-08-22 16:52:36 +10:00
Dave Airlie
98f1a17285 Merge tag 'sunxi-drm-fixes-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into drm-fixes
Allwinner DRM fixes for 4.13

A single commit to restore the framebuffer console when there's no DRM
users left.

* tag 'sunxi-drm-fixes-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
  drm/sun4i: Implement drm_driver lastclose to restore fbdev console
2017-08-22 16:50:07 +10:00
Linus Torvalds
6470812e22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
 "Just a couple small fixes, two of which have to do with gcc-7:

   1) Don't clobber kernel fixed registers in __multi4 libgcc helper.

   2) Fix a new uninitialized variable warning on sparc32 with gcc-7,
      from Thomas Petazzoni.

   3) Adjust pmd_t initializer on sparc32 to make gcc happy.

   4) If ATU isn't available, don't bark in the logs. From Tushar Dave"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus()
  sparc64: remove unnecessary log message
  sparc64: Don't clibber fixed registers in __multi4.
  mm: add pmd_t initializer __pmd() to work around a GCC bug.
2017-08-21 14:07:48 -07:00
Thomas Petazzoni
2dc77533f1 sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus()
When building the kernel for Sparc using gcc 7.x, the build fails
with:

arch/sparc/kernel/pcic.c: In function ‘pcibios_fixup_bus’:
arch/sparc/kernel/pcic.c:647:8: error: ‘cmd’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
    cmd |= PCI_COMMAND_IO;
        ^~

The simplified code looks like this:

unsigned int cmd;
[...]
pcic_read_config(dev->bus, dev->devfn, PCI_COMMAND, 2, &cmd);
[...]
cmd |= PCI_COMMAND_IO;

I.e, the code assumes that pcic_read_config() will always initialize
cmd. But it's not the case. Looking at pcic_read_config(), if
bus->number is != 0 or if the size is not one of 1, 2 or 4, *val will
not be initialized.

As a simple fix, we initialize cmd to zero at the beginning of
pcibios_fixup_bus.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-21 13:57:22 -07:00
Linus Torvalds
05ab303b4f Merge tag 'arc-4.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:

 - PAE40 related updates

 - SLC errata for region ops

 - intc line masking by default

* tag 'arc-4.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  arc: Mask individual IRQ lines during core INTC init
  ARCv2: PAE40: set MSB even if !CONFIG_ARC_HAS_PAE40 but PAE exists in SoC
  ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
  ARC: dma: implement dma_unmap_page and sg variant
  ARCv2: SLC: Make sure busy bit is set properly for region ops
  ARC: [plat-sim] Include this platform unconditionally
  ARC: [plat-axs10x]: prepare dts files for enabling PAE40 on axs103
  ARC: defconfig: Cleanup from old Kconfig options
2017-08-21 13:30:36 -07:00
Linus Torvalds
0b3baec853 Merge tag 'rtc-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC fix from Alexandre Belloni:
 "Fix regmap configuration for ds1307"

* tag 'rtc-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: ds1307: fix regmap config
2017-08-21 13:27:51 -07:00
Linus Torvalds
e3181f2c0c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix IGMP handling wrt VRF, from David Ahern.

 2) Fix timer access to freed object in dccp, from Eric Dumazet.

 3) Use kmalloc_array() in ptr_ring to avoid overflow cases which are
    triggerable by userspace. Also from Eric Dumazet.

 4) Fix infinite loop in unmapping cleanup of nfp driver, from Colin Ian
    King.

 5) Correct datagram peek handling of empty SKBs, from Matthew Dawson.

 6) Fix use after free in TIPC, from Eric Dumazet.

 7) When replacing a route in ipv6 we need to reset the round robin
    pointer, from Wei Wang.

 8) Fix bug in pci_find_pcie_root_port() which was unearthed by the
    relaxed ordering changes, from Thierry Redding. I made sure to get
    an explicit ACK from Bjorn this time around :-)

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
  ipv6: repair fib6 tree in failure case
  net_sched: fix order of queue length updates in qdisc_replace()
  tools lib bpf: improve warning
  switchdev: documentation: minor typo fixes
  bpf, doc: also add s390x as arch to sysctl description
  net: sched: fix NULL pointer dereference when action calls some targets
  rxrpc: Fix oops when discarding a preallocated service call
  irda: do not leak initialized list.dev to userspace
  net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled
  PCI: Allow PCI express root ports to find themselves
  tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
  net: check and errout if res->fi is NULL when RTM_F_FIB_MATCH is set
  ipv6: reset fn->rr_ptr when replacing route
  sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
  tipc: fix use-after-free
  tun: handle register_netdevice() failures properly
  datagram: When peeking datagrams with offset < 0 don't skip empty skbs
  bpf, doc: improve sysctl knob description
  netxen: fix incorrect loop counter decrement
  nfp: fix infinite loop on umapping cleanup
  ...
2017-08-21 13:16:27 -07:00
Oleg Nesterov
dd1c1f2f20 pids: make task_tgid_nr_ns() safe
This was reported many times, and this was even mentioned in commit
52ee2dfdd4 ("pids: refactor vnr/nr_ns helpers to make them safe") but
somehow nobody bothered to fix the obvious problem: task_tgid_nr_ns() is
not safe because task->group_leader points to nowhere after the exiting
task passes exit_notify(), rcu_read_lock() can not help.

We really need to change __unhash_process() to nullify group_leader,
parent, and real_parent, but this needs some cleanups.  Until then we
can turn task_tgid_nr_ns() into another user of __task_pid_nr_ns() and
fix the problem.

Reported-by: Troy Kensinger <tkensinger@google.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-21 12:47:31 -07:00
Radim Krčmář
f2d8421b88 Merge tag 'kvm-s390-master-4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux
KVM: s390: two fixes for sthyi emulation

- missing inline assembly constraint
- wrong exception handling
2017-08-21 18:21:30 +02:00
Josh Poimboeuf
deecd4d71b objtool: Fix '-mtune=atom' decoding support in objtool 2.0
With '-mtune=atom', which is enabled with CONFIG_MATOM=y, GCC uses some
unusual instructions for setting up the stack.

Instead of:

  mov %rsp, %rbp

it does:

  lea (%rsp), %rbp

And instead of:

  add imm, %rsp

it does:

  lea disp(%rsp), %rsp

Add support for these instructions to the objtool decoder.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: baa41469a7 ("objtool: Implement stack validation 2.0")
Link: http://lkml.kernel.org/r/4ea1db896e821226efe1f8e09f270771bde47e65.1501188854.git.jpoimboe@redhat.com
[ This is a cherry-picked version of upcoming commit 5b8de48e82. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-21 16:07:27 +02:00
Heiko Carstens
857b8de967 KVM: s390: sthyi: fix specification exception detection
sthyi should only generate a specification exception if the function
code is zero and the response buffer is not on a 4k boundary.

The current code would also test for unknown function codes if the
response buffer, that is currently only defined for function code 0,
is not on a 4k boundary and incorrectly inject a specification
exception instead of returning with condition code 3 and return code 4
(unsupported function code).

Fix this by moving the boundary check.

Fixes: 95ca2cb579 ("KVM: s390: Add sthyi emulation")
Cc: <stable@vger.kernel.org> # 4.8+
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-21 15:34:54 +02:00
Heiko Carstens
4a4eefcd0e KVM: s390: sthyi: fix sthyi inline assembly
The sthyi inline assembly misses register r3 within the clobber
list. The sthyi instruction will always write a return code to
register "R2+1", which in this case would be r3. Due to that we may
have register corruption and see host crashes or data corruption
depending on how gcc decided to allocate and use registers during
compile time.

Fixes: 95ca2cb579 ("KVM: s390: Add sthyi emulation")
Cc: <stable@vger.kernel.org> # 4.8+
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-21 15:34:36 +02:00
Shawn Lin
d83c2dbaa9 mmc: block: prevent propagating R1_OUT_OF_RANGE for open-ending mode
We to some extent should tolerate R1_OUT_OF_RANGE for open-ending
mode as it is expected behaviour and most of the backup partition
tables should be located near some of the last blocks which will
always make open-ending read exceed the capacity of cards.

Fixes: 9820a5b111 ("mmc: core: for data errors, take response of stop cmd into account")
Fixes: a04e6bae9e ("mmc: core: check also R1 response for stop commands")
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-08-21 13:32:31 +02:00
Heiner Kallweit
03619844d8 rtc: ds1307: fix regmap config
Current max_register setting breaks reading nvram on certain chips and
also reading the standard registers on RX8130 where register map starts
at 0x10.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Fixes: 11e5890b53 "rtc: ds1307: convert driver to regmap"
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-08-21 11:08:03 +02:00
Chris Wilson
d41a3c2be1 drm/i915: Clear lost context-switch interrupts across reset
During a global reset, we disable the irq. As we disable the irq, the
hardware may be raising a GT interrupt that we then ignore, leaving it
pending in the GTIIR. After the reset, we then re-enable the irq,
triggering the pending interrupt. However, that interrupt was for the
stale state from before the reset, and the contents of the CSB buffer
are now invalid.

v2: Add a comment to make it clear that the double clear is purely my
paranoia.

Reported-by: "Dong, Chuanxiao" <chuanxiao.dong@intel.com>
Fixes: 821ed7df6e ("drm/i915: Update reset path to fix incomplete requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Dong, Chuanxiao" <chuanxiao.dong@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170807121919.30165-1-chris@chris-wilson.co.uk
Link: https://patchwork.freedesktop.org/patch/msgid/20170818090509.5363-1-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
(cherry picked from commit 64f09f00ca)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-08-21 11:52:35 +03:00
Andy Shevchenko
a93c115275 drm/i915/bxt: use NULL for GPIO connection ID
The commit 213e08ad60
	("drm/i915/bxt: add bxt dsi gpio element support")
enables GPIO support for Broxton based platforms.

While using that API we might get into troubles in the future, because
we can't rely on label name in the driver since vendor firmware might
provide any GPIO pin there, e.g. "reset", and even mark it in _DSD (in
which case the request will fail).

To avoid inconsistency and potential issues we have two options:
a) generate GPIO ACPI mapping table and supply it via
   acpi_dev_add_driver_gpios(), or
b) just pass NULL as connection ID.

The b) approach is much simpler and would work since the driver relies
on GPIO indices only. Moreover, the _CRS fallback mechanism, when
requesting GPIO, has been made stricter, and supplying non-NULL
connection ID when neither _DSD, nor GPIO ACPI mapping is present, is
making request fail.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101921
Fixes: f10e4bf663 ("gpio: acpi: Even more tighten up ACPI GPIO lookups")
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Tested-by: Mika Kahola <mika.kahola@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817105541.63914-1-andriy.shevchenko@linux.intel.com
(cherry picked from commit cd55a1fbd2)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-08-21 11:52:29 +03:00
Keerthy
4459398b6d soc: ti: knav: Add a NULL pointer check for kdev in knav_pool_create
knav_pool_create is an exported function. In the event of a call
before knav_queue_probe, we encounter a NULL pointer dereference
in the following line. Hence return -EPROBE_DEFER to the caller till
the kdev pointer is non-NULL.

Signed-off-by: Keerthy <j-keerthy@ti.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2017-08-21 09:19:50 +02:00
Wei Wang
348a400272 ipv6: repair fib6 tree in failure case
In fib6_add(), it is possible that fib6_add_1() picks an intermediate
node and sets the node's fn->leaf to NULL in order to add this new
route. However, if fib6_add_rt2node() fails to add the new
route for some reason, fn->leaf will be left as NULL and could
potentially cause crash when fn->leaf is accessed in fib6_locate().
This patch makes sure fib6_repair_tree() is called to properly repair
fn->leaf in the above failure case.

Here is the syzkaller reported general protection fault in fib6_locate:
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 0 PID: 40937 Comm: syz-executor3 Not tainted
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8801d7d64100 ti: ffff8801d01a0000 task.ti: ffff8801d01a0000
RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] __ipv6_prefix_equal64_half include/net/ipv6.h:475 [inline]
RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] ipv6_prefix_equal include/net/ipv6.h:492 [inline]
RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] fib6_locate_1 net/ipv6/ip6_fib.c:1210 [inline]
RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] fib6_locate+0x281/0x3c0 net/ipv6/ip6_fib.c:1233
RSP: 0018:ffff8801d01a36a8  EFLAGS: 00010202
RAX: 0000000000000020 RBX: ffff8801bc790e00 RCX: ffffc90002983000
RDX: 0000000000001219 RSI: ffff8801d01a37a0 RDI: 0000000000000100
RBP: ffff8801d01a36f0 R08: 00000000000000ff R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001
R13: dffffc0000000000 R14: ffff8801d01a37a0 R15: 0000000000000000
FS:  00007f6afd68c700(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004c6340 CR3: 00000000ba41f000 CR4: 00000000001426f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
 ffff8801d01a37a8 ffff8801d01a3780 ffffed003a0346f5 0000000c82a23ea0
 ffff8800b7bd7700 ffff8801d01a3780 ffff8800b6a1c940 ffffffff82a23ea0
 ffff8801d01a3920 ffff8801d01a3748 ffffffff82a223d6 ffff8801d7d64988
Call Trace:
 [<ffffffff82a223d6>] ip6_route_del+0x106/0x570 net/ipv6/route.c:2109
 [<ffffffff82a23f9d>] inet6_rtm_delroute+0xfd/0x100 net/ipv6/route.c:3075
 [<ffffffff82621359>] rtnetlink_rcv_msg+0x549/0x7a0 net/core/rtnetlink.c:3450
 [<ffffffff8274c1d1>] netlink_rcv_skb+0x141/0x370 net/netlink/af_netlink.c:2281
 [<ffffffff82613ddf>] rtnetlink_rcv+0x2f/0x40 net/core/rtnetlink.c:3456
 [<ffffffff8274ad38>] netlink_unicast_kernel net/netlink/af_netlink.c:1206 [inline]
 [<ffffffff8274ad38>] netlink_unicast+0x518/0x750 net/netlink/af_netlink.c:1232
 [<ffffffff8274b83e>] netlink_sendmsg+0x8ce/0xc30 net/netlink/af_netlink.c:1778
 [<ffffffff82564aff>] sock_sendmsg_nosec net/socket.c:609 [inline]
 [<ffffffff82564aff>] sock_sendmsg+0xcf/0x110 net/socket.c:619
 [<ffffffff82564d62>] sock_write_iter+0x222/0x3a0 net/socket.c:834
 [<ffffffff8178523d>] new_sync_write+0x1dd/0x2b0 fs/read_write.c:478
 [<ffffffff817853f4>] __vfs_write+0xe4/0x110 fs/read_write.c:491
 [<ffffffff81786c38>] vfs_write+0x178/0x4b0 fs/read_write.c:538
 [<ffffffff817892a9>] SYSC_write fs/read_write.c:585 [inline]
 [<ffffffff817892a9>] SyS_write+0xd9/0x1b0 fs/read_write.c:577
 [<ffffffff82c71e32>] entry_SYSCALL_64_fastpath+0x12/0x17

Note: there is no "Fixes" tag as this seems to be a bug introduced
very early.

Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 20:06:56 -07:00
Konstantin Khlebnikov
68a66d149a net_sched: fix order of queue length updates in qdisc_replace()
This important to call qdisc_tree_reduce_backlog() after changing queue
length. Parent qdisc should deactivate class in ->qlen_notify() called from
qdisc_tree_reduce_backlog() but this happens only if qdisc->q.qlen in zero.

Missed class deactivations leads to crashes/warnings at picking packets
from empty qdisc and corrupting state at reactivating this class in future.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Fixes: 86a7996cc8 ("net_sched: introduce qdisc_replace() helper")
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 20:02:00 -07:00
Eric Leblond
49bf4b36fd tools lib bpf: improve warning
Signed-off-by: Eric Leblond <eric@regit.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 19:49:51 -07:00
Chris Packham
5a78449810 switchdev: documentation: minor typo fixes
Two typos in switchdev.txt

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 19:49:10 -07:00
Daniel Borkmann
d4dd2d75a2 bpf, doc: also add s390x as arch to sysctl description
Looks like this was accidentally missed, so still add s390x
as supported eBPF JIT arch to bpf_jit_enable.

Fixes: 014cd0a368 ("bpf: Update sysctl documentation to list all supported architectures")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 19:45:09 -07:00
Ben Hutchings
2bfbe7881e kbuild: Do not use hyphen in exported variable name
This definition in Makefile.dtbinst:

    export dtbinst-root ?= $(obj)

should define and export dtbinst-root when handling the root dts
directory, and do nothing in the subdirectories.  However some shells,
including dash, will not pass through environment variables whose name
includes a hyphen.  Usually GNU make does not use a shell to recurse,
but if e.g. $(srctree) contains '~' it will use a shell here.

Rename the variable to dtbinst_root.

References: https://bugs.debian.org/833561
Fixes: 323a028d39cdi ("dts, kbuild: Implement support for dtb vendor subdirs")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-08-21 09:06:00 +09:00
Shuah Khan
801d2e9f1c Makefile: add kselftest-clean to PHONY target list
kselftest-clean isn't in the PHONY target list. Add it.

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-08-21 09:05:59 +09:00
Arnd Bergmann
8c97023cf0 Kbuild: use -fshort-wchar globally
Commit 971a69db7d ("Xen: don't warn about 2-byte wchar_t in efi")
added the --no-wchar-size-warning to the Makefile to avoid this
harmless warning:

arm-linux-gnueabi-ld: warning: drivers/xen/efi.o uses 2-byte wchar_t yet the output is to use 4-byte wchar_t; use of wchar_t values across objects may fail

Changing kbuild to use thin archives instead of recursive linking
unfortunately brings the same warning back during the final link.

The kernel does not use wchar_t string literals at this point, and
xen does not use wchar_t at all (only efi_char16_t), so the flag
has no effect, but as pointed out by Jan Beulich, adding a wchar_t
string literal would be bad here.

Since wchar_t is always defined as u16, independent of the toolchain
default, always passing -fshort-wchar is correct and lets us
remove the Xen specific hack along with fixing the warning.

Link: https://patchwork.kernel.org/patch/9275217/
Fixes: 971a69db7d ("Xen: don't warn about 2-byte wchar_t in efi")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-08-21 09:05:59 +09:00
Linus Torvalds
14ccee78fc Linux 4.13-rc6 2017-08-20 14:13:52 -07:00
Linus Torvalds
197e7e5213 Sanitize 'move_pages()' permission checks
The 'move_paghes()' system call was introduced long long ago with the
same permission checks as for sending a signal (except using
CAP_SYS_NICE instead of CAP_SYS_KILL for the overriding capability).

That turns out to not be a great choice - while the system call really
only moves physical page allocations around (and you need other
capabilities to do a lot of it), you can check the return value to map
out some the virtual address choices and defeat ASLR of a binary that
still shares your uid.

So change the access checks to the more common 'ptrace_may_access()'
model instead.

This tightens the access checks for the uid, and also effectively
changes the CAP_SYS_NICE check to CAP_SYS_PTRACE, but it's unlikely that
anybody really _uses_ this legacy system call any more (we hav ebetter
NUMA placement models these days), so I expect nobody to notice.

Famous last words.

Reported-by: Otto Ebeling <otto.ebeling@iki.fi>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-20 13:26:27 -07:00
Greg Kroah-Hartman
2c68888f1d Merge tag 'fixes-for-4.13b' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus
Jonathan writes:

Second set of IIO fixes for the 4.13 cycle.

Given the late stage of this series, some more involved fixes have been
held back for the upcoming merge window.

The hid-sensor issue has been causing problems for a long time so it
is great to have that one finally fixed!  No more bug reports for the
userspace guys (well about that anyway).

* documentation
  - some warning fixes due to missing colons in kernel-doc.
* adis16480
  - fix accel scale factor.
* bmp280
  - properly initialize the device for humidity readings - without this
  the humidity readings may be skipped and a magic value of 0x8000 returned.
* hid-sensor-strigger
  - fix a race with user space when powering up the sensor.
* ina291
  - Avoid an underflow for the sleeping time as a result of supporting the
  fastest rates.
* st-magnetometer
  - Fix the status register address for hte LSM303AGR,
  - Remove the ihl property for LSM303AGR as the sensor doesn't support
  active low for the dataready line.
* stm32-adc
  - Fix use of a common clock rate.
* stm32-timer
  - fix the quadrature mode get routine to account for the magic 0 value.
  set on boot.
  - fix the return value of write_raw,
  - fix the get/set down count direction as the enum value was not being
  converted to the relevant bit field,
  - add an enable attribute to actually turn it on when in encoder mode,
  - missing mask when reading the trigger mode.
2017-08-20 10:45:54 -07:00
Linus Torvalds
7f680d7ec3 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "Another pile of small fixes and updates for x86:

   - Plug a hole in the SMAP implementation which misses to clear AC on
     NMI entry

   - Fix the norandmaps/ADDR_NO_RANDOMIZE logic so the command line
     parameter works correctly again

   - Use the proper accessor in the startup64 code for next_early_pgt to
     prevent accessing of invalid addresses and faulting in the early
     boot code.

   - Prevent CPU hotplug lock recursion in the MTRR code

   - Unbreak CPU0 hotplugging

   - Rename overly long CPUID bits which got introduced in this cycle

   - Two commits which mark data 'const' and restrict the scope of data
     and functions to file scope by making them 'static'"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Constify attribute_group structures
  x86/boot/64/clang: Use fixup_pointer() to access 'next_early_pgt'
  x86/elf: Remove the unnecessary ADDR_NO_RANDOMIZE checks
  x86: Fix norandmaps/ADDR_NO_RANDOMIZE
  x86/mtrr: Prevent CPU hotplug lock recursion
  x86: Mark various structures and functions as 'static'
  x86/cpufeature, kvm/svm: Rename (shorten) the new "virtualized VMSAVE/VMLOAD" CPUID flag
  x86/smpboot: Unbreak CPU0 hotplug
  x86/asm/64: Clear AC on NMI entries
2017-08-20 09:36:52 -07:00
Linus Torvalds
2615a38f14 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
 "A few small fixes for timer drivers:

   - Prevent infinite recursion in the arm architected timer driver with
     ftrace

   - Propagate error codes to the caller in case of failure in EM STI
     driver

   - Adjust a bogus loop iteration in the arm architected timer driver

   - Add a missing Kconfig dependency to the pistachio clocksource to
     prevent build failures

   - Correctly check for IS_ERR() instead of NULL in the shared timer-of
     code"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/arm_arch_timer: Avoid infinite recursion when ftrace is enabled
  clocksource/drivers/Kconfig: Fix CLKSRC_PISTACHIO dependencies
  clocksource/drivers/timer-of: Checking for IS_ERR() instead of NULL
  clocksource/drivers/em_sti: Fix error return codes in em_sti_probe()
  clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization
2017-08-20 09:34:24 -07:00
Hans de Goede
d912366a59 Input: soc_button_array - silence -ENOENT error on Dell XPS13 9365
The Dell XPS13 9365 has an INT33D2 ACPI node with no GPIOs, causing
the following error in dmesg:

[    7.172275] soc_button_array: probe of INT33D2:00 failed with error -2

This commit silences this, by returning -ENODEV when there are no GPIOs.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=196679
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-08-20 09:30:23 -07:00
Linus Torvalds
e46db8d2ef Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "Two fixes for the perf subsystem:

   - Fix an inconsistency of RDPMC mm struct tagging across exec() which
     causes RDPMC to fault.

   - Correct the timestamp mechanics across IOC_DISABLE/ENABLE which
     causes incorrect timestamps and total time calculations"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix time on IOC_ENABLE
  perf/x86: Fix RDPMC vs. mm_struct tracking
2017-08-20 09:20:57 -07:00
Linus Torvalds
9dae41a238 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "A pile of smallish changes all over the place:

   - Add a missing ISB in the GIC V1 driver

   - Remove an ACPI version check in the GIC V3 ITS driver

   - Add the missing irq_pm_shutdown function for BRCMSTB-L2 to avoid
     spurious wakeups

   - Remove the artifical limitation of ITS instances to the number of
     NUMA nodes which prevents utilizing the ITS hardware correctly

   - Prevent a infinite parsing loop in the GIC-V3 ITS/MSI code

   - Honour the force affinity argument in the GIC-V3 driver which is
     required to make perf work correctly

   - Correctly report allocation failures in GIC-V2/V3 to avoid using
     half allocated and initialized interrupts.

   - Fixup checks against nr_cpu_ids in the generic IPI code"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/ipi: Fixup checks against nr_cpu_ids
  genirq: Restore trigger settings in irq_modify_status()
  MAINTAINERS: Remove Jason Cooper's irqchip git tree
  irqchip/gic-v3-its-platform-msi: Fix msi-parent parsing loop
  irqchip/gic-v3-its: Allow GIC ITS number more than MAX_NUMNODES
  irqchip: brcmstb-l2: Define an irq_pm_shutdown function
  irqchip/gic: Ensure we have an ISB between ack and ->handle_irq
  irqchip/gic-v3-its: Remove ACPICA version check for ACPI NUMA
  irqchip/gic-v3: Honor forced affinity setting
  irqchip/gic-v3: Report failures in gic_irq_domain_alloc
  irqchip/gic-v2: Report failures in gic_irq_domain_alloc
  irqchip/atmel-aic: Remove root argument from ->fixup() prototype
  irqchip/atmel-aic: Fix unbalanced refcount in aic_common_rtc_irq_fixup()
  irqchip/atmel-aic: Fix unbalanced of_node_put() in aic_common_irq_fixup()
2017-08-20 09:07:56 -07:00
Linus Torvalds
e18a5ebc2d Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull watchdog fix from Thomas Gleixner:
 "A fix for the hardlockup watchdog to prevent false positives with
  extreme Turbo-Modes which make the perf/NMI watchdog fire faster than
  the hrtimer which is used to verify.

  Slightly larger than the minimal fix, which just would increase the
  hrtimer frequency, but comes with extra overhead of more watchdog
  timer interrupts and thread wakeups for all users.

  With this change we restrict the overhead to the extreme Turbo-Mode
  systems"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kernel/watchdog: Prevent false positives with turbo modes
2017-08-20 08:54:30 -07:00
Lorenzo Bianconi
8b35a5f87a iio: magnetometer: st_magn: remove ihl property for LSM303AGR
Remove IRQ active low support for LSM303AGR since the sensor does not
support that capability for data-ready line

Fixes: a9fd053b56 (iio: st_sensors: support active-low interrupts)
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@st.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-08-20 15:21:49 +01:00
Lorenzo Bianconi
541ee9b24f iio: magnetometer: st_magn: fix status register address for LSM303AGR
Fixes: 97865fe413 (iio: st_sensors: verify interrupt event to status)
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@st.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-08-20 15:21:48 +01:00
Srinivas Pandruvada
f1664eaace iio: hid-sensor-trigger: Fix the race with user space powering up sensors
It has been reported for a while that with iio-sensor-proxy service the
rotation only works after one suspend/resume cycle. This required a wait
in the systemd unit file to avoid race. I found a Yoga 900 where I could
reproduce this.

The problem scenerio is:
- During sensor driver init, enable run time PM and also set a
  auto-suspend for 3 seconds.
	This result in one runtime resume. But there is a check to avoid
a powerup in this sequence, but rpm is active
- User space iio-sensor-proxy tries to power up the sensor. Since rpm is
  active it will simply return. But sensors were not actually
powered up in the prior sequence, so actaully the sensors will not work
- After 3 seconds the auto suspend kicks

If we add a wait in systemd service file to fire iio-sensor-proxy after
3 seconds, then now everything will work as the runtime resume will
actually powerup the sensor as this is a user request.

To avoid this:
- Remove the check to match user requested state, this will cause a
  brief powerup, but if the iio-sensor-proxy starts immediately it will
still work as the sensors are ON.
- Also move the autosuspend delay to place when user requested turn off
  of sensors, like after user finished raw read or buffer disable

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Bastien Nocera <hadess@hadess.net>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-08-20 15:21:48 +01:00
Fabrice Gasnier
a359bb2a55 iio: trigger: stm32-timer: fix get trigger mode
Fix reading trigger mode, when other bit-fields are set. SMCR register
value must be masked to read SMS (slave mode selection) only.

Fixes: 9eba381 ("iio: make stm32 trigger driver use
INDIO_HARDWARE_TRIGGERED mode")

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-08-20 15:21:47 +01:00
Dragos Bogdan
fdd0d32eb9 iio: imu: adis16480: Fix acceleration scale factor for adis16480
According to the datasheet, the range of the acceleration is [-10 g, + 10 g],
so the scale factor should be 10 instead of 5.

Signed-off-by: Dragos Bogdan <dragos.bogdan@analog.com>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-08-20 15:21:47 +01:00
Jonathan Corbet
f00fd7ae4f PATCH] iio: Fix some documentation warnings
The kerneldoc description for the trig_readonly field of struct iio_dev
lacked a colon, leading to this doc build warning:

  ./include/linux/iio/iio.h:603: warning: No description found for parameter 'trig_readonly'

A similar issue for iio_trigger_set_immutable() in trigger.h yielded:

  ./include/linux/iio/trigger.h:151: warning: No description found for parameter 'indio_dev'
  ./include/linux/iio/trigger.h:151: warning: No description found for parameter 'trig'

Fix the formatting and silence the warnings.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-08-20 15:21:46 +01:00
Alexey Dobriyan
8fbbe2d7cc genirq/ipi: Fixup checks against nr_cpu_ids
Valid CPU ids are [0, nr_cpu_ids-1] inclusive.

Fixes: 3b8e29a82d ("genirq: Implement ipi_send_mask/single()")
Fixes: f9bce791ae ("genirq: Add a new function to get IPI reverse mapping")
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170819095751.GB27864@avx2
2017-08-20 10:49:05 +02:00
Takashi Sakamoto
dbd7396b4f ALSA: firewire-motu: destroy stream data surely at failure of card initialization
When failing sound card registration after initializing stream data, this
module leaves allocated data in stream data. This commit fixes the bug.

Fixes: 9b2bb4f2f4 ('ALSA: firewire-motu: add stream management functionality')
Cc: <stable@vger.kernel.org> # v4.12+
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-20 09:40:45 +02:00
Takashi Sakamoto
0c264af7be ALSA: firewire: fix NULL pointer dereference when releasing uninitialized data of iso-resource
When calling 'iso_resource_free()' for uninitialized data, this function
causes NULL pointer dereference due to its 'unit' member. This occurs when
unplugging audio and music units on IEEE 1394 bus at failure of card
registration.

This commit fixes the bug. The bug exists since kernel v4.5.

Fixes: 324540c4e0 ('ALSA: fireface: postpone sound card registration') at v4.12
Fixes: 8865a31e0f ('ALSA: firewire-motu: postpone sound card registration') at v4.12
Fixes: b610386c8a ('ALSA: firewire-tascam: deleyed registration of sound card') at v4.7
Fixes: 86c8dd7f4d ('ALSA: firewire-digi00x: delayed registration of sound card') at v4.7
Fixes: 6c29230e2a ('ALSA: oxfw: delayed registration of sound card') at v4.7
Fixes: 7d3c1d5901 ('ALSA: fireworks: delayed registration of sound card') at v4.7
Fixes: 04a2c73c97 ('ALSA: bebob: delayed registration of sound card') at v4.7
Fixes: b59fb1900b ('ALSA: dice: postpone card registration') at v4.5
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-20 09:40:32 +02:00
Aaron Ma
ec667683c5 Input: trackpoint - add new trackpoint firmware ID
Synaptics add new TP firmware ID: 0x2 and 0x3, for now both lower 2 bits
are indicated as TP. Change the constant to bitwise values.

This makes trackpoint to be recognized on Lenovo Carbon X1 Gen5 instead
of it being identified as "PS/2 Generic Mouse".

Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-08-18 17:04:11 -07:00
KT Liao
1d2226e450 Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
Add ELAN0602 to the list of known ACPI IDs to enable support for ELAN
touchpads found in Lenovo Yoga310.

Signed-off-by: KT Liao <kt.liao@emc.com.tw>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-08-18 17:03:57 -07:00
Xin Long
4f8a881acc net: sched: fix NULL pointer dereference when action calls some targets
As we know in some target's checkentry it may dereference par.entryinfo
to check entry stuff inside. But when sched action calls xt_check_target,
par.entryinfo is set with NULL. It would cause kernel panic when calling
some targets.

It can be reproduce with:
  # tc qd add dev eth1 ingress handle ffff:
  # tc filter add dev eth1 parent ffff: u32 match u32 0 0 action xt \
    -j ECN --ecn-tcp-remove

It could also crash kernel when using target CLUSTERIP or TPROXY.

By now there's no proper value for par.entryinfo in ipt_init_target,
but it can not be set with NULL. This patch is to void all these
panics by setting it with an ipt_entry obj with all members = 0.

Note that this issue has been there since the very beginning.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:25:49 -07:00
David Howells
9a19bad70c rxrpc: Fix oops when discarding a preallocated service call
rxrpc_service_prealloc_one() doesn't set the socket pointer on any new call
it preallocates, but does add it to the rxrpc net namespace call list.
This, however, causes rxrpc_put_call() to oops when the call is discarded
when the socket is closed.  rxrpc_put_call() needs the socket to be able to
reach the namespace so that it can use a lock held therein.

Fix this by setting a call's socket pointer immediately before discarding
it.

This can be triggered by unloading the kafs module, resulting in an oops
like the following:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
IP: rxrpc_put_call+0x1e2/0x32d
PGD 0
P4D 0
Oops: 0000 [#1] SMP
Modules linked in: kafs(E-)
CPU: 3 PID: 3037 Comm: rmmod Tainted: G            E   4.12.0-fscache+ #213
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
task: ffff8803fc92e2c0 task.stack: ffff8803fef74000
RIP: 0010:rxrpc_put_call+0x1e2/0x32d
RSP: 0018:ffff8803fef77e08 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8803fab99ac0 RCX: 000000000000000f
RDX: ffffffff81c50a40 RSI: 000000000000000c RDI: ffff8803fc92ea88
RBP: ffff8803fef77e30 R08: ffff8803fc87b941 R09: ffffffff82946d20
R10: ffff8803fef77d10 R11: 00000000000076fc R12: 0000000000000005
R13: ffff8803fab99c20 R14: 0000000000000001 R15: ffffffff816c6aee
FS:  00007f915a059700(0000) GS:ffff88041fb80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000030 CR3: 00000003fef39000 CR4: 00000000001406e0
Call Trace:
 rxrpc_discard_prealloc+0x325/0x341
 rxrpc_listen+0xf9/0x146
 kernel_listen+0xb/0xd
 afs_close_socket+0x3e/0x173 [kafs]
 afs_exit+0x1f/0x57 [kafs]
 SyS_delete_module+0x10f/0x19a
 do_syscall_64+0x8a/0x149
 entry_SYSCALL64_slow_path+0x25/0x25

Fixes: 2baec2c3f8 ("rxrpc: Support network namespacing")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:23:23 -07:00
Colin Ian King
b024d949a3 irda: do not leak initialized list.dev to userspace
list.dev has not been initialized and so the copy_to_user is copying
data from the stack back to user space which is a potential
information leak. Fix this ensuring all of list is initialized to
zero.

Detected by CoverityScan, CID#1357894 ("Uninitialized scalar variable")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:21:51 -07:00
Huy Nguyen
ca3d89a3eb net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled
enable_4k_uar module parameter was added in patch cited below to
address the backward compatibility issue in SRIOV when the VM has
system's PAGE_SIZE uar implementation and the Hypervisor has 4k uar
implementation.

The above compatibility issue does not exist in the non SRIOV case.
In this patch, we always enable 4k uar implementation if SRIOV
is not enabled on mlx4's supported cards.

Fixes: 76e39ccf9c ("net/mlx4_core: Fix backward compatibility on VFs")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:15:37 -07:00
Thierry Reding
b6f6d56c91 PCI: Allow PCI express root ports to find themselves
If the pci_find_pcie_root_port() function is called on a root port
itself, return the root port rather than NULL.

This effectively reverts commit 0e40523287 ("PCI: fix oops when
try to find Root Port for a PCI device") which added an extra check
that would now be redundant.

Fixes: a99b646afa ("PCI: Disable PCIe Relaxed Ordering if unsupported")
Fixes: c56d4450eb ("PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shawn Lin <shawn.lin@rock-chips.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:14:37 -07:00
Neal Cardwell
cdbeb633ca tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
In some situations tcp_send_loss_probe() can realize that it's unable
to send a loss probe (TLP), and falls back to calling tcp_rearm_rto()
to schedule an RTO timer. In such cases, sometimes tcp_rearm_rto()
realizes that the RTO was eligible to fire immediately or at some
point in the past (delta_us <= 0). Previously in such cases
tcp_rearm_rto() was scheduling such "overdue" RTOs to happen at now +
icsk_rto, which caused needless delays of hundreds of milliseconds
(and non-linear behavior that made reproducible testing
difficult). This commit changes the logic to schedule "overdue" RTOs
ASAP, rather than at now + icsk_rto.

Fixes: 6ba8a3b19e ("tcp: Tail loss probe (TLP)")
Suggested-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:07:43 -07:00
Linus Torvalds
58d4e450a4 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "14 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes
  mm/vmalloc.c: don't unconditonally use __GFP_HIGHMEM
  mm/mempolicy: fix use after free when calling get_mempolicy
  mm/cma_debug.c: fix stack corruption due to sprintf usage
  signal: don't remove SIGNAL_UNKILLABLE for traced tasks.
  mm, oom: fix potential data corruption when oom_reaper races with writer
  mm: fix double mmap_sem unlock on MMF_UNSTABLE enforced SIGBUS
  slub: fix per memcg cache leak on css offline
  mm: discard memblock data later
  test_kmod: fix description for -s -and -c parameters
  kmod: fix wait on recursive loop
  wait: add wait_event_killable_timeout()
  kernel/watchdog: fix Kconfig constraints for perf hardlockup watchdog
  mm: memcontrol: fix NULL pointer crash in test_clear_page_writeback()
2017-08-18 16:06:33 -07:00
Roopa Prabhu
bc3aae2bba net: check and errout if res->fi is NULL when RTM_F_FIB_MATCH is set
Syzkaller hit 'general protection fault in fib_dump_info' bug on
commit 4.13-rc5..

Guilty file: net/ipv4/fib_semantics.c

kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 0 PID: 2808 Comm: syz-executor0 Not tainted 4.13.0-rc5 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
task: ffff880078562700 task.stack: ffff880078110000
RIP: 0010:fib_dump_info+0x388/0x1170 net/ipv4/fib_semantics.c:1314
RSP: 0018:ffff880078117010 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 00000000000000fe RCX: 0000000000000002
RDX: 0000000000000006 RSI: ffff880078117084 RDI: 0000000000000030
RBP: ffff880078117268 R08: 000000000000000c R09: ffff8800780d80c8
R10: 0000000058d629b4 R11: 0000000067fce681 R12: 0000000000000000
R13: ffff8800784bd540 R14: ffff8800780d80b5 R15: ffff8800780d80a4
FS:  00000000022fa940(0000) GS:ffff88007fc00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004387d0 CR3: 0000000079135000 CR4: 00000000000006f0
Call Trace:
  inet_rtm_getroute+0xc89/0x1f50 net/ipv4/route.c:2766
  rtnetlink_rcv_msg+0x288/0x680 net/core/rtnetlink.c:4217
  netlink_rcv_skb+0x340/0x470 net/netlink/af_netlink.c:2397
  rtnetlink_rcv+0x28/0x30 net/core/rtnetlink.c:4223
  netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
  netlink_unicast+0x4c4/0x6e0 net/netlink/af_netlink.c:1291
  netlink_sendmsg+0x8c4/0xca0 net/netlink/af_netlink.c:1854
  sock_sendmsg_nosec net/socket.c:633 [inline]
  sock_sendmsg+0xca/0x110 net/socket.c:643
  ___sys_sendmsg+0x779/0x8d0 net/socket.c:2035
  __sys_sendmsg+0xd1/0x170 net/socket.c:2069
  SYSC_sendmsg net/socket.c:2080 [inline]
  SyS_sendmsg+0x2d/0x50 net/socket.c:2076
  entry_SYSCALL_64_fastpath+0x1a/0xa5
  RIP: 0033:0x4512e9
  RSP: 002b:00007ffc75584cc8 EFLAGS: 00000216 ORIG_RAX:
  000000000000002e
  RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00000000004512e9
  RDX: 0000000000000000 RSI: 0000000020f2cfc8 RDI: 0000000000000003
  RBP: 000000000000000e R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000216 R12: fffffffffffffffe
  R13: 0000000000718000 R14: 0000000020c44ff0 R15: 0000000000000000
  Code: 00 0f b6 8d ec fd ff ff 48 8b 85 f0 fd ff ff 88 48 17 48 8b 45
  28 48 8d 78 30 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03
  <0f>
  b6 04 02 84 c0 74 08 3c 03 0f 8e cb 0c 00 00 48 8b 45 28 44
  RIP: fib_dump_info+0x388/0x1170 net/ipv4/fib_semantics.c:1314 RSP:
  ffff880078117010
---[ end trace 254a7af28348f88b ]---

This patch adds a res->fi NULL check.

example run:
$ip route get 0.0.0.0 iif virt1-0
broadcast 0.0.0.0 dev lo
    cache <local,brd> iif virt1-0

$ip route get 0.0.0.0 iif virt1-0 fibmatch
RTNETLINK answers: No route to host

Reported-by: idaifish <idaifish@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: b61798130f ("net: ipv4: RTM_GETROUTE: return matched fib result when requested")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:05:46 -07:00
Wei Wang
383143f31d ipv6: reset fn->rr_ptr when replacing route
syzcaller reported the following use-after-free issue in rt6_select():
BUG: KASAN: use-after-free in rt6_select net/ipv6/route.c:755 [inline] at addr ffff8800bc6994e8
BUG: KASAN: use-after-free in ip6_pol_route.isra.46+0x1429/0x1470 net/ipv6/route.c:1084 at addr ffff8800bc6994e8
Read of size 4 by task syz-executor1/439628
CPU: 0 PID: 439628 Comm: syz-executor1 Not tainted 4.3.5+ #8
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
 0000000000000000 ffff88018fe435b0 ffffffff81ca384d ffff8801d3588c00
 ffff8800bc699380 ffff8800bc699500 dffffc0000000000 ffff8801d40a47c0
 ffff88018fe435d8 ffffffff81735751 ffff88018fe43660 ffff8800bc699380
Call Trace:
 [<ffffffff81ca384d>] __dump_stack lib/dump_stack.c:15 [inline]
 [<ffffffff81ca384d>] dump_stack+0xc1/0x124 lib/dump_stack.c:51
sctp: [Deprecated]: syz-executor0 (pid 439615) Use of struct sctp_assoc_value in delayed_ack socket option.
Use struct sctp_sack_info instead
 [<ffffffff81735751>] kasan_object_err+0x21/0x70 mm/kasan/report.c:158
 [<ffffffff817359c4>] print_address_description mm/kasan/report.c:196 [inline]
 [<ffffffff817359c4>] kasan_report_error+0x1b4/0x4a0 mm/kasan/report.c:285
 [<ffffffff81735d93>] kasan_report mm/kasan/report.c:305 [inline]
 [<ffffffff81735d93>] __asan_report_load4_noabort+0x43/0x50 mm/kasan/report.c:325
 [<ffffffff82a28e39>] rt6_select net/ipv6/route.c:755 [inline]
 [<ffffffff82a28e39>] ip6_pol_route.isra.46+0x1429/0x1470 net/ipv6/route.c:1084
 [<ffffffff82a28fb1>] ip6_pol_route_output+0x81/0xb0 net/ipv6/route.c:1203
 [<ffffffff82ab0a50>] fib6_rule_action+0x1f0/0x680 net/ipv6/fib6_rules.c:95
 [<ffffffff8265cbb6>] fib_rules_lookup+0x2a6/0x7a0 net/core/fib_rules.c:223
 [<ffffffff82ab1430>] fib6_rule_lookup+0xd0/0x250 net/ipv6/fib6_rules.c:41
 [<ffffffff82a22006>] ip6_route_output+0x1d6/0x2c0 net/ipv6/route.c:1224
 [<ffffffff829e83d2>] ip6_dst_lookup_tail+0x4d2/0x890 net/ipv6/ip6_output.c:943
 [<ffffffff829e889a>] ip6_dst_lookup_flow+0x9a/0x250 net/ipv6/ip6_output.c:1079
 [<ffffffff82a9f7d8>] ip6_datagram_dst_update+0x538/0xd40 net/ipv6/datagram.c:91
 [<ffffffff82aa0978>] __ip6_datagram_connect net/ipv6/datagram.c:251 [inline]
 [<ffffffff82aa0978>] ip6_datagram_connect+0x518/0xe50 net/ipv6/datagram.c:272
 [<ffffffff82aa1313>] ip6_datagram_connect_v6_only+0x63/0x90 net/ipv6/datagram.c:284
 [<ffffffff8292f790>] inet_dgram_connect+0x170/0x1f0 net/ipv4/af_inet.c:564
 [<ffffffff82565547>] SYSC_connect+0x1a7/0x2f0 net/socket.c:1582
 [<ffffffff8256a649>] SyS_connect+0x29/0x30 net/socket.c:1563
 [<ffffffff82c72032>] entry_SYSCALL_64_fastpath+0x12/0x17
Object at ffff8800bc699380, in cache ip6_dst_cache size: 384

The root cause of it is that in fib6_add_rt2node(), when it replaces an
existing route with the new one, it does not update fn->rr_ptr.
This commit resets fn->rr_ptr to NULL when it points to a route which is
replaced in fib6_add_rt2node().

Fixes: 2759647247 ("ipv6: fix ECMP route replacement")
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:02:22 -07:00
Charles Milette
f299aec6eb staging: rtl8188eu: add RNX-N150NUB support
Add support for USB Device Rosewill RNX-N150NUB.
VendorID: 0x0bda, ProductID: 0xffef

Signed-off-by: Charles Milette <charles.milette@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-18 16:01:46 -07:00
Alexander Potapenko
15339e441e sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
KMSAN reported use of uninitialized sctp_addr->v4.sin_addr.s_addr and
sctp_addr->v6.sin6_scope_id in sctp_v6_cmp_addr() (see below).
Make sure all fields of an IPv6 address are initialized, which
guarantees that the IPv4 fields are also initialized.

==================================================================
 BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
 net/sctp/ipv6.c:517
 CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
 01/01/2011
 Call Trace:
  dump_stack+0x172/0x1c0 lib/dump_stack.c:42
  is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
  kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
  native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
  arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
  arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
  __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
  sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
  sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
  sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
  sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
  inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
  sock_sendmsg_nosec net/socket.c:633 [inline]
  sock_sendmsg net/socket.c:643 [inline]
  SYSC_sendto+0x608/0x710 net/socket.c:1696
  SyS_sendto+0x8a/0xb0 net/socket.c:1664
  entry_SYSCALL_64_fastpath+0x13/0x94
 RIP: 0033:0x44b479
 RSP: 002b:00007f6213f21c08 EFLAGS: 00000286 ORIG_RAX: 000000000000002c
 RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 000000000044b479
 RDX: 0000000000000041 RSI: 0000000020edd000 RDI: 0000000000000006
 RBP: 00000000007080a8 R08: 0000000020b85fe4 R09: 000000000000001c
 R10: 0000000000040005 R11: 0000000000000286 R12: 00000000ffffffff
 R13: 0000000000003760 R14: 00000000006e5820 R15: 0000000000ff8000
 origin description: ----dst_saddr@sctp_v6_get_dst
 local variable created at:
  sk_fullsock include/net/sock.h:2321 [inline]
  inet6_sk include/linux/ipv6.h:309 [inline]
  sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
==================================================================
 BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
 net/sctp/ipv6.c:517
 CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
 01/01/2011
 Call Trace:
  dump_stack+0x172/0x1c0 lib/dump_stack.c:42
  is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
  kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
  native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
  arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
  arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
  __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
  sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
  sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
  sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
  sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
  inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
  sock_sendmsg_nosec net/socket.c:633 [inline]
  sock_sendmsg net/socket.c:643 [inline]
  SYSC_sendto+0x608/0x710 net/socket.c:1696
  SyS_sendto+0x8a/0xb0 net/socket.c:1664
  entry_SYSCALL_64_fastpath+0x13/0x94
 RIP: 0033:0x44b479
 RSP: 002b:00007f6213f21c08 EFLAGS: 00000286 ORIG_RAX: 000000000000002c
 RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 000000000044b479
 RDX: 0000000000000041 RSI: 0000000020edd000 RDI: 0000000000000006
 RBP: 00000000007080a8 R08: 0000000020b85fe4 R09: 000000000000001c
 R10: 0000000000040005 R11: 0000000000000286 R12: 00000000ffffffff
 R13: 0000000000003760 R14: 00000000006e5820 R15: 0000000000ff8000
 origin description: ----dst_saddr@sctp_v6_get_dst
 local variable created at:
  sk_fullsock include/net/sock.h:2321 [inline]
  inet6_sk include/linux/ipv6.h:309 [inline]
  sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
  sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
==================================================================

Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:00:47 -07:00
Eric Dumazet
5bfd37b4de tipc: fix use-after-free
syszkaller reported use-after-free in tipc [1]

When msg->rep skb is freed, set the pointer to NULL,
so that caller does not free it again.

[1]

==================================================================
BUG: KASAN: use-after-free in skb_push+0xd4/0xe0 net/core/skbuff.c:1466
Read of size 8 at addr ffff8801c6e71e90 by task syz-executor5/4115

CPU: 1 PID: 4115 Comm: syz-executor5 Not tainted 4.13.0-rc4+ #32
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 print_address_description+0x73/0x250 mm/kasan/report.c:252
 kasan_report_error mm/kasan/report.c:351 [inline]
 kasan_report+0x24e/0x340 mm/kasan/report.c:409
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
 skb_push+0xd4/0xe0 net/core/skbuff.c:1466
 tipc_nl_compat_recv+0x833/0x18f0 net/tipc/netlink_compat.c:1209
 genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:598
 genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:623
 netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2397
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:634
 netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
 netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1291
 netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1854
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 sock_write_iter+0x31a/0x5d0 net/socket.c:898
 call_write_iter include/linux/fs.h:1743 [inline]
 new_sync_write fs/read_write.c:457 [inline]
 __vfs_write+0x684/0x970 fs/read_write.c:470
 vfs_write+0x189/0x510 fs/read_write.c:518
 SYSC_write fs/read_write.c:565 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:557
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x4512e9
RSP: 002b:00007f3bc8184c08 EFLAGS: 00000216 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 00000000004512e9
RDX: 0000000000000020 RSI: 0000000020fdb000 RDI: 0000000000000006
RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004b5e76
R13: 00007f3bc8184b48 R14: 00000000004b5e86 R15: 0000000000000000

Allocated by task 4115:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
 kmem_cache_alloc_node+0x13d/0x750 mm/slab.c:3651
 __alloc_skb+0xf1/0x740 net/core/skbuff.c:219
 alloc_skb include/linux/skbuff.h:903 [inline]
 tipc_tlv_alloc+0x26/0xb0 net/tipc/netlink_compat.c:148
 tipc_nl_compat_dumpit+0xf2/0x3c0 net/tipc/netlink_compat.c:248
 tipc_nl_compat_handle net/tipc/netlink_compat.c:1130 [inline]
 tipc_nl_compat_recv+0x756/0x18f0 net/tipc/netlink_compat.c:1199
 genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:598
 genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:623
 netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2397
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:634
 netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
 netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1291
 netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1854
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 sock_write_iter+0x31a/0x5d0 net/socket.c:898
 call_write_iter include/linux/fs.h:1743 [inline]
 new_sync_write fs/read_write.c:457 [inline]
 __vfs_write+0x684/0x970 fs/read_write.c:470
 vfs_write+0x189/0x510 fs/read_write.c:518
 SYSC_write fs/read_write.c:565 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:557
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Freed by task 4115:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
 __cache_free mm/slab.c:3503 [inline]
 kmem_cache_free+0x77/0x280 mm/slab.c:3763
 kfree_skbmem+0x1a1/0x1d0 net/core/skbuff.c:622
 __kfree_skb net/core/skbuff.c:682 [inline]
 kfree_skb+0x165/0x4c0 net/core/skbuff.c:699
 tipc_nl_compat_dumpit+0x36a/0x3c0 net/tipc/netlink_compat.c:260
 tipc_nl_compat_handle net/tipc/netlink_compat.c:1130 [inline]
 tipc_nl_compat_recv+0x756/0x18f0 net/tipc/netlink_compat.c:1199
 genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:598
 genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:623
 netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2397
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:634
 netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
 netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1291
 netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1854
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 sock_write_iter+0x31a/0x5d0 net/socket.c:898
 call_write_iter include/linux/fs.h:1743 [inline]
 new_sync_write fs/read_write.c:457 [inline]
 __vfs_write+0x684/0x970 fs/read_write.c:470
 vfs_write+0x189/0x510 fs/read_write.c:518
 SYSC_write fs/read_write.c:565 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:557
 entry_SYSCALL_64_fastpath+0x1f/0xbe

The buggy address belongs to the object at ffff8801c6e71dc0
 which belongs to the cache skbuff_head_cache of size 224
The buggy address is located 208 bytes inside of
 224-byte region [ffff8801c6e71dc0, ffff8801c6e71ea0)
The buggy address belongs to the page:
page:ffffea00071b9c40 count:1 mapcount:0 mapping:ffff8801c6e71000 index:0x0
flags: 0x200000000000100(slab)
raw: 0200000000000100 ffff8801c6e71000 0000000000000000 000000010000000c
raw: ffffea0007224a20 ffff8801d98caf48 ffff8801d9e79040 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8801c6e71d80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
 ffff8801c6e71e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8801c6e71e80: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
                         ^
 ffff8801c6e71f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff8801c6e71f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov  <dvyukov@google.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 15:59:29 -07:00
Eric Dumazet
ff244c6b29 tun: handle register_netdevice() failures properly
syzkaller reported a double free [1], caused by the fact
that tun driver was not updated properly when priv_destructor
was added.

When/if register_netdevice() fails, priv_destructor() must have been
called already.

[1]
BUG: KASAN: double-free or invalid-free in selinux_tun_dev_free_security+0x15/0x20 security/selinux/hooks.c:5023

CPU: 0 PID: 2919 Comm: syzkaller227220 Not tainted 4.13.0-rc4+ #23
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 print_address_description+0x7f/0x260 mm/kasan/report.c:252
 kasan_report_double_free+0x55/0x80 mm/kasan/report.c:333
 kasan_slab_free+0xa0/0xc0 mm/kasan/kasan.c:514
 __cache_free mm/slab.c:3503 [inline]
 kfree+0xd3/0x260 mm/slab.c:3820
 selinux_tun_dev_free_security+0x15/0x20 security/selinux/hooks.c:5023
 security_tun_dev_free_security+0x48/0x80 security/security.c:1512
 tun_set_iff drivers/net/tun.c:1884 [inline]
 __tun_chr_ioctl+0x2ce6/0x3d50 drivers/net/tun.c:2064
 tun_chr_ioctl+0x2a/0x40 drivers/net/tun.c:2309
 vfs_ioctl fs/ioctl.c:45 [inline]
 do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
 SYSC_ioctl fs/ioctl.c:700 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x443ff9
RSP: 002b:00007ffc34271f68 EFLAGS: 00000217 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000004002e0 RCX: 0000000000443ff9
RDX: 0000000020533000 RSI: 00000000400454ca RDI: 0000000000000003
RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000401ce0
R13: 0000000000401d70 R14: 0000000000000000 R15: 0000000000000000

Allocated by task 2919:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:551
 kmem_cache_alloc_trace+0x101/0x6f0 mm/slab.c:3627
 kmalloc include/linux/slab.h:493 [inline]
 kzalloc include/linux/slab.h:666 [inline]
 selinux_tun_dev_alloc_security+0x49/0x170 security/selinux/hooks.c:5012
 security_tun_dev_alloc_security+0x6d/0xa0 security/security.c:1506
 tun_set_iff drivers/net/tun.c:1839 [inline]
 __tun_chr_ioctl+0x1730/0x3d50 drivers/net/tun.c:2064
 tun_chr_ioctl+0x2a/0x40 drivers/net/tun.c:2309
 vfs_ioctl fs/ioctl.c:45 [inline]
 do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
 SYSC_ioctl fs/ioctl.c:700 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Freed by task 2919:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_slab_free+0x6e/0xc0 mm/kasan/kasan.c:524
 __cache_free mm/slab.c:3503 [inline]
 kfree+0xd3/0x260 mm/slab.c:3820
 selinux_tun_dev_free_security+0x15/0x20 security/selinux/hooks.c:5023
 security_tun_dev_free_security+0x48/0x80 security/security.c:1512
 tun_free_netdev+0x13b/0x1b0 drivers/net/tun.c:1563
 register_netdevice+0x8d0/0xee0 net/core/dev.c:7605
 tun_set_iff drivers/net/tun.c:1859 [inline]
 __tun_chr_ioctl+0x1caf/0x3d50 drivers/net/tun.c:2064
 tun_chr_ioctl+0x2a/0x40 drivers/net/tun.c:2309
 vfs_ioctl fs/ioctl.c:45 [inline]
 do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
 SYSC_ioctl fs/ioctl.c:700 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
 entry_SYSCALL_64_fastpath+0x1f/0xbe

The buggy address belongs to the object at ffff8801d2843b40
 which belongs to the cache kmalloc-32 of size 32
The buggy address is located 0 bytes inside of
 32-byte region [ffff8801d2843b40, ffff8801d2843b60)
The buggy address belongs to the page:
page:ffffea000660cea8 count:1 mapcount:0 mapping:ffff8801d2843000 index:0xffff8801d2843fc1
flags: 0x200000000000100(slab)
raw: 0200000000000100 ffff8801d2843000 ffff8801d2843fc1 000000010000003f
raw: ffffea0006626a40 ffffea00066141a0 ffff8801dbc00100
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8801d2843a00: fb fb fb fb fc fc fc fc fb fb fb fb fc fc fc fc
 ffff8801d2843a80: 00 00 00 fc fc fc fc fc fb fb fb fb fc fc fc fc
>ffff8801d2843b00: 00 00 00 00 fc fc fc fc fb fb fb fb fc fc fc fc
                                           ^
 ffff8801d2843b80: fb fb fb fb fc fc fc fc fb fb fb fb fc fc fc fc
 ffff8801d2843c00: fb fb fb fb fc fc fc fc fb fb fb fb fc fc fc fc

==================================================================

Fixes: cf124db566 ("net: Fix inconsistent teardown and release of private netdev state.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 15:55:35 -07:00
Kees Cook
c715b72c1b mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes
Moving the x86_64 and arm64 PIE base from 0x555555554000 to 0x000100000000
broke AddressSanitizer.  This is a partial revert of:

  eab09532d4 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE")
  02445990a9 ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB")

The AddressSanitizer tool has hard-coded expectations about where
executable mappings are loaded.

The motivation for changing the PIE base in the above commits was to
avoid the Stack-Clash CVEs that allowed executable mappings to get too
close to heap and stack.  This was mainly a problem on 32-bit, but the
64-bit bases were moved too, in an effort to proactively protect those
systems (proofs of concept do exist that show 64-bit collisions, but
other recent changes to fix stack accounting and setuid behaviors will
minimize the impact).

The new 32-bit PIE base is fine for ASan (since it matches the ET_EXEC
base), so only the 64-bit PIE base needs to be reverted to let x86 and
arm64 ASan binaries run again.  Future changes to the 64-bit PIE base on
these architectures can be made optional once a more dynamic method for
dealing with AddressSanitizer is found.  (e.g.  always loading PIE into
the mmap region for marked binaries.)

Link: http://lkml.kernel.org/r/20170807201542.GA21271@beast
Fixes: eab09532d4 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE")
Fixes: 02445990a9 ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Kostya Serebryany <kcc@google.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:02 -07:00
Laura Abbott
704b862f9e mm/vmalloc.c: don't unconditonally use __GFP_HIGHMEM
Commit 19809c2da2 ("mm, vmalloc: use __GFP_HIGHMEM implicitly") added
use of __GFP_HIGHMEM for allocations.  vmalloc_32 may use
GFP_DMA/GFP_DMA32 which does not play nice with __GFP_HIGHMEM and will
trigger a BUG in gfp_zone.

Only add __GFP_HIGHMEM if we aren't using GFP_DMA/GFP_DMA32.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1482249
Link: http://lkml.kernel.org/r/20170816220705.31374-1-labbott@redhat.com
Fixes: 19809c2da2 ("mm, vmalloc: use __GFP_HIGHMEM implicitly")
Signed-off-by: Laura Abbott <labbott@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:02 -07:00
zhong jiang
73223e4e2e mm/mempolicy: fix use after free when calling get_mempolicy
I hit a use after free issue when executing trinity and repoduced it
with KASAN enabled.  The related call trace is as follows.

  BUG: KASan: use after free in SyS_get_mempolicy+0x3c8/0x960 at addr ffff8801f582d766
  Read of size 2 by task syz-executor1/798

  INFO: Allocated in mpol_new.part.2+0x74/0x160 age=3 cpu=1 pid=799
     __slab_alloc+0x768/0x970
     kmem_cache_alloc+0x2e7/0x450
     mpol_new.part.2+0x74/0x160
     mpol_new+0x66/0x80
     SyS_mbind+0x267/0x9f0
     system_call_fastpath+0x16/0x1b
  INFO: Freed in __mpol_put+0x2b/0x40 age=4 cpu=1 pid=799
     __slab_free+0x495/0x8e0
     kmem_cache_free+0x2f3/0x4c0
     __mpol_put+0x2b/0x40
     SyS_mbind+0x383/0x9f0
     system_call_fastpath+0x16/0x1b
  INFO: Slab 0xffffea0009cb8dc0 objects=23 used=8 fp=0xffff8801f582de40 flags=0x200000000004080
  INFO: Object 0xffff8801f582d760 @offset=5984 fp=0xffff8801f582d600

  Bytes b4 ffff8801f582d750: ae 01 ff ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a  ........ZZZZZZZZ
  Object ffff8801f582d760: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
  Object ffff8801f582d770: 6b 6b 6b 6b 6b 6b 6b a5                          kkkkkkk.
  Redzone ffff8801f582d778: bb bb bb bb bb bb bb bb                          ........
  Padding ffff8801f582d8b8: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
  Memory state around the buggy address:
  ffff8801f582d600: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff8801f582d680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  >ffff8801f582d700: fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb fc

!shared memory policy is not protected against parallel removal by other
thread which is normally protected by the mmap_sem.  do_get_mempolicy,
however, drops the lock midway while we can still access it later.

Early premature up_read is a historical artifact from times when
put_user was called in this path see https://lwn.net/Articles/124754/
but that is gone since 8bccd85ffb ("[PATCH] Implement sys_* do_*
layering in the memory policy layer.").  but when we have the the
current mempolicy ref count model.  The issue was introduced
accordingly.

Fix the issue by removing the premature release.

Link: http://lkml.kernel.org/r/1502950924-27521-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org>	[2.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:02 -07:00
Prakash Gupta
da094e4284 mm/cma_debug.c: fix stack corruption due to sprintf usage
name[] in cma_debugfs_add_one() can only accommodate 16 chars including
NULL to store sprintf output.  It's common for cma device name to be
larger than 15 chars.  This can cause stack corrpution.  If the gcc
stack protector is turned on, this can cause a panic due to stack
corruption.

Below is one example trace:

  Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in:
  ffffff8e69a75730
  Call trace:
     dump_backtrace+0x0/0x2c4
     show_stack+0x20/0x28
     dump_stack+0xb8/0xf4
     panic+0x154/0x2b0
     print_tainted+0x0/0xc0
     cma_debugfs_init+0x274/0x290
     do_one_initcall+0x5c/0x168
     kernel_init_freeable+0x1c8/0x280

Fix the short sprintf buffer in cma_debugfs_add_one() by using
scnprintf() instead of sprintf().

Link: http://lkml.kernel.org/r/1502446217-21840-1-git-send-email-guptap@codeaurora.org
Fixes: f318dd083c ("cma: Store a name in the cma structure")
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
Acked-by: Laura Abbott <labbott@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:02 -07:00
Jamie Iles
eb61b5911b signal: don't remove SIGNAL_UNKILLABLE for traced tasks.
When forcing a signal, SIGNAL_UNKILLABLE is removed to prevent recursive
faults, but this is undesirable when tracing.  For example, debugging an
init process (whether global or namespace), hitting a breakpoint and
SIGTRAP will force SIGTRAP and then remove SIGNAL_UNKILLABLE.
Everything continues fine, but then once debugging has finished, the
init process is left killable which is unlikely what the user expects,
resulting in either an accidentally killed init or an init that stops
reaping zombies.

Link: http://lkml.kernel.org/r/20170815112806.10728-1-jamie.iles@oracle.com
Signed-off-by: Jamie Iles <jamie.iles@oracle.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:02 -07:00
Michal Hocko
6b31d5955c mm, oom: fix potential data corruption when oom_reaper races with writer
Wenwei Tao has noticed that our current assumption that the oom victim
is dying and never doing any visible changes after it dies, and so the
oom_reaper can tear it down, is not entirely true.

__task_will_free_mem consider a task dying when SIGNAL_GROUP_EXIT is set
but do_group_exit sends SIGKILL to all threads _after_ the flag is set.
So there is a race window when some threads won't have
fatal_signal_pending while the oom_reaper could start unmapping the
address space.  Moreover some paths might not check for fatal signals
before each PF/g-u-p/copy_from_user.

We already have a protection for oom_reaper vs.  PF races by checking
MMF_UNSTABLE.  This has been, however, checked only for kernel threads
(use_mm users) which can outlive the oom victim.  A simple fix would be
to extend the current check in handle_mm_fault for all tasks but that
wouldn't be sufficient because the current check assumes that a kernel
thread would bail out after EFAULT from get_user*/copy_from_user and
never re-read the same address which would succeed because the PF path
has established page tables already.  This seems to be the case for the
only existing use_mm user currently (virtio driver) but it is rather
fragile in general.

This is even more fragile in general for more complex paths such as
generic_perform_write which can re-read the same address more times
(e.g.  iov_iter_copy_from_user_atomic to fail and then
iov_iter_fault_in_readable on retry).

Therefore we have to implement MMF_UNSTABLE protection in a robust way
and never make a potentially corrupted content visible.  That requires
to hook deeper into the PF path and check for the flag _every time_
before a pte for anonymous memory is established (that means all
!VM_SHARED mappings).

The corruption can be triggered artificially
(http://lkml.kernel.org/r/201708040646.v746kkhC024636@www262.sakura.ne.jp)
but there doesn't seem to be any real life bug report.  The race window
should be quite tight to trigger most of the time.

Link: http://lkml.kernel.org/r/20170807113839.16695-3-mhocko@kernel.org
Fixes: aac4536355 ("mm, oom: introduce oom reaper")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Wenwei Tao <wenwei.tww@alibaba-inc.com>
Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andrea Argangeli <andrea@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:01 -07:00
Michal Hocko
5b53a6ea88 mm: fix double mmap_sem unlock on MMF_UNSTABLE enforced SIGBUS
Tetsuo Handa has noticed that MMF_UNSTABLE SIGBUS path in
handle_mm_fault causes a lockdep splat

  Out of memory: Kill process 1056 (a.out) score 603 or sacrifice child
  Killed process 1056 (a.out) total-vm:4268108kB, anon-rss:2246048kB, file-rss:0kB, shmem-rss:0kB
  a.out (1169) used greatest stack depth: 11664 bytes left
  DEBUG_LOCKS_WARN_ON(depth <= 0)
  ------------[ cut here ]------------
  WARNING: CPU: 6 PID: 1339 at kernel/locking/lockdep.c:3617 lock_release+0x172/0x1e0
  CPU: 6 PID: 1339 Comm: a.out Not tainted 4.13.0-rc3-next-20170803+ #142
  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
  RIP: 0010:lock_release+0x172/0x1e0
  Call Trace:
     up_read+0x1a/0x40
     __do_page_fault+0x28e/0x4c0
     do_page_fault+0x30/0x80
     page_fault+0x28/0x30

The reason is that the page fault path might have dropped the mmap_sem
and returned with VM_FAULT_RETRY.  MMF_UNSTABLE check however rewrites
the error path to VM_FAULT_SIGBUS and we always expect mmap_sem taken in
that path.  Fix this by taking mmap_sem when VM_FAULT_RETRY is held in
the MMF_UNSTABLE path.

We cannot simply add VM_FAULT_SIGBUS to the existing error code because
all arch specific page fault handlers and g-u-p would have to learn a
new error code combination.

Link: http://lkml.kernel.org/r/20170807113839.16695-2-mhocko@kernel.org
Fixes: 3f70dc38ce ("mm: make sure that kthreads will not refault oom reaped memory")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Andrea Argangeli <andrea@kernel.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Wenwei Tao <wenwei.tww@alibaba-inc.com>
Cc: <stable@vger.kernel.org>	[4.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:01 -07:00
Vladimir Davydov
f6ba488073 slub: fix per memcg cache leak on css offline
To avoid a possible deadlock, sysfs_slab_remove() schedules an
asynchronous work to delete sysfs entries corresponding to the kmem
cache.  To ensure the cache isn't freed before the work function is
called, it takes a reference to the cache kobject.  The reference is
supposed to be released by the work function.

However, the work function (sysfs_slab_remove_workfn()) does nothing in
case the cache sysfs entry has already been deleted, leaking the kobject
and the corresponding cache.

This may happen on a per memcg cache destruction, because sysfs entries
of a per memcg cache are deleted on memcg offline if the cache is empty
(see __kmemcg_cache_deactivate()).

The kmemleak report looks like this:

  unreferenced object 0xffff9f798a79f540 (size 32):
    comm "kworker/1:4", pid 15416, jiffies 4307432429 (age 28687.554s)
    hex dump (first 32 bytes):
      6b 6d 61 6c 6c 6f 63 2d 31 36 28 31 35 39 39 3a  kmalloc-16(1599:
      6e 65 77 72 6f 6f 74 29 00 23 6b c0 ff ff ff ff  newroot).#k.....
    backtrace:
       kmemleak_alloc+0x4a/0xa0
       __kmalloc_track_caller+0x148/0x2c0
       kvasprintf+0x66/0xd0
       kasprintf+0x49/0x70
       memcg_create_kmem_cache+0xe6/0x160
       memcg_kmem_cache_create_func+0x20/0x110
       process_one_work+0x205/0x5d0
       worker_thread+0x4e/0x3a0
       kthread+0x109/0x140
       ret_from_fork+0x2a/0x40
  unreferenced object 0xffff9f79b6136840 (size 416):
    comm "kworker/1:4", pid 15416, jiffies 4307432429 (age 28687.573s)
    hex dump (first 32 bytes):
      40 fb 80 c2 3e 33 00 00 00 00 00 40 00 00 00 00  @...>3.....@....
      00 00 00 00 00 00 00 00 10 00 00 00 10 00 00 00  ................
    backtrace:
       kmemleak_alloc+0x4a/0xa0
       kmem_cache_alloc+0x128/0x280
       create_cache+0x3b/0x1e0
       memcg_create_kmem_cache+0x118/0x160
       memcg_kmem_cache_create_func+0x20/0x110
       process_one_work+0x205/0x5d0
       worker_thread+0x4e/0x3a0
       kthread+0x109/0x140
       ret_from_fork+0x2a/0x40

Fix the leak by adding the missing call to kobject_put() to
sysfs_slab_remove_workfn().

Link: http://lkml.kernel.org/r/20170812181134.25027-1-vdavydov.dev@gmail.com
Fixes: 3b7b314053 ("slub: make sysfs file removal asynchronous")
Signed-off-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Reported-by: Andrei Vagin <avagin@gmail.com>
Tested-by: Andrei Vagin <avagin@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>	[4.12.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:01 -07:00
Pavel Tatashin
3010f87650 mm: discard memblock data later
There is existing use after free bug when deferred struct pages are
enabled:

The memblock_add() allocates memory for the memory array if more than
128 entries are needed.  See comment in e820__memblock_setup():

  * The bootstrap memblock region count maximum is 128 entries
  * (INIT_MEMBLOCK_REGIONS), but EFI might pass us more E820 entries
  * than that - so allow memblock resizing.

This memblock memory is freed here:
        free_low_memory_core_early()

We access the freed memblock.memory later in boot when deferred pages
are initialized in this path:

        deferred_init_memmap()
                for_each_mem_pfn_range()
                  __next_mem_pfn_range()
                    type = &memblock.memory;

One possible explanation for why this use-after-free hasn't been hit
before is that the limit of INIT_MEMBLOCK_REGIONS has never been
exceeded at least on systems where deferred struct pages were enabled.

Tested by reducing INIT_MEMBLOCK_REGIONS down to 4 from the current 128,
and verifying in qemu that this code is getting excuted and that the
freed pages are sane.

Link: http://lkml.kernel.org/r/1502485554-318703-2-git-send-email-pasha.tatashin@oracle.com
Fixes: 7e18adb4f8 ("mm: meminit: initialise remaining struct pages in parallel with kswapd")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:01 -07:00
Luis R. Rodriguez
768dc4e484 test_kmod: fix description for -s -and -c parameters
The descriptions were reversed, correct this.

Link: http://lkml.kernel.org/r/20170809234635.13443-4-mcgrof@kernel.org
Fixes: 64b671204a ("test_sysctl: add generic script to expand on tests")
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Reported-by: Daniel Mentz <danielmentz@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jessica Yu <jeyu@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matt Redfearn <matt.redfearn@imgetc.com>
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Michal Marek <mmarek@suse.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:01 -07:00
Luis R. Rodriguez
2ba293c9e7 kmod: fix wait on recursive loop
Recursive loops with module loading were previously handled in kmod by
restricting the number of modprobe calls to 50 and if that limit was
breached request_module() would return an error and a user would see the
following on their kernel dmesg:

  request_module: runaway loop modprobe binfmt-464c
  Starting init:/sbin/init exists but couldn't execute it (error -8)

This issue could happen for instance when a 64-bit kernel boots a 32-bit
userspace on some architectures and has no 32-bit binary format
hanlders.  This is visible, for instance, when a CONFIG_MODULES enabled
64-bit MIPS kernel boots a into o32 root filesystem and the binfmt
handler for o32 binaries is not built-in.

After commit 6d7964a722 ("kmod: throttle kmod thread limit") we now
don't have any visible signs of an error and the kernel just waits for
the loop to end somehow.

Although this *particular* recursive loop could also be addressed by
doing a sanity check on search_binary_handler() and disallowing a
modular binfmt to be required for modprobe, a generic solution for any
recursive kernel kmod issues is still needed.

This should catch these loops.  We can investigate each loop and address
each one separately as they come in, this however puts a stop gap for
them as before.

Link: http://lkml.kernel.org/r/20170809234635.13443-3-mcgrof@kernel.org
Fixes: 6d7964a722 ("kmod: throttle kmod thread limit")
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Reported-by: Matt Redfearn <matt.redfearn@imgtec.com>
Tested-by: Matt Redfearn <matt.redfearn@imgetc.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jessica Yu <jeyu@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:01 -07:00
Luis R. Rodriguez
8ada92799e wait: add wait_event_killable_timeout()
These are the few pending fixes I have queued up for v4.13-final.  One
is a a generic regression fix for recursive loops on kmod and the other
one is a trivial print out correction.

During the v4.13 development we assumed that recursive kmod loops were
no longer possible.  Clearly that is not true.  The regression fix makes
use of a new killable wait.  We use a killable wait to be paranoid in
how signals might be sent to modprobe and only accept a proper SIGKILL.
The signal will only be available to userspace to issue *iff* a thread
has already entered a wait state, and that happens only if we've already
throttled after 50 kmod threads have been hit.

Note that although it may seem excessive to trigger a failure afer 5
seconds if all kmod thread remain busy, prior to the series of changes
that went into v4.13 we would actually *always* fatally fail any request
which came in if the limit was already reached.  The new waiting
implemented in v4.13 actually gives us *more* breathing room -- the wait
for 5 seconds is a wait for *any* kmod thread to finish.  We give up and
fail *iff* no kmod thread has finished and they're *all* running
straight for 5 consecutive seconds.  If 50 kmod threads are running
consecutively for 5 seconds something else must be really bad.

Recursive loops with kmod are bad but they're also hard to implement
properly as a selftest without currently fooling current userspace tools
like kmod [1].  For instance kmod will complain when you run depmod if
it finds a recursive loop with symbol dependency between modules as such
this type of recursive loop cannot go upstream as the modules_install
target will fail after running depmod.

These tests already exist on userspace kmod upstream though (refer to
the testsuite/module-playground/mod-loop-*.c files).  The same is not
true if request_module() is used though, or worst if aliases are used.

Likewise the issue with 64-bit kernels booting 32-bit userspace without
a binfmt handler built-in is also currently not detected and proactively
avoided by userspace kmod tools, or kconfig for all architectures.
Although we could complain in the kernel when some of these individual
recursive issues creep up, proactively avoiding these situations in
userspace at build time is what we should keep striving for.

Lastly, since recursive loops could happen with kmod it may mean
recursive loops may also be possible with other kernel usermode helpers,
this should be investigated and long term if we can come up with a more
sensible generic solution even better!

[0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20170809-kmod-for-v4.13-final
[1] https://git.kernel.org/pub/scm/utils/kernel/kmod/kmod.git

This patch (of 3):

This wait is similar to wait_event_interruptible_timeout() but only
accepts SIGKILL interrupt signal.  Other signals are ignored.

Link: http://lkml.kernel.org/r/20170809234635.13443-2-mcgrof@kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Jessica Yu <jeyu@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michal Marek <mmarek@suse.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Matt Redfearn <matt.redfearn@imgetc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:01 -07:00
Nicholas Piggin
92e5aae457 kernel/watchdog: fix Kconfig constraints for perf hardlockup watchdog
Commit 05a4a95279 ("kernel/watchdog: split up config options") lost
the perf-based hardlockup detector's dependency on PERF_EVENTS, which
can result in broken builds with some powerpc configurations.

Restore the dependency.  Add it in for x86 too, despite x86 always
selecting PERF_EVENTS it seems reasonable to make the dependency
explicit.

Link: http://lkml.kernel.org/r/20170810114452.6673-1-npiggin@gmail.com
Fixes: 05a4a95279 ("kernel/watchdog: split up config options")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:01 -07:00
Johannes Weiner
739f79fc9d mm: memcontrol: fix NULL pointer crash in test_clear_page_writeback()
Jaegeuk and Brad report a NULL pointer crash when writeback ending tries
to update the memcg stats:

    BUG: unable to handle kernel NULL pointer dereference at 00000000000003b0
    IP: test_clear_page_writeback+0x12e/0x2c0
    [...]
    RIP: 0010:test_clear_page_writeback+0x12e/0x2c0
    Call Trace:
     <IRQ>
     end_page_writeback+0x47/0x70
     f2fs_write_end_io+0x76/0x180 [f2fs]
     bio_endio+0x9f/0x120
     blk_update_request+0xa8/0x2f0
     scsi_end_request+0x39/0x1d0
     scsi_io_completion+0x211/0x690
     scsi_finish_command+0xd9/0x120
     scsi_softirq_done+0x127/0x150
     __blk_mq_complete_request_remote+0x13/0x20
     flush_smp_call_function_queue+0x56/0x110
     generic_smp_call_function_single_interrupt+0x13/0x30
     smp_call_function_single_interrupt+0x27/0x40
     call_function_single_interrupt+0x89/0x90
    RIP: 0010:native_safe_halt+0x6/0x10

    (gdb) l *(test_clear_page_writeback+0x12e)
    0xffffffff811bae3e is in test_clear_page_writeback (./include/linux/memcontrol.h:619).
    614		mod_node_page_state(page_pgdat(page), idx, val);
    615		if (mem_cgroup_disabled() || !page->mem_cgroup)
    616			return;
    617		mod_memcg_state(page->mem_cgroup, idx, val);
    618		pn = page->mem_cgroup->nodeinfo[page_to_nid(page)];
    619		this_cpu_add(pn->lruvec_stat->count[idx], val);
    620	}
    621
    622	unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
    623							gfp_t gfp_mask,

The issue is that writeback doesn't hold a page reference and the page
might get freed after PG_writeback is cleared (and the mapping is
unlocked) in test_clear_page_writeback().  The stat functions looking up
the page's node or zone are safe, as those attributes are static across
allocation and free cycles.  But page->mem_cgroup is not, and it will
get cleared if we race with truncation or migration.

It appears this race window has been around for a while, but less likely
to trigger when the memcg stats were updated first thing after
PG_writeback is cleared.  Recent changes reshuffled this code to update
the global node stats before the memcg ones, though, stretching the race
window out to an extent where people can reproduce the problem.

Update test_clear_page_writeback() to look up and pin page->mem_cgroup
before clearing PG_writeback, then not use that pointer afterward.  It
is a partial revert of 62cccb8c8e ("mm: simplify lock_page_memcg()")
but leaves the pageref-holding callsites that aren't affected alone.

Link: http://lkml.kernel.org/r/20170809183825.GA26387@cmpxchg.org
Fixes: 62cccb8c8e ("mm: simplify lock_page_memcg()")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Jaegeuk Kim <jaegeuk@kernel.org>
Tested-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reported-by: Bradley Bolen <bradleybolen@gmail.com>
Tested-by: Brad Bolen <bradleybolen@gmail.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: <stable@vger.kernel.org>	[4.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-18 15:32:01 -07:00
Matthew Dawson
a0917e0bc6 datagram: When peeking datagrams with offset < 0 don't skip empty skbs
Due to commit e6afc8ace6 ("udp: remove
headers from UDP packets before queueing"), when udp packets are being
peeked the requested extra offset is always 0 as there is no need to skip
the udp header.  However, when the offset is 0 and the next skb is
of length 0, it is only returned once.  The behaviour can be seen with
the following python script:

from socket import *;
f=socket(AF_INET6, SOCK_DGRAM | SOCK_NONBLOCK, 0);
g=socket(AF_INET6, SOCK_DGRAM | SOCK_NONBLOCK, 0);
f.bind(('::', 0));
addr=('::1', f.getsockname()[1]);
g.sendto(b'', addr)
g.sendto(b'b', addr)
print(f.recvfrom(10, MSG_PEEK));
print(f.recvfrom(10, MSG_PEEK));

Where the expected output should be the empty string twice.

Instead, make sk_peek_offset return negative values, and pass those values
to __skb_try_recv_datagram/__skb_try_recv_from_queue.  If the passed offset
to __skb_try_recv_from_queue is negative, the checked skb is never skipped.
__skb_try_recv_from_queue will then ensure the offset is reset back to 0
if a peek is requested without an offset, unless no packets are found.

Also simplify the if condition in __skb_try_recv_from_queue.  If _off is
greater then 0, and off is greater then or equal to skb->len, then
(_off || skb->len) must always be true assuming skb->len >= 0 is always
true.

Also remove a redundant check around a call to sk_peek_offset in af_unix.c,
as it double checked if MSG_PEEK was set in the flags.

V2:
 - Moved the negative fixup into __skb_try_recv_from_queue, and remove now
redundant checks
 - Fix peeking in udp{,v6}_recvmsg to report the right value when the
offset is 0

V3:
 - Marked new branch in __skb_try_recv_from_queue as unlikely.

Signed-off-by: Matthew Dawson <matthew@mjdsystems.ca>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 15:12:54 -07:00
Linus Torvalds
cc28fcdc01 Merge tag 'xfs-4.13-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
 "A handful more bug fixes for you today.

  Changes since last time:

   - Don't leak resources when mount fails

   - Don't accidentally clobber variables when looking for free inodes"

* tag 'xfs-4.13-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: don't leak quotacheck dquots when cow recovery
  xfs: clear MS_ACTIVE after finishing log recovery
  iomap: fix integer truncation issues in the zeroing and dirtying helpers
  xfs: fix inobt inode allocation search optimization
2017-08-18 14:25:50 -07:00
Linus Torvalds
70bfc741f8 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A small set of fixes that should go into this release. This contains:

   - An NVMe pull request from Christoph, with a few select fixes.

     One of them fix a polling regression in this series, in which it's
     trivial to cause the kernel to disable most of the hardware queue
     interrupts.

   - Fixup for a blk-mq queue usage imbalance on request allocation,
     from Keith.

   - A xen block pull request from Konrad, fixing two issues with
     xen/xen-blkfront"

* 'for-linus' of git://git.kernel.dk/linux-block:
  blk-mq-pci: add a fallback when pci_irq_get_affinity returns NULL
  nvme-pci: set cqe_seen on polled completions
  nvme-fabrics: fix reporting of unrecognized options
  nvmet-fc: eliminate incorrect static markers on local variables
  nvmet-fc: correct use after free on list teardown
  nvmet: don't overwrite identify sn/fr with 0-bytes
  xen-blkfront: use a right index when checking requests
  xen: fix bio vec merging
  blk-mq: Fix queue usage on failed request allocation
2017-08-18 14:12:39 -07:00
Linus Torvalds
edb20a1b4a Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
 "Fourth set of -rc fixes for 4.13 cycle. This is all of the -rc fixes
  that we know of. I suspect this will be the last rc pull request, but
  you never know, I could be wrong.

  Nothing major here. There are the i40iw patches I mentioned in my last
  pull request minus one that I pulled out because it wasn't a fix and
  not appropriate for the rc cycle. Then a few other items trickled in
  and were added to the pull request. It's fairly small aside from those
  five i40iw patches

   - Set of five i40iw fixes (the first of these is rather large by line
     count consideration, but I decided to send it because if fixes a
     legitimate issue and the line count is because it does so by
     creating a new function and using it where needed instead of just
     patching up a few lines...a smaller fix could probably be done, but
     the larger fix is the better code solution)

   - One vmw_pvrdma fix

   - One hns_roce fix (this silences a checker warning, but can't
     actually happen, I expect a patch to remove this from all drivers
     that share this same check in for-next)

   - One iw_cxgb4 fix

   - Two IB core fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/uverbs: Fix NULL pointer dereference during device removal
  IB/core: Protect sysfs entry on ib_unregister_device
  iw_cxgb4: fix misuse of integer variable
  IB/hns: fix memory leak on ah on error return path
  i40iw: Fix potential fcn_id_array out of bounds
  i40iw: Use correct alignment for CQ0 memory
  i40iw: Fix typecast of tcp_seq_num
  i40iw: Correct variable names
  i40iw: Fix parsing of query/commit FPM buffers
  RDMA/vmw_pvrdma: Report CQ missed events
2017-08-18 12:35:22 -07:00
Linus Torvalds
039a8e3847 Merge tag 'powerpc-4.13-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
 "A bug in the VSX register saving that could cause userspace FP/VMX
  register corruption.

  Never seen to happen (that we know of), was found by code inspection,
  but still tagged for stable given the consequences"

* tag 'powerpc-4.13-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Fix VSX enabling/flushing to also test MSR_FP and MSR_VEC
2017-08-18 11:11:03 -07:00
Linus Torvalds
4283346802 Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
 "A small number of bugfixes, nothing serious this time. Here is a full
  list.

  4.13 regression fix:

   - imx7d-sdb pinctrl support regressed in 4.13 due to an incomplete
     patch

  DT fixes for recently added devices:

   - badly copied DT entries on imx6qdl-nitrogen6_som broke PCI reset

   - sama5d2 memory controller had the wrong ID and registers

   - imx7 power domains did not work correctly with deferred probing
     (driver added in 4.12)

   - Allwinner H5 pinctrl (added in 4.12) did not work right with GPIO
     interrupts

  Fixes for older bugs that just got noticed:

   - i.MX25 ADC support (added in 4.6) apparently never worked right due
     to a missing 'ranges' property in DT.

   - Renesas Salvador Audio support (added in v4.5) was broken for
     device repeated bind/unbind due to a naming conflict.

   - Various allwinner boards are missing an 'ethernet' alias in DT,
     leading to unstable device naming.

  Preventive bugfix:

   - TI Keystone needs a fix to prevent a NULL pointer dereference with
     an upcoming PM change"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  soc: ti: ti_sci_pm_domains: Populate name for genpd
  ARM: dts: imx6qdl-nitrogen6_som2: fix PCIe reset
  arm64: allwinner: h5: fix pinctrl IRQs
  arm64: allwinner: a64: sopine: add missing ethernet0 alias
  arm64: allwinner: a64: pine64: add missing ethernet0 alias
  arm64: allwinner: a64: bananapi-m64: add missing ethernet0 alias
  arm64: renesas: salvator-common: avoid audio_clkout naming conflict
  ARM: dts: i.MX25: add ranges to tscadc
  soc: imx: gpcv2: fix regulator deferred probe
  ARM: dts: at91: sama5d2: fix EBI/NAND controllers declaration
  ARM: dts: at91: sama5d2: use sama5d2 compatible string for SMC
  ARM: dts: imx7d-sdb: Put pinctrl_spi4 in the correct location
2017-08-18 11:08:48 -07:00
Linus Torvalds
cb247857f3 Merge tag 'sound-4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "A collection of small fixes, mostly for regression fixes (sequencer
  kconfig and emu10k1 probe) and device-specific quirks (three for USB
  and one for HD-audio).

  One significant change is a fix for races in ALSA sequencer core,
  which covers over the previous incomplete fix"

* tag 'sound-4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: emu10k1: Fix forgotten user-copy conversion in init code
  ALSA: usb-audio: add DSD support for new Amanero PID
  ALSA: usb-audio: Add mute TLV for playback volumes on C-Media devices
  ALSA: usb-audio: Apply sample rate quirk to Sennheiser headset
  ALSA: seq: 2nd attempt at fixing race creating a queue
  ALSA: hda/realtek - Fix pincfg for Dell XPS 13 9370
  ALSA: seq: Fix CONFIG_SND_SEQ_MIDI dependency
2017-08-18 11:02:49 -07:00
Daniel Borkmann
2110ba5830 bpf, doc: improve sysctl knob description
Current context speaking of tcpdump filters is out of date these
days, so lets improve the sysctl description for the BPF knobs
a bit.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 11:00:41 -07:00
Colin Ian King
a120d9ab65 netxen: fix incorrect loop counter decrement
The loop counter k is currently being decremented from zero which
is incorrect. Fix this by incrementing k instead

Detected by CoverityScan, CID#401847 ("Infinite loop")

Fixes: 83f18a557c ("netxen_nic: fw dump support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 10:58:33 -07:00
Linus Torvalds
4478976a43 Merge tag 'dma-mapping-4.13-3' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
 "Another dma-mapping regression fix"

* tag 'dma-mapping-4.13-3' of git://git.infradead.org/users/hch/dma-mapping:
  of: fix DMA mask generation
2017-08-18 10:51:30 -07:00
Colin Ian King
eac2c68d66 nfp: fix infinite loop on umapping cleanup
The while loop that performs the dma page unmapping never decrements
index counter f and hence loops forever. Fix this with a pre-decrement
on f.

Detected by CoverityScan, CID#1357309 ("Infinite loop")

Fixes: 4c3523623d ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 10:28:06 -07:00
Jiri Pirko
acc8b31665 net: sched: fix p_filter_chain check in tcf_chain_flush
The dereference before check is wrong and leads to an oops when
p_filter_chain is NULL. The check needs to be done on the pointer to
prevent NULL dereference.

Fixes: f93e1cdcf4 ("net/sched: fix filter flushing")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 10:19:11 -07:00
Christoph Hellwig
c005390374 blk-mq-pci: add a fallback when pci_irq_get_affinity returns NULL
While pci_irq_get_affinity should never fail for SMP kernel that
implement the affinity mapping, it will always return NULL in the
UP case, so provide a fallback mapping of all queues to CPU 0 in
that case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-18 08:08:14 -06:00
Jens Axboe
6caa0503c4 Merge branch 'nvme-4.13' of git://git.infradead.org/nvme into for-linus
Pull NVMe changes from Christoph:

"The fixes are getting really small now - two for FC, one for PCI, one
 for the fabrics layer and one for the target."
2017-08-18 08:04:15 -06:00
Thomas Gleixner
7edaeb6841 kernel/watchdog: Prevent false positives with turbo modes
The hardlockup detector on x86 uses a performance counter based on unhalted
CPU cycles and a periodic hrtimer. The hrtimer period is about 2/5 of the
performance counter period, so the hrtimer should fire 2-3 times before the
performance counter NMI fires. The NMI code checks whether the hrtimer
fired since the last invocation. If not, it assumess a hard lockup.

The calculation of those periods is based on the nominal CPU
frequency. Turbo modes increase the CPU clock frequency and therefore
shorten the period of the perf/NMI watchdog. With extreme Turbo-modes (3x
nominal frequency) the perf/NMI period is shorter than the hrtimer period
which leads to false positives.

A simple fix would be to shorten the hrtimer period, but that comes with
the side effect of more frequent hrtimer and softlockup thread wakeups,
which is not desired.

Implement a low pass filter, which checks the perf/NMI period against
kernel time. If the perf/NMI fires before 4/5 of the watchdog period has
elapsed then the event is ignored and postponed to the next perf/NMI.

That solves the problem and avoids the overhead of shorter hrtimer periods
and more frequent softlockup thread wakeups.

Fixes: 58687acba5 ("lockup_detector: Combine nmi_watchdog and softlockup detector")
Reported-and-tested-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: dzickus@redhat.com
Cc: prarit@redhat.com
Cc: ak@linux.intel.com
Cc: babu.moger@oracle.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: acme@redhat.com
Cc: stable@vger.kernel.org
Cc: atomlin@redhat.com
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1708150931310.1886@nanos
2017-08-18 12:35:02 +02:00
Marc Zyngier
e8f241893d genirq: Restore trigger settings in irq_modify_status()
irq_modify_status starts by clearing the trigger settings from
irq_data before applying the new settings, but doesn't restore them,
leaving them to IRQ_TYPE_NONE.

That's pretty confusing to the potential request_irq() that could
follow. Instead, snapshot the settings before clearing them, and restore
them if the irq_modify_status() invocation was not changing the trigger.

Fixes: 1e2a7d7849 ("irqdomain: Don't set type when mapping an IRQ")
Reported-and-tested-by: jeffy <jeffy.chen@rock-chips.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jon Hunter <jonathanh@nvidia.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170818095345.12378-1-marc.zyngier@arm.com
2017-08-18 12:04:14 +02:00
Dave Gerlach
4dd6a9973b soc: ti: ti_sci_pm_domains: Populate name for genpd
Commit b6a1d093f9 ("PM / Domains: Extend generic power domain
debugfs") now creates a debugfs directory for each genpd based on the
name of the genpd. Currently no name is given to the genpd created by
ti_sci_pm_domains driver so because of this we see a NULL pointer
dereferences when it is accessed on boot when the debugfs entry creation
is attempted.

Give the genpd a name before registering it to avoid this.

Fixes: 52835d59fc ("soc: ti: Add ti_sci_pm_domains driver")
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2017-08-18 11:59:53 +02:00
Arnd Bergmann
93112486f4 Merge tag 'imx-fixes-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes
Pull "i.MX fixes for 4.13, round 3" from Shawn Guo:

 - Fix PCIe reset GPIO of imx6qdl-nitrogen6_som2 board, which was
   a bad copy from nitrogen6_max device tree.

* tag 'imx-fixes-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imx6qdl-nitrogen6_som2: fix PCIe reset
2017-08-18 11:58:38 +02:00
Arnd Bergmann
552c497c40 Merge tag 'sunxi-fixes-for-4.13-2' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
Pull "Allwinner fixes for 4.13, round 2" from Chen-Yu Tsai:

Three fixes adding a missing alias for the Ethernet controller on A64
boards. One adding a missing interrupt for the pin controller.

* tag 'sunxi-fixes-for-4.13-2' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  arm64: allwinner: h5: fix pinctrl IRQs
  arm64: allwinner: a64: sopine: add missing ethernet0 alias
  arm64: allwinner: a64: pine64: add missing ethernet0 alias
  arm64: allwinner: a64: bananapi-m64: add missing ethernet0 alias
2017-08-18 11:55:44 +02:00
Arvind Yadav
45bd07ad82 x86: Constify attribute_group structures
attribute_groups are not supposed to change at runtime and none of the
groups is modified.

Mark the non-const structs as const.

[ tglx: Folded into one big patch ]

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: tony.luck@intel.com
Cc: bp@alien8.de
Link: http://lkml.kernel.org/r/1500550238-15655-2-git-send-email-arvind.yadav.cs@gmail.com
2017-08-18 11:30:35 +02:00
Florian Fainelli
7374bfb82e MAINTAINERS: Remove Jason Cooper's irqchip git tree
Jason's irqchip tree does not seem to have been updated for many months
now, remove it from the list of trees to avoid any possible confusion.

Jason says:

  "Unfortunately, when I have time for irqchip, I don't always have the
   time to properly follow up with pull-requests. So, for the time being,
   I'll stick to reviewing as I can."

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Cc: marc.zyngier@arm.com
Link: http://lkml.kernel.org/r/20170727224733.8288-1-f.fainelli@gmail.com
2017-08-18 11:06:35 +02:00
Takashi Iwai
0b36f2bd28 ALSA: emu10k1: Fix forgotten user-copy conversion in init code
The commit d42fe63d58 ("ALSA: emu10k1: Get rid of set_fs() usage")
converted the user-space copy hack with set_fs() to the direct
memcpy(), but one place was forgotten.  This resulted in the error
from snd_emu10k1_init_efx(), eventually failed to load the driver.
Fix the missing piece.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196687
Fixes: d42fe63d58 ("ALSA: emu10k1: Get rid of set_fs() usage")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-18 10:59:02 +02:00
Jussi Laako
ed993c6fdf ALSA: usb-audio: add DSD support for new Amanero PID
Add DSD support for new Amanero Combo384 firmware version with a new
PID. This firmware uses DSD_U32_BE.

Fixes: 3eff682d76 ("ALSA: usb-audio: Support both DSD LE/BE Amanero firmware versions")
Signed-off-by: Jussi Laako <jussi@sonarnerd.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-18 10:13:21 +02:00
Keith Busch
e9d8a0fdea nvme-pci: set cqe_seen on polled completions
Fixes: 920d13a884 ("nvme-pci: factor out the cqe reading mechanics from __nvme_process_cq")
Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-18 09:19:39 +02:00
Benjamin Herrenschmidt
1a92a80ad3 powerpc/mm: Ensure cpumask update is ordered
There is no guarantee that the various isync's involved with
the context switch will order the update of the CPU mask with
the first TLB entry for the new context being loaded by the HW.

Be safe here and add a memory barrier to order any subsequent
load/store which may bring entries into the TLB.

The corresponding barrier on the other side already exists as
pte updates use pte_xchg() which uses __cmpxchg_u64 which has
a sync after the atomic operation.

Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add comments in the code]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-18 13:07:16 +10:00
Gary Bisson
c40bc54fdf ARM: dts: imx6qdl-nitrogen6_som2: fix PCIe reset
Previous value was a bad copy of nitrogen6_max device tree.

Signed-off-by: Gary Bisson <gary.bisson@boundarydevices.com>
Fixes: 3faa1bb2e8 ("ARM: dts: imx: add Boundary Devices Nitrogen6_SOM2 support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2017-08-18 09:40:40 +08:00
Linus Torvalds
04d49f3638 Merge tag 'drm-fixes-for-v4.13-rc6' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Seems to be slowing down nicely, just one amdgpu fix, and a bunch of
  i915 fixes"

* tag 'drm-fixes-for-v4.13-rc6' of git://people.freedesktop.org/~airlied/linux:
  drm/amdgpu: save list length when fence is signaled
  drm/i915: Avoid the gpu reset vs. modeset deadlock
  drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
  drm/i915: Return correct EDP voltage swing table for 0.85V
  drm/i915/cnl: Add slice and subslice information to debugfs.
  drm/i915: Perform an invalidate prior to executing golden renderstate
  drm/i915: remove unused function declaration
2017-08-17 16:48:29 -07:00
Linus Torvalds
d33a2a9143 Merge tag 'pm-4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These fix two issues related to exposing the current CPU frequency to
  user space on x86.

  Specifics:

   - Disable interrupts around reading IA32_APERF and IA32_MPERF in
     aperfmperf_snapshot_khz() (introduced recently) to avoid excessive
     delays between the reads that may result from interrupt handling
     (Doug Smythies).

   - Fix the computation of the CPU frequency to be reported through the
     pstate_sample tracepoint in intel_pstate (Doug Smythies)"

* tag 'pm-4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: x86: Disable interrupts during MSRs reading
  cpufreq: intel_pstate: report correct CPU frequencies during trace
2017-08-17 14:21:18 -07:00
Linus Torvalds
440105d3c9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: elan_i2c - Add antoher Lenovo ACPI ID for upcoming Lenovo NB
  Input: elan_i2c - add ELAN0608 to the ACPI table
  Input: trackpoint - assume 3 buttons when buttons detection fails
2017-08-17 13:45:44 -07:00
Dave Airlie
28eb462879 Merge branch 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
single amdgpu fix.

* 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: save list length when fence is signaled
2017-08-18 05:45:03 +10:00
Dave Airlie
41d31b5fd2 Merge tag 'drm-intel-fixes-2017-08-16' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
drm/i915 fixes for v4.13-rc6

"Chris' "drm/i915: Perform an invalidate prior to executing golden renderstate" and Daniel's
"drm/i915: Avoid the gpu reset vs. modeset deadlock" seem like the most important ones.

* tag 'drm-intel-fixes-2017-08-16' of git://anongit.freedesktop.org/git/drm-intel:
  drm/i915: Avoid the gpu reset vs. modeset deadlock
  drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
  drm/i915: Return correct EDP voltage swing table for 0.85V
  drm/i915/cnl: Add slice and subslice information to debugfs.
  drm/i915: Perform an invalidate prior to executing golden renderstate
  drm/i915: remove unused function declaration
2017-08-18 05:43:10 +10:00
Darrick J. Wong
77aff8c764 xfs: don't leak quotacheck dquots when cow recovery
If we fail a mount on account of cow recovery errors, it's possible that
a previous quotacheck left some dquots in memory.  The bailout clause of
xfs_mountfs forgets to purge these, and so we leak them.  Fix that.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2017-08-17 12:40:33 -07:00
Darrick J. Wong
8204f8ddaa xfs: clear MS_ACTIVE after finishing log recovery
Way back when we established inode block-map redo log items, it was
discovered that we needed to prevent the VFS from evicting inodes during
log recovery because any given inode might be have bmap redo items to
replay even if the inode has no link count and is ultimately deleted,
and any eviction of an unlinked inode causes the inode to be truncated
and freed too early.

To make this possible, we set MS_ACTIVE so that inodes would not be torn
down immediately upon release.  Unfortunately, this also results in the
quota inodes not being released at all if a later part of the mount
process should fail, because we never reclaim the inodes.  So, set
MS_ACTIVE right before we do the last part of log recovery and clear it
immediately after we finish the log recovery so that everything
will be torn down properly if we abort the mount.

Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2017-08-17 12:40:33 -07:00
Rafael J. Wysocki
8179962b84 Merge branches 'intel_pstate-fix' and 'cpufreq-x86-fix'
* intel_pstate-fix:
  cpufreq: intel_pstate: report correct CPU frequencies during trace

* cpufreq-x86-fix:
  cpufreq: x86: Disable interrupts during MSRs reading
2017-08-17 21:00:30 +02:00
Lv Zheng
98529b9272 ACPI: EC: Fix regression related to wrong ECDT initialization order
Commit 2a5708409e (ACPI / EC: Fix a gap that ECDT EC cannot handle
EC events) introduced acpi_ec_ecdt_start(), but that function is
invoked before acpi_ec_query_init(), which is too early.  This causes
the kernel to crash if an EC event occurs after boot, when ec_query_wq
is not valid:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000102
 ...
 Workqueue: events acpi_ec_event_handler
 task: ffff9f539790dac0 task.stack: ffffb437c0e10000
 RIP: 0010:__queue_work+0x32/0x430

Normally, the DSDT EC should always be valid, so acpi_ec_ecdt_start()
is actually a no-op in the majority of cases.  However, commit
c712bb58d8 (ACPI / EC: Add support to skip boot stage DSDT probe)
caused the probing of the DSDT EC as the "boot EC" to be skipped when
the ECDT EC is valid and uncovered the bug.

Fix this issue by invoking acpi_ec_ecdt_start() after acpi_ec_query_init()
in acpi_ec_init().

Link: https://jira01.devtools.intel.com/browse/LCK-4348
Fixes: 2a5708409e (ACPI / EC: Fix a gap that ECDT EC cannot handle EC events)
Fixes: c712bb58d8 (ACPI / EC: Add support to skip boot stage DSDT probe)
Reported-by: Wang Wendy <wendy.wang@intel.com>
Tested-by: Feng Chenzhou <chenzhoux.feng@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-17 20:52:19 +02:00
Linus Torvalds
3bc6c906ea Merge branch 'parisc-4.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:

 - Fix PCI memory bar assignments with 64-bit kernels on machines with
   Dino/Cujo PCI chipsets. This makes PCI graphic cards work on such
   machines (from Thomas Bogendoerfer).

 - Fix documentation to be more clear about the difference between %pF
   and %pS printk format usage. There are still many places in the
   kernel which have it wrong (from Petr Mladek, Sergey Senozhatsky &
   me).

* 'parisc-4.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  printk-formats.txt: Better describe the difference between %pS and %pF
  parisc: pci memory bar assignment fails with 64bit kernels on dino/cujo
2017-08-17 11:39:54 -07:00
Michael Ellerman
014cd0a368 bpf: Update sysctl documentation to list all supported architectures
The sysctl documentation states that the JIT is only available on
x86_64, which is no longer correct.

Update the list, and break it out to indicate which architectures
support the cBPF JIT (via HAVE_CBPF_JIT) or the eBPF JIT
(HAVE_EBPF_JIT).

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-17 10:09:28 -07:00
Christoph Hellwig
81a0b8d74e nvme-fabrics: fix reporting of unrecognized options
Only print the specified options that are not recognized, instead
of the whole list of options.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
2017-08-17 18:48:54 +02:00
Linus Torvalds
99f781b1bf Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota fix from Jan Kara:
 "A fix of a check for quota limit"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: correct space limit check
2017-08-17 09:26:10 -07:00
Linus Torvalds
c8c03f1858 pty: fix the cached path of the pty slave file descriptor in the master
Christian Brauner reported that if you use the TIOCGPTPEER ioctl() to
get a slave pty file descriptor, the resulting file descriptor doesn't
look right in /proc/<pid>/fd/<fd>.  In particular, he wanted to use
readlink() on /proc/self/fd/<fd> to get the pathname of the slave pty
(basically implementing "ptsname{_r}()").

The reason for that was that we had generated the wrong 'struct path'
when we create the pty in ptmx_open().

In particular, the dentry was correct, but the vfsmount pointed to the
mount of the ptmx node. That _can_ be correct - in case you use
"/dev/pts/ptmx" to open the master - but usually is not.  The normal
case is to use /dev/ptmx, which then looks up the pts/ directory, and
then the vfsmount of the ptmx node is obviously the /dev directory, not
the /dev/pts/ directory.

We actually did have the right vfsmount available, but in the wrong
place (it gets looked up in 'devpts_acquire()' when we get a reference
to the pts filesystem), and so ptmx_open() used the wrong mnt pointer.

The end result of this confusion was that the pty worked fine, but when
if you did TIOCGPTPEER to get the slave side of the pty, end end result
would also work, but have that dodgy 'struct path'.

And then when doing "d_path()" on to get the pathname, the vfsmount
would not match the root of the pts directory, and d_path() would return
an empty pathname thinking that the entry had escaped a bind mount into
another mount.

This fixes the problem by making devpts_acquire() return the vfsmount
for the pts filesystem, allowing ptmx_open() to trivially just use the
right mount for the pts dentry, and create the proper 'struct path'.

Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-17 09:10:48 -07:00
Jarkko Nikula
2a86cdd2e7 i2c: designware: Fix runtime PM for I2C slave mode
I2C slave controller must be powered and active all the time when I2C
slave backend is registered in order to let master address and
communicate with us.

Now if the controller is runtime PM capable it will be suspended after
probe and cannot ever respond to the master or generate interrupts.

Fix this by resuming the controller when I2C slave backend is registered
and let it suspend after unregistering.

Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-08-17 17:55:29 +02:00
Jarkko Nikula
733f656397 i2c: designware: Remove needless pm_runtime_put_noidle() call
I guess pm_runtime_put_noidle() call in i2c_dw_probe_slave() was copied
by accident from similar master mode adapter registration code. It is
unbalanced due missing pm_runtime_get_noresume() but harmless since it
doesn't decrease dev->power.usage_count below zero.

In theory we can hit similar needless runtime suspend/resume cycle
during slave mode adapter registration that was happening when
registering the master mode adapter. See commit cd998ded5c ("i2c:
designware: Prevent runtime suspend during adapter registration").

However, since we are slave, we can consider it as a wrong configuration
if we have other slaves attached under this adapter and can omit the
pm_runtime_get_noresume()/pm_runtime_put_noidle() calls for simplicity.

Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-08-17 17:55:20 +02:00
Takashi Iwai
0f174b3525 ALSA: usb-audio: Add mute TLV for playback volumes on C-Media devices
C-Media devices (at least some models) mute the playback stream when
volumes are set to the minimum value.  But this isn't informed via TLV
and the user-space, typically PulseAudio, gets confused as if it's
still played in a low volume.

This patch adds the new flag, min_mute, to struct usb_mixer_elem_info
for indicating that the mixer element is with the minimum-mute volume.
This flag is set for known C-Media devices in
snd_usb_mixer_fu_apply_quirk() in turn.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196669
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-17 17:52:16 +02:00
Arnd Bergmann
872784bffb Merge tag 'renesas-fixes4-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
Pull "Fourth Round of Renesas ARM Based SoC Fixes for v4.13" from Simon Horman:

* Avoid audio_clkout naming conflict for salvator boards using
  Renesas R-Car Gen 3 SoCs

  Morimoto-san says "The clock name of "audio_clkout" is used by the
  Renesas sound driver.  This duplicated naming breaks its clock
  registering/unregistering.  Especially when unbind/bind it can't handle
  clkout correctly.  This patch renames "audio_clkout" to "audio-clkout" to
  avoid the naming conflict."

* tag 'renesas-fixes4-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  arm64: renesas: salvator-common: avoid audio_clkout naming conflict
2017-08-17 11:00:26 +02:00
Robin Murphy
ee7b1f3120 of: fix DMA mask generation
Historically, DMA masks have suffered some ambiguity between whether
they represent the range of physical memory a device can access, or the
address bits a device is capable of driving, particularly since on many
platforms the two are equivalent. Whilst there are some stragglers left
(dma_max_pfn(), I'm looking at you...), the majority of DMA code has
been cleaned up to follow the latter definition, not least since it is
the only one which makes sense once IOMMUs are involved.

In this respect, of_dma_configure() has always done the wrong thing in
how it generates initial masks based on "dma-ranges". Although rounding
down did not affect the TI Keystone platform where dma_addr + size is
already a power of two, in any other case it results in a mask which is
at best unnecessarily constrained and at worst unusable.

BCM2837 illustrates the problem nicely, where we have a DMA base of 3GB
and a size of 1GB - 16MB, giving dma_addr + size = 0xff000000 and a
resultant mask of 0x7fffffff, which is then insufficient to even cover
the necessary offset, effectively making all DMA addresses out-of-range.
This has been hidden until now (mostly because we don't yet prevent
drivers from simply overwriting this initial mask later upon probe), but
due to recent changes elsewhere now shows up as USB being broken on
Raspberry Pi 3.

Make it right by rounding up instead of down, such that the mask
correctly correctly describes all possisble bits the device needs to
emit.

Fixes: 9a6d7298b0 ("of: Calculate device DMA masks based on DT dma-range size")
Reported-by: Stefan Wahren <stefan.wahren@i2se.com>
Reported-by: Andreas Färber <afaerber@suse.de>
Reported-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-17 10:23:45 +02:00
Alexander Potapenko
187e91fe5e x86/boot/64/clang: Use fixup_pointer() to access 'next_early_pgt'
__startup_64() is normally using fixup_pointer() to access globals in a
position-independent fashion. However 'next_early_pgt' was accessed
directly, which wasn't guaranteed to work.

Luckily GCC was generating a R_X86_64_PC32 PC-relative relocation for
'next_early_pgt', but Clang emitted a R_X86_64_32S, which led to
accessing invalid memory and rebooting the kernel.

Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Davidson <md@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: c88d71508e ("x86/boot/64: Rewrite startup_64() in C")
Link: http://lkml.kernel.org/r/20170816190808.131748-1-glider@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-17 09:53:00 +02:00
James Smart
369157b41c nvmet-fc: eliminate incorrect static markers on local variables
There were 2 statics introduced that were bogus. Removed the static
designations.

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-17 09:35:08 +02:00
Linus Torvalds
ac9a40905a Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "A couple of minor fixes (st, ses) and some bigger driver fixes for
  qla2xxx (crash triggered by fw dump) and ipr (lockdep problems with
  mq)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ses: Fix wrong page error
  scsi: ipr: Fix scsi-mq lockdep issue
  scsi: st: fix blk_get_queue usage
  scsi: qla2xxx: Fix system crash while triggering FW dump
2017-08-16 17:21:20 -07:00
Varun Prakash
71eb2ac58a scsi: cxgb4i: call neigh_event_send() to update MAC address
If nud_state is not valid then call neigh_event_send() to update MAC
address.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-16 20:12:31 -04:00
Christoph Hellwig
cbe7dfa26e Revert "scsi: default to scsi-mq"
Defaulting to scsi-mq in 4.13-rc has shown various regressions
on setups that we didn't previously consider.  Fixes for them are
in progress, but too invasive to make it in this cycle.  So for
now revert the commit that defaults to blk-mq for SCSI.  For 4.14
we'll plan to try again with these fixes.

This reverts commit 5c279bd9e4.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-16 20:01:32 -04:00
Damien Le Moal
70e42fd02c scsi: sd_zbc: Write unlock zone from sd_uninit_cmnd()
Releasing a zone write lock only when the write commnand that acquired
the lock completes can cause deadlocks due to potential command
reordering if the lock owning request is requeued and not executed. This
problem exists only with the scsi-mq path as, unlike the legacy path,
requests are moved out of the dispatch queue before being prepared and
so before locking a zone for a write command.

Since sd_uninit_cmnd() is now always called when a request is requeued,
call sd_zbc_write_unlock_zone() from that function for write requests
that acquired a zone lock instead of from sd_done(). Acquisition of a zone
lock by a write command is indicated using the new command
flag SCMD_ZONE_WRITE_LOCK.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-16 20:01:31 -04:00
Raghava Aditya Renukunta
c802673249 scsi: aacraid: Fix out of bounds in aac_get_name_resp
We terminate the aac_get_name_resp on a byte that is outside the bounds
of the structure. Extend the return response by one byte to remove the
out of bounds reference.

Fixes: b836439faf ("aacraid: 4KB sector support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Carroll <david.carroll@microsemi.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-16 20:01:31 -04:00
Varun Prakash
82f0fd06d4 scsi: csiostor: fail probe if fw does not support FCoE
Fail probe if FCoE capability is not enabled in the firmware.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-16 20:01:30 -04:00
weiping zhang
61f0c3c7a0 scsi: megaraid_sas: fix error handle in megasas_probe_one
megasas_mgmt_info.max_index has increased by 1 before megasas_io_attach,
if megasas_io_attach return error, then goto fail_io_attach,
megasas_mgmt_info.instance has a wrong index here. So first reduce
max_index and then set that instance to NULL.

Signed-off-by: weiping zhang <zhangweiping@didichuxing.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-16 20:01:30 -04:00
Linus Torvalds
422ce075f9 Merge tag 'audit-pr-20170816' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit fixes from Paul Moore:
 "Two small fixes to the audit code, both explained well in the
  respective patch descriptions, but the quick summary is one
  use-after-free fix, and one silly fanotify notification flag fix"

* tag 'audit-pr-20170816' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: Receive unmount event
  audit: Fix use after free in audit_remove_watch_rule()
2017-08-16 16:48:34 -07:00
Eric Dumazet
c780a049f9 ipv4: better IP_MAX_MTU enforcement
While working on yet another syzkaller report, I found
that our IP_MAX_MTU enforcements were not properly done.

gcc seems to reload dev->mtu for min(dev->mtu, IP_MAX_MTU), and
final result can be bigger than IP_MAX_MTU :/

This is a problem because device mtu can be changed on other cpus or
threads.

While this patch does not fix the issue I am working on, it is
probably worth addressing it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16 16:28:47 -07:00
Eric Dumazet
81fbfe8ada ptr_ring: use kmalloc_array()
As found by syzkaller, malicious users can set whatever tx_queue_len
on a tun device and eventually crash the kernel.

Lets remove the ALIGN(XXX, SMP_CACHE_BYTES) thing since a small
ring buffer is not fast anyway.

Fixes: 2e0ab8ca83 ("ptr_ring: array based FIFO for pointers")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16 16:28:47 -07:00
Eric Dumazet
120e9dabaf dccp: defer ccid_hc_tx_delete() at dismantle time
syszkaller team reported another problem in DCCP [1]

Problem here is that the structure holding RTO timer
(ccid2_hc_tx_rto_expire() handler) is freed too soon.

We can not use del_timer_sync() to cancel the timer
since this timer wants to grab socket lock (that would risk a dead lock)

Solution is to defer the freeing of memory when all references to
the socket were released. Socket timers do own a reference, so this
should fix the issue.

[1]

==================================================================
BUG: KASAN: use-after-free in ccid2_hc_tx_rto_expire+0x51c/0x5c0 net/dccp/ccids/ccid2.c:144
Read of size 4 at addr ffff8801d2660540 by task kworker/u4:7/3365

CPU: 1 PID: 3365 Comm: kworker/u4:7 Not tainted 4.13.0-rc4+ #3
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events_unbound call_usermodehelper_exec_work
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 print_address_description+0x73/0x250 mm/kasan/report.c:252
 kasan_report_error mm/kasan/report.c:351 [inline]
 kasan_report+0x24e/0x340 mm/kasan/report.c:409
 __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:429
 ccid2_hc_tx_rto_expire+0x51c/0x5c0 net/dccp/ccids/ccid2.c:144
 call_timer_fn+0x233/0x830 kernel/time/timer.c:1268
 expire_timers kernel/time/timer.c:1307 [inline]
 __run_timers+0x7fd/0xb90 kernel/time/timer.c:1601
 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
 __do_softirq+0x2f5/0xba3 kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364 [inline]
 irq_exit+0x1cc/0x200 kernel/softirq.c:405
 exiting_irq arch/x86/include/asm/apic.h:638 [inline]
 smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:1044
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:702
RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:824 [inline]
RIP: 0010:__raw_write_unlock_irq include/linux/rwlock_api_smp.h:267 [inline]
RIP: 0010:_raw_write_unlock_irq+0x56/0x70 kernel/locking/spinlock.c:343
RSP: 0018:ffff8801cd50eaa8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
RAX: dffffc0000000000 RBX: ffffffff85a090c0 RCX: 0000000000000006
RDX: 1ffffffff0b595f3 RSI: 1ffff1003962f989 RDI: ffffffff85acaf98
RBP: ffff8801cd50eab0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801cc96ea60
R13: dffffc0000000000 R14: ffff8801cc96e4c0 R15: ffff8801cc96e4c0
 </IRQ>
 release_task+0xe9e/0x1a40 kernel/exit.c:220
 wait_task_zombie kernel/exit.c:1162 [inline]
 wait_consider_task+0x29b8/0x33c0 kernel/exit.c:1389
 do_wait_thread kernel/exit.c:1452 [inline]
 do_wait+0x441/0xa90 kernel/exit.c:1523
 kernel_wait4+0x1f5/0x370 kernel/exit.c:1665
 SYSC_wait4+0x134/0x140 kernel/exit.c:1677
 SyS_wait4+0x2c/0x40 kernel/exit.c:1673
 call_usermodehelper_exec_sync kernel/kmod.c:286 [inline]
 call_usermodehelper_exec_work+0x1a0/0x2c0 kernel/kmod.c:323
 process_one_work+0xbf3/0x1bc0 kernel/workqueue.c:2097
 worker_thread+0x223/0x1860 kernel/workqueue.c:2231
 kthread+0x35e/0x430 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:425

Allocated by task 21267:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
 kmem_cache_alloc+0x127/0x750 mm/slab.c:3561
 ccid_new+0x20e/0x390 net/dccp/ccid.c:151
 dccp_hdlr_ccid+0x27/0x140 net/dccp/feat.c:44
 __dccp_feat_activate+0x142/0x2a0 net/dccp/feat.c:344
 dccp_feat_activate_values+0x34e/0xa90 net/dccp/feat.c:1538
 dccp_rcv_request_sent_state_process net/dccp/input.c:472 [inline]
 dccp_rcv_state_process+0xed1/0x1620 net/dccp/input.c:677
 dccp_v4_do_rcv+0xeb/0x160 net/dccp/ipv4.c:679
 sk_backlog_rcv include/net/sock.h:911 [inline]
 __release_sock+0x124/0x360 net/core/sock.c:2269
 release_sock+0xa4/0x2a0 net/core/sock.c:2784
 inet_wait_for_connect net/ipv4/af_inet.c:557 [inline]
 __inet_stream_connect+0x671/0xf00 net/ipv4/af_inet.c:643
 inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:682
 SYSC_connect+0x204/0x470 net/socket.c:1642
 SyS_connect+0x24/0x30 net/socket.c:1623
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Freed by task 3049:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:447
 set_track mm/kasan/kasan.c:459 [inline]
 kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
 __cache_free mm/slab.c:3503 [inline]
 kmem_cache_free+0x77/0x280 mm/slab.c:3763
 ccid_hc_tx_delete+0xc5/0x100 net/dccp/ccid.c:190
 dccp_destroy_sock+0x1d1/0x2b0 net/dccp/proto.c:225
 inet_csk_destroy_sock+0x166/0x3f0 net/ipv4/inet_connection_sock.c:833
 dccp_done+0xb7/0xd0 net/dccp/proto.c:145
 dccp_time_wait+0x13d/0x300 net/dccp/minisocks.c:72
 dccp_rcv_reset+0x1d1/0x5b0 net/dccp/input.c:160
 dccp_rcv_state_process+0x8fc/0x1620 net/dccp/input.c:663
 dccp_v4_do_rcv+0xeb/0x160 net/dccp/ipv4.c:679
 sk_backlog_rcv include/net/sock.h:911 [inline]
 __sk_receive_skb+0x33e/0xc00 net/core/sock.c:521
 dccp_v4_rcv+0xef1/0x1c00 net/dccp/ipv4.c:871
 ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
 NF_HOOK include/linux/netfilter.h:248 [inline]
 ip_local_deliver+0x1ce/0x6d0 net/ipv4/ip_input.c:257
 dst_input include/net/dst.h:477 [inline]
 ip_rcv_finish+0x8db/0x19c0 net/ipv4/ip_input.c:397
 NF_HOOK include/linux/netfilter.h:248 [inline]
 ip_rcv+0xc3f/0x17d0 net/ipv4/ip_input.c:488
 __netif_receive_skb_core+0x19af/0x33d0 net/core/dev.c:4417
 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4455
 process_backlog+0x203/0x740 net/core/dev.c:5130
 napi_poll net/core/dev.c:5527 [inline]
 net_rx_action+0x792/0x1910 net/core/dev.c:5593
 __do_softirq+0x2f5/0xba3 kernel/softirq.c:284

The buggy address belongs to the object at ffff8801d2660100
 which belongs to the cache ccid2_hc_tx_sock of size 1240
The buggy address is located 1088 bytes inside of
 1240-byte region [ffff8801d2660100, ffff8801d26605d8)
The buggy address belongs to the page:
page:ffffea0007499800 count:1 mapcount:0 mapping:ffff8801d2660100 index:0x0 compound_mapcount: 0
flags: 0x200000000008100(slab|head)
raw: 0200000000008100 ffff8801d2660100 0000000000000000 0000000100000005
raw: ffffea00075271a0 ffffea0007538820 ffff8801d3aef9c0 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8801d2660400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8801d2660480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8801d2660500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                           ^
 ffff8801d2660580: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
 ffff8801d2660600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16 14:26:26 -07:00
Liping Zhang
494bea39f3 openvswitch: fix skb_panic due to the incorrect actions attrlen
For sw_flow_actions, the actions_len only represents the kernel part's
size, and when we dump the actions to the userspace, we will do the
convertions, so it's true size may become bigger than the actions_len.

But unfortunately, for OVS_PACKET_ATTR_ACTIONS, we use the actions_len
to alloc the skbuff, so the user_skb's size may become insufficient and
oops will happen like this:
  skbuff: skb_over_panic: text:ffffffff8148fabf len:1749 put:157 head:
  ffff881300f39000 data:ffff881300f39000 tail:0x6d5 end:0x6c0 dev:<NULL>
  ------------[ cut here ]------------
  kernel BUG at net/core/skbuff.c:129!
  [...]
  Call Trace:
   <IRQ>
   [<ffffffff8148be82>] skb_put+0x43/0x44
   [<ffffffff8148fabf>] skb_zerocopy+0x6c/0x1f4
   [<ffffffffa0290d36>] queue_userspace_packet+0x3a3/0x448 [openvswitch]
   [<ffffffffa0292023>] ovs_dp_upcall+0x30/0x5c [openvswitch]
   [<ffffffffa028d435>] output_userspace+0x132/0x158 [openvswitch]
   [<ffffffffa01e6890>] ? ip6_rcv_finish+0x74/0x77 [ipv6]
   [<ffffffffa028e277>] do_execute_actions+0xcc1/0xdc8 [openvswitch]
   [<ffffffffa028e3f2>] ovs_execute_actions+0x74/0x106 [openvswitch]
   [<ffffffffa0292130>] ovs_dp_process_packet+0xe1/0xfd [openvswitch]
   [<ffffffffa0292b77>] ? key_extract+0x63c/0x8d5 [openvswitch]
   [<ffffffffa029848b>] ovs_vport_receive+0xa1/0xc3 [openvswitch]
  [...]

Also we can find that the actions_len is much little than the orig_len:
  crash> struct sw_flow_actions 0xffff8812f539d000
  struct sw_flow_actions {
    rcu = {
      next = 0xffff8812f5398800,
      func = 0xffffe3b00035db32
    },
    orig_len = 1384,
    actions_len = 592,
    actions = 0xffff8812f539d01c
  }

So as a quick fix, use the orig_len instead of the actions_len to alloc
the user_skb.

Last, this oops happened on our system running a relative old kernel, but
the same risk still exists on the mainline, since we use the wrong
actions_len from the beginning.

Fixes: ccea74457b ("openvswitch: include datapath actions with sampled-packet upcall to userspace")
Cc: Neil McKee <neil.mckee@inmon.com>
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16 14:12:37 -07:00
Helge Deller
d6957f3396 printk-formats.txt: Better describe the difference between %pS and %pF
Sometimes people seems unclear when to use the %pS or %pF printk format.
For example, see commit 51d96dc2e2 ("random: fix warning message on ia64
and parisc") which fixed such a wrong format string.

The documentation should be more clear about the difference.

Signed-off-by: Helge Deller <deller@gmx.de>
[pmladek@suse.com: Restructure the entire section]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2017-08-16 21:09:45 +02:00
Oleg Nesterov
01578e3616 x86/elf: Remove the unnecessary ADDR_NO_RANDOMIZE checks
The ADDR_NO_RANDOMIZE checks in stack_maxrandom_size() and
randomize_stack_top() are not required.

PF_RANDOMIZE is set by load_elf_binary() only if ADDR_NO_RANDOMIZE is not
set, no need to re-check after that.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: http://lkml.kernel.org/r/20170815154011.GB1076@redhat.com
2017-08-16 20:32:02 +02:00
Oleg Nesterov
47ac5484fd x86: Fix norandmaps/ADDR_NO_RANDOMIZE
Documentation/admin-guide/kernel-parameters.txt says:

    norandmaps  Don't use address space randomization. Equivalent
                to echo 0 > /proc/sys/kernel/randomize_va_space

but it doesn't work because arch_rnd() which is used to randomize
mm->mmap_base returns a random value unconditionally. And as Kirill
pointed out, ADDR_NO_RANDOMIZE is broken by the same reason.

Just shift the PF_RANDOMIZE check from arch_mmap_rnd() to arch_rnd().

Fixes: 1b028f784e ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20170815153952.GA1076@redhat.com
2017-08-16 20:32:01 +02:00
Tushar Dave
6170a50689 sparc64: remove unnecessary log message
There is no need to log message if ATU hvapi couldn't get register.
Unlike PCI hvapi, ATU hvapi registration failure is not hard error.
Even if ATU hvapi registration fails (on system with ATU or without
ATU) system continues with legacy IOMMU. So only log message when
ATU hvapi successfully get registered.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16 11:29:56 -07:00
Colin Ian King
47f078339b Revert "staging: fsl-mc: be consistent when checking strcmp() return"
The previous fix removed the equal to zero comparisons by the strcmps and
now the function always returns true. Revert this change to restore the
original correctly functioning code.

Detected by CoverityScan, CID#1452267 ("Constant expression result")

This reverts commit b93ad9a067.

Fixes: b93ad9a067 ("staging: fsl-mc: be consistent when checking strcmp() return")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-16 11:12:06 -07:00
David Ahern
c7b725be84 net: igmp: Use ingress interface rather than vrf device
Anuradha reported that statically added groups for interfaces enslaved
to a VRF device were not persisting. The problem is that igmp queries
and reports need to use the data in the in_dev for the real ingress
device rather than the VRF device. Update igmp_rcv accordingly.

Fixes: e58e415968 ("net: Enable support for VRF with ipv4 multicast")
Reported-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16 11:08:55 -07:00
David S. Miller
79db795833 sparc64: Don't clibber fixed registers in __multi4.
%g4 and %g5 are fixed registers used by the kernel for the thread
pointer and the per-cpu offset.  Use %o4 and %g7 instead.

Diagnosis by Anthony Yznaga.

Fixes: 1b4af13ff2 ("sparc64: Add __multi3 for gcc 7.x and later.")
Reported-by: Anatoly Pugachev <matorola@gmail.com>
Tested-by: Anatoly Pugachev <matorola@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-16 10:59:54 -07:00
Maor Gottlieb
870201f95f IB/uverbs: Fix NULL pointer dereference during device removal
As part of ib_uverbs_remove_one which might be triggered upon
reset flow, we trigger IB_EVENT_DEVICE_FATAL event to userspace
application.
If device was removed after uverbs fd was opened but before
ib_uverbs_get_context was called, the event file will be accessed
before it was allocated, result in NULL pointer dereference:

[ 72.325873] BUG: unable to handle kernel NULL pointer dereference at (null)
...
[ 72.325984] IP: _raw_spin_lock_irqsave+0x22/0x40
[ 72.327123] Call Trace:
[ 72.327168] ib_uverbs_async_handler.isra.8+0x2e/0x160 [ib_uverbs]
[ 72.327216] ? synchronize_srcu_expedited+0x27/0x30
[ 72.327269] ib_uverbs_remove_one+0x120/0x2c0 [ib_uverbs]
[ 72.327330] ib_unregister_device+0xd0/0x180 [ib_core]
[ 72.327373] mlx5_ib_remove+0x74/0x140 [mlx5_ib]
[ 72.327422] mlx5_remove_device+0xfb/0x110 [mlx5_core]
[ 72.327466] mlx5_unregister_interface+0x3c/0xa0 [mlx5_core]
[ 72.327509] mlx5_ib_cleanup+0x10/0x962 [mlx5_ib]
[ 72.327546] SyS_delete_module+0x155/0x230
[ 72.328472] ? exit_to_usermode_loop+0x70/0xa6
[ 72.329370] do_syscall_64+0x54/0xc0
[ 72.330262] entry_SYSCALL64_slow_path+0x25/0x25

Fix it by checking that user context was allocated before
trigger the event.

Fixes: 036b106357 ('IB/uverbs: Enable device removal when there are active user space applications')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-16 12:53:15 -04:00
Jens Axboe
3e09fc802d Merge branch 'stable/for-jens-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus
Pull xen block changes from Konrad:

Two fixes, both of them spotted by Amazon:

 1) Fix in Xen-blkfront caused by the re-write in 4.8 time-frame.
 2) Fix in the xen_biovec_phys_mergeable which allowed guest
    requests when using NVMe - to slurp up more data than allowed
    leading to an XSA (which has been made public today).
2017-08-16 09:56:34 -06:00
Shiraz Saleem
06f8174a97 IB/core: Protect sysfs entry on ib_unregister_device
ib_unregister_device is not protecting removal of sysfs entries.
A call to ib_register_device in that window can result in
duplicate sysfs entry warning. Move mutex_unlock to after
ib_device_unregister_sysfs to protect against sysfs entry creation.

This issue is exposed during driver load/unload stress test.

WARNING: CPU: 5 PID: 4445 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x5f/0x70
sysfs: cannot create duplicate filename '/class/infiniband/i40iw0'
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Q87M-D2H
BIOS F7 01/17/2014
Workqueue: i40e i40e_service_task [i40e]
Call Trace:
dump_stack+0x67/0x98
__warn+0xcc/0xf0
warn_slowpath_fmt+0x4a/0x50
? kernfs_path_from_node+0x4b/0x60
sysfs_warn_dup+0x5f/0x70
sysfs_do_create_link_sd.isra.2+0xb7/0xc0
sysfs_create_link+0x20/0x40
device_add+0x28c/0x600
ib_device_register_sysfs+0x58/0x170 [ib_core]
ib_register_device+0x325/0x570 [ib_core]
? i40iw_register_rdma_device+0x1f4/0x400 [i40iw]
? kmem_cache_alloc_trace+0x143/0x330
? __raw_spin_lock_init+0x2d/0x50
i40iw_register_rdma_device+0x2dc/0x400 [i40iw]
i40iw_open+0x10a6/0x1950 [i40iw]
? i40iw_open+0xeab/0x1950 [i40iw]
? i40iw_make_cm_node+0x9c0/0x9c0 [i40iw]
i40e_client_subtask+0xa4/0x110 [i40e]
i40e_service_task+0xc2d/0x1320 [i40e]
process_one_work+0x203/0x710
? process_one_work+0x16f/0x710
worker_thread+0x126/0x4a0
? trace_hardirqs_on+0xd/0x10
kthread+0x112/0x150
? process_one_work+0x710/0x710
? kthread_create_on_node+0x40/0x40
ret_from_fork+0x2e/0x40
---[ end trace fd11b69e21ea7653 ]---
Couldn't register device i40iw0 with driver model

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-16 11:47:55 -04:00
Steve Wise
d4ba61d218 iw_cxgb4: fix misuse of integer variable
Fixes: ee30f7d507 ("iw_cxgb4: Max fastreg depth depends on DSGL support")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-16 11:43:21 -04:00
Colin Ian King
5b59a3969e IB/hns: fix memory leak on ah on error return path
When dmac is NULL, ah is not being freed on the error return path. Fix
this by kfree'ing it.

Detected by CoverityScan, CID#1452636 ("Resource Leak")

Fixes: d8966fcd4c ("IB/core: Use rdma_ah_attr accessor functions")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-16 11:30:33 -04:00
Christopher N Bednarz
aa939c12ab i40iw: Fix potential fcn_id_array out of bounds
Avoid out of bounds error by utilizing I40IW_MAX_STATS_COUNT
instead of I40IW_INVALID_FCN_ID.

Signed-off-by: Christopher N Bednarz <christoper.n.bednarz@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-16 11:27:44 -04:00
Christopher N Bednarz
a28f047e5f i40iw: Use correct alignment for CQ0 memory
Utilize correct alignment variable when allocating
DMA memory for CQ0.

Signed-off-by: Christopher N Bednarz <christopher.n.bednarz@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-16 11:27:44 -04:00
Mustafa Ismail
29c2415a66 i40iw: Fix typecast of tcp_seq_num
The typecast of tcp_seq_num incorrectly uses u8. Fix by
casting to u32.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-16 11:27:44 -04:00
Mustafa Ismail
8129331f01 i40iw: Correct variable names
Fix incorrect naming of status code and struct. Use inline
instead of immediate.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-16 11:27:44 -04:00
Chien Tin Tung
f67ace2d88 i40iw: Fix parsing of query/commit FPM buffers
Parsing of commit/query Host Memory Cache Function Private Memory
is not skipping over reserved fields and incorrectly assigning
those values into object's base/cnt/max_cnt fields. Skip over
reserved fields and set correct values. Also correct memory
alignment requirement for commit/query FPM buffers.

Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Christopher N Bednarz <christopher.n.bednarz@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-16 11:27:44 -04:00
Rodrigo Vivi
31c1a7b8f9 drm/i915/cnl: Fix LSPCON support.
When LSPCON support was extended to CNL
one part was missed on lspcon_init.

So, instead of adding check per platform on lspcon_init
let's use HAS_LSPCON that is already there for that
purpose.

Fixes: ff15947e0f ("drm/i915/cnl: LSPCON support is gen9+")
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170816030403.11368-1-rodrigo.vivi@intel.com
(cherry picked from commit acf58d4e96)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-08-16 18:02:13 +03:00
Jani Nikula
7c648bde21 drm/i915/vbt: ignore extraneous child devices for a port
Ever since we've parsed VBT child devices, starting from 6acab15a7b
("drm/i915: use the HDMI DDI buffer translations from VBT"), we've
ignored the child device information if more than one child device
references the same port. The rationale for this seems lost in time.

Since commit 311a20949f ("drm/i915: don't init DP or HDMI when not
supported by DDI port") we started using this information more to skip
HDMI/DP init if the port wasn't there per VBT child devices. However, at
the same time it added port defaults without further explanation.

Thus, if the child device info was skipped due to multiple child devices
referencing the same port, the device info would be retrieved from the
somewhat arbitrary defaults.

Finally, when commit bb1d132935 ("drm/i915/vbt: split out defaults
that are set when there is no VBT") stopped initializing the defaults
whenever VBT is present, thus trusting the VBT more, we stopped
initializing ports which were referenced by more than one child device.

Apparently at least Asus UX305UA, UX305U, and UX306U laptops have VBT
child device blocks which cause this behaviour. Arguably they were
shipped with a broken VBT.

Relax the rules for multiple references to the same port, and use the
first child device info to reference a port. Retain the logic to debug
log about this, though.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101745
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196233
Fixes: bb1d132935 ("drm/i915/vbt: split out defaults that are set when there is no VBT")
Tested-by: Oliver Weißbarth <mail@oweissbarth.de>
Reported-by: Oliver Weißbarth <mail@oweissbarth.de>
Reported-by: Didier G <didierg-divers@orange.fr>
Reported-by: Giles Anderson <agander@gmail.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: <stable@vger.kernel.org> # v4.12+
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170811113907.6716-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit b5273d7275)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-08-16 18:02:08 +03:00
Balasubramaniam, Hari Chand
211b7aac54 drm/i915: Initialize 'data' in intel_dsi_dcs_backlight.c
variable 'data' may be used uninitialized in this function. thus,
'function dcs_get_backlight' will return unwanted value/fail.

Thus, adding NULL initialized to 'data' variable will solve the return
failure happening.

v2: Change commit message to reflect upstream with proper message

Fixes: 90198355b8 ("drm/i915/dsi: Add DCS control for Panel PWM")
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Yetunde Adebisi <yetundex.adebisi@intel.com>
Cc: Deepak M <m.deepak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Balasubramaniam, Hari Chand <hari.chand.balasubramaniam@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1502762746-191826-1-git-send-email-hari.chand.balasubramaniam@intel.com
(cherry picked from commit d59814a5b4)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-08-16 18:02:00 +03:00
Bryan Tan
a7d2e03928 RDMA/vmw_pvrdma: Report CQ missed events
There is a chance of a race between arming the CQ and receiving
completions. By reporting CQ missed events any ULPs should poll
again to get the completions.

Fixes: 29c8d9eba5 ("IB: Add vmw_pvrdma driver")
Acked-by: Aditya Sarwade <asarwade@vmware.com>
Signed-off-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-16 10:56:04 -04:00
Benjamin Herrenschmidt
5a69aec945 powerpc: Fix VSX enabling/flushing to also test MSR_FP and MSR_VEC
VSX uses a combination of the old vector registers, the old FP
registers and new "second halves" of the FP registers.

Thus when we need to see the VSX state in the thread struct
(flush_vsx_to_thread()) or when we'll use the VSX in the kernel
(enable_kernel_vsx()) we need to ensure they are all flushed into
the thread struct if either of them is individually enabled.

Unfortunately we only tested if the whole VSX was enabled, not if they
were individually enabled.

Fixes: 72cd7b44bc ("powerpc: Uncomment and make enable_kernel_vsx() routine available")
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-16 19:35:54 +10:00
James Smart
16a5a480f0 nvmet-fc: correct use after free on list teardown
Use list_for_each_entry_safe to prevent list handling from referencing
next pointers directly after list_del's

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-16 10:06:18 +02:00
Martin Wilck
42819eb7a0 nvmet: don't overwrite identify sn/fr with 0-bytes
The merged version of my patch "nvmet: don't report 0-bytes in serial
number" fails to remove two lines which should have been replaced,
so that the space-padded strings are overwritten again with 0-bytes.
Fix it.

Fixes: 42de82a8b5 nvmet: don't report 0-bytes in serial number
Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Sagi Grimberg <sagi@grimbeg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-16 10:06:10 +02:00
Thomas Bogendoerfer
4098116039 parisc: pci memory bar assignment fails with 64bit kernels on dino/cujo
For 64bit kernels the lmmio_space_offset of the host bridge window
isn't set correctly on systems with dino/cujo PCI host bridges.
This leads to not assigned memory bars and failing drivers, which
need to use these bars.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: <stable@vger.kernel.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
2017-08-16 09:50:39 +02:00
Linus Torvalds
510c8a899c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix TCP checksum offload handling in iwlwifi driver, from Emmanuel
    Grumbach.

 2) In ksz DSA tagging code, free SKB if skb_put_padto() fails. From
    Vivien Didelot.

 3) Fix two regressions with bonding on wireless, from Andreas Born.

 4) Fix build when busypoll is disabled, from Daniel Borkmann.

 5) Fix copy_linear_skb() wrt. SO_PEEK_OFF, from Eric Dumazet.

 6) Set SKB cached route properly in inet_rtm_getroute(), from Florian
    Westphal.

 7) Fix PCI-E relaxed ordering handling in cxgb4 driver, from Ding
    Tianhong.

 8) Fix module refcnt leak in ULP code, from Sabrina Dubroca.

 9) Fix use of GFP_KERNEL in atomic contexts in AF_KEY code, from Eric
    Dumazet.

10) Need to purge socket write queue in dccp_destroy_sock(), also from
    Eric Dumazet.

11) Make bpf_trace_printk() work properly on 32-bit architectures, from
    Daniel Borkmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
  bpf: fix bpf_trace_printk on 32 bit archs
  PCI: fix oops when try to find Root Port for a PCI device
  sfc: don't try and read ef10 data on non-ef10 NIC
  net_sched: remove warning from qdisc_hash_add
  net_sched/sfq: update hierarchical backlog when drop packet
  net_sched: reset pointers to tcf blocks in classful qdiscs' destructors
  ipv4: fix NULL dereference in free_fib_info_rcu()
  net: Fix a typo in comment about sock flags.
  ipv6: fix NULL dereference in ip6_route_dev_notify()
  tcp: fix possible deadlock in TCP stack vs BPF filter
  dccp: purge write queue in dccp_destroy_sock()
  udp: fix linear skb reception with PEEK_OFF
  ipv6: release rt6->rt6i_idev properly during ifdown
  af_key: do not use GFP_KERNEL in atomic contexts
  tcp: ulp: avoid module refcnt leak in tcp_set_ulp
  net/cxgb4vf: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
  net/cxgb4: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
  PCI: Disable Relaxed Ordering Attributes for AMD A1100
  PCI: Disable Relaxed Ordering for some Intel processors
  PCI: Disable PCIe Relaxed Ordering if unsupported
  ...
2017-08-15 18:52:28 -07:00
Daniel Borkmann
88a5c690b6 bpf: fix bpf_trace_printk on 32 bit archs
James reported that on MIPS32 bpf_trace_printk() is currently
broken while MIPS64 works fine:

  bpf_trace_printk() uses conditional operators to attempt to
  pass different types to __trace_printk() depending on the
  format operators. This doesn't work as intended on 32-bit
  architectures where u32 and long are passed differently to
  u64, since the result of C conditional operators follows the
  "usual arithmetic conversions" rules, such that the values
  passed to __trace_printk() will always be u64 [causing issues
  later in the va_list handling for vscnprintf()].

  For example the samples/bpf/tracex5 test printed lines like
  below on MIPS32, where the fd and buf have come from the u64
  fd argument, and the size from the buf argument:

    [...] 1180.941542: 0x00000001: write(fd=1, buf=  (null), size=6258688)

  Instead of this:

    [...] 1625.616026: 0x00000001: write(fd=1, buf=009e4000, size=512)

One way to get it working is to expand various combinations
of argument types into 8 different combinations for 32 bit
and 64 bit kernels. Fix tested by James on MIPS32 and MIPS64
as well that it resolves the issue.

Fixes: 9c959c863f ("tracing: Allow BPF programs to call bpf_trace_printk()")
Reported-by: James Hogan <james.hogan@imgtec.com>
Tested-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15 17:32:15 -07:00
dingtianhong
0e40523287 PCI: fix oops when try to find Root Port for a PCI device
Eric report a oops when booting the system after applying
the commit a99b646afa ("PCI: Disable PCIe Relaxed..."):

[    4.241029] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
[    4.247001] IP: pci_find_pcie_root_port+0x62/0x80
[    4.253011] PGD 0
[    4.253011] P4D 0
[    4.253011]
[    4.258013] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[    4.262015] Modules linked in:
[    4.265005] CPU: 31 PID: 1 Comm: swapper/0 Not tainted 4.13.0-dbx-DEV #316
[    4.271002] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016
[    4.279002] task: ffffa2ee38cfa040 task.stack: ffffa51ec0004000
[    4.285001] RIP: 0010:pci_find_pcie_root_port+0x62/0x80
[    4.290012] RSP: 0000:ffffa51ec0007ab8 EFLAGS: 00010246
[    4.295003] RAX: 0000000000000000 RBX: ffffa2ee36bae000 RCX: 0000000000000006
[    4.303002] RDX: 000000000000081c RSI: ffffa2ee38cfa8c8 RDI: ffffa2ee36bae000
[    4.310013] RBP: ffffa51ec0007b58 R08: 0000000000000001 R09: 0000000000000000
[    4.317001] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa51ec0007ad0
[    4.324005] R13: ffffa2ee36bae098 R14: 0000000000000002 R15: ffffa2ee37204818
[    4.331002] FS:  0000000000000000(0000) GS:ffffa2ee3fcc0000(0000) knlGS:0000000000000000
[    4.339002] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.345001] CR2: 0000000000000050 CR3: 000000401000f000 CR4: 00000000001406e0
[    4.351002] Call Trace:
[    4.354012]  ? pci_configure_device+0x19f/0x570
[    4.359002]  ? pci_conf1_read+0xb8/0xf0
[    4.363002]  ? raw_pci_read+0x23/0x40
[    4.366011]  ? pci_read+0x2c/0x30
[    4.370014]  ? pci_read_config_word+0x67/0x70
[    4.374012]  pci_device_add+0x28/0x230
[    4.378012]  ? pci_vpd_f0_read+0x50/0x80
[    4.382014]  pci_scan_single_device+0x96/0xc0
[    4.386012]  pci_scan_slot+0x79/0xf0
[    4.389001]  pci_scan_child_bus+0x31/0x180
[    4.394014]  acpi_pci_root_create+0x1c6/0x240
[    4.398013]  pci_acpi_scan_root+0x15f/0x1b0
[    4.402012]  acpi_pci_root_add+0x2e6/0x400
[    4.406012]  ? acpi_evaluate_integer+0x37/0x60
[    4.411002]  acpi_bus_attach+0xdf/0x200
[    4.415002]  acpi_bus_attach+0x6a/0x200
[    4.418014]  acpi_bus_attach+0x6a/0x200
[    4.422013]  acpi_bus_scan+0x38/0x70
[    4.426011]  acpi_scan_init+0x10c/0x271
[    4.429001]  acpi_init+0x2fa/0x348
[    4.433004]  ? acpi_sleep_proc_init+0x2d/0x2d
[    4.437001]  do_one_initcall+0x43/0x169
[    4.441001]  kernel_init_freeable+0x1d0/0x258
[    4.445003]  ? rest_init+0xe0/0xe0
[    4.449001]  kernel_init+0xe/0x150

====================== cut here =============================

It looks like the pci_find_pcie_root_port() was trying to
find the Root Port for the PCI device which is the Root
Port already, it will return NULL and trigger the problem,
so check the highest_pcie_bridge to fix thie problem.

Fixes: a99b646afa ("PCI: Disable PCIe Relaxed Ordering if unsupported")
Fixes: c56d4450eb ("PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15 17:25:16 -07:00
Bert Kenward
61deee9628 sfc: don't try and read ef10 data on non-ef10 NIC
The MAC stats command takes a port ID, which doesn't exist on
pre-ef10 NICs (5000- and 6000- series). This is extracted from the
NIC specific data; we misinterpret this as the ef10 data structure,
causing us to read potentially unallocated data. With a KASAN kernel
this can cause errors with:
   BUG: KASAN: slab-out-of-bounds in efx_mcdi_mac_stats

Fixes: 0a2ab4d988 ("sfc: set the port-id when calling MC_CMD_MAC_STATS")
Reported-by: Stefano Brivio <sbrivio@redhat.com>
Tested-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15 17:19:34 -07:00
Konstantin Khlebnikov
c90e95147c net_sched: remove warning from qdisc_hash_add
It was added in commit e57a784d8c ("pkt_sched: set root qdisc
before change() in attach_default_qdiscs()") to hide duplicates
from "tc qdisc show" for incative deivices.

After 59cc1f61f ("net: sched: convert qdisc linked list to hashtable")
it triggered when classful qdisc is added to inactive device because
default qdiscs are added before switching root qdisc.

Anyway after commit ea32746953 ("net: sched: avoid duplicates in
qdisc dump") duplicates are filtered right in dumper.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15 17:16:39 -07:00
Konstantin Khlebnikov
325d5dc3f7 net_sched/sfq: update hierarchical backlog when drop packet
When sfq_enqueue() drops head packet or packet from another queue it
have to update backlog at upper qdiscs too.

Fixes: 2ccccf5fb4 ("net_sched: update hierarchical backlog too")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15 17:16:39 -07:00
Konstantin Khlebnikov
898904226b net_sched: reset pointers to tcf blocks in classful qdiscs' destructors
Traffic filters could keep direct pointers to classes in classful qdisc,
thus qdisc destruction first removes all filters before freeing classes.
Class destruction methods also tries to free attached filters but now
this isn't safe because tcf_block_put() unlike to tcf_destroy_chain()
cannot be called second time.

This patch set class->block to NULL after first tcf_block_put() and
turn second call into no-op.

Fixes: 6529eaba33 ("net: sched: introduce tcf block infractructure")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15 17:16:39 -07:00
Eric Dumazet
187e5b3ac8 ipv4: fix NULL dereference in free_fib_info_rcu()
If fi->fib_metrics could not be allocated in fib_create_info()
we attempt to dereference a NULL pointer in free_fib_info_rcu() :

    m = fi->fib_metrics;
    if (m != &dst_default_metrics && atomic_dec_and_test(&m->refcnt))
            kfree(m);

Before my recent patch, we used to call kfree(NULL) and nothing wrong
happened.

Instead of using RCU to defer freeing while we are under memory stress,
it seems better to take immediate action.

This was reported by syzkaller team.

Fixes: 3fb07daff8 ("ipv4: add reference counting to metrics")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15 17:07:52 -07:00
Tonghao Zhang
b3dc8f772f net: Fix a typo in comment about sock flags.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15 17:07:17 -07:00
Eric Dumazet
12d94a8049 ipv6: fix NULL dereference in ip6_route_dev_notify()
Based on a syzkaller report [1], I found that a per cpu allocation
failure in snmp6_alloc_dev() would then lead to NULL dereference in
ip6_route_dev_notify().

It seems this is a very old bug, thus no Fixes tag in this submission.

Let's add in6_dev_put_clear() helper, as we will probably use
it elsewhere (once available/present in net-next)

[1]
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 17294 Comm: syz-executor6 Not tainted 4.13.0-rc2+ #10
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff88019f456680 task.stack: ffff8801c6e58000
RIP: 0010:__read_once_size include/linux/compiler.h:250 [inline]
RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline]
RIP: 0010:refcount_sub_and_test+0x7d/0x1b0 lib/refcount.c:178
RSP: 0018:ffff8801c6e5f1b0 EFLAGS: 00010202
RAX: 0000000000000037 RBX: dffffc0000000000 RCX: ffffc90005d25000
RDX: ffff8801c6e5f218 RSI: ffffffff82342bbf RDI: 0000000000000001
RBP: ffff8801c6e5f240 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff10038dcbe37
R13: 0000000000000006 R14: 0000000000000001 R15: 00000000000001b8
FS:  00007f21e0429700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001ddbc22000 CR3: 00000001d632b000 CR4: 00000000001426e0
DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Call Trace:
 refcount_dec_and_test+0x1a/0x20 lib/refcount.c:211
 in6_dev_put include/net/addrconf.h:335 [inline]
 ip6_route_dev_notify+0x1c9/0x4a0 net/ipv6/route.c:3732
 notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
 __raw_notifier_call_chain kernel/notifier.c:394 [inline]
 raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
 call_netdevice_notifiers_info+0x51/0x90 net/core/dev.c:1678
 call_netdevice_notifiers net/core/dev.c:1694 [inline]
 rollback_registered_many+0x91c/0xe80 net/core/dev.c:7107
 rollback_registered+0x1be/0x3c0 net/core/dev.c:7149
 register_netdevice+0xbcd/0xee0 net/core/dev.c:7587
 register_netdev+0x1a/0x30 net/core/dev.c:7669
 loopback_net_init+0x76/0x160 drivers/net/loopback.c:214
 ops_init+0x10a/0x570 net/core/net_namespace.c:118
 setup_net+0x313/0x710 net/core/net_namespace.c:294
 copy_net_ns+0x27c/0x580 net/core/net_namespace.c:418
 create_new_namespaces+0x425/0x880 kernel/nsproxy.c:107
 unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:206
 SYSC_unshare kernel/fork.c:2347 [inline]
 SyS_unshare+0x653/0xfa0 kernel/fork.c:2297
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x4512c9
RSP: 002b:00007f21e0428c08 EFLAGS: 00000216 ORIG_RAX: 0000000000000110
RAX: ffffffffffffffda RBX: 0000000000718150 RCX: 00000000004512c9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000062020200
RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004b973d
R13: 00000000ffffffff R14: 000000002001d000 R15: 00000000000002dd
Code: 50 2b 34 82 c7 00 f1 f1 f1 f1 c7 40 04 04 f2 f2 f2 c7 40 08 f3 f3
f3 f3 e8 a1 43 39 ff 4c 89 f8 48 8b 95 70 ff ff ff 48 c1 e8 03 <0f> b6
0c 18 4c 89 f8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f 85
RIP: __read_once_size include/linux/compiler.h:250 [inline] RSP:
ffff8801c6e5f1b0
RIP: atomic_read arch/x86/include/asm/atomic.h:26 [inline] RSP:
ffff8801c6e5f1b0
RIP: refcount_sub_and_test+0x7d/0x1b0 lib/refcount.c:178 RSP:
ffff8801c6e5f1b0
---[ end trace e441d046c6410d31 ]---

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15 17:06:34 -07:00
Jan Kara
b5fed474b9 audit: Receive unmount event
Although audit_watch_handle_event() can handle FS_UNMOUNT event, it is
not part of AUDIT_FS_WATCH mask and thus such event never gets to
audit_watch_handle_event(). Thus fsnotify marks are deleted by fsnotify
subsystem on unmount without audit being notified about that which leads
to a strange state of existing audit rules with dead fsnotify marks.

Add FS_UNMOUNT to the mask of events to be received so that audit can
clean up its state accordingly.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-08-15 16:03:00 -04:00
Jan Kara
d76036ab47 audit: Fix use after free in audit_remove_watch_rule()
audit_remove_watch_rule() drops watch's reference to parent but then
continues to work with it. That is not safe as parent can get freed once
we drop our reference. The following is a trivial reproducer:

mount -o loop image /mnt
touch /mnt/file
auditctl -w /mnt/file -p wax
umount /mnt
auditctl -D
<crash in fsnotify_destroy_mark()>

Grab our own reference in audit_remove_watch_rule() earlier to make sure
mark does not get freed under us.

CC: stable@vger.kernel.org
Reported-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Tested-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-08-15 15:58:17 -04:00
Linus Torvalds
40c6d1b9e2 Merge tag 'linux-kselftest-4.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
 "This update consists of important compile and run-time error fixes to
  timers/freq-step, kmod, and sysctl tests"

* tag 'linux-kselftest-4.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: timers: freq-step: fix compile error
  selftests: futex: fix run_tests target
  test_sysctl: fix sysctl.sh by making it executable
  test_kmod: fix kmod.sh by making it executable
2017-08-15 12:49:43 -07:00
Chunming Zhou
7a7c286d07 drm/amdgpu: save list length when fence is signaled
update the list first to avoid redundant checks.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2017-08-15 14:10:01 -04:00
David S. Miller
0a6f04184d Merge tag 'wireless-drivers-for-davem-2017-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:

====================
wireless-drivers fixes for 4.13

This time quite a few fixes for iwlwifi and one major regression fix
for brcmfmac. For the iwlwifi aggregation bug a small change was
needed for mac80211, but as Johannes is still away the mac80211 patch
is taken via wireless-drivers tree.

brcmfmac

* fix firmware crash (a recent regression in bcm4343{0,1,8}

iwlwifi

* Some simple PCI HW ID fix-ups and additions for family 9000

* Remove a bogus warning message with new FWs (bug #196915)

* Don't allow illegal channel options to be used (bug #195299)

* A fix for checksum offload in family 9000

* A fix serious throughput degradation in 11ac with multiple streams

* An old bug in SMPS where the firmware was not aware of SMPS changes

* Fix a memory leak in the SAR code

* Fix a stuck queue case in AP mode;

* Convert a WARN to a simple debug in a legitimate race case (from
  which we can recover)

* Fix a severe throughput aggregation on 9000-family devices due to
  aggregation issues, needed a small change in mac80211
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15 10:19:14 -07:00
Arnd Bergmann
44506752fa Merge tag 'at91-ab-4.13-dt-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux into fixes
Pull "DT fixes for 4.13" from Alexandre Belloni:

 - Fix NAND flash support for sama5d2

* tag 'at91-ab-4.13-dt-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  ARM: dts: at91: sama5d2: fix EBI/NAND controllers declaration
  ARM: dts: at91: sama5d2: use sama5d2 compatible string for SMC
2017-08-15 17:37:10 +02:00
Arnd Bergmann
64c2c372db Merge tag 'imx-fixes-4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes
Pull "i.MX fixes for 4.13, round 2" from Shawn Guo:

 - Add missing 'ranges' property for i.MX25 device tree TSCADC node, so
   that it's child nodes ADC and TSC device can be probed by kernel.
 - Fix i.MX GPCv2 power domain driver to request regulator after power
   domain initialization, since regulator could defer probing and
   therefore cause power domain initialized twice.

* tag 'imx-fixes-4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: i.MX25: add ranges to tscadc
  soc: imx: gpcv2: fix regulator deferred probe
2017-08-15 17:34:52 +02:00
Arnd Bergmann
69a80d8c2c Merge tag 'imx-fixes-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes
Pull "i.MX fixes for 4.13" from Shawn Guo:

 - A fix for imx7d-sdb board to move pinctrl_spi4 pins from low power
   iomux controller to normal iomuxc node, as the pins belong to normal
   iomuxc rather than low power one.

* tag 'imx-fixes-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imx7d-sdb: Put pinctrl_spi4 in the correct location
2017-08-15 17:34:04 +02:00
Munehisa Kamata
b15bd8cb37 xen-blkfront: use a right index when checking requests
Since commit d05d7f4079 ("Merge branch 'for-4.8/core' of
git://git.kernel.dk/linux-block") and 3fc9d69093 ("Merge branch
'for-4.8/drivers' of git://git.kernel.dk/linux-block"), blkfront_resume()
has been using an index for iterating ring_info to check request when
iterating blk_shadow in an inner loop. This seems to have been
accidentally introduced during the massive rewrite of the block layer
macros in the commits.

This may cause crash like this:

[11798.057074] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[11798.058832] IP: [<ffffffff814411fa>] blkfront_resume+0x10a/0x610
....
[11798.061063] Call Trace:
[11798.061063]  [<ffffffff8139ce93>] xenbus_dev_resume+0x53/0x140
[11798.061063]  [<ffffffff8139ce40>] ? xenbus_dev_probe+0x150/0x150
[11798.061063]  [<ffffffff813f359e>] dpm_run_callback+0x3e/0x110
[11798.061063]  [<ffffffff813f3a08>] device_resume+0x88/0x190
[11798.061063]  [<ffffffff813f4cc0>] dpm_resume+0x100/0x2d0
[11798.061063]  [<ffffffff813f5221>] dpm_resume_end+0x11/0x20
[11798.061063]  [<ffffffff813950a8>] do_suspend+0xe8/0x1a0
[11798.061063]  [<ffffffff813954bd>] shutdown_handler+0xfd/0x130
[11798.061063]  [<ffffffff8139aba0>] ? split+0x110/0x110
[11798.061063]  [<ffffffff8139ac26>] xenwatch_thread+0x86/0x120
[11798.061063]  [<ffffffff810b4570>] ? prepare_to_wait_event+0x110/0x110
[11798.061063]  [<ffffffff8108fe57>] kthread+0xd7/0xf0
[11798.061063]  [<ffffffff811da811>] ? kfree+0x121/0x170
[11798.061063]  [<ffffffff8108fd80>] ? kthread_park+0x60/0x60
[11798.061063]  [<ffffffff810863b0>] ?  call_usermodehelper_exec_work+0xb0/0xb0
[11798.061063]  [<ffffffff810864ea>] ?  call_usermodehelper_exec_async+0x13a/0x140
[11798.061063]  [<ffffffff81534a45>] ret_from_fork+0x25/0x30

Use the right index in the inner loop.

Fixes: d05d7f4079 ("Merge branch 'for-4.8/core' of git://git.kernel.dk/linux-block")
Fixes: 3fc9d69093 ("Merge branch 'for-4.8/drivers' of git://git.kernel.dk/linux-block")
Signed-off-by: Munehisa Kamata <kamatam@amazon.com>
Reviewed-by: Thomas Friebel <friebelt@amazon.de>
Reviewed-by: Eduardo Valentin <eduval@amazon.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Roger Pau Monne <roger.pau@citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2017-08-15 10:34:04 -04:00
Roger Pau Monne
462cdace79 xen: fix bio vec merging
The current test for bio vec merging is not fully accurate and can be
tricked into merging bios when certain grant combinations are used.
The result of these malicious bio merges is a bio that extends past
the memory page used by any of the originating bios.

Take into account the following scenario, where a guest creates two
grant references that point to the same mfn, ie: grant 1 -> mfn A,
grant 2 -> mfn A.

These references are then used in a PV block request, and mapped by
the backend domain, thus obtaining two different pfns that point to
the same mfn, pfn B -> mfn A, pfn C -> mfn A.

If those grants happen to be used in two consecutive sectors of a disk
IO operation becoming two different bios in the backend domain, the
checks in xen_biovec_phys_mergeable will succeed, because bfn1 == bfn2
(they both point to the same mfn). However due to the bio merging,
the backend domain will end up with a bio that expands past mfn A into
mfn A + 1.

Fix this by making sure the check in xen_biovec_phys_mergeable takes
into account the offset and the length of the bio, this basically
replicates whats done in __BIOVEC_PHYS_MERGEABLE using mfns (bus
addresses). While there also remove the usage of
__BIOVEC_PHYS_MERGEABLE, since that's already checked by the callers
of xen_biovec_phys_mergeable.

CC: stable@vger.kernel.org
Reported-by: "Jan H. Schönherr" <jschoenh@amazon.de>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2017-08-15 10:32:49 -04:00
Keith Busch
3280d66a63 blk-mq: Fix queue usage on failed request allocation
blk_mq_get_request() does not release the callers queue usage counter
when allocation fails. The caller still needs to account for its own
queue usage when it is unable to allocate a request.

Fixes: 1ad43c0078 ("blk-mq: don't leak preempt counter/q_usage_counter when allocating rq failed")

Reported-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-15 07:22:34 -06:00
Joerg Roedel
2926a2aa5c iommu: Fix wrong freeing of iommu_device->dev
The struct iommu_device has a 'struct device' embedded into
it, not as a pointer, but the whole struct. In the
conversion of the iommu drivers to use struct iommu_device
it was forgotten that the relase function for that struct
device simply calls kfree() on the pointer.

This frees memory that was never allocated and causes memory
corruption.

To fix this issue, use a pointer to struct device instead of
embedding the whole struct. This needs some updates in the
iommu sysfs code as well as the Intel VT-d and AMD IOMMU
driver.

Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Fixes: 39ab9555c2 ('iommu: Add sysfs bindings for struct iommu_device')
Cc: stable@vger.kernel.org # >= v4.11
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-15 13:58:48 +02:00
Thomas Gleixner
84393817db x86/mtrr: Prevent CPU hotplug lock recursion
Larry reported a CPU hotplug lock recursion in the MTRR code.

============================================
WARNING: possible recursive locking detected

systemd-udevd/153 is trying to acquire lock:
 (cpu_hotplug_lock.rw_sem){.+.+.+}, at: [<c030fc26>] stop_machine+0x16/0x30
 
 but task is already holding lock:
  (cpu_hotplug_lock.rw_sem){.+.+.+}, at: [<c0234353>] mtrr_add_page+0x83/0x470

....

 cpus_read_lock+0x48/0x90
 stop_machine+0x16/0x30
 mtrr_add_page+0x18b/0x470
 mtrr_add+0x3e/0x70

mtrr_add_page() holds the hotplug rwsem already and calls stop_machine()
which acquires it again.

Call stop_machine_cpuslocked() instead.

Reported-and-tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1708140920250.1865@nanos
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Borislav Petkov <bp@suse.de>
2017-08-15 13:03:47 +02:00
Maarten Lankhorst
a0ffc51e20 drm/atomic: If the atomic check fails, return its value first
The last part of drm_atomic_check_only is testing whether we need to
fail with -EINVAL when modeset is not allowed, but forgets to return
the value when atomic_check() fails first.

This results in -EDEADLK being replaced by -EINVAL, and the sanity
check in drm_modeset_drop_locks kicks in:

[  308.531734] ------------[ cut here ]------------
[  308.531791] WARNING: CPU: 0 PID: 1886 at drivers/gpu/drm/drm_modeset_lock.c:217 drm_modeset_drop_locks+0x33/0xc0 [drm]
[  308.531828] Modules linked in:
[  308.532050] CPU: 0 PID: 1886 Comm: kms_atomic Tainted: G     U  W 4.13.0-rc5-patser+ #5225
[  308.532082] Hardware name: NUC5i7RYB, BIOS RYBDWi35.86A.0246.2015.0309.1355 03/09/2015
[  308.532124] task: ffff8800cd9dae00 task.stack: ffff8800ca3b8000
[  308.532168] RIP: 0010:drm_modeset_drop_locks+0x33/0xc0 [drm]
[  308.532189] RSP: 0018:ffff8800ca3bf980 EFLAGS: 00010282
[  308.532211] RAX: dffffc0000000000 RBX: ffff8800ca3bfaf8 RCX: 0000000013a171e6
[  308.532235] RDX: 1ffff10019477f69 RSI: ffffffffa8ba4fa0 RDI: ffff8800ca3bfb48
[  308.532258] RBP: ffff8800ca3bf998 R08: 0000000000000000 R09: 0000000000000003
[  308.532281] R10: 0000000079dbe066 R11: 00000000f760b34b R12: 0000000000000001
[  308.532304] R13: dffffc0000000000 R14: 00000000ffffffea R15: ffff880096889680
[  308.532328] FS:  00007ff00959cec0(0000) GS:ffff8800d4e00000(0000) knlGS:0000000000000000
[  308.532359] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  308.532380] CR2: 0000000000000008 CR3: 00000000ca2e3000 CR4: 00000000003406f0
[  308.532402] Call Trace:
[  308.532440]  drm_mode_atomic_ioctl+0x19fa/0x1c00 [drm]
[  308.532488]  ? drm_atomic_set_property+0x1220/0x1220 [drm]
[  308.532565]  ? avc_has_extended_perms+0xc39/0xff0
[  308.532593]  ? lock_downgrade+0x610/0x610
[  308.532640]  ? drm_atomic_set_property+0x1220/0x1220 [drm]
[  308.532680]  drm_ioctl_kernel+0x154/0x1a0 [drm]
[  308.532755]  drm_ioctl+0x624/0x8f0 [drm]
[  308.532858]  ? drm_atomic_set_property+0x1220/0x1220 [drm]
[  308.532976]  ? drm_getunique+0x210/0x210 [drm]
[  308.533061]  do_vfs_ioctl+0xd92/0xe40
[  308.533121]  ? ioctl_preallocate+0x1b0/0x1b0
[  308.533160]  ? selinux_capable+0x20/0x20
[  308.533191]  ? do_fcntl+0x1b1/0xbf0
[  308.533219]  ? kasan_slab_free+0xa2/0xb0
[  308.533249]  ? f_getown+0x4b/0xa0
[  308.533278]  ? putname+0xcf/0xe0
[  308.533309]  ? security_file_ioctl+0x57/0x90
[  308.533342]  SyS_ioctl+0x4e/0x80
[  308.533374]  entry_SYSCALL_64_fastpath+0x18/0xad
[  308.533405] RIP: 0033:0x7ff00779e4d7
[  308.533431] RSP: 002b:00007fff66a043d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  308.533481] RAX: ffffffffffffffda RBX: 000000e7c7ca5910 RCX: 00007ff00779e4d7
[  308.533560] RDX: 00007fff66a04430 RSI: 00000000c03864bc RDI: 0000000000000003
[  308.533608] RBP: 00007ff007a5fb00 R08: 000000e7c7ca4620 R09: 000000e7c7ca5e60
[  308.533647] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000070
[  308.533685] R13: 0000000000000000 R14: 0000000000000000 R15: 000000e7c7ca5930
[  308.533770] Code: ff df 55 48 89 e5 41 55 41 54 53 48 89 fb 48 83 c7
50 48 89 fa 48 c1 ea 03 80 3c 02 00 74 05 e8 94 d4 16 e7 48 83 7b 50 00
74 02 <0f> ff 4c 8d 6b 58 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1
[  308.534086] ---[ end trace 77f11e53b1df44ad ]---

Solve this by adding the missing return.

This is also a bugfix because we could end up rejecting updates with
-EINVAL because of a early -EDEADLK, while if atomic_check ran to
completion it might have downgraded the modeset to a fastset.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Testcase: kms_atomic
Link: https://patchwork.freedesktop.org/patch/msgid/20170815095706.23624-1-maarten.lankhorst@linux.intel.com
Fixes: d34f20d6e2 ("drm: Atomic modeset ioctl")
Cc: <stable@vger.kernel.org> # v4.0+
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-08-15 12:38:05 +02:00
Takashi Iwai
a8e800fe0f ALSA: usb-audio: Apply sample rate quirk to Sennheiser headset
A Senheisser headset requires the typical sample-rate quirk for
avoiding spurious errors from inquiring the current sample rate like:
 usb 1-1: 2:1: cannot get freq at ep 0x4
 usb 1-1: 3:1: cannot get freq at ep 0x83

The USB ID 1395:740a has to be added to the entries in
snd_usb_get_sample_rate_quirk().

Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1052580
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-15 09:11:23 +02:00
Daniel Mentz
7e1d90f60a ALSA: seq: 2nd attempt at fixing race creating a queue
commit 4842e98f26 ("ALSA: seq: Fix race at
creating a queue") attempted to fix a race reported by syzkaller. That
fix has been described as follows:

"
When a sequencer queue is created in snd_seq_queue_alloc(),it adds the
new queue element to the public list before referencing it.  Thus the
queue might be deleted before the call of snd_seq_queue_use(), and it
results in the use-after-free error, as spotted by syzkaller.

The fix is to reference the queue object at the right time.
"

Even with that fix in place, syzkaller reported a use-after-free error.
It specifically pointed to the last instruction "return q->queue" in
snd_seq_queue_alloc(). The pointer q is being used after kfree() has
been called on it.

It turned out that there is still a small window where a race can
happen. The window opens at
snd_seq_ioctl_create_queue()->snd_seq_queue_alloc()->queue_list_add()
and closes at
snd_seq_ioctl_create_queue()->queueptr()->snd_use_lock_use(). Between
these two calls, a different thread could delete the queue and possibly
re-create a different queue in the same location in queue_list.

This change prevents this situation by calling snd_use_lock_use() from
snd_seq_queue_alloc() prior to calling queue_list_add(). It is then the
caller's responsibility to call snd_use_lock_free(&q->use_lock).

Fixes: 4842e98f26 ("ALSA: seq: Fix race at creating a queue")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Daniel Mentz <danielmentz@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-15 08:02:35 +02:00
Eric Dumazet
d624d276d1 tcp: fix possible deadlock in TCP stack vs BPF filter
Filtering the ACK packet was not put at the right place.

At this place, we already allocated a child and put it
into accept queue.

We absolutely need to call tcp_child_process() to release
its spinlock, or we will deadlock at accept() or close() time.

Found by syzkaller team (Thanks a lot !)

Fixes: 8fac365f63 ("tcp: Add a tcp_filter hook before handle ack packet")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Chenbo Feng <fengc@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 22:31:27 -07:00
Eric Dumazet
7749d4ff88 dccp: purge write queue in dccp_destroy_sock()
syzkaller reported that DCCP could have a non empty
write queue at dismantle time.

WARNING: CPU: 1 PID: 2953 at net/core/stream.c:199 sk_stream_kill_queues+0x3ce/0x520 net/core/stream.c:199
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 2953 Comm: syz-executor0 Not tainted 4.13.0-rc4+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 panic+0x1e4/0x417 kernel/panic.c:180
 __warn+0x1c4/0x1d9 kernel/panic.c:541
 report_bug+0x211/0x2d0 lib/bug.c:183
 fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190
 do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
 do_trap+0x260/0x390 arch/x86/kernel/traps.c:273
 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323
 invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:846
RIP: 0010:sk_stream_kill_queues+0x3ce/0x520 net/core/stream.c:199
RSP: 0018:ffff8801d182f108 EFLAGS: 00010297
RAX: ffff8801d1144140 RBX: ffff8801d13cb280 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff85137b00 RDI: ffff8801d13cb280
RBP: ffff8801d182f148 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801d13cb4d0
R13: ffff8801d13cb3b8 R14: ffff8801d13cb300 R15: ffff8801d13cb3b8
 inet_csk_destroy_sock+0x175/0x3f0 net/ipv4/inet_connection_sock.c:835
 dccp_close+0x84d/0xc10 net/dccp/proto.c:1067
 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
 sock_release+0x8d/0x1e0 net/socket.c:597
 sock_close+0x16/0x20 net/socket.c:1126
 __fput+0x327/0x7e0 fs/file_table.c:210
 ____fput+0x15/0x20 fs/file_table.c:246
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 exit_task_work include/linux/task_work.h:21 [inline]
 do_exit+0xa32/0x1b10 kernel/exit.c:865
 do_group_exit+0x149/0x400 kernel/exit.c:969
 get_signal+0x7e8/0x17e0 kernel/signal.c:2330
 do_signal+0x94/0x1ee0 arch/x86/kernel/signal.c:808
 exit_to_usermode_loop+0x21c/0x2d0 arch/x86/entry/common.c:157
 prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
 syscall_return_slowpath+0x3a7/0x450 arch/x86/entry/common.c:263

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 22:28:18 -07:00
Al Viro
42b7305905 udp: fix linear skb reception with PEEK_OFF
copy_linear_skb() is broken; both of its callers actually
expect 'len' to be the amount we are trying to copy,
not the offset of the end.
Fix it keeping the meanings of arguments in sync with what the
callers (both of them) expect.
Also restore a saner behavior on EFAULT (i.e. preserving
the iov_iter position in case of failure):

The commit fd851ba9ca ("udp: harden copy_linear_skb()")
avoids the more destructive effect of the buggy
copy_linear_skb(), e.g. no more invalid memory access, but
said function still behaves incorrectly: when peeking with
offset it can fail with EINVAL instead of copying the
appropriate amount of memory.

Reported-by: Sasha Levin <alexander.levin@verizon.com>
Fixes: b65ac44674 ("udp: try to avoid 2 cache miss on dequeue")
Fixes: fd851ba9ca ("udp: harden copy_linear_skb()")
Signed-off-by: Al Viro <viro@ZenIV.linux.org.uk>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Sasha Levin <alexander.levin@verizon.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 22:26:51 -07:00
Wei Wang
e5645f51ba ipv6: release rt6->rt6i_idev properly during ifdown
When a dst is created by addrconf_dst_alloc() for a host route or an
anycast route, dst->dev points to loopback dev while rt6->rt6i_idev
points to a real device.
When the real device goes down, the current cleanup code only checks for
dst->dev and assumes rt6->rt6i_idev->dev is the same. This causes the
refcount leak on the real device in the above situation.
This patch makes sure to always release the refcount taken on
rt6->rt6i_idev during dst_dev_put().

Fixes: 587fea7411 ("ipv6: mark DST_NOGC and remove the operation of
dst_free()")
Reported-by: John Stultz <john.stultz@linaro.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Tested-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 22:18:48 -07:00
Eric Dumazet
36f41f8fc6 af_key: do not use GFP_KERNEL in atomic contexts
pfkey_broadcast() might be called from non process contexts,
we can not use GFP_KERNEL in these cases [1].

This patch partially reverts commit ba51b6be38 ("net: Fix RCU splat in
af_key"), only keeping the GFP_ATOMIC forcing under rcu_read_lock()
section.

[1] : syzkaller reported :

in_atomic(): 1, irqs_disabled(): 0, pid: 2932, name: syzkaller183439
3 locks held by syzkaller183439/2932:
 #0:  (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [<ffffffff83b43888>] pfkey_sendmsg+0x4c8/0x9f0 net/key/af_key.c:3649
 #1:  (&pfk->dump_lock){+.+.+.}, at: [<ffffffff83b467f6>] pfkey_do_dump+0x76/0x3f0 net/key/af_key.c:293
 #2:  (&(&net->xfrm.xfrm_policy_lock)->rlock){+...+.}, at: [<ffffffff83957632>] spin_lock_bh include/linux/spinlock.h:304 [inline]
 #2:  (&(&net->xfrm.xfrm_policy_lock)->rlock){+...+.}, at: [<ffffffff83957632>] xfrm_policy_walk+0x192/0xa30 net/xfrm/xfrm_policy.c:1028
CPU: 0 PID: 2932 Comm: syzkaller183439 Not tainted 4.13.0-rc4+ #24
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 ___might_sleep+0x2b2/0x470 kernel/sched/core.c:5994
 __might_sleep+0x95/0x190 kernel/sched/core.c:5947
 slab_pre_alloc_hook mm/slab.h:416 [inline]
 slab_alloc mm/slab.c:3383 [inline]
 kmem_cache_alloc+0x24b/0x6e0 mm/slab.c:3559
 skb_clone+0x1a0/0x400 net/core/skbuff.c:1037
 pfkey_broadcast_one+0x4b2/0x6f0 net/key/af_key.c:207
 pfkey_broadcast+0x4ba/0x770 net/key/af_key.c:281
 dump_sp+0x3d6/0x500 net/key/af_key.c:2685
 xfrm_policy_walk+0x2f1/0xa30 net/xfrm/xfrm_policy.c:1042
 pfkey_dump_sp+0x42/0x50 net/key/af_key.c:2695
 pfkey_do_dump+0xaa/0x3f0 net/key/af_key.c:299
 pfkey_spddump+0x1a0/0x210 net/key/af_key.c:2722
 pfkey_process+0x606/0x710 net/key/af_key.c:2814
 pfkey_sendmsg+0x4d6/0x9f0 net/key/af_key.c:3650
sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 ___sys_sendmsg+0x755/0x890 net/socket.c:2035
 __sys_sendmsg+0xe5/0x210 net/socket.c:2069
 SYSC_sendmsg net/socket.c:2080 [inline]
 SyS_sendmsg+0x2d/0x50 net/socket.c:2076
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x445d79
RSP: 002b:00007f32447c1dc8 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000445d79
RDX: 0000000000000000 RSI: 000000002023dfc8 RDI: 0000000000000008
RBP: 0000000000000086 R08: 00007f32447c2700 R09: 00007f32447c2700
R10: 00007f32447c2700 R11: 0000000000000202 R12: 0000000000000000
R13: 00007ffe33edec4f R14: 00007f32447c29c0 R15: 0000000000000000

Fixes: ba51b6be38 ("net: Fix RCU splat in af_key")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: David Ahern <dsa@cumulusnetworks.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 22:18:12 -07:00
Sabrina Dubroca
539a06baed tcp: ulp: avoid module refcnt leak in tcp_set_ulp
__tcp_ulp_find_autoload returns tcp_ulp_ops after taking a reference on
the module. Then, if ->init fails, tcp_set_ulp propagates the error but
nothing releases that reference.

Fixes: 734942cc4e ("tcp: ULP infrastructure")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 22:17:05 -07:00
David S. Miller
bae514a688 Merge branch 'Add-new-PCI_DEV_FLAGS_NO_RELAXED_ORDERING-flag'
Ding Tianhong says:

====================
Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag

Some devices have problems with Transaction Layer Packets with the Relaxed
Ordering Attribute set.  This patch set adds a new PCIe Device Flag,
PCI_DEV_FLAGS_NO_RELAXED_ORDERING, a set of PCI Quirks to catch some known
devices with Relaxed Ordering issues, and a use of this new flag by the
cxgb4 driver to avoid using Relaxed Ordering with problematic Root Complex
Ports.

It's been years since I've submitted kernel.org patches, I appolgise for the
almost certain submission errors.

v2: Alexander point out that the v1 was only a part of the whole solution,
    some platform which has some issues could use the new flag to indicate
    that it is not safe to enable relaxed ordering attribute, then we need
    to clear the relaxed ordering enable bits in the PCI configuration when
    initializing the device. So add a new second patch to modify the PCI
    initialization code to clear the relaxed ordering enable bit in the
    event that the root complex doesn't want relaxed ordering enabled.

    The third patch was base on the v1's second patch and only be changed
    to query the relaxed ordering enable bit in the PCI configuration space
    to allow the Chelsio NIC to send TLPs with the relaxed ordering attributes
    set.

    This version didn't plan to drop the defines for Intel Drivers to use the
    new checking way to enable relaxed ordering because it is not the hardest
    part of the moment, we could fix it in next patchset when this patches
    reach the goal.

v3: Redesigned the logic for pci_configure_relaxed_ordering when configuration,
    If a PCIe device didn't enable the relaxed ordering attribute default,
    we should not do anything in the PCIe configuration, otherwise we
    should check if any of the devices above us do not support relaxed
    ordering by the PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag, then base on
    the result if we get a return that indicate that the relaxed ordering
    is not supported we should update our device to disable relaxed ordering
    in configuration space. If the device above us doesn't exist or isn't
    the PCIe device, we shouldn't do anything and skip updating relaxed ordering
    because we are probably running in a guest.

v4: Rename the functions pcie_get_relaxed_ordering and pcie_disable_relaxed_ordering
    according John's suggestion, and modify the description, use the true/false
    as the return value.

    We shouldn't enable relaxed ordering attribute by the setting in the root
    complex configuration space for PCIe device, so fix it for cxgb4.

    Fix some format issues.

v5: Removed the unnecessary code for some function which only return the bool
    value, and add the check for VF device.

    Make this patch set base on 4.12-rc5.

v6: Fix the logic error in the need to enable the relaxed ordering attribute for cxgb4.

v7: The cxgb4 drivers will enable the PCIe Capability Device Control[Relaxed
    Ordering Enable] in PCI Probe() routine, this will break our current
    solution for some platform which has problematic when enable the relaxed
    ordering attribute. According to the latest recommendations, remove the
    enable_pcie_relaxed_ordering(), although it could not cover the Peer-to-Peer
    scene, but we agree to leave this problem until we really trigger it.

    Make this patch set base on 4.12 release version.

v8: Change the second patch title and description to make it more reasonable,
    add the acked-by from Alex and Ashok.

    Add a new patch to enable the Relaxed Ordering Attribute for cxgb4vf driver.

    Make this patch set base on 4.13-rc2.

v9: The document (https://software.intel.com/sites/default/files/managed/9e/
    bc/64-ia-32-architectures-optimization-manual.pdf) indicate that the Xeon
    processors based on Broadwell/Haswell microarchitecture has the problem
    with Relaxed Ordering Attribute enabled, so add the whole list Device ID
    from Intel to the patch.

v10: Significant rework based on Bjorn's feedback, reorganize the first 2 patches,
     now the Intel and AMD erratum soc has been divided to the different patches,
     rename the pcie_relaxed_ordering_supported() to pcie_relaxed_ordering_enabled(),
     and no need to check every intervening switch except the root ports, update
     some commits.

v11: We shouldn't let the Intel engineer to acked the AMD's erratum patch, fix the
     funny mistake.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 22:14:51 -07:00
Casey Leedom
b629276df7 net/cxgb4vf: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
cxgb4vf Ethernet driver now queries PCIe configuration space to
determine if it can send TLPs to it with the Relaxed Ordering
Attribute set, just like the pf did.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Reviewed-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 22:14:51 -07:00
Casey Leedom
b0ba9d5fde net/cxgb4: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
cxgb4 Ethernet driver now queries PCIe configuration space to determine
if it can send TLPs to it with the Relaxed Ordering Attribute set.

Remove the enable_pcie_relaxed_ordering() to avoid enable PCIe Capability
Device Control[Relaxed Ordering Enable] at probe routine, to make sure
the driver will not send the Relaxed Ordering TLPs to the Root Complex which
could not deal the Relaxed Ordering TLPs.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Reviewed-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 22:14:51 -07:00
dingtianhong
077fa19c5d PCI: Disable Relaxed Ordering Attributes for AMD A1100
Casey reported that the AMD ARM A1100 SoC has a bug in its PCIe
Root Port where Upstream Transaction Layer Packets with the Relaxed
Ordering Attribute clear are allowed to bypass earlier TLPs with
Relaxed Ordering set, it would cause Data Corruption, so we need
to disable Relaxed Ordering Attribute when Upstream TLPs to the
Root Port.

Reported-and-suggested-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 22:14:50 -07:00
dingtianhong
87e09cdec4 PCI: Disable Relaxed Ordering for some Intel processors
According to the Intel spec section 3.9.1 said:

    3.9.1 Optimizing PCIe Performance for Accesses Toward Coherent Memory
          and Toward MMIO Regions (P2P)

    In order to maximize performance for PCIe devices in the processors
    listed in Table 3-6 below, the soft- ware should determine whether the
    accesses are toward coherent memory (system memory) or toward MMIO
    regions (P2P access to other devices). If the access is toward MMIO
    region, then software can command HW to set the RO bit in the TLP
    header, as this would allow hardware to achieve maximum throughput for
    these types of accesses. For accesses toward coherent memory, software
    can command HW to clear the RO bit in the TLP header (no RO), as this
    would allow hardware to achieve maximum throughput for these types of
    accesses.

    Table 3-6. Intel Processor CPU RP Device IDs for Processors Optimizing
               PCIe Performance

    Processor                            CPU RP Device IDs

    Intel Xeon processors based on       6F01H-6F0EH
    Broadwell microarchitecture

    Intel Xeon processors based on       2F01H-2F0EH
    Haswell microarchitecture

It means some Intel processors has performance issue when use the Relaxed
Ordering Attribute, so disable Relaxed Ordering for these root port.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 22:14:50 -07:00
dingtianhong
a99b646afa PCI: Disable PCIe Relaxed Ordering if unsupported
When bit4 is set in the PCIe Device Control register, it indicates
whether the device is permitted to use relaxed ordering.
On some platforms using relaxed ordering can have performance issues or
due to erratum can cause data-corruption. In such cases devices must avoid
using relaxed ordering.

The patch adds a new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING to indicate that
Relaxed Ordering (RO) attribute should not be used for Transaction Layer
Packets (TLP) targeted towards these affected root complexes.

This patch checks if there is any node in the hierarchy that indicates that
using relaxed ordering is not safe. In such cases the patch turns off the
relaxed ordering by clearing the capability for this device.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 22:14:50 -07:00
KT Liao
7698869040 Input: elan_i2c - Add antoher Lenovo ACPI ID for upcoming Lenovo NB
Add 2 new IDs (ELAN0609 and ELAN060B) to the list of ACPI IDs that should
be handled by the driver.

Signed-off-by: KT Liao <kt.liao@emc.com.tw>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-08-14 20:22:11 -07:00
Kai-Heng Feng
1874064eed Input: elan_i2c - add ELAN0608 to the ACPI table
Similar to commit 722c5ac708 ("Input: elan_i2c - add ELAN0605 to the
ACPI table"), ELAN0608 should be handled by elan_i2c.

This touchpad can be found in Lenovo ideapad 320-14IKB.

BugLink: https://bugs.launchpad.net/bugs/1708852

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-08-14 20:22:02 -07:00
Linus Torvalds
fcd0735000 Merge tag 'md/4.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
Pull MD fixes from Shaohua Li:
 "Fix several bugs:

   - fix a rcu stall issue introduced in 4.12 (Neil Brown)

   - fix two raid5 cache race conditions (Song Liu)"

* tag 'md/4.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  MD: not clear ->safemode for external metadata array
  md/r5cache: fix io_unit handling in r5l_log_endio()
  md/r5cache: call mddev_lock/unlock() in r5c_journal_mode_set
  md: fix test in md_write_start()
  md: always clear ->safemode when md_check_recovery gets the mddev lock.
2017-08-14 13:09:59 -07:00
Brendan Higgins
f1c0b7e448 i2c: aspeed: fixed potential null pointer dereference
Before I skipped null checks when the master is in the STOP state; this
fixes that.

Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
Acked-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Fixes: f327c686d3 ("i2c: aspeed: added driver for Aspeed I2C")
2017-08-14 22:08:13 +02:00
Anton Vasilyev
42543aeb48 i2c: simtec: use release_mem_region instead of release_resource
Use api pair of request_mem_region and release_mem_region
instead of release_resource.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-08-14 21:39:21 +02:00
Javier Martinez Canillas
f4b17a14fa i2c: core: Make comment about I2C table requirement to reflect the code
I2C drivers were required to have an I2C device ID table even if were for
devices that would only be registered using a specific firmware interface
(e.g: OF or ACPI).

But commit da10c06a04 ("i2c: Make I2C ID tables non-mandatory for DT'ed
devices") changed the I2C core to relax the requirement and allow drivers
to avoid defining this table.

Unfortunately it only took into account drivers for OF-only devices and
forgot about ACPI-only ones, and this was fixed by commit c64ffff7a9
("i2c: core: Allow empty id_table in ACPI case as well").

But the latter didn't update the original comment, so it doesn't reflect
what the code does now.

Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-08-14 21:33:46 +02:00
Jarkko Nikula
4e2d93de07 i2c: designware: Fix standard mode speed when configuring the slave mode
Code sets bit DW_IC_CON_SPEED_FAST (0x4) always when configuring the slave
mode. This results incorrect register value DW_IC_CON_SPEED_HIGH (0x6)
when OR'ed together with DW_IC_CON_SPEED_STD (0x2).

Remove this and let the code set the speed mode bits according to clock
frequency or default to fast mode.

Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-08-14 21:13:29 +02:00
Jarkko Nikula
984277a041 i2c: designware: Fix oops from i2c_dw_irq_handler_slave
When i2c-designware is initialized in slave mode the
i2c-designware-slave.c: i2c_dw_irq_handler_slave() can hit a NULL
pointer dereference when I2C slave backend is not registered but code is
accessing the struct dw_i2c_dev.slave without testing is it NULL.

We might get spurious interrupts from other devices or from IRQ core
during unloading the driver when CONFIG_DEBUG_SHIRQ is set. Existing
check for enable and IRQ status is not enough since device can be power
gated and those bits may read 1.

Fix this by handling the interrupt only when also struct dw_i2c_dev.slave
is set.

Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-08-14 21:11:19 +02:00
Ulf Hansson
a23318feef i2c: designware: Fix system suspend
The commit 8503ff1665 ("i2c: designware: Avoid unnecessary resuming
during system suspend"), may suggest to the PM core to try out the so
called direct_complete path for system sleep. In this path, the PM core
treats a runtime suspended device as it's already in a proper low power
state for system sleep, which makes it skip calling the system sleep
callbacks for the device, except for the ->prepare() and the ->complete()
callbacks.

However, the PM core may unset the direct_complete flag for a parent
device, in case its child device are being system suspended before. In this
scenario, the PM core invokes the system sleep callbacks, no matter if the
device is runtime suspended or not.

Particularly in cases of an existing i2c slave device, the above path is
triggered, which breaks the assumption that the i2c device is always
runtime resumed whenever the dw_i2c_plat_suspend() is being called.

More precisely, dw_i2c_plat_suspend() calls clk_core_disable() and
clk_core_unprepare(), for an already disabled/unprepared clock, leading to
a splat in the log about clocks calls being wrongly balanced and breaking
system sleep.

To still allow the direct_complete path in cases when it's possible, but
also to keep the fix simple, let's runtime resume the i2c device in the
->suspend() callback, before continuing to put the device into low power
state.

Note, in cases when the i2c device is attached to the ACPI PM domain, this
problem doesn't occur, because ACPI's ->suspend() callback, assigned to
acpi_subsys_suspend(), already calls pm_runtime_resume() for the device.

It should also be noted that this change does not fix commit 8503ff1665
("i2c: designware: Avoid unnecessary resuming during system suspend").
Because for the non-ACPI case, the system sleep support was already broken
prior that point.

Cc: <stable@vger.kernel.org> # v4.4+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: John Stultz <john.stultz@linaro.org>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-08-14 21:02:42 +02:00
Linus Torvalds
6b9d1c24e0 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "Fix an error path bug in ixp4xx as well as a read overrun in
 sha1-avx2"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: x86/sha1 - Fix reads beyond the number of blocks passed
  crypto: ixp4xx - Fix error handling path in 'aead_perform()'
2017-08-14 11:35:56 -07:00
Jon Paul Maloy
59a361bc6f tipc: avoid inheriting msg_non_seq flag when message is returned
In the function msg_reverse(), we reverse the header while trying to
reuse the original buffer whenever possible. Those rejected/returned
messages are always transmitted as unicast, but the msg_non_seq field
is not explicitly set to zero as it should be.

We have seen cases where multicast senders set the message type to
"NOT dest_droppable", meaning that a multicast message shorter than
one MTU will be returned, e.g., during receive buffer overflow, by
reusing the original buffer. This has the effect that even the
'msg_non_seq' field is inadvertently inherited by the rejected message,
although it is now sent as a unicast message. This again leads the
receiving unicast link endpoint to steer the packet toward the broadcast
link receive function, where it is dropped. The affected unicast link is
thereafter (after 100 failed retransmissions) declared 'stale' and
reset.

We fix this by unconditionally setting the 'msg_non_seq' flag to zero
for all rejected/returned messages.

Reported-by: Canh Duc Luu <canh.d.luu@dektech.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 11:20:36 -07:00
Jon Paul Maloy
fed5f5718c tipc: accept PACKET_MULTICAST packets
On L2 bearers, the TIPC broadcast function is sending out packets using
the corresponding L2 broadcast address. At reception, we filter such
packets under the assumption that they will also be delivered as
broadcast packets.

This assumption doesn't always hold true. Under high load, we have seen
that a switch may convert the destination address and deliver the packet
as a PACKET_MULTICAST, something leading to inadvertently dropped
packets and a stale and reset broadcast link.

We fix this by extending the reception filtering to accept packets of
type PACKET_MULTICAST.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 11:19:25 -07:00
Florian Westphal
2c87d63ac8 ipv4: route: fix inet_rtm_getroute induced crash
"ip route get $daddr iif eth0 from $saddr" causes:
 BUG: KASAN: use-after-free in ip_route_input_rcu+0x1535/0x1b50
 Call Trace:
  ip_route_input_rcu+0x1535/0x1b50
  ip_route_input_noref+0xf9/0x190
  tcp_v4_early_demux+0x1a4/0x2b0
  ip_rcv+0xbcb/0xc05
  __netif_receive_skb+0x9c/0xd0
  netif_receive_skb_internal+0x5a8/0x890

Problem is that inet_rtm_getroute calls either ip_route_input_rcu (if an
iif was provided) or ip_route_output_key_hash_rcu.

But ip_route_input_rcu, unlike ip_route_output_key_hash_rcu, already
associates the dst_entry with the skb.  This clears the SKB_DST_NOREF
bit (i.e. skb_dst_drop will release/free the entry while it should not).

Thus only set the dst if we called ip_route_output_key_hash_rcu().

I tested this patch by running:
 while true;do ip r get 10.0.1.2;done > /dev/null &
 while true;do ip r get 10.0.1.2 iif eth0  from 10.0.1.1;done > /dev/null &
... and saw no crash or memory leak.

Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: David Ahern <dsahern@gmail.com>
Fixes: ba52d61e0f ("ipv4: route: restore skb_dst_set in inet_rtm_getroute")
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-14 11:08:29 -07:00
Sean Paul
1724c7c0c9 Merge origin/master into drm-misc-fixes
Backmerge 4.13-rc5 into drm-misc-fixes, it was getting a
little stale.

Signed-off-by: Sean Paul <seanpaul@chromium.org>
2017-08-14 13:14:36 -04:00
Daniel Vetter
781cc76e0c drm/i915: Avoid the gpu reset vs. modeset deadlock
... using the biggest hammer we have. This is essentially a weaponized
version of the timeout-based wedging Chris added in

commit 36703e79a9
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Jun 22 11:56:25 2017 +0100

    drm/i915: Break modeset deadlocks on reset

Because defense-in-depth is good it's good to still have both. Also
note that with the locking change we can now restrict this a lot (old
gpus and special testing only), so this doesn't kill the TDR benefits
on at least anything remotely modern.

And futuremore with a few tricks it should be possible to make a much
more educated guess about whether an atomic commit is stuck waiting on
the gpu (atomic_t counting the pending i915_sw_fence used by the
atomic modeset code should do it), so we can improve this.

But for now just start with something that is guaranteed to recover
faster, for much better CI througput.

This defacto reverts TDR on these platforms, but there's not really a
single commit to specify as the sole offender.

v2: Add a debug message to explain what's going on. We can't DRM_ERROR
because that spams CI. And the timeout based fallback still prints a
DRM_ERROR, in case something goes wrong.

v3: Fix comment layout (Michel)

Fixes: 4680816be3 ("drm/i915: Wait first for submission, before waiting for request completion")
Fixes: 221fe79945 ("drm/i915: Perform a direct reset of the GPU from the waiter")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v2)
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v2)
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170808080828.23650-1-daniel.vetter@ffwll.ch
(cherry picked from commit 97154ec242)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-08-14 19:28:46 +03:00
Chris Wilson
430ffaf46c drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
When switching between contexts using the aliasing_ppgtt, the VM is
shared. We don't need to reload the PD registers unless they are dirty.

Martin Peres reported an issue that looks like corruption between
Haswell context switches, bisecting to commit f9326be5f1 ("drm/i915:
Rearrange switch_context to load the aliasing ppgtt on first use").
Switching between the same mm (the aliasing_ppgtt is used for all
contexts in this case) should be a nop, but appears to trigger some
side-effects in the context switch. However, as we know the switch
is redundant in this case, we can skip it and continue to ignore the
issue until somebody feels strong enough to investigate full-ppgtt on
gen7 again!

Except.. Martin was using full-ppgtt which is not supported as it
doesn't work correctly yet. So whilst the bisect did yield valuable
information about the failures, the fix should not have any user impact
under default settings, with the exception of a slightly lower
throughput on xcs as the VM would always be reloaded.

v2: Also remember to set the legacy_active_context following the switch
on xcs (commit e8a9c58fcd ("drm/i915: Unify active context tracking
between legacy/execlists/guc"))

Fixes: f9326be5f1 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use")
Fixes: e8a9c58fcd ("drm/i915: Unify active context tracking between legacy/execlists/guc")
Reported-by: Martin Peres <martin.peres@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170812152724.6883-1-chris@chris-wilson.co.uk
(cherry picked from commit 12124bea5b)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-08-14 19:28:37 +03:00
Matthias Kaehlcke
7eceb9d049 drm/i915: Return correct EDP voltage swing table for 0.85V
For 0.85V cnl_get_buf_trans_edp() returns the DP table, instead of EDP.
Use the correct table.

The error was pointed out by this clang warning:

drivers/gpu/drm/i915/intel_ddi.c:392:39: warning: variable
  'cnl_ddi_translations_edp_0_85V' is not needed and will not be emitted
  [-Wunneeded-internal-declaration]
    static const struct cnl_ddi_buf_trans cnl_ddi_translations_edp_0_85V[] = {

Fixes: cf54ca8bc5 ("drm/i915/cnl: Implement voltage swing sequence.")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170717195854.192139-1-mka@chromium.org
(cherry picked from commit 50946c8985)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-08-14 19:28:28 +03:00
Rodrigo Vivi
1dd7a3e7af drm/i915/cnl: Add slice and subslice information to debugfs.
A missing part to EU slice power gating is the
debugfs interface. This patch actually should have been
squashed to the initial EU slice power gating one.

v2: Initial patch was merged without this part.

Fixes: c7ae7e9ab2 ("drm/i915/cnl: Configure EU slice power gating.")
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170809200702.11236-1-rodrigo.vivi@intel.com
(cherry picked from commit 7ea1adf30f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-08-14 19:28:22 +03:00
Chris Wilson
a0125a932e drm/i915: Perform an invalidate prior to executing golden renderstate
As we may have just bound the renderstate into the GGTT for execution, we
need to ensure that the GTT TLB are also flushed.

On snb-gt2, this would cause a random GPU hang at the start of a new
context (e.g. boot) and on snb-gt1, it was causing the renderstate batch
to take ~10s. It was the GPU hang that revealed the truth, as the CS
gleefully executed beyond the end of the golden renderstate batch, a good
indicator for a GTT TLB miss.

Fixes: 20fe17aa52 ("drm/i915: Remove redundant TLB invalidate on switching contexts")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20170808131904.1385-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.12-rc1+
(cherry picked from commit 802673d66f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-08-14 19:28:06 +03:00
Lionel Landwerlin
26a72e8a8d drm/i915: remove unused function declaration
This function is not part of the driver anymore.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 90f4fcd56b ("drm/i915: Remove forced stop ring on suspend/unload")
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20170804140348.24971-1-lionel.g.landwerlin@intel.com
(cherry picked from commit fe29133df3)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-08-14 19:27:59 +03:00
Maarten Lankhorst
7f5d6dac54 drm/atomic: Handle -EDEADLK with out-fences correctly
complete_crtc_signaling is freeing fence_state, but when retrying
num_fences and fence_state are not zero'd. This caused duplicate
fd's in the fence_state array, followed by a BUG_ON in fs/file.c
because we reallocate freed memory, and installing over an existing
fd, or potential other fun.

Zero fence_state and num_fences correctly in the retry loop, which
allows kms_atomic_transition to pass.

Fixes: beaf5af480 ("drm/fence: add out-fences support")
Cc: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Cc: Brian Starkey <brian.starkey@arm.com> (v10)
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.10+
Testcase: kms_atomic_transitions.plane-all-modeset-transition-fencing
(with CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y)
Link: https://patchwork.freedesktop.org/patch/msgid/20170814100721.13340-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> #intel-gfx on irc
2017-08-14 15:47:57 +02:00
Nikhil Mahale
491ab4700d drm: Fix framebuffer leak
Do not leak framebuffer if client provided crtc id found invalid.

Signed-off-by: Nikhil Mahale <nmahale@nvidia.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1502250781-5779-1-git-send-email-nmahale@nvidia.com
2017-08-14 15:30:33 +02:00
Gregory CLEMENT
3f13b6a24f gpio: mvebu: Fix cause computation in irq handler
When switching to regmap, the way to compute the irq cause was
reorganized. However while doing it, a typo was introduced: a 'xor'
replaced a 'and'.

This lead to wrong behavior in the interrupt handler ans one of the
symptom was wrong irq handler called on the Armada 388 GP:
"->handle_irq():  c016303c,
handle_bad_irq+0x0/0x278
->irq_data.chip(): c0b0ec0c,
0xc0b0ec0c
->action():   (null)
   IRQ_NOPROBE set
 IRQ_NOREQUEST set
unexpected IRQ trap at vector 00
irq 0, desc: ee804800, depth: 1, count: 0, unhandled: 0"

Fixes: 2233bf7a92 ("gpio: mvebu: switch to regmap for register access")
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-08-14 15:00:43 +02:00
Masami Hiramatsu
90b05b0598 gpio: reject invalid gpio before getting gpio_desc
Check user-given gpio number and reject it before
calling gpio_to_desc() because gpio_to_desc() is
for kernel driver and it expects given gpio number
is valid (means 0 to 511).
If given number is invalid, gpio_to_desc() calls
WARN() and dump registers and stack for debug.
This means user can easily kick WARN() just by
writing invalid gpio number (e.g. 512) to
/sys/class/gpio/export.

Fixes: 0e9a5edf5d ("gpio: fix deferred probe detection for legacy API")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-08-14 15:00:43 +02:00
Tahsin Erdogan
32aaf19420 ext4: add missing xattr hash update
When updating an extended attribute, if the padded value sizes are the
same, a shortcut is taken to avoid the bulk of the work. This was fine
until the xattr hash update was moved inside ext4_xattr_set_entry().
With that change, the hash update got missed in the shortcut case.

Thanks to ZhangYi (yizhang089@gmail.com) for root causing the problem.

Fixes: daf8328172 ("ext4: eliminate xattr entry e_hash recalculation for removes")

Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-14 08:30:06 -04:00
Theodore Ts'o
b80b32b6d5 ext4: fix clang build regression
Arnd Bergmann <arnd@arndb.de>

As Stefan pointed out, I misremembered what clang can do specifically,
and it turns out that the variable-length array at the end of the
structure did not work (a flexible array would have worked here
but not solved the problem):

fs/ext4/mballoc.c:2303:17: error: fields must have a constant size:
'variable length array in structure' extension will never be supported
                ext4_grpblk_t counters[blocksize_bits + 2];

This reverts part of my previous patch, using a fixed-size array
again, but keeping the check for the array overflow.

Fixes: 2df2c3402f ("ext4: fix warning about stack corruption")
Reported-by: Stefan Agner <stefan@agner.ch>
Tested-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-14 08:29:18 -04:00
Shih-Yuan Lee (FourDollars)
8df4b00310 ALSA: hda/realtek - Fix pincfg for Dell XPS 13 9370
The initial pin configs for Dell headset mode of ALC3271 has changed.

/sys/class/sound/hwC0D0/init_pin_configs: (BIOS 0.1.4)
0x12 0xb7a60130
0x13 0xb8a61140
0x14 0x40000000
0x16 0x411111f0
0x17 0x90170110
0x18 0x411111f0
0x19 0x411111f0
0x1a 0x411111f0
0x1b 0x411111f0
0x1d 0x4087992d
0x1e 0x411111f0
0x21 0x04211020

has changed to ...

/sys/class/sound/hwC0D0/init_pin_configs: (BIOS 0.2.0)
0x12 0xb7a60130
0x13 0x40000000
0x14 0x411111f0
0x16 0x411111f0
0x17 0x90170110
0x18 0x411111f0
0x19 0x411111f0
0x1a 0x411111f0
0x1b 0x411111f0
0x1d 0x4067992d
0x1e 0x411111f0
0x21 0x04211020

Fixes: b4576de872 ("ALSA: hda/realtek - Fix typo of pincfg for Dell quirk")
Signed-off-by: Shih-Yuan Lee (FourDollars) <sylee@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-14 12:12:42 +02:00
Arend Van Spriel
e9bf53ab1e brcmfmac: feature check for multi-scheduled scan fails on bcm4343x devices
The firmware feature check introduced for multi-scheduled scan turned out
to be failing for bcm4343{0,1,8} devices resulting in a firmware crash.
The reason for this crash has not yet been root cause so this patch avoids
the feature check for those device as a short-term fix.

Reported-by: Stefan Wahren <stefan.wahren@i2se.com>
Reported-by: Ian Molton <ian@mnementh.co.uk>
Fixes: 9fe929aaac ("brcmfmac: add firmware feature detection for gscan feature")
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-08-14 11:09:30 +03:00
Thomas Gleixner
9c9947f893 Merge tag 'irqchip-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
Pull irqchip fixes for 4.13 from Marc Zyngier

Mostly GIC related, again:
- GICv3 ITS NUMA handling fixes
- GICv3 force affinity handling
- Barrier adjustment in both GIC interrupt handling
- Error reporting when the DT presents an incompatible interrupt
- GICv3 platform MSI DT parsing bug fix
- Broadcom L2 PM fix
- Atmel AIC cleanups
2017-08-14 09:34:10 +02:00
Icenowy Zheng
d86e63e1f0 arm64: allwinner: h5: fix pinctrl IRQs
The pin controller of H5 has three IRQs at the chip's GIC, which
represents three banks of pinctrl IRQs. However, the device tree used to
miss the third IRQ of the pin controller, which makes the PG bank IRQ
not usable.

Add the missing IRQ to the pinctrl node.

Fixes: 4e36de179f ("arm64: allwinner: h5: add Allwinner H5 .dtsi")
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2017-08-14 14:18:21 +08:00
Andreas Born
11e9d7829d bonding: ratelimit failed speed/duplex update warning
bond_miimon_commit() handles the UP transition for each slave of a bond
in the case of MII. It is triggered 10 times per second for the default
MII Polling interval of 100ms. For device drivers that do not implement
__ethtool_get_link_ksettings() the call to bond_update_speed_duplex()
fails persistently while the MII status could remain UP. That is, in
this and other cases where the speed/duplex update keeps failing over a
longer period of time while the MII state is UP, a warning is printed
every MII polling interval.

To address these excessive warnings net_ratelimit() should be used.
Printing a warning once would not be sufficient since the call to
bond_update_speed_duplex() could recover to succeed and fail again
later. In that case there would be no new indication what went wrong.

Fixes: b5bf0f5b16 (bonding: correctly update link status during mii-commit phase)
Signed-off-by: Andreas Born <futur.andy@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-13 20:01:38 -07:00
Ingo Molnar
b60bf53abc Merge branch 'clockevents/4.13-fixes' of http://git.linaro.org/people/daniel.lezcano/linux into timers/urgent
Pull clockevents fixes from Daniel Lezcano:

" - Fix error check against IS_ERR() instead of NULL for the timer-of code (Dan Carpenter)
  - Fix infinite recusion with ftrace for the ARM architected timer (Ding Tianhong)
  - Fix the error code return in the em_sti's probe function (Gustavo A. R.  Silva)
  - Fix Kconfig dependency for the pistachio driver (Matt Redfearn)
  - Fix mem frame loop initialization for the ARM architected timer (Matthias Kaehlcke)"

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-13 11:39:45 +02:00
Shaohua Li
afc1f55ca4 MD: not clear ->safemode for external metadata array
->safemode should be triggered by mdadm for external metadaa array, otherwise
array's state confuses mdadm.

Fixes: 33182d15c6bf(md: always clear ->safemode when md_check_recovery gets the mddev lock.)
Cc: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-11 20:42:06 -07:00
Christoph Hellwig
e28ae8e428 iomap: fix integer truncation issues in the zeroing and dirtying helpers
Fix the min_t calls in the zeroing and dirtying helpers to perform the
comparisms on 64-bit types, which prevents them from incorrectly
being truncated, and larger zeroing operations being stuck in a never
ending loop.

Special thanks to Markus Stockhausen for spotting the bug.

Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-08-11 16:56:33 -07:00
Omar Sandoval
c44245b3d5 xfs: fix inobt inode allocation search optimization
When we try to allocate a free inode by searching the inobt, we try to
find the inode nearest the parent inode by searching chunks both left
and right of the chunk containing the parent. As an optimization, we
cache the leftmost and rightmost records that we previously searched; if
we do another allocation with the same parent inode, we'll pick up the
search where it last left off.

There's a bug in the case where we found a free inode to the left of the
parent's chunk: we need to update the cached left and right records, but
because we already reassigned the right record to point to the left, we
end up assigning the left record to both the cached left and right
records.

This isn't a correctness problem strictly, but it can result in the next
allocation rechecking chunks unnecessarily or allocating inodes further
away from the parent than it needs to. Fix it by swapping the record
pointer after we update the cached left and right records.

Fixes: bd16956599 ("xfs: speed up free inode search")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-08-11 16:56:33 -07:00
Eric Dumazet
fd851ba9ca udp: harden copy_linear_skb()
syzkaller got crashes with CONFIG_HARDENED_USERCOPY=y configs.

Issue here is that recvfrom() can be used with user buffer of Z bytes,
and SO_PEEK_OFF of X bytes, from a skb with Y bytes, and following
condition :

Z < X < Y

kernel BUG at mm/usercopy.c:72!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 2917 Comm: syzkaller842281 Not tainted 4.13.0-rc3+ #16
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
task: ffff8801d2fa40c0 task.stack: ffff8801d1fe8000
RIP: 0010:report_usercopy mm/usercopy.c:64 [inline]
RIP: 0010:__check_object_size+0x3ad/0x500 mm/usercopy.c:264
RSP: 0018:ffff8801d1fef8a8 EFLAGS: 00010286
RAX: 0000000000000078 RBX: ffffffff847102c0 RCX: 0000000000000000
RDX: 0000000000000078 RSI: 1ffff1003a3fded5 RDI: ffffed003a3fdf09
RBP: ffff8801d1fef998 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801d1ea480e
R13: fffffffffffffffa R14: ffffffff84710280 R15: dffffc0000000000
FS:  0000000001360880(0000) GS:ffff8801dc000000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000202ecfe4 CR3: 00000001d1ff8000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 check_object_size include/linux/thread_info.h:108 [inline]
 check_copy_size include/linux/thread_info.h:139 [inline]
 copy_to_iter include/linux/uio.h:105 [inline]
 copy_linear_skb include/net/udp.h:371 [inline]
 udpv6_recvmsg+0x1040/0x1af0 net/ipv6/udp.c:395
 inet_recvmsg+0x14c/0x5f0 net/ipv4/af_inet.c:793
 sock_recvmsg_nosec net/socket.c:792 [inline]
 sock_recvmsg+0xc9/0x110 net/socket.c:799
 SYSC_recvfrom+0x2d6/0x570 net/socket.c:1788
 SyS_recvfrom+0x40/0x50 net/socket.c:1760
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Fixes: b65ac44674 ("udp: try to avoid 2 cache miss on dequeue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11 15:00:45 -07:00
David S. Miller
6401f37c6e Merge branch 'bpf-Minor-fix-in-bpf_convert_ctx_access'
Daniel Borkmann says:

====================
bpf: Minor fix in bpf_convert_ctx_access

First one was found while trying to compile the kernel
with !CONFIG_NET_RX_BUSY_POLL.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11 14:59:24 -07:00
Daniel Borkmann
2ed46ce45e bpf: fix two missing target_size settings in bpf_convert_ctx_access
When CONFIG_NET_SCHED or CONFIG_NET_RX_BUSY_POLL is /not/ set and
we try a narrow __sk_buff load of tc_index or napi_id, respectively,
then verifier rightfully complains that it's misconfigured, because
we need to set target_size in each of the two cases. The rewrite
for the ctx access is just a dummy op, but needs to pass, so fix
this up.

Fixes: f96da09473 ("bpf: simplify narrower ctx access")
Reported-by: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11 14:59:24 -07:00
Daniel Borkmann
e4dde41273 net: fix compilation when busy poll is not enabled
MIN_NAPI_ID is used in various places outside of
CONFIG_NET_RX_BUSY_POLL wrapping, so when it's not set
we run into build errors such as:

  net/core/dev.c: In function 'dev_get_by_napi_id':
  net/core/dev.c:886:16: error: ‘MIN_NAPI_ID’ undeclared (first use in this function)
    if (napi_id < MIN_NAPI_ID)
                  ^~~~~~~~~~~

Thus, have MIN_NAPI_ID always defined to fix these errors.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11 14:59:24 -07:00
Anton Vasilyev
54a6a043fb mISDN: Fix null pointer dereference at mISDN_FsmNew
If mISDN_FsmNew() fails to allocate memory for jumpmatrix
then null pointer dereference will occur on any write to
jumpmatrix.

The patch adds check on successful allocation and
corresponding error handling.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11 14:56:23 -07:00
Simon Horman
bb3afda4fc nfp: do not update MTU from BH in flower app
The Flower app may receive a request to update the MTU of a representor
netdev upon receipt of a control message from the firmware. This requires
the RTNL lock which needs to be taken outside of the packet processing
path.

As a handling of this correctly seems a little to invasive for a fix simply
skip setting the MTU for now.

Relevant backtrace:
 [ 1496.288489] BUG: scheduling while atomic: kworker/0:3/373/0x00000100
 [ 1496.294911]  dca syscopyarea sysfillrect sysimgblt fb_sys_fops ptp drm mxm_wmi ahci pps_core libahci i2c_algo_bit wmi [last unloaded: nfp]
 [ 1496.294918] CPU: 0 PID: 373 Comm: kworker/0:3 Tainted: G           OE   4.13.0-rc3+ #3
 [ 1496.294919] Hardware name: Supermicro X10DRi/X10DRi, BIOS 2.0 12/28/2015
 [ 1496.294923] Workqueue: events work_for_cpu_fn
 [ 1496.294924] Call Trace:
 [ 1496.294927]  <IRQ>
 [ 1496.294931]  dump_stack+0x63/0x82
 [ 1496.294935]  __schedule_bug+0x54/0x70
 [ 1496.294937]  __schedule+0x62f/0x890
 [ 1496.294941]  ? intel_unmap_sg+0x90/0x90
 [ 1496.294942]  schedule+0x36/0x80
 [ 1496.294943]  schedule_preempt_disabled+0xe/0x10
 [ 1496.294945]  __mutex_lock.isra.2+0x445/0x4a0
 [ 1496.294947]  ? device_is_rmrr_locked+0x12/0x50
 [ 1496.294950]  ? kfree+0x162/0x170
 [ 1496.294952]  ? device_is_rmrr_locked+0x12/0x50
 [ 1496.294953]  ? iommu_should_identity_map+0x50/0xe0
 [ 1496.294954]  __mutex_lock_slowpath+0x13/0x20
 [ 1496.294955]  ? iommu_no_mapping+0x48/0xd0
 [ 1496.294956]  ? __mutex_lock_slowpath+0x13/0x20
 [ 1496.294957]  mutex_lock+0x2f/0x40
 [ 1496.294960]  rtnl_lock+0x15/0x20
 [ 1496.294979]  nfp_flower_cmsg_rx+0xc8/0x150 [nfp]
 [ 1496.294986]  nfp_ctrl_poll+0x286/0x350 [nfp]
 [ 1496.294989]  tasklet_action+0xf6/0x110
 [ 1496.294992]  __do_softirq+0xed/0x278
 [ 1496.294993]  irq_exit+0xb6/0xc0
 [ 1496.294994]  do_IRQ+0x4f/0xd0
 [ 1496.294996]  common_interrupt+0x89/0x89

Fixes: 948faa46c0 ("nfp: add support for control messages for flower app")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11 14:50:09 -07:00
Romain Perier
fbca164776 net: stmmac: Use the right logging function in stmmac_mdio_register
Currently, the function stmmac_mdio_register() is only used by
stmmac_dvr_probe() from stmmac_main.c, in order to register the MDIO bus
and probe information about the PHY. As this function is called before
calling register_netdev(), all messages logged from stmmac_mdio_register
are prefixed by "(unnamed net_device)". The goal of netdev_info or
netdev_err is to dump useful infos about a net_device, when this data
structure is partially initialized, there is no point for using these
functions.

This commit fixes the issue by replacing all netdev_*() by the
corresponding dev_*() function for logging. The last netdev_info is
replaced by phy_attached_info(), as a valid phydev can be used at this
point.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11 14:38:55 -07:00
Konstantin Khlebnikov
8d55373875 net/sched/hfsc: allocate tcf block for hfsc root class
Without this filters cannot be attached.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Fixes: 6529eaba33 ("net: sched: introduce tcf block infractructure")
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11 14:30:30 -07:00
Andreas Born
ad729bc9ac bonding: require speed/duplex only for 802.3ad, alb and tlb
The patch c4adfc822b ("bonding: make speed, duplex setting consistent
with link state") puts the link state to down if
bond_update_speed_duplex() cannot retrieve speed and duplex settings.
Assumably the patch was written with 802.3ad mode in mind which relies
on link speed/duplex settings. For other modes like active-backup these
settings are not required. Thus, only for these other modes, this patch
reintroduces support for slaves that do not support reporting speed or
duplex such as wireless devices. This fixes the regression reported in
bug 196547 (https://bugzilla.kernel.org/show_bug.cgi?id=196547).

Fixes: c4adfc822b ("bonding: make speed, duplex setting consistent
with link state")
Signed-off-by: Andreas Born <futur.andy@googlemail.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11 14:21:42 -07:00
Vivien Didelot
e71cb9e009 net: dsa: ksz: fix skb freeing
The DSA layer frees the original skb when an xmit function returns NULL,
meaning an error occurred. But if the tagging code copied the original
skb, it is responsible of freeing the copy if an error occurs.

The ksz tagging code currently has two issues: if skb_put_padto fails,
the skb copy is not freed, and the original skb will be freed twice.

To fix that, move skb_put_padto inside both branches of the skb_tailroom
condition, before freeing the original skb, and free the copy on error.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11 13:57:08 -07:00
Shuah Khan
622b2fbe62 selftests: timers: freq-step: fix compile error
Fix compile error due to ksft_exit_skip() update to take var_args.

freq-step.c: In function ‘init_test’:
freq-step.c:234:3: error: too few arguments to function ‘ksft_exit_skip’
   ksft_exit_skip();
   ^~~~~~~~~~~~~~
In file included from freq-step.c:26:0:
../kselftest.h:167:19: note: declared here
 static inline int ksft_exit_skip(const char *msg, ...)
                   ^~~~~~~~~~~~~~
<builtin>: recipe for target 'freq-step' failed

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-08-11 09:28:37 -06:00
Ding Tianhong
adb4f11e0a clocksource/drivers/arm_arch_timer: Avoid infinite recursion when ftrace is enabled
On platforms with an arch timer erratum workaround, it's possible for
arch_timer_reg_read_stable() to recurse into itself when certain
tracing options are enabled, leading to stack overflows and related
problems.

For example, when PREEMPT_TRACER and FUNCTION_GRAPH_TRACER are
selected, it's possible to trigger this with:

$ mount -t debugfs nodev /sys/kernel/debug/
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer

The problem is that in such cases, preempt_disable() instrumentation
attempts to acquire a timestamp via trace_clock(), resulting in a call
back to arch_timer_reg_read_stable(), and hence recursion.

This patch changes arch_timer_reg_read_stable() to use
preempt_{disable,enable}_notrace(), which avoids this.

This problem is similar to the fixed by upstream commit 96b3d28bf4
("sched/clock: Prevent tracing recursion in sched_clock_cpu()").

Fixes: 6acc71ccac ("arm64: arch_timer: Allows a CPU-specific erratum to only affect a subset of CPUs")
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-11 16:01:43 +02:00
Colin Ian King
b45e4c45b1 x86: Mark various structures and functions as 'static'
Mark a couple of structures and functions as 'static', pointed out by Sparse:

  warning: symbol 'bts_pmu' was not declared. Should it be static?
  warning: symbol 'p4_event_aliases' was not declared. Should it be static?
  warning: symbol 'rapl_attr_groups' was not declared. Should it be static?
  symbol 'process_uv2_message' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Andrew Banman <abanman@hpe.com> # for the UV change
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20170810155709.7094-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-11 14:49:43 +02:00
Borislav Petkov
5442c26995 x86/cpufeature, kvm/svm: Rename (shorten) the new "virtualized VMSAVE/VMLOAD" CPUID flag
"virtual_vmload_vmsave" is what is going to land in /proc/cpuinfo now
as per v4.13-rc4, for a single feature bit which is clearly too long.

So rename it to what it is called in the processor manual.
"v_vmsave_vmload" is a bit shorter, after all.

We could go more aggressively here but having it the same as in the
processor manual is advantageous.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Radim Krčmář <rkrcmar@redhat.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Jörg Rödel <joro@8bytes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm-ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170801185552.GA3743@nazgul.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-11 13:42:28 +02:00
Matt Redfearn
599dc457c7 clocksource/drivers/Kconfig: Fix CLKSRC_PISTACHIO dependencies
In v4.13, CLKSRC_PISTACHIO can select TIMER_OF on architectures without
GENERIC_CLOCKEVENTS, resulting in a struct clock_event_device missing
some required features and build breakage compiling timer_of.c. One of
the symbols selecting TIMER_OF is CLKSRC_PISTACHIO, so add the
dependency on GENERIC_CLOCKEVENTS.

Thanks to kbuild test robot for finding this error
(https://lkml.org/lkml/2017/7/16/249)

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Suggested-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-11 12:53:29 +02:00
Dan Carpenter
9e80dbd872 clocksource/drivers/timer-of: Checking for IS_ERR() instead of NULL
The current code checks the return value of the of_io_request_and_map()
function as it was returning a NULL pointer in case of error.

However, it returns an error code encoded in the pointer return value, not a
NULL value. Fix this by checking the returned pointer against IS_ERR() and
return the error with PTR_ERR().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-11 12:53:21 +02:00
Philipp Zabel
5be5dd38d4 drm/imx: ipuv3-plane: fix YUV framebuffer scanout on the base plane
Historically, only RGB framebuffers could be assigned to the primary
plane. This changed with universal plane support. Since no colorspace
conversion was set up for the IPUv3 full plane, assigning YUV frame
buffers to the primary plane caused incorrect output.
Fix this by enabling color space conversion also for the primary plane.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-08-11 10:31:13 +02:00
Arnd Bergmann
2406b296a3 gpu: ipu-v3: add DRM dependency
The new PRE/PRG driver code causes a link failure when IPUv3 is built-in,
but DRM is built as a module:

drivers/gpu/ipu-v3/ipu-pre.o: In function `ipu_pre_configure':
ipu-pre.c:(.text.ipu_pre_configure+0x18): undefined reference to `drm_format_info'
drivers/gpu/ipu-v3/ipu-prg.o: In function `ipu_prg_format_supported':
ipu-prg.c:(.text.ipu_prg_format_supported+0x8): undefined reference to `drm_format_info'

Adding a Kconfig dependency on DRM means we don't run into this problem
any more. If DRM is disabled altogether, the IPUv3 driver is built
without PRE/PRG support.

Fixes: ea9c260514 ("gpu: ipu-v3: add driver for Prefetch Resolve Gasket")
Link: https://patchwork.kernel.org/patch/9636665/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[p.zabel@pengutronix.de: changed the dependency from DRM to DRM || !DRM,
 since the link failure only happens when DRM=m and IPUV3_CORE=y.
 Modified the commit message to reflect this.]
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-08-11 10:31:13 +02:00
Takashi Iwai
4d3a869333 ALSA: seq: Fix CONFIG_SND_SEQ_MIDI dependency
The commit 0181307abc ("ALSA: seq: Reorganize kconfig and build")
rewrote the dependency of each sequencer module in a standard way, but
there was one change applied mistakenly: CONFIG_SND_SEQ_MIDI isn't
enabled properly by CONFIG_SND_RAWMIDI.  I seem to have changed the
wrong one instead, CONFIG_SND_SEQ_MIDI_EMUL, which is eventually
reverse-selected by CONFIG_SND_SEQ_MIDI itself.  This ended up the
lack of snd-seq-midi module as reported below.

The fix is to put def_tristate properly to CONFIG_SND_SEQ_MIDI instead
of *_MIDI_EMUL entry.

Fixes: 0181307abc ("ALSA: seq: Reorganize kconfig and build")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196633
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-08-11 09:51:41 +02:00
Alexey Brodkin
a8ec3ee861 arc: Mask individual IRQ lines during core INTC init
ARC cores on reset have all interrupt lines of built-in INTC enabled.
Which means once we globally enable interrupts (very early on boot)
faulty hardware blocks may trigger an interrupt that Linux kernel
cannot handle yet as corresponding handler is not yet installed.

In that case system falls in "interrupt storm" and basically never
does anything useful except entering and exiting generic IRQ handling
code.

One real example of that kind of problematic hardware is DW GMAC which
also has interrupts enabled on reset and if Ethernet PHY informs GMAC
about link state, GMAC immediately reports that upstream to ARC core
and here we are.

Now with that change we mask all individual IRQ lines making entire
system more fool-proof.

[This patch was motivated by Adaptrum platform support]

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Eugeniy Paltsev <paltsev@synopsys.com>
Tested-by: Alexandru Gagniuc <alex.g@adaptrum.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-11 06:38:31 +05:30
Doug Smythies
8e2f3bce05 cpufreq: x86: Disable interrupts during MSRs reading
According to Intel 64 and IA-32 Architectures SDM, Volume 3,
Chapter 14.2, "Software needs to exercise care to avoid delays
between the two RDMSRs (for example interrupts)".

So, disable interrupts during reading MSRs IA32_APERF and IA32_MPERF.

See also: commit 4ab60c3f32 (cpufreq: intel_pstate: Disable
interrupts during MSRs reading).

Signed-off-by: Doug Smythies <dsmythies@telus.net>
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-11 01:27:41 +02:00
Doug Smythies
c587c79f90 cpufreq: intel_pstate: report correct CPU frequencies during trace
The intel_pstate CPU frequency scaling driver has always
calculated CPU frequency incorrectly.  Recent changes have
eliminted most of the issues, however the frequency reported
in the trace buffer, if used, is incorrect.

It remains desireable that cpu->pstate.scaling still be a nice
round number for things such as when setting max and min frequencies.
So the proposal is to just fix the reported frequency in the trace data.

Fixes what remains of [1].

Link: https://bugzilla.kernel.org/show_bug.cgi?id=96521 # [1]
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-11 01:25:53 +02:00
Zi Yan
9157259d16 mm: add pmd_t initializer __pmd() to work around a GCC bug.
THP migration is added but only supports x86_64 at the moment. For all
other architectures, swp_entry_to_pmd() only returns a zero pmd_t.

Due to a GCC zero initializer bug #53119, the standard (pmd_t){0}
initializer is not accepted by all GCC versions. __pmd() is a feasible
workaround. In addition, sparc32's pmd_t is an array instead of a single
value, so we need (pmd_t){ {0}, } instead of (pmd_t){0}. Thus,
a different __pmd() definition is needed in sparc32.

Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-10 15:03:57 -07:00
Lorenzo Pieralisi
a008873740 irqchip/gic-v3-its-platform-msi: Fix msi-parent parsing loop
While parsing the msi-parent property to chase up the IRQ domain
a given device belongs to, the index into the msi-parent tuple should
be incremented to ensure all properties entries are taken into account.

Current code missed the index update so the parsing loop does not work
in case multiple msi-parent phandles are present and may turn into
an infinite loop in of_pmsi_get_dev_id() if phandle at index 0 does
not correspond to the domain we are actually looking-up.

Fix the code by updating the phandle index at each iteration in
of_pmsi_get_dev_id().

Fixes: deac7fc1c8 ("irqchip/gic-v3-its: Parse new version of msi-parent property")
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-08-10 16:26:54 +01:00
Hanjun Guo
fdf6e7a8c9 irqchip/gic-v3-its: Allow GIC ITS number more than MAX_NUMNODES
When enabling ITS NUMA support on D05, I got the boot log:

[    0.000000] SRAT: PXM 0 -> ITS 0 -> Node 0
[    0.000000] SRAT: PXM 0 -> ITS 1 -> Node 0
[    0.000000] SRAT: PXM 0 -> ITS 2 -> Node 0
[    0.000000] SRAT: PXM 1 -> ITS 3 -> Node 1
[    0.000000] SRAT: ITS affinity exceeding max count[4]

This is wrong on D05 as we have 8 ITSs with 4 NUMA nodes.

So dynamically alloc the memory needed instead of using
its_srat_maps[MAX_NUMNODES], which count the number of
ITS entry(ies) in SRAT and alloc its_srat_maps as needed,
then build the mapping of numa node to ITS ID. Of course,
its_srat_maps will be freed after ITS probing because
we don't need that after boot.

After doing this, I got what I wanted:

[    0.000000] SRAT: PXM 0 -> ITS 0 -> Node 0
[    0.000000] SRAT: PXM 0 -> ITS 1 -> Node 0
[    0.000000] SRAT: PXM 0 -> ITS 2 -> Node 0
[    0.000000] SRAT: PXM 1 -> ITS 3 -> Node 1
[    0.000000] SRAT: PXM 2 -> ITS 4 -> Node 2
[    0.000000] SRAT: PXM 2 -> ITS 5 -> Node 2
[    0.000000] SRAT: PXM 2 -> ITS 6 -> Node 2
[    0.000000] SRAT: PXM 3 -> ITS 7 -> Node 3

Fixes: dbd2b82672 ("irqchip/gic-v3-its: Add ACPI NUMA node mapping")
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: John Garry <john.garry@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-08-10 16:22:50 +01:00
Vitaly Kuznetsov
10e66760fa x86/smpboot: Unbreak CPU0 hotplug
A hang on CPU0 onlining after a preceding offlining is observed. Trace
shows that CPU0 is stuck in check_tsc_sync_target() waiting for source
CPU to run check_tsc_sync_source() but this never happens. Source CPU,
in its turn, is stuck on synchronize_sched() which is called from
native_cpu_up() -> do_boot_cpu() -> unregister_nmi_handler().

So it's a classic ABBA deadlock, due to the use of synchronize_sched() in
unregister_nmi_handler().

Fix the bug by moving unregister_nmi_handler() from do_boot_cpu() to
native_cpu_up() after cpu onlining is done.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170803105818.9934-1-vkuznets@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 17:02:47 +02:00
Gustavo A. R. Silva
5c23a558a6 clocksource/drivers/em_sti: Fix error return codes in em_sti_probe()
Propagate the return values of platform_get_irq and devm_request_irq on
failure.

Cc: Frans Klaver <fransklaver@gmail.com>
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-10 14:48:18 +02:00
Matthias Kaehlcke
d197f79887 clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization
The loop to find the best memory frame in arch_timer_mem_acpi_init()
initializes the loop counter with itself ('i = i'), which is suspicious
in the first place and pointed out by clang. The loop condition is
'i < timer_count' and a prior for loop exits when 'i' reaches
'timer_count', therefore the second loop is never executed.

Initialize the loop counter with 0 to iterate over all timers, which
supposedly was the intention before the typo monster attacked.

Fixes: c2743a3676 ("clocksource: arm_arch_timer: add GTDT support for memory-mapped timer")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-10 14:48:17 +02:00
Andy Lutomirski
e93c17301a x86/asm/64: Clear AC on NMI entries
This closes a hole in our SMAP implementation.

This patch comes from grsecurity. Good catch!

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/314cc9f294e8f14ed85485727556ad4f15bb1659.1502159503.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 13:13:15 +02:00
Peter Zijlstra
9b231d9f47 perf/core: Fix time on IOC_ENABLE
Vince reported that when we do IOC_ENABLE/IOC_DISABLE while the task
is SIGSTOP'ed state the timestamps go wobbly.

It turns out we indeed fail to correctly account time while in 'OFF'
state and doing IOC_ENABLE without getting scheduled in exposes the
problem.

Further thinking about this problem, it occurred to me that we can
suffer a similar fate when we migrate an uncore event between CPUs.
The perf_event_install() on the 'new' CPU will do add_event_to_ctx()
which will reset all the time stamp, resulting in a subsequent
update_event_times() to overwrite the total_time_* fields with smaller
values.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:01:09 +02:00
Peter Zijlstra
bfe334924c perf/x86: Fix RDPMC vs. mm_struct tracking
Vince reported the following rdpmc() testcase failure:

 > Failing test case:
 >
 >	fd=perf_event_open();
 >	addr=mmap(fd);
 >	exec()  // without closing or unmapping the event
 >	fd=perf_event_open();
 >	addr=mmap(fd);
 >	rdpmc()	// GPFs due to rdpmc being disabled

The problem is of course that exec() plays tricks with what is
current->mm, only destroying the old mappings after having
installed the new mm.

Fix this confusion by passing along vma->vm_mm instead of relying on
current->mm.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 1e0fb9ec67 ("perf: Add pmu callbacks to track event mapping and unmapping")
Link: http://lkml.kernel.org/r/20170802173930.cstykcqefmqt7jau@hirez.programming.kicks-ass.net
[ Minor cleanups. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:01:08 +02:00
Icenowy Zheng
56a9155074 arm64: allwinner: a64: sopine: add missing ethernet0 alias
The EMAC Ethernet controller was enabled, but an accompanying alias
was not added. This results in unstable numbering if other Ethernet
devices, such as a USB dongle, are present. Also, the bootloader uses
the alias to assign a generated stable MAC address to the device node.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Fixes: 96219b0048 ("arm64: allwinner: a64: add device tree for SoPine
		      with baseboard")
[wens@csie.org: Rewrite commit log as fixing a previous patch with Fixes]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2017-08-10 14:02:38 +08:00
Icenowy Zheng
dff751c689 arm64: allwinner: a64: pine64: add missing ethernet0 alias
The EMAC Ethernet controller was enabled, but an accompanying alias
was not added. This results in unstable numbering if other Ethernet
devices, such as a USB dongle, are present. Also, the bootloader uses
the alias to assign a generated stable MAC address to the device node.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Fixes: 9702394374 ("arm64: allwinner: pine64: Enable dwmac-sun8i")
[wens@csie.org: Rewrite commit log as fixing a previous patch with Fixes]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2017-08-10 14:01:38 +08:00
Icenowy Zheng
7dc88d2afb arm64: allwinner: a64: bananapi-m64: add missing ethernet0 alias
The EMAC Ethernet controller was enabled, but an accompanying alias
was not added. This results in unstable numbering if other Ethernet
devices, such as a USB dongle, are present. Also, the bootloader uses
the alias to assign a generated stable MAC address to the device node.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Fixes: e729549990 ("arm64: allwinner: bananapi-m64: Enable dwmac-sun8i")
[wens@csie.org: Rewrite commit log as fixing a previous patch with Fixes]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2017-08-10 14:00:10 +08:00
Kalle Valo
9d6b9b8d1c Merge tag 'iwlwifi-for-kalle-2018-08-09' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes
Some more fixes for 4.13

* Fix a memory leak in the SAR code;
* Fix a stuck queue case in AP mode;
* Convert a WARN to a simple debug in a legitimate race case (from
  which we can recover);
* Fix a severe throughput aggregation on 9000-family devices due to
  aggregation issues.
2017-08-09 22:37:23 +03:00
Jeffy Chen
0fa375e6bc drm/rockchip: Fix suspend crash when drm is not bound
Currently we are allocating drm_device in rockchip_drm_bind, so if the
suspend/resume code access it when drm is not bound, we would hit this
crash:

[  253.402836] Unable to handle kernel NULL pointer dereference at virtual address 00000028
[  253.402837] pgd = ffffffc06c9b0000
[  253.402841] [00000028] *pgd=0000000000000000, *pud=0000000000000000
[  253.402844] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[  253.402859] Modules linked in: btusb btrtl btbcm btintel bluetooth ath10k_pci ath10k_core ar10k_ath ar10k_mac80211 cfg80211 ip6table_filter asix usbnet mii
[  253.402864] CPU: 4 PID: 1331 Comm: cat Not tainted 4.4.70 #15
[  253.402865] Hardware name: Google Scarlet (DT)
[  253.402867] task: ffffffc076c0ce00 ti: ffffffc06c2c8000 task.ti: ffffffc06c2c8000
[  253.402871] PC is at rockchip_drm_sys_suspend+0x20/0x5c

Add sanity checks to prevent that.

Reported-by: Brian Norris <briannorris@chromium.com>
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.kernel.org/patch/9890297/
2017-08-09 13:59:25 -04:00
Shuah Khan
7ba190be87 selftests: futex: fix run_tests target
make -C tools/testing/selftests/futex/ run_tests doesn't run the futex
tests.

Running the tests when `dirname $(OUTPUT)` == $(PWD) doesn't work when
the $(OUTPUT) is $(PWD) which is the case when the test is run using
make -C tools/testing/selftests/futex/ run_tests.

Fixes: a8ba798bc8 ("selftests: enable O and KBUILD_OUTPUT")
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Reviewed-by: Darren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-08-09 10:16:24 -06:00
Cao jin
4e433fc4d1 fixdep: trivial: typo fix and correction
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-08-10 01:01:03 +09:00
Cao jin
312a3d0918 kbuild: trivial cleanups on the comments
This is a bunch of trivial fixes and cleanups.

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-08-10 00:58:20 +09:00
Nicholas Piggin
cb87481ee8 kbuild: linker script do not match C names unless LD_DEAD_CODE_DATA_ELIMINATION is configured
The .data and .bss sections were modified in the generic linker script to
pull in sections named .data.<C identifier>, which are generated by gcc with
-ffunction-sections and -fdata-sections options.

The problem with this pattern is it can also match section names that Linux
defines explicitly, e.g., .data.unlikely. This can cause Linux sections to
get moved into the wrong place.

The way to avoid this is to use ".." separators for explicit section names
(the dot character is valid in a section name but not a C identifier).
However currently there are sections which don't follow this rule, so for
now just disable the wild card by default.

Example: http://marc.info/?l=linux-arm-kernel&m=150106824024221&w=2

Cc: <stable@vger.kernel.org> # 4.9
Fixes: b67067f117 ("kbuild: allow archs to select link dead code/data elimination")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-08-10 00:57:42 +09:00
megha.dey@linux.intel.com
8861249c74 crypto: x86/sha1 - Fix reads beyond the number of blocks passed
It was reported that the sha1 AVX2 function(sha1_transform_avx2) is
reading ahead beyond its intended data, and causing a crash if the next
block is beyond page boundary:
http://marc.info/?l=linux-crypto-vger&m=149373371023377

This patch makes sure that there is no overflow for any buffer length.

It passes the tests written by Jan Stancek that revealed this problem:
https://github.com/jstancek/sha1-avx2-crash

I have re-enabled sha1-avx2 by reverting commit
b82ce24426

Cc: <stable@vger.kernel.org>
Fixes: b82ce24426 ("crypto: sha1-ssse3 - Disable avx2")
Originally-by: Ilya Albrekht <ilya.albrekht@intel.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-09 20:01:37 +08:00
Herbert Xu
28389575a8 crypto: ixp4xx - Fix error handling path in 'aead_perform()'
In commit 0f987e25cb, the source processing has been moved in front of
the destination processing, but the error handling path has not been
modified accordingly.
Free resources in the correct order to avoid some leaks.

Cc: <stable@vger.kernel.org>
Fixes: 0f987e25cb ("crypto: ixp4xx - Fix false lastlen uninitialised warning")
Reported-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
2017-08-09 20:01:33 +08:00
Naftali Goldstein
20fc690f38 iwlwifi: mvm: send delba upon rx ba session timeout
When an RX block-ack session times out, the firmware, which offloads
RX reordering but not the BA session negotiation, stops the session
but doesn't send a DELBA.  This causes the the session to remain
active in the remote device, so no more BA sessions will be
established, causing a severe throughput degradation due to the lack
of aggregation.

Use the new ieee80211_rx_ba_timer_expired API when the ba session timer
expires, since this will tear down the ba session and also send a delba.

The previous API used is intended for drivers that offload the
addba/delba negotiation, but not the rx reordering, while our driver
does the opposite.

This patch depends on "mac80211: add api to start ba session timer
expired flow".

Signed-off-by: Naftali Goldstein <naftali.goldstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-09 10:05:01 +03:00
Naftali Goldstein
04c2cf3436 mac80211: add api to start ba session timer expired flow
Some drivers handle rx buffer reordering internally (and by extension
handle also the rx ba session timer internally), but do not ofload the
addba/delba negotiation.
Add an api for these drivers to properly tear-down the ba session,
including sending a delba.

Signed-off-by: Naftali Goldstein <naftali.goldstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-09 09:49:42 +03:00
Sergei Shtylyov
7f5770678b dmaengine: tegra210-adma: fix of_irq_get() error check
of_irq_get() may return 0 as well as negative error number on failure,
while the driver only checks for the negative values. The driver would then
call request_irq(0, ...) in tegra_adma_alloc_chan_resources() and never get
valid channel interrupt.

Check for 'tdc->irq <= 0' instead and return -ENXIO from the driver's probe
iff of_irq_get() returned 0.

Fixes: f46b195799 ("dmaengine: tegra-adma: Add support for Tegra210 ADMA")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2017-08-09 11:39:16 +05:30
Emmanuel Grumbach
a600852a9d iwlwifi: mvm: don't WARN when a legit race happens in A-MPDU
When we start an Rx A-MPDU session, we first get the AddBA
request, then we send an ADD_STA command to the firmware
that will reply with a BAID which is a hardware resource
that tracks the BA session.
This BAID will appear on each and every frame that we get
from the firwmare until the A-MPDU session is torn down.
In the Rx path, we look at this BAID to manage the
reordering buffer.

This flow is inherently racy since the hardware will start
to put the BAID in the frames it receives even if the
firmware hasn't sent the response to the ADD_STA command.
This basically means that the driver can get frames with
a valid BAID that it doesn't know yet.
When that happens, the driver used to WARN.
Fix this by simply not WARN in this case. When the driver
will know abou the BAID, it will initialise the relevant
states and the next frame with a valid BAID will refresh
them.

Fixes: b915c10174 ("iwlwifi: mvm: add reorder buffer per queue")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-09 08:53:50 +03:00
Avraham Stern
7e39a00d59 iwlwifi: mvm: start mac queues when deferred tx frames are purged
In AP mode, if a station is removed just as it is adding a new stream,
the queue in question will remain stopped and no more TX will happen
in this queue, leading to connection failures and other problems.

This is because under DQA, when tx is deferred because a queue needs
to be allocated, the mac queue for that TID is stopped until the new
stream is added.  If at this point the station that this stream
belongs to is removed, all the deferred tx frames are purged, but the
mac queue is not restarted. As a result, all following tx on this
queue will not be transmitted.

Fix this by starting the relevant mac queues when the deferred tx
frames are purged.

Fixes: 24afba7690 ("iwlwifi: mvm: support bss dynamic alloc/dealloc of queues")
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-09 08:25:24 +03:00
Brian King
424f727b94 scsi: ses: Fix wrong page error
If a SES device returns an error on a requested diagnostic page, we are
currently printing an error indicating the wrong page was received. Fix
this up to simply return a failure and only check the returned page when
the diagnostic page buffer was populated by the device.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-08 11:49:52 -04:00
Brian King
b0e17a9b0d scsi: ipr: Fix scsi-mq lockdep issue
Fixes the following lockdep warning that can occur when scsi-mq is
enabled with ipr due to ipr calling scsi_unblock_requests from irq
context. The fix is to move the call to scsi_unblock_requests to ipr's
existing workqueue.

stack backtrace:
CPU: 28 PID: 0 Comm: swapper/28 Not tainted 4.13.0-rc2-gcc6x-gf74c89b #1
Call Trace:
[c000001fffe97550] [c000000000b50818] dump_stack+0xe8/0x160 (unreliable)
[c000001fffe97590] [c0000000001586d0] print_usage_bug+0x2d0/0x390
[c000001fffe97640] [c000000000158f34] mark_lock+0x7a4/0x8e0
[c000001fffe976f0] [c00000000015a000] __lock_acquire+0x6a0/0x1a70
[c000001fffe97860] [c00000000015befc] lock_acquire+0xec/0x2e0
[c000001fffe97930] [c000000000b71514] _raw_spin_lock+0x44/0x70
[c000001fffe97960] [c0000000005b60f4] blk_mq_sched_dispatch_requests+0xa4/0x2a0
[c000001fffe979c0] [c0000000005acac0] __blk_mq_run_hw_queue+0x100/0x2c0
[c000001fffe97a00] [c0000000005ad478] __blk_mq_delay_run_hw_queue+0x118/0x130
[c000001fffe97a40] [c0000000005ad61c] blk_mq_start_hw_queues+0x6c/0xa0
[c000001fffe97a80] [c000000000797aac] scsi_kick_queue+0x2c/0x60
[c000001fffe97aa0] [c000000000797cf0] scsi_run_queue+0x210/0x360
[c000001fffe97b10] [c00000000079b888] scsi_run_host_queues+0x48/0x80
[c000001fffe97b40] [c0000000007b6090] ipr_ioa_bringdown_done+0x70/0x1e0
[c000001fffe97bc0] [c0000000007bc860] ipr_reset_ioa_job+0x80/0xf0
[c000001fffe97bf0] [c0000000007b4d50] ipr_reset_timer_done+0xd0/0x100
[c000001fffe97c30] [c0000000001937bc] call_timer_fn+0xdc/0x4b0
[c000001fffe97cf0] [c000000000193d08] expire_timers+0x178/0x330
[c000001fffe97d60] [c0000000001940c8] run_timer_softirq+0xb8/0x120
[c000001fffe97de0] [c000000000b726a8] __do_softirq+0x168/0x6d8
[c000001fffe97ef0] [c0000000000df2c8] irq_exit+0x108/0x150
[c000001fffe97f10] [c000000000017bf4] __do_irq+0x2a4/0x4a0
[c000001fffe97f90] [c00000000002da50] call_do_irq+0x14/0x24
[c0000007fad93aa0] [c000000000017e8c] do_IRQ+0x9c/0x140
[c0000007fad93af0] [c000000000008b98] hardware_interrupt_common+0x138/0x140

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-08 11:49:51 -04:00
Bodo Stroesser
180efde0a3 scsi: st: fix blk_get_queue usage
If blk_queue_get() in st_probe fails, disk->queue must not be set to
SDp->request_queue, as that would result in put_disk() dropping a not
taken reference.

Thus, disk->queue should be set only after a successful blk_queue_get().

Fixes: 2b5bebccd2 ("st: Take additional queue ref in st_probe")
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Acked-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-08 11:49:51 -04:00
Michael Hernandez
be37aa4b99 scsi: qla2xxx: Fix system crash while triggering FW dump
This patch fixes system hang/crash while firmware dump is attempted with
Block MQ enabled in qla2xxx driver. Fix is to remove check in fw dump
template entries for existing request and response queues so that full
buffer size is calculated during template size calculation.

Following stack trace is seen during firmware dump capture process

[  694.390588] qla2xxx [0000:81:00.0]-5003:11: ISP System Error - mbx1=4b1fh mbx2=10h mbx3=2ah mbx7=0h.
[  694.402336] BUG: unable to handle kernel paging request at ffffc90008c7b000
[  694.402372] IP: memcpy_erms+0x6/0x10
[  694.402386] PGD 105f01a067
[  694.402386] PUD 85f89c067
[  694.402398] PMD 10490cb067
[  694.402409] PTE 0
[  694.402421]
[  694.402437] Oops: 0002 [#1] PREEMPT SMP
[  694.402452] Modules linked in: netconsole configfs qla2xxx scsi_transport_fc
nvme_fc nvme_fabrics bnep bluetooth rfkill xt_tcpudp unix_diag xt_multiport
ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet
iscsi_ibft iscsi_boot_sysfs xfs libcrc32c ipmi_ssif sb_edac edac_core
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass igb
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel iTCO_wdt
aes_x86_64 crypto_simd ptp iTCO_vendor_support glue_helper cryptd lpc_ich joydev
i2c_i801 pcspkr ioatdma mei_me pps_core tpm_tis mei mfd_core acpi_power_meter
tpm_tis_core ipmi_si ipmi_devintf tpm ipmi_msghandler shpchp wmi dca button
acpi_pad btrfs xor uas usb_storage hid_generic usbhid raid6_pq crc32c_intel ast
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect
[  694.402692]  sysimgblt fb_sys_fops xhci_pci ttm ehci_pci sr_mod xhci_hcd
cdrom ehci_hcd drm usbcore sg
[  694.402730] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-1-default+ #19
[  694.402753] Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1a 10/16/2015
[  694.402776] task: ffffffff81c0e4c0 task.stack: ffffffff81c00000
[  694.402798] RIP: 0010:memcpy_erms+0x6/0x10
[  694.402813] RSP: 0018:ffff88085fc03cd0 EFLAGS: 00210006
[  694.402832] RAX: ffffc90008c7ae0c RBX: 0000000000000004 RCX: 000000000001fe0c
[  694.402856] RDX: 0000000000020000 RSI: ffff8810332c01f4 RDI: ffffc90008c7b000
[  694.402879] RBP: ffff88085fc03d18 R08: 0000000000020000 R09: 0000000000279e0a
[  694.402903] R10: 0000000000000000 R11: f000000000000000 R12: ffff88085fc03d80
[  694.402927] R13: ffffc90008a01000 R14: ffffc90008a056d4 R15: ffff881052ef17e0
[  694.402951] FS:  0000000000000000(0000) GS:ffff88085fc00000(0000) knlGS:0000000000000000
[  694.402977] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  694.403012] CR2: ffffc90008c7b000 CR3: 0000000001c09000 CR4: 00000000001406f0
[  694.403036] Call Trace:
[  694.403047]  <IRQ>
[  694.403072]  ? qla27xx_fwdt_entry_t263+0x18e/0x380 [qla2xxx]
[  694.403099]  qla27xx_walk_template+0x9d/0x1a0 [qla2xxx]
[  694.403124]  qla27xx_fwdump+0x1f3/0x272 [qla2xxx]
[  694.403149]  qla2x00_async_event+0xb08/0x1a50 [qla2xxx]
[  694.403169]  ? enqueue_task_fair+0xa2/0x9d0

Signed-off-by: Mike Hernandez <michael.hernandez@cavium.com>
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-08 11:49:50 -04:00
Song Liu
a9501d7421 md/r5cache: fix io_unit handling in r5l_log_endio()
In r5l_log_endio(), once log->io_list_lock is released, the io unit
may be accessed (or even freed) by other threads. Current code
doesn't handle the io_unit properly, which leads to potential race
conditions.

This patch solves this race condition by:

1. Add a pending_stripe count flush_payload. Multiple flush_payloads
   are counted as only one pending_stripe. Flag has_flush_payload is
   added to show whether the io unit has flush_payload;
2. In r5l_log_endio(), check flags has_null_flush and
   has_flush_payload with log->io_list_lock held. After the lock
   is released, this IO unit is only accessed when we know the
   pending_stripe counter cannot be zeroed by other threads.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-08 07:42:37 -07:00
Song Liu
b44886c54a md/r5cache: call mddev_lock/unlock() in r5c_journal_mode_set
In r5c_journal_mode_set(), it is necessary to call mddev_lock()
before accessing conf and conf->log. Otherwise, the conf->log
may change (and become NULL).

Shaohua: fix unlock in failure cases

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-08 07:42:36 -07:00
NeilBrown
81fe48e9aa md: fix test in md_write_start()
md_write_start() needs to clear the in_sync flag is it is set, or if
there might be a race with set_in_sync() such that the later will
set it very soon.  In the later case it is sufficient to take the
spinlock to synchronize with set_in_sync(), and then set the flag
if needed.

The current test is incorrect.
It should be:
  if "flag is set" or "race is possible"

"flag is set" is trivially "mddev->in_sync".
"race is possible" should be tested by "mddev->sync_checkers".

If sync_checkers is 0, then there can be no race.  set_in_sync() will
wait in percpu_ref_switch_to_atomic_sync() for an RCU grace period,
and as md_write_start() holds the rcu_read_lock(), set_in_sync() will
be sure ot see the update to writes_pending.

If sync_checkers is > 0, there could be race.  If md_write_start()
happened entirely between
		if (!mddev->in_sync &&
		    percpu_ref_is_zero(&mddev->writes_pending)) {
and
			mddev->in_sync = 1;
in set_in_sync(), then it would not see that is_sync had been set,
and set_in_sync() would not see that writes_pending had been
incremented.

This bug means that in_sync is sometimes not set when it should be.
Consequently there is a small chance that the array will be marked as
"clean" when in fact it is inconsistent.

Fixes: 4ad23a9764 ("MD: use per-cpu counter for writes_pending")
cc: stable@vger.kernel.org (v4.12+)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-08 07:42:36 -07:00
NeilBrown
33182d15c6 md: always clear ->safemode when md_check_recovery gets the mddev lock.
If ->safemode == 1, md_check_recovery() will try to get the mddev lock
and perform various other checks.
If mddev->in_sync is zero, it will call set_in_sync, and clear
->safemode.  However if mddev->in_sync is not zero, ->safemode will not
be cleared.

When md_check_recovery() drops the mddev lock, the thread is woken
up again.  Normally it would just check if there was anything else to
do, find nothing, and go to sleep.  However as ->safemode was not
cleared, it will take the mddev lock again, then wake itself up
when unlocking.

This results in an infinite loop, repeatedly calling
md_check_recovery(), which RCU or the soft-lockup detector
will eventually complain about.

Prior to commit 4ad23a9764 ("MD: use per-cpu counter for
writes_pending"), safemode would only be set to one when the
writes_pending counter reached zero, and would be cleared again
when writes_pending is incremented.  Since that patch, safemode
is set more freely, but is not reliably cleared.

So in md_check_recovery() clear ->safemode before checking ->in_sync.

Fixes: 4ad23a9764 ("MD: use per-cpu counter for writes_pending")
Cc: stable@vger.kernel.org (4.12+)
Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reported-by: David R <david@unsolicited.net>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-08-08 07:42:35 -07:00
Luis R. Rodriguez
8b0949d407 test_sysctl: fix sysctl.sh by making it executable
We had just forogtten to do this. Without this the following test fails:

$ sudo make -C tools/testing/selftests/sysctl/ run_tests
make: Entering directory '/home/mcgrof/linux-next/tools/testing/selftests/sysctl'
/bin/sh: ./sysctl.sh: Permission denied
selftests:  sysctl.sh [FAIL]
/home/mcgrof/linux-next/tools/testing/selftests/sysctl
make: Leaving directory '/home/mcgrof/linux-next/tools/testing/selftests/sysctl'

Fixes: 64b671204a ("test_sysctl: add generic script to expand on tests")
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-08-07 15:13:36 -06:00
Luis R. Rodriguez
0a9c40cea7 test_kmod: fix kmod.sh by making it executable
We had just forgotten to do this. Without this if we run the
following we get a permission denied:

sudo make -C tools/testing/selftests/kmod/ run_tests
make: Entering directory '/home/mcgrof/linux-next/tools/testing/selftests/kmod'
/bin/sh: ./kmod.sh: Permission denied
selftests:  kmod.sh [FAIL]
/home/mcgrof/linux-next/tools/testing/selftests/kmod
make: Leaving directory '/home/mcgrof/linux-next/tools/testing/selftests/kmod

Fixes: 39258f448d71 ("kmod: add test driver to stress test the module loader")
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-08-07 15:13:11 -06:00
zhangyi (F)
41e327b586 quota: correct space limit check
Currently we compare total space (curspace + rsvspace)
with space limit in quota-tools when setting grace time
and also in check_bdq(), but we missing rsvspace in
somewhere else, correct them. This patch also fix incorrect
zero dqb_btime and grace time updating failure when we use
rsvspace(e.g. ext4 dalloc feature).

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2017-08-07 16:51:28 +02:00
Florian Fainelli
c017d21147 irqchip: brcmstb-l2: Define an irq_pm_shutdown function
The Broadcom STB platforms support S5 and we allow specific hardware
wake-up events to take us out of this state. Because we were not
defining an irq_pm_shutdown() function pointer, we would not be
correctly masking non-wakeup events, which would result in spurious
wake-ups from sources that were not explicitly configured for wake-up.

Fixes: 7f646e9276 ("irqchip: brcmstb-l2: Add Broadcom Set Top Box Level-2 interrupt controller")
Acked-by: Gregory Fong <gregory.0xf0@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-08-07 13:50:52 +01:00
Kuninori Morimoto
8cd7b51ff5 arm64: renesas: salvator-common: avoid audio_clkout naming conflict
clock name of "audio_clkout" is used by Renesas sound driver.
This duplicated naming breaks its clock registering/unregistering.
Especially, when unbind/bind it can't handle clkout correctly.
This patch renames "audio_clkout" to "audio-clkout" to avoid
naming conflict.

Fixes: 8a8f181d2c ("arm64: renesas: salvator-x: use CS2000 as AUDIO_CLK_B")
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
2017-08-07 09:50:10 +02:00
Christophe Jaillet
aae9d56323 iwlwifi: mvm: Fix a memory leak in an error handling path in 'iwl_mvm_sar_get_wgds_table()'
We should free 'wgds.pointer' here as done a few lines above in another
error handling path.
It was allocated within 'acpi_evaluate_object()'.

Fixes: c52030a01ccc ("iwlwifi: mvm: add GEO_TX_POWER_LIMIT cmd for geographic tx power table")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-05 21:28:24 +03:00
Martin Kaiser
8317562097 ARM: dts: i.MX25: add ranges to tscadc
Add a ranges; line to the tscadc node. This creates a 1:1 mapping between
the addresses used by tscadc and those in its child nodes (adc, tsc).

Without such a mapping, the reg = ... lines in the tsc and adc nodes do
not create a resource. Probing the fsl-imx25-tcq and fsl-imx25-tsadc
drivers will then fail since there's no IORESOURCE_MEM.

Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Fixes: 92f651f39b ("ARM: dts: imx25: Add TSC and ADC support")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2017-08-05 13:22:32 +08:00
Stefan Agner
9e01e2d56d soc: imx: gpcv2: fix regulator deferred probe
If a regulator requests a deferred probe, the power domain gets
initialized twice. This leads to a list double add (without
list debugging the kernel hangs due to the double add later):

  WARNING: CPU: 0 PID: 19 at lib/list_debug.c:31 __list_add_valid+0xbc/0xc4
  list_add double add: new=c1229754, prev=c12383b4, next=c1229754.

Initialize the power domain after we get the regulator. Also do
not print an error in case the regulator defers probing.

Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Fixes: 03aa12629f ("soc: imx: Add GPCv2 power gating driver")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Tested-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2017-08-05 11:12:20 +08:00
Vineet Gupta
b5ddb6d547 ARCv2: PAE40: set MSB even if !CONFIG_ARC_HAS_PAE40 but PAE exists in SoC
PAE40 confiuration in hardware extends some of the address registers
for TLB/cache ops to 2 words.

So far kernel was NOT setting the higher word if feature was not enabled
in software which is wrong. Those need to be set to 0 in such case.

Normally this would be done in the cache flush / tlb ops, however since
these registers only exist conditionally, this would have to be
conditional to a flag being set on boot which is expensive/ugly -
specially for the more common case of PAE exists but not in use.
Optimize that by zero'ing them once at boot - nobody will write to
them afterwards

Cc: stable@vger.kernel.org   #4.4+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-04 13:56:35 +05:30
Alexey Brodkin
7d79cee2c6 ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
It is necessary to explicitly set both SLC_AUX_RGN_START1 and SLC_AUX_RGN_END1
which hold MSB bits of the physical address correspondingly of region start
and end otherwise SLC region operation is executed in unpredictable manner

Without this patch, SLC flushes on HSDK (IOC disabled) were taking
seconds.

Cc: stable@vger.kernel.org   #4.4+
Reported-by: Vladimir Kondratiev <vladimir.kondratiev@intel.com>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: PAR40 regs only written if PAE40 exist]
2017-08-04 13:56:34 +05:30
Vineet Gupta
2e332fec2f ARC: dma: implement dma_unmap_page and sg variant
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-04 13:56:33 +05:30
Alexey Brodkin
b37174d95b ARCv2: SLC: Make sure busy bit is set properly for region ops
c70c473396 "ARCv2: SLC: Make sure busy bit is set properly on SLC flushing"
fixes problem for entire SLC operation where the problem was initially
caught. But given a nature of the issue it is perfectly possible for
busy bit to be read incorrectly even when region operation was started.

So extending initial fix for regional operation as well.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: stable@vger.kernel.org   #4.10
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-04 13:50:07 +05:30
Vineet Gupta
33460f86ad ARC: [plat-sim] Include this platform unconditionally
Essentially remove CONFIG_ARC_PLAT_SIM

There is no need for any platform specific code, just the board DTS
match strings which we can include unconditionally

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-04 13:49:47 +05:30
Eugeniy Paltsev
f862b31514 ARC: [plat-axs10x]: prepare dts files for enabling PAE40 on axs103
Enable 64bit adressing, where it needed, to make possible
enabling PAE40 on axs103.

This patch doesn't affect on any functionality.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-04 13:49:23 +05:30
Kalle Valo
368bd88ebb Merge tag 'iwlwifi-for-kalle-2017-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes
Some fixes in iwlwifi for 4.13

* Some simple PCI HW ID fix-ups and additions for family 9000;
* A couple of bugzilla fixes:
  - Remove a bogus warning message with new FWs (196915)
  - Don't allow illegal channel options to be used (195299)
* A fix for checksum offload in family 9000;
* A fix serious throughput degradation in 11ac with multiple streams;
* An old bug in SMPS where the firmware was not aware of SMPS changes;
2017-08-03 11:01:11 +03:00
Steven Rostedt (VMware)
a7e52ad7ed ring-buffer: Have ring_buffer_alloc_read_page() return error on offline CPU
Chunyu Hu reported:
  "per_cpu trace directories and files are created for all possible cpus,
   but only the cpus which have ever been on-lined have their own per cpu
   ring buffer (allocated by cpuhp threads). While trace_buffers_open, the
   open handler for trace file 'trace_pipe_raw' is always trying to access
   field of ring_buffer_per_cpu, and would panic with the NULL pointer.

   Align the behavior of trace_pipe_raw with trace_pipe, that returns -NODEV
   when openning it if that cpu does not have trace ring buffer.

   Reproduce:
   cat /sys/kernel/debug/tracing/per_cpu/cpu31/trace_pipe_raw
   (cpu31 is never on-lined, this is a 16 cores x86_64 box)

   Tested with:
   1) boot with maxcpus=14, read trace_pipe_raw of cpu15.
      Got -NODEV.
   2) oneline cpu15, read trace_pipe_raw of cpu15.
      Get the raw trace data.

   Call trace:
   [ 5760.950995] RIP: 0010:ring_buffer_alloc_read_page+0x32/0xe0
   [ 5760.961678]  tracing_buffers_read+0x1f6/0x230
   [ 5760.962695]  __vfs_read+0x37/0x160
   [ 5760.963498]  ? __vfs_read+0x5/0x160
   [ 5760.964339]  ? security_file_permission+0x9d/0xc0
   [ 5760.965451]  ? __vfs_read+0x5/0x160
   [ 5760.966280]  vfs_read+0x8c/0x130
   [ 5760.967070]  SyS_read+0x55/0xc0
   [ 5760.967779]  do_syscall_64+0x67/0x150
   [ 5760.968687]  entry_SYSCALL64_slow_path+0x25/0x25"

This was introduced by the addition of the feature to reuse reader pages
instead of re-allocating them. The problem is that the allocation of a
reader page (which is per cpu) does not check if the cpu is online and set
up for the ring buffer.

Link: http://lkml.kernel.org/r/1500880866-1177-1-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: 73a757e631 ("ring-buffer: Return reader page back into existing ring buffer")
Reported-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-02 14:23:02 -04:00
Dan Carpenter
147d88e0b5 tracing: Missing error code in tracer_alloc_buffers()
If ring_buffer_alloc() or one of the next couple function calls fail
then we should return -ENOMEM but the current code returns success.

Link: http://lkml.kernel.org/r/20170801110201.ajdkct7vwzixahvx@mwanda

Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Fixes: b32614c034 ('tracing/rb: Convert to hotplug state machine')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-02 14:19:57 -04:00
Steven Rostedt (VMware)
4bb0f0e73c tracing: Call clear_boot_tracer() at lateinit_sync
The clear_boot_tracer function is used to reset the default_bootup_tracer
string to prevent it from being accessed after boot, as it originally points
to init data. But since clear_boot_tracer() is called via the
init_lateinit() call, it races with the initcall for registering the hwlat
tracer. If someone adds "ftrace=hwlat" to the kernel command line, depending
on how the linker sets up the text, the saved command line may be cleared,
and the hwlat tracer never is initialized.

Simply have the clear_boot_tracer() be called by initcall_lateinit_sync() as
that's for tasks to be called after lateinit.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=196551

Cc: stable@vger.kernel.org
Fixes: e7c15cd8a ("tracing: Added hardware latency tracer")
Reported-by: Zamir SUN <sztsian@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-02 14:19:57 -04:00
Will Deacon
39a06b67c2 irqchip/gic: Ensure we have an ISB between ack and ->handle_irq
Devices that expose their interrupt status registers via system
registers (e.g. Statistical profiling, CPU PMU, DynamIQ PMU, arch timer,
vgic (although unused by Linux), ...) rely on a context synchronising
operation on the CPU to ensure that the updated status register is
visible to the CPU when handling the interrupt. This usually happens as
a result of taking the IRQ exception in the first place, but there are
two race scenarios where this isn't the case.

For example, let's say we have two peripherals (X and Y), where Y uses a
system register for its interrupt status.

Case 1:
1. CPU takes an IRQ exception as a result of X raising an interrupt
2. Y then raises its interrupt line, but the update to its system
   register is not yet visible to the CPU
3. The GIC decides to expose Y's interrupt number first in the Ack
   register
4. The CPU runs the IRQ handler for Y, but the status register is stale

Case 2:
1. CPU takes an IRQ exception as a result of X raising an interrupt
2. CPU reads the interrupt number for X from the Ack register and runs
   its IRQ handler
3. Y raises its interrupt line and the Ack register is updated, but
   again, the update to its system register is not yet visible to the
   CPU.
4. Since the GIC drivers poll the Ack register, we read Y's interrupt
   number and run its handler without a context synchronisation
   operation, therefore seeing the stale register value.

In either case, we run the risk of missing an IRQ. This patch solves the
problem by ensuring that we execute an ISB in the GIC drivers prior
to invoking the interrupt handler. This is already the case for GICv3
and EOIMode 1 (the usual case for the host).

Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-08-02 16:55:59 +01:00
Robert Richter
d1ce263feb irqchip/gic-v3-its: Remove ACPICA version check for ACPI NUMA
The version check was added due to dependency to

 a618c7f89a ACPICA: Add support for new SRAT subtable

Now, that this code is in the kernel, remove the check. This is esp.
useful to enable backports.

Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-08-02 16:55:48 +01:00
Logan Gunthorpe
0eb4634536 ntb: ntb_test: ensure the link is up before trying to configure the mws
After the link tests, there is a race on one side of the test for
the link coming up. It's possible, in some cases, for the test script
to write to the 'peer_trans' files before the link has come up.

To fix this, we simply use the link event file to ensure both sides
see the link as up before continuning.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Fixes: a9c59ef774 ("ntb_test: Add a selftest script for the NTB subsystem")
2017-08-01 15:18:59 -04:00
Dave Jiang
f3fd2afed8 ntb: transport shouldn't disable link due to bogus values in SPADs
It seems that under certain scenarios the SPAD can have bogus values caused
by an agent (i.e. BIOS or other software) that is not the kernel driver, and
that causes memory window setup failure. This should not cause the link to
be disabled because if we do that, the driver will never recover again. We
have verified in testing that this issue happens and prevents proper link
recovery.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Fixes: 84f766855f ("ntb: stop link work when we do not have memory")
2017-08-01 13:31:44 -04:00
Tzipi Peres
558f479f68 iwlwifi: add the new 9000 series PCI IDs
Add two PCI IDs for the 9160 series.
Add five PCI IDs for the 9260 series.
Add one PCI IDs for the 9270 series.
Add seven PCI IDs for the 9460 series.
Add five PCI IDs for the 9560 series.

Signed-off-by: Tzipi Peres <tzipi.peres@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-01 12:26:17 +03:00
Naftali Goldstein
8addabf8e6 iwlwifi: mvm: set the RTS_MIMO_PROT bit in flag mask when sending sta to fw
Set the STA_FLG_RTS_MIMO_PROT bit in station_flags_msk of the add sta
command, so that when smps mode changes, the FW will know about it.

In particular, in AP mode, clients are added upon receival of an auth
request, at which point there's no knowledge of the client's smps mode.
When the assoc request arrives, the add_sta command is resent to modify
the station parameters. At this point the driver knows the smps mode,
but since the corresponding bit in the mask is not set, the fw doesn't
update this field so there's no rts protection for mimo.

Fixes: 5bc5aaad40 ("iwlwifi: mvm: set up initial SMPS/NSS station info")
Signed-off-by: Naftali Goldstein <naftali.goldstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-01 12:18:51 +03:00
Haim Dreyfuss
e9fb92e13d iwlwifi: fix fw_pre_next_step to apply also for C step
C step NICs should use the latest FW (currently B step).
Correct the condition to make C step NICs advanced its default FW name
to the latest one.
Also rename _next_ to b_or_c to avoid confusion.

Fixes: 5da083d192 ("iwlwifi: add support for 9000 HW B-step NICs")
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-01 12:05:09 +03:00
Gregory Greenman
87f55616f8 iwlwifi: mvm: rs: fix TLC statistics collection
Statistics should be collected according to the actual rate a
frame/aggregation was transmitted and not according to the initial rate
from the last LQ command (these rates are different if the frames were
retransmitted at a lower rate from the rate scale table).

This is needed to remove throughput degradation.

Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-01 11:51:02 +03:00
Gregory Greenman
9465c3f8ba iwlwifi: mvm: set A-MPDU bit upon empty BA notification from FW
The bit was set only if there was at least one reclaimed frame in an
aggregation. It's important to set it also in the case that the whole
A-MPDU was lost, otherwise rate scaling statistics will not be
updated correctly. Thus, set it always in ba notification handler.

This fixes a throughput degradation of about 20% in certain scenarios
with multiple streams on 11ac.

Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-01 11:50:30 +03:00
Emmanuel Grumbach
92b0f7b26b iwlwifi: split the regulatory rules when the bandwidth flags require it
When we create a regulatory domain out of an MCC
notification, we need to make sure that all the channels
in the rule have the exact same properties.
The current code mixes channel 36 and 40 although 36 can be
a control channel with HT40+ (36, 40) whereas 40 can't be
a control channel with HT40+ since  (40, 44) is invalid.

Because of that, cfg80211 would allow to connect in 40MHz
to APs that are configured to channel 40 HT40+ and that made
our firmware assert.

Fix this by checking the bandwidth flags before taking the
decision if the rule should be split.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=195299 partly.

Fixes: af45a9003f ("iwlwifi: create regdomain from mcc_update_cmd response")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-01 11:19:13 +03:00
Emmanuel Grumbach
58877d7428 iwlwifi: add TLV for MLME offload firmware capability
The firmware now adds a new DWORD for the MLME offload's
capability even on firmware versions that don't support
it.
Add the TLV bit to avoid getting the print:
capa flags index 3 larger than supported by driver.

This fixes the bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=196195

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-01 11:10:12 +03:00
Emmanuel Grumbach
3f25bb4b7f iwlwifi: mvm: fix TCP CSUM offload with WEP and A000 series
When we enabled TCP checksum offload, we need to tell the
firmware where the IP header starts. If we have an IV, then
we need to adapt that value since the IV is placed before
the SNAP header. This is true only for cases where the
driver adds the IV, not the WEP case in which the IV is
added by the firmware itself.

On A000 devices series, the IV is always added by the
device.

Fix this.

Fixes: 5e6a98dc48 ("iwlwifi: mvm: enable TCP/UDP checksum support for 9000 family")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-08-01 11:02:54 +03:00
Fabrice Gasnier
50dbe1f4b4 iio: adc: stm32: fix common clock rate
ADC clock input is provided to internal prescaler (that decreases its
frequency). It's then used as reference clock for conversions.

- Fix common clock rate used then by stm32-adc sub-devices. Take common
  prescaler into account. Currently, rate is used to set "boost" mode.
  It may unnecessarily be set. This impacts power consumption.
- Fix ADC max clock rate on STM32H7 (fADC from datasheet). Currently,
  prescaler may be set too low. This can result in ADC reference
  clock used for conversion to exceed max allowed clock frequency.

Fixes: 95e339b6e8 ("iio: adc: stm32: add support for STM32H7")
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-07-30 17:18:38 +01:00
Stefan Brüns
ff3aa88a4d iio: adc: ina219: Avoid underflow for sleeping time
Proper support for the INA219 lowered the minimum sampling period from
2*140us to 2*84us. Subtracting 200us later leads to an underflow and
an almost infinite udelay later.

Using a signed int for the sampling period provides sufficient range
(at most 2*8640*1024us), but catches the underflow when comparing with
buffer_us.

Fixes: 18edac2e22 ("iio: adc: Fix integration time/averaging for INA219/220")
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-07-30 14:50:19 +01:00
Fabrice Gasnier
90938ca432 iio: trigger: stm32-timer: add enable attribute
In order to use encoder mode, timers needs to be enabled (e.g. CEN bit)
along with peripheral clock.
Add IIO_CHAN_INFO_ENABLE attribute to handle this.
Also, in triggered mode, CEN bit is set automatically in hardware.
Then clock must be enabled before starting triggered mode.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-07-30 14:48:14 +01:00
Fabrice Gasnier
06e3fe8998 iio: trigger: stm32-timer: fix get/set down count direction
Fixes: 4adec7da05 ("iio: stm32 trigger: Add quadrature encoder device")

This fixes two issues:
- stm32_set_count_direction: to set down direction
- stm32_get_count_direction: to get down direction

IIO core provides/expects value to be an index of iio_enum items array.
This needs to be turned by these routines into TIM_CR1_DIR (e.g. BIT(4))
value.
Also, report error when attempting to write direction, when in encoder
mode: in this case, direction is read only (given by encoder inputs).

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-07-30 14:46:02 +01:00
Fabrice Gasnier
1987a08cd9 iio: trigger: stm32-timer: fix write_raw return value
Fixes: 4adec7da05 ("iio: stm32 trigger: Add quadrature encoder device")

IIO core expects zero as return value for write_raw() callback
in case of success.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-07-30 14:44:28 +01:00
Fabrice Gasnier
50b39608ef iio: trigger: stm32-timer: fix quadrature mode get routine
Fixes: 4adec7da05 ("iio: stm32 trigger: Add quadrature encoder device")

SMS bitfiled is mode + 1. After reset, upon boot, SMS = 0. When reading
from sysfs, stm32_get_quadrature_mode() returns -1 (e.g. -EPERM) which is
wrong error code here. So, check SMS bitfiled matches valid encoder mode,
or return -EINVAL.

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-07-30 14:43:10 +01:00
Andreas Klinger
eb92b4183d iio: bmp280: properly initialize device for humidity reading
If the device is not initialized at least once it happens that the humidity
reading is skipped, which means the special value 0x8000 is delivered.

For omitting this case the oversampling of the humidity must be set before
the oversampling of the temperature und pressure is set as written in the
datasheet of the BME280.

Furthermore proper error detection is added in case a skipped value is read
from the device. This is done also for pressure and temperature reading.
Especially it don't make sense to compensate this value and treat it as
regular value.

Signed-off-by: Andreas Klinger <ak@it-klinger.de>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-07-30 14:41:12 +01:00
Alexander Dahl
3fb3b3c4b6 memory: atmel-ebi: Fix smc cycle xlate converter
The converter function for translating ns timings in register values was
initialized with a wrong function pointer. This resulted in wrong
register values also for the setup and pulse registers when configuring
the EBI interface trough dts.

Includes a small fix in a comment of the smc driver, which was probably
just a copy'n'paste mistake.

Signed-off-by: Alexander Dahl <ada@thorsis.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-26 22:37:54 +02:00
Alexander Dahl
1f6b53901f memory: atmel-ebi: Allow t_DF timings of zero ns
As reported in [1] and in [2] it's not possible to set the device tree
property 'atmel,smc-tdf-ns' to zero, although the SoC allows a setting
of 0ns for the t_DF time.

Allow this setting by doing the same thing as in the atmel nand
controller driver by setting ncycles to ATMEL_SMC_MODE_TDF_MIN if zero
is set in the dts.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/490966.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520652.html

Suggested-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Alexander Dahl <ada@thorsis.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-26 22:37:46 +02:00
Alexander Dahl
bc9b934b2f memory: atmel-ebi: Fix smc timing return value evaluation
Setting optional EBI/SMC properties through device tree always fails due
to wrong evaluation of the return value of
atmel_ebi_xslate_smc_timings().

If you put some of those properties in your dts file, but not
'atmel,smc-tdf-ns' the local variable 'required' in
atmel_ebi_xslate_smc_timings() stays on 'false' after the first 'if'
block. This leads to setting 'ret' to -EINVAL in the first run of the
following 'for' loop which is then the return value of this function.

However if you set 'atmel,smc-tdf-ns' in the dts file and everything in
atmel_ebi_xslate_smc_timings() works well, it returns the content of
'required' which is 'true' then.

So the function atmel_ebi_xslate_smc_timings() always returns non-zero
which lets its call in atmel_ebi_xslate_smc_config() always fail and
thus returning -EINVAL, so the EBI configuration for this node fails.

Judging from the following code evaluating the local 'required' variable
in atmel_ebi_xslate_smc_config() and the call of caps->xlate_config in
atmel_ebi_dev_setup() it's probably right to only let the call fail if a
negative error code is returned.

Signed-off-by: Alexander Dahl <ada@thorsis.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-26 22:37:30 +02:00
Oscar Campos
293b915fd9 Input: trackpoint - assume 3 buttons when buttons detection fails
Trackpoint buttons detection fails on ThinkPad 570 and 470 series,
this makes the middle button of the trackpoint to not being recogized.
As I don't believe there is any trackpoint with less than 3 buttons this
patch just assumes three buttons when the extended button information
read fails.

Signed-off-by: Oscar Campos <oscar.campos@member.fsf.org>
Acked-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-07-18 17:28:47 -07:00
Krzysztof Kozlowski
29178c1473 ARC: defconfig: Cleanup from old Kconfig options
Remove old, dead Kconfig option INET_LRO. It is gone since
commit 7bbf3cae65 ("ipv4: Remove inet_lro library").

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-07-17 19:16:07 -07:00
Logan Gunthorpe
bc240eec4b ntb: use correct mw_count function in ntb_tool and ntb_transport
After converting to the new API, both ntb_tool and ntb_transport are
using ntb_mw_count to iterate through ntb_peer_get_addr when they
should be using ntb_peer_mw_count.

This probably isn't an issue with the Intel and AMD drivers but
this will matter for any future driver with asymetric memory window
counts.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Fixes: 443b9a14ec ("NTB: Alter MW API to support multi-ports devices")
2017-07-17 12:56:15 -04:00
Ludovic Desroches
8ff235fe7a ARM: dts: at91: sama5d2: fix EBI/NAND controllers declaration
Fix HSMC interrupt ID, PMECC registers and EBI ones.

Fixes: d9c41bf30c ("ARM: dts: at91: Declare EBI/NAND controllers")
Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-17 11:37:39 +02:00
Ludovic Desroches
fa405fd9dd ARM: dts: at91: sama5d2: use sama5d2 compatible string for SMC
A new compatible string has been introduced for sama5d2 SMC to allow to
manage the registers mapping change.

Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-07-17 11:17:02 +02:00
Jonathan Liu
2a596fc9d9 drm/sun4i: Implement drm_driver lastclose to restore fbdev console
The drm_driver lastclose callback is called when the last userspace
DRM client has closed. Call drm_fbdev_cma_restore_mode to restore
the fbdev console otherwise the fbdev console will stop working.

Fixes: 9026e0d122 ("drm: Add Allwinner A10 Display Engine support")
Cc: stable@vger.kernel.org
Tested-by: Olliver Schinagl <oliver@schinagl.nl>
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Jonathan Liu <net147@gmail.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
2017-07-17 08:22:03 +02:00
Fabio Estevam
05969566e6 ARM: dts: imx7d-sdb: Put pinctrl_spi4 in the correct location
pinctrl_spi4 pin group is not part of the low power iomux controller,
so move it under the normal iomuxc node.

Fixes: 184f39b57c ("ARM: dts: imx7d-sdb: Add GPIO expander node")
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2017-07-16 09:43:04 +08:00
Suzuki K Poulose
65a30f8b30 irqchip/gic-v3: Honor forced affinity setting
Honor the 'force' flag for set_affinity, by selecting a CPU
from the given mask (which may not be reported "online" by
the cpu_online_mask). Some drivers, like ARM PMU, rely on it.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-07-04 11:11:26 +01:00
Suzuki K Poulose
63c16c6eac irqchip/gic-v3: Report failures in gic_irq_domain_alloc
If the GIC cannot map an IRQ via irq_domain_ops->alloc(), it doesn't
return an error code.  This can cause a problem with drivers, where
it thinks it has successfully got an IRQ for the device, but requesting
the same ends up failure with -ENOSYS (as the IRQ's chip is not set).

Fixes: commit 443acc4f37 ("irqchip: GICv3: Convert to domain hierarchy")
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-07-04 11:11:25 +01:00
Suzuki K Poulose
456c59c31c irqchip/gic-v2: Report failures in gic_irq_domain_alloc
If the GIC cannot map an IRQ via irq_domain_ops->alloc(), it doesn't
return an error code.  This can cause a problem with drivers, where
it thinks it has successfully got an IRQ for the device, but requesting
the same ends up failure with -ENOSYS (as the IRQ's chip is not set).

Fixes: commit 9a1091ef00 ("irqchip: gic: Support hierarchy irq domain.")
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-07-04 11:11:25 +01:00
Boris Brezillon
0a46230bf0 irqchip/atmel-aic: Remove root argument from ->fixup() prototype
We are no longer using the root argument passed to the ->fixup() hooks.
Remove it.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-07-04 11:10:37 +01:00
Boris Brezillon
277867ade8 irqchip/atmel-aic: Fix unbalanced refcount in aic_common_rtc_irq_fixup()
of_find_compatible_node() is calling of_node_put() on its first argument
thus leading to an unbalanced of_node_get/put() issue if the node has not
been retained before that.

Instead of passing the root node, pass NULL, which does exactly the same:
iterate over all DT nodes, starting from the root node.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Fixes: 3d61467f9b ("irqchip: atmel-aic: Implement RTC irq fixup")
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-07-04 11:10:36 +01:00
Boris Brezillon
469bcef53c irqchip/atmel-aic: Fix unbalanced of_node_put() in aic_common_irq_fixup()
aic_common_irq_fixup() is calling twice of_node_put() on the same node
thus leading to an unbalanced refcount on the root node.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Fixes: b2f579b58e ("irqchip: atmel-aic: Add irq fixup infrastructure")
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-07-04 11:09:50 +01:00
408 changed files with 2976 additions and 1441 deletions

View File

@@ -228,7 +228,7 @@ Learning on the device port should be enabled, as well as learning_sync:
bridge link set dev DEV learning on self
bridge link set dev DEV learning_sync on self
Learning_sync attribute enables syncing of the learned/forgotton FDB entry to
Learning_sync attribute enables syncing of the learned/forgotten FDB entry to
the bridge's FDB. It's possible, but not optimal, to enable learning on the
device port and on the bridge port, and disable learning_sync.
@@ -245,7 +245,7 @@ the responsibility of the port driver/device to age out these entries. If the
port device supports ageing, when the FDB entry expires, it will notify the
driver which in turn will notify the bridge with SWITCHDEV_FDB_DEL. If the
device does not support ageing, the driver can simulate ageing using a
garbage collection timer to monitor FBD entries. Expired entries will be
garbage collection timer to monitor FDB entries. Expired entries will be
notified to the bridge using SWITCHDEV_FDB_DEL. See rocker driver for
example of driver running ageing timer.

View File

@@ -58,20 +58,23 @@ Symbols/Function Pointers
%ps versatile_init
%pB prev_fn_of_versatile_init+0x88/0x88
For printing symbols and function pointers. The ``S`` and ``s`` specifiers
result in the symbol name with (``S``) or without (``s``) offsets. Where
this is used on a kernel without KALLSYMS - the symbol address is
printed instead.
The ``F`` and ``f`` specifiers are for printing function pointers,
for example, f->func, &gettimeofday. They have the same result as
``S`` and ``s`` specifiers. But they do an extra conversion on
ia64, ppc64 and parisc64 architectures where the function pointers
are actually function descriptors.
The ``S`` and ``s`` specifiers can be used for printing symbols
from direct addresses, for example, __builtin_return_address(0),
(void *)regs->ip. They result in the symbol name with (``S``) or
without (``s``) offsets. If KALLSYMS are disabled then the symbol
address is printed instead.
The ``B`` specifier results in the symbol name with offsets and should be
used when printing stack backtraces. The specifier takes into
consideration the effect of compiler optimisations which may occur
when tail-call``s are used and marked with the noreturn GCC attribute.
On ia64, ppc64 and parisc64 architectures function pointers are
actually function descriptors which must first be resolved. The ``F`` and
``f`` specifiers perform this resolution and then provide the same
functionality as the ``S`` and ``s`` specifiers.
Kernel Pointers
===============

View File

@@ -35,9 +35,34 @@ Table : Subdirectories in /proc/sys/net
bpf_jit_enable
--------------
This enables Berkeley Packet Filter Just in Time compiler.
Currently supported on x86_64 architecture, bpf_jit provides a framework
to speed packet filtering, the one used by tcpdump/libpcap for example.
This enables the BPF Just in Time (JIT) compiler. BPF is a flexible
and efficient infrastructure allowing to execute bytecode at various
hook points. It is used in a number of Linux kernel subsystems such
as networking (e.g. XDP, tc), tracing (e.g. kprobes, uprobes, tracepoints)
and security (e.g. seccomp). LLVM has a BPF back end that can compile
restricted C into a sequence of BPF instructions. After program load
through bpf(2) and passing a verifier in the kernel, a JIT will then
translate these BPF proglets into native CPU instructions. There are
two flavors of JITs, the newer eBPF JIT currently supported on:
- x86_64
- arm64
- ppc64
- sparc64
- mips64
- s390x
And the older cBPF JIT supported on the following archs:
- arm
- mips
- ppc
- sparc
eBPF JITs are a superset of cBPF JITs, meaning the kernel will
migrate cBPF instructions into eBPF instructions and then JIT
compile them transparently. Older cBPF JITs can only translate
tcpdump filters, seccomp rules, etc, but not mentioned eBPF
programs loaded through bpf(2).
Values :
0 - disable the JIT (default value)
1 - enable the JIT
@@ -46,9 +71,9 @@ Values :
bpf_jit_harden
--------------
This enables hardening for the Berkeley Packet Filter Just in Time compiler.
Supported are eBPF JIT backends. Enabling hardening trades off performance,
but can mitigate JIT spraying.
This enables hardening for the BPF JIT compiler. Supported are eBPF
JIT backends. Enabling hardening trades off performance, but can
mitigate JIT spraying.
Values :
0 - disable JIT hardening (default value)
1 - enable JIT hardening for unprivileged users only
@@ -57,11 +82,11 @@ Values :
bpf_jit_kallsyms
----------------
When Berkeley Packet Filter Just in Time compiler is enabled, then compiled
images are unknown addresses to the kernel, meaning they neither show up in
traces nor in /proc/kallsyms. This enables export of these addresses, which
can be used for debugging/tracing. If bpf_jit_harden is enabled, this feature
is disabled.
When BPF JIT compiler is enabled, then compiled images are unknown
addresses to the kernel, meaning they neither show up in traces nor
in /proc/kallsyms. This enables export of these addresses, which can
be used for debugging/tracing. If bpf_jit_harden is enabled, this
feature is disabled.
Values :
0 - disable JIT kallsyms export (default value)
1 - enable JIT kallsyms export for privileged users only

View File

@@ -7110,7 +7110,6 @@ M: Marc Zyngier <marc.zyngier@arm.com>
L: linux-kernel@vger.kernel.org
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core
T: git git://git.infradead.org/users/jcooper/linux.git irqchip/core
F: Documentation/devicetree/bindings/interrupt-controller/
F: drivers/irqchip/

View File

@@ -1,7 +1,7 @@
VERSION = 4
PATCHLEVEL = 13
SUBLEVEL = 0
EXTRAVERSION = -rc5
EXTRAVERSION = -rc7
NAME = Fearless Coyote
# *DOCUMENTATION*
@@ -396,7 +396,7 @@ LINUXINCLUDE := \
KBUILD_CPPFLAGS := -D__KERNEL__
KBUILD_CFLAGS := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
-fno-strict-aliasing -fno-common \
-fno-strict-aliasing -fno-common -fshort-wchar \
-Werror-implicit-function-declaration \
-Wno-format-security \
-std=gnu89 $(call cc-option,-fno-PIE)
@@ -442,7 +442,7 @@ export RCS_TAR_IGNORE := --exclude SCCS --exclude BitKeeper --exclude .svn \
# ===========================================================================
# Rules shared between *config targets and build targets
# Basic helpers built in scripts/
# Basic helpers built in scripts/basic/
PHONY += scripts_basic
scripts_basic:
$(Q)$(MAKE) $(build)=scripts/basic
@@ -505,7 +505,7 @@ ifeq ($(KBUILD_EXTMOD),)
endif
endif
endif
# install and module_install need also be processed one by one
# install and modules_install need also be processed one by one
ifneq ($(filter install,$(MAKECMDGOALS)),)
ifneq ($(filter modules_install,$(MAKECMDGOALS)),)
mixed-targets := 1
@@ -964,7 +964,7 @@ export KBUILD_VMLINUX_MAIN := $(core-y) $(libs-y2) $(drivers-y) $(net-y) $(virt-
export KBUILD_VMLINUX_LIBS := $(libs-y1)
export KBUILD_LDS := arch/$(SRCARCH)/kernel/vmlinux.lds
export LDFLAGS_vmlinux
# used by scripts/pacmage/Makefile
# used by scripts/package/Makefile
export KBUILD_ALLDIRS := $(sort $(filter-out arch/%,$(vmlinux-alldirs)) arch Documentation include samples scripts tools)
vmlinux-deps := $(KBUILD_LDS) $(KBUILD_VMLINUX_INIT) $(KBUILD_VMLINUX_MAIN) $(KBUILD_VMLINUX_LIBS)
@@ -992,8 +992,8 @@ include/generated/autoksyms.h: FORCE
ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink)
# Final link of vmlinux with optional arch pass after final link
cmd_link-vmlinux = \
$(CONFIG_SHELL) $< $(LD) $(LDFLAGS) $(LDFLAGS_vmlinux) ; \
cmd_link-vmlinux = \
$(CONFIG_SHELL) $< $(LD) $(LDFLAGS) $(LDFLAGS_vmlinux) ; \
$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
vmlinux: scripts/link-vmlinux.sh vmlinux_prereq $(vmlinux-deps) FORCE
@@ -1184,6 +1184,7 @@ PHONY += kselftest
kselftest:
$(Q)$(MAKE) -C tools/testing/selftests run_tests
PHONY += kselftest-clean
kselftest-clean:
$(Q)$(MAKE) -C tools/testing/selftests clean

View File

@@ -96,7 +96,6 @@ menu "ARC Architecture Configuration"
menu "ARC Platform/SoC/Board"
source "arch/arc/plat-sim/Kconfig"
source "arch/arc/plat-tb10x/Kconfig"
source "arch/arc/plat-axs10x/Kconfig"
#New platform adds here

View File

@@ -107,7 +107,7 @@ core-y += arch/arc/
# w/o this dtb won't embed into kernel binary
core-y += arch/arc/boot/dts/
core-$(CONFIG_ARC_PLAT_SIM) += arch/arc/plat-sim/
core-y += arch/arc/plat-sim/
core-$(CONFIG_ARC_PLAT_TB10X) += arch/arc/plat-tb10x/
core-$(CONFIG_ARC_PLAT_AXS10X) += arch/arc/plat-axs10x/
core-$(CONFIG_ARC_PLAT_EZNPS) += arch/arc/plat-eznps/

View File

@@ -15,15 +15,15 @@
/ {
compatible = "snps,arc";
#address-cells = <1>;
#size-cells = <1>;
#address-cells = <2>;
#size-cells = <2>;
cpu_card {
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x00000000 0xf0000000 0x10000000>;
ranges = <0x00000000 0x0 0xf0000000 0x10000000>;
core_clk: core_clk {
#clock-cells = <0>;
@@ -91,23 +91,21 @@
mb_intc: dw-apb-ictl@0xe0012000 {
#interrupt-cells = <1>;
compatible = "snps,dw-apb-ictl";
reg = < 0xe0012000 0x200 >;
reg = < 0x0 0xe0012000 0x0 0x200 >;
interrupt-controller;
interrupt-parent = <&core_intc>;
interrupts = < 7 >;
};
memory {
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x00000000 0x80000000 0x20000000>;
device_type = "memory";
reg = <0x80000000 0x1b000000>; /* (512 - 32) MiB */
/* CONFIG_KERNEL_RAM_BASE_ADDRESS needs to match low mem start */
reg = <0x0 0x80000000 0x0 0x1b000000>; /* (512 - 32) MiB */
};
reserved-memory {
#address-cells = <1>;
#size-cells = <1>;
#address-cells = <2>;
#size-cells = <2>;
ranges;
/*
* We just move frame buffer area to the very end of
@@ -118,7 +116,7 @@
*/
frame_buffer: frame_buffer@9e000000 {
compatible = "shared-dma-pool";
reg = <0x9e000000 0x2000000>;
reg = <0x0 0x9e000000 0x0 0x2000000>;
no-map;
};
};

View File

@@ -14,15 +14,15 @@
/ {
compatible = "snps,arc";
#address-cells = <1>;
#size-cells = <1>;
#address-cells = <2>;
#size-cells = <2>;
cpu_card {
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x00000000 0xf0000000 0x10000000>;
ranges = <0x00000000 0x0 0xf0000000 0x10000000>;
core_clk: core_clk {
#clock-cells = <0>;
@@ -94,30 +94,29 @@
mb_intc: dw-apb-ictl@0xe0012000 {
#interrupt-cells = <1>;
compatible = "snps,dw-apb-ictl";
reg = < 0xe0012000 0x200 >;
reg = < 0x0 0xe0012000 0x0 0x200 >;
interrupt-controller;
interrupt-parent = <&core_intc>;
interrupts = < 24 >;
};
memory {
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x00000000 0x80000000 0x40000000>;
device_type = "memory";
reg = <0x80000000 0x20000000>; /* 512MiB */
/* CONFIG_KERNEL_RAM_BASE_ADDRESS needs to match low mem start */
reg = <0x0 0x80000000 0x0 0x20000000 /* 512 MiB low mem */
0x1 0xc0000000 0x0 0x40000000>; /* 1 GiB highmem */
};
reserved-memory {
#address-cells = <1>;
#size-cells = <1>;
#address-cells = <2>;
#size-cells = <2>;
ranges;
/*
* Move frame buffer out of IOC aperture (0x8z-0xAz).
*/
frame_buffer: frame_buffer@be000000 {
compatible = "shared-dma-pool";
reg = <0xbe000000 0x2000000>;
reg = <0x0 0xbe000000 0x0 0x2000000>;
no-map;
};
};

View File

@@ -14,15 +14,15 @@
/ {
compatible = "snps,arc";
#address-cells = <1>;
#size-cells = <1>;
#address-cells = <2>;
#size-cells = <2>;
cpu_card {
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x00000000 0xf0000000 0x10000000>;
ranges = <0x00000000 0x0 0xf0000000 0x10000000>;
core_clk: core_clk {
#clock-cells = <0>;
@@ -100,30 +100,29 @@
mb_intc: dw-apb-ictl@0xe0012000 {
#interrupt-cells = <1>;
compatible = "snps,dw-apb-ictl";
reg = < 0xe0012000 0x200 >;
reg = < 0x0 0xe0012000 0x0 0x200 >;
interrupt-controller;
interrupt-parent = <&idu_intc>;
interrupts = <0>;
};
memory {
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x00000000 0x80000000 0x40000000>;
device_type = "memory";
reg = <0x80000000 0x20000000>; /* 512MiB */
/* CONFIG_KERNEL_RAM_BASE_ADDRESS needs to match low mem start */
reg = <0x0 0x80000000 0x0 0x20000000 /* 512 MiB low mem */
0x1 0xc0000000 0x0 0x40000000>; /* 1 GiB highmem */
};
reserved-memory {
#address-cells = <1>;
#size-cells = <1>;
#address-cells = <2>;
#size-cells = <2>;
ranges;
/*
* Move frame buffer out of IOC aperture (0x8z-0xAz).
*/
frame_buffer: frame_buffer@be000000 {
compatible = "shared-dma-pool";
reg = <0xbe000000 0x2000000>;
reg = <0x0 0xbe000000 0x0 0x2000000>;
no-map;
};
};

View File

@@ -13,7 +13,7 @@
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x00000000 0xe0000000 0x10000000>;
ranges = <0x00000000 0x0 0xe0000000 0x10000000>;
interrupt-parent = <&mb_intc>;
i2sclk: i2sclk@100a0 {

View File

@@ -21,7 +21,6 @@ CONFIG_MODULES=y
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARC_PLAT_SIM=y
CONFIG_ISA_ARCV2=y
CONFIG_ARC_BUILTIN_DTB_NAME="haps_hs"
CONFIG_PREEMPT=y

View File

@@ -23,7 +23,6 @@ CONFIG_MODULES=y
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARC_PLAT_SIM=y
CONFIG_ISA_ARCV2=y
CONFIG_SMP=y
CONFIG_ARC_BUILTIN_DTB_NAME="haps_hs_idu"

View File

@@ -39,7 +39,6 @@ CONFIG_IP_PNP=y
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_BEET is not set
# CONFIG_INET_LRO is not set
# CONFIG_INET_DIAG is not set
# CONFIG_IPV6 is not set
# CONFIG_WIRELESS is not set

View File

@@ -23,7 +23,6 @@ CONFIG_MODULES=y
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARC_PLAT_SIM=y
CONFIG_ARC_BUILTIN_DTB_NAME="nsim_700"
CONFIG_PREEMPT=y
# CONFIG_COMPACTION is not set

View File

@@ -26,7 +26,6 @@ CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARC_PLAT_SIM=y
CONFIG_ISA_ARCV2=y
CONFIG_ARC_BUILTIN_DTB_NAME="nsim_hs"
CONFIG_PREEMPT=y

View File

@@ -24,7 +24,6 @@ CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARC_PLAT_SIM=y
CONFIG_ISA_ARCV2=y
CONFIG_SMP=y
CONFIG_ARC_BUILTIN_DTB_NAME="nsim_hs_idu"

View File

@@ -23,7 +23,6 @@ CONFIG_MODULES=y
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARC_PLAT_SIM=y
CONFIG_ARC_BUILTIN_DTB_NAME="nsimosci"
# CONFIG_COMPACTION is not set
CONFIG_NET=y

View File

@@ -23,7 +23,6 @@ CONFIG_MODULES=y
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARC_PLAT_SIM=y
CONFIG_ISA_ARCV2=y
CONFIG_ARC_BUILTIN_DTB_NAME="nsimosci_hs"
# CONFIG_COMPACTION is not set

View File

@@ -18,7 +18,6 @@ CONFIG_MODULES=y
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARC_PLAT_SIM=y
CONFIG_ISA_ARCV2=y
CONFIG_SMP=y
# CONFIG_ARC_TIMERS_64BIT is not set

View File

@@ -38,7 +38,6 @@ CONFIG_IP_MULTICAST=y
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_BEET is not set
# CONFIG_INET_LRO is not set
# CONFIG_INET_DIAG is not set
# CONFIG_IPV6 is not set
# CONFIG_WIRELESS is not set

View File

@@ -96,7 +96,9 @@ extern unsigned long perip_base, perip_end;
#define ARC_REG_SLC_FLUSH 0x904
#define ARC_REG_SLC_INVALIDATE 0x905
#define ARC_REG_SLC_RGN_START 0x914
#define ARC_REG_SLC_RGN_START1 0x915
#define ARC_REG_SLC_RGN_END 0x916
#define ARC_REG_SLC_RGN_END1 0x917
/* Bit val in SLC_CONTROL */
#define SLC_CTRL_DIS 0x001

View File

@@ -94,6 +94,8 @@ static inline int is_pae40_enabled(void)
return IS_ENABLED(CONFIG_ARC_HAS_PAE40);
}
extern int pae40_exist_but_not_enab(void);
#endif /* !__ASSEMBLY__ */
#endif

View File

@@ -75,10 +75,13 @@ void arc_init_IRQ(void)
* Set a default priority for all available interrupts to prevent
* switching of register banks if Fast IRQ and multiple register banks
* are supported by CPU.
* Also disable all IRQ lines so faulty external hardware won't
* trigger interrupt that kernel is not ready to handle.
*/
for (i = NR_EXCEPTIONS; i < irq_bcr.irqs + NR_EXCEPTIONS; i++) {
write_aux_reg(AUX_IRQ_SELECT, i);
write_aux_reg(AUX_IRQ_PRIORITY, ARCV2_IRQ_DEF_PRIO);
write_aux_reg(AUX_IRQ_ENABLE, 0);
}
/* setup status32, don't enable intr yet as kernel doesn't want */

View File

@@ -27,7 +27,7 @@
*/
void arc_init_IRQ(void)
{
int level_mask = 0;
int level_mask = 0, i;
/* Is timer high priority Interrupt (Level2 in ARCompact jargon) */
level_mask |= IS_ENABLED(CONFIG_ARC_COMPACT_IRQ_LEVELS) << TIMER0_IRQ;
@@ -40,6 +40,18 @@ void arc_init_IRQ(void)
if (level_mask)
pr_info("Level-2 interrupts bitset %x\n", level_mask);
/*
* Disable all IRQ lines so faulty external hardware won't
* trigger interrupt that kernel is not ready to handle.
*/
for (i = TIMER0_IRQ; i < NR_CPU_IRQS; i++) {
unsigned int ienb;
ienb = read_aux_reg(AUX_IENABLE);
ienb &= ~(1 << i);
write_aux_reg(AUX_IENABLE, ienb);
}
}
/*

View File

@@ -665,6 +665,7 @@ noinline void slc_op(phys_addr_t paddr, unsigned long sz, const int op)
static DEFINE_SPINLOCK(lock);
unsigned long flags;
unsigned int ctrl;
phys_addr_t end;
spin_lock_irqsave(&lock, flags);
@@ -694,8 +695,19 @@ noinline void slc_op(phys_addr_t paddr, unsigned long sz, const int op)
* END needs to be setup before START (latter triggers the operation)
* END can't be same as START, so add (l2_line_sz - 1) to sz
*/
write_aux_reg(ARC_REG_SLC_RGN_END, (paddr + sz + l2_line_sz - 1));
write_aux_reg(ARC_REG_SLC_RGN_START, paddr);
end = paddr + sz + l2_line_sz - 1;
if (is_pae40_enabled())
write_aux_reg(ARC_REG_SLC_RGN_END1, upper_32_bits(end));
write_aux_reg(ARC_REG_SLC_RGN_END, lower_32_bits(end));
if (is_pae40_enabled())
write_aux_reg(ARC_REG_SLC_RGN_START1, upper_32_bits(paddr));
write_aux_reg(ARC_REG_SLC_RGN_START, lower_32_bits(paddr));
/* Make sure "busy" bit reports correct stataus, see STAR 9001165532 */
read_aux_reg(ARC_REG_SLC_CTRL);
while (read_aux_reg(ARC_REG_SLC_CTRL) & SLC_CTRL_BUSY);
@@ -1111,6 +1123,13 @@ noinline void __init arc_ioc_setup(void)
__dc_enable();
}
/*
* Cache related boot time checks/setups only needed on master CPU:
* - Geometry checks (kernel build and hardware agree: e.g. L1_CACHE_BYTES)
* Assume SMP only, so all cores will have same cache config. A check on
* one core suffices for all
* - IOC setup / dma callbacks only need to be done once
*/
void __init arc_cache_init_master(void)
{
unsigned int __maybe_unused cpu = smp_processor_id();
@@ -1190,12 +1209,27 @@ void __ref arc_cache_init(void)
printk(arc_cache_mumbojumbo(0, str, sizeof(str)));
/*
* Only master CPU needs to execute rest of function:
* - Assume SMP so all cores will have same cache config so
* any geomtry checks will be same for all
* - IOC setup / dma callbacks only need to be setup once
*/
if (!cpu)
arc_cache_init_master();
/*
* In PAE regime, TLB and cache maintenance ops take wider addresses
* And even if PAE is not enabled in kernel, the upper 32-bits still need
* to be zeroed to keep the ops sane.
* As an optimization for more common !PAE enabled case, zero them out
* once at init, rather than checking/setting to 0 for every runtime op
*/
if (is_isa_arcv2() && pae40_exist_but_not_enab()) {
if (IS_ENABLED(CONFIG_ARC_HAS_ICACHE))
write_aux_reg(ARC_REG_IC_PTAG_HI, 0);
if (IS_ENABLED(CONFIG_ARC_HAS_DCACHE))
write_aux_reg(ARC_REG_DC_PTAG_HI, 0);
if (l2_line_sz) {
write_aux_reg(ARC_REG_SLC_RGN_END1, 0);
write_aux_reg(ARC_REG_SLC_RGN_START1, 0);
}
}
}

View File

@@ -153,6 +153,19 @@ static void _dma_cache_sync(phys_addr_t paddr, size_t size,
}
}
/*
* arc_dma_map_page - map a portion of a page for streaming DMA
*
* Ensure that any data held in the cache is appropriately discarded
* or written back.
*
* The device owns this memory once this call has completed. The CPU
* can regain ownership by calling dma_unmap_page().
*
* Note: while it takes struct page as arg, caller can "abuse" it to pass
* a region larger than PAGE_SIZE, provided it is physically contiguous
* and this still works correctly
*/
static dma_addr_t arc_dma_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size, enum dma_data_direction dir,
unsigned long attrs)
@@ -165,6 +178,24 @@ static dma_addr_t arc_dma_map_page(struct device *dev, struct page *page,
return plat_phys_to_dma(dev, paddr);
}
/*
* arc_dma_unmap_page - unmap a buffer previously mapped through dma_map_page()
*
* After this call, reads by the CPU to the buffer are guaranteed to see
* whatever the device wrote there.
*
* Note: historically this routine was not implemented for ARC
*/
static void arc_dma_unmap_page(struct device *dev, dma_addr_t handle,
size_t size, enum dma_data_direction dir,
unsigned long attrs)
{
phys_addr_t paddr = plat_dma_to_phys(dev, handle);
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
_dma_cache_sync(paddr, size, dir);
}
static int arc_dma_map_sg(struct device *dev, struct scatterlist *sg,
int nents, enum dma_data_direction dir, unsigned long attrs)
{
@@ -178,6 +209,18 @@ static int arc_dma_map_sg(struct device *dev, struct scatterlist *sg,
return nents;
}
static void arc_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
int nents, enum dma_data_direction dir,
unsigned long attrs)
{
struct scatterlist *s;
int i;
for_each_sg(sg, s, nents, i)
arc_dma_unmap_page(dev, sg_dma_address(s), sg_dma_len(s), dir,
attrs);
}
static void arc_dma_sync_single_for_cpu(struct device *dev,
dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
{
@@ -223,7 +266,9 @@ const struct dma_map_ops arc_dma_ops = {
.free = arc_dma_free,
.mmap = arc_dma_mmap,
.map_page = arc_dma_map_page,
.unmap_page = arc_dma_unmap_page,
.map_sg = arc_dma_map_sg,
.unmap_sg = arc_dma_unmap_sg,
.sync_single_for_device = arc_dma_sync_single_for_device,
.sync_single_for_cpu = arc_dma_sync_single_for_cpu,
.sync_sg_for_cpu = arc_dma_sync_sg_for_cpu,

View File

@@ -104,6 +104,8 @@
/* A copy of the ASID from the PID reg is kept in asid_cache */
DEFINE_PER_CPU(unsigned int, asid_cache) = MM_CTXT_FIRST_CYCLE;
static int __read_mostly pae_exists;
/*
* Utility Routine to erase a J-TLB entry
* Caller needs to setup Index Reg (manually or via getIndex)
@@ -784,7 +786,7 @@ void read_decode_mmu_bcr(void)
mmu->u_dtlb = mmu4->u_dtlb * 4;
mmu->u_itlb = mmu4->u_itlb * 4;
mmu->sasid = mmu4->sasid;
mmu->pae = mmu4->pae;
pae_exists = mmu->pae = mmu4->pae;
}
}
@@ -809,6 +811,11 @@ char *arc_mmu_mumbojumbo(int cpu_id, char *buf, int len)
return buf;
}
int pae40_exist_but_not_enab(void)
{
return pae_exists && !is_pae40_enabled();
}
void arc_mmu_init(void)
{
char str[256];
@@ -859,6 +866,9 @@ void arc_mmu_init(void)
/* swapper_pg_dir is the pgd for the kernel, used by vmalloc */
write_aux_reg(ARC_REG_SCRATCH_DATA0, swapper_pg_dir);
#endif
if (pae40_exist_but_not_enab())
write_aux_reg(ARC_REG_TLBPD1HI, 0);
}
/*

View File

@@ -1,13 +0,0 @@
#
# Copyright (C) 2007-2010, 2011-2012 Synopsys, Inc. (www.synopsys.com)
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License version 2 as
# published by the Free Software Foundation.
#
menuconfig ARC_PLAT_SIM
bool "ARC nSIM based simulation virtual platforms"
help
Support for nSIM based ARC simulation platforms
This includes the standalone nSIM (uart only) vs. System C OSCI VP

View File

@@ -20,11 +20,14 @@
*/
static const char *simulation_compat[] __initconst = {
#ifdef CONFIG_ISA_ARCOMPACT
"snps,nsim",
"snps,nsim_hs",
"snps,nsimosci",
#else
"snps,nsim_hs",
"snps,nsimosci_hs",
"snps,zebu_hs",
#endif
NULL,
};

View File

@@ -266,6 +266,7 @@
&hdmicec {
status = "okay";
needs-hpd;
};
&hsi2c_4 {

View File

@@ -297,6 +297,7 @@
#address-cells = <1>;
#size-cells = <1>;
status = "disabled";
ranges;
adc: adc@50030800 {
compatible = "fsl,imx25-gcq";

View File

@@ -507,7 +507,7 @@
pinctrl_pcie: pciegrp {
fsl,pins = <
/* PCIe reset */
MX6QDL_PAD_EIM_BCLK__GPIO6_IO31 0x030b0
MX6QDL_PAD_EIM_DA0__GPIO3_IO00 0x030b0
MX6QDL_PAD_EIM_DA4__GPIO3_IO04 0x030b0
>;
};
@@ -668,7 +668,7 @@
&pcie {
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_pcie>;
reset-gpio = <&gpio6 31 GPIO_ACTIVE_LOW>;
reset-gpio = <&gpio3 0 GPIO_ACTIVE_LOW>;
status = "okay";
};

View File

@@ -557,6 +557,14 @@
>;
};
pinctrl_spi4: spi4grp {
fsl,pins = <
MX7D_PAD_GPIO1_IO09__GPIO1_IO9 0x59
MX7D_PAD_GPIO1_IO12__GPIO1_IO12 0x59
MX7D_PAD_GPIO1_IO13__GPIO1_IO13 0x59
>;
};
pinctrl_tsc2046_pendown: tsc2046_pendown {
fsl,pins = <
MX7D_PAD_EPDC_BDR1__GPIO2_IO29 0x59
@@ -697,13 +705,5 @@
fsl,pins = <
MX7D_PAD_LPSR_GPIO1_IO01__PWM1_OUT 0x110b0
>;
pinctrl_spi4: spi4grp {
fsl,pins = <
MX7D_PAD_GPIO1_IO09__GPIO1_IO9 0x59
MX7D_PAD_GPIO1_IO12__GPIO1_IO12 0x59
MX7D_PAD_GPIO1_IO13__GPIO1_IO13 0x59
>;
};
};
};

View File

@@ -303,7 +303,7 @@
#size-cells = <1>;
atmel,smc = <&hsmc>;
reg = <0x10000000 0x10000000
0x40000000 0x30000000>;
0x60000000 0x30000000>;
ranges = <0x0 0x0 0x10000000 0x10000000
0x1 0x0 0x60000000 0x10000000
0x2 0x0 0x70000000 0x10000000
@@ -1048,18 +1048,18 @@
};
hsmc: hsmc@f8014000 {
compatible = "atmel,sama5d3-smc", "syscon", "simple-mfd";
compatible = "atmel,sama5d2-smc", "syscon", "simple-mfd";
reg = <0xf8014000 0x1000>;
interrupts = <5 IRQ_TYPE_LEVEL_HIGH 6>;
interrupts = <17 IRQ_TYPE_LEVEL_HIGH 6>;
clocks = <&hsmc_clk>;
#address-cells = <1>;
#size-cells = <1>;
ranges;
pmecc: ecc-engine@ffffc070 {
pmecc: ecc-engine@f8014070 {
compatible = "atmel,sama5d2-pmecc";
reg = <0xffffc070 0x490>,
<0xffffc500 0x100>;
reg = <0xf8014070 0x490>,
<0xf8014500 0x100>;
};
};

View File

@@ -1,7 +1,7 @@
menuconfig ARCH_AT91
bool "Atmel SoCs"
depends on ARCH_MULTI_V4T || ARCH_MULTI_V5 || ARCH_MULTI_V7 || ARM_SINGLE_ARMV7M
select ARM_CPU_SUSPEND if PM
select ARM_CPU_SUSPEND if PM && ARCH_MULTI_V7
select COMMON_CLK_AT91
select GPIOLIB
select PINCTRL

View File

@@ -608,6 +608,9 @@ static void __init at91_pm_init(void (*pm_idle)(void))
void __init at91rm9200_pm_init(void)
{
if (!IS_ENABLED(CONFIG_SOC_AT91RM9200))
return;
at91_dt_ramc();
/*
@@ -620,18 +623,27 @@ void __init at91rm9200_pm_init(void)
void __init at91sam9_pm_init(void)
{
if (!IS_ENABLED(CONFIG_SOC_AT91SAM9))
return;
at91_dt_ramc();
at91_pm_init(at91sam9_idle);
}
void __init sama5_pm_init(void)
{
if (!IS_ENABLED(CONFIG_SOC_SAMA5))
return;
at91_dt_ramc();
at91_pm_init(NULL);
}
void __init sama5d2_pm_init(void)
{
if (!IS_ENABLED(CONFIG_SOC_SAMA5D2))
return;
at91_pm_backup_init();
sama5_pm_init();
}

View File

@@ -51,6 +51,7 @@
compatible = "sinovoip,bananapi-m64", "allwinner,sun50i-a64";
aliases {
ethernet0 = &emac;
serial0 = &uart0;
serial1 = &uart1;
};

View File

@@ -51,6 +51,7 @@
compatible = "pine64,pine64", "allwinner,sun50i-a64";
aliases {
ethernet0 = &emac;
serial0 = &uart0;
serial1 = &uart1;
serial2 = &uart2;

View File

@@ -53,6 +53,7 @@
"allwinner,sun50i-a64";
aliases {
ethernet0 = &emac;
serial0 = &uart0;
};

View File

@@ -120,5 +120,8 @@
};
&pio {
interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 17 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 23 IRQ_TYPE_LEVEL_HIGH>;
compatible = "allwinner,sun50i-h5-pinctrl";
};

View File

@@ -45,7 +45,7 @@
stdout-path = "serial0:115200n8";
};
audio_clkout: audio_clkout {
audio_clkout: audio-clkout {
/*
* This is same as <&rcar_sound 0>
* but needed to avoid cs2000/rcar_sound probe dead-lock

View File

@@ -65,13 +65,13 @@ DECLARE_PER_CPU(const struct arch_timer_erratum_workaround *,
u64 _val; \
if (needs_unstable_timer_counter_workaround()) { \
const struct arch_timer_erratum_workaround *wa; \
preempt_disable(); \
preempt_disable_notrace(); \
wa = __this_cpu_read(timer_unstable_counter_workaround); \
if (wa && wa->read_##reg) \
_val = wa->read_##reg(); \
else \
_val = read_sysreg(reg); \
preempt_enable(); \
preempt_enable_notrace(); \
} else { \
_val = read_sysreg(reg); \
} \

View File

@@ -114,10 +114,10 @@
/*
* This is the base location for PIE (ET_DYN with INTERP) loads. On
* 64-bit, this is raised to 4GB to leave the entire 32-bit address
* 64-bit, this is above 4GB to leave the entire 32-bit address
* space open for things that want to use the area for 32-bit pointers.
*/
#define ELF_ET_DYN_BASE 0x100000000UL
#define ELF_ET_DYN_BASE (2 * TASK_SIZE_64 / 3)
#ifndef __ASSEMBLY__

View File

@@ -161,9 +161,11 @@ void fpsimd_flush_thread(void)
{
if (!system_supports_fpsimd())
return;
preempt_disable();
memset(&current->thread.fpsimd_state, 0, sizeof(struct fpsimd_state));
fpsimd_flush_task_state(current);
set_thread_flag(TIF_FOREIGN_FPSTATE);
preempt_enable();
}
/*

View File

@@ -354,7 +354,6 @@ __primary_switched:
tst x23, ~(MIN_KIMG_ALIGN - 1) // already running randomized?
b.ne 0f
mov x0, x21 // pass FDT address in x0
mov x1, x23 // pass modulo offset in x1
bl kaslr_early_init // parse FDT for KASLR options
cbz x0, 0f // KASLR disabled? just proceed
orr x23, x23, x0 // record KASLR offset

View File

@@ -75,7 +75,7 @@ extern void *__init __fixmap_remap_fdt(phys_addr_t dt_phys, int *size,
* containing function pointers) to be reinitialized, and zero-initialized
* .bss variables will be reset to 0.
*/
u64 __init kaslr_early_init(u64 dt_phys, u64 modulo_offset)
u64 __init kaslr_early_init(u64 dt_phys)
{
void *fdt;
u64 seed, offset, mask, module_range;
@@ -131,15 +131,17 @@ u64 __init kaslr_early_init(u64 dt_phys, u64 modulo_offset)
/*
* The kernel Image should not extend across a 1GB/32MB/512MB alignment
* boundary (for 4KB/16KB/64KB granule kernels, respectively). If this
* happens, increase the KASLR offset by the size of the kernel image
* rounded up by SWAPPER_BLOCK_SIZE.
* happens, round down the KASLR offset by (1 << SWAPPER_TABLE_SHIFT).
*
* NOTE: The references to _text and _end below will already take the
* modulo offset (the physical displacement modulo 2 MB) into
* account, given that the physical placement is controlled by
* the loader, and will not change as a result of the virtual
* mapping we choose.
*/
if ((((u64)_text + offset + modulo_offset) >> SWAPPER_TABLE_SHIFT) !=
(((u64)_end + offset + modulo_offset) >> SWAPPER_TABLE_SHIFT)) {
u64 kimg_sz = _end - _text;
offset = (offset + round_up(kimg_sz, SWAPPER_BLOCK_SIZE))
& mask;
}
if ((((u64)_text + offset) >> SWAPPER_TABLE_SHIFT) !=
(((u64)_end + offset) >> SWAPPER_TABLE_SHIFT))
offset = round_down(offset, 1 << SWAPPER_TABLE_SHIFT);
if (IS_ENABLED(CONFIG_KASAN))
/*

View File

@@ -435,8 +435,11 @@ retry:
* the mmap_sem because it would already be released
* in __lock_page_or_retry in mm/filemap.c.
*/
if (fatal_signal_pending(current))
if (fatal_signal_pending(current)) {
if (!user_mode(regs))
goto no_context;
return 0;
}
/*
* Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk of

View File

@@ -199,7 +199,7 @@ config PPC
select HAVE_OPTPROBES if PPC64
select HAVE_PERF_EVENTS
select HAVE_PERF_EVENTS_NMI if PPC64
select HAVE_HARDLOCKUP_DETECTOR_PERF if HAVE_PERF_EVENTS_NMI && !HAVE_HARDLOCKUP_DETECTOR_ARCH
select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI && !HAVE_HARDLOCKUP_DETECTOR_ARCH
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select HAVE_RCU_TABLE_FREE if SMP

View File

@@ -90,6 +90,24 @@ static inline void switch_mm_irqs_off(struct mm_struct *prev,
/* Mark this context has been used on the new CPU */
if (!cpumask_test_cpu(smp_processor_id(), mm_cpumask(next))) {
cpumask_set_cpu(smp_processor_id(), mm_cpumask(next));
/*
* This full barrier orders the store to the cpumask above vs
* a subsequent operation which allows this CPU to begin loading
* translations for next.
*
* When using the radix MMU that operation is the load of the
* MMU context id, which is then moved to SPRN_PID.
*
* For the hash MMU it is either the first load from slb_cache
* in switch_slb(), and/or the store of paca->mm_ctx_id in
* copy_mm_to_paca().
*
* On the read side the barrier is in pte_xchg(), which orders
* the store to the PTE vs the load of mm_cpumask.
*/
smp_mb();
new_on_cpu = true;
}

View File

@@ -87,6 +87,7 @@ static inline bool pte_xchg(pte_t *ptep, pte_t old, pte_t new)
unsigned long *p = (unsigned long *)ptep;
__be64 prev;
/* See comment in switch_mm_irqs_off() */
prev = (__force __be64)__cmpxchg_u64(p, (__force unsigned long)pte_raw(old),
(__force unsigned long)pte_raw(new));

View File

@@ -62,6 +62,7 @@ static inline bool pte_xchg(pte_t *ptep, pte_t old, pte_t new)
{
unsigned long *p = (unsigned long *)ptep;
/* See comment in switch_mm_irqs_off() */
return pte_val(old) == __cmpxchg_u64(p, pte_val(old), pte_val(new));
}
#endif

View File

@@ -362,7 +362,8 @@ void enable_kernel_vsx(void)
cpumsr = msr_check_and_set(MSR_FP|MSR_VEC|MSR_VSX);
if (current->thread.regs && (current->thread.regs->msr & MSR_VSX)) {
if (current->thread.regs &&
(current->thread.regs->msr & (MSR_VSX|MSR_VEC|MSR_FP))) {
check_if_tm_restore_required(current);
/*
* If a thread has already been reclaimed then the
@@ -386,7 +387,7 @@ void flush_vsx_to_thread(struct task_struct *tsk)
{
if (tsk->thread.regs) {
preempt_disable();
if (tsk->thread.regs->msr & MSR_VSX) {
if (tsk->thread.regs->msr & (MSR_VSX|MSR_VEC|MSR_FP)) {
BUG_ON(tsk != current);
giveup_vsx(tsk);
}

View File

@@ -294,32 +294,26 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
struct kvm_create_spapr_tce_64 *args)
{
struct kvmppc_spapr_tce_table *stt = NULL;
struct kvmppc_spapr_tce_table *siter;
unsigned long npages, size;
int ret = -ENOMEM;
int i;
int fd = -1;
if (!args->size)
return -EINVAL;
/* Check this LIOBN hasn't been previously allocated */
list_for_each_entry(stt, &kvm->arch.spapr_tce_tables, list) {
if (stt->liobn == args->liobn)
return -EBUSY;
}
size = _ALIGN_UP(args->size, PAGE_SIZE >> 3);
npages = kvmppc_tce_pages(size);
ret = kvmppc_account_memlimit(kvmppc_stt_pages(npages), true);
if (ret) {
stt = NULL;
goto fail;
}
if (ret)
return ret;
ret = -ENOMEM;
stt = kzalloc(sizeof(*stt) + npages * sizeof(struct page *),
GFP_KERNEL);
if (!stt)
goto fail;
goto fail_acct;
stt->liobn = args->liobn;
stt->page_shift = args->page_shift;
@@ -334,24 +328,42 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
goto fail;
}
kvm_get_kvm(kvm);
ret = fd = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
stt, O_RDWR | O_CLOEXEC);
if (ret < 0)
goto fail;
mutex_lock(&kvm->lock);
list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables);
/* Check this LIOBN hasn't been previously allocated */
ret = 0;
list_for_each_entry(siter, &kvm->arch.spapr_tce_tables, list) {
if (siter->liobn == args->liobn) {
ret = -EBUSY;
break;
}
}
if (!ret) {
list_add_rcu(&stt->list, &kvm->arch.spapr_tce_tables);
kvm_get_kvm(kvm);
}
mutex_unlock(&kvm->lock);
return anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
stt, O_RDWR | O_CLOEXEC);
if (!ret)
return fd;
fail:
if (stt) {
for (i = 0; i < npages; i++)
if (stt->pages[i])
__free_page(stt->pages[i]);
put_unused_fd(fd);
kfree(stt);
}
fail:
for (i = 0; i < npages; i++)
if (stt->pages[i])
__free_page(stt->pages[i]);
kfree(stt);
fail_acct:
kvmppc_account_memlimit(kvmppc_stt_pages(npages), false);
return ret;
}

View File

@@ -1291,6 +1291,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
/* Hypervisor doorbell - exit only if host IPI flag set */
cmpwi r12, BOOK3S_INTERRUPT_H_DOORBELL
bne 3f
BEGIN_FTR_SECTION
PPC_MSGSYNC
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
lbz r0, HSTATE_HOST_IPI(r13)
cmpwi r0, 0
beq 4f

View File

@@ -16,7 +16,22 @@ static void GLUE(X_PFX,ack_pending)(struct kvmppc_xive_vcpu *xc)
u8 cppr;
u16 ack;
/* XXX DD1 bug workaround: Check PIPR vs. CPPR first ! */
/*
* Ensure any previous store to CPPR is ordered vs.
* the subsequent loads from PIPR or ACK.
*/
eieio();
/*
* DD1 bug workaround: If PIPR is less favored than CPPR
* ignore the interrupt or we might incorrectly lose an IPB
* bit.
*/
if (cpu_has_feature(CPU_FTR_POWER9_DD1)) {
u8 pipr = __x_readb(__x_tima + TM_QW1_OS + TM_PIPR);
if (pipr >= xc->hw_cppr)
return;
}
/* Perform the acknowledge OS to register cycle. */
ack = be16_to_cpu(__x_readw(__x_tima + TM_SPC_ACK_OS_REG));
@@ -235,6 +250,11 @@ skip_ipi:
/*
* If we found an interrupt, adjust what the guest CPPR should
* be as if we had just fetched that interrupt from HW.
*
* Note: This can only make xc->cppr smaller as the previous
* loop will only exit with hirq != 0 if prio is lower than
* the current xc->cppr. Thus we don't need to re-check xc->mfrr
* for pending IPIs.
*/
if (hirq)
xc->cppr = prio;
@@ -380,6 +400,12 @@ X_STATIC int GLUE(X_PFX,h_cppr)(struct kvm_vcpu *vcpu, unsigned long cppr)
old_cppr = xc->cppr;
xc->cppr = cppr;
/*
* Order the above update of xc->cppr with the subsequent
* read of xc->mfrr inside push_pending_to_hw()
*/
smp_mb();
/*
* We are masking less, we need to look for pending things
* to deliver and set VP pending bits accordingly to trigger
@@ -420,21 +446,37 @@ X_STATIC int GLUE(X_PFX,h_eoi)(struct kvm_vcpu *vcpu, unsigned long xirr)
* used to signal MFRR changes is EOId when fetched from
* the queue.
*/
if (irq == XICS_IPI || irq == 0)
if (irq == XICS_IPI || irq == 0) {
/*
* This barrier orders the setting of xc->cppr vs.
* subsquent test of xc->mfrr done inside
* scan_interrupts and push_pending_to_hw
*/
smp_mb();
goto bail;
}
/* Find interrupt source */
sb = kvmppc_xive_find_source(xive, irq, &src);
if (!sb) {
pr_devel(" source not found !\n");
rc = H_PARAMETER;
/* Same as above */
smp_mb();
goto bail;
}
state = &sb->irq_state[src];
kvmppc_xive_select_irq(state, &hw_num, &xd);
state->in_eoi = true;
mb();
/*
* This barrier orders both setting of in_eoi above vs,
* subsequent test of guest_priority, and the setting
* of xc->cppr vs. subsquent test of xc->mfrr done inside
* scan_interrupts and push_pending_to_hw
*/
smp_mb();
again:
if (state->guest_priority == MASKED) {
@@ -461,6 +503,14 @@ again:
}
/*
* This barrier orders the above guest_priority check
* and spin_lock/unlock with clearing in_eoi below.
*
* It also has to be a full mb() as it must ensure
* the MMIOs done in source_eoi() are completed before
* state->in_eoi is visible.
*/
mb();
state->in_eoi = false;
bail:
@@ -495,6 +545,18 @@ X_STATIC int GLUE(X_PFX,h_ipi)(struct kvm_vcpu *vcpu, unsigned long server,
/* Locklessly write over MFRR */
xc->mfrr = mfrr;
/*
* The load of xc->cppr below and the subsequent MMIO store
* to the IPI must happen after the above mfrr update is
* globally visible so that:
*
* - Synchronize with another CPU doing an H_EOI or a H_CPPR
* updating xc->cppr then reading xc->mfrr.
*
* - The target of the IPI sees the xc->mfrr update
*/
mb();
/* Shoot the IPI if most favored than target cppr */
if (mfrr < xc->cppr)
__x_writeq(0, __x_trig_page(&xc->vp_ipi_data));

View File

@@ -394,7 +394,7 @@ static int sthyi(u64 vaddr)
"srl %[cc],28\n"
: [cc] "=d" (cc)
: [code] "d" (code), [addr] "a" (addr)
: "memory", "cc");
: "3", "memory", "cc");
return cc;
}
@@ -425,7 +425,7 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
VCPU_EVENT(vcpu, 3, "STHYI: fc: %llu addr: 0x%016llx", code, addr);
trace_kvm_s390_handle_sthyi(vcpu, code, addr);
if (reg1 == reg2 || reg1 & 1 || reg2 & 1 || addr & ~PAGE_MASK)
if (reg1 == reg2 || reg1 & 1 || reg2 & 1)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
if (code & 0xffff) {
@@ -433,6 +433,9 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
goto out;
}
if (addr & ~PAGE_MASK)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
/*
* If the page has not yet been faulted in, we want to do that
* now and not after all the expensive calculations.

View File

@@ -68,6 +68,7 @@ typedef struct { unsigned long iopgprot; } iopgprot_t;
#define iopgprot_val(x) ((x).iopgprot)
#define __pte(x) ((pte_t) { (x) } )
#define __pmd(x) ((pmd_t) { { (x) }, })
#define __iopte(x) ((iopte_t) { (x) } )
#define __pgd(x) ((pgd_t) { (x) } )
#define __ctxd(x) ((ctxd_t) { (x) } )
@@ -95,6 +96,7 @@ typedef unsigned long iopgprot_t;
#define iopgprot_val(x) (x)
#define __pte(x) (x)
#define __pmd(x) ((pmd_t) { { (x) }, })
#define __iopte(x) (x)
#define __pgd(x) (x)
#define __ctxd(x) (x)

View File

@@ -1266,8 +1266,6 @@ static int pci_sun4v_probe(struct platform_device *op)
* ATU group, but ATU hcalls won't be available.
*/
hv_atu = false;
pr_err(PFX "Could not register hvapi ATU err=%d\n",
err);
} else {
pr_info(PFX "Registered hvapi ATU major[%lu] minor[%lu]\n",
vatu_major, vatu_minor);

View File

@@ -602,7 +602,7 @@ void pcibios_fixup_bus(struct pci_bus *bus)
{
struct pci_dev *dev;
int i, has_io, has_mem;
unsigned int cmd;
unsigned int cmd = 0;
struct linux_pcic *pcic;
/* struct linux_pbm_info* pbm = &pcic->pbm; */
int node;

View File

@@ -5,26 +5,26 @@
.align 4
ENTRY(__multi3) /* %o0 = u, %o1 = v */
mov %o1, %g1
srl %o3, 0, %g4
mulx %g4, %g1, %o1
srl %o3, 0, %o4
mulx %o4, %g1, %o1
srlx %g1, 0x20, %g3
mulx %g3, %g4, %g5
sllx %g5, 0x20, %o5
srl %g1, 0, %g4
mulx %g3, %o4, %g7
sllx %g7, 0x20, %o5
srl %g1, 0, %o4
sub %o1, %o5, %o5
srlx %o5, 0x20, %o5
addcc %g5, %o5, %g5
addcc %g7, %o5, %g7
srlx %o3, 0x20, %o5
mulx %g4, %o5, %g4
mulx %o4, %o5, %o4
mulx %g3, %o5, %o5
sethi %hi(0x80000000), %g3
addcc %g5, %g4, %g5
srlx %g5, 0x20, %g5
addcc %g7, %o4, %g7
srlx %g7, 0x20, %g7
add %g3, %g3, %g3
movcc %xcc, %g0, %g3
addcc %o5, %g5, %o5
sllx %g4, 0x20, %g4
add %o1, %g4, %o1
addcc %o5, %g7, %o5
sllx %o4, 0x20, %o4
add %o1, %o4, %o1
add %o5, %g3, %g2
mulx %g1, %o2, %g1
add %g1, %g2, %g1

View File

@@ -100,6 +100,7 @@ config X86
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
select GENERIC_TIME_VSYSCALL
select HARDLOCKUP_CHECK_TIMESTAMP if X86_64
select HAVE_ACPI_APEI if ACPI
select HAVE_ACPI_APEI_NMI if ACPI
select HAVE_ALIGNED_STRUCT_PAGE if SLUB
@@ -163,7 +164,7 @@ config X86
select HAVE_PCSPKR_PLATFORM
select HAVE_PERF_EVENTS
select HAVE_PERF_EVENTS_NMI
select HAVE_HARDLOCKUP_DETECTOR_PERF if HAVE_PERF_EVENTS_NMI
select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select HAVE_REGS_AND_STACK_ACCESS_API

View File

@@ -117,11 +117,10 @@
.set T1, REG_T1
.endm
#define K_BASE %r8
#define HASH_PTR %r9
#define BLOCKS_CTR %r8
#define BUFFER_PTR %r10
#define BUFFER_PTR2 %r13
#define BUFFER_END %r11
#define PRECALC_BUF %r14
#define WK_BUF %r15
@@ -205,14 +204,14 @@
* blended AVX2 and ALU instruction scheduling
* 1 vector iteration per 8 rounds
*/
vmovdqu ((i * 2) + PRECALC_OFFSET)(BUFFER_PTR), W_TMP
vmovdqu (i * 2)(BUFFER_PTR), W_TMP
.elseif ((i & 7) == 1)
vinsertf128 $1, (((i-1) * 2)+PRECALC_OFFSET)(BUFFER_PTR2),\
vinsertf128 $1, ((i-1) * 2)(BUFFER_PTR2),\
WY_TMP, WY_TMP
.elseif ((i & 7) == 2)
vpshufb YMM_SHUFB_BSWAP, WY_TMP, WY
.elseif ((i & 7) == 4)
vpaddd K_XMM(K_BASE), WY, WY_TMP
vpaddd K_XMM + K_XMM_AR(%rip), WY, WY_TMP
.elseif ((i & 7) == 7)
vmovdqu WY_TMP, PRECALC_WK(i&~7)
@@ -255,7 +254,7 @@
vpxor WY, WY_TMP, WY_TMP
.elseif ((i & 7) == 7)
vpxor WY_TMP2, WY_TMP, WY
vpaddd K_XMM(K_BASE), WY, WY_TMP
vpaddd K_XMM + K_XMM_AR(%rip), WY, WY_TMP
vmovdqu WY_TMP, PRECALC_WK(i&~7)
PRECALC_ROTATE_WY
@@ -291,7 +290,7 @@
vpsrld $30, WY, WY
vpor WY, WY_TMP, WY
.elseif ((i & 7) == 7)
vpaddd K_XMM(K_BASE), WY, WY_TMP
vpaddd K_XMM + K_XMM_AR(%rip), WY, WY_TMP
vmovdqu WY_TMP, PRECALC_WK(i&~7)
PRECALC_ROTATE_WY
@@ -446,6 +445,16 @@
.endm
/* Add constant only if (%2 > %3) condition met (uses RTA as temp)
* %1 + %2 >= %3 ? %4 : 0
*/
.macro ADD_IF_GE a, b, c, d
mov \a, RTA
add $\d, RTA
cmp $\c, \b
cmovge RTA, \a
.endm
/*
* macro implements 80 rounds of SHA-1, for multiple blocks with s/w pipelining
*/
@@ -463,13 +472,16 @@
lea (2*4*80+32)(%rsp), WK_BUF
# Precalc WK for first 2 blocks
PRECALC_OFFSET = 0
ADD_IF_GE BUFFER_PTR2, BLOCKS_CTR, 2, 64
.set i, 0
.rept 160
PRECALC i
.set i, i + 1
.endr
PRECALC_OFFSET = 128
/* Go to next block if needed */
ADD_IF_GE BUFFER_PTR, BLOCKS_CTR, 3, 128
ADD_IF_GE BUFFER_PTR2, BLOCKS_CTR, 4, 128
xchg WK_BUF, PRECALC_BUF
.align 32
@@ -479,8 +491,8 @@ _loop:
* we use K_BASE value as a signal of a last block,
* it is set below by: cmovae BUFFER_PTR, K_BASE
*/
cmp K_BASE, BUFFER_PTR
jne _begin
test BLOCKS_CTR, BLOCKS_CTR
jnz _begin
.align 32
jmp _end
.align 32
@@ -512,10 +524,10 @@ _loop0:
.set j, j+2
.endr
add $(2*64), BUFFER_PTR /* move to next odd-64-byte block */
cmp BUFFER_END, BUFFER_PTR /* is current block the last one? */
cmovae K_BASE, BUFFER_PTR /* signal the last iteration smartly */
/* Update Counter */
sub $1, BLOCKS_CTR
/* Move to the next block only if needed*/
ADD_IF_GE BUFFER_PTR, BLOCKS_CTR, 4, 128
/*
* rounds
* 60,62,64,66,68
@@ -532,8 +544,8 @@ _loop0:
UPDATE_HASH 12(HASH_PTR), D
UPDATE_HASH 16(HASH_PTR), E
cmp K_BASE, BUFFER_PTR /* is current block the last one? */
je _loop
test BLOCKS_CTR, BLOCKS_CTR
jz _loop
mov TB, B
@@ -575,10 +587,10 @@ _loop2:
.set j, j+2
.endr
add $(2*64), BUFFER_PTR2 /* move to next even-64-byte block */
cmp BUFFER_END, BUFFER_PTR2 /* is current block the last one */
cmovae K_BASE, BUFFER_PTR /* signal the last iteration smartly */
/* update counter */
sub $1, BLOCKS_CTR
/* Move to the next block only if needed*/
ADD_IF_GE BUFFER_PTR2, BLOCKS_CTR, 4, 128
jmp _loop3
_loop3:
@@ -641,19 +653,12 @@ _loop3:
avx2_zeroupper
lea K_XMM_AR(%rip), K_BASE
/* Setup initial values */
mov CTX, HASH_PTR
mov BUF, BUFFER_PTR
lea 64(BUF), BUFFER_PTR2
shl $6, CNT /* mul by 64 */
add BUF, CNT
add $64, CNT
mov CNT, BUFFER_END
cmp BUFFER_END, BUFFER_PTR2
cmovae K_BASE, BUFFER_PTR2
mov BUF, BUFFER_PTR2
mov CNT, BLOCKS_CTR
xmm_mov BSWAP_SHUFB_CTL(%rip), YMM_SHUFB_BSWAP

View File

@@ -201,7 +201,7 @@ asmlinkage void sha1_transform_avx2(u32 *digest, const char *data,
static bool avx2_usable(void)
{
if (false && avx_usable() && boot_cpu_has(X86_FEATURE_AVX2)
if (avx_usable() && boot_cpu_has(X86_FEATURE_AVX2)
&& boot_cpu_has(X86_FEATURE_BMI1)
&& boot_cpu_has(X86_FEATURE_BMI2))
return true;

View File

@@ -1211,6 +1211,8 @@ ENTRY(nmi)
* other IST entries.
*/
ASM_CLAC
/* Use %rdx as our temp variable throughout */
pushq %rdx

View File

@@ -2114,7 +2114,7 @@ static void refresh_pce(void *ignored)
load_mm_cr4(this_cpu_read(cpu_tlbstate.loaded_mm));
}
static void x86_pmu_event_mapped(struct perf_event *event)
static void x86_pmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
{
if (!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED))
return;
@@ -2129,22 +2129,20 @@ static void x86_pmu_event_mapped(struct perf_event *event)
* For now, this can't happen because all callers hold mmap_sem
* for write. If this changes, we'll need a different solution.
*/
lockdep_assert_held_exclusive(&current->mm->mmap_sem);
lockdep_assert_held_exclusive(&mm->mmap_sem);
if (atomic_inc_return(&current->mm->context.perf_rdpmc_allowed) == 1)
on_each_cpu_mask(mm_cpumask(current->mm), refresh_pce, NULL, 1);
if (atomic_inc_return(&mm->context.perf_rdpmc_allowed) == 1)
on_each_cpu_mask(mm_cpumask(mm), refresh_pce, NULL, 1);
}
static void x86_pmu_event_unmapped(struct perf_event *event)
static void x86_pmu_event_unmapped(struct perf_event *event, struct mm_struct *mm)
{
if (!current->mm)
return;
if (!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED))
return;
if (atomic_dec_and_test(&current->mm->context.perf_rdpmc_allowed))
on_each_cpu_mask(mm_cpumask(current->mm), refresh_pce, NULL, 1);
if (atomic_dec_and_test(&mm->context.perf_rdpmc_allowed))
on_each_cpu_mask(mm_cpumask(mm), refresh_pce, NULL, 1);
}
static int x86_pmu_event_idx(struct perf_event *event)

View File

@@ -69,7 +69,7 @@ struct bts_buffer {
struct bts_phys buf[0];
};
struct pmu bts_pmu;
static struct pmu bts_pmu;
static size_t buf_size(struct page *page)
{

View File

@@ -587,7 +587,7 @@ static __initconst const u64 p4_hw_cache_event_ids
* P4_CONFIG_ALIASABLE or bits for P4_PEBS_METRIC, they are
* either up to date automatically or not applicable at all.
*/
struct p4_event_alias {
static struct p4_event_alias {
u64 original;
u64 alternative;
} p4_event_aliases[] = {

View File

@@ -559,7 +559,7 @@ static struct attribute_group rapl_pmu_format_group = {
.attrs = rapl_formats_attr,
};
const struct attribute_group *rapl_attr_groups[] = {
static const struct attribute_group *rapl_attr_groups[] = {
&rapl_pmu_attr_group,
&rapl_pmu_format_group,
&rapl_pmu_events_group,

View File

@@ -721,7 +721,7 @@ static struct attribute *uncore_pmu_attrs[] = {
NULL,
};
static struct attribute_group uncore_pmu_attr_group = {
static const struct attribute_group uncore_pmu_attr_group = {
.attrs = uncore_pmu_attrs,
};

View File

@@ -272,7 +272,7 @@ static struct attribute *nhmex_uncore_ubox_formats_attr[] = {
NULL,
};
static struct attribute_group nhmex_uncore_ubox_format_group = {
static const struct attribute_group nhmex_uncore_ubox_format_group = {
.name = "format",
.attrs = nhmex_uncore_ubox_formats_attr,
};
@@ -299,7 +299,7 @@ static struct attribute *nhmex_uncore_cbox_formats_attr[] = {
NULL,
};
static struct attribute_group nhmex_uncore_cbox_format_group = {
static const struct attribute_group nhmex_uncore_cbox_format_group = {
.name = "format",
.attrs = nhmex_uncore_cbox_formats_attr,
};
@@ -407,7 +407,7 @@ static struct attribute *nhmex_uncore_bbox_formats_attr[] = {
NULL,
};
static struct attribute_group nhmex_uncore_bbox_format_group = {
static const struct attribute_group nhmex_uncore_bbox_format_group = {
.name = "format",
.attrs = nhmex_uncore_bbox_formats_attr,
};
@@ -484,7 +484,7 @@ static struct attribute *nhmex_uncore_sbox_formats_attr[] = {
NULL,
};
static struct attribute_group nhmex_uncore_sbox_format_group = {
static const struct attribute_group nhmex_uncore_sbox_format_group = {
.name = "format",
.attrs = nhmex_uncore_sbox_formats_attr,
};
@@ -898,7 +898,7 @@ static struct attribute *nhmex_uncore_mbox_formats_attr[] = {
NULL,
};
static struct attribute_group nhmex_uncore_mbox_format_group = {
static const struct attribute_group nhmex_uncore_mbox_format_group = {
.name = "format",
.attrs = nhmex_uncore_mbox_formats_attr,
};
@@ -1163,7 +1163,7 @@ static struct attribute *nhmex_uncore_rbox_formats_attr[] = {
NULL,
};
static struct attribute_group nhmex_uncore_rbox_format_group = {
static const struct attribute_group nhmex_uncore_rbox_format_group = {
.name = "format",
.attrs = nhmex_uncore_rbox_formats_attr,
};

View File

@@ -130,7 +130,7 @@ static struct attribute *snb_uncore_formats_attr[] = {
NULL,
};
static struct attribute_group snb_uncore_format_group = {
static const struct attribute_group snb_uncore_format_group = {
.name = "format",
.attrs = snb_uncore_formats_attr,
};
@@ -289,7 +289,7 @@ static struct attribute *snb_uncore_imc_formats_attr[] = {
NULL,
};
static struct attribute_group snb_uncore_imc_format_group = {
static const struct attribute_group snb_uncore_imc_format_group = {
.name = "format",
.attrs = snb_uncore_imc_formats_attr,
};
@@ -769,7 +769,7 @@ static struct attribute *nhm_uncore_formats_attr[] = {
NULL,
};
static struct attribute_group nhm_uncore_format_group = {
static const struct attribute_group nhm_uncore_format_group = {
.name = "format",
.attrs = nhm_uncore_formats_attr,
};

View File

@@ -602,27 +602,27 @@ static struct uncore_event_desc snbep_uncore_qpi_events[] = {
{ /* end: all zeroes */ },
};
static struct attribute_group snbep_uncore_format_group = {
static const struct attribute_group snbep_uncore_format_group = {
.name = "format",
.attrs = snbep_uncore_formats_attr,
};
static struct attribute_group snbep_uncore_ubox_format_group = {
static const struct attribute_group snbep_uncore_ubox_format_group = {
.name = "format",
.attrs = snbep_uncore_ubox_formats_attr,
};
static struct attribute_group snbep_uncore_cbox_format_group = {
static const struct attribute_group snbep_uncore_cbox_format_group = {
.name = "format",
.attrs = snbep_uncore_cbox_formats_attr,
};
static struct attribute_group snbep_uncore_pcu_format_group = {
static const struct attribute_group snbep_uncore_pcu_format_group = {
.name = "format",
.attrs = snbep_uncore_pcu_formats_attr,
};
static struct attribute_group snbep_uncore_qpi_format_group = {
static const struct attribute_group snbep_uncore_qpi_format_group = {
.name = "format",
.attrs = snbep_uncore_qpi_formats_attr,
};
@@ -1431,27 +1431,27 @@ static struct attribute *ivbep_uncore_qpi_formats_attr[] = {
NULL,
};
static struct attribute_group ivbep_uncore_format_group = {
static const struct attribute_group ivbep_uncore_format_group = {
.name = "format",
.attrs = ivbep_uncore_formats_attr,
};
static struct attribute_group ivbep_uncore_ubox_format_group = {
static const struct attribute_group ivbep_uncore_ubox_format_group = {
.name = "format",
.attrs = ivbep_uncore_ubox_formats_attr,
};
static struct attribute_group ivbep_uncore_cbox_format_group = {
static const struct attribute_group ivbep_uncore_cbox_format_group = {
.name = "format",
.attrs = ivbep_uncore_cbox_formats_attr,
};
static struct attribute_group ivbep_uncore_pcu_format_group = {
static const struct attribute_group ivbep_uncore_pcu_format_group = {
.name = "format",
.attrs = ivbep_uncore_pcu_formats_attr,
};
static struct attribute_group ivbep_uncore_qpi_format_group = {
static const struct attribute_group ivbep_uncore_qpi_format_group = {
.name = "format",
.attrs = ivbep_uncore_qpi_formats_attr,
};
@@ -1887,7 +1887,7 @@ static struct attribute *knl_uncore_ubox_formats_attr[] = {
NULL,
};
static struct attribute_group knl_uncore_ubox_format_group = {
static const struct attribute_group knl_uncore_ubox_format_group = {
.name = "format",
.attrs = knl_uncore_ubox_formats_attr,
};
@@ -1927,7 +1927,7 @@ static struct attribute *knl_uncore_cha_formats_attr[] = {
NULL,
};
static struct attribute_group knl_uncore_cha_format_group = {
static const struct attribute_group knl_uncore_cha_format_group = {
.name = "format",
.attrs = knl_uncore_cha_formats_attr,
};
@@ -2037,7 +2037,7 @@ static struct attribute *knl_uncore_pcu_formats_attr[] = {
NULL,
};
static struct attribute_group knl_uncore_pcu_format_group = {
static const struct attribute_group knl_uncore_pcu_format_group = {
.name = "format",
.attrs = knl_uncore_pcu_formats_attr,
};
@@ -2187,7 +2187,7 @@ static struct attribute *knl_uncore_irp_formats_attr[] = {
NULL,
};
static struct attribute_group knl_uncore_irp_format_group = {
static const struct attribute_group knl_uncore_irp_format_group = {
.name = "format",
.attrs = knl_uncore_irp_formats_attr,
};
@@ -2385,7 +2385,7 @@ static struct attribute *hswep_uncore_ubox_formats_attr[] = {
NULL,
};
static struct attribute_group hswep_uncore_ubox_format_group = {
static const struct attribute_group hswep_uncore_ubox_format_group = {
.name = "format",
.attrs = hswep_uncore_ubox_formats_attr,
};
@@ -2439,7 +2439,7 @@ static struct attribute *hswep_uncore_cbox_formats_attr[] = {
NULL,
};
static struct attribute_group hswep_uncore_cbox_format_group = {
static const struct attribute_group hswep_uncore_cbox_format_group = {
.name = "format",
.attrs = hswep_uncore_cbox_formats_attr,
};
@@ -2621,7 +2621,7 @@ static struct attribute *hswep_uncore_sbox_formats_attr[] = {
NULL,
};
static struct attribute_group hswep_uncore_sbox_format_group = {
static const struct attribute_group hswep_uncore_sbox_format_group = {
.name = "format",
.attrs = hswep_uncore_sbox_formats_attr,
};
@@ -3314,7 +3314,7 @@ static struct attribute *skx_uncore_cha_formats_attr[] = {
NULL,
};
static struct attribute_group skx_uncore_chabox_format_group = {
static const struct attribute_group skx_uncore_chabox_format_group = {
.name = "format",
.attrs = skx_uncore_cha_formats_attr,
};
@@ -3427,7 +3427,7 @@ static struct attribute *skx_uncore_iio_formats_attr[] = {
NULL,
};
static struct attribute_group skx_uncore_iio_format_group = {
static const struct attribute_group skx_uncore_iio_format_group = {
.name = "format",
.attrs = skx_uncore_iio_formats_attr,
};
@@ -3484,7 +3484,7 @@ static struct attribute *skx_uncore_formats_attr[] = {
NULL,
};
static struct attribute_group skx_uncore_format_group = {
static const struct attribute_group skx_uncore_format_group = {
.name = "format",
.attrs = skx_uncore_formats_attr,
};
@@ -3605,7 +3605,7 @@ static struct attribute *skx_upi_uncore_formats_attr[] = {
NULL,
};
static struct attribute_group skx_upi_uncore_format_group = {
static const struct attribute_group skx_upi_uncore_format_group = {
.name = "format",
.attrs = skx_upi_uncore_formats_attr,
};

View File

@@ -286,7 +286,7 @@
#define X86_FEATURE_PAUSEFILTER (15*32+10) /* filtered pause intercept */
#define X86_FEATURE_PFTHRESHOLD (15*32+12) /* pause filter threshold */
#define X86_FEATURE_AVIC (15*32+13) /* Virtual Interrupt Controller */
#define X86_FEATURE_VIRTUAL_VMLOAD_VMSAVE (15*32+15) /* Virtual VMLOAD VMSAVE */
#define X86_FEATURE_V_VMSAVE_VMLOAD (15*32+15) /* Virtual VMSAVE VMLOAD */
/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */
#define X86_FEATURE_AVX512VBMI (16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/

View File

@@ -247,11 +247,11 @@ extern int force_personality32;
/*
* This is the base location for PIE (ET_DYN with INTERP) loads. On
* 64-bit, this is raised to 4GB to leave the entire 32-bit address
* 64-bit, this is above 4GB to leave the entire 32-bit address
* space open for things that want to use the area for 32-bit pointers.
*/
#define ELF_ET_DYN_BASE (mmap_is_ia32() ? 0x000400000UL : \
0x100000000UL)
(TASK_SIZE / 3 * 2))
/* This yields a mask that user programs can use to figure out what
instruction set this CPU supports. This could be done in user space,

View File

@@ -450,10 +450,10 @@ static inline int copy_fpregs_to_fpstate(struct fpu *fpu)
return 0;
}
static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate)
static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
{
if (use_xsave()) {
copy_kernel_to_xregs(&fpstate->xsave, -1);
copy_kernel_to_xregs(&fpstate->xsave, mask);
} else {
if (use_fxsr())
copy_kernel_to_fxregs(&fpstate->fxsave);
@@ -477,7 +477,7 @@ static inline void copy_kernel_to_fpregs(union fpregs_state *fpstate)
: : [addr] "m" (fpstate));
}
__copy_kernel_to_fpregs(fpstate);
__copy_kernel_to_fpregs(fpstate, -1);
}
extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size);

View File

@@ -492,6 +492,7 @@ struct kvm_vcpu_arch {
unsigned long cr4;
unsigned long cr4_guest_owned_bits;
unsigned long cr8;
u32 pkru;
u32 hflags;
u64 efer;
u64 apic_base;

View File

@@ -140,9 +140,7 @@ static inline int init_new_context(struct task_struct *tsk,
mm->context.execute_only_pkey = -1;
}
#endif
init_new_context_ldt(tsk, mm);
return 0;
return init_new_context_ldt(tsk, mm);
}
static inline void destroy_context(struct mm_struct *mm)
{

View File

@@ -40,13 +40,16 @@ static void aperfmperf_snapshot_khz(void *dummy)
struct aperfmperf_sample *s = this_cpu_ptr(&samples);
ktime_t now = ktime_get();
s64 time_delta = ktime_ms_delta(now, s->time);
unsigned long flags;
/* Don't bother re-computing within the cache threshold time. */
if (time_delta < APERFMPERF_CACHE_THRESHOLD_MS)
return;
local_irq_save(flags);
rdmsrl(MSR_IA32_APERF, aperf);
rdmsrl(MSR_IA32_MPERF, mperf);
local_irq_restore(flags);
aperf_delta = aperf - s->aperf;
mperf_delta = mperf - s->mperf;

View File

@@ -122,7 +122,7 @@ static struct attribute *thermal_throttle_attrs[] = {
NULL
};
static struct attribute_group thermal_attr_group = {
static const struct attribute_group thermal_attr_group = {
.attrs = thermal_throttle_attrs,
.name = "thermal_throttle"
};

View File

@@ -561,7 +561,7 @@ static struct attribute *mc_default_attrs[] = {
NULL
};
static struct attribute_group mc_attr_group = {
static const struct attribute_group mc_attr_group = {
.attrs = mc_default_attrs,
.name = "microcode",
};
@@ -707,7 +707,7 @@ static struct attribute *cpu_root_microcode_attrs[] = {
NULL
};
static struct attribute_group cpu_root_microcode_group = {
static const struct attribute_group cpu_root_microcode_group = {
.name = "microcode",
.attrs = cpu_root_microcode_attrs,
};

View File

@@ -237,6 +237,18 @@ set_mtrr(unsigned int reg, unsigned long base, unsigned long size, mtrr_type typ
stop_machine(mtrr_rendezvous_handler, &data, cpu_online_mask);
}
static void set_mtrr_cpuslocked(unsigned int reg, unsigned long base,
unsigned long size, mtrr_type type)
{
struct set_mtrr_data data = { .smp_reg = reg,
.smp_base = base,
.smp_size = size,
.smp_type = type
};
stop_machine_cpuslocked(mtrr_rendezvous_handler, &data, cpu_online_mask);
}
static void set_mtrr_from_inactive_cpu(unsigned int reg, unsigned long base,
unsigned long size, mtrr_type type)
{
@@ -370,7 +382,7 @@ int mtrr_add_page(unsigned long base, unsigned long size,
/* Search for an empty MTRR */
i = mtrr_if->get_free_region(base, size, replace);
if (i >= 0) {
set_mtrr(i, base, size, type);
set_mtrr_cpuslocked(i, base, size, type);
if (likely(replace < 0)) {
mtrr_usage_table[i] = 1;
} else {
@@ -378,7 +390,7 @@ int mtrr_add_page(unsigned long base, unsigned long size,
if (increment)
mtrr_usage_table[i]++;
if (unlikely(replace != i)) {
set_mtrr(replace, 0, 0, 0);
set_mtrr_cpuslocked(replace, 0, 0, 0);
mtrr_usage_table[replace] = 0;
}
}
@@ -506,7 +518,7 @@ int mtrr_del_page(int reg, unsigned long base, unsigned long size)
goto out;
}
if (--mtrr_usage_table[reg] < 1)
set_mtrr(reg, 0, 0, 0);
set_mtrr_cpuslocked(reg, 0, 0, 0);
error = reg;
out:
mutex_unlock(&mtrr_mutex);

View File

@@ -53,6 +53,7 @@ void __head __startup_64(unsigned long physaddr)
pudval_t *pud;
pmdval_t *pmd, pmd_entry;
int i;
unsigned int *next_pgt_ptr;
/* Is the address too large? */
if (physaddr >> MAX_PHYSMEM_BITS)
@@ -91,9 +92,9 @@ void __head __startup_64(unsigned long physaddr)
* creates a bunch of nonsense entries but that is fine --
* it avoids problems around wraparound.
*/
pud = fixup_pointer(early_dynamic_pgts[next_early_pgt++], physaddr);
pmd = fixup_pointer(early_dynamic_pgts[next_early_pgt++], physaddr);
next_pgt_ptr = fixup_pointer(&next_early_pgt, physaddr);
pud = fixup_pointer(early_dynamic_pgts[(*next_pgt_ptr)++], physaddr);
pmd = fixup_pointer(early_dynamic_pgts[(*next_pgt_ptr)++], physaddr);
if (IS_ENABLED(CONFIG_X86_5LEVEL)) {
p4d = fixup_pointer(early_dynamic_pgts[next_early_pgt++], physaddr);

View File

@@ -55,7 +55,7 @@ static struct bin_attribute *boot_params_data_attrs[] = {
NULL,
};
static struct attribute_group boot_params_attr_group = {
static const struct attribute_group boot_params_attr_group = {
.attrs = boot_params_version_attrs,
.bin_attrs = boot_params_data_attrs,
};
@@ -202,7 +202,7 @@ static struct bin_attribute *setup_data_data_attrs[] = {
NULL,
};
static struct attribute_group setup_data_attr_group = {
static const struct attribute_group setup_data_attr_group = {
.attrs = setup_data_type_attrs,
.bin_attrs = setup_data_data_attrs,
};

View File

@@ -971,7 +971,8 @@ void common_cpu_up(unsigned int cpu, struct task_struct *idle)
* Returns zero if CPU booted OK, else error code from
* ->wakeup_secondary_cpu.
*/
static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle,
int *cpu0_nmi_registered)
{
volatile u32 *trampoline_status =
(volatile u32 *) __va(real_mode_header->trampoline_status);
@@ -979,7 +980,6 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
unsigned long start_ip = real_mode_header->trampoline_start;
unsigned long boot_error = 0;
int cpu0_nmi_registered = 0;
unsigned long timeout;
idle->thread.sp = (unsigned long)task_pt_regs(idle);
@@ -1035,7 +1035,7 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
boot_error = apic->wakeup_secondary_cpu(apicid, start_ip);
else
boot_error = wakeup_cpu_via_init_nmi(cpu, start_ip, apicid,
&cpu0_nmi_registered);
cpu0_nmi_registered);
if (!boot_error) {
/*
@@ -1080,12 +1080,6 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
*/
smpboot_restore_warm_reset_vector();
}
/*
* Clean up the nmi handler. Do this after the callin and callout sync
* to avoid impact of possible long unregister time.
*/
if (cpu0_nmi_registered)
unregister_nmi_handler(NMI_LOCAL, "wake_cpu0");
return boot_error;
}
@@ -1093,8 +1087,9 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
int native_cpu_up(unsigned int cpu, struct task_struct *tidle)
{
int apicid = apic->cpu_present_to_apicid(cpu);
int cpu0_nmi_registered = 0;
unsigned long flags;
int err;
int err, ret = 0;
WARN_ON(irqs_disabled());
@@ -1131,10 +1126,11 @@ int native_cpu_up(unsigned int cpu, struct task_struct *tidle)
common_cpu_up(cpu, tidle);
err = do_boot_cpu(apicid, cpu, tidle);
err = do_boot_cpu(apicid, cpu, tidle, &cpu0_nmi_registered);
if (err) {
pr_err("do_boot_cpu failed(%d) to wakeup CPU#%u\n", err, cpu);
return -EIO;
ret = -EIO;
goto unreg_nmi;
}
/*
@@ -1150,7 +1146,15 @@ int native_cpu_up(unsigned int cpu, struct task_struct *tidle)
touch_nmi_watchdog();
}
return 0;
unreg_nmi:
/*
* Clean up the nmi handler. Do this after the callin and callout sync
* to avoid impact of possible long unregister time.
*/
if (cpu0_nmi_registered)
unregister_nmi_handler(NMI_LOCAL, "wake_cpu0");
return ret;
}
/**

View File

@@ -469,7 +469,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
entry->ecx &= kvm_cpuid_7_0_ecx_x86_features;
cpuid_mask(&entry->ecx, CPUID_7_ECX);
/* PKU is not yet implemented for shadow paging. */
if (!tdp_enabled)
if (!tdp_enabled || !boot_cpu_has(X86_FEATURE_OSPKE))
entry->ecx &= ~F(PKU);
entry->edx &= kvm_cpuid_7_0_edx_x86_features;
entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX);

View File

@@ -84,11 +84,6 @@ static inline u64 kvm_read_edx_eax(struct kvm_vcpu *vcpu)
| ((u64)(kvm_register_read(vcpu, VCPU_REGS_RDX) & -1u) << 32);
}
static inline u32 kvm_read_pkru(struct kvm_vcpu *vcpu)
{
return kvm_x86_ops->get_pkru(vcpu);
}
static inline void enter_guest_mode(struct kvm_vcpu *vcpu)
{
vcpu->arch.hflags |= HF_GUEST_MASK;

View File

@@ -185,7 +185,7 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
* index of the protection domain, so pte_pkey * 2 is
* is the index of the first bit for the domain.
*/
pkru_bits = (kvm_read_pkru(vcpu) >> (pte_pkey * 2)) & 3;
pkru_bits = (vcpu->arch.pkru >> (pte_pkey * 2)) & 3;
/* clear present bit, replace PFEC.RSVD with ACC_USER_MASK. */
offset = (pfec & ~1) +

View File

@@ -1100,7 +1100,7 @@ static __init int svm_hardware_setup(void)
if (vls) {
if (!npt_enabled ||
!boot_cpu_has(X86_FEATURE_VIRTUAL_VMLOAD_VMSAVE) ||
!boot_cpu_has(X86_FEATURE_V_VMSAVE_VMLOAD) ||
!IS_ENABLED(CONFIG_X86_64)) {
vls = false;
} else {
@@ -1777,11 +1777,6 @@ static void svm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
to_svm(vcpu)->vmcb->save.rflags = rflags;
}
static u32 svm_get_pkru(struct kvm_vcpu *vcpu)
{
return 0;
}
static void svm_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
{
switch (reg) {
@@ -5413,8 +5408,6 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
.get_rflags = svm_get_rflags,
.set_rflags = svm_set_rflags,
.get_pkru = svm_get_pkru,
.tlb_flush = svm_flush_tlb,
.run = svm_vcpu_run,

View File

@@ -636,8 +636,6 @@ struct vcpu_vmx {
u64 current_tsc_ratio;
bool guest_pkru_valid;
u32 guest_pkru;
u32 host_pkru;
/*
@@ -2383,11 +2381,6 @@ static void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
to_vmx(vcpu)->emulation_required = emulation_required(vcpu);
}
static u32 vmx_get_pkru(struct kvm_vcpu *vcpu)
{
return to_vmx(vcpu)->guest_pkru;
}
static u32 vmx_get_interrupt_shadow(struct kvm_vcpu *vcpu)
{
u32 interruptibility = vmcs_read32(GUEST_INTERRUPTIBILITY_INFO);
@@ -9020,8 +9013,10 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
vmx_set_interrupt_shadow(vcpu, 0);
if (vmx->guest_pkru_valid)
__write_pkru(vmx->guest_pkru);
if (static_cpu_has(X86_FEATURE_PKU) &&
kvm_read_cr4_bits(vcpu, X86_CR4_PKE) &&
vcpu->arch.pkru != vmx->host_pkru)
__write_pkru(vcpu->arch.pkru);
atomic_switch_perf_msrs(vmx);
debugctlmsr = get_debugctlmsr();
@@ -9169,13 +9164,11 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
* back on host, so it is safe to read guest PKRU from current
* XSAVE.
*/
if (boot_cpu_has(X86_FEATURE_OSPKE)) {
vmx->guest_pkru = __read_pkru();
if (vmx->guest_pkru != vmx->host_pkru) {
vmx->guest_pkru_valid = true;
if (static_cpu_has(X86_FEATURE_PKU) &&
kvm_read_cr4_bits(vcpu, X86_CR4_PKE)) {
vcpu->arch.pkru = __read_pkru();
if (vcpu->arch.pkru != vmx->host_pkru)
__write_pkru(vmx->host_pkru);
} else
vmx->guest_pkru_valid = false;
}
/*
@@ -11682,8 +11675,6 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
.get_rflags = vmx_get_rflags,
.set_rflags = vmx_set_rflags,
.get_pkru = vmx_get_pkru,
.tlb_flush = vmx_flush_tlb,
.run = vmx_vcpu_run,

View File

@@ -3245,7 +3245,12 @@ static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu)
u32 size, offset, ecx, edx;
cpuid_count(XSTATE_CPUID, index,
&size, &offset, &ecx, &edx);
memcpy(dest + offset, src, size);
if (feature == XFEATURE_MASK_PKRU)
memcpy(dest + offset, &vcpu->arch.pkru,
sizeof(vcpu->arch.pkru));
else
memcpy(dest + offset, src, size);
}
valid -= feature;
@@ -3283,7 +3288,11 @@ static void load_xsave(struct kvm_vcpu *vcpu, u8 *src)
u32 size, offset, ecx, edx;
cpuid_count(XSTATE_CPUID, index,
&size, &offset, &ecx, &edx);
memcpy(dest, src + offset, size);
if (feature == XFEATURE_MASK_PKRU)
memcpy(&vcpu->arch.pkru, src + offset,
sizeof(vcpu->arch.pkru));
else
memcpy(dest, src + offset, size);
}
valid -= feature;
@@ -7633,7 +7642,9 @@ void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
*/
vcpu->guest_fpu_loaded = 1;
__kernel_fpu_begin();
__copy_kernel_to_fpregs(&vcpu->arch.guest_fpu.state);
/* PKRU is separately restored in kvm_x86_ops->run. */
__copy_kernel_to_fpregs(&vcpu->arch.guest_fpu.state,
~XFEATURE_MASK_PKRU);
trace_kvm_fpu(1);
}

View File

@@ -50,8 +50,7 @@ unsigned long tasksize_64bit(void)
static unsigned long stack_maxrandom_size(unsigned long task_size)
{
unsigned long max = 0;
if ((current->flags & PF_RANDOMIZE) &&
!(current->personality & ADDR_NO_RANDOMIZE)) {
if (current->flags & PF_RANDOMIZE) {
max = (-1UL) & __STACK_RND_MASK(task_size == tasksize_32bit());
max <<= PAGE_SHIFT;
}
@@ -79,13 +78,13 @@ static int mmap_is_legacy(void)
static unsigned long arch_rnd(unsigned int rndbits)
{
if (!(current->flags & PF_RANDOMIZE))
return 0;
return (get_random_long() & ((1UL << rndbits) - 1)) << PAGE_SHIFT;
}
unsigned long arch_mmap_rnd(void)
{
if (!(current->flags & PF_RANDOMIZE))
return 0;
return arch_rnd(mmap_is_ia32() ? mmap32_rnd_bits : mmap64_rnd_bits);
}

View File

@@ -26,7 +26,7 @@
static struct bau_operations ops __ro_after_init;
/* timeouts in nanoseconds (indexed by UVH_AGING_PRESCALE_SEL urgency7 30:28) */
static int timeout_base_ns[] = {
static const int timeout_base_ns[] = {
20,
160,
1280,
@@ -1216,7 +1216,7 @@ static struct bau_pq_entry *find_another_by_swack(struct bau_pq_entry *msg,
* set a bit in the UVH_LB_BAU_INTD_SOFTWARE_ACKNOWLEDGE register.
* Such a message must be ignored.
*/
void process_uv2_message(struct msg_desc *mdp, struct bau_control *bcp)
static void process_uv2_message(struct msg_desc *mdp, struct bau_control *bcp)
{
unsigned long mmr_image;
unsigned char swack_vec;

View File

@@ -75,6 +75,8 @@ static const char *const blk_queue_flag_name[] = {
QUEUE_FLAG_NAME(STATS),
QUEUE_FLAG_NAME(POLL_STATS),
QUEUE_FLAG_NAME(REGISTERED),
QUEUE_FLAG_NAME(SCSI_PASSTHROUGH),
QUEUE_FLAG_NAME(QUIESCED),
};
#undef QUEUE_FLAG_NAME
@@ -265,6 +267,7 @@ static const char *const cmd_flag_name[] = {
CMD_FLAG_NAME(RAHEAD),
CMD_FLAG_NAME(BACKGROUND),
CMD_FLAG_NAME(NOUNMAP),
CMD_FLAG_NAME(NOWAIT),
};
#undef CMD_FLAG_NAME

View File

@@ -36,12 +36,18 @@ int blk_mq_pci_map_queues(struct blk_mq_tag_set *set, struct pci_dev *pdev)
for (queue = 0; queue < set->nr_hw_queues; queue++) {
mask = pci_irq_get_affinity(pdev, queue);
if (!mask)
return -EINVAL;
goto fallback;
for_each_cpu(cpu, mask)
set->mq_map[cpu] = queue;
}
return 0;
fallback:
WARN_ON_ONCE(set->nr_hw_queues > 1);
for_each_possible_cpu(cpu)
set->mq_map[cpu] = 0;
return 0;
}
EXPORT_SYMBOL_GPL(blk_mq_pci_map_queues);

View File

@@ -360,12 +360,12 @@ struct request *blk_mq_alloc_request(struct request_queue *q, unsigned int op,
return ERR_PTR(ret);
rq = blk_mq_get_request(q, NULL, op, &alloc_data);
blk_queue_exit(q);
if (!rq)
return ERR_PTR(-EWOULDBLOCK);
blk_mq_put_ctx(alloc_data.ctx);
blk_queue_exit(q);
rq->__data_len = 0;
rq->__sector = (sector_t) -1;
@@ -411,12 +411,11 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
alloc_data.ctx = __blk_mq_get_ctx(q, cpu);
rq = blk_mq_get_request(q, NULL, op, &alloc_data);
blk_queue_exit(q);
if (!rq)
return ERR_PTR(-EWOULDBLOCK);
blk_queue_exit(q);
return rq;
}
EXPORT_SYMBOL_GPL(blk_mq_alloc_request_hctx);

View File

@@ -382,6 +382,14 @@ static unsigned int tg_iops_limit(struct throtl_grp *tg, int rw)
} \
} while (0)
static inline unsigned int throtl_bio_data_size(struct bio *bio)
{
/* assume it's one sector */
if (unlikely(bio_op(bio) == REQ_OP_DISCARD))
return 512;
return bio->bi_iter.bi_size;
}
static void throtl_qnode_init(struct throtl_qnode *qn, struct throtl_grp *tg)
{
INIT_LIST_HEAD(&qn->node);
@@ -934,6 +942,7 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg, struct bio *bio,
bool rw = bio_data_dir(bio);
u64 bytes_allowed, extra_bytes, tmp;
unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd;
unsigned int bio_size = throtl_bio_data_size(bio);
jiffy_elapsed = jiffy_elapsed_rnd = jiffies - tg->slice_start[rw];
@@ -947,14 +956,14 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg, struct bio *bio,
do_div(tmp, HZ);
bytes_allowed = tmp;
if (tg->bytes_disp[rw] + bio->bi_iter.bi_size <= bytes_allowed) {
if (tg->bytes_disp[rw] + bio_size <= bytes_allowed) {
if (wait)
*wait = 0;
return true;
}
/* Calc approx time to dispatch */
extra_bytes = tg->bytes_disp[rw] + bio->bi_iter.bi_size - bytes_allowed;
extra_bytes = tg->bytes_disp[rw] + bio_size - bytes_allowed;
jiffy_wait = div64_u64(extra_bytes * HZ, tg_bps_limit(tg, rw));
if (!jiffy_wait)
@@ -1034,11 +1043,12 @@ static bool tg_may_dispatch(struct throtl_grp *tg, struct bio *bio,
static void throtl_charge_bio(struct throtl_grp *tg, struct bio *bio)
{
bool rw = bio_data_dir(bio);
unsigned int bio_size = throtl_bio_data_size(bio);
/* Charge the bio to the group */
tg->bytes_disp[rw] += bio->bi_iter.bi_size;
tg->bytes_disp[rw] += bio_size;
tg->io_disp[rw]++;
tg->last_bytes_disp[rw] += bio->bi_iter.bi_size;
tg->last_bytes_disp[rw] += bio_size;
tg->last_io_disp[rw]++;
/*

View File

@@ -29,26 +29,25 @@
#include <scsi/scsi_cmnd.h>
/**
* bsg_destroy_job - routine to teardown/delete a bsg job
* bsg_teardown_job - routine to teardown a bsg job
* @job: bsg_job that is to be torn down
*/
static void bsg_destroy_job(struct kref *kref)
static void bsg_teardown_job(struct kref *kref)
{
struct bsg_job *job = container_of(kref, struct bsg_job, kref);
struct request *rq = job->req;
blk_end_request_all(rq, BLK_STS_OK);
put_device(job->dev); /* release reference for the request */
kfree(job->request_payload.sg_list);
kfree(job->reply_payload.sg_list);
kfree(job);
blk_end_request_all(rq, BLK_STS_OK);
}
void bsg_job_put(struct bsg_job *job)
{
kref_put(&job->kref, bsg_destroy_job);
kref_put(&job->kref, bsg_teardown_job);
}
EXPORT_SYMBOL_GPL(bsg_job_put);
@@ -100,7 +99,7 @@ EXPORT_SYMBOL_GPL(bsg_job_done);
*/
static void bsg_softirq_done(struct request *rq)
{
struct bsg_job *job = rq->special;
struct bsg_job *job = blk_mq_rq_to_pdu(rq);
bsg_job_put(job);
}
@@ -122,33 +121,20 @@ static int bsg_map_buffer(struct bsg_buffer *buf, struct request *req)
}
/**
* bsg_create_job - create the bsg_job structure for the bsg request
* bsg_prepare_job - create the bsg_job structure for the bsg request
* @dev: device that is being sent the bsg request
* @req: BSG request that needs a job structure
*/
static int bsg_create_job(struct device *dev, struct request *req)
static int bsg_prepare_job(struct device *dev, struct request *req)
{
struct request *rsp = req->next_rq;
struct request_queue *q = req->q;
struct scsi_request *rq = scsi_req(req);
struct bsg_job *job;
struct bsg_job *job = blk_mq_rq_to_pdu(req);
int ret;
BUG_ON(req->special);
job = kzalloc(sizeof(struct bsg_job) + q->bsg_job_size, GFP_KERNEL);
if (!job)
return -ENOMEM;
req->special = job;
job->req = req;
if (q->bsg_job_size)
job->dd_data = (void *)&job[1];
job->request = rq->cmd;
job->request_len = rq->cmd_len;
job->reply = rq->sense;
job->reply_len = SCSI_SENSE_BUFFERSIZE; /* Size of sense buffer
* allocated */
if (req->bio) {
ret = bsg_map_buffer(&job->request_payload, req);
if (ret)
@@ -187,7 +173,6 @@ static void bsg_request_fn(struct request_queue *q)
{
struct device *dev = q->queuedata;
struct request *req;
struct bsg_job *job;
int ret;
if (!get_device(dev))
@@ -199,7 +184,7 @@ static void bsg_request_fn(struct request_queue *q)
break;
spin_unlock_irq(q->queue_lock);
ret = bsg_create_job(dev, req);
ret = bsg_prepare_job(dev, req);
if (ret) {
scsi_req(req)->result = ret;
blk_end_request_all(req, BLK_STS_OK);
@@ -207,8 +192,7 @@ static void bsg_request_fn(struct request_queue *q)
continue;
}
job = req->special;
ret = q->bsg_job_fn(job);
ret = q->bsg_job_fn(blk_mq_rq_to_pdu(req));
spin_lock_irq(q->queue_lock);
if (ret)
break;
@@ -219,6 +203,35 @@ static void bsg_request_fn(struct request_queue *q)
spin_lock_irq(q->queue_lock);
}
static int bsg_init_rq(struct request_queue *q, struct request *req, gfp_t gfp)
{
struct bsg_job *job = blk_mq_rq_to_pdu(req);
struct scsi_request *sreq = &job->sreq;
memset(job, 0, sizeof(*job));
scsi_req_init(sreq);
sreq->sense_len = SCSI_SENSE_BUFFERSIZE;
sreq->sense = kzalloc(sreq->sense_len, gfp);
if (!sreq->sense)
return -ENOMEM;
job->req = req;
job->reply = sreq->sense;
job->reply_len = sreq->sense_len;
job->dd_data = job + 1;
return 0;
}
static void bsg_exit_rq(struct request_queue *q, struct request *req)
{
struct bsg_job *job = blk_mq_rq_to_pdu(req);
struct scsi_request *sreq = &job->sreq;
kfree(sreq->sense);
}
/**
* bsg_setup_queue - Create and add the bsg hooks so we can receive requests
* @dev: device to attach bsg device to
@@ -235,7 +248,9 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
q = blk_alloc_queue(GFP_KERNEL);
if (!q)
return ERR_PTR(-ENOMEM);
q->cmd_size = sizeof(struct scsi_request);
q->cmd_size = sizeof(struct bsg_job) + dd_job_size;
q->init_rq_fn = bsg_init_rq;
q->exit_rq_fn = bsg_exit_rq;
q->request_fn = bsg_request_fn;
ret = blk_init_allocated_queue(q);
@@ -243,7 +258,6 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
goto out_cleanup_queue;
q->queuedata = dev;
q->bsg_job_size = dd_job_size;
q->bsg_job_fn = job_fn;
queue_flag_set_unlocked(QUEUE_FLAG_BIDI, q);
queue_flag_set_unlocked(QUEUE_FLAG_SCSI_PASSTHROUGH, q);

View File

@@ -100,9 +100,13 @@ acpi_evaluate_object_typed(acpi_handle handle,
free_buffer_on_error = TRUE;
}
status = acpi_get_handle(handle, pathname, &target_handle);
if (ACPI_FAILURE(status)) {
return_ACPI_STATUS(status);
if (pathname) {
status = acpi_get_handle(handle, pathname, &target_handle);
if (ACPI_FAILURE(status)) {
return_ACPI_STATUS(status);
}
} else {
target_handle = handle;
}
full_pathname = acpi_ns_get_external_pathname(target_handle);

View File

@@ -1741,7 +1741,7 @@ error:
* functioning ECDT EC first in order to handle the events.
* https://bugzilla.kernel.org/show_bug.cgi?id=115021
*/
int __init acpi_ec_ecdt_start(void)
static int __init acpi_ec_ecdt_start(void)
{
acpi_handle handle;
@@ -2003,20 +2003,17 @@ static inline void acpi_ec_query_exit(void)
int __init acpi_ec_init(void)
{
int result;
int ecdt_fail, dsdt_fail;
/* register workqueue for _Qxx evaluations */
result = acpi_ec_query_init();
if (result)
goto err_exit;
/* Now register the driver for the EC */
result = acpi_bus_register_driver(&acpi_ec_driver);
if (result)
goto err_exit;
return result;
err_exit:
if (result)
acpi_ec_query_exit();
return result;
/* Drivers must be started after acpi_ec_query_init() */
ecdt_fail = acpi_ec_ecdt_start();
dsdt_fail = acpi_bus_register_driver(&acpi_ec_driver);
return ecdt_fail && dsdt_fail ? -ENODEV : 0;
}
/* EC driver currently not unloadable */

Some files were not shown because too many files have changed in this diff Show More