Compare commits

..

216 Commits

Author SHA1 Message Date
Linus Torvalds
cfaf025112 Linux 3.5-rc2 2012-06-08 18:40:09 -07:00
David Rientjes
1e11ad8dc4 mm, oom: fix badness score underflow
If the privileges given to root threads (3% of allowable memory) or a
negative value of /proc/pid/oom_score_adj happen to exceed the amount of
rss of a thread, its badness score overflows as a result of commit
a7f638f999 ("mm, oom: normalize oom scores to oom_score_adj scale only
for userspace").

Fix this by making the type signed and return 1, meaning the thread is
still eligible for kill, if the value is negative.

Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-08 15:07:35 -07:00
Linus Torvalds
7249450449 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar.

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix the relax_domain_level boot parameter
  sched: Validate assumptions in sched_init_numa()
  sched: Always initialize cpu-power
  sched: Fix domain iteration
  sched/rt: Fix lockdep annotation within find_lock_lowest_rq()
  sched/numa: Load balance between remote nodes
  sched/x86: Calculate booted cores after construction of sibling_mask
2012-06-08 14:59:29 -07:00
Randy Dunlap
cd96891d48 sched/fair: fix lots of kernel-doc warnings
Fix lots of new kernel-doc warnings in kernel/sched/fair.c:

  Warning(kernel/sched/fair.c:3625): No description found for parameter 'env'
  Warning(kernel/sched/fair.c:3625): Excess function parameter 'sd' description in 'update_sg_lb_stats'
  Warning(kernel/sched/fair.c:3735): No description found for parameter 'env'
  Warning(kernel/sched/fair.c:3735): Excess function parameter 'sd' description in 'update_sd_pick_busiest'
  Warning(kernel/sched/fair.c:3735): Excess function parameter 'this_cpu' description in 'update_sd_pick_busiest'
  .. more warnings

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-08 14:59:10 -07:00
Linus Torvalds
8f53369b75 Revert "drm/i915/crt: Do not rely upon the HPD presence pin"
This reverts commit 9e612a008f.

It incorrectly finds VGA connectors where none are attached, apparently
not noticing that nothing replied to the EDID queries, and happily using
the default EDID modes that have nothing to do with actual hardware.

That in turn then causes X to fall down to the lowest common
denominator, which is usually the default 1024x768 mode that is in the
default EDID and pretty much anything supports).

I'd suggest that if not relying on the HDP pin, the code should at least
check whether it gets valid EDID data back, rather than just assume
there's something on the VGA connector.

Cc: Dave Airlie <airlied@linux.ie>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-08 14:53:06 -07:00
Linus Torvalds
77249539cd Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 bug fixes from Theodore Ts'o:
 "This update contains two bug fixes, both destined for the stable tree.
  Perhaps the most important is one which fixes ext4 when used with file
  systems originally formatted for use with ext3, but then later
  converted to take advantage of ext4."

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: don't set i_flags in EXT4_IOC_SETFLAGS
  ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bg
2012-06-08 11:15:31 -07:00
Linus Torvalds
3e9ca02241 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
Pull powerpc fixes from Paul Mackerras:
 "Two small fixes for powerpc:
   - a fix for a regression since 3.2 that causes 4-second (or longer)
     pauses
   - a fix for a potential oops when loading kernel modules on 32-bit
     embedded systems."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  powerpc: Fix kernel panic during kernel module load
  powerpc/time: Sanity check of decrementer expiration is necessary
2012-06-08 11:06:01 -07:00
Linus Torvalds
e72643088f Merge tag 'upstream-3.5-rc2' of git://git.infradead.org/linux-ubifs
Pull UBI/UBIFS fixes from Artem Bityutskiy:
 "Fix UBI and UBIFS - they refuse to work without debugfs.  This was
  broken by the 3.5-rc1 UBI/UBIFS changes when we removed the debugging
  Kconfig switches.

  Also, correct locking in 'ubi_wl_flush()' - it was extended to support
  flushing a specific LEB in 3.5-rc1, and the locking was sub-optimal."

* tag 'upstream-3.5-rc2' of git://git.infradead.org/linux-ubifs:
  UBI: correct ubi_wl_flush locking
  UBIFS: fix debugfs-less systems support
  UBI: fix debugfs-less systems support
2012-06-08 11:04:06 -07:00
Linus Torvalds
32ba9c3fca Revert "vfs: stop d_splice_alias creating directory aliases"
This reverts commit 7732a557b1 (and commit
3f50fff4da, which was a follow-up
cleanup).

We're chasing an elusive bug that Dave Jones can apparently reproduce
using his system call fuzzer tool, and that looks like some kind of
locking ordering problem on the directory i_mutex chain.  Our i_mutex
locking is rather complex, and depends on the topological ordering of
the directories, which is why we have been very wary of splicing
directory entries around.

Of course, we really don't want to ever see aliased unconnected
directories anyway, so none of this should ever happen, but this revert
aims to basically get us back to a known older state.

Bruce points to some of the previous discussion at

       http://marc.info/?i=<20110310105821.GE22723@ZenIV.linux.org.uk>

and in particular a long post from Neil:

       http://marc.info/?i=<20110311150749.2fa2be66@notabene.brown>

It should be noted that it's possible that Dave's problems come from
other changes altohgether, including possibly just the fact that Dave
constantly is teachning his fuzzer new tricks.  So what appears to be a
new bug could in fact be an old one that just gets newly triggered, but
reverting these patches as "still under heavy discussion" is the right
thing regardless.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-08 10:34:03 -07:00
Linus Torvalds
0b35d326f8 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/nmi: Fix section mismatch warnings on 32-bit
  x86/uv: Fix UV2 BAU legacy mode
  x86/mm: Only add extra pages count for the first memory range during pre-allocation early page table space
  x86, efi stub: Add .reloc section back into image
  x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs
  x86/reboot: Fix a warning message triggered by stop_other_cpus()
  x86/intel/moorestown: Change intel_scu_devices_create() to __devinit
  x86/numa: Set numa_nodes_parsed at acpi_numa_memory_affinity_init()
  x86/gart: Fix kmemleak warning
  x86: mce: Add the dropped timer interval init back
  x86/mce: Fix the MCE poll timer logic
2012-06-08 09:26:55 -07:00
Linus Torvalds
106544d81d Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "A bit larger than what I'd wish for - half of it is due to hw driver
  updates to Intel Ivy-Bridge which info got recently released,
  cycles:pp should work there now too, amongst other things.  (but we
  are generally making exceptions for hardware enablement of this type.)

  There are also callchain fixes in it - responding to mostly
  theoretical (but valid) concerns.  The tooling side sports perf.data
  endianness/portability fixes which did not make it for the merge
  window - and various other fixes as well."

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
  perf/x86: Check user address explicitly in copy_from_user_nmi()
  perf/x86: Check if user fp is valid
  perf: Limit callchains to 127
  perf/x86: Allow multiple stacks
  perf/x86: Update SNB PEBS constraints
  perf/x86: Enable/Add IvyBridge hardware support
  perf/x86: Implement cycles:p for SNB/IVB
  perf/x86: Fix Intel shared extra MSR allocation
  x86/decoder: Fix bsr/bsf/jmpe decoding with operand-size prefix
  perf: Remove duplicate invocation on perf_event_for_each
  perf uprobes: Remove unnecessary check before strlist__delete
  perf symbols: Check for valid dso before creating map
  perf evsel: Fix 32 bit values endianity swap for sample_id_all header
  perf session: Handle endianity swap on sample_id_all header data
  perf symbols: Handle different endians properly during symbol load
  perf evlist: Pass third argument to ioctl explicitly
  perf tools: Update ioctl documentation for PERF_IOC_FLAG_GROUP
  perf tools: Make --version show kernel version instead of pull req tag
  perf tools: Check if callchain is corrupted
  perf callchain: Make callchain cursors TLS
  ...
2012-06-08 09:14:46 -07:00
Linus Torvalds
03d8f54082 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm intel and exynos fixes from Dave Airlie:
 "A bunch of fixes for Intel and exynos, nothing too major, a new intel
  PCI ID, and a fix for CRT detection."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: pch_irq_handler -> {ibx, cpt}_irq_handler
  char/agp: add another Ironlake host bridge
  drm/i915: fix up ivb plane 3 pageflips
  drm/exynos: fixed blending for hdmi graphic layer
  drm/exynos: Remove dummy encoder get_crtc operation implementation
  drm/exynos: Keep a reference to frame buffer GEM objects
  drm/exynos: Don't cast GEM object to Exynos GEM object when not needed
  drm/exynos: DRIVER_BUS_PLATFORM is not a driver feature
  drm/exynos: fixed size type.
  drm/exynos: Use DRM_FORMAT_{NV12, YUV420} instead of DRM_FORMAT_{NV12M, YUV420M}
  drm/i915: hold forcewake around ring hw init
  drm/i915: Mark the ringbuffers as being in the GTT domain
  drm/i915/crt: Do not rely upon the HPD presence pin
  drm/i915: Reset last_retired_head when resetting ring
2012-06-08 09:12:21 -07:00
Linus Torvalds
b1e25f4109 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull leap second timer fix from Thomas Gleixner.

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Fix CLOCK_MONOTONIC inconsistency during leapsecond
2012-06-08 09:11:33 -07:00
Linus Torvalds
857505fae8 Merge tag 'moduleparam-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
Pull minor module param fixes from Rusty Russell:
 "One bugfix for multiple moduleparam levels, one removal of overzealous
  printk."

* tag 'moduleparam-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  init: Drop initcall level output
  module_param: stop double-calling parameters.
2012-06-08 09:10:35 -07:00
Don Zickus
eeaaa96a3a x86/nmi: Fix section mismatch warnings on 32-bit
It was reported that compiling for 32-bit caused a bunch of
section mismatch warnings:

 VDSOSYM arch/x86/vdso/vdso32-syms.lds
  LD      arch/x86/vdso/built-in.o
  LD      arch/x86/built-in.o

 WARNING: arch/x86/built-in.o(.data+0x5af0): Section mismatch in
 reference from the variable test_nmi_ipi_callback_na.10451 to
 the function .init.text:test_nmi_ipi_callback() [...]

 WARNING: arch/x86/built-in.o(.data+0x5b04): Section mismatch in
 reference from the variable nmi_unk_cb_na.10399 to the function
 .init.text:nmi_unk_cb() The variable nmi_unk_cb_na.10399
 references the function __init nmi_unk_cb() [...]

Both of these are attributed to the internal representation of
the nmiaction struct created during register_nmi_handler.  The
reason for this is that those structs are not defined in the
init section whereas the rest of the code in nmi_selftest.c is.

To resolve this, I created a new #define,
register_nmi_handler_initonly, that tags the struct as
__initdata to resolve the mismatch.  This #define should only be
used in rare situations where the register/unregister is called
during init of the kernel.

Big thanks to Jan Beulich for decoding this for me as I didn't
have a clue what was going on.

Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Tested-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1338991542-23000-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08 12:19:27 +02:00
Steffen Rumler
3c75296562 powerpc: Fix kernel panic during kernel module load
This fixes a problem which can causes kernel oopses while loading
a kernel module.

According to the PowerPC EABI specification, GPR r11 is assigned
the dedicated function to point to the previous stack frame.
In the powerpc-specific kernel module loader, do_plt_call()
(in arch/powerpc/kernel/module_32.c), GPR r11 is also used
to generate trampoline code.

This combination crashes the kernel, in the case where the compiler
chooses to use a helper function for saving GPRs on entry, and the
module loader has placed the .init.text section far away from the
.text section, meaning that it has to generate a trampoline for
functions in the .init.text section to call the GPR save helper.
Because the trampoline trashes r11, references to the stack frame
using r11 can cause an oops.

The fix just uses GPR r12 instead of GPR r11 for generating the
trampoline code.  According to the statements from Freescale, this is
safe from an EABI perspective.

I've tested the fix for kernel 2.6.33 on MPC8541.

Cc: stable@vger.kernel.org
Signed-off-by: Steffen Rumler <steffen.rumler.ext@nsn.com>
[paulus@samba.org: reworded the description]
Signed-off-by: Paul Mackerras <paulus@samba.org>
2012-06-08 19:59:08 +10:00
Cliff Wickman
d5d2d2eea8 x86/uv: Fix UV2 BAU legacy mode
The SGI Altix UV2 BAU (Broadcast Assist Unit) as used for
tlb-shootdown (selective broadcast mode) always uses UV2
broadcast descriptor format. There is no need to clear the
'legacy' (UV1) mode, because the hardware always uses UV2 mode
for selective broadcast.

But the BIOS uses general broadcast and legacy mode, and the
hardware pays attention to the legacy mode bit for general
broadcast. So the kernel must not clear that mode bit.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/E1SccoO-0002Lh-Cb@eag09.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08 11:48:28 +02:00
Yinghai Lu
bd2753b2dd x86/mm: Only add extra pages count for the first memory range during pre-allocation early page table space
Robin found this regression:

| I just tried to boot an 8TB system.  It fails very early in boot with:
| Kernel panic - not syncing: Cannot find space for the kernel page tables

git bisect commit 722bc6b167.

A git revert of that commit does boot past that point on the 8TB
configuration.

That commit will add up extra pages for all memory range even
above 4g.

Try to limit that extra page count adding to first entry only.

Bisected-by: Robin Holt <holt@sgi.com>
Tested-by: Robin Holt <holt@sgi.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/CAE9FiQUj3wyzQxtq9yzBNc9u220p8JZ1FYHG7t%3DMOzJ%3D9BZMYA@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08 11:40:50 +02:00
Dave Airlie
2d5c7cd35f Merge branch 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung into drm-fixes
* 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung:
  drm/exynos: fixed blending for hdmi graphic layer
  drm/exynos: Remove dummy encoder get_crtc operation implementation
  drm/exynos: Keep a reference to frame buffer GEM objects
  drm/exynos: Don't cast GEM object to Exynos GEM object when not needed
  drm/exynos: DRIVER_BUS_PLATFORM is not a driver feature
  drm/exynos: fixed size type.
  drm/exynos: Use DRM_FORMAT_{NV12, YUV420} instead of DRM_FORMAT_{NV12M, YUV420M}
2012-06-08 09:42:51 +01:00
Dave Airlie
6cf98d6ebb Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
  drm/i915: pch_irq_handler -> {ibx, cpt}_irq_handler
  char/agp: add another Ironlake host bridge
  drm/i915: fix up ivb plane 3 pageflips
  drm/i915: hold forcewake around ring hw init
  drm/i915: Mark the ringbuffers as being in the GTT domain
  drm/i915/crt: Do not rely upon the HPD presence pin
  drm/i915: Reset last_retired_head when resetting ring
2012-06-08 09:42:35 +01:00
Borislav Petkov
19efb72fdc init: Drop initcall level output
9fb48c744b ("params: add 3rd arg to option handler callback
signature") added similar lines to dmesg:

initlevel:0=early, 4 registered initcalls
initlevel:1=core, 31 registered initcalls
initlevel:2=postcore, 11 registered initcalls
initlevel:3=arch, 7 registered initcalls
initlevel:4=subsys, 40 registered initcalls
initlevel:5=fs, 30 registered initcalls
initlevel:6=device, 250 registered initcalls
initlevel:7=late, 35 registered initcalls

but they don't contain any info for the general user staring at dmesg.
I'm very doubtful the count of initcalls registered per level helps
anyone so drop that output completely.

Cc: Jim Cromie <jim.cromie@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jason Baron <jbaron@redhat.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-06-08 14:58:14 +09:30
Rusty Russell
ae82fdb140 module_param: stop double-calling parameters.
Commit 026cee0086 "params:
<level>_initcall-like kernel parameters" set old-style module
parameters to level 0.  And we call those level 0 calls where we used
to, early in start_kernel().

We also loop through the initcall levels and call the levelled
module_params before the corresponding initcall.  Unfortunately level
0 is early_init(), so we call the standard module_param calls twice.

(Turns out most things don't care, but at least ubi.mtd does).

Change the level to -1 for standard module_param calls.

Reported-by: Benoît Thébaudeau <benoit.thebaudeau@advansee.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2012-06-08 14:58:13 +09:30
Paul Mackerras
860aed25a1 powerpc/time: Sanity check of decrementer expiration is necessary
This reverts 68568add2c ("powerpc/time: Remove unnecessary sanity check
of decrementer expiration").  We do need to check whether we have reached
the expiration time of the next event, because we sometimes get an early
decrementer interrupt, most notably when we set the decrementer to 1 in
arch_irq_work_raise().  The effect of not having the sanity check is that
if timer_interrupt() gets called early, we leave the decrementer set to
its maximum value, which means we then don't get any more decrementer
interrupts for about 4 seconds (or longer, depending on timebase
frequency).  I saw these pauses as a consequence of getting a stray
hypervisor decrementer interrupt left over from exiting a KVM guest.

This isn't quite a straight revert because of changes to the surrounding
code, but it restores the same algorithm as was previously used.

Cc: stable@vger.kernel.org
Acked-by: Anton Blanchard <anton@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2012-06-08 14:07:35 +10:00
Linus Torvalds
48d212a2ee Revert "mm: correctly synchronize rss-counters at exit/exec"
This reverts commit 40af1bbdca.

It's horribly and utterly broken for at least the following reasons:

 - calling sync_mm_rss() from mmput() is fundamentally wrong, because
   there's absolutely no reason to believe that the task that does the
   mmput() always does it on its own VM.  Example: fork, ptrace, /proc -
   you name it.

 - calling it *after* having done mmdrop() on it is doubly insane, since
   the mm struct may well be gone now.

 - testing mm against NULL before you call it is insane too, since a
NULL mm there would have caused oopses long before.

.. and those are just the three bugs I found before I decided to give up
looking for me and revert it asap.  I should have caught it before I
even took it, but I trusted Andrew too much.

Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07 17:54:07 -07:00
Tao Ma
b22b1f178f ext4: don't set i_flags in EXT4_IOC_SETFLAGS
Commit 7990696 uses the ext4_{set,clear}_inode_flags() functions to
change the i_flags automatically but fails to remove the error setting
of i_flags.  So we still have the problem of trashing state flags.
Fix this by removing the assignment.

Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
2012-06-07 19:04:19 -04:00
Theodore Ts'o
b0dd6b70f0 ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bg
Ext3 filesystems that are converted to use as many ext4 file system
features as possible will enable uninit_bg to speed up e2fsck times.
These file systems will have a native ext3 layout of inode tables and
block allocation bitmaps (as opposed to ext4's flex_bg layout).
Unfortunately, in these cases, when first allocating a block in an
uninitialized block group, ext4 would incorrectly calculate the number
of free blocks in that block group, and then errorneously report that
the file system was corrupt:

EXT4-fs error (device vdd): ext4_mb_generate_buddy:741: group 30, 32254 clusters in bitmap, 32258 in gd

This problem can be reproduced via:

    mke2fs -q -t ext4 -O ^flex_bg /dev/vdd 5g
    mount -t ext4 /dev/vdd /mnt
    fallocate -l 4600m /mnt/test

The problem was caused by a bone headed mistake in the check to see if a
particular metadata block was part of the block group.

Many thanks to Kees Cook for finding and bisecting the buggy commit
which introduced this bug (commit fd034a84e1, present since v3.2).

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: stable@kernel.org
2012-06-07 18:56:06 -04:00
Linus Torvalds
46edaedaf3 Merge branch 'akpm' (Andrew's fixups)
Merge random fixes from Andrew Morton.

* emailed from Andrew Morton <akpm@linux-foundation.org>: (11 patches)
  mm: correctly synchronize rss-counters at exit/exec
  btree: catch NULL value before it does harm
  btree: fix tree corruption in btree_get_prev()
  ipc: shm: restore MADV_REMOVE functionality on shared memory segments
  drivers/platform/x86/acerhdf.c: correct Boris' mail address
  c/r: prctl: drop VMA flags test on PR_SET_MM_ stack data assignment
  c/r: prctl: add ability to get clear_tid_address
  c/r: prctl: add minimal address test to PR_SET_MM
  c/r: prctl: update prctl_set_mm_exe_file() after mm->num_exe_file_vmas removal
  MAINTAINERS: whitespace fixes
  shmem: replace_page must flush_dcache and others
2012-06-07 15:05:43 -07:00
Konstantin Khlebnikov
40af1bbdca mm: correctly synchronize rss-counters at exit/exec
mm->rss_stat counters have per-task delta: task->rss_stat.  Before
changing task->mm pointer the kernel must flush this delta with
sync_mm_rss().

do_exit() already calls sync_mm_rss() to flush the rss-counters before
committing the rss statistics into task->signal->maxrss, taskstats,
audit and other stuff.  Unfortunately the kernel does this before
calling mm_release(), which can call put_user() for processing
task->clear_child_tid.  So at this point we can trigger page-faults and
task->rss_stat becomes non-zero again.  As a result mm->rss_stat becomes
inconsistent and check_mm() will print something like this:

| BUG: Bad rss-counter state mm:ffff88020813c380 idx:1 val:-1
| BUG: Bad rss-counter state mm:ffff88020813c380 idx:2 val:1

This patch moves sync_mm_rss() into mm_release(), and moves mm_release()
out of do_exit() and calls it earlier.  After mm_release() there should
be no pagefaults.

[akpm@linux-foundation.org: tweak comment]
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>		[3.4.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07 14:43:55 -07:00
Joern Engel
39caa0916e btree: catch NULL value before it does harm
Storing NULL values in the btree is illegal and can lead to memory
corruption and possible other fun as well.  Catch it on insert, instead
of waiting for the inevitable.

Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07 14:43:55 -07:00
Roland Dreier
cbf8ae32f6 btree: fix tree corruption in btree_get_prev()
The memory the parameter __key points to is used as an iterator in
btree_get_prev(), so if we save off a bkey() pointer in retry_key and
then assign that to __key, we'll end up corrupting the btree internals
when we do eg

	longcpy(__key, bkey(geo, node, i), geo->keylen);

to return the key value.  What we should do instead is use longcpy() to
copy the key value that retry_key points to __key.

This can cause a btree to get corrupted by seemingly read-only
operations such as btree_for_each_safe.

[akpm@linux-foundation.org: avoid the double longcpy()]
Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: Joern Engel <joern@logfs.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07 14:43:55 -07:00
Will Deacon
7d8a45695c ipc: shm: restore MADV_REMOVE functionality on shared memory segments
Commit 17cf28afea ("mm/fs: remove truncate_range") removed the
truncate_range inode operation in favour of the fallocate file
operation.

When using SYSV IPC shared memory segments, calling madvise with the
MADV_REMOVE advice on an area of shared memory will attempt to invoke
the .fallocate function for the shm_file_operations, which is NULL and
therefore returns -EOPNOTSUPP to userspace.  The previous behaviour
would inherit the inode_operations from the underlying tmpfs file and
invoke truncate_range there.

This patch restores the previous behaviour by wrapping the underlying
fallocate function in shm_fallocate, as we do for fsync.

[hughd@google.com: use -ENOTSUPP in shm_fallocate()]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07 14:43:55 -07:00
Borislav Petkov
4e791c98ae drivers/platform/x86/acerhdf.c: correct Boris' mail address
Correct mail address reference to a mail account which I actually read.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Cc: Peter Feuerer <peter@piie.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07 14:43:55 -07:00
Cyrill Gorcunov
736f24d5e5 c/r: prctl: drop VMA flags test on PR_SET_MM_ stack data assignment
In commit b76437579d ("procfs: mark thread stack correctly in
proc/<pid>/maps") the stack allocated via clone() is marked in
/proc/<pid>/maps as [stack:%d] thus it might be out of the former
mm->start_stack/end_stack values (and even has some custom VMA flags
set).

So to be able to restore mm->start_stack/end_stack drop vma flags test,
but still require the underlying VMA to exist.

As always note this feature is under CONFIG_CHECKPOINT_RESTORE and
requires CAP_SYS_RESOURCE to be granted.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07 14:43:55 -07:00
Cyrill Gorcunov
300f786b26 c/r: prctl: add ability to get clear_tid_address
Zero is written at clear_tid_address when the process exits.  This
functionality is used by pthread_join().

We already have sys_set_tid_address() to change this address for the
current task but there is no way to obtain it from user space.

Without the ability to find this address and dump it we can't restore
pthread'ed apps which call pthread_join() once they have been restored.

This patch introduces the PR_GET_TID_ADDRESS prctl option which allows
the current process to obtain own clear_tid_address.

This feature is available iif CONFIG_CHECKPOINT_RESTORE is set.

[akpm@linux-foundation.org: fix prctl numbering]
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pedro Alves <palves@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07 14:43:55 -07:00
Cyrill Gorcunov
1ad75b9e16 c/r: prctl: add minimal address test to PR_SET_MM
Make sure the address being set is greater than mmap_min_addr (as
suggested by Kees Cook).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07 14:43:55 -07:00
Konstantin Khlebnikov
bafb282df2 c/r: prctl: update prctl_set_mm_exe_file() after mm->num_exe_file_vmas removal
A fix for commit b32dfe3771 ("c/r: prctl: add ability to set new
mm_struct::exe_file").

After removing mm->num_exe_file_vmas kernel keeps mm->exe_file until
final mmput(), it never becomes NULL while task is alive.

We can check for other mapped files in mm instead of checking
mm->num_exe_file_vmas, and mark mm with flag MMF_EXE_FILE_CHANGED in
order to forbid second changing of mm->exe_file.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07 14:43:55 -07:00
Joe Perches
6305902c2f MAINTAINERS: whitespace fixes
Remove trailing spaces at EOL.
Always use a tab after the type :

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07 14:43:54 -07:00
Hugh Dickins
0142ef6cdc shmem: replace_page must flush_dcache and others
Commit bde05d1ccd ("shmem: replace page if mapping excludes its zone")
is not at all likely to break for anyone, but it was an earlier version
from before review feedback was incorporated.  Fix that up now.

* shmem_replace_page must flush_dcache_page after copy_highpage [akpm]
* Expand comment on why shmem_unuse_inode needs page_swapcount [akpm]
* Remove excess of VM_BUG_ONs from shmem_replace_page [wangcong]
* Check page_private matches swap before calling shmem_replace_page [hughd]
* shmem_replace_page allow for unexpected race in radix_tree lookup [hughd]

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Stephane Marchesin <marcheu@chromium.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-07 14:43:54 -07:00
Jordan Justen
743628e868 x86, efi stub: Add .reloc section back into image
Some UEFI firmware will not load a .efi with a .reloc section
with a size of 0.

Therefore, we create a .efi image with 4 main areas and 3 sections.
1. PE/COFF file header
2. .setup section (covers all setup code following the first sector)
3. .reloc section (contains 1 dummy reloc entry, created in build.c)
4. .text section (covers the remaining kernel image)

To make room for the new .setup section data, the header
bugger_off_msg had to be shortened.

Reported-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Link: http://lkml.kernel.org/r/1339085121-12760-1-git-send-email-jordan.l.justen@intel.com
Tested-by: Lee G Rosenbaum <lee.g.rosenbaum@intel.com>
Tested-by: Henrik Rydberg <rydberg@euromail.se>
Cc: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-07 09:52:33 -07:00
Linus Torvalds
513335f964 Merge tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6
Pull PARISC fixes from James Bottomley:
 "This is a set of three bug fixes for minor build breakages that got
  introduced just before 3.5-rc1 was released."

* tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6:
  [PARISC] fix code to find libgcc
  [PARISC] fix compile break in use of lib/strncopy_from_user.c
  [PARISC] fix missing TAINT_WARN problem
2012-06-07 09:06:54 -07:00
Linus Torvalds
0c30989cc9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull tile fixes from Chris Metcalf:
 "These two minor bug fixes fix build failures from some changes that
  were merged in during the 3.5 merge window."

* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: add #include to unbreak build after generic init_task conversion
  tile: remove cpu_idle_on_new_stack
2012-06-07 09:06:13 -07:00
Artem Bityutskiy
12027f1b3f UBI: correct ubi_wl_flush locking
Commit "62f38455 UBI: modify ubi_wl_flush function to clear work queue for a lnum"
takes the 'work_sem' semaphore in write mode for the entire loop, which is not
very good because it will block other workers for potentially long time. We do
not need to have it in write mode - read mode is enough, and we do not need to
hole it over the entire loop. So this patch turns changes the locking: takes
'work_sem' in read mode and pushes it down to the loop.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-06-07 15:22:21 +03:00
Artem Bityutskiy
818039c7d5 UBIFS: fix debugfs-less systems support
Commit "f70b7e5 UBIFS: remove Kconfig debugging option" broke UBIFS and it
refuses to initialize if debugfs (CONFIG_DEBUG_FS) is disabled. I incorrectly
assumed that debugfs files creation function will return success if debugfs
is disabled, but they actually return -ENODEV. This patch fixes the issue.

Reported-by: Paul Parsons <lost.distance@yahoo.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Tested-by: Paul Parsons <lost.distance@yahoo.com>
2012-06-07 10:43:54 +03:00
Artem Bityutskiy
e9b4cf2094 UBI: fix debugfs-less systems support
Commit "aa44d1d UBI: remove Kconfig debugging option" broke UBI and it
refuses to initialize if debugfs (CONFIG_DEBUG_FS) is disabled. I incorrectly
assumed that debugfs files creation function will return success if debugfs
is disabled, but they actually return -ENODEV. This patch fixes the issue.

Reported-by: Paul Parsons <lost.distance@yahoo.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Tested-by: Paul Parsons <lost.distance@yahoo.com>
2012-06-07 10:43:54 +03:00
Adam Jackson
23e81d691a drm/i915: pch_irq_handler -> {ibx, cpt}_irq_handler
Cougar/Panther Point redefine the bits in SDEIIR pretty completely.
This function is just debugging, but if we're debugging we probably want
to be told accurate things instead of lies.

I'm told Lynx Point changes this yet more, but I have no idea how...

Note from Eugeni's review:

"For the record and for future enabling efforts, for LPT, bits 28-31
and 1-14 are gone since CPT/PPT (e.g., those must be zero). And there
is the bit 15 as a new addition, but we are not using it yet and
probably won't be using in foreseeable future."

Signed-off-by: Adam Jackson <ajax@redhat.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35103
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-06 23:01:08 +02:00
Linus Torvalds
71fae7e714 Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fix from Guenter Roeck:
 "Update e-mail address in MAINTAINERS"

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  MAINTAINERS: Update my e-mail address
2012-06-06 11:56:45 -07:00
Linus Torvalds
ff39d0e8f0 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull ACPI and Power Management changes from Len Brown.

This does an evil merge to fix up what I think is a mismerge by Len to
the gma500 driver, and restore it to the mainline state.

In that driver, both branches had commented out the call to
acpi_video_register(), and Len resolved the merge to that commented-out
version.

However, in mainline, further changes by Alan (commit d839ede47a:
"gma500: opregion and ACPI" to be exact) had re-enabled the ACPI video
registration, so the current state of the driver seems to want it.

Alan is apparently still feeling the effects of partying with the Queen,
so he didn't reply to my query, but I'll do the evil merge anyway.

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  ACPI: fix acpi_bus.h build warnings when ACPI is not enabled
  drivers: acpi: Fix dependency for ACPI_HOTPLUG_CPU
  tools/power turbostat: fix IVB support
  tools/power turbostat: fix un-intended affinity of forked program
  ACPI video: use after input_unregister_device()
  gma500: don't register the ACPI video bus
  acpi_video: Intel video is not always i915
  acpi_video: fix leaking PCI references
  ACPI: Ignore invalid _PSS entries, but use valid ones
  ACPI battery: only refresh the sysfs files when pertinent information changes
2012-06-06 10:47:15 -07:00
Linus Torvalds
ae501be0f6 Merge tag 'rdma-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull InfiniBand/RDMA fixes from Roland Dreier:
 "All in hardware drivers:
   - Fix crash in cxgb4
   - Fixes to new ocrdma driver
   - Regression fixes for mlx4"

* tag 'rdma-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mlx4: Fix max_wqe capacity reported from query device
  mlx4_core: Fix setting VL_cap in mlx4_SET_PORT wrapper flow
  IB/mlx4: Fix EQ deallocation in legacy mode
  RDMA/cxgb4: Fix crash when peer address is 0.0.0.0
  RDMA/ocrdma: Remove unnecessary version.h includes
  RDMA/ocrdma: Fix signaled event for SRQ_LIMIT_REACHED
  RDMA/ocrdma: Correct queue free count math
2012-06-06 10:45:21 -07:00
Roland Dreier
20952cdd8e Merge branches 'cxgb4', 'mlx4' and 'ocrdma' into for-linus 2012-06-06 10:08:11 -07:00
Sagi Grimberg
fc2d004419 IB/mlx4: Fix max_wqe capacity reported from query device
1. Limit the max number of WQEs per QP reported when querying the
   device, so that ib_create_qp() will not fail for a QP size that the
   device claimed to support due to additional headroom WQEs being
   allocated.

2. Limit qp resources accepted for ib_create_qp() to the limits
   reported in ib_query_device().  In kernel space, make sure that the
   limits returned to the caller following qp creation also lie within
   the reported device limits. For userspace, report as before, and do
   adjustment in libmlx4 (so as not to break ABI).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Sagi Grimberg <sagig@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-06-06 10:08:03 -07:00
Jack Morgenstein
edc4a67e15 mlx4_core: Fix setting VL_cap in mlx4_SET_PORT wrapper flow
Commit 096335b3f9 ("mlx4_core: Allow dynamic MTU configuration for
IB ports") modifies the port VL setting.  This exposes a bug in
mlx4_common_set_port(), where the VL cap value passed in (inside the
command mailbox) is incorrectly zeroed-out:

mlx4_SET_PORT modifies the VL_cap field (byte 3 of the mailbox).
Since the SET_PORT command is paravirtualized on the master as well as
on the slaves, mlx4_SET_PORT_wrapper() is invoked on the master.  This
calls mlx4_common_set_port() where mailbox byte 3 gets overwritten by
code which should only set a single bit in that byte (for the reset
qkey counter flag) -- but instead overwrites the entire byte.

The result is that when running in SR-IOV mode, the VL_cap will be set
to zero -- fix this.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-06-06 10:07:54 -07:00
Linus Torvalds
374916ed16 Merge tag 'md-3.5-fixes' of git://neil.brown.name/md
Pull two md fixes from NeilBrown:
 "One sparse-warning fix, one bugfix for 3.4-stable"

* tag 'md-3.5-fixes' of git://neil.brown.name/md:
  md: raid1/raid10: fix problem with merge_bvec_fn
  lib/raid6: fix sparse warnings in recovery functions
2012-06-06 09:49:28 -07:00
Linus Torvalds
9e68447f5b Merge tag 'iommu-fixes-3.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
 "Two patches are in here which fix AMD IOMMU specific issues.  One
  patch fixes a long-standing warning on resume because the
  amd_iommu_resume function enabled interrupts.  The other patch fixes a
  deadlock in an error-path of the page-fault request handling code of
  the IOMMU driver.

* tag 'iommu-fixes-3.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/amd: Fix deadlock in ppr-handling error path
  iommu/amd: Cache pdev pointer to root-bridge
2012-06-06 09:47:57 -07:00
Chris Metcalf
2ded5c2484 tile: add #include to unbreak build after generic init_task conversion
Some code was moved from init_task.c to setup.c but the appropriate
header needed to be moved as well.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-06-06 11:29:35 -04:00
Chris Metcalf
10db9e009a tile: remove cpu_idle_on_new_stack
This routine isn't used unless CONFIG_HOMECACHE is enabled, which
isn't even available as a public configuration option yet.
Since it no longer links correctly in 3.4, just remove it for now.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-06-06 11:29:31 -04:00
Arun Sharma
db0dc75d64 perf/x86: Check user address explicitly in copy_from_user_nmi()
Signed-off-by: Arun Sharma <asharma@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1334961696-19580-5-git-send-email-asharma@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 17:08:04 +02:00
Arun Sharma
bc6ca7b342 perf/x86: Check if user fp is valid
Signed-off-by: Arun Sharma <asharma@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1334961696-19580-4-git-send-email-asharma@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 17:08:01 +02:00
Arun Sharma
0b0d9cf6ec perf: Limit callchains to 127
Stack depth of 255 seems excessive, given that copy_from_user_nmi()
could be slow.

Signed-off-by: Arun Sharma <asharma@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1334961696-19580-3-git-send-email-asharma@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 17:08:00 +02:00
Arun Sharma
302fa4b58a perf/x86: Allow multiple stacks
Without this patch, applications with two different stack
regions (eg: native stack vs JIT stack) get truncated
callchains even when RBP chaining is present. GDB shows proper
stack traces and the frame pointer chaining is intact.

This patch disables the (fp < RSP) check, hoping that other checks
in the code save the day for us. In our limited testing, this
didn't seem to break anything.

In the long term, we could potentially have userspace advise
the kernel on the range of valid stack addresses, so we don't
spend a lot of time unwinding from bogus addresses.

Signed-off-by: Arun Sharma <asharma@fb.com>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1334961696-19580-2-git-send-email-asharma@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 17:07:58 +02:00
Dimitri Sivanich
a841f8cef4 sched: Fix the relax_domain_level boot parameter
It does not get processed because sched_domain_level_max is 0 at the
time that setup_relax_domain_level() is run.

Simply accept the value as it is, as we don't know the value of
sched_domain_level_max until sched domain construction is completed.

Fix sched_relax_domain_level in cpuset.  The build_sched_domain() routine calls
the set_domain_attribute() routine prior to setting the sd->level, however,
the set_domain_attribute() routine relies on the sd->level to decide whether
idle load balancing will be off/on.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120605184436.GA15668@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 17:07:41 +02:00
Eugeni Dodonov
67384fe3fd char/agp: add another Ironlake host bridge
This seems to come on Gigabyte H55M-S2V and was discovered through the
https://bugs.freedesktop.org/show_bug.cgi?id=50381 debugging.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50381
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-06 17:05:29 +02:00
Peter Zijlstra
8440ccb43f perf/x86: Update SNB PEBS constraints
Afaict there's no need to (incompletely) iterate the
MEM_UOPS_RETIRED.* umask state.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1338884803.28282.153.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:59:52 +02:00
Peter Zijlstra
b6db437ba8 perf/x86: Enable/Add IvyBridge hardware support
Implement rudimentary IVB perf support. The SDM states its identical
to SNB with exception of the exact event tables, but a quick look
suggests they're similar enough.

Also mark SNB-EP as broken for now.

Requested-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338884803.28282.153.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:59:49 +02:00
Peter Zijlstra
cccb9ba9e4 perf/x86: Implement cycles:p for SNB/IVB
Now that there's finally a chip with working PEBS (IvyBridge), we can
enable the hardware and implement cycles:p for SNB/IVB.

Cc: Stephane Eranian <eranian@google.com>
Requested-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338884803.28282.153.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:59:47 +02:00
Peter Zijlstra
b430f7c470 perf/x86: Fix Intel shared extra MSR allocation
Zheng Yan reported that event group validation can wreck event state
when Intel extra_reg allocation changes event state.

Validation shouldn't change any persistent state. Cloning events in
validate_{event,group}() isn't really pretty either, so add a few
special cases to avoid modifying the event state.

The code is restructured to minimize the special case impact.

Reported-by: Zheng Yan <zheng.z.yan@linux.intel.com>
Acked-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338903031.28282.175.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:59:44 +02:00
Peter Zijlstra
d039ac6080 sched: Validate assumptions in sched_init_numa()
Add some code to validate assumptions we're making and output
warnings if they are not.

If this trigger we want to know about it.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Alex Shi <lkml.alex@gmail.com>
Link: http://lkml.kernel.org/n/tip-6uc3wk5s9udxtdl9cnku0vtt@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:52:30 +02:00
Peter Zijlstra
c3decf0dfb sched: Always initialize cpu-power
Often when we run into mis-shapen topologies the balance iteration
fails to update the cpu power properly and we'll end up in /0 traps.

Always initialize the cpu-power to a semi-sane value so that we can
at least boot the machine, even if the load-balancer might not
function correctly.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-3lbhyj25sr169ha7z3qht5na@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:52:27 +02:00
Peter Zijlstra
c117487687 sched: Fix domain iteration
Weird topologies can lead to asymmetric domain setups. This needs
further consideration since these setups are typically non-minimal
too.

For now, make it work by adding an extra mask selecting which CPUs
are allowed to iterate up.

The topology that triggered it is the one from David Rientjes:

	10 20 20 30
	20 10 20 20
	20 20 10 20
	30 20 20 10

resulting in boxes that wouldn't even boot.

Reported-by: David Rientjes <rientjes@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-3p86l9cuaqnxz7uxsojmz5rm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:52:26 +02:00
Peter Zijlstra
7f1b43936f sched/rt: Fix lockdep annotation within find_lock_lowest_rq()
Roland Dreier reported spurious, hard to trigger lockdep warnings
within the scheduler - without any real lockup.

This bit gives us the right clue:

> [89945.640512]  [<ffffffff8103fa1a>] double_lock_balance+0x5a/0x90
> [89945.640568]  [<ffffffff8104c546>] push_rt_task+0xc6/0x290

if you look at that code you'll find the double_lock_balance() in
question is the one in find_lock_lowest_rq() [yay for inlining].

Now find_lock_lowest_rq() has a bug.. it fails to use
double_unlock_balance() in one exit path, if this results in a retry in
push_rt_task() we'll call double_lock_balance() again, at which point
we'll run into said lockdep confusion.

Reported-by: Roland Dreier <roland@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337282386.4281.77.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:52:26 +02:00
Alex Shi
10717dcde1 sched/numa: Load balance between remote nodes
Commit cb83b629b ("sched/numa: Rewrite the CONFIG_NUMA sched
domain support") removed the NODE sched domain and started checking
if the node distance in SLIT table is farther than REMOTE_DISTANCE,
if so, it will lose the load balance chance at exec/fork/wake_affine
points.

But actually, even the node distance is farther than REMOTE_DISTANCE.

Modern CPUs also has QPI like connections, which ensures that memory
access is not too slow between nodes. So the above change in behavior
on NUMA machine causes a performance regression on various benchmarks:
hackbench, tbench, netperf, oltp, etc.

This patch will recover the scheduler behavior to old mode on all my
Intel platforms: NHM EP/EX, WSM EP, SNB EP/EP4S, and thus fixes the
perfromance regressions. (all of them just have 2 kinds distance, 10, 21)

Signed-off-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338965571-9812-1-git-send-email-alex.shi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:52:25 +02:00
Kamalesh Babulal
ceb1cbac8e sched/x86: Calculate booted cores after construction of sibling_mask
Commit 316ad24830 ("sched/x86: Rewrite set_cpu_sibling_map()")
broke the booted_cores accounting.

The problem is that the booted_cores accounting needs all the
sibling links set up. So restore the second loop and add a comment as
to why its needed.

On qemu booted with -smp sockets=1,cores=2,threads=2;
Before:
 $ grep cores /proc/cpuinfo
 cpu cores       : 2
 cpu cores       : 1
 cpu cores       : 4
 cpu cores       : 3

With the patch:
 $ grep cores /proc/cpuinfo
 cpu cores       : 2
 cpu cores       : 2
 cpu cores       : 2
 cpu cores       : 2

Reported-by: Prarit Bhargava <prarit@redhat.com>
Reported-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120531073738.GH7511@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:37:59 +02:00
Tomoki Sekiyama
f6175f5bfb x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs
In current Linux, percpu variable `vector_irq' is not cleared on
offlined cpus while disabling devices' irqs. If the cpu that has
the disabled irqs in vector_irq is hotplugged,
__setup_vector_irq() hits invalid irq vector and may crash.

This bug can be reproduced as following;

  # echo 0 > /sys/devices/system/cpu/cpu7/online
  # modprobe -r some_driver_using_interrupts      # vector_irq@cpu7 uncleared
  # echo 1 > /sys/devices/system/cpu/cpu7/online  # kernel may crash

This patch fixes this bug by clearing vector_irq in
__clear_irq_vector() even if the cpu is offlined.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: ltc-kernel@ml.yrl.intra.hitachi.co.jp
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/4FC340BE.7080101@hitachi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 12:03:25 +02:00
Feng Tang
55c844a4dd x86/reboot: Fix a warning message triggered by stop_other_cpus()
When rebooting our 24 CPU Westmere servers with 3.4-rc6, we
always see this warning msg:

Restarting system.
machine restart
------------[ cut here ]------------
WARNING: at arch/x86/kernel/smp.c:125
native_smp_send_reschedule+0x74/0xa7() Hardware name: X8DTN
Modules linked in: igb [last unloaded: scsi_wait_scan]
Pid: 1, comm: systemd-shutdow Not tainted 3.4.0-rc6+ #22
Call Trace:
 <IRQ>  [<ffffffff8102a41f>] warn_slowpath_common+0x7e/0x96
 [<ffffffff8102a44c>] warn_slowpath_null+0x15/0x17
 [<ffffffff81018cf7>] native_smp_send_reschedule+0x74/0xa7
 [<ffffffff810561c1>] trigger_load_balance+0x279/0x2a6
 [<ffffffff81050112>] scheduler_tick+0xe0/0xe9
 [<ffffffff81036768>] update_process_times+0x60/0x70
 [<ffffffff81062f2f>] tick_sched_timer+0x68/0x92
 [<ffffffff81046e33>] __run_hrtimer+0xb3/0x13c
 [<ffffffff81062ec7>] ? tick_nohz_handler+0xd0/0xd0
 [<ffffffff810474f2>] hrtimer_interrupt+0xdb/0x198
 [<ffffffff81019a35>] smp_apic_timer_interrupt+0x81/0x94
 [<ffffffff81655187>] apic_timer_interrupt+0x67/0x70
 <EOI>  [<ffffffff8101a3c4>] ? default_send_IPI_mask_allbutself_phys+0xb4/0xc4
 [<ffffffff8101c680>] physflat_send_IPI_allbutself+0x12/0x14
 [<ffffffff81018db4>] native_nmi_stop_other_cpus+0x8a/0xd6
 [<ffffffff810188ba>] native_machine_shutdown+0x50/0x67
 [<ffffffff81018926>] machine_shutdown+0xa/0xc
 [<ffffffff8101897e>] native_machine_restart+0x20/0x32
 [<ffffffff810189b0>] machine_restart+0xa/0xc
 [<ffffffff8103b196>] kernel_restart+0x47/0x4c
 [<ffffffff8103b2e6>] sys_reboot+0x13e/0x17c
 [<ffffffff8164e436>] ? _raw_spin_unlock_bh+0x10/0x12
 [<ffffffff810fcac9>] ? bdi_queue_work+0xcf/0xd8
 [<ffffffff810fe82f>] ? __bdi_start_writeback+0xae/0xb7
 [<ffffffff810e0d64>] ? iterate_supers+0xa3/0xb7
 [<ffffffff816547a2>] system_call_fastpath+0x16/0x1b
---[ end trace 320af5cb1cb60c5b ]---

The root cause seems to be the
default_send_IPI_mask_allbutself_phys() takes quite some time (I
measured it could be several ms) to complete sending NMIs to all
the other 23 CPUs, and for HZ=250/1000 system, the time is long
enough for a timer interrupt to happen, which will in turn
trigger to kick load balance to a stopped CPU and cause this
warning in native_smp_send_reschedule().

So disabling the local irq before stop_other_cpu() can fix this
problem (tested 25 times reboot ok), and it is fine as there
should be nobody caring the timer interrupt in such reboot
stage.

The latest 3.4 kernel slightly changes this behavior by sending
REBOOT_VECTOR first and only send NMI_VECTOR if the REBOOT_VCTOR
fails, and this patch is still needed to prevent the problem.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20120530231541.4c13433a@feng-i7
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 12:03:23 +02:00
Sebastian Andrzej Siewior
7071f6b288 x86/intel/moorestown: Change intel_scu_devices_create() to __devinit
The allmodconfig hits:

 WARNING: vmlinux.o(.text+0x6553d): Section mismatch in
          reference from the function intel_scu_devices_create() to the
          function .devinit.text: spi_register_board_info()
	  [...]

This patch marks intel_scu_devices_create() as devinit because
it only calls a devinit function, spi_register_board_info().

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Feng Tang <feng.tang@intel.com>
Link: http://lkml.kernel.org/r/20120531212025.GA8519@breakpoint.cc
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 11:58:40 +02:00
Yasuaki Ishimatsu
4af463d28f x86/numa: Set numa_nodes_parsed at acpi_numa_memory_affinity_init()
When hot-adding a CPU, the system outputs following messages
since node_to_cpumask_map[2] was not allocated memory.

Booting Node 2 Processor 32 APIC 0xc0
node_to_cpumask_map[2] NULL
Pid: 0, comm: swapper/32 Tainted: G       A     3.3.5-acd #21
Call Trace:
 [<ffffffff81048845>] debug_cpumask_set_cpu+0x155/0x160
 [<ffffffff8105e28a>] ? add_timer_on+0xaa/0x120
 [<ffffffff8150665f>] numa_add_cpu+0x1e/0x22
 [<ffffffff815020bb>] identify_cpu+0x1df/0x1e4
 [<ffffffff815020d6>] identify_econdary_cpu+0x16/0x1d
 [<ffffffff81504614>] smp_store_cpu_info+0x3c/0x3e
 [<ffffffff81505263>] smp_callin+0x139/0x1be
 [<ffffffff815052fb>] start_secondary+0x13/0xeb

The reason is that the bit of node 2 was not set at
numa_nodes_parsed. numa_nodes_parsed is set by only
acpi_numa_processor_affinity_init /
acpi_numa_x2apic_affinity_init. Thus even if hot-added memory
which is same PXM as hot-added CPU is written in ACPI SRAT
Table, if the hot-added CPU is not written in ACPI SRAT table,
numa_nodes_parsed is not set.

But according to ACPI Spec Rev 5.0, it says about ACPI SRAT
table as follows: This optional table provides information that
allows OSPM to associate processors and memory ranges, including
ranges of memory provided by hot-added memory devices, with
system localities / proximity domains and clock domains.

It means that ACPI SRAT table only provides information for CPUs
present at boot time and for memory including hot-added memory.
So hot-added memory is written in ACPI SRAT table, but hot-added
CPU is not written in it. Thus numa_nodes_parsed should be set
by not only acpi_numa_processor_affinity_init /
acpi_numa_x2apic_affinity_init but also
acpi_numa_memory_affinity_init for the case.

Additionally, if system has cpuless memory node,
acpi_numa_processor_affinity_init /
acpi_numa_x2apic_affinity_init cannot set numa_nodes_parseds
since these functions cannot find cpu description for the node.
In this case, numa_nodes_parsed needs to be set by
acpi_numa_memory_affinity_init.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: liuj97@gmail.com
Cc: kosaki.motohiro@gmail.com
Link: http://lkml.kernel.org/r/4FCC2098.4030007@jp.fujitsu.com
[ merged it ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 11:58:39 +02:00
Xiaotian Feng
aff5a62d52 x86/gart: Fix kmemleak warning
aperture_64.c now is using memblock, the previous
kmemleak_ignore() for alloc_bootmem() should be removed then.

Otherwise, with kmemleak enabled, kernel will throw warnings
like:

[    0.000000] kmemleak: Trying to color unknown object at 0xffff8800c4000000 as Black
[    0.000000] Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc1-next-20120605+ #130
[    0.000000] Call Trace:
[    0.000000]  [<ffffffff811b27e6>] paint_ptr+0x66/0xc0
[    0.000000]  [<ffffffff816b90fb>] kmemleak_ignore+0x2b/0x60
[    0.000000]  [<ffffffff81ef7bc0>] kmemleak_init+0x217/0x2c1
[    0.000000]  [<ffffffff81ed2b97>] start_kernel+0x32d/0x3eb
[    0.000000]  [<ffffffff81ed25e4>] ? repair_env_string+0x5a/0x5a
[    0.000000]  [<ffffffff81ed2356>] x86_64_start_reservations+0x131/0x135
[    0.000000]  [<ffffffff81ed2120>] ? early_idt_handlers+0x120/0x120
[    0.000000]  [<ffffffff81ed245c>] x86_64_start_kernel+0x102/0x111
[    0.000000] kmemleak: Early log backtrace:
[    0.000000]    [<ffffffff816b911b>] kmemleak_ignore+0x4b/0x60
[    0.000000]    [<ffffffff81ee6a38>] gart_iommu_hole_init+0x3e7/0x547
[    0.000000]    [<ffffffff81edb20b>] pci_iommu_alloc+0x44/0x6f
[    0.000000]    [<ffffffff81ee81ad>] mem_init+0x19/0xec
[    0.000000]    [<ffffffff81ed2a54>] start_kernel+0x1ea/0x3eb
[    0.000000]    [<ffffffff81ed2356>] x86_64_start_reservations+0x131/0x135
[    0.000000]    [<ffffffff81ed245c>] x86_64_start_kernel+0x102/0x111
[    0.000000]    [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
Cc: Xiaotian Feng <xtfeng@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/1338922831-2847-1-git-send-email-xtfeng@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 11:58:38 +02:00
Thomas Gleixner
1a87fc1ec7 x86: mce: Add the dropped timer interval init back
commit 82f7af09 ("x86/mce: Cleanup timer mess) dropped the
initialization of the per cpu timer interval. Duh :(

Restore the previous behaviour.

Reported-by: Chen Gong <gong.chen@linux.intel.com>
Cc: bp@amd64.org
Cc: tony.luck@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-06-06 11:33:21 +02:00
Masami Hiramatsu
436d03faf6 x86/decoder: Fix bsr/bsf/jmpe decoding with operand-size prefix
Fix the x86 instruction decoder to decode bsr/bsf/jmpe with
operand-size prefix (66h). This fixes the test case failure
reported by Linus, attached below.

bsf/bsr/jmpe have a special encoding. Opcode map in
Intel Software Developers Manual vol2 says they have
TZCNT/LZCNT variants if it has F3h prefix. However, there
is no information if it has other 66h or F2h prefixes.
Current instruction decoder supposes that those are
bad instructions, but it actually accepts at least
operand-size prefixes.

H. Peter Anvin further explains:

 " TZCNT/LZCNT are F3 + BSF/BSR exactly because the F2 and
   F3 prefixes have historically been no-ops with most instructions.
   This allows software to unconditionally use the prefixed versions
   and get TZCNT/LZCNT on the processors that have them if they don't
   care about the difference. "

This fixes errors reported by test_get_len:

  Warning: arch/x86/tools/test_get_len found difference at <em_bsf>:ffffffff81036d87
  Warning: ffffffff81036de5:	66 0f bc c2          	bsf    %dx,%ax
  Warning: objdump says 4 bytes, but insn_get_length() says 3
  Warning: arch/x86/tools/test_get_len found difference at <em_bsr>:ffffffff81036ea6
  Warning: ffffffff81036f04:	66 0f bd c2          	bsr    %dx,%ax
  Warning: objdump says 4 bytes, but insn_get_length() says 3
  Warning: decoded and checked 13298882 instructions with 2 warnings

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <yrl.pp-manager.tt@hitachi.com>
Link: http://lkml.kernel.org/r/20120604150911.22338.43296.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 08:54:18 +02:00
Ingo Molnar
02e03040a3 Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf fixes from Arnaldo Carvalho de Melo:

 * Endianness fixes from Jiri Olsa

 * Fixes for make perf tarball

 * Fix for DSO name in perf script callchains, from David Ahern

 * Segfault fixes for perf top --callchain, from Namhyung Kim

 * Minor function result fixes from Srikar Dronamraju

 * Add missing 3rd ioctl parameter, from Namhyung Kim

 * Fix pager usage in minimal embedded systems, from Avik Sil

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 08:46:33 +02:00
Chen Gong
958fb3c512 x86/mce: Fix the MCE poll timer logic
In commit 82f7af09 ("x86/mce: Cleanup timer mess), Thomas just
forgot the "/ 2" there while cleaning up.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@amd64.org
Cc: tony.luck@intel.com
Link: http://lkml.kernel.org/r/1338863702-9245-1-git-send-email-gong.chen@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 08:28:21 +02:00
Linus Torvalds
eea5b5510f Merge tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull MCE regression fix from Tony Luck:
 "Typo/thinko in a cleanup caused a semantic change. Fix it."

* tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  x86/mce: Fix the MCE poll timer logic
2012-06-05 15:15:04 -07:00
Linus Torvalds
ecc728467f Merge branch 'fixes-for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull arm CMA fix from Marek Szyprowski:
 "This removes the ARMv6+ CMA dependency and lets one use old, well-
  tested dma-mapping implementation also on ARMv6+ systems without the
  need to use EXPERIMENTAL stuff."

Russell King complained (rightly) about the experimental feature being
forced on by the ARM config.

Here CMA is "continuous memory allocator", not "cross-memory attach".
We really neet to stop using insane TLA's for things that aren't big
industry standards.

* 'fixes-for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  ARM: dma-mapping: remove unconditional dependency on CMA
2012-06-05 13:23:17 -07:00
Bryan Wu
aa69cb8c1e MAINTAINERS: add linux-leds mail list address and git URL
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-05 12:50:54 -07:00
Daniel Vetter
cb05d8dede drm/i915: fix up ivb plane 3 pageflips
Or at least plug another gapping hole. Apparrently hw desingers only
moved the bit field, but did not bother ot re-enumerate the planes
when adding support for a 3rd pipe.

Discovered by i-g-t/flip_test.

This may or may not fix the reference bugzilla, because that one
smells like we have still larger fish to fry.

v2: Fixup the impossible case to catch programming errors, noticed by
Chris Wilson.

References: https://bugs.freedesktop.org/show_bug.cgi?id=50069
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Eugeni Dodonov <eugeni.dodonov@intel.com>
Cc: stable@vger.kernel.org
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-05 21:05:21 +02:00
Karel Zak
256cccbeca MAINTAINERS: update util-linux info
Signed-off-by: Karel Zak <kzak@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-05 11:57:31 -07:00
Linus Torvalds
5f73639c63 Merge branch '3.5-merge-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull two scsi target fixes from Nicholas Bellinger:
 "The first is a small name-space collision fix from Stefan for the new
  sbp-target / ieee-1394 code, and second is the FILEIO backend
  conversion patch to always use O_DSYNC by default instead of O_SYNC as
  recommended by hch.  Note the latter is CC'ed stable."

* '3.5-merge-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target/file: Use O_DSYNC by default for FILEIO backends
  sbp-target: rename a variable to avoid name clash
2012-06-05 11:55:27 -07:00
Linus Torvalds
365f0e173f Merge branch 'for-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
 "This fixes the possible premature superblock release on umount bug
  mentioned during v3.5-rc1 pull request.

  Originally, cgroup dentry destruction path assumed that cgroup dentry
  didn't have any reference left after cgroup removal thus put super
  during dentry removal.  Now that there can be lingering dentry
  references, this led to super being put with live dentries.  This
  patch fixes the problem by putting super ref on dentry release instead
  of removal."

* 'for-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: superblock can't be released with active dentries
2012-06-05 11:54:12 -07:00
Linus Torvalds
690efa08e2 Merge branch 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux
Pull embedded i2c update from Wolfram Sang:
 "This only contains one new driver which had multiple dependencies
  (pinctrl, i2c-mux-rework, new devm_* functions), so I decided to wait
  for rc1.  Plus, it had to wait a little for the ack of a devicetree
  maintainer since the bindings were not trivial enough for me to pass
  through.

  So, given that, I hope there is still something like the "new driver
  rule", so we could have the driver in 3.5 and people can start using
  it.  That would make merging support for some boards easier for 3.6
  since the dependency on this driver is gone then."

* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
  i2c: Add generic I2C multiplexer using pinctrl API
2012-06-05 10:55:31 -07:00
Linus Torvalds
5698e9180c Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fix from Avi Kivity:
 "A one-liner fix for a buffer overflow"

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Fix buffer overflow in kvm_set_irq()
2012-06-05 10:54:30 -07:00
Linus Torvalds
f80c43efb3 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm radeon fixes from Dave Airlie:
 "This is just radeon fixes and a bunch of new PCI ids.  The fixes are
  for a deadlock, an audio regression, and a couple of audio fixes."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon/kms: add new SI PCI ids
  drm/radeon/kms: add new BTC PCI ids
  drm/radeon/kms: add new Palm, Sumo PCI ids
  drm/radeon/kms: add new Trinity PCI ids
  drm/radeon: fix vm deadlocks on cayman
  drm/radeon: fix gpu_init on si
  drm/radeon/hdmi: don't set SEND_MAX_PACKETS bit
  drm/radeon/audio: don't hardcode CRTC id
  drm/radeon: make audio_init consistent across asics
2012-06-05 10:53:26 -07:00
Konstantin Khlebnikov
fffaee365f radix-tree: fix contiguous iterator
This patch fixes bug in macro radix_tree_for_each_contig().

If radix_tree_next_slot() sees NULL in next slot it returns NULL, but following
radix_tree_next_chunk() switches iterating into next chunk. As result iterating
becomes non-contiguous and breaks vfs "splice" and all its users.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Reported-and-bisected-by: Hans de Bruin <jmdebruin@xmsnet.nl>
Reported-and-bisected-by: Ondrej Zary <linux@rainbow-software.org>
Reported-bisected-and-tested-by: Toralf Förster <toralf.foerster@gmx.de>
Link: https://lkml.org/lkml/2012/6/5/64
Cc: stable <stable@vger.kernel.org> # 3.4.x
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-05 10:46:40 -07:00
Chen Gong
c2238f10e0 x86/mce: Fix the MCE poll timer logic
In commit 82f7af09 (x86/mce: Cleanup timer mess), Thomas just forgot
the "/ 2" there while cleaning up.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-06-05 10:15:07 -07:00
Linus Torvalds
f9ba7179ce Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix blksize calculation
  fuse: fix stat call on 32 bit platforms
  fuse: optimize fallocate on permanent failure
  fuse: add FALLOCATE operation
  fuse: Convert to kstrtoul_from_user
2012-06-05 10:11:11 -07:00
Linus Torvalds
0b3e9f3f21 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar.

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Remove NULL assignment of dattr_cur
  sched: Remove the last NULL entry from sched_feat_names
  sched: Make sched_feat_names const
  sched/rt: Fix SCHED_RR across cgroups
  sched: Move nr_cpus_allowed out of 'struct sched_rt_entity'
  sched: Make sure to not re-read variables after validation
  sched: Fix SD_OVERLAP
  sched: Don't try allocating memory from offline nodes
  sched/nohz: Fix rq->cpu_load calculations some more
  sched/x86: Use cpu_llc_shared_mask(cpu) for coregroup_mask
2012-06-05 09:47:15 -07:00
Alex Deucher
7aaa61b347 drm/radeon/kms: add new SI PCI ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-06-05 15:11:12 +01:00
Alex Deucher
a2bef8ce82 drm/radeon/kms: add new BTC PCI ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-06-05 15:11:11 +01:00
Alex Deucher
4a6991cc1f drm/radeon/kms: add new Palm, Sumo PCI ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-06-05 15:11:11 +01:00
Alex Deucher
d430f7dbf7 drm/radeon/kms: add new Trinity PCI ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-06-05 15:11:09 +01:00
Avi Kivity
f2ebd422f7 KVM: Fix buffer overflow in kvm_set_irq()
kvm_set_irq() has an internal buffer of three irq routing entries, allowing
connecting a GSI to three IRQ chips or on MSI.  However setup_routing_entry()
does not properly enforce this, allowing three irqchip routes followed by
an MSI route to overflow the buffer.

Fix by ensuring that an MSI entry is added to an empty list.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-05 16:39:58 +03:00
Christian König
bb40915582 drm/radeon: fix vm deadlocks on cayman
Locking mutex in different orders just screams for
deadlocks, and some testing showed that it is actually
quite easy to trigger them.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-06-05 09:27:23 +01:00
Alex Deucher
1a8ca7502c drm/radeon: fix gpu_init on si
- Properly set up the RBs
- Properly set up the SPI
- Properly set up gb_addr_config

This should fix rendering issues on certain cards.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-06-05 09:25:54 +01:00
Rafał Miłecki
7838e05a0d drm/radeon/hdmi: don't set SEND_MAX_PACKETS bit
Many TVs and A/V receivers don't work with this bit set. Problem was
confirmed using: Onkyo TX-SR605, Sony BRAVIA KDL-52X3500, Sony BRAVIA
KDL-40S40xx. In theory this bit shouldn't affect audio engine when
feeding it with data, however it seems it does. Driver fglrx doesn't set
that bit in any of the above cases.
This fixes a regression introduced by 3.5-rc1.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-06-05 09:25:33 +01:00
Rafał Miłecki
0aecb5a4ba drm/radeon/audio: don't hardcode CRTC id
This is based on info released by AMD, should allow using audio in much
more cases.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-06-05 09:25:01 +01:00
Alex Deucher
d4e30ef05c drm/radeon: make audio_init consistent across asics
Call it in the asic startup callback on all asics.
Previously r600 and rv770 called it in the startup
and resume callbacks while all the other asics called
it in the startup callback.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-06-05 09:24:33 +01:00
James Bottomley
4c01acc01d [PARISC] fix code to find libgcc
Sam broke this with

commit 1f2bfbd00e
Author: Sam Ravnborg <sam@ravnborg.org>
Date:   Sat May 5 10:18:41 2012 +0200

    kbuild: link of vmlinux moved to a script

But we should be deriving the location of libgcc in the same way as all
the other archs, so fix by adding a LIBGCC variable which is evaluated
in the makefile

Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-06-05 14:10:23 +09:00
James Bottomley
731455624f [PARISC] fix compile break in use of lib/strncopy_from_user.c
Linus broke us with

commit 36126f8f2e
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Sat May 26 10:43:17 2012 -0700

    word-at-a-time: make the interfaces truly generic

By moving functions defined in strncopy_from_user.c into the asm-geneic
version word-at-a-time.h.  Spark and OpenRisc were fixed to use this, but
not parisc.  Fix by adding to generic-y in asm/Kbuild

Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-06-05 14:10:22 +09:00
James Bottomley
f1ea8b66e5 [PARISC] fix missing TAINT_WARN problem
Al viro broke us with

commit edd63a2763
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Fri Apr 27 13:42:45 2012 -0400

    set_restore_sigmask() is never called without SIGPENDING (and never should be)

Although it's pretty much our fault since parisc's asm/bug.h uses
BUGWARN_TAINT but doesn't include the file that defines it.  Fix that.

Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-06-05 14:10:17 +09:00
Seung-Woo Kim
5736603bef drm/exynos: fixed blending for hdmi graphic layer
Blending for graphic layer 0 of hdmi mixer was not set so video
layer cannot be showed if graphic layer 0 is enabled.
This patch fixes blending values to support blending between
graphic layer 0 and video layer.

Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2012-06-05 13:25:18 +09:00
Laurent Pinchart
f56fdcef4d drm/exynos: Remove dummy encoder get_crtc operation implementation
The encoder get_crtc operation is called to retrieve a pointer to the
CRTC the encoder is currenctly connected to, right after setting the
encoder::crtc field to the new CRTC. The implementation of this
operation returns the pointer to the new CRTC, which is then pointlessly
compared to itself.

As the operation is not mandatory, don't implement it.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-06-05 13:25:15 +09:00
Laurent Pinchart
07b6835f2c drm/exynos: Keep a reference to frame buffer GEM objects
GEM objects used by frame buffers must be referenced for the whole life
of the frame buffer. Release the references in the frame buffer
destructor instead of its constructor.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-06-05 11:53:58 +09:00
Laurent Pinchart
6037bafa2e drm/exynos: Don't cast GEM object to Exynos GEM object when not needed
The exynos_drm_gem_dumb_map_offset() doesn't need to access any
Exynos-specific GEM object fields, don't cast the GEM object.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-06-05 11:53:58 +09:00
Laurent Pinchart
293a1c128e drm/exynos: DRIVER_BUS_PLATFORM is not a driver feature
DRIVER_BUS_PLATFORM is a bus type used internally in the DRM core, not a
flag for the drm_driver::driver_features field.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-06-05 11:53:58 +09:00
Inki Dae
13b87b2742 drm/exynos: fixed size type.
size type of drm_exynos_gem_mmap struct is changed to uint64_t and
it adds pad for the struct to be aligned as 64bit.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-06-05 11:51:47 +09:00
Ville Syrjälä
363b06aaa5 drm/exynos: Use DRM_FORMAT_{NV12, YUV420} instead of DRM_FORMAT_{NV12M, YUV420M}
The NV12M/YUV420M formats are identical to the already existing standard
NV12/YUV420 formats. The M variants will be removed, so convert the
driver to use the standard names.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-06-05 11:51:43 +09:00
Linus Torvalds
99becf1328 Pull 'for-linus' branches of git://git.kernel.org/pub/scm/linux/kernel/git/viro/{signal,vfs}
Pull signal and vfs compile breakage fixes from Al Viro.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  fixups for signal breakage

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  nommu: fix compilation of nommu.c
2012-06-04 15:09:08 -07:00
Linus Torvalds
bf2785a818 Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French.

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Move get_next_mid to ops struct
  CIFS: Make accessing is_valid_oplock/dump_detail ops struct field safe
  CIFS: Improve identation in cifs_unlock_range
  CIFS: Fix possible wrong memory allocation
2012-06-04 15:00:58 -07:00
Al Viro
03240b279d fixups for signal breakage
Obvious brainos spotted by Geert.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-04 17:47:34 -04:00
Greg Ungerer
ad1ed2937e nommu: fix compilation of nommu.c
Compiling 3.5-rc1 for nommu targets gives:

  CC      mm/nommu.o
mm/nommu.c: In function ‘sys_mmap_pgoff’:
mm/nommu.c:1489:2: error: ‘ret’ undeclared (first use in this function)
mm/nommu.c:1489:2: note: each undeclared identifier is reported only once for each function it appears in

It is trivially fixed by replacing 'ret' with the local variable that is
already defined for the return value 'retval'.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-04 17:17:31 -04:00
John Stultz
fad0c66c4b timekeeping: Fix CLOCK_MONOTONIC inconsistency during leapsecond
Commit 6b43ae8a61 (ntp: Fix leap-second hrtimer livelock) broke the
leapsecond update of CLOCK_MONOTONIC. The missing leapsecond update to
wall_to_monotonic causes discontinuities in CLOCK_MONOTONIC.

Adjust wall_to_monotonic when NTP inserted a leapsecond.

Reported-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Richard Cochran <richardcochran@gmail.com>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1338400497-12420-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-06-04 21:46:29 +02:00
Linus Torvalds
a3fe778c78 Merge tag 'stable/frontswap.v16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm
Pull frontswap feature from Konrad Rzeszutek Wilk:
 "Frontswap provides a "transcendent memory" interface for swap pages.
  In some environments, dramatic performance savings may be obtained
  because swapped pages are saved in RAM (or a RAM-like device) instead
  of a swap disk.  This tag provides the basic infrastructure along with
  some changes to the existing backends."

Fix up trivial conflict in mm/Makefile due to removal of swap token code
changing a line next to the new frontswap entry.

This pull request came in before the merge window even opened, it got
delayed to after the merge window by me just wanting to make sure it had
actual users.  Apparently IBM is using this on their embedded side, and
Jan Beulich says that it's already made available for SLES and OpenSUSE
users.

Also acked by Rik van Riel, and Konrad points to other people liking it
too.  So in it goes.

By Dan Magenheimer (4) and Konrad Rzeszutek Wilk (2)
via Konrad Rzeszutek Wilk
* tag 'stable/frontswap.v16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm:
  frontswap: s/put_page/store/g s/get_page/load
  MAINTAINER: Add myself for the frontswap API
  mm: frontswap: config and doc files
  mm: frontswap: core frontswap functionality
  mm: frontswap: core swap subsystem hooks and headers
  mm: frontswap: add frontswap header file
2012-06-04 12:28:45 -07:00
Linus Torvalds
9171c670b4 Merge branches 'irq-urgent-for-linus' and 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq and smpboot updates from Thomas Gleixner:
 "Just cleanup patches with no functional change and a fix for suspend
  issues."

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Introduce irq_do_set_affinity() to reduce duplicated code
  genirq: Add IRQS_PENDING for nested and simple irq

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  smpboot, idle: Fix comment mismatch over idle_threads_init()
  smpboot, idle: Optimize calls to smp_processor_id() in idle_threads_init()
2012-06-04 11:36:51 -07:00
Linus Torvalds
c22072bdf0 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "The clocksource driver is pure hardware enablement and the skew option
  is default off, well tested and non dangerous."

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick: Move skew_tick option into the HIGH_RES_TIMER section
  clocksource: em_sti: Add DT support
  clocksource: em_sti: Emma Mobile STI driver
  clockevents: Make clockevents_config() a global symbol
  tick: Add tick skew boot option
2012-06-04 11:25:31 -07:00
Daniel Vetter
b7884eb45e drm/i915: hold forcewake around ring hw init
Empirical evidence suggests that we need to: On at least one ivb
machine when running the hangman i-g-t test, the rings don't properly
initialize properly - the RING_START registers seems to be stuck at
all zeros.

Holding forcewake around this register init sequences makes chip reset
reliable again. Note that this is not the first such issue:

commit f01db988ef
Author: Sean Paul <seanpaul@chromium.org>
Date:   Fri Mar 16 12:43:22 2012 -0400

    drm/i915: Add wait_for in init_ring_common

added delay loops to make RING_START and RING_CTL initialization
reliable on the blt ring at boot-up. So I guess it won't hurt if we do
this unconditionally for all force_wake needing gpus.

To avoid copy&pasting of the HAS_FORCE_WAKE check I've added a new
intel_info bit for that.

v2: Fixup missing commas in static struct and properly handling the
error case in init_ring_common, both noticed by Jani Nikula.

Cc: stable@vger.kernel.org
Reported-and-tested-by: Yang Guang <guang.a.yang@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50522
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-04 20:25:29 +02:00
Chris Wilson
3eef8918ff drm/i915: Mark the ringbuffers as being in the GTT domain
By correctly describing the rinbuffers as being in the GTT domain, it
appears that we are more careful with the management of the CPU cache
upon resume and so prevent some coherency issue when submitting commands
to the GPU later. A secondary effect is that the debug logs are then
consistent with the actual usage (i.e. they no longer describe the
ringbuffers as being in the CPU write domain when we are accessing them
through an wc iomapping.)

Reported-and-tested-by: Daniel Gnoutcheff <daniel@gnoutcheff.name>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41092
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-04 20:16:40 +02:00
Linus Torvalds
0640113be2 vfs: Fix /proc/<tid>/fdinfo/<fd> file handling
Cyrill Gorcunov reports that I broke the fdinfo files with commit
30a08bf2d3 ("proc: move fd symlink i_mode calculations into
tid_fd_revalidate()"), and he's quite right.

The tid_fd_revalidate() function is not just used for the <tid>/fd
symlinks, it's also used for the <tid>/fdinfo/<fd> files, and the
permission model for those are different.

So do the dynamic symlink permission handling just for symlinks, making
the fdinfo files once more appear as the proper regular files they are.

Of course, Al Viro argued (probably correctly) that we shouldn't do the
symlink permission games at all, and make the symlinks always just be
the normal 'lrwxrwxrwx'.  That would have avoided this issue too, but
since somebody noticed that the permissions had changed (which was the
reason for that original commit 30a08bf2d3 in the first place), people
do apparently use this feature.

[ Basically, you can use the symlink permission data as a cheap "fdinfo"
  replacement, since you see whether the file is open for reading and/or
  writing by just looking at st_mode of the symlink.  So the feature
  does make sense, even if the pain it has caused means we probably
  shouldn't have done it to begin with. ]

Reported-and-tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-04 11:00:45 -07:00
Stephen Warren
ae58d1e406 i2c: Add generic I2C multiplexer using pinctrl API
This is useful for SoCs whose I2C module's signals can be routed to
different sets of pins at run-time, using the pinctrl API.

                                 +-----+  +-----+
                                 | dev |  | dev |
    +------------------------+   +-----+  +-----+
    | SoC                    |      |        |
    |                   /----|------+--------+
    |   +---+   +------+     | child bus A, on first set of pins
    |   |I2C|---|Pinmux|     |
    |   +---+   +------+     | child bus B, on second set of pins
    |                   \----|------+--------+--------+
    |                        |      |        |        |
    +------------------------+  +-----+  +-----+  +-----+
                                | dev |  | dev |  | dev |
                                +-----+  +-----+  +-----+

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
2012-06-04 16:49:43 +02:00
Joerg Roedel
eee53537c4 iommu/amd: Fix deadlock in ppr-handling error path
In the error path of the ppr_notifer it can happen that the
iommu->lock is taken recursivly. This patch fixes the
problem by releasing the iommu->lock before any notifier is
invoked. This also requires to move the erratum workaround
for the ppr-log (interrupt may be faster than data in the log)
one function up.

Cc: stable@vger.kernel.org # v3.3, v3.4
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-06-04 12:47:44 +02:00
Joerg Roedel
c1bf94ec1e iommu/amd: Cache pdev pointer to root-bridge
At some point pci_get_bus_and_slot started to enable
interrupts. Since this function is used in the
amd_iommu_resume path it will enable interrupts on resume
which causes a warning. The fix will use a cached pointer
to the root-bridge to re-enable the IOMMU in case the BIOS
is broken.

Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-06-04 12:47:44 +02:00
Shlomo Pongratz
3aac6ff16a IB/mlx4: Fix EQ deallocation in legacy mode
Commit e605b743f3 ("IB/mlx4: Increase the number of vectors (EQs)
available for ULPs") didn't handle correctly the case where there
aren't enough MSI-X vectors to increase the number of EQs, so only the
legacy EQs are allocated.  This results in an attempt to memset() to
zero the EQ table which was never allocated and a kernel crash.

Fix this by checking in the teardown flow if the table of EQs was ever
allocated.  Also remove some unneeded setting to zero of the EQ
related fields in struct mlx4_ib_dev.

Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-06-03 23:02:16 -07:00
Marek Szyprowski
f1ae98da85 ARM: dma-mapping: remove unconditional dependency on CMA
CMA has been enabled unconditionally on all ARMv6+ systems to solve the
long standing issue of double kernel mappings for all dma coherent
buffers. This however created a dependency on CONFIG_EXPERIMENTAL for
the whole ARM architecture what should be really avoided. This patch
removes this dependency and lets one use old, well-tested dma-mapping
implementation also on ARMv6+ systems without the need to use
EXPERIMENTAL stuff.

Reported-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-06-04 08:01:24 +02:00
Thadeu Lima de Souza Cascardo
71b43fd573 RDMA/cxgb4: Fix crash when peer address is 0.0.0.0
When using rping -c -a 0.0.0.0 with iw_cxgb4, the system crashes when
rdma_connect() is called.  ip_dev_find() will return NULL, but pdev is
accessed anyway.

Checking that pdev is NULL and returning -ENODEV prevents the system
from crashing.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-06-03 22:59:15 -07:00
Len Brown
d3514abcf5 Merge branches 'bugfix-battery', 'bugfix-misc', 'bugfix-rafael', 'bugfix-turbostat', 'bugfix-video' and 'workaround-pss' into release
bug fixes

Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-04 00:48:41 -04:00
Len Brown
7e1bd6e38b Merge branch 'upstream' into bugfix-video
Update bugfix-video branch to 2.5-rc1
so I don't have to again resolve the
conflict in these patches vs. upstream.

Conflicts:
	drivers/gpu/drm/gma500/psb_drv.c

	text conflict: add comment vs delete neighboring line

	keep just this:
	/* igd_opregion_init(&dev_priv->opregion_dev); */
	/* acpi_video_register(); */

Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-04 00:35:19 -04:00
Len Brown
7ae30986dc ACPI: fix acpi_bus.h build warnings when ACPI is not enabled
introduced in Linux-3.5-rc1 by
66886d6f8c
(ACPI: Add stubs for (un)register_acpi_bus_type)

Fix header file warnings when CONFIG_ACPI is not enabled:

include/acpi/acpi_bus.h:443:42: warning: 'struct acpi_bus_type' declared inside parameter list
include/acpi/acpi_bus.h:443:42: warning: its scope is only this definition or declaration, which is probably not
include/acpi/acpi_bus.h:444:44: warning: 'struct acpi_bus_type' declared inside parameter list

Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-04 00:29:11 -04:00
Kukjin Kim
5041caa4d5 gpio/samsung: fix the typo 'exynos5_xxx' instead of 'exonys5_xxx'
Should be 'exynos5_xxx' instead of 'exonys5_xxx'.

It happened at the commit 30b842889e ("Merge tag 'soc2' of
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc")
during v3.5 merge window.

Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
[ My bad  - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-03 21:21:01 -07:00
Fabio Estevam
2f07a6134f drivers: acpi: Fix dependency for ACPI_HOTPLUG_CPU
Fix the following build warning:

warning: (ACPI_HOTPLUG_CPU) selects ACPI_CONTAINER which has unmet direct dependencies (ACPI && EXPERIMENTAL)

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-04 00:07:33 -04:00
Len Brown
650a37f32d tools/power turbostat: fix IVB support
Initial IVB support went into turbostat in Linux-3.1:
553575f1ae
(tools turbostat: recognize and run properly on IVB)

However, when running on IVB, turbostat would fail
to report the new couters added with SNB, c7, pc2 and pc7.
So in scenarios where these counters are non-zero on IVB,
turbostat would report erroneous residencey results.

In particular c7 time would be added to c1 time,
since c1 time is calculated as "that which is left over".

Also, turbostat reports MHz capabilities when passed
the "-v" option, and it would incorrectly report 133MHz
bclk instead of 100MHz bclk for IVB, which would inflate
GHz reported with that option.

This patch is a backport of a fix already included in turbostat v2.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-03 23:47:49 -04:00
Len Brown
d15cf7c129 tools/power turbostat: fix un-intended affinity of forked program
Linux 3.4 included a modification to turbostat to
lower cross-call overhead by using scheduler affinity:

15aaa34654
(tools turbostat: reduce measurement overhead due to IPIs)

In the use-case where turbostat forks a child program,
that change had the un-intended side-effect of binding
the child to the last cpu in the system.

This change removed the binding before forking the child.

This is a back-port of a fix already included in turbostat v2.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-03 23:24:00 -04:00
Linus Torvalds
4d578573b8 Merge branch 'pm-acpi' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull some left-over PM patches from Rafael J. Wysocki.

* 'pm-acpi' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / PM: Make acpi_pm_device_sleep_state() follow the specification
  ACPI / PM: Make __acpi_bus_get_power() cover D3cold correctly
  ACPI / PM: Fix error messages in drivers/acpi/bus.c
  rtc-cmos / PM: report wakeup event on ACPI RTC alarm
  ACPI / PM: Generate wakeup events on fixed power button
2012-06-03 20:15:57 -07:00
Linus Torvalds
68e3e92620 Revert "mm: compaction: handle incorrect MIGRATE_UNMOVABLE type pageblocks"
This reverts commit 5ceb9ce6fe.

That commit seems to be the cause of the mm compation list corruption
issues that Dave Jones reported.  The locking (or rather, absense
there-of) is dubious, as is the use of the 'page' variable once it has
been found to be outside the pageblock range.

So revert it for now, we can re-visit this for 3.6.  If we even need to:
as Minchan Kim says, "The patch wasn't a bug fix and even test workload
was very theoretical".

Reported-and-tested-by: Dave Jones <davej@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-03 20:05:57 -07:00
Hugh Dickins
752dc185da mm: fix warning in __set_page_dirty_nobuffers
New tmpfs use of !PageUptodate pages for fallocate() is triggering the
WARNING: at mm/page-writeback.c:1990 when __set_page_dirty_nobuffers()
is called from migrate_page_copy() for compaction.

It is anomalous that migration should use __set_page_dirty_nobuffers()
on an address_space that does not participate in dirty and writeback
accounting; and this has also been observed to insert surprising dirty
tags into a tmpfs radix_tree, despite tmpfs not using tags at all.

We should probably give migrate_page_copy() a better way to preserve the
tag and migrate accounting info, when mapping_cap_account_dirty().  But
that needs some more work: so in the interim, avoid the warning by using
a simple SetPageDirty on PageSwapBacked pages.

Reported-and-tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-03 20:05:47 -07:00
Linus Torvalds
2f9d3df8aa vfs: move inode stat information closer together
The comment above it says "Stat data, not accessed from path walking",
but in fact some of inode fields we use for the common stat data was way
down at the end of the inode, causing unnecessary cache misses for the
common stat operations.

The inode structure is pretty big, and this can change padding depending
on field width, but at least on the common 64-bit configurations this
doesn't change the size.  Some of our inode layout has historically been
to tro to avoid unnecessary padding fields, but cache locality is at
least as important for layout, if not more.

Noticed by looking at kernel profiles, and noticing that the "i_blkbits"
access stood out like a sore thumb.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-03 14:50:19 -07:00
Guenter Roeck
ca4620853a MAINTAINERS: Update my e-mail address
Maintainer activities moved to home server. Update e-mail address to reflect
reality.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Jean Delvare <khali@linux-fr.org>
2012-06-03 08:13:41 -07:00
Nicholas Bellinger
a4dff3043c target/file: Use O_DSYNC by default for FILEIO backends
Convert to use O_DSYNC for all cases at FILEIO backend creation time to
avoid the extra syncing of pure timestamp updates with legacy O_SYNC during
default operation as recommended by hch.  Continue to do this independently of
Write Cache Enable (WCE) bit, as WCE=0 is currently the default for all backend
devices and enabled by user on per device basis via attrib/emulate_write_cache.

This patch drops the now unnecessary fd_buffered_io= token usage that was
originally signalling when to explictly disable O_SYNC at backend creation
time for buffered I/O operation.  This can end up being dangerous for a number
of reasons during physical node failure, so go ahead and drop this option
for now when O_DSYNC is used as the default.

Also allow explict FUA WRITEs -> vfs_fsync_range() call to function in
fd_execute_cmd() independently of WCE bit setting.

Reported-by: Christoph Hellwig <hch@lst.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-06-02 23:47:20 -07:00
Dan Carpenter
301f33fbcf ACPI video: use after input_unregister_device()
We can't use "input" anymore after calling input_unregister_device().
The call to input_free_device() is a double free.  The normal way to
deal with this is to make input_register_device() the last function
called in the function.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-01 13:41:40 -04:00
Alan Cox
155689defc gma500: don't register the ACPI video bus
We are not yet ready for this and it makes a mess on some devices.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-01 13:40:56 -04:00
Alan Cox
c6996bdd85 acpi_video: Intel video is not always i915
Stop it poking at random registers on the i740 cards that may be out there
still.

As per Matthew's feedback remove the conditional checks and never enable the
opregion handling unless an appropriate driver has been loaded.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-01 13:40:55 -04:00
Alan Cox
cfb46f433a acpi_video: fix leaking PCI references
Signed-off-by: Alan Cox <alan@linux.intel.com>
Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-01 13:40:55 -04:00
Pavel Shilovsky
8825736060 CIFS: Move get_next_mid to ops struct
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2012-06-01 12:35:19 -05:00
Pavel Shilovsky
7f0adb53bc CIFS: Make accessing is_valid_oplock/dump_detail ops struct field safe
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2012-06-01 12:35:16 -05:00
Pavel Shilovsky
ea319d57d3 CIFS: Improve identation in cifs_unlock_range
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2012-06-01 12:35:12 -05:00
Pavel Shilovsky
0013fb4ca3 CIFS: Fix possible wrong memory allocation
when cifs_reconnect sets maxBuf to 0 and we try to calculate a size
of memory we need to store locks.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2012-06-01 12:35:08 -05:00
Namhyung Kim
cb7225feec perf: Remove duplicate invocation on perf_event_for_each
The @func callback was invoked twice for group leader when
perf_event_for_each() called. It seems the commit 75f937f24b
("perf_counter: Fix ctx->mutex vs counter ->mutex inversion") made the
mistake during the change.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338443506-25009-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31 14:01:00 -03:00
Srikar Dronamraju
a23c4dc422 perf uprobes: Remove unnecessary check before strlist__delete
Since strlist__delete() itself checks, the additional check before
calling strlist__delete() is redundant.

No Functional change.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anton Arapov <anton@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20120531114643.23691.38666.sendpatchset@srdronam.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31 12:08:49 -03:00
Srikar Dronamraju
378474e4b2 perf symbols: Check for valid dso before creating map
dso__new() can return NULL. Hence verify dso before creating a new map.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anton Arapov <anton@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20120531114656.23691.54223.sendpatchset@srdronam.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31 12:08:22 -03:00
Jiri Olsa
37073f9e44 perf evsel: Fix 32 bit values endianity swap for sample_id_all header
We swap the sample_id_all header by u64 pointers. Some members of the
header happen to be 32 bit values. We need to handle them separatelly.

Together with other endianity patches, this change fixies perf report
discrepancies on origin and target systems as described in test 1 below,
e.g. following perf report diff:

...
      0.12%               ps  [kernel.kallsyms]    [k] clear_page
-     0.12%              awk  bash                 [.] alloc_word_desc
+     0.12%              awk  bash                 [.] yyparse
      0.11%   beah-rhts-task  libpython2.6.so.1.0  [.] 0x5560e
      0.10%             perf  libc-2.12.so         [.] __ctype_toupper_loc
-     0.09%  rhts-test-runne  bash                 [.] maybe_make_export_env
+     0.09%  rhts-test-runne  bash                 [.] 0x385a0
      0.09%               ps  [kernel.kallsyms]    [k] page_fault
...

Note, running following to test perf endianity handling:
test 1)
  - origin system:
    # perf record -a -- sleep 10 (any perf record will do)
    # perf report > report.origin
    # perf archive perf.data

  - copy the perf.data, report.origin and perf.data.tar.bz2
    to a target system and run:
    # tar xjvf perf.data.tar.bz2 -C ~/.debug
    # perf report > report.target
    # diff -u report.origin report.target

  - the diff should produce no output
    (besides some white space stuff and possibly different
     date/TZ output)

test 2)
  - origin system:
    # perf record -ag -fo /tmp/perf.data -- sleep 1
  - mount origin system root to the target system on /mnt/origin
  - target system:
    # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
     --kallsyms /mnt/origin/proc/kallsyms
  - complete perf.data header is displayed

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338380624-7443-4-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31 11:59:01 -03:00
Jiri Olsa
268fb20f83 perf session: Handle endianity swap on sample_id_all header data
Adding endianity swapping for event header attached via sample_id_all.

Currently we dont do that and it's causing wrong data to be read when
running report on architecture with different endianity than the record.

The perf is currently able to process 32-bit PPC samples on 32-bit
and 64-bit x86.

Together with other endianity patches, this change fixies perf report
discrepancies on origin and target systems as described in test 1
below, e.g. following perf report diff:

...
      0.12%               ps  [kernel.kallsyms]    [k] clear_page
-     0.12%              awk  bash                 [.] alloc_word_desc
+     0.12%              awk  bash                 [.] yyparse
      0.11%   beah-rhts-task  libpython2.6.so.1.0  [.] 0x5560e
      0.10%             perf  libc-2.12.so         [.] __ctype_toupper_loc
-     0.09%  rhts-test-runne  bash                 [.] maybe_make_export_env
+     0.09%  rhts-test-runne  bash                 [.] 0x385a0
      0.09%               ps  [kernel.kallsyms]    [k] page_fault
...

Note, running following to test perf endianity handling:
test 1)
  - origin system:
    # perf record -a -- sleep 10 (any perf record will do)
    # perf report > report.origin
    # perf archive perf.data

  - copy the perf.data, report.origin and perf.data.tar.bz2
    to a target system and run:
    # tar xjvf perf.data.tar.bz2 -C ~/.debug
    # perf report > report.target
    # diff -u report.origin report.target

  - the diff should produce no output
    (besides some white space stuff and possibly different
     date/TZ output)

test 2)
  - origin system:
    # perf record -ag -fo /tmp/perf.data -- sleep 1
  - mount origin system root to the target system on /mnt/origin
  - target system:
    # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
     --kallsyms /mnt/origin/proc/kallsyms
  - complete perf.data header is displayed

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338380624-7443-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31 11:58:14 -03:00
Jiri Olsa
8db4841fc7 perf symbols: Handle different endians properly during symbol load
Currently we dont care about the file object's endianness. It's possible
we read buildid file object from different architecture than we are
currentlly running on. So we need to care about properly reading such
object's data - handle different endianness properly.

Adding:
	needs_swap DSO field
	dso__swap_init function to initialize DSO's needs_swap
	DSO__SWAP to read the data with proper swaps

Together with other endianity patches, this change fixies perf report
discrepancies on origin and target systems as described in test 1 below,
e.g. following perf report diff:

...
      0.12%               ps  [kernel.kallsyms]    [k] clear_page
-     0.12%              awk  bash                 [.] alloc_word_desc
+     0.12%              awk  bash                 [.] yyparse
      0.11%   beah-rhts-task  libpython2.6.so.1.0  [.] 0x5560e
      0.10%             perf  libc-2.12.so         [.] __ctype_toupper_loc
-     0.09%  rhts-test-runne  bash                 [.] maybe_make_export_env
+     0.09%  rhts-test-runne  bash                 [.] 0x385a0
      0.09%               ps  [kernel.kallsyms]    [k] page_fault
...

Note, running following to test perf endianity handling:
test 1)
  - origin system:
    # perf record -a -- sleep 10 (any perf record will do)
    # perf report > report.origin
    # perf archive perf.data

  - copy the perf.data, report.origin and perf.data.tar.bz2
    to a target system and run:
    # tar xjvf perf.data.tar.bz2 -C ~/.debug
    # perf report > report.target
    # diff -u report.origin report.target

  - the diff should produce no output
    (besides some white space stuff and possibly different
     date/TZ output)

test 1)
  - origin system:
    # perf record -ag -fo /tmp/perf.data -- sleep 1
  - mount origin system root to the target system on /mnt/origin
  - target system:
    # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
     --kallsyms /mnt/origin/proc/kallsyms
  - complete perf.data header is displayed

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338380624-7443-2-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31 11:55:36 -03:00
Namhyung Kim
55da80059d perf evlist: Pass third argument to ioctl explicitly
The ioctl on perf event fd wants 3 arguments but we only passed 2. As
the only user of the functions is perf record and it calls them for
every event (regardless of group setting), just pass 0 for now.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338443506-25009-3-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31 11:39:16 -03:00
Namhyung Kim
a59e64a13a perf tools: Update ioctl documentation for PERF_IOC_FLAG_GROUP
The ioctl interface of perf event fd receives 3 arguments to control
event group behavior but it lacked documentation.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338443506-25009-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31 11:38:42 -03:00
Arnaldo Carvalho de Melo
aa5cdd308d perf tools: Make --version show kernel version instead of pull req tag
Before:

  $ perf --version
  perf version perf.urgent.for.mingo.5.g37da28

After:

  $ perf --version
  perf version 3.4.8941.g37da28.dirty

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-vc9b4e6023iegz9kabr3yvyv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31 11:20:59 -03:00
Namhyung Kim
114067b69e perf tools: Check if callchain is corrupted
We faced segmentation fault on perf top -G at very high sampling rate
due to a corrupted callchain. While the root cause was not revealed (I
failed to figure it out), this patch tries to protect us from the
segfault on such cases.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sunjin Yang <fan4326@gmail.com>
Link: http://lkml.kernel.org/r/1338443007-24857-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31 11:20:34 -03:00
Namhyung Kim
472606458f perf callchain: Make callchain cursors TLS
perf top -G has a race on callchain cursor between main thread and
display thread. Since the callchain cursors are used locally make them
thread-local data would solve the problem.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Reported-by: Sunjin Yang <fan4326@gmail.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sunjin Yang <fan4326@gmail.com>
Link: http://lkml.kernel.org/r/1338443007-24857-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-31 10:47:12 -03:00
Chris Wilson
9e612a008f drm/i915/crt: Do not rely upon the HPD presence pin
Whilst most monitors do wire up the HPD presence pin, it seems quite a
few KVM do not. Therefore if we simply rely on the HPD pin being
asserted to indicate a connected monitor we fail miserable, so fall back
to performing a DCC query for the EDID.

Reported-and-tested-by: Matthieu LAVIE <boiteamadmax@hotmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50501
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-31 14:50:31 +02:00
NeilBrown
aba336bd1d md: raid1/raid10: fix problem with merge_bvec_fn
The new merge_bvec_fn which calls the corresponding function
in subsidiary devices requires that mddev->merge_check_needed
be set if any child has a merge_bvec_fn.

However were were only setting that when a device was hot-added,
not when a device was present from the start.

This bug was introduced in 3.4 so patch is suitable for 3.4.y
kernels.  However that are conflicts in raid10.c so a separate
patch will be needed for 3.4.y.

Cc: stable@vger.kernel.org
Reported-by: Sebastian Riemer <sebastian.riemer@profitbricks.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-05-31 15:56:30 +10:00
Stefan Richter
5f2a3d6191 sbp-target: rename a variable to avoid name clash
'int login_id' shadows 'static atomic_t login_id'.
Seen as compilation warning on x86-32.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Chris Boot <bootc@bootc.net>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-05-30 16:15:33 -07:00
Avik Sil
ea1b3ebac9 perf tools: Fix pager on minimal-install embedded systems
Some Distributions may lack "less" package being included by default,
e.g., Linaro nano rootfs. In those cases use the portable "pager"
command instead of "less".

Signed-off-by: Avik Sil <avik.sil@linaro.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338287725-26382-1-git-send-email-avik.sil@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30 15:10:39 -03:00
Arnaldo Carvalho de Melo
f1439c315b perf tools: Fix make tarballs
The patch series that introduced the top level tools/ makefile and the
libtraceevent broke this feature where files needed to build in a
detached tarball were not included in the MANIFEST file and thus not
included in the tarball.

Fix it by adding the relevant files to the MANIFEST.

Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-z3mjj74927xvqwhlmu18kj80@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30 15:05:59 -03:00
David Ahern
52deff71bc perf script: Fix regression in callchain dso name
$ perf script -i /tmp/perf.data
...
gcc 13623 544315.062858: context-switches:
    ffffffff815f65c9 __schedule ([kernel.kallsyms])
    ffffffff81087cea __cond_resched ([kernel.kallsyms])
    ffffffff815f6b92 _cond_resched ([kernel.kallsyms])
    ffffffff815fb87a do_page_fault ([kernel.kallsyms])
    ffffffff815f8465 page_fault ([kernel.kallsyms])
        2b7a71ea0303 _dl_lookup_symbol_x ([kernel.kallsyms])
        2b7a71ea1eb5 _dl_relocate_object ([kernel.kallsyms])
        2b7a71e99b2e dl_main ([kernel.kallsyms])
        2b7a71eab7f4 _dl_sysdep_start ([kernel.kallsyms])

All DSO's in a callchain are printed as [kernel.kallsyms].

git bisect chased it to:

547a92e0ae is the first bad commit
commit 547a92e0ae
Author: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
Date:   Mon Jan 30 13:42:57 2012 +0900

    perf script: Unify the expressions indicating "unknown"

    The perf script command uses various expressions to indicate "unknown".

    It is unfriendly for user scripts to parse it. So, this patch unifies
    the expressions to "[unknown]".

Looks like a copy-paste in that the other references use al.map but this one
should be node->map.

With this patch you get:

$ perf script -i /tmp/perf.data
...
gcc 13623 544315.062858: context-switches:
    ffffffff815f65c9 __schedule ([kernel.kallsyms])
    ffffffff81087cea __cond_resched ([kernel.kallsyms])
    ffffffff815f6b92 _cond_resched ([kernel.kallsyms])
    ffffffff815fb87a do_page_fault ([kernel.kallsyms])
    ffffffff815f8465 page_fault ([kernel.kallsyms])
        2b7a71ea0303 _dl_lookup_symbol_x (/lib64/ld-2.14.90.so)
        2b7a71ea1eb5 _dl_relocate_object (/lib64/ld-2.14.90.so)
        2b7a71e99b2e dl_main (/lib64/ld-2.14.90.so)
        2b7a71eab7f4 _dl_sysdep_start (/lib64/ld-2.14.90.so)

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1338353906-60706-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30 14:24:38 -03:00
Arnaldo Carvalho de Melo
79695e1bb6 perf stat: Initialize default events wrt exclude_{guest,host}
When no event is specified the tools use perf_evlist__add_default(), that will
call event_attr_init to initialize the KVM exclusion bits.

When the change was made to the tools so that by default guest samples would be
excluded, the changes were made just to the parsing routines and to
perf_evlist__add_default(), not to perf_evlist__add_attrs, that is used so far
just by perf stat to add multiple events, according to the level of detail
specified.

Recently the tools were changed to reconstruct the event name from all the
details in perf_event_attr, not just from .type and .config, but taking into
account all the feature bits (.exclude_{guest,host,user,kernel,etc},
.precise_ip, etc).

That is when we noticed that the default for perf stat wasn't the one for the
rest of the tools, i.e. the .exclude_guest bit wasn't being set.

I.e. the default, that doesn't call event_attr_init was showing the :HG
modifier:

  $ perf stat usleep 1

   Performance counter stats for 'usleep 1':

            0.942119 task-clock                #    0.454 CPUs utilized
                   1 context-switches          #    0.001 M/sec
                   0 CPU-migrations            #    0.000 K/sec
                 126 page-faults               #    0.134 M/sec
             693,193 cycles:HG                 #    0.736 GHz                     [40.11%]
             407,461 stalled-cycles-frontend:HG #   58.78% frontend cycles idle    [72.29%]
             365,403 stalled-cycles-backend:HG #   52.71% backend  cycles idle
             465,982 instructions:HG           #    0.67  insns per cycle
                                               #    0.87  stalled cycles per insn
              89,760 branches:HG               #   95.275 M/sec
               6,178 branch-misses:HG          #    6.88% of all branches

         0.002077228 seconds time elapsed

While if one explicitely specifies the same events, which will make the parsing code
to be called and thus event_attr_init is called:

  $ perf stat -e task-clock,context-switches,migrations,page-faults,cycles,stalled-cycles-frontend,stalled-cycles-backend,instructions,branches,branch-misses usleep 1

   Performance counter stats for 'usleep 1':

            1.040349 task-clock                #    0.500 CPUs utilized
                   2 context-switches          #    0.002 M/sec
                   0 CPU-migrations            #    0.000 K/sec
                 127 page-faults               #    0.122 M/sec
             587,966 cycles                    #    0.565 GHz                     [13.18%]
             459,167 stalled-cycles-frontend   #   78.09% frontend cycles idle
             390,249 stalled-cycles-backend    #   66.37% backend  cycles idle
             504,006 instructions              #    0.86  insns per cycle
                                               #    0.91  stalled cycles per insn
              96,455 branches                  #   92.714 M/sec
               6,522 branch-misses             #    6.76% of all branches         [96.12%]

         0.002078681 seconds time elapsed

Fix it by introducing a perf_evlist__add_default_attrs method that will call
evlist_attr_init in all the perf_event_attr entries before adding the events.

Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4eysr236r0pgiyum9epwxw7s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30 14:02:38 -03:00
Arnaldo Carvalho de Melo
107baecaca perf annotate browser: Fix help window entry for navigating to hottest line
Its 'H', not 'h'. The later is for getting to the help window.

Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-7zvwphhm815y2zczoxgstzuf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30 12:31:44 -03:00
Arnaldo Carvalho de Melo
91557e847d perf report: Use the right symbol for annotation
In non symbolic views, i.e. --sort without "symbol", as in:

 perf report --sort comm

We're segfaulting in the --tui because we're testing the symbol resolved
and then trying to use the symbol on the histogram entry where we're
coalescing all hits for a COMM, and the first hist_entry for a comm may
have a NULL symbol, i.e. the RIP didn't resolve to any symbol.

In this case we're segfaulting, fix it by testing against the symbol in
the histogram entry.

Reported-by: William Cohen <wcohen@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-8ylwubbcmu27ucc9ffrku3yv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-30 12:25:52 -03:00
Kamalesh Babulal
6a4c96eef4 sched: Remove NULL assignment of dattr_cur
Remove explicit NULL assignment of static pointer
dattr_cur from init_sched_domains().

Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120523091411.GG5005@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30 14:02:27 +02:00
Hiroshi Shimamoto
7997a456ef sched: Remove the last NULL entry from sched_feat_names
No need to have the last NULL entry.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4FBF29E7.5020805@ct.jp.nec.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30 14:02:27 +02:00
Hiroshi Shimamoto
1292531f6f sched: Make sched_feat_names const
The strings sched_feat_names are never changed.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4FBF29B2.9030904@ct.jp.nec.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30 14:02:26 +02:00
Colin Cross
454c79999f sched/rt: Fix SCHED_RR across cgroups
task_tick_rt() has an optimization to only reschedule SCHED_RR tasks
if they were the only element on their rq.  However, with cgroups
a SCHED_RR task could be the only element on its per-cgroup rq but
still be competing with other SCHED_RR tasks in its parent's
cgroup.  In this case, the SCHED_RR task in the child cgroup would
never yield at the end of its timeslice.  If the child cgroup
rt_runtime_us was the same as the parent cgroup rt_runtime_us,
the task in the parent cgroup would starve completely.

Modify task_tick_rt() to check that the task is the only task on its
rq, and that the each of the scheduling entities of its ancestors
is also the only entity on its rq.

Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337229266-15798-1-git-send-email-ccross@android.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30 14:02:25 +02:00
Peter Zijlstra
29baa7478b sched: Move nr_cpus_allowed out of 'struct sched_rt_entity'
Since nr_cpus_allowed is used outside of sched/rt.c and wants to be
used outside of there more, move it to a more natural site.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-kr61f02y9brwzkh6x53pdptm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30 14:02:25 +02:00
Peter Zijlstra
b654f7de41 sched: Make sure to not re-read variables after validation
We could re-read rq->rt_avg after we validated it was smaller than
total, invalidating the check and resulting in an unintended negative.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Rientjes <rientjes@google.com>
Link: http://lkml.kernel.org/r/1337688268.9698.29.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30 14:02:24 +02:00
Peter Zijlstra
74a5ce20e6 sched: Fix SD_OVERLAP
SD_OVERLAP exists to allow overlapping groups, overlapping groups
appear in NUMA topologies that aren't fully connected.

The typical result of not fully connected NUMA is that each cpu (or
rather node) will have different spans for a particular distance.
However due to how sched domains are traversed -- only the first cpu
in the mask goes one level up -- the next level only cares about the
spans of the cpus that went up.

Due to this two things were observed to be broken:

 - build_overlap_sched_groups() -- since its possible the cpu we're
   building the groups for exists in multiple (or all) groups, the
   selection criteria of the first group didn't ensure there was a cpu
   for which is was true that cpumask_first(span) == cpu. Thus load-
   balancing would terminate.

 - update_group_power() -- assumed that the cpu span of the first
   group of the domain was covered by all groups of the child domain.
   The above explains why this isn't true, so deal with it.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Rientjes <rientjes@google.com>
Link: http://lkml.kernel.org/r/1337788843.9783.14.camel@laptop
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30 14:02:24 +02:00
Peter Zijlstra
2ea45800d8 sched: Don't try allocating memory from offline nodes
Allocators don't appreciate it when you try and allocate memory from
offline nodes.

Reported-and-tested-by: Tony Luck <tony.luck@intel.com>
Reported-and-tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-epfc1io9whb7o22bcujf31vn@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30 14:02:23 +02:00
Peter Zijlstra
5aaa0b7a2e sched/nohz: Fix rq->cpu_load calculations some more
Follow up on commit 556061b00 ("sched/nohz: Fix rq->cpu_load[]
calculations") since while that fixed the busy case it regressed the
mostly idle case.

Add a callback from the nohz exit to also age the rq->cpu_load[]
array. This closes the hole where either there was no nohz load
balance pass during the nohz, or there was a 'significant' amount of
idle time between the last nohz balance and the nohz exit.

So we'll update unconditionally from the tick to not insert any
accidental 0 load periods while busy, and we try and catch up from
nohz idle balance and nohz exit. Both these are still prone to missing
a jiffy, but that has always been the case.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: pjt@google.com
Cc: Venkatesh Pallipadi <venki@google.com>
Link: http://lkml.kernel.org/n/tip-kt0trz0apodbf84ucjfdbr1a@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30 14:02:16 +02:00
Peter Zijlstra
9f646389aa sched/x86: Use cpu_llc_shared_mask(cpu) for coregroup_mask
Commit commit 8e7fbcbc2 ("sched: Remove stale power aware scheduling
remnants and dysfunctional knobs") made a boo-boo with removing the
power aware scheduling muck from the x86 topology bits.

We should unconditionally use the llc_shared mask for multi-core.

Reported-and-tested-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Borislav Petkov <bp@amd64.org>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Link: http://lkml.kernel.org/n/tip-lsksc2kfyeveb13avh327p0d@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30 11:05:44 +02:00
Ingo Molnar
063e047761 Merge branch 'linus' into perf/urgent
Merge back Linus's latest branch so that we pick up the uprobes changes.

( I tested this branch locally and while it's one from the middle of the
  merge window it's a good one to base further work off. )

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30 10:59:04 +02:00
Devendra Naga
7ad5e449b9 RDMA/ocrdma: Remove unnecessary version.h includes
"make versioncheck" shows:

    drivers/infiniband/hw/ocrdma/ocrdma_main.c: 29 linux/version.h not needed.
    drivers/infiniband/hw/ocrdma/ocrdma_verbs.h: 31 linux/version.h not needed.

Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-29 12:53:11 -07:00
Parav Pandit
804eaf29ba RDMA/ocrdma: Fix signaled event for SRQ_LIMIT_REACHED
Signed-off-by: Parav Pandit <parav.pandit@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-29 12:52:00 -07:00
Parav Pandit
cd4fedf9cf RDMA/ocrdma: Correct queue free count math
Correct queue free count math for SQ, RQ for all hardware type.
Update user-kernel ABI interface.

Signed-off-by: Parav Pandit <parav.pandit@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-29 12:49:36 -07:00
Rafael J. Wysocki
dbe9a2edd1 ACPI / PM: Make acpi_pm_device_sleep_state() follow the specification
The comparison between the system sleep state being entered
and the lowest system sleep state the given device may wake up
from in acpi_pm_device_sleep_state() is reversed, because the
specification (ACPI 5.0) says that for wakeup to work:

"The sleeping state being entered must be less than or equal to the
 power state declared in element 1 of the _PRW object."

In other words, the state returned by _PRW is the deepest
(lowest-power) system sleep state the device is capable of waking up
the system from.

Moreover, acpi_pm_device_sleep_state() also should check if the
wakeup capability is supported through ACPI, because in principle it
may be done via native PCIe PME, for example, in which case _SxW
should not be evaluated.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-05-29 21:21:07 +02:00
Rafael J. Wysocki
38c92fff98 ACPI / PM: Make __acpi_bus_get_power() cover D3cold correctly
After recent changes of the ACPI device power states definitions, if
power resources are not used for the device's power management, the
state returned by __acpi_bus_get_power() cannot exceed D3hot, because
the return values of _PSC are 0 through 3.  However, if the _PR3
method is not present for the device and _PS3 returns 3, we have to
assume that the device is in D3cold, so the value returned by
__acpi_bus_get_power() in that case should be 4.

Similarly, acpi_power_get_inferred_state() should take the power
resources for the D3hot state into account in general, so that it
can return 3 if those resources are "on" or 4 (D3cold) otherwise.

Fix the the above two issues and make sure that if both _PSC and
_PR3 are present for the device, the power resources listed by _PR3
will be used to determine if the number 3 returned by _PSC is meant
to represent D3cold or D3hot.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-05-29 21:20:24 +02:00
Rafael J. Wysocki
63a1a765df ACPI / PM: Fix error messages in drivers/acpi/bus.c
After recent changes of the ACPI device low-power states definitions
kernel messages in drivers/acpi/bus.c need to be updated so that they
include the correct names of the states in question (currently is
"D3" for D3hot and "D4" for D3cold, which is incorrect).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-05-29 21:20:23 +02:00
Daniel Drake
b2201e5482 rtc-cmos / PM: report wakeup event on ACPI RTC alarm
When the ACPI-driven RTC alarm wakes the system, report it as a wakeup
event. This allows userspace to determine that the reason for system
wakeup was RTC alarm.

Signed-off-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-05-29 21:20:23 +02:00
Daniel Drake
c10d7a1384 ACPI / PM: Generate wakeup events on fixed power button
When the system is woken up by the ACPI fixed power button, currently there
is no way of userspace becoming aware that the power button was pressed.

OLPC would like to know this, so that we can respond appropriately.
For example, if the system was woken up by a network packet, we know
we can go back to sleep very quickly. But if the user explicitly woke the
system with the power button, we're going to want to stay awake for a
while.

The wakeup count mechanism seems like a good fit for communicating this.
Mark the fixed power button as wakeup-enabled, and increment its wakeup
counter when the system is woken with the power button. (The wakeup counter
is also incremented when the power button is pressed during system
operation; this is already handled by an existing acpi-button codepath).

Signed-off-by: Daniel Drake <dsd@laptop.org>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-05-29 21:20:23 +02:00
Chris Wilson
c3b2003792 drm/i915: Reset last_retired_head when resetting ring
When we reset the ring control registers, including the HEAD and TAIL of
the ring, we also need to reset associated state. In this instance, we
were failing to reset the cached value of ring->last_retired_head and so
upon the first request for more space following a resume would
potentially (depending on a narrow race window) believe that the HEAD had
advanced much further than reality.

This is a regression from:

commit a71d8d9452
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Feb 15 11:25:36 2012 +0000

    drm/i915: Record the tail at each request and use it to estimate the head

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org # 3.4
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-29 20:06:58 +02:00
Jim Kukunas
2aa4ee2a88 lib/raid6: fix sparse warnings in recovery functions
Make the recovery functions static to fix the following sparse warnings:

lib/raid6/recov.c:25:6: warning: symbol 'raid6_2data_recov_intx1' was
not declared. Should it be static?
lib/raid6/recov.c:69:6: warning: symbol 'raid6_datap_recov_intx1' was
not declared. Should it be static?
lib/raid6/recov_ssse3.c:22:6: warning: symbol 'raid6_2data_recov_ssse3'
was not declared. Should it be static?
lib/raid6/recov_ssse3.c:197:6: warning: symbol 'raid6_datap_recov_ssse3'
was not declared. Should it be static?

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-05-28 14:10:22 +10:00
Tejun Heo
fa980ca87d cgroup: superblock can't be released with active dentries
48ddbe1946 "cgroup: make css->refcnt clearing on cgroup removal
optional" allowed a css to linger after the associated cgroup is
removed.  As a css holds a reference on the cgroup's dentry, it means
that cgroup dentries may linger for a while.

cgroup_create() does grab an active reference on the superblock to
prevent it from going away while there are !root cgroups; however, the
reference is put from cgroup_diput() which is invoked on cgroup
removal, so cgroup dentries which are removed but persisting due to
lingering csses already have released their superblock active refs
allowing superblock to be killed while those dentries are around.

Given the right condition, this makes cgroup_kill_sb() call
kill_litter_super() with dentries with non-zero d_count leading to
BUG() in shrink_dcache_for_umount_subtree().

Fix it by adding cgroup_dops->d_release() operation and moving
deactivate_super() to it.  cgroup_diput() now marks dentry->d_fsdata
with itself if superblock should be deactivated and cgroup_d_release()
deactivates the superblock on dentry release.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
LKML-Reference: <CA+1xoqe5hMuxzCRhMy7J0XchDk2ZnuxOHJKikROk1-ReAzcT6g@mail.gmail.com>
Acked-by: Li Zefan <lizefan@huawei.com>
2012-05-27 17:22:56 -07:00
Thomas Gleixner
62cf20b32a tick: Move skew_tick option into the HIGH_RES_TIMER section
commit 5307c95 (tick: Add tick skew boot option) broke the
!CONFIG_HIGH_RES_TIMERS build.

Move the boot option parsing into the CONFIG_HIGH_RES_TIMERS section.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <mgalbraith@suse.de>
2012-05-25 14:08:57 +02:00
Magnus Damm
fc0830fe01 clocksource: em_sti: Add DT support
Update the em-sti driver to support DT.

Signed-off-by: Magnus Damm <damm@opensource.se>
Cc: arnd@arndb.de
Cc: horms@verge.net.au
Cc: johnstul@us.ibm.com
Cc: rjw@sisk.pl
Cc: lethal@linux-sh.org
Cc: gregkh@linuxfoundation.org
Cc: olof@lixom.net
Link: http://lkml.kernel.org/r/20120509143950.27521.7949.sendpatchset@w520
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-05-25 11:32:06 +02:00
Magnus Damm
b9dbf95177 clocksource: em_sti: Emma Mobile STI driver
The STI hardware is based on a single 48-bit 32kHz
counter that together with two individual compare
registers can generate interrupts. There are no
timer operating modes selectable which means that
the timer can not clear on match.

This driver is providing clocksource support for the
48-bit counter. Clockevents are also supported using
the same timer in oneshot mode.

Signed-off-by: Magnus Damm <damm@opensource.se>
Cc: horms@verge.net.au
Cc: arnd@arndb.de
Cc: johnstul@us.ibm.com
Cc: rjw@sisk.pl
Cc: lethal@linux-sh.org
Cc: gregkh@linuxfoundation.org
Cc: olof@lixom.net
Link: http://lkml.kernel.org/r/20120525070344.23443.69756.sendpatchset@w520
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-05-25 11:32:06 +02:00
Magnus Damm
e5400321a6 clockevents: Make clockevents_config() a global symbol
Make clockevents_config() into a global symbol to allow it to be used
by compiled-in clockevent drivers. This is needed by drivers that want
to update the timer frequency after registration time.

Signed-off-by: Magnus Damm <damm@opensource.se>
Tested-by: Simon Horman <horms@verge.net.au>
Cc: arnd@arndb.de
Cc: johnstul@us.ibm.com
Cc: rjw@sisk.pl
Cc: lethal@linux-sh.org
Cc: gregkh@linuxfoundation.org
Cc: olof@lixom.net
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20120509143934.27521.46553.sendpatchset@w520
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-05-25 01:44:51 +02:00
Mike Galbraith
5307c9556b tick: Add tick skew boot option
Let the user decide whether power consumption or jitter is the
more important consideration for their machines.

Quoting removal commit af5ab277de:

"Historically, Linux has tried to make the regular timer tick on the
 various CPUs not happen at the same time, to avoid contention on
 xtime_lock.
    
 Nowadays, with the tickless kernel, this contention no longer happens
 since time keeping and updating are done differently. In addition,
 this skew is actually hurting power consumption in a measurable way on
 many-core systems."

Problems:

- Contrary to the above, systems do encounter contention on both
  xtime_lock and RCU structure locks when the tick is synchronized.
  
- Moderate sized RT systems suffer intolerable jitter due to the tick
  being synchronized.

- SGI reports the same for their large systems.

- Fully utilized systems reap no power saving benefit from skew removal,
  but do suffer from resulting induced lock contention.

- 0209f649 rcu: limit rcu_node leaf-level fanout
  This patch was born to combat lock contention which testing showed
  to have been _induced by_ skew removal.  Skew the tick, contention
  disappeared virtually completely.

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Link: http://lkml.kernel.org/r/1336472458.21924.78.camel@marge.simpson.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-05-25 01:44:50 +02:00
Srivatsa S. Bhat
4a70d2d990 smpboot, idle: Fix comment mismatch over idle_threads_init()
The comment over idle_threads_init() really talks about the functionality
of idle_init(). Move that comment to idle_init(), and add a suitable
comment over idle_threads_init().

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: suresh.b.siddha@intel.com
Cc: venki@google.com
Cc: nikunj@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/20120524151100.2549.66501.stgit@srivatsabhat.in.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-05-24 22:58:08 +02:00
Srivatsa S. Bhat
ee74d13229 smpboot, idle: Optimize calls to smp_processor_id() in idle_threads_init()
While trying to initialize idle threads for all cpus, idle_threads_init()
calls smp_processor_id() in a loop, which is unnecessary. The intent
is to initialize idle threads for all non-boot cpus. So just use a variable
to note the boot cpu and use it in the loop.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: suresh.b.siddha@intel.com
Cc: venki@google.com
Cc: nikunj@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/20120524151055.2549.64309.stgit@srivatsabhat.in.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-05-24 22:58:08 +02:00
Jiang Liu
818b0f3bfb genirq: Introduce irq_do_set_affinity() to reduce duplicated code
All invocations of chip->irq_set_affinity() are doing the same return
value checks. Let them all use a common function.

[ tglx: removed the silly likely while at it ]

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Keping Chen <chenkeping@huawei.com>
Link: http://lkml.kernel.org/r/1333120296-13563-3-git-send-email-jiang.liu@huawei.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-05-24 22:36:40 +02:00
Ning Jiang
23812b9d9e genirq: Add IRQS_PENDING for nested and simple irq
Every interrupt which is an active wakeup source needs the ability to
abort suspend if there is a pending irq. Right now only edge and level
irqs can do that.

            |
       +---------+
       |   INTC  |
       +---------+
               | GPIO_IRQ
            +------------+
            |  gpio-exp  |
            +------------+
              |        |
         GPIO0_IRQ  GPIO1_IRQ

In the above diagram, gpio expander has irq number GPIO_IRQ, it is
connected with two sub GPIO pins, GPIO0 and GPIO1.

During suspend, we set IRQF_NO_SUSPEND for GPIO_IRQ so that gpio
expander driver can handle the sub irq GPIO0_IRQ and GPIO1_IRQ, and
these two irqs themselves can further be handled by simple or nested
irq in some drivers(typically gpio and mfd driver). If they are used
as wakeup sources during suspend, we want them to be able to abort
suspend too.

Setting IRQS_PENDING flag in handle_nested_irq() and handle_simple_irq()
when the irq is disabled allows check_wakeup_irqs() to identify such
irqs as source for aborting suspend.

Signed-off-by: Ning Jiang <ning.n.jiang@gmail.com>
Cc: rjw@sisk.pl
Link: http://lkml.kernel.org/r/CAH3Oq6T905%2B3fkF43NAMMFvJvq7dsk_so6T2vQ8ZJrA5xiU3YA@mail.gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-05-24 22:27:45 +02:00
Konrad Rzeszutek Wilk
165c8aed5b frontswap: s/put_page/store/g s/get_page/load
Sounds so much more natural.

Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-05-15 11:34:08 -04:00
Konrad Rzeszutek Wilk
839a1f79ed MAINTAINER: Add myself for the frontswap API
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-05-15 11:34:05 -04:00
Dan Magenheimer
27c6aec214 mm: frontswap: config and doc files
This patch 4of4 adds configuration and documentation files including a FAQ.

[v14: updated docs/FAQ to use zcache and RAMster as examples]
[v10: no change]
[v9: akpm@linux-foundation.org: sysfs->debugfs; no longer need Doc/ABI file]
[v8: rebase to 3.0-rc4]
[v7: rebase to 3.0-rc3]
[v6: rebase to 3.0-rc1]
[v5: change config default to n]
[v4: rebase to 2.6.39]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Acked-by: Jan Beulich <JBeulich@novell.com>
Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Rik Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-05-15 11:34:03 -04:00
Dan Magenheimer
29f233cfff mm: frontswap: core frontswap functionality
This patch, 3of4, provides the core frontswap code that interfaces between
the hooks in the swap subsystem and a frontswap backend via frontswap_ops.

---
New file added: mm/frontswap.c

[v14: add support for writethrough, per suggestion by aarcange@redhat.com]
[v11: sjenning@linux.vnet.ibm.com: s/puts/failed_puts/]
[v10: sjenning@linux.vnet.ibm.com: fix debugfs calls on 32-bit]
[v9: akpm@linux-foundation.org: change "flush" to "invalidate", part 1]
[v9: akpm@linux-foundation.org: mark some statics __read_mostly]
[v9: akpm@linux-foundation.org: add clarifying comments]
[v9: akpm@linux-foundation.org: no need to loop repeating try_to_unuse]
[v9: error27@gmail.com: remove superfluous check for NULL]
[v8: rebase to 3.0-rc4]
[v8: kamezawa.hiroyu@jp.fujitsu.com: add comment to clarify find_next_to_unuse]
[v7: rebase to 3.0-rc3]
[v7: JBeulich@novell.com: use new static inlines, no-ops if not config'd]
[v6: rebase to 3.1-rc1]
[v6: lliubbo@gmail.com: use vzalloc]
[v6: lliubbo@gmail.com: fix null pointer deref if vzalloc fails]
[v6: konrad.wilk@oracl.com: various checks and code clarifications/comments]
[v4: rebase to 2.6.39]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Acked-by: Jan Beulich <JBeulich@novell.com>
Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Rik Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
[v12: Squashed s/flush/invalidate/ in]
[v15: A bit of cleanup and seperate DEBUGFS]
Signed-off-by: Konrad Wilk <konrad.wilk@oracle.com>
2012-05-15 11:34:00 -04:00
Dan Magenheimer
38b5faf4b1 mm: frontswap: core swap subsystem hooks and headers
This patch, 2of4, contains the changes to the core swap subsystem.
This includes:

(1) makes available core swap data structures (swap_lock, swap_list and
swap_info) that are needed by frontswap.c but we don't need to expose them
to the dozens of files that include swap.h so we create a new swapfile.h
just to extern-ify these and modify their declarations to non-static

(2) adds frontswap-related elements to swap_info_struct.  Frontswap_map
points to vzalloc'ed one-bit-per-swap-page metadata that indicates
whether the swap page is in frontswap or in the device and frontswap_pages
counts how many pages are in frontswap.

(3) adds hooks in the swap subsystem and extends try_to_unuse so that
frontswap_shrink can do a "partial swapoff".

Note that a failed frontswap_map allocation is safe... failure is noted
by lack of "FS" in the subsequent printk.

---

[v14: rebase to 3.4-rc2]
[v10: no change]
[v9: akpm@linux-foundation.org: mark some statics __read_mostly]
[v9: akpm@linux-foundation.org: add clarifying comments]
[v9: akpm@linux-foundation.org: no need to loop repeating try_to_unuse]
[v9: error27@gmail.com: remove superfluous check for NULL]
[v8: rebase to 3.0-rc4]
[v8: kamezawa.hiroyu@jp.fujitsu.com: change counter to atomic_t to avoid races]
[v8: kamezawa.hiroyu@jp.fujitsu.com: comment to clarify informational counters]
[v7: rebase to 3.0-rc3]
[v7: JBeulich@novell.com: add new swap struct elements only if config'd]
[v6: rebase to 3.0-rc1]
[v6: lliubbo@gmail.com: fix null pointer deref if vzalloc fails]
[v6: konrad.wilk@oracl.com: various checks and code clarifications/comments]
[v5: no change from v4]
[v4: rebase to 2.6.39]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Reviewed-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Jan Beulich <JBeulich@novell.com>
Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Rik Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
[v11: Rebased, fixed mm/swapfile.c context change]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-05-15 11:33:58 -04:00
Dan Magenheimer
c3ba969815 mm: frontswap: add frontswap header file
Frontswap is the alter ego of cleancache, the "yang" to cleancache's
"yin"... and more precisely frontswap is the provider of anonymous
pages to transcendent memory to nicely complement cleancache's providing
of clean pagecache pages to transcendent memory.  For optimal use
of transcendent memory, both are necessary... because a kernel
under memory pressure first reclaims clean pagecache pages and,
when under more memory pressure, starts swapping anonymous pages.

Frontswap and cleancache (which was merged at 3.0) are the "frontends"
and the only necessary changes to the core kernel for transcendent memory;
all other supporting code -- the "backends" -- is implemented as drivers.
See the LWN.net article "Transcendent memory in a nutshell" for a detailed
overview of frontswap and related kernel parts:
https://lwn.net/Articles/454795/

Frontswap code was first posted publicly in January 2009 and on LKML in
May 2009, and has remained functionally stable for nearly three years now.
It is barely invasive, touching only the swap subsystem and adds less
than 100 lines of code to existing swap subsystem code files.
It has improved syntactically substantially between V1 and this posting
of V14, thanks to the review of a few kernel developers, and has adapted
easily to at least one major swap subsystem change.  As of 3.4, there are
three in-tree users of frontswap patiently waiting for this patchset and
for CONFIG_FRONTSWAP to be enabled: zcache (staging driver merged at
2.6.39), Xen tmem (merged at 3.0 and 3.1) and RAMster (staging driver
merged at 3.4).  In addition, a RFC has been posted for a KVM backend.
The frontswap patchset has been in linux-next since next-110603.  Earlier
versions of frontswap already ship in the Oracle Unbreakable Enterprise Kernel
and SuSE SLES.

This patch, 1of4, provides the header file for the core code for frontswap
that interfaces between the hooks in the swap subsystem and a frontswap
backend via frontswap_ops.
---
New file added: include/linux/frontswap.h

[v14: add support for writethrough, per suggestion by aarcange@redhat.com]
[v14: rebase to 3.4-rc2]
[v11: konrad.wilk@oracle.com: squashed s/flush/invalidate/ in]
[v10: no change]
[v9: akpm@linux-foundation.org: change "flush" to "invalidate", part 1]
[v8: rebase to 3.0-rc4]
[v7: rebase to 3.0-rc3]
[v7: JBeulich@novell.com: new static inlines resolve to no-ops if not config'd]
[v7: JBeulich@novell.com: avoid redundant shifts/divides for *_bit lib calls]
[v6: rebase to 3.1-rc1]
[v5: no change from v4]
[v4: rebase to 2.6.39]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Jan Beulich <JBeulich@novell.com>
Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Rik Riel <riel@redhat.com>
[v15: int/bool on some functions]
Signed-off-by: Konrad Wilk <konrad.wilk@oracle.com>
2012-05-15 11:33:46 -04:00
Miklos Szeredi
203627bbc9 fuse: fix blksize calculation
Don't use inode->i_blkbits which might be stale, instead calculate the blksize
information from the freshly obtained attributes.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2012-05-14 17:12:56 +02:00
Pavel Shilovsky
45c72cd73c fuse: fix stat call on 32 bit platforms
Now we store attr->ino at inode->i_ino, return attr->ino at the
first time and then return inode->i_ino if the attribute timeout
isn't expired. That's wrong on 32 bit platforms because attr->ino
is 64 bit and inode->i_ino is 32 bit in this case.

Fix this by saving 64 bit ino in fuse_inode structure and returning
it every time we call getattr. Also squash attr->ino into inode->i_ino
explicitly.

Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2012-05-14 17:06:42 +02:00
Marco Aurelio da Costa
d8e725f356 ACPI: Ignore invalid _PSS entries, but use valid ones
The EliteBook 8560W has non-initialized entries in its _PSS ACPI
table. Instead of bailing out when the first non-initialized entry is
found, ignore it and use only  the valid entries. Only bail out if there
is no valid entry at all.

[v3: Fixes suggested by Konrad]

Signed-off-by: Marco Aurelio da Costa <costa@gamic.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-05-08 01:56:37 -04:00
Andy Whitcroft
c597145696 ACPI battery: only refresh the sysfs files when pertinent information changes
We only need to regenerate the sysfs files when the capacity units
change, avoid the update otherwise.

The origin of this issue is dates way back to 2.6.38:
da8aeb92d4
(ACPI / Battery: Update information on info notification and resume)

cc: <stable@vger.kernel.org>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Tested-by: Ralf Jung <post@ralfj.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-05-08 01:49:57 -04:00
Miklos Szeredi
519c6040ce fuse: optimize fallocate on permanent failure
If userspace filesystem doesn't support fallocate, remember this and don't send
request next time.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2012-04-26 10:56:36 +02:00
Anatol Pomozov
05ba1f0823 fuse: add FALLOCATE operation
fallocate filesystem operation preallocates media space for the given file.
If fallocate returns success then any subsequent write to the given range
never fails with 'not enough space' error.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2012-04-25 12:25:05 +02:00
Peter Huewe
e2690695ce fuse: Convert to kstrtoul_from_user
This patch replaces the code for getting an number from a
userspace buffer by a simple call to kstroul_from_user.
This makes it easier to read and less error prone.

Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2012-04-25 12:25:05 +02:00
197 changed files with 3836 additions and 1466 deletions

View File

@@ -0,0 +1,93 @@
Pinctrl-based I2C Bus Mux
This binding describes an I2C bus multiplexer that uses pin multiplexing to
route the I2C signals, and represents the pin multiplexing configuration
using the pinctrl device tree bindings.
+-----+ +-----+
| dev | | dev |
+------------------------+ +-----+ +-----+
| SoC | | |
| /----|------+--------+
| +---+ +------+ | child bus A, on first set of pins
| |I2C|---|Pinmux| |
| +---+ +------+ | child bus B, on second set of pins
| \----|------+--------+--------+
| | | | |
+------------------------+ +-----+ +-----+ +-----+
| dev | | dev | | dev |
+-----+ +-----+ +-----+
Required properties:
- compatible: i2c-mux-pinctrl
- i2c-parent: The phandle of the I2C bus that this multiplexer's master-side
port is connected to.
Also required are:
* Standard pinctrl properties that specify the pin mux state for each child
bus. See ../pinctrl/pinctrl-bindings.txt.
* Standard I2C mux properties. See mux.txt in this directory.
* I2C child bus nodes. See mux.txt in this directory.
For each named state defined in the pinctrl-names property, an I2C child bus
will be created. I2C child bus numbers are assigned based on the index into
the pinctrl-names property.
The only exception is that no bus will be created for a state named "idle". If
such a state is defined, it must be the last entry in pinctrl-names. For
example:
pinctrl-names = "ddc", "pta", "idle" -> ddc = bus 0, pta = bus 1
pinctrl-names = "ddc", "idle", "pta" -> Invalid ("idle" not last)
pinctrl-names = "idle", "ddc", "pta" -> Invalid ("idle" not last)
Whenever an access is made to a device on a child bus, the relevant pinctrl
state will be programmed into hardware.
If an idle state is defined, whenever an access is not being made to a device
on a child bus, the idle pinctrl state will be programmed into hardware.
If an idle state is not defined, the most recently used pinctrl state will be
left programmed into hardware whenever no access is being made of a device on
a child bus.
Example:
i2cmux {
compatible = "i2c-mux-pinctrl";
#address-cells = <1>;
#size-cells = <0>;
i2c-parent = <&i2c1>;
pinctrl-names = "ddc", "pta", "idle";
pinctrl-0 = <&state_i2cmux_ddc>;
pinctrl-1 = <&state_i2cmux_pta>;
pinctrl-2 = <&state_i2cmux_idle>;
i2c@0 {
reg = <0>;
#address-cells = <1>;
#size-cells = <0>;
eeprom {
compatible = "eeprom";
reg = <0x50>;
};
};
i2c@1 {
reg = <1>;
#address-cells = <1>;
#size-cells = <0>;
eeprom {
compatible = "eeprom";
reg = <0x50>;
};
};
};

View File

@@ -2543,6 +2543,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
sched_debug [KNL] Enables verbose scheduler debug messages.
skew_tick= [KNL] Offset the periodic timer tick per cpu to mitigate
xtime_lock contention on larger systems, and/or RCU lock
contention on all systems with CONFIG_MAXSMP set.
Format: { "0" | "1" }
0 -- disable. (may be 1 via CONFIG_CMDLINE="skew_tick=1"
1 -- enable.
Note: increases power consumption, thus should only be
enabled if running jitter sensitive (HPC/RT) workloads.
security= [SECURITY] Choose a security module to enable at boot.
If this boot parameter is not specified, only the first
security module asking for security registration will be

View File

@@ -0,0 +1,278 @@
Frontswap provides a "transcendent memory" interface for swap pages.
In some environments, dramatic performance savings may be obtained because
swapped pages are saved in RAM (or a RAM-like device) instead of a swap disk.
(Note, frontswap -- and cleancache (merged at 3.0) -- are the "frontends"
and the only necessary changes to the core kernel for transcendent memory;
all other supporting code -- the "backends" -- is implemented as drivers.
See the LWN.net article "Transcendent memory in a nutshell" for a detailed
overview of frontswap and related kernel parts:
https://lwn.net/Articles/454795/ )
Frontswap is so named because it can be thought of as the opposite of
a "backing" store for a swap device. The storage is assumed to be
a synchronous concurrency-safe page-oriented "pseudo-RAM device" conforming
to the requirements of transcendent memory (such as Xen's "tmem", or
in-kernel compressed memory, aka "zcache", or future RAM-like devices);
this pseudo-RAM device is not directly accessible or addressable by the
kernel and is of unknown and possibly time-varying size. The driver
links itself to frontswap by calling frontswap_register_ops to set the
frontswap_ops funcs appropriately and the functions it provides must
conform to certain policies as follows:
An "init" prepares the device to receive frontswap pages associated
with the specified swap device number (aka "type"). A "store" will
copy the page to transcendent memory and associate it with the type and
offset associated with the page. A "load" will copy the page, if found,
from transcendent memory into kernel memory, but will NOT remove the page
from from transcendent memory. An "invalidate_page" will remove the page
from transcendent memory and an "invalidate_area" will remove ALL pages
associated with the swap type (e.g., like swapoff) and notify the "device"
to refuse further stores with that swap type.
Once a page is successfully stored, a matching load on the page will normally
succeed. So when the kernel finds itself in a situation where it needs
to swap out a page, it first attempts to use frontswap. If the store returns
success, the data has been successfully saved to transcendent memory and
a disk write and, if the data is later read back, a disk read are avoided.
If a store returns failure, transcendent memory has rejected the data, and the
page can be written to swap as usual.
If a backend chooses, frontswap can be configured as a "writethrough
cache" by calling frontswap_writethrough(). In this mode, the reduction
in swap device writes is lost (and also a non-trivial performance advantage)
in order to allow the backend to arbitrarily "reclaim" space used to
store frontswap pages to more completely manage its memory usage.
Note that if a page is stored and the page already exists in transcendent memory
(a "duplicate" store), either the store succeeds and the data is overwritten,
or the store fails AND the page is invalidated. This ensures stale data may
never be obtained from frontswap.
If properly configured, monitoring of frontswap is done via debugfs in
the /sys/kernel/debug/frontswap directory. The effectiveness of
frontswap can be measured (across all swap devices) with:
failed_stores - how many store attempts have failed
loads - how many loads were attempted (all should succeed)
succ_stores - how many store attempts have succeeded
invalidates - how many invalidates were attempted
A backend implementation may provide additional metrics.
FAQ
1) Where's the value?
When a workload starts swapping, performance falls through the floor.
Frontswap significantly increases performance in many such workloads by
providing a clean, dynamic interface to read and write swap pages to
"transcendent memory" that is otherwise not directly addressable to the kernel.
This interface is ideal when data is transformed to a different form
and size (such as with compression) or secretly moved (as might be
useful for write-balancing for some RAM-like devices). Swap pages (and
evicted page-cache pages) are a great use for this kind of slower-than-RAM-
but-much-faster-than-disk "pseudo-RAM device" and the frontswap (and
cleancache) interface to transcendent memory provides a nice way to read
and write -- and indirectly "name" -- the pages.
Frontswap -- and cleancache -- with a fairly small impact on the kernel,
provides a huge amount of flexibility for more dynamic, flexible RAM
utilization in various system configurations:
In the single kernel case, aka "zcache", pages are compressed and
stored in local memory, thus increasing the total anonymous pages
that can be safely kept in RAM. Zcache essentially trades off CPU
cycles used in compression/decompression for better memory utilization.
Benchmarks have shown little or no impact when memory pressure is
low while providing a significant performance improvement (25%+)
on some workloads under high memory pressure.
"RAMster" builds on zcache by adding "peer-to-peer" transcendent memory
support for clustered systems. Frontswap pages are locally compressed
as in zcache, but then "remotified" to another system's RAM. This
allows RAM to be dynamically load-balanced back-and-forth as needed,
i.e. when system A is overcommitted, it can swap to system B, and
vice versa. RAMster can also be configured as a memory server so
many servers in a cluster can swap, dynamically as needed, to a single
server configured with a large amount of RAM... without pre-configuring
how much of the RAM is available for each of the clients!
In the virtual case, the whole point of virtualization is to statistically
multiplex physical resources acrosst the varying demands of multiple
virtual machines. This is really hard to do with RAM and efforts to do
it well with no kernel changes have essentially failed (except in some
well-publicized special-case workloads).
Specifically, the Xen Transcendent Memory backend allows otherwise
"fallow" hypervisor-owned RAM to not only be "time-shared" between multiple
virtual machines, but the pages can be compressed and deduplicated to
optimize RAM utilization. And when guest OS's are induced to surrender
underutilized RAM (e.g. with "selfballooning"), sudden unexpected
memory pressure may result in swapping; frontswap allows those pages
to be swapped to and from hypervisor RAM (if overall host system memory
conditions allow), thus mitigating the potentially awful performance impact
of unplanned swapping.
A KVM implementation is underway and has been RFC'ed to lkml. And,
using frontswap, investigation is also underway on the use of NVM as
a memory extension technology.
2) Sure there may be performance advantages in some situations, but
what's the space/time overhead of frontswap?
If CONFIG_FRONTSWAP is disabled, every frontswap hook compiles into
nothingness and the only overhead is a few extra bytes per swapon'ed
swap device. If CONFIG_FRONTSWAP is enabled but no frontswap "backend"
registers, there is one extra global variable compared to zero for
every swap page read or written. If CONFIG_FRONTSWAP is enabled
AND a frontswap backend registers AND the backend fails every "store"
request (i.e. provides no memory despite claiming it might),
CPU overhead is still negligible -- and since every frontswap fail
precedes a swap page write-to-disk, the system is highly likely
to be I/O bound and using a small fraction of a percent of a CPU
will be irrelevant anyway.
As for space, if CONFIG_FRONTSWAP is enabled AND a frontswap backend
registers, one bit is allocated for every swap page for every swap
device that is swapon'd. This is added to the EIGHT bits (which
was sixteen until about 2.6.34) that the kernel already allocates
for every swap page for every swap device that is swapon'd. (Hugh
Dickins has observed that frontswap could probably steal one of
the existing eight bits, but let's worry about that minor optimization
later.) For very large swap disks (which are rare) on a standard
4K pagesize, this is 1MB per 32GB swap.
When swap pages are stored in transcendent memory instead of written
out to disk, there is a side effect that this may create more memory
pressure that can potentially outweigh the other advantages. A
backend, such as zcache, must implement policies to carefully (but
dynamically) manage memory limits to ensure this doesn't happen.
3) OK, how about a quick overview of what this frontswap patch does
in terms that a kernel hacker can grok?
Let's assume that a frontswap "backend" has registered during
kernel initialization; this registration indicates that this
frontswap backend has access to some "memory" that is not directly
accessible by the kernel. Exactly how much memory it provides is
entirely dynamic and random.
Whenever a swap-device is swapon'd frontswap_init() is called,
passing the swap device number (aka "type") as a parameter.
This notifies frontswap to expect attempts to "store" swap pages
associated with that number.
Whenever the swap subsystem is readying a page to write to a swap
device (c.f swap_writepage()), frontswap_store is called. Frontswap
consults with the frontswap backend and if the backend says it does NOT
have room, frontswap_store returns -1 and the kernel swaps the page
to the swap device as normal. Note that the response from the frontswap
backend is unpredictable to the kernel; it may choose to never accept a
page, it could accept every ninth page, or it might accept every
page. But if the backend does accept a page, the data from the page
has already been copied and associated with the type and offset,
and the backend guarantees the persistence of the data. In this case,
frontswap sets a bit in the "frontswap_map" for the swap device
corresponding to the page offset on the swap device to which it would
otherwise have written the data.
When the swap subsystem needs to swap-in a page (swap_readpage()),
it first calls frontswap_load() which checks the frontswap_map to
see if the page was earlier accepted by the frontswap backend. If
it was, the page of data is filled from the frontswap backend and
the swap-in is complete. If not, the normal swap-in code is
executed to obtain the page of data from the real swap device.
So every time the frontswap backend accepts a page, a swap device read
and (potentially) a swap device write are replaced by a "frontswap backend
store" and (possibly) a "frontswap backend loads", which are presumably much
faster.
4) Can't frontswap be configured as a "special" swap device that is
just higher priority than any real swap device (e.g. like zswap,
or maybe swap-over-nbd/NFS)?
No. First, the existing swap subsystem doesn't allow for any kind of
swap hierarchy. Perhaps it could be rewritten to accomodate a hierarchy,
but this would require fairly drastic changes. Even if it were
rewritten, the existing swap subsystem uses the block I/O layer which
assumes a swap device is fixed size and any page in it is linearly
addressable. Frontswap barely touches the existing swap subsystem,
and works around the constraints of the block I/O subsystem to provide
a great deal of flexibility and dynamicity.
For example, the acceptance of any swap page by the frontswap backend is
entirely unpredictable. This is critical to the definition of frontswap
backends because it grants completely dynamic discretion to the
backend. In zcache, one cannot know a priori how compressible a page is.
"Poorly" compressible pages can be rejected, and "poorly" can itself be
defined dynamically depending on current memory constraints.
Further, frontswap is entirely synchronous whereas a real swap
device is, by definition, asynchronous and uses block I/O. The
block I/O layer is not only unnecessary, but may perform "optimizations"
that are inappropriate for a RAM-oriented device including delaying
the write of some pages for a significant amount of time. Synchrony is
required to ensure the dynamicity of the backend and to avoid thorny race
conditions that would unnecessarily and greatly complicate frontswap
and/or the block I/O subsystem. That said, only the initial "store"
and "load" operations need be synchronous. A separate asynchronous thread
is free to manipulate the pages stored by frontswap. For example,
the "remotification" thread in RAMster uses standard asynchronous
kernel sockets to move compressed frontswap pages to a remote machine.
Similarly, a KVM guest-side implementation could do in-guest compression
and use "batched" hypercalls.
In a virtualized environment, the dynamicity allows the hypervisor
(or host OS) to do "intelligent overcommit". For example, it can
choose to accept pages only until host-swapping might be imminent,
then force guests to do their own swapping.
There is a downside to the transcendent memory specifications for
frontswap: Since any "store" might fail, there must always be a real
slot on a real swap device to swap the page. Thus frontswap must be
implemented as a "shadow" to every swapon'd device with the potential
capability of holding every page that the swap device might have held
and the possibility that it might hold no pages at all. This means
that frontswap cannot contain more pages than the total of swapon'd
swap devices. For example, if NO swap device is configured on some
installation, frontswap is useless. Swapless portable devices
can still use frontswap but a backend for such devices must configure
some kind of "ghost" swap device and ensure that it is never used.
5) Why this weird definition about "duplicate stores"? If a page
has been previously successfully stored, can't it always be
successfully overwritten?
Nearly always it can, but no, sometimes it cannot. Consider an example
where data is compressed and the original 4K page has been compressed
to 1K. Now an attempt is made to overwrite the page with data that
is non-compressible and so would take the entire 4K. But the backend
has no more space. In this case, the store must be rejected. Whenever
frontswap rejects a store that would overwrite, it also must invalidate
the old data and ensure that it is no longer accessible. Since the
swap subsystem then writes the new data to the read swap device,
this is the correct course of action to ensure coherency.
6) What is frontswap_shrink for?
When the (non-frontswap) swap subsystem swaps out a page to a real
swap device, that page is only taking up low-value pre-allocated disk
space. But if frontswap has placed a page in transcendent memory, that
page may be taking up valuable real estate. The frontswap_shrink
routine allows code outside of the swap subsystem to force pages out
of the memory managed by frontswap and back into kernel-addressable memory.
For example, in RAMster, a "suction driver" thread will attempt
to "repatriate" pages sent to a remote machine back to the local machine;
this is driven using the frontswap_shrink mechanism when memory pressure
subsides.
7) Why does the frontswap patch create the new include file swapfile.h?
The frontswap code depends on some swap-subsystem-internal data
structures that have, over the years, moved back and forth between
static and global. This seemed a reasonable compromise: Define
them as global but declare them in a new include file that isn't
included by the large number of source files that include swap.h.
Dan Magenheimer, last updated April 9, 2012

View File

@@ -1077,7 +1077,7 @@ F: drivers/media/video/s5p-fimc/
ARM/SAMSUNG S5P SERIES Multi Format Codec (MFC) SUPPORT
M: Kyungmin Park <kyungmin.park@samsung.com>
M: Kamil Debski <k.debski@samsung.com>
M: Jeongtae Park <jtp.park@samsung.com>
M: Jeongtae Park <jtp.park@samsung.com>
L: linux-arm-kernel@lists.infradead.org
L: linux-media@vger.kernel.org
S: Maintained
@@ -1743,10 +1743,10 @@ F: include/linux/can/platform/
CAPABILITIES
M: Serge Hallyn <serge.hallyn@canonical.com>
L: linux-security-module@vger.kernel.org
S: Supported
S: Supported
F: include/linux/capability.h
F: security/capability.c
F: security/commoncap.c
F: security/commoncap.c
F: kernel/capability.c
CELL BROADBAND ENGINE ARCHITECTURE
@@ -2146,11 +2146,11 @@ S: Orphan
F: drivers/net/wan/pc300*
CYTTSP TOUCHSCREEN DRIVER
M: Javier Martinez Canillas <javier@dowhile0.org>
L: linux-input@vger.kernel.org
S: Maintained
F: drivers/input/touchscreen/cyttsp*
F: include/linux/input/cyttsp.h
M: Javier Martinez Canillas <javier@dowhile0.org>
L: linux-input@vger.kernel.org
S: Maintained
F: drivers/input/touchscreen/cyttsp*
F: include/linux/input/cyttsp.h
DAMA SLAVE for AX.25
M: Joerg Reuter <jreuter@yaina.de>
@@ -2270,7 +2270,7 @@ F: include/linux/device-mapper.h
F: include/linux/dm-*.h
DIOLAN U2C-12 I2C DRIVER
M: Guenter Roeck <guenter.roeck@ericsson.com>
M: Guenter Roeck <linux@roeck-us.net>
L: linux-i2c@vger.kernel.org
S: Maintained
F: drivers/i2c/busses/i2c-diolan-u2c.c
@@ -2930,6 +2930,13 @@ F: Documentation/power/freezing-of-tasks.txt
F: include/linux/freezer.h
F: kernel/freezer.c
FRONTSWAP API
M: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
L: linux-kernel@vger.kernel.org
S: Maintained
F: mm/frontswap.c
F: include/linux/frontswap.h
FS-CACHE: LOCAL CACHING FOR NETWORK FILESYSTEMS
M: David Howells <dhowells@redhat.com>
L: linux-cachefs@redhat.com
@@ -3138,7 +3145,7 @@ F: drivers/tty/hvc/
HARDWARE MONITORING
M: Jean Delvare <khali@linux-fr.org>
M: Guenter Roeck <guenter.roeck@ericsson.com>
M: Guenter Roeck <linux@roeck-us.net>
L: lm-sensors@lm-sensors.org
W: http://www.lm-sensors.org/
T: quilt kernel.org/pub/linux/kernel/people/jdelvare/linux-2.6/jdelvare-hwmon/
@@ -4096,6 +4103,8 @@ F: drivers/scsi/53c700*
LED SUBSYSTEM
M: Bryan Wu <bryan.wu@canonical.com>
M: Richard Purdie <rpurdie@rpsys.net>
L: linux-leds@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds.git
S: Maintained
F: drivers/leds/
F: include/linux/leds.h
@@ -4411,6 +4420,13 @@ S: Orphan
F: drivers/video/matrox/matroxfb_*
F: include/linux/matroxfb.h
MAX16065 HARDWARE MONITOR DRIVER
M: Guenter Roeck <linux@roeck-us.net>
L: lm-sensors@lm-sensors.org
S: Maintained
F: Documentation/hwmon/max16065
F: drivers/hwmon/max16065.c
MAX6650 HARDWARE MONITOR AND FAN CONTROLLER DRIVER
M: "Hans J. Koch" <hjk@hansjkoch.de>
L: lm-sensors@lm-sensors.org
@@ -5149,7 +5165,7 @@ F: drivers/leds/leds-pca9532.c
F: include/linux/leds-pca9532.h
PCA9541 I2C BUS MASTER SELECTOR DRIVER
M: Guenter Roeck <guenter.roeck@ericsson.com>
M: Guenter Roeck <linux@roeck-us.net>
L: linux-i2c@vger.kernel.org
S: Maintained
F: drivers/i2c/muxes/i2c-mux-pca9541.c
@@ -5169,7 +5185,7 @@ S: Maintained
F: drivers/firmware/pcdp.*
PCI ERROR RECOVERY
M: Linas Vepstas <linasvepstas@gmail.com>
M: Linas Vepstas <linasvepstas@gmail.com>
L: linux-pci@vger.kernel.org
S: Supported
F: Documentation/PCI/pci-error-recovery.txt
@@ -5299,7 +5315,7 @@ F: drivers/video/fb-puv3.c
F: drivers/rtc/rtc-puv3.c
PMBUS HARDWARE MONITORING DRIVERS
M: Guenter Roeck <guenter.roeck@ericsson.com>
M: Guenter Roeck <linux@roeck-us.net>
L: lm-sensors@lm-sensors.org
W: http://www.lm-sensors.org/
W: http://www.roeck-us.net/linux/drivers/
@@ -7291,11 +7307,11 @@ F: Documentation/DocBook/uio-howto.tmpl
F: drivers/uio/
F: include/linux/uio*.h
UTIL-LINUX-NG PACKAGE
UTIL-LINUX PACKAGE
M: Karel Zak <kzak@redhat.com>
L: util-linux-ng@vger.kernel.org
W: http://kernel.org/~kzak/util-linux-ng/
T: git git://git.kernel.org/pub/scm/utils/util-linux-ng/util-linux-ng.git
L: util-linux@vger.kernel.org
W: http://en.wikipedia.org/wiki/Util-linux
T: git git://git.kernel.org/pub/scm/utils/util-linux/util-linux.git
S: Maintained
UVESAFB DRIVER

View File

@@ -1,7 +1,7 @@
VERSION = 3
PATCHLEVEL = 5
SUBLEVEL = 0
EXTRAVERSION = -rc1
EXTRAVERSION = -rc2
NAME = Saber-toothed Squirrel
# *DOCUMENTATION*

View File

@@ -7,7 +7,6 @@ config ARM
select HAVE_IDE if PCI || ISA || PCMCIA
select HAVE_DMA_ATTRS
select HAVE_DMA_CONTIGUOUS if (CPU_V6 || CPU_V6K || CPU_V7)
select CMA if (CPU_V6 || CPU_V6K || CPU_V7)
select HAVE_MEMBLOCK
select RTC_LIB
select SYS_SUPPORTS_APM_EMULATION

View File

@@ -186,6 +186,12 @@ config SH_TIMER_TMU
help
This enables build of the TMU timer driver.
config EM_TIMER_STI
bool "STI timer driver"
default y
help
This enables build of the STI timer driver.
endmenu
config SH_CLK_CPG

View File

@@ -268,10 +268,8 @@ static int __init consistent_init(void)
unsigned long base = consistent_base;
unsigned long num_ptes = (CONSISTENT_END - base) >> PMD_SHIFT;
#ifndef CONFIG_ARM_DMA_USE_IOMMU
if (cpu_architecture() >= CPU_ARCH_ARMv6)
if (IS_ENABLED(CONFIG_CMA) && !IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU))
return 0;
#endif
consistent_pte = kmalloc(num_ptes * sizeof(pte_t), GFP_KERNEL);
if (!consistent_pte) {
@@ -342,7 +340,7 @@ static int __init coherent_init(void)
struct page *page;
void *ptr;
if (cpu_architecture() < CPU_ARCH_ARMv6)
if (!IS_ENABLED(CONFIG_CMA))
return 0;
ptr = __alloc_from_contiguous(NULL, size, prot, &page);
@@ -704,7 +702,7 @@ static void *__dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
if (arch_is_coherent() || nommu())
addr = __alloc_simple_buffer(dev, size, gfp, &page);
else if (cpu_architecture() < CPU_ARCH_ARMv6)
else if (!IS_ENABLED(CONFIG_CMA))
addr = __alloc_remap_buffer(dev, size, gfp, prot, &page, caller);
else if (gfp & GFP_ATOMIC)
addr = __alloc_from_pool(dev, size, &page, caller);
@@ -773,7 +771,7 @@ void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
if (arch_is_coherent() || nommu()) {
__dma_free_buffer(page, size);
} else if (cpu_architecture() < CPU_ARCH_ARMv6) {
} else if (!IS_ENABLED(CONFIG_CMA)) {
__dma_free_remap(cpu_addr, size);
__dma_free_buffer(page, size);
} else {

View File

@@ -300,7 +300,7 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, struct thread_info *ti)
if ((sysreg_read(SR) & MODE_MASK) == MODE_SUPERVISOR)
syscall = 1;
if (ti->flags & _TIF_SIGPENDING))
if (ti->flags & _TIF_SIGPENDING)
do_signal(regs, syscall);
if (ti->flags & _TIF_NOTIFY_RESUME) {

View File

@@ -173,7 +173,7 @@ asmlinkage int bfin_clone(struct pt_regs *regs)
unsigned long newsp;
#ifdef __ARCH_SYNC_CORE_DCACHE
if (current->rt.nr_cpus_allowed == num_possible_cpus())
if (current->nr_cpus_allowed == num_possible_cpus())
set_cpus_allowed_ptr(current, cpumask_of(smp_processor_id()));
#endif

View File

@@ -21,6 +21,7 @@ KBUILD_DEFCONFIG := default_defconfig
NM = sh $(srctree)/arch/parisc/nm
CHECKFLAGS += -D__hppa__=1
LIBGCC = $(shell $(CC) $(KBUILD_CFLAGS) -print-libgcc-file-name)
MACHINE := $(shell uname -m)
ifeq ($(MACHINE),parisc*)
@@ -79,7 +80,7 @@ kernel-y := mm/ kernel/ math-emu/
kernel-$(CONFIG_HPUX) += hpux/
core-y += $(addprefix arch/parisc/, $(kernel-y))
libs-y += arch/parisc/lib/ `$(CC) -print-libgcc-file-name`
libs-y += arch/parisc/lib/ $(LIBGCC)
drivers-$(CONFIG_OPROFILE) += arch/parisc/oprofile/

View File

@@ -1,3 +1,4 @@
include include/asm-generic/Kbuild.asm
header-y += pdc.h
generic-y += word-at-a-time.h

View File

@@ -1,6 +1,8 @@
#ifndef _PARISC_BUG_H
#define _PARISC_BUG_H
#include <linux/kernel.h> /* for BUGFLAG_TAINT */
/*
* Tell the user there is some problem.
* The offending file and line are encoded in the __bug_table section.

View File

@@ -176,8 +176,8 @@ int module_frob_arch_sections(Elf32_Ehdr *hdr,
static inline int entry_matches(struct ppc_plt_entry *entry, Elf32_Addr val)
{
if (entry->jump[0] == 0x3d600000 + ((val + 0x8000) >> 16)
&& entry->jump[1] == 0x396b0000 + (val & 0xffff))
if (entry->jump[0] == 0x3d800000 + ((val + 0x8000) >> 16)
&& entry->jump[1] == 0x398c0000 + (val & 0xffff))
return 1;
return 0;
}
@@ -204,10 +204,9 @@ static uint32_t do_plt_call(void *location,
entry++;
}
/* Stolen from Paul Mackerras as well... */
entry->jump[0] = 0x3d600000+((val+0x8000)>>16); /* lis r11,sym@ha */
entry->jump[1] = 0x396b0000 + (val&0xffff); /* addi r11,r11,sym@l*/
entry->jump[2] = 0x7d6903a6; /* mtctr r11 */
entry->jump[0] = 0x3d800000+((val+0x8000)>>16); /* lis r12,sym@ha */
entry->jump[1] = 0x398c0000 + (val&0xffff); /* addi r12,r12,sym@l*/
entry->jump[2] = 0x7d8903a6; /* mtctr r12 */
entry->jump[3] = 0x4e800420; /* bctr */
DEBUGP("Initialized plt for 0x%x at %p\n", val, entry);

View File

@@ -475,6 +475,7 @@ void timer_interrupt(struct pt_regs * regs)
struct pt_regs *old_regs;
u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
struct clock_event_device *evt = &__get_cpu_var(decrementers);
u64 now;
/* Ensure a positive value is written to the decrementer, or else
* some CPUs will continue to take decrementer exceptions.
@@ -509,9 +510,16 @@ void timer_interrupt(struct pt_regs * regs)
irq_work_run();
}
*next_tb = ~(u64)0;
if (evt->event_handler)
evt->event_handler(evt);
now = get_tb_or_rtc();
if (now >= *next_tb) {
*next_tb = ~(u64)0;
if (evt->event_handler)
evt->event_handler(evt);
} else {
now = *next_tb - now;
if (now <= DECREMENTER_MAX)
set_dec((int)now);
}
#ifdef CONFIG_PPC64
/* collect purr register values often, for accurate calculations */

View File

@@ -91,11 +91,6 @@ extern void smp_nap(void);
/* Enable interrupts racelessly and nap forever: helper for cpu_idle(). */
extern void _cpu_idle(void);
/* Switch boot idle thread to a freshly-allocated stack and free old stack. */
extern void cpu_idle_on_new_stack(struct thread_info *old_ti,
unsigned long new_sp,
unsigned long new_ss10);
#else /* __ASSEMBLY__ */
/*

View File

@@ -68,20 +68,6 @@ STD_ENTRY(KBacktraceIterator_init_current)
jrp lr /* keep backtracer happy */
STD_ENDPROC(KBacktraceIterator_init_current)
/*
* Reset our stack to r1/r2 (sp and ksp0+cpu respectively), then
* free the old stack (passed in r0) and re-invoke cpu_idle().
* We update sp and ksp0 simultaneously to avoid backtracer warnings.
*/
STD_ENTRY(cpu_idle_on_new_stack)
{
move sp, r1
mtspr SPR_SYSTEM_SAVE_K_0, r2
}
jal free_thread_info
j cpu_idle
STD_ENDPROC(cpu_idle_on_new_stack)
/* Loop forever on a nap during SMP boot. */
STD_ENTRY(smp_nap)
nap

View File

@@ -29,6 +29,7 @@
#include <linux/smp.h>
#include <linux/timex.h>
#include <linux/hugetlb.h>
#include <linux/start_kernel.h>
#include <asm/setup.h>
#include <asm/sections.h>
#include <asm/cacheflush.h>

View File

@@ -94,10 +94,10 @@ bs_die:
.section ".bsdata", "a"
bugger_off_msg:
.ascii "Direct booting from floppy is no longer supported.\r\n"
.ascii "Please use a boot loader program instead.\r\n"
.ascii "Direct floppy boot is not supported. "
.ascii "Use a boot loader program instead.\r\n"
.ascii "\n"
.ascii "Remove disk and press any key to reboot . . .\r\n"
.ascii "Remove disk and press any key to reboot ...\r\n"
.byte 0
#ifdef CONFIG_EFI_STUB
@@ -111,7 +111,7 @@ coff_header:
#else
.word 0x8664 # x86-64
#endif
.word 2 # nr_sections
.word 3 # nr_sections
.long 0 # TimeDateStamp
.long 0 # PointerToSymbolTable
.long 1 # NumberOfSymbols
@@ -158,8 +158,8 @@ extra_header_fields:
#else
.quad 0 # ImageBase
#endif
.long 0x1000 # SectionAlignment
.long 0x200 # FileAlignment
.long 0x20 # SectionAlignment
.long 0x20 # FileAlignment
.word 0 # MajorOperatingSystemVersion
.word 0 # MinorOperatingSystemVersion
.word 0 # MajorImageVersion
@@ -200,8 +200,10 @@ extra_header_fields:
# Section table
section_table:
.ascii ".text"
.byte 0
#
# The offset & size fields are filled in by build.c.
#
.ascii ".setup"
.byte 0
.byte 0
.long 0
@@ -217,9 +219,8 @@ section_table:
#
# The EFI application loader requires a relocation section
# because EFI applications must be relocatable. But since
# we don't need the loader to fixup any relocs for us, we
# just create an empty (zero-length) .reloc section header.
# because EFI applications must be relocatable. The .reloc
# offset & size fields are filled in by build.c.
#
.ascii ".reloc"
.byte 0
@@ -233,6 +234,25 @@ section_table:
.word 0 # NumberOfRelocations
.word 0 # NumberOfLineNumbers
.long 0x42100040 # Characteristics (section flags)
#
# The offset & size fields are filled in by build.c.
#
.ascii ".text"
.byte 0
.byte 0
.byte 0
.long 0
.long 0x0 # startup_{32,64}
.long 0 # Size of initialized data
# on disk
.long 0x0 # startup_{32,64}
.long 0 # PointerToRelocations
.long 0 # PointerToLineNumbers
.word 0 # NumberOfRelocations
.word 0 # NumberOfLineNumbers
.long 0x60500020 # Characteristics (section flags)
#endif /* CONFIG_EFI_STUB */
# Kernel attributes; used by setup. This is part 1 of the

View File

@@ -50,6 +50,8 @@ typedef unsigned int u32;
u8 buf[SETUP_SECT_MAX*512];
int is_big_kernel;
#define PECOFF_RELOC_RESERVE 0x20
/*----------------------------------------------------------------------*/
static const u32 crctab32[] = {
@@ -133,11 +135,103 @@ static void usage(void)
die("Usage: build setup system [> image]");
}
#ifdef CONFIG_EFI_STUB
static void update_pecoff_section_header(char *section_name, u32 offset, u32 size)
{
unsigned int pe_header;
unsigned short num_sections;
u8 *section;
pe_header = get_unaligned_le32(&buf[0x3c]);
num_sections = get_unaligned_le16(&buf[pe_header + 6]);
#ifdef CONFIG_X86_32
section = &buf[pe_header + 0xa8];
#else
section = &buf[pe_header + 0xb8];
#endif
while (num_sections > 0) {
if (strncmp((char*)section, section_name, 8) == 0) {
/* section header size field */
put_unaligned_le32(size, section + 0x8);
/* section header vma field */
put_unaligned_le32(offset, section + 0xc);
/* section header 'size of initialised data' field */
put_unaligned_le32(size, section + 0x10);
/* section header 'file offset' field */
put_unaligned_le32(offset, section + 0x14);
break;
}
section += 0x28;
num_sections--;
}
}
static void update_pecoff_setup_and_reloc(unsigned int size)
{
u32 setup_offset = 0x200;
u32 reloc_offset = size - PECOFF_RELOC_RESERVE;
u32 setup_size = reloc_offset - setup_offset;
update_pecoff_section_header(".setup", setup_offset, setup_size);
update_pecoff_section_header(".reloc", reloc_offset, PECOFF_RELOC_RESERVE);
/*
* Modify .reloc section contents with a single entry. The
* relocation is applied to offset 10 of the relocation section.
*/
put_unaligned_le32(reloc_offset + 10, &buf[reloc_offset]);
put_unaligned_le32(10, &buf[reloc_offset + 4]);
}
static void update_pecoff_text(unsigned int text_start, unsigned int file_sz)
{
unsigned int pe_header;
unsigned int text_sz = file_sz - text_start;
pe_header = get_unaligned_le32(&buf[0x3c]);
/* Size of image */
put_unaligned_le32(file_sz, &buf[pe_header + 0x50]);
/*
* Size of code: Subtract the size of the first sector (512 bytes)
* which includes the header.
*/
put_unaligned_le32(file_sz - 512, &buf[pe_header + 0x1c]);
#ifdef CONFIG_X86_32
/*
* Address of entry point.
*
* The EFI stub entry point is +16 bytes from the start of
* the .text section.
*/
put_unaligned_le32(text_start + 16, &buf[pe_header + 0x28]);
#else
/*
* Address of entry point. startup_32 is at the beginning and
* the 64-bit entry point (startup_64) is always 512 bytes
* after. The EFI stub entry point is 16 bytes after that, as
* the first instruction allows legacy loaders to jump over
* the EFI stub initialisation
*/
put_unaligned_le32(text_start + 528, &buf[pe_header + 0x28]);
#endif /* CONFIG_X86_32 */
update_pecoff_section_header(".text", text_start, text_sz);
}
#endif /* CONFIG_EFI_STUB */
int main(int argc, char ** argv)
{
#ifdef CONFIG_EFI_STUB
unsigned int file_sz, pe_header;
#endif
unsigned int i, sz, setup_sectors;
int c;
u32 sys_size;
@@ -163,6 +257,12 @@ int main(int argc, char ** argv)
die("Boot block hasn't got boot flag (0xAA55)");
fclose(file);
#ifdef CONFIG_EFI_STUB
/* Reserve 0x20 bytes for .reloc section */
memset(buf+c, 0, PECOFF_RELOC_RESERVE);
c += PECOFF_RELOC_RESERVE;
#endif
/* Pad unused space with zeros */
setup_sectors = (c + 511) / 512;
if (setup_sectors < SETUP_SECT_MIN)
@@ -170,6 +270,10 @@ int main(int argc, char ** argv)
i = setup_sectors*512;
memset(buf+c, 0, i-c);
#ifdef CONFIG_EFI_STUB
update_pecoff_setup_and_reloc(i);
#endif
/* Set the default root device */
put_unaligned_le16(DEFAULT_ROOT_DEV, &buf[508]);
@@ -194,66 +298,8 @@ int main(int argc, char ** argv)
put_unaligned_le32(sys_size, &buf[0x1f4]);
#ifdef CONFIG_EFI_STUB
file_sz = sz + i + ((sys_size * 16) - sz);
pe_header = get_unaligned_le32(&buf[0x3c]);
/* Size of image */
put_unaligned_le32(file_sz, &buf[pe_header + 0x50]);
/*
* Subtract the size of the first section (512 bytes) which
* includes the header and .reloc section. The remaining size
* is that of the .text section.
*/
file_sz -= 512;
/* Size of code */
put_unaligned_le32(file_sz, &buf[pe_header + 0x1c]);
#ifdef CONFIG_X86_32
/*
* Address of entry point.
*
* The EFI stub entry point is +16 bytes from the start of
* the .text section.
*/
put_unaligned_le32(i + 16, &buf[pe_header + 0x28]);
/* .text size */
put_unaligned_le32(file_sz, &buf[pe_header + 0xb0]);
/* .text vma */
put_unaligned_le32(0x200, &buf[pe_header + 0xb4]);
/* .text size of initialised data */
put_unaligned_le32(file_sz, &buf[pe_header + 0xb8]);
/* .text file offset */
put_unaligned_le32(0x200, &buf[pe_header + 0xbc]);
#else
/*
* Address of entry point. startup_32 is at the beginning and
* the 64-bit entry point (startup_64) is always 512 bytes
* after. The EFI stub entry point is 16 bytes after that, as
* the first instruction allows legacy loaders to jump over
* the EFI stub initialisation
*/
put_unaligned_le32(i + 528, &buf[pe_header + 0x28]);
/* .text size */
put_unaligned_le32(file_sz, &buf[pe_header + 0xc0]);
/* .text vma */
put_unaligned_le32(0x200, &buf[pe_header + 0xc4]);
/* .text size of initialised data */
put_unaligned_le32(file_sz, &buf[pe_header + 0xc8]);
/* .text file offset */
put_unaligned_le32(0x200, &buf[pe_header + 0xcc]);
#endif /* CONFIG_X86_32 */
#endif /* CONFIG_EFI_STUB */
update_pecoff_text(setup_sectors * 512, sz + i + ((sys_size * 16) - sz));
#endif
crc = partial_crc32(buf, i, crc);
if (fwrite(buf, 1, i, stdout) != i)

View File

@@ -54,6 +54,20 @@ struct nmiaction {
__register_nmi_handler((t), &fn##_na); \
})
/*
* For special handlers that register/unregister in the
* init section only. This should be considered rare.
*/
#define register_nmi_handler_initonly(t, fn, fg, n) \
({ \
static struct nmiaction fn##_na __initdata = { \
.handler = (fn), \
.name = (n), \
.flags = (fg), \
}; \
__register_nmi_handler((t), &fn##_na); \
})
int __register_nmi_handler(unsigned int, struct nmiaction *);
void unregister_nmi_handler(unsigned int, const char *);

View File

@@ -33,9 +33,8 @@
#define segment_eq(a, b) ((a).seg == (b).seg)
#define user_addr_max() (current_thread_info()->addr_limit.seg)
#define __addr_ok(addr) \
((unsigned long __force)(addr) < \
(current_thread_info()->addr_limit.seg))
#define __addr_ok(addr) \
((unsigned long __force)(addr) < user_addr_max())
/*
* Test whether a block of memory is a valid user space address.
@@ -47,14 +46,14 @@
* This needs 33-bit (65-bit for x86_64) arithmetic. We have a carry...
*/
#define __range_not_ok(addr, size) \
#define __range_not_ok(addr, size, limit) \
({ \
unsigned long flag, roksum; \
__chk_user_ptr(addr); \
asm("add %3,%1 ; sbb %0,%0 ; cmp %1,%4 ; sbb $0,%0" \
: "=&r" (flag), "=r" (roksum) \
: "1" (addr), "g" ((long)(size)), \
"rm" (current_thread_info()->addr_limit.seg)); \
"rm" (limit)); \
flag; \
})
@@ -77,7 +76,8 @@
* checks that the pointer is in the user space range - after calling
* this function, memory access functions may still return -EFAULT.
*/
#define access_ok(type, addr, size) (likely(__range_not_ok(addr, size) == 0))
#define access_ok(type, addr, size) \
(likely(__range_not_ok(addr, size, user_addr_max()) == 0))
/*
* The exception table consists of pairs of addresses relative to the

View File

@@ -149,7 +149,6 @@
/* 4 bits of software ack period */
#define UV2_ACK_MASK 0x7UL
#define UV2_ACK_UNITS_SHFT 3
#define UV2_LEG_SHFT UV2H_LB_BAU_MISC_CONTROL_USE_LEGACY_DESCRIPTOR_FORMATS_SHFT
#define UV2_EXT_SHFT UV2H_LB_BAU_MISC_CONTROL_ENABLE_EXTENDED_SB_STATUS_SHFT
/*

View File

@@ -20,7 +20,6 @@
#include <linux/bitops.h>
#include <linux/ioport.h>
#include <linux/suspend.h>
#include <linux/kmemleak.h>
#include <asm/e820.h>
#include <asm/io.h>
#include <asm/iommu.h>
@@ -95,11 +94,6 @@ static u32 __init allocate_aperture(void)
return 0;
}
memblock_reserve(addr, aper_size);
/*
* Kmemleak should not scan this block as it may not be mapped via the
* kernel direct mapping.
*/
kmemleak_ignore(phys_to_virt(addr));
printk(KERN_INFO "Mapping aperture over %d KB of RAM @ %lx\n",
aper_size >> 10, addr);
insert_aperture_resource((u32)addr, aper_size);

View File

@@ -1195,7 +1195,7 @@ static void __clear_irq_vector(int irq, struct irq_cfg *cfg)
BUG_ON(!cfg->vector);
vector = cfg->vector;
for_each_cpu_and(cpu, cfg->domain, cpu_online_mask)
for_each_cpu(cpu, cfg->domain)
per_cpu(vector_irq, cpu)[vector] = -1;
cfg->vector = 0;
@@ -1203,7 +1203,7 @@ static void __clear_irq_vector(int irq, struct irq_cfg *cfg)
if (likely(!cfg->move_in_progress))
return;
for_each_cpu_and(cpu, cfg->old_domain, cpu_online_mask) {
for_each_cpu(cpu, cfg->old_domain) {
for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS;
vector++) {
if (per_cpu(vector_irq, cpu)[vector] != irq)

View File

@@ -1274,7 +1274,7 @@ static void mce_timer_fn(unsigned long data)
*/
iv = __this_cpu_read(mce_next_interval);
if (mce_notify_irq())
iv = max(iv, (unsigned long) HZ/100);
iv = max(iv / 2, (unsigned long) HZ/100);
else
iv = min(iv * 2, round_jiffies_relative(check_interval * HZ));
__this_cpu_write(mce_next_interval, iv);
@@ -1557,7 +1557,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
static void __mcheck_cpu_init_timer(void)
{
struct timer_list *t = &__get_cpu_var(mce_timer);
unsigned long iv = __this_cpu_read(mce_next_interval);
unsigned long iv = check_interval * HZ;
setup_timer(t, mce_timer_fn, smp_processor_id());

View File

@@ -1496,6 +1496,7 @@ static struct cpu_hw_events *allocate_fake_cpuc(void)
if (!cpuc->shared_regs)
goto error;
}
cpuc->is_fake = 1;
return cpuc;
error:
free_fake_cpuc(cpuc);
@@ -1756,6 +1757,12 @@ perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
dump_trace(NULL, regs, NULL, 0, &backtrace_ops, entry);
}
static inline int
valid_user_frame(const void __user *fp, unsigned long size)
{
return (__range_not_ok(fp, size, TASK_SIZE) == 0);
}
#ifdef CONFIG_COMPAT
#include <asm/compat.h>
@@ -1780,7 +1787,7 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
if (bytes != sizeof(frame))
break;
if (fp < compat_ptr(regs->sp))
if (!valid_user_frame(fp, sizeof(frame)))
break;
perf_callchain_store(entry, frame.return_address);
@@ -1826,7 +1833,7 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
if (bytes != sizeof(frame))
break;
if ((unsigned long)fp < regs->sp)
if (!valid_user_frame(fp, sizeof(frame)))
break;
perf_callchain_store(entry, frame.return_address);

View File

@@ -117,6 +117,7 @@ struct cpu_hw_events {
struct perf_event *event_list[X86_PMC_IDX_MAX]; /* in enabled order */
unsigned int group_flag;
int is_fake;
/*
* Intel DebugStore bits
@@ -364,6 +365,7 @@ struct x86_pmu {
int pebs_record_size;
void (*drain_pebs)(struct pt_regs *regs);
struct event_constraint *pebs_constraints;
void (*pebs_aliases)(struct perf_event *event);
/*
* Intel LBR

View File

@@ -1119,27 +1119,33 @@ intel_bts_constraints(struct perf_event *event)
return NULL;
}
static bool intel_try_alt_er(struct perf_event *event, int orig_idx)
static int intel_alt_er(int idx)
{
if (!(x86_pmu.er_flags & ERF_HAS_RSP_1))
return false;
return idx;
if (event->hw.extra_reg.idx == EXTRA_REG_RSP_0) {
event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
event->hw.config |= 0x01bb;
event->hw.extra_reg.idx = EXTRA_REG_RSP_1;
event->hw.extra_reg.reg = MSR_OFFCORE_RSP_1;
} else if (event->hw.extra_reg.idx == EXTRA_REG_RSP_1) {
if (idx == EXTRA_REG_RSP_0)
return EXTRA_REG_RSP_1;
if (idx == EXTRA_REG_RSP_1)
return EXTRA_REG_RSP_0;
return idx;
}
static void intel_fixup_er(struct perf_event *event, int idx)
{
event->hw.extra_reg.idx = idx;
if (idx == EXTRA_REG_RSP_0) {
event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
event->hw.config |= 0x01b7;
event->hw.extra_reg.idx = EXTRA_REG_RSP_0;
event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0;
} else if (idx == EXTRA_REG_RSP_1) {
event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
event->hw.config |= 0x01bb;
event->hw.extra_reg.reg = MSR_OFFCORE_RSP_1;
}
if (event->hw.extra_reg.idx == orig_idx)
return false;
return true;
}
/*
@@ -1157,14 +1163,18 @@ __intel_shared_reg_get_constraints(struct cpu_hw_events *cpuc,
struct event_constraint *c = &emptyconstraint;
struct er_account *era;
unsigned long flags;
int orig_idx = reg->idx;
int idx = reg->idx;
/* already allocated shared msr */
if (reg->alloc)
/*
* reg->alloc can be set due to existing state, so for fake cpuc we
* need to ignore this, otherwise we might fail to allocate proper fake
* state for this extra reg constraint. Also see the comment below.
*/
if (reg->alloc && !cpuc->is_fake)
return NULL; /* call x86_get_event_constraint() */
again:
era = &cpuc->shared_regs->regs[reg->idx];
era = &cpuc->shared_regs->regs[idx];
/*
* we use spin_lock_irqsave() to avoid lockdep issues when
* passing a fake cpuc
@@ -1173,6 +1183,29 @@ again:
if (!atomic_read(&era->ref) || era->config == reg->config) {
/*
* If its a fake cpuc -- as per validate_{group,event}() we
* shouldn't touch event state and we can avoid doing so
* since both will only call get_event_constraints() once
* on each event, this avoids the need for reg->alloc.
*
* Not doing the ER fixup will only result in era->reg being
* wrong, but since we won't actually try and program hardware
* this isn't a problem either.
*/
if (!cpuc->is_fake) {
if (idx != reg->idx)
intel_fixup_er(event, idx);
/*
* x86_schedule_events() can call get_event_constraints()
* multiple times on events in the case of incremental
* scheduling(). reg->alloc ensures we only do the ER
* allocation once.
*/
reg->alloc = 1;
}
/* lock in msr value */
era->config = reg->config;
era->reg = reg->reg;
@@ -1180,17 +1213,17 @@ again:
/* one more user */
atomic_inc(&era->ref);
/* no need to reallocate during incremental event scheduling */
reg->alloc = 1;
/*
* need to call x86_get_event_constraint()
* to check if associated event has constraints
*/
c = NULL;
} else if (intel_try_alt_er(event, orig_idx)) {
raw_spin_unlock_irqrestore(&era->lock, flags);
goto again;
} else {
idx = intel_alt_er(idx);
if (idx != reg->idx) {
raw_spin_unlock_irqrestore(&era->lock, flags);
goto again;
}
}
raw_spin_unlock_irqrestore(&era->lock, flags);
@@ -1204,11 +1237,14 @@ __intel_shared_reg_put_constraints(struct cpu_hw_events *cpuc,
struct er_account *era;
/*
* only put constraint if extra reg was actually
* allocated. Also takes care of event which do
* not use an extra shared reg
* Only put constraint if extra reg was actually allocated. Also takes
* care of event which do not use an extra shared reg.
*
* Also, if this is a fake cpuc we shouldn't touch any event state
* (reg->alloc) and we don't care about leaving inconsistent cpuc state
* either since it'll be thrown out.
*/
if (!reg->alloc)
if (!reg->alloc || cpuc->is_fake)
return;
era = &cpuc->shared_regs->regs[reg->idx];
@@ -1300,15 +1336,9 @@ static void intel_put_event_constraints(struct cpu_hw_events *cpuc,
intel_put_shared_regs_event_constraints(cpuc, event);
}
static int intel_pmu_hw_config(struct perf_event *event)
static void intel_pebs_aliases_core2(struct perf_event *event)
{
int ret = x86_pmu_hw_config(event);
if (ret)
return ret;
if (event->attr.precise_ip &&
(event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
if ((event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
/*
* Use an alternative encoding for CPU_CLK_UNHALTED.THREAD_P
* (0x003c) so that we can use it with PEBS.
@@ -1329,10 +1359,48 @@ static int intel_pmu_hw_config(struct perf_event *event)
*/
u64 alt_config = X86_CONFIG(.event=0xc0, .inv=1, .cmask=16);
alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
event->hw.config = alt_config;
}
}
static void intel_pebs_aliases_snb(struct perf_event *event)
{
if ((event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
/*
* Use an alternative encoding for CPU_CLK_UNHALTED.THREAD_P
* (0x003c) so that we can use it with PEBS.
*
* The regular CPU_CLK_UNHALTED.THREAD_P event (0x003c) isn't
* PEBS capable. However we can use UOPS_RETIRED.ALL
* (0x01c2), which is a PEBS capable event, to get the same
* count.
*
* UOPS_RETIRED.ALL counts the number of cycles that retires
* CNTMASK micro-ops. By setting CNTMASK to a value (16)
* larger than the maximum number of micro-ops that can be
* retired per cycle (4) and then inverting the condition, we
* count all cycles that retire 16 or less micro-ops, which
* is every cycle.
*
* Thereby we gain a PEBS capable cycle counter.
*/
u64 alt_config = X86_CONFIG(.event=0xc2, .umask=0x01, .inv=1, .cmask=16);
alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
event->hw.config = alt_config;
}
}
static int intel_pmu_hw_config(struct perf_event *event)
{
int ret = x86_pmu_hw_config(event);
if (ret)
return ret;
if (event->attr.precise_ip && x86_pmu.pebs_aliases)
x86_pmu.pebs_aliases(event);
if (intel_pmu_needs_lbr_smpl(event)) {
ret = intel_pmu_setup_lbr_filter(event);
@@ -1607,6 +1675,7 @@ static __initconst const struct x86_pmu intel_pmu = {
.max_period = (1ULL << 31) - 1,
.get_event_constraints = intel_get_event_constraints,
.put_event_constraints = intel_put_event_constraints,
.pebs_aliases = intel_pebs_aliases_core2,
.format_attrs = intel_arch3_formats_attr,
@@ -1840,8 +1909,9 @@ __init int intel_pmu_init(void)
break;
case 42: /* SandyBridge */
x86_add_quirk(intel_sandybridge_quirk);
case 45: /* SandyBridge, "Romely-EP" */
x86_add_quirk(intel_sandybridge_quirk);
case 58: /* IvyBridge */
memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
@@ -1849,6 +1919,7 @@ __init int intel_pmu_init(void)
x86_pmu.event_constraints = intel_snb_event_constraints;
x86_pmu.pebs_constraints = intel_snb_pebs_event_constraints;
x86_pmu.pebs_aliases = intel_pebs_aliases_snb;
x86_pmu.extra_regs = intel_snb_extra_regs;
/* all extra regs are per-cpu when HT is on */
x86_pmu.er_flags |= ERF_HAS_RSP_1;

View File

@@ -400,14 +400,7 @@ struct event_constraint intel_snb_pebs_event_constraints[] = {
INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0xcd, 0x8), /* MEM_TRANS_RETIRED.* */
INTEL_UEVENT_CONSTRAINT(0x11d0, 0xf), /* MEM_UOP_RETIRED.STLB_MISS_LOADS */
INTEL_UEVENT_CONSTRAINT(0x12d0, 0xf), /* MEM_UOP_RETIRED.STLB_MISS_STORES */
INTEL_UEVENT_CONSTRAINT(0x21d0, 0xf), /* MEM_UOP_RETIRED.LOCK_LOADS */
INTEL_UEVENT_CONSTRAINT(0x22d0, 0xf), /* MEM_UOP_RETIRED.LOCK_STORES */
INTEL_UEVENT_CONSTRAINT(0x41d0, 0xf), /* MEM_UOP_RETIRED.SPLIT_LOADS */
INTEL_UEVENT_CONSTRAINT(0x42d0, 0xf), /* MEM_UOP_RETIRED.SPLIT_STORES */
INTEL_UEVENT_CONSTRAINT(0x81d0, 0xf), /* MEM_UOP_RETIRED.ANY_LOADS */
INTEL_UEVENT_CONSTRAINT(0x82d0, 0xf), /* MEM_UOP_RETIRED.ANY_STORES */
INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
INTEL_UEVENT_CONSTRAINT(0x02d4, 0xf), /* MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS */

View File

@@ -42,7 +42,7 @@ static int __init nmi_unk_cb(unsigned int val, struct pt_regs *regs)
static void __init init_nmi_testsuite(void)
{
/* trap all the unknown NMIs we may generate */
register_nmi_handler(NMI_UNKNOWN, nmi_unk_cb, 0, "nmi_selftest_unk");
register_nmi_handler_initonly(NMI_UNKNOWN, nmi_unk_cb, 0, "nmi_selftest_unk");
}
static void __init cleanup_nmi_testsuite(void)
@@ -64,7 +64,7 @@ static void __init test_nmi_ipi(struct cpumask *mask)
{
unsigned long timeout;
if (register_nmi_handler(NMI_LOCAL, test_nmi_ipi_callback,
if (register_nmi_handler_initonly(NMI_LOCAL, test_nmi_ipi_callback,
NMI_FLAG_FIRST, "nmi_selftest")) {
nmi_fail = FAILURE;
return;

View File

@@ -639,9 +639,11 @@ void native_machine_shutdown(void)
set_cpus_allowed_ptr(current, cpumask_of(reboot_cpu_id));
/*
* O.K Now that I'm on the appropriate processor,
* stop all of the others.
* O.K Now that I'm on the appropriate processor, stop all of the
* others. Also disable the local irq to not receive the per-cpu
* timer interrupt which may trigger scheduler's load balance.
*/
local_irq_disable();
stop_other_cpus();
#endif

View File

@@ -382,6 +382,15 @@ void __cpuinit set_cpu_sibling_map(int cpu)
if ((i == cpu) || (has_mc && match_llc(c, o)))
link_mask(llc_shared, cpu, i);
}
/*
* This needs a separate iteration over the cpus because we rely on all
* cpu_sibling_mask links to be set-up.
*/
for_each_cpu(i, cpu_sibling_setup_mask) {
o = &cpu_data(i);
if ((i == cpu) || (has_mc && match_mc(c, o))) {
link_mask(core, cpu, i);
@@ -410,15 +419,7 @@ void __cpuinit set_cpu_sibling_map(int cpu)
/* maps the cpu to the sched domain representing multi-core */
const struct cpumask *cpu_coregroup_mask(int cpu)
{
struct cpuinfo_x86 *c = &cpu_data(cpu);
/*
* For perf, we return last level cache shared map.
* And for power savings, we return cpu_core_map
*/
if (!(cpu_has(c, X86_FEATURE_AMD_DCM)))
return cpu_core_mask(cpu);
else
return cpu_llc_shared_mask(cpu);
return cpu_llc_shared_mask(cpu);
}
static void impress_friends(void)

View File

@@ -8,6 +8,7 @@
#include <linux/module.h>
#include <asm/word-at-a-time.h>
#include <linux/sched.h>
/*
* best effort, GUP based copy_from_user() that is NMI-safe
@@ -21,6 +22,9 @@ copy_from_user_nmi(void *to, const void __user *from, unsigned long n)
void *map;
int ret;
if (__range_not_ok(from, n, TASK_SIZE) == 0)
return len;
do {
ret = __get_user_pages_fast(addr, 1, 0, &page);
if (!ret)

View File

@@ -28,7 +28,7 @@
# - (66): the last prefix is 0x66
# - (F3): the last prefix is 0xF3
# - (F2): the last prefix is 0xF2
#
# - (!F3) : the last prefix is not 0xF3 (including non-last prefix case)
Table: one byte opcode
Referrer:
@@ -515,12 +515,12 @@ b4: LFS Gv,Mp
b5: LGS Gv,Mp
b6: MOVZX Gv,Eb
b7: MOVZX Gv,Ew
b8: JMPE | POPCNT Gv,Ev (F3)
b8: JMPE (!F3) | POPCNT Gv,Ev (F3)
b9: Grp10 (1A)
ba: Grp8 Ev,Ib (1A)
bb: BTC Ev,Gv
bc: BSF Gv,Ev | TZCNT Gv,Ev (F3)
bd: BSR Gv,Ev | LZCNT Gv,Ev (F3)
bc: BSF Gv,Ev (!F3) | TZCNT Gv,Ev (F3)
bd: BSR Gv,Ev (!F3) | LZCNT Gv,Ev (F3)
be: MOVSX Gv,Eb
bf: MOVSX Gv,Ew
# 0x0f 0xc0-0xcf

View File

@@ -62,7 +62,8 @@ static void __init find_early_table_space(struct map_range *mr, unsigned long en
extra += PMD_SIZE;
#endif
/* The first 2/4M doesn't use large pages. */
extra += mr->end - mr->start;
if (mr->start < PMD_SIZE)
extra += mr->end - mr->start;
ptes = (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
} else

View File

@@ -176,6 +176,8 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
return;
}
node_set(node, numa_nodes_parsed);
printk(KERN_INFO "SRAT: Node %u PXM %u [mem %#010Lx-%#010Lx]\n",
node, pxm,
(unsigned long long) start, (unsigned long long) end - 1);

View File

@@ -782,7 +782,7 @@ BLOCKING_NOTIFIER_HEAD(intel_scu_notifier);
EXPORT_SYMBOL_GPL(intel_scu_notifier);
/* Called by IPC driver */
void intel_scu_devices_create(void)
void __devinit intel_scu_devices_create(void)
{
int i;

View File

@@ -1295,7 +1295,6 @@ static void __init enable_timeouts(void)
*/
mmr_image |= (1L << SOFTACK_MSHIFT);
if (is_uv2_hub()) {
mmr_image &= ~(1L << UV2_LEG_SHFT);
mmr_image |= (1L << UV2_EXT_SHFT);
}
write_mmr_misc_control(pnode, mmr_image);

View File

@@ -66,9 +66,10 @@ BEGIN {
rex_expr = "^REX(\\.[XRWB]+)*"
fpu_expr = "^ESC" # TODO
lprefix1_expr = "\\(66\\)"
lprefix1_expr = "\\((66|!F3)\\)"
lprefix2_expr = "\\(F3\\)"
lprefix3_expr = "\\(F2\\)"
lprefix3_expr = "\\((F2|!F3)\\)"
lprefix_expr = "\\((66|F2|F3)\\)"
max_lprefix = 4
# All opcodes starting with lower-case 'v' or with (v1) superscript
@@ -333,13 +334,16 @@ function convert_operands(count,opnd, i,j,imm,mod)
if (match(ext, lprefix1_expr)) {
lptable1[idx] = add_flags(lptable1[idx],flags)
variant = "INAT_VARIANT"
} else if (match(ext, lprefix2_expr)) {
}
if (match(ext, lprefix2_expr)) {
lptable2[idx] = add_flags(lptable2[idx],flags)
variant = "INAT_VARIANT"
} else if (match(ext, lprefix3_expr)) {
}
if (match(ext, lprefix3_expr)) {
lptable3[idx] = add_flags(lptable3[idx],flags)
variant = "INAT_VARIANT"
} else {
}
if (!match(ext, lprefix_expr)){
table[idx] = add_flags(table[idx],flags)
}
}

View File

@@ -31,5 +31,5 @@ asmlinkage long sys_pselect6(int n, fd_set __user *inp, fd_set __user *outp,
asmlinkage long sys_ppoll(struct pollfd __user *ufds, unsigned int nfds,
struct timespec __user *tsp, const sigset_t __user *sigmask,
size_t sigsetsize);
asmlinkage long sys_rt_sigsuspend(sigset_t __user *unewset,
size_t sigsetsize);

View File

@@ -493,7 +493,7 @@ static void do_signal(struct pt_regs *regs)
if (ret)
return;
signal_delivered(signr, info, ka, regs, 0);
signal_delivered(signr, &info, &ka, regs, 0);
if (current->ptrace & PT_SINGLESTEP)
task_pt_regs(current)->icountlevel = 1;

View File

@@ -208,7 +208,7 @@ config ACPI_IPMI
config ACPI_HOTPLUG_CPU
bool
depends on ACPI_PROCESSOR && HOTPLUG_CPU
depends on EXPERIMENTAL && ACPI_PROCESSOR && HOTPLUG_CPU
select ACPI_CONTAINER
default y

View File

@@ -643,11 +643,19 @@ static int acpi_battery_update(struct acpi_battery *battery)
static void acpi_battery_refresh(struct acpi_battery *battery)
{
int power_unit;
if (!battery->bat.dev)
return;
power_unit = battery->power_unit;
acpi_battery_get_info(battery);
/* The battery may have changed its reporting units. */
if (power_unit == battery->power_unit)
return;
/* The battery has changed its reporting units. */
sysfs_remove_battery(battery);
sysfs_add_battery(battery);
}

View File

@@ -182,41 +182,66 @@ EXPORT_SYMBOL(acpi_bus_get_private_data);
Power Management
-------------------------------------------------------------------------- */
static const char *state_string(int state)
{
switch (state) {
case ACPI_STATE_D0:
return "D0";
case ACPI_STATE_D1:
return "D1";
case ACPI_STATE_D2:
return "D2";
case ACPI_STATE_D3_HOT:
return "D3hot";
case ACPI_STATE_D3_COLD:
return "D3";
default:
return "(unknown)";
}
}
static int __acpi_bus_get_power(struct acpi_device *device, int *state)
{
int result = 0;
acpi_status status = 0;
unsigned long long psc = 0;
int result = ACPI_STATE_UNKNOWN;
if (!device || !state)
return -EINVAL;
*state = ACPI_STATE_UNKNOWN;
if (device->flags.power_manageable) {
/*
* Get the device's power state either directly (via _PSC) or
* indirectly (via power resources).
*/
if (device->power.flags.power_resources) {
result = acpi_power_get_inferred_state(device, state);
if (result)
return result;
} else if (device->power.flags.explicit_get) {
status = acpi_evaluate_integer(device->handle, "_PSC",
NULL, &psc);
if (ACPI_FAILURE(status))
return -ENODEV;
*state = (int)psc;
}
} else {
if (!device->flags.power_manageable) {
/* TBD: Non-recursive algorithm for walking up hierarchy. */
*state = device->parent ?
device->parent->power.state : ACPI_STATE_D0;
goto out;
}
ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Device [%s] power state is D%d\n",
device->pnp.bus_id, *state));
/*
* Get the device's power state either directly (via _PSC) or
* indirectly (via power resources).
*/
if (device->power.flags.explicit_get) {
unsigned long long psc;
acpi_status status = acpi_evaluate_integer(device->handle,
"_PSC", NULL, &psc);
if (ACPI_FAILURE(status))
return -ENODEV;
result = psc;
}
/* The test below covers ACPI_STATE_UNKNOWN too. */
if (result <= ACPI_STATE_D2) {
; /* Do nothing. */
} else if (device->power.flags.power_resources) {
int error = acpi_power_get_inferred_state(device, &result);
if (error)
return error;
} else if (result == ACPI_STATE_D3_HOT) {
result = ACPI_STATE_D3;
}
*state = result;
out:
ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Device [%s] power state is %s\n",
device->pnp.bus_id, state_string(*state)));
return 0;
}
@@ -234,13 +259,14 @@ static int __acpi_bus_set_power(struct acpi_device *device, int state)
/* Make sure this is a valid target state */
if (state == device->power.state) {
ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Device is already at D%d\n",
state));
ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Device is already at %s\n",
state_string(state)));
return 0;
}
if (!device->power.states[state].flags.valid) {
printk(KERN_WARNING PREFIX "Device does not support D%d\n", state);
printk(KERN_WARNING PREFIX "Device does not support %s\n",
state_string(state));
return -ENODEV;
}
if (device->parent && (state < device->parent->power.state)) {
@@ -294,13 +320,13 @@ static int __acpi_bus_set_power(struct acpi_device *device, int state)
end:
if (result)
printk(KERN_WARNING PREFIX
"Device [%s] failed to transition to D%d\n",
device->pnp.bus_id, state);
"Device [%s] failed to transition to %s\n",
device->pnp.bus_id, state_string(state));
else {
device->power.state = state;
ACPI_DEBUG_PRINT((ACPI_DB_INFO,
"Device [%s] transitioned to D%d\n",
device->pnp.bus_id, state));
"Device [%s] transitioned to %s\n",
device->pnp.bus_id, state_string(state)));
}
return result;

View File

@@ -631,7 +631,7 @@ int acpi_power_get_inferred_state(struct acpi_device *device, int *state)
* We know a device's inferred power state when all the resources
* required for a given D-state are 'on'.
*/
for (i = ACPI_STATE_D0; i < ACPI_STATE_D3_HOT; i++) {
for (i = ACPI_STATE_D0; i <= ACPI_STATE_D3_HOT; i++) {
list = &device->power.states[i].resources;
if (list->count < 1)
continue;

View File

@@ -333,6 +333,7 @@ static int acpi_processor_get_performance_states(struct acpi_processor *pr)
struct acpi_buffer state = { 0, NULL };
union acpi_object *pss = NULL;
int i;
int last_invalid = -1;
status = acpi_evaluate_object(pr->handle, "_PSS", NULL, &buffer);
@@ -394,14 +395,33 @@ static int acpi_processor_get_performance_states(struct acpi_processor *pr)
((u32)(px->core_frequency * 1000) !=
(px->core_frequency * 1000))) {
printk(KERN_ERR FW_BUG PREFIX
"Invalid BIOS _PSS frequency: 0x%llx MHz\n",
px->core_frequency);
result = -EFAULT;
kfree(pr->performance->states);
goto end;
"Invalid BIOS _PSS frequency found for processor %d: 0x%llx MHz\n",
pr->id, px->core_frequency);
if (last_invalid == -1)
last_invalid = i;
} else {
if (last_invalid != -1) {
/*
* Copy this valid entry over last_invalid entry
*/
memcpy(&(pr->performance->states[last_invalid]),
px, sizeof(struct acpi_processor_px));
++last_invalid;
}
}
}
if (last_invalid == 0) {
printk(KERN_ERR FW_BUG PREFIX
"No valid BIOS _PSS frequency found for processor %d\n", pr->id);
result = -EFAULT;
kfree(pr->performance->states);
pr->performance->states = NULL;
}
if (last_invalid > 0)
pr->performance->state_count = last_invalid;
end:
kfree(buffer.pointer);

View File

@@ -1567,6 +1567,7 @@ static int acpi_bus_scan_fixed(void)
ACPI_BUS_TYPE_POWER_BUTTON,
ACPI_STA_DEFAULT,
&ops);
device_init_wakeup(&device->dev, true);
}
if ((acpi_gbl_FADT.flags & ACPI_FADT_SLEEP_BUTTON) == 0) {

View File

@@ -57,6 +57,7 @@ MODULE_PARM_DESC(gts, "Enable evaluation of _GTS on suspend.");
MODULE_PARM_DESC(bfs, "Enable evaluation of _BFS on resume".);
static u8 sleep_states[ACPI_S_STATE_COUNT];
static bool pwr_btn_event_pending;
static void acpi_sleep_tts_switch(u32 acpi_state)
{
@@ -184,6 +185,14 @@ static int acpi_pm_prepare(void)
return error;
}
static int find_powerf_dev(struct device *dev, void *data)
{
struct acpi_device *device = to_acpi_device(dev);
const char *hid = acpi_device_hid(device);
return !strcmp(hid, ACPI_BUTTON_HID_POWERF);
}
/**
* acpi_pm_finish - Instruct the platform to leave a sleep state.
*
@@ -192,6 +201,7 @@ static int acpi_pm_prepare(void)
*/
static void acpi_pm_finish(void)
{
struct device *pwr_btn_dev;
u32 acpi_state = acpi_target_sleep_state;
acpi_ec_unblock_transactions();
@@ -209,6 +219,23 @@ static void acpi_pm_finish(void)
acpi_set_firmware_waking_vector((acpi_physical_address) 0);
acpi_target_sleep_state = ACPI_STATE_S0;
/* If we were woken with the fixed power button, provide a small
* hint to userspace in the form of a wakeup event on the fixed power
* button device (if it can be found).
*
* We delay the event generation til now, as the PM layer requires
* timekeeping to be running before we generate events. */
if (!pwr_btn_event_pending)
return;
pwr_btn_event_pending = false;
pwr_btn_dev = bus_find_device(&acpi_bus_type, NULL, NULL,
find_powerf_dev);
if (pwr_btn_dev) {
pm_wakeup_event(pwr_btn_dev, 0);
put_device(pwr_btn_dev);
}
}
/**
@@ -298,9 +325,23 @@ static int acpi_suspend_enter(suspend_state_t pm_state)
/* ACPI 3.0 specs (P62) says that it's the responsibility
* of the OSPM to clear the status bit [ implying that the
* POWER_BUTTON event should not reach userspace ]
*
* However, we do generate a small hint for userspace in the form of
* a wakeup event. We flag this condition for now and generate the
* event later, as we're currently too early in resume to be able to
* generate wakeup events.
*/
if (ACPI_SUCCESS(status) && (acpi_state == ACPI_STATE_S3))
acpi_clear_event(ACPI_EVENT_POWER_BUTTON);
if (ACPI_SUCCESS(status) && (acpi_state == ACPI_STATE_S3)) {
acpi_event_status pwr_btn_status;
acpi_get_event_status(ACPI_EVENT_POWER_BUTTON, &pwr_btn_status);
if (pwr_btn_status & ACPI_EVENT_FLAG_SET) {
acpi_clear_event(ACPI_EVENT_POWER_BUTTON);
/* Flag for later */
pwr_btn_event_pending = true;
}
}
/*
* Disable and clear GPE status before interrupt is enabled. Some GPEs
@@ -730,8 +771,8 @@ int acpi_pm_device_sleep_state(struct device *dev, int *d_min_p)
* can wake the system. _S0W may be valid, too.
*/
if (acpi_target_sleep_state == ACPI_STATE_S0 ||
(device_may_wakeup(dev) &&
adev->wakeup.sleep_state <= acpi_target_sleep_state)) {
(device_may_wakeup(dev) && adev->wakeup.flags.valid &&
adev->wakeup.sleep_state >= acpi_target_sleep_state)) {
acpi_status status;
acpi_method[3] = 'W';

View File

@@ -1687,10 +1687,6 @@ static int acpi_video_bus_add(struct acpi_device *device)
set_bit(KEY_BRIGHTNESS_ZERO, input->keybit);
set_bit(KEY_DISPLAY_OFF, input->keybit);
error = input_register_device(input);
if (error)
goto err_stop_video;
printk(KERN_INFO PREFIX "%s [%s] (multi-head: %s rom: %s post: %s)\n",
ACPI_VIDEO_DEVICE_NAME, acpi_device_bid(device),
video->flags.multihead ? "yes" : "no",
@@ -1701,12 +1697,16 @@ static int acpi_video_bus_add(struct acpi_device *device)
video->pm_nb.priority = 0;
error = register_pm_notifier(&video->pm_nb);
if (error)
goto err_unregister_input_dev;
goto err_stop_video;
error = input_register_device(input);
if (error)
goto err_unregister_pm_notifier;
return 0;
err_unregister_input_dev:
input_unregister_device(input);
err_unregister_pm_notifier:
unregister_pm_notifier(&video->pm_nb);
err_stop_video:
acpi_video_bus_stop_devices(video);
err_free_input_dev:
@@ -1743,9 +1743,18 @@ static int acpi_video_bus_remove(struct acpi_device *device, int type)
return 0;
}
static int __init is_i740(struct pci_dev *dev)
{
if (dev->device == 0x00D1)
return 1;
if (dev->device == 0x7000)
return 1;
return 0;
}
static int __init intel_opregion_present(void)
{
#if defined(CONFIG_DRM_I915) || defined(CONFIG_DRM_I915_MODULE)
int opregion = 0;
struct pci_dev *dev = NULL;
u32 address;
@@ -1754,13 +1763,15 @@ static int __init intel_opregion_present(void)
continue;
if (dev->vendor != PCI_VENDOR_ID_INTEL)
continue;
/* We don't want to poke around undefined i740 registers */
if (is_i740(dev))
continue;
pci_read_config_dword(dev, 0xfc, &address);
if (!address)
continue;
return 1;
opregion = 1;
}
#endif
return 0;
return opregion;
}
int acpi_video_register(void)

View File

@@ -898,6 +898,7 @@ static struct pci_device_id agp_intel_pci_table[] = {
ID(PCI_DEVICE_ID_INTEL_B43_HB),
ID(PCI_DEVICE_ID_INTEL_B43_1_HB),
ID(PCI_DEVICE_ID_INTEL_IRONLAKE_D_HB),
ID(PCI_DEVICE_ID_INTEL_IRONLAKE_D2_HB),
ID(PCI_DEVICE_ID_INTEL_IRONLAKE_M_HB),
ID(PCI_DEVICE_ID_INTEL_IRONLAKE_MA_HB),
ID(PCI_DEVICE_ID_INTEL_IRONLAKE_MC2_HB),

View File

@@ -212,6 +212,7 @@
#define PCI_DEVICE_ID_INTEL_G41_HB 0x2E30
#define PCI_DEVICE_ID_INTEL_G41_IG 0x2E32
#define PCI_DEVICE_ID_INTEL_IRONLAKE_D_HB 0x0040
#define PCI_DEVICE_ID_INTEL_IRONLAKE_D2_HB 0x0069
#define PCI_DEVICE_ID_INTEL_IRONLAKE_D_IG 0x0042
#define PCI_DEVICE_ID_INTEL_IRONLAKE_M_HB 0x0044
#define PCI_DEVICE_ID_INTEL_IRONLAKE_MA_HB 0x0062

View File

@@ -6,6 +6,7 @@ obj-$(CONFIG_CS5535_CLOCK_EVENT_SRC) += cs5535-clockevt.o
obj-$(CONFIG_SH_TIMER_CMT) += sh_cmt.o
obj-$(CONFIG_SH_TIMER_MTU2) += sh_mtu2.o
obj-$(CONFIG_SH_TIMER_TMU) += sh_tmu.o
obj-$(CONFIG_EM_TIMER_STI) += em_sti.o
obj-$(CONFIG_CLKBLD_I8253) += i8253.o
obj-$(CONFIG_CLKSRC_MMIO) += mmio.o
obj-$(CONFIG_DW_APB_TIMER) += dw_apb_timer.o

View File

@@ -0,0 +1,406 @@
/*
* Emma Mobile Timer Support - STI
*
* Copyright (C) 2012 Magnus Damm
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include <linux/init.h>
#include <linux/platform_device.h>
#include <linux/spinlock.h>
#include <linux/interrupt.h>
#include <linux/ioport.h>
#include <linux/io.h>
#include <linux/clk.h>
#include <linux/irq.h>
#include <linux/err.h>
#include <linux/delay.h>
#include <linux/clocksource.h>
#include <linux/clockchips.h>
#include <linux/slab.h>
#include <linux/module.h>
enum { USER_CLOCKSOURCE, USER_CLOCKEVENT, USER_NR };
struct em_sti_priv {
void __iomem *base;
struct clk *clk;
struct platform_device *pdev;
unsigned int active[USER_NR];
unsigned long rate;
raw_spinlock_t lock;
struct clock_event_device ced;
struct clocksource cs;
};
#define STI_CONTROL 0x00
#define STI_COMPA_H 0x10
#define STI_COMPA_L 0x14
#define STI_COMPB_H 0x18
#define STI_COMPB_L 0x1c
#define STI_COUNT_H 0x20
#define STI_COUNT_L 0x24
#define STI_COUNT_RAW_H 0x28
#define STI_COUNT_RAW_L 0x2c
#define STI_SET_H 0x30
#define STI_SET_L 0x34
#define STI_INTSTATUS 0x40
#define STI_INTRAWSTATUS 0x44
#define STI_INTENSET 0x48
#define STI_INTENCLR 0x4c
#define STI_INTFFCLR 0x50
static inline unsigned long em_sti_read(struct em_sti_priv *p, int offs)
{
return ioread32(p->base + offs);
}
static inline void em_sti_write(struct em_sti_priv *p, int offs,
unsigned long value)
{
iowrite32(value, p->base + offs);
}
static int em_sti_enable(struct em_sti_priv *p)
{
int ret;
/* enable clock */
ret = clk_enable(p->clk);
if (ret) {
dev_err(&p->pdev->dev, "cannot enable clock\n");
return ret;
}
/* configure channel, periodic mode and maximum timeout */
p->rate = clk_get_rate(p->clk);
/* reset the counter */
em_sti_write(p, STI_SET_H, 0x40000000);
em_sti_write(p, STI_SET_L, 0x00000000);
/* mask and clear pending interrupts */
em_sti_write(p, STI_INTENCLR, 3);
em_sti_write(p, STI_INTFFCLR, 3);
/* enable updates of counter registers */
em_sti_write(p, STI_CONTROL, 1);
return 0;
}
static void em_sti_disable(struct em_sti_priv *p)
{
/* mask interrupts */
em_sti_write(p, STI_INTENCLR, 3);
/* stop clock */
clk_disable(p->clk);
}
static cycle_t em_sti_count(struct em_sti_priv *p)
{
cycle_t ticks;
unsigned long flags;
/* the STI hardware buffers the 48-bit count, but to
* break it out into two 32-bit access the registers
* must be accessed in a certain order.
* Always read STI_COUNT_H before STI_COUNT_L.
*/
raw_spin_lock_irqsave(&p->lock, flags);
ticks = (cycle_t)(em_sti_read(p, STI_COUNT_H) & 0xffff) << 32;
ticks |= em_sti_read(p, STI_COUNT_L);
raw_spin_unlock_irqrestore(&p->lock, flags);
return ticks;
}
static cycle_t em_sti_set_next(struct em_sti_priv *p, cycle_t next)
{
unsigned long flags;
raw_spin_lock_irqsave(&p->lock, flags);
/* mask compare A interrupt */
em_sti_write(p, STI_INTENCLR, 1);
/* update compare A value */
em_sti_write(p, STI_COMPA_H, next >> 32);
em_sti_write(p, STI_COMPA_L, next & 0xffffffff);
/* clear compare A interrupt source */
em_sti_write(p, STI_INTFFCLR, 1);
/* unmask compare A interrupt */
em_sti_write(p, STI_INTENSET, 1);
raw_spin_unlock_irqrestore(&p->lock, flags);
return next;
}
static irqreturn_t em_sti_interrupt(int irq, void *dev_id)
{
struct em_sti_priv *p = dev_id;
p->ced.event_handler(&p->ced);
return IRQ_HANDLED;
}
static int em_sti_start(struct em_sti_priv *p, unsigned int user)
{
unsigned long flags;
int used_before;
int ret = 0;
raw_spin_lock_irqsave(&p->lock, flags);
used_before = p->active[USER_CLOCKSOURCE] | p->active[USER_CLOCKEVENT];
if (!used_before)
ret = em_sti_enable(p);
if (!ret)
p->active[user] = 1;
raw_spin_unlock_irqrestore(&p->lock, flags);
return ret;
}
static void em_sti_stop(struct em_sti_priv *p, unsigned int user)
{
unsigned long flags;
int used_before, used_after;
raw_spin_lock_irqsave(&p->lock, flags);
used_before = p->active[USER_CLOCKSOURCE] | p->active[USER_CLOCKEVENT];
p->active[user] = 0;
used_after = p->active[USER_CLOCKSOURCE] | p->active[USER_CLOCKEVENT];
if (used_before && !used_after)
em_sti_disable(p);
raw_spin_unlock_irqrestore(&p->lock, flags);
}
static struct em_sti_priv *cs_to_em_sti(struct clocksource *cs)
{
return container_of(cs, struct em_sti_priv, cs);
}
static cycle_t em_sti_clocksource_read(struct clocksource *cs)
{
return em_sti_count(cs_to_em_sti(cs));
}
static int em_sti_clocksource_enable(struct clocksource *cs)
{
int ret;
struct em_sti_priv *p = cs_to_em_sti(cs);
ret = em_sti_start(p, USER_CLOCKSOURCE);
if (!ret)
__clocksource_updatefreq_hz(cs, p->rate);
return ret;
}
static void em_sti_clocksource_disable(struct clocksource *cs)
{
em_sti_stop(cs_to_em_sti(cs), USER_CLOCKSOURCE);
}
static void em_sti_clocksource_resume(struct clocksource *cs)
{
em_sti_clocksource_enable(cs);
}
static int em_sti_register_clocksource(struct em_sti_priv *p)
{
struct clocksource *cs = &p->cs;
memset(cs, 0, sizeof(*cs));
cs->name = dev_name(&p->pdev->dev);
cs->rating = 200;
cs->read = em_sti_clocksource_read;
cs->enable = em_sti_clocksource_enable;
cs->disable = em_sti_clocksource_disable;
cs->suspend = em_sti_clocksource_disable;
cs->resume = em_sti_clocksource_resume;
cs->mask = CLOCKSOURCE_MASK(48);
cs->flags = CLOCK_SOURCE_IS_CONTINUOUS;
dev_info(&p->pdev->dev, "used as clock source\n");
/* Register with dummy 1 Hz value, gets updated in ->enable() */
clocksource_register_hz(cs, 1);
return 0;
}
static struct em_sti_priv *ced_to_em_sti(struct clock_event_device *ced)
{
return container_of(ced, struct em_sti_priv, ced);
}
static void em_sti_clock_event_mode(enum clock_event_mode mode,
struct clock_event_device *ced)
{
struct em_sti_priv *p = ced_to_em_sti(ced);
/* deal with old setting first */
switch (ced->mode) {
case CLOCK_EVT_MODE_ONESHOT:
em_sti_stop(p, USER_CLOCKEVENT);
break;
default:
break;
}
switch (mode) {
case CLOCK_EVT_MODE_ONESHOT:
dev_info(&p->pdev->dev, "used for oneshot clock events\n");
em_sti_start(p, USER_CLOCKEVENT);
clockevents_config(&p->ced, p->rate);
break;
case CLOCK_EVT_MODE_SHUTDOWN:
case CLOCK_EVT_MODE_UNUSED:
em_sti_stop(p, USER_CLOCKEVENT);
break;
default:
break;
}
}
static int em_sti_clock_event_next(unsigned long delta,
struct clock_event_device *ced)
{
struct em_sti_priv *p = ced_to_em_sti(ced);
cycle_t next;
int safe;
next = em_sti_set_next(p, em_sti_count(p) + delta);
safe = em_sti_count(p) < (next - 1);
return !safe;
}
static void em_sti_register_clockevent(struct em_sti_priv *p)
{
struct clock_event_device *ced = &p->ced;
memset(ced, 0, sizeof(*ced));
ced->name = dev_name(&p->pdev->dev);
ced->features = CLOCK_EVT_FEAT_ONESHOT;
ced->rating = 200;
ced->cpumask = cpumask_of(0);
ced->set_next_event = em_sti_clock_event_next;
ced->set_mode = em_sti_clock_event_mode;
dev_info(&p->pdev->dev, "used for clock events\n");
/* Register with dummy 1 Hz value, gets updated in ->set_mode() */
clockevents_config_and_register(ced, 1, 2, 0xffffffff);
}
static int __devinit em_sti_probe(struct platform_device *pdev)
{
struct em_sti_priv *p;
struct resource *res;
int irq, ret;
p = kzalloc(sizeof(*p), GFP_KERNEL);
if (p == NULL) {
dev_err(&pdev->dev, "failed to allocate driver data\n");
ret = -ENOMEM;
goto err0;
}
p->pdev = pdev;
platform_set_drvdata(pdev, p);
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!res) {
dev_err(&pdev->dev, "failed to get I/O memory\n");
ret = -EINVAL;
goto err0;
}
irq = platform_get_irq(pdev, 0);
if (irq < 0) {
dev_err(&pdev->dev, "failed to get irq\n");
ret = -EINVAL;
goto err0;
}
/* map memory, let base point to the STI instance */
p->base = ioremap_nocache(res->start, resource_size(res));
if (p->base == NULL) {
dev_err(&pdev->dev, "failed to remap I/O memory\n");
ret = -ENXIO;
goto err0;
}
/* get hold of clock */
p->clk = clk_get(&pdev->dev, "sclk");
if (IS_ERR(p->clk)) {
dev_err(&pdev->dev, "cannot get clock\n");
ret = PTR_ERR(p->clk);
goto err1;
}
if (request_irq(irq, em_sti_interrupt,
IRQF_TIMER | IRQF_IRQPOLL | IRQF_NOBALANCING,
dev_name(&pdev->dev), p)) {
dev_err(&pdev->dev, "failed to request low IRQ\n");
ret = -ENOENT;
goto err2;
}
raw_spin_lock_init(&p->lock);
em_sti_register_clockevent(p);
em_sti_register_clocksource(p);
return 0;
err2:
clk_put(p->clk);
err1:
iounmap(p->base);
err0:
kfree(p);
return ret;
}
static int __devexit em_sti_remove(struct platform_device *pdev)
{
return -EBUSY; /* cannot unregister clockevent and clocksource */
}
static const struct of_device_id em_sti_dt_ids[] __devinitconst = {
{ .compatible = "renesas,em-sti", },
{},
};
MODULE_DEVICE_TABLE(of, em_sti_dt_ids);
static struct platform_driver em_sti_device_driver = {
.probe = em_sti_probe,
.remove = __devexit_p(em_sti_remove),
.driver = {
.name = "em_sti",
.of_match_table = em_sti_dt_ids,
}
};
module_platform_driver(em_sti_device_driver);
MODULE_AUTHOR("Magnus Damm");
MODULE_DESCRIPTION("Renesas Emma Mobile STI Timer Driver");
MODULE_LICENSE("GPL v2");

View File

@@ -2833,7 +2833,7 @@ static __init void exynos5_gpiolib_init(void)
}
/* need to set base address for gpc4 */
exonys5_gpios_1[11].base = gpio_base1 + 0x2E0;
exynos5_gpios_1[11].base = gpio_base1 + 0x2E0;
/* need to set base address for gpx */
chip = &exynos5_gpios_1[21];

View File

@@ -244,8 +244,8 @@ static const struct file_operations exynos_drm_driver_fops = {
};
static struct drm_driver exynos_drm_driver = {
.driver_features = DRIVER_HAVE_IRQ | DRIVER_BUS_PLATFORM |
DRIVER_MODESET | DRIVER_GEM | DRIVER_PRIME,
.driver_features = DRIVER_HAVE_IRQ | DRIVER_MODESET |
DRIVER_GEM | DRIVER_PRIME,
.load = exynos_drm_load,
.unload = exynos_drm_unload,
.open = exynos_drm_open,

View File

@@ -172,19 +172,12 @@ static void exynos_drm_encoder_commit(struct drm_encoder *encoder)
manager_ops->commit(manager->dev);
}
static struct drm_crtc *
exynos_drm_encoder_get_crtc(struct drm_encoder *encoder)
{
return encoder->crtc;
}
static struct drm_encoder_helper_funcs exynos_encoder_helper_funcs = {
.dpms = exynos_drm_encoder_dpms,
.mode_fixup = exynos_drm_encoder_mode_fixup,
.mode_set = exynos_drm_encoder_mode_set,
.prepare = exynos_drm_encoder_prepare,
.commit = exynos_drm_encoder_commit,
.get_crtc = exynos_drm_encoder_get_crtc,
};
static void exynos_drm_encoder_destroy(struct drm_encoder *encoder)

View File

@@ -51,11 +51,22 @@ struct exynos_drm_fb {
static void exynos_drm_fb_destroy(struct drm_framebuffer *fb)
{
struct exynos_drm_fb *exynos_fb = to_exynos_fb(fb);
unsigned int i;
DRM_DEBUG_KMS("%s\n", __FILE__);
drm_framebuffer_cleanup(fb);
for (i = 0; i < ARRAY_SIZE(exynos_fb->exynos_gem_obj); i++) {
struct drm_gem_object *obj;
if (exynos_fb->exynos_gem_obj[i] == NULL)
continue;
obj = &exynos_fb->exynos_gem_obj[i]->base;
drm_gem_object_unreference_unlocked(obj);
}
kfree(exynos_fb);
exynos_fb = NULL;
}
@@ -134,11 +145,11 @@ exynos_user_fb_create(struct drm_device *dev, struct drm_file *file_priv,
return ERR_PTR(-ENOENT);
}
drm_gem_object_unreference_unlocked(obj);
fb = exynos_drm_framebuffer_init(dev, mode_cmd, obj);
if (IS_ERR(fb))
if (IS_ERR(fb)) {
drm_gem_object_unreference_unlocked(obj);
return fb;
}
exynos_fb = to_exynos_fb(fb);
nr = exynos_drm_format_num_buffers(fb->pixel_format);
@@ -152,8 +163,6 @@ exynos_user_fb_create(struct drm_device *dev, struct drm_file *file_priv,
return ERR_PTR(-ENOENT);
}
drm_gem_object_unreference_unlocked(obj);
exynos_fb->exynos_gem_obj[i] = to_exynos_gem_obj(obj);
}

View File

@@ -31,10 +31,10 @@
static inline int exynos_drm_format_num_buffers(uint32_t format)
{
switch (format) {
case DRM_FORMAT_NV12M:
case DRM_FORMAT_NV12:
case DRM_FORMAT_NV12MT:
return 2;
case DRM_FORMAT_YUV420M:
case DRM_FORMAT_YUV420:
return 3;
default:
return 1;

View File

@@ -689,7 +689,6 @@ int exynos_drm_gem_dumb_map_offset(struct drm_file *file_priv,
struct drm_device *dev, uint32_t handle,
uint64_t *offset)
{
struct exynos_drm_gem_obj *exynos_gem_obj;
struct drm_gem_object *obj;
int ret = 0;
@@ -710,15 +709,13 @@ int exynos_drm_gem_dumb_map_offset(struct drm_file *file_priv,
goto unlock;
}
exynos_gem_obj = to_exynos_gem_obj(obj);
if (!exynos_gem_obj->base.map_list.map) {
ret = drm_gem_create_mmap_offset(&exynos_gem_obj->base);
if (!obj->map_list.map) {
ret = drm_gem_create_mmap_offset(obj);
if (ret)
goto out;
}
*offset = (u64)exynos_gem_obj->base.map_list.hash.key << PAGE_SHIFT;
*offset = (u64)obj->map_list.hash.key << PAGE_SHIFT;
DRM_DEBUG_KMS("offset = 0x%lx\n", (unsigned long)*offset);
out:

View File

@@ -365,7 +365,7 @@ static void vp_video_buffer(struct mixer_context *ctx, int win)
switch (win_data->pixel_format) {
case DRM_FORMAT_NV12MT:
tiled_mode = true;
case DRM_FORMAT_NV12M:
case DRM_FORMAT_NV12:
crcb_mode = false;
buf_num = 2;
break;
@@ -601,18 +601,20 @@ static void mixer_win_reset(struct mixer_context *ctx)
mixer_reg_write(res, MXR_BG_COLOR2, 0x008080);
/* setting graphical layers */
val = MXR_GRP_CFG_COLOR_KEY_DISABLE; /* no blank key */
val |= MXR_GRP_CFG_WIN_BLEND_EN;
val |= MXR_GRP_CFG_BLEND_PRE_MUL;
val |= MXR_GRP_CFG_PIXEL_BLEND_EN;
val |= MXR_GRP_CFG_ALPHA_VAL(0xff); /* non-transparent alpha */
/* the same configuration for both layers */
mixer_reg_write(res, MXR_GRAPHIC_CFG(0), val);
val |= MXR_GRP_CFG_BLEND_PRE_MUL;
val |= MXR_GRP_CFG_PIXEL_BLEND_EN;
mixer_reg_write(res, MXR_GRAPHIC_CFG(1), val);
/* setting video layers */
val = MXR_GRP_CFG_ALPHA_VAL(0);
mixer_reg_write(res, MXR_VIDEO_CFG, val);
/* configuration of Video Processor Registers */
vp_win_reset(ctx);
vp_default_filter(res);

View File

@@ -233,6 +233,7 @@ static const struct intel_device_info intel_sandybridge_d_info = {
.has_blt_ring = 1,
.has_llc = 1,
.has_pch_split = 1,
.has_force_wake = 1,
};
static const struct intel_device_info intel_sandybridge_m_info = {
@@ -243,6 +244,7 @@ static const struct intel_device_info intel_sandybridge_m_info = {
.has_blt_ring = 1,
.has_llc = 1,
.has_pch_split = 1,
.has_force_wake = 1,
};
static const struct intel_device_info intel_ivybridge_d_info = {
@@ -252,6 +254,7 @@ static const struct intel_device_info intel_ivybridge_d_info = {
.has_blt_ring = 1,
.has_llc = 1,
.has_pch_split = 1,
.has_force_wake = 1,
};
static const struct intel_device_info intel_ivybridge_m_info = {
@@ -262,6 +265,7 @@ static const struct intel_device_info intel_ivybridge_m_info = {
.has_blt_ring = 1,
.has_llc = 1,
.has_pch_split = 1,
.has_force_wake = 1,
};
static const struct intel_device_info intel_valleyview_m_info = {
@@ -289,6 +293,7 @@ static const struct intel_device_info intel_haswell_d_info = {
.has_blt_ring = 1,
.has_llc = 1,
.has_pch_split = 1,
.has_force_wake = 1,
};
static const struct intel_device_info intel_haswell_m_info = {
@@ -298,6 +303,7 @@ static const struct intel_device_info intel_haswell_m_info = {
.has_blt_ring = 1,
.has_llc = 1,
.has_pch_split = 1,
.has_force_wake = 1,
};
static const struct pci_device_id pciidlist[] = { /* aka */
@@ -1139,10 +1145,9 @@ MODULE_LICENSE("GPL and additional rights");
/* We give fast paths for the really cool registers */
#define NEEDS_FORCE_WAKE(dev_priv, reg) \
(((dev_priv)->info->gen >= 6) && \
((reg) < 0x40000) && \
((reg) != FORCEWAKE)) && \
(!IS_VALLEYVIEW((dev_priv)->dev))
((HAS_FORCE_WAKE((dev_priv)->dev)) && \
((reg) < 0x40000) && \
((reg) != FORCEWAKE))
#define __i915_read(x, y) \
u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \

View File

@@ -285,6 +285,7 @@ struct intel_device_info {
u8 is_ivybridge:1;
u8 is_valleyview:1;
u8 has_pch_split:1;
u8 has_force_wake:1;
u8 is_haswell:1;
u8 has_fbc:1;
u8 has_pipe_cxsr:1;
@@ -1101,6 +1102,8 @@ struct drm_i915_file_private {
#define HAS_PCH_CPT(dev) (INTEL_PCH_TYPE(dev) == PCH_CPT)
#define HAS_PCH_IBX(dev) (INTEL_PCH_TYPE(dev) == PCH_IBX)
#define HAS_FORCE_WAKE(dev) (INTEL_INFO(dev)->has_force_wake)
#include "i915_trace.h"
/**

View File

@@ -510,7 +510,7 @@ out:
return ret;
}
static void pch_irq_handler(struct drm_device *dev, u32 pch_iir)
static void ibx_irq_handler(struct drm_device *dev, u32 pch_iir)
{
drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
int pipe;
@@ -550,6 +550,35 @@ static void pch_irq_handler(struct drm_device *dev, u32 pch_iir)
DRM_DEBUG_DRIVER("PCH transcoder A underrun interrupt\n");
}
static void cpt_irq_handler(struct drm_device *dev, u32 pch_iir)
{
drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
int pipe;
if (pch_iir & SDE_AUDIO_POWER_MASK_CPT)
DRM_DEBUG_DRIVER("PCH audio power change on port %d\n",
(pch_iir & SDE_AUDIO_POWER_MASK_CPT) >>
SDE_AUDIO_POWER_SHIFT_CPT);
if (pch_iir & SDE_AUX_MASK_CPT)
DRM_DEBUG_DRIVER("AUX channel interrupt\n");
if (pch_iir & SDE_GMBUS_CPT)
DRM_DEBUG_DRIVER("PCH GMBUS interrupt\n");
if (pch_iir & SDE_AUDIO_CP_REQ_CPT)
DRM_DEBUG_DRIVER("Audio CP request interrupt\n");
if (pch_iir & SDE_AUDIO_CP_CHG_CPT)
DRM_DEBUG_DRIVER("Audio CP change interrupt\n");
if (pch_iir & SDE_FDI_MASK_CPT)
for_each_pipe(pipe)
DRM_DEBUG_DRIVER(" pipe %c FDI IIR: 0x%08x\n",
pipe_name(pipe),
I915_READ(FDI_RX_IIR(pipe)));
}
static irqreturn_t ivybridge_irq_handler(DRM_IRQ_ARGS)
{
struct drm_device *dev = (struct drm_device *) arg;
@@ -591,7 +620,7 @@ static irqreturn_t ivybridge_irq_handler(DRM_IRQ_ARGS)
if (pch_iir & SDE_HOTPLUG_MASK_CPT)
queue_work(dev_priv->wq, &dev_priv->hotplug_work);
pch_irq_handler(dev, pch_iir);
cpt_irq_handler(dev, pch_iir);
/* clear PCH hotplug event before clear CPU irq */
I915_WRITE(SDEIIR, pch_iir);
@@ -684,7 +713,10 @@ static irqreturn_t ironlake_irq_handler(DRM_IRQ_ARGS)
if (de_iir & DE_PCH_EVENT) {
if (pch_iir & hotplug_mask)
queue_work(dev_priv->wq, &dev_priv->hotplug_work);
pch_irq_handler(dev, pch_iir);
if (HAS_PCH_CPT(dev))
cpt_irq_handler(dev, pch_iir);
else
ibx_irq_handler(dev, pch_iir);
}
if (de_iir & DE_PCU_EVENT) {

View File

@@ -210,6 +210,14 @@
#define MI_DISPLAY_FLIP MI_INSTR(0x14, 2)
#define MI_DISPLAY_FLIP_I915 MI_INSTR(0x14, 1)
#define MI_DISPLAY_FLIP_PLANE(n) ((n) << 20)
/* IVB has funny definitions for which plane to flip. */
#define MI_DISPLAY_FLIP_IVB_PLANE_A (0 << 19)
#define MI_DISPLAY_FLIP_IVB_PLANE_B (1 << 19)
#define MI_DISPLAY_FLIP_IVB_SPRITE_A (2 << 19)
#define MI_DISPLAY_FLIP_IVB_SPRITE_B (3 << 19)
#define MI_DISPLAY_FLIP_IVB_PLANE_C (4 << 19)
#define MI_DISPLAY_FLIP_IVB_SPRITE_C (5 << 19)
#define MI_SET_CONTEXT MI_INSTR(0x18, 0)
#define MI_MM_SPACE_GTT (1<<8)
#define MI_MM_SPACE_PHYSICAL (0<<8)
@@ -3313,7 +3321,7 @@
/* PCH */
/* south display engine interrupt */
/* south display engine interrupt: IBX */
#define SDE_AUDIO_POWER_D (1 << 27)
#define SDE_AUDIO_POWER_C (1 << 26)
#define SDE_AUDIO_POWER_B (1 << 25)
@@ -3349,15 +3357,44 @@
#define SDE_TRANSA_CRC_ERR (1 << 1)
#define SDE_TRANSA_FIFO_UNDER (1 << 0)
#define SDE_TRANS_MASK (0x3f)
/* CPT */
#define SDE_CRT_HOTPLUG_CPT (1 << 19)
/* south display engine interrupt: CPT/PPT */
#define SDE_AUDIO_POWER_D_CPT (1 << 31)
#define SDE_AUDIO_POWER_C_CPT (1 << 30)
#define SDE_AUDIO_POWER_B_CPT (1 << 29)
#define SDE_AUDIO_POWER_SHIFT_CPT 29
#define SDE_AUDIO_POWER_MASK_CPT (7 << 29)
#define SDE_AUXD_CPT (1 << 27)
#define SDE_AUXC_CPT (1 << 26)
#define SDE_AUXB_CPT (1 << 25)
#define SDE_AUX_MASK_CPT (7 << 25)
#define SDE_PORTD_HOTPLUG_CPT (1 << 23)
#define SDE_PORTC_HOTPLUG_CPT (1 << 22)
#define SDE_PORTB_HOTPLUG_CPT (1 << 21)
#define SDE_CRT_HOTPLUG_CPT (1 << 19)
#define SDE_HOTPLUG_MASK_CPT (SDE_CRT_HOTPLUG_CPT | \
SDE_PORTD_HOTPLUG_CPT | \
SDE_PORTC_HOTPLUG_CPT | \
SDE_PORTB_HOTPLUG_CPT)
#define SDE_GMBUS_CPT (1 << 17)
#define SDE_AUDIO_CP_REQ_C_CPT (1 << 10)
#define SDE_AUDIO_CP_CHG_C_CPT (1 << 9)
#define SDE_FDI_RXC_CPT (1 << 8)
#define SDE_AUDIO_CP_REQ_B_CPT (1 << 6)
#define SDE_AUDIO_CP_CHG_B_CPT (1 << 5)
#define SDE_FDI_RXB_CPT (1 << 4)
#define SDE_AUDIO_CP_REQ_A_CPT (1 << 2)
#define SDE_AUDIO_CP_CHG_A_CPT (1 << 1)
#define SDE_FDI_RXA_CPT (1 << 0)
#define SDE_AUDIO_CP_REQ_CPT (SDE_AUDIO_CP_REQ_C_CPT | \
SDE_AUDIO_CP_REQ_B_CPT | \
SDE_AUDIO_CP_REQ_A_CPT)
#define SDE_AUDIO_CP_CHG_CPT (SDE_AUDIO_CP_CHG_C_CPT | \
SDE_AUDIO_CP_CHG_B_CPT | \
SDE_AUDIO_CP_CHG_A_CPT)
#define SDE_FDI_MASK_CPT (SDE_FDI_RXC_CPT | \
SDE_FDI_RXB_CPT | \
SDE_FDI_RXA_CPT)
#define SDEISR 0xc4000
#define SDEIMR 0xc4004

View File

@@ -6158,17 +6158,34 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
struct intel_ring_buffer *ring = &dev_priv->ring[BCS];
uint32_t plane_bit = 0;
int ret;
ret = intel_pin_and_fence_fb_obj(dev, obj, ring);
if (ret)
goto err;
switch(intel_crtc->plane) {
case PLANE_A:
plane_bit = MI_DISPLAY_FLIP_IVB_PLANE_A;
break;
case PLANE_B:
plane_bit = MI_DISPLAY_FLIP_IVB_PLANE_B;
break;
case PLANE_C:
plane_bit = MI_DISPLAY_FLIP_IVB_PLANE_C;
break;
default:
WARN_ONCE(1, "unknown plane in flip command\n");
ret = -ENODEV;
goto err;
}
ret = intel_ring_begin(ring, 4);
if (ret)
goto err_unpin;
intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | (intel_crtc->plane << 19));
intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | plane_bit);
intel_ring_emit(ring, (fb->pitches[0] | obj->tiling_mode));
intel_ring_emit(ring, (obj->gtt_offset));
intel_ring_emit(ring, (MI_NOOP));

View File

@@ -266,10 +266,15 @@ u32 intel_ring_get_active_head(struct intel_ring_buffer *ring)
static int init_ring_common(struct intel_ring_buffer *ring)
{
drm_i915_private_t *dev_priv = ring->dev->dev_private;
struct drm_device *dev = ring->dev;
drm_i915_private_t *dev_priv = dev->dev_private;
struct drm_i915_gem_object *obj = ring->obj;
int ret = 0;
u32 head;
if (HAS_FORCE_WAKE(dev))
gen6_gt_force_wake_get(dev_priv);
/* Stop the ring if it's running. */
I915_WRITE_CTL(ring, 0);
I915_WRITE_HEAD(ring, 0);
@@ -317,7 +322,8 @@ static int init_ring_common(struct intel_ring_buffer *ring)
I915_READ_HEAD(ring),
I915_READ_TAIL(ring),
I915_READ_START(ring));
return -EIO;
ret = -EIO;
goto out;
}
if (!drm_core_check_feature(ring->dev, DRIVER_MODESET))
@@ -326,9 +332,14 @@ static int init_ring_common(struct intel_ring_buffer *ring)
ring->head = I915_READ_HEAD(ring);
ring->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
ring->space = ring_space(ring);
ring->last_retired_head = -1;
}
return 0;
out:
if (HAS_FORCE_WAKE(dev))
gen6_gt_force_wake_put(dev_priv);
return ret;
}
static int
@@ -987,6 +998,10 @@ static int intel_init_ring_buffer(struct drm_device *dev,
if (ret)
goto err_unref;
ret = i915_gem_object_set_to_gtt_domain(obj, true);
if (ret)
goto err_unpin;
ring->virtual_start = ioremap_wc(dev->agp->base + obj->gtt_offset,
ring->size);
if (ring->virtual_start == NULL) {

View File

@@ -460,15 +460,28 @@ static void cayman_gpu_init(struct radeon_device *rdev)
rdev->config.cayman.max_pipes_per_simd = 4;
rdev->config.cayman.max_tile_pipes = 2;
if ((rdev->pdev->device == 0x9900) ||
(rdev->pdev->device == 0x9901)) {
(rdev->pdev->device == 0x9901) ||
(rdev->pdev->device == 0x9905) ||
(rdev->pdev->device == 0x9906) ||
(rdev->pdev->device == 0x9907) ||
(rdev->pdev->device == 0x9908) ||
(rdev->pdev->device == 0x9909) ||
(rdev->pdev->device == 0x9910) ||
(rdev->pdev->device == 0x9917)) {
rdev->config.cayman.max_simds_per_se = 6;
rdev->config.cayman.max_backends_per_se = 2;
} else if ((rdev->pdev->device == 0x9903) ||
(rdev->pdev->device == 0x9904)) {
(rdev->pdev->device == 0x9904) ||
(rdev->pdev->device == 0x990A) ||
(rdev->pdev->device == 0x9913) ||
(rdev->pdev->device == 0x9918)) {
rdev->config.cayman.max_simds_per_se = 4;
rdev->config.cayman.max_backends_per_se = 2;
} else if ((rdev->pdev->device == 0x9990) ||
(rdev->pdev->device == 0x9991)) {
} else if ((rdev->pdev->device == 0x9919) ||
(rdev->pdev->device == 0x9990) ||
(rdev->pdev->device == 0x9991) ||
(rdev->pdev->device == 0x9994) ||
(rdev->pdev->device == 0x99A0)) {
rdev->config.cayman.max_simds_per_se = 3;
rdev->config.cayman.max_backends_per_se = 1;
} else {

View File

@@ -2426,6 +2426,12 @@ int r600_startup(struct radeon_device *rdev)
if (r)
return r;
r = r600_audio_init(rdev);
if (r) {
DRM_ERROR("radeon: audio init failed\n");
return r;
}
return 0;
}
@@ -2462,12 +2468,6 @@ int r600_resume(struct radeon_device *rdev)
return r;
}
r = r600_audio_init(rdev);
if (r) {
DRM_ERROR("radeon: audio resume failed\n");
return r;
}
return r;
}
@@ -2577,9 +2577,6 @@ int r600_init(struct radeon_device *rdev)
rdev->accel_working = false;
}
r = r600_audio_init(rdev);
if (r)
return r; /* TODO error handling */
return 0;
}

View File

@@ -192,6 +192,7 @@ void r600_audio_set_clock(struct drm_encoder *encoder, int clock)
struct radeon_device *rdev = dev->dev_private;
struct radeon_encoder *radeon_encoder = to_radeon_encoder(encoder);
struct radeon_encoder_atom_dig *dig = radeon_encoder->enc_priv;
struct radeon_crtc *radeon_crtc = to_radeon_crtc(encoder->crtc);
int base_rate = 48000;
switch (radeon_encoder->encoder_id) {
@@ -217,8 +218,8 @@ void r600_audio_set_clock(struct drm_encoder *encoder, int clock)
WREG32(EVERGREEN_AUDIO_PLL1_DIV, clock * 10);
WREG32(EVERGREEN_AUDIO_PLL1_UNK, 0x00000071);
/* Some magic trigger or src sel? */
WREG32_P(0x5ac, 0x01, ~0x77);
/* Select DTO source */
WREG32(0x5ac, radeon_crtc->crtc_id);
} else {
switch (dig->dig_encoder) {
case 0:

View File

@@ -348,7 +348,6 @@ void r600_hdmi_setmode(struct drm_encoder *encoder, struct drm_display_mode *mod
WREG32(HDMI0_AUDIO_PACKET_CONTROL + offset,
HDMI0_AUDIO_SAMPLE_SEND | /* send audio packets */
HDMI0_AUDIO_DELAY_EN(1) | /* default audio delay */
HDMI0_AUDIO_SEND_MAX_PACKETS | /* send NULL packets if no audio is available */
HDMI0_AUDIO_PACKETS_PER_LINE(3) | /* should be suffient for all audio modes and small enough for all hblanks */
HDMI0_60958_CS_UPDATE); /* allow 60958 channel status fields to be updated */
}

View File

@@ -1374,9 +1374,9 @@ struct cayman_asic {
struct si_asic {
unsigned max_shader_engines;
unsigned max_pipes_per_simd;
unsigned max_tile_pipes;
unsigned max_simds_per_se;
unsigned max_cu_per_sh;
unsigned max_sh_per_se;
unsigned max_backends_per_se;
unsigned max_texture_channel_caches;
unsigned max_gprs;
@@ -1387,7 +1387,6 @@ struct si_asic {
unsigned sc_hiz_tile_fifo_size;
unsigned sc_earlyz_tile_fifo_size;
unsigned num_shader_engines;
unsigned num_tile_pipes;
unsigned num_backends_per_se;
unsigned backend_disable_mask_per_asic;

View File

@@ -476,12 +476,18 @@ int radeon_vm_bo_add(struct radeon_device *rdev,
mutex_lock(&vm->mutex);
if (last_pfn > vm->last_pfn) {
/* grow va space 32M by 32M */
unsigned align = ((32 << 20) >> 12) - 1;
/* release mutex and lock in right order */
mutex_unlock(&vm->mutex);
radeon_mutex_lock(&rdev->cs_mutex);
radeon_vm_unbind_locked(rdev, vm);
mutex_lock(&vm->mutex);
/* and check again */
if (last_pfn > vm->last_pfn) {
/* grow va space 32M by 32M */
unsigned align = ((32 << 20) >> 12) - 1;
radeon_vm_unbind_locked(rdev, vm);
vm->last_pfn = (last_pfn + align) & ~align;
}
radeon_mutex_unlock(&rdev->cs_mutex);
vm->last_pfn = (last_pfn + align) & ~align;
}
head = &vm->va;
last_offset = 0;
@@ -595,8 +601,8 @@ int radeon_vm_bo_rmv(struct radeon_device *rdev,
if (bo_va == NULL)
return 0;
mutex_lock(&vm->mutex);
radeon_mutex_lock(&rdev->cs_mutex);
mutex_lock(&vm->mutex);
radeon_vm_bo_update_pte(rdev, vm, bo, NULL);
radeon_mutex_unlock(&rdev->cs_mutex);
list_del(&bo_va->vm_list);
@@ -641,9 +647,8 @@ void radeon_vm_fini(struct radeon_device *rdev, struct radeon_vm *vm)
struct radeon_bo_va *bo_va, *tmp;
int r;
mutex_lock(&vm->mutex);
radeon_mutex_lock(&rdev->cs_mutex);
mutex_lock(&vm->mutex);
radeon_vm_unbind_locked(rdev, vm);
radeon_mutex_unlock(&rdev->cs_mutex);

View File

@@ -273,7 +273,7 @@ int radeon_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
break;
case RADEON_INFO_MAX_PIPES:
if (rdev->family >= CHIP_TAHITI)
value = rdev->config.si.max_pipes_per_simd;
value = rdev->config.si.max_cu_per_sh;
else if (rdev->family >= CHIP_CAYMAN)
value = rdev->config.cayman.max_pipes_per_simd;
else if (rdev->family >= CHIP_CEDAR)

View File

@@ -908,12 +908,6 @@ static int rs600_startup(struct radeon_device *rdev)
return r;
}
r = r600_audio_init(rdev);
if (r) {
dev_err(rdev->dev, "failed initializing audio\n");
return r;
}
r = radeon_ib_pool_start(rdev);
if (r)
return r;
@@ -922,6 +916,12 @@ static int rs600_startup(struct radeon_device *rdev)
if (r)
return r;
r = r600_audio_init(rdev);
if (r) {
dev_err(rdev->dev, "failed initializing audio\n");
return r;
}
return 0;
}

View File

@@ -637,12 +637,6 @@ static int rs690_startup(struct radeon_device *rdev)
return r;
}
r = r600_audio_init(rdev);
if (r) {
dev_err(rdev->dev, "failed initializing audio\n");
return r;
}
r = radeon_ib_pool_start(rdev);
if (r)
return r;
@@ -651,6 +645,12 @@ static int rs690_startup(struct radeon_device *rdev)
if (r)
return r;
r = r600_audio_init(rdev);
if (r) {
dev_err(rdev->dev, "failed initializing audio\n");
return r;
}
return 0;
}

View File

@@ -956,6 +956,12 @@ static int rv770_startup(struct radeon_device *rdev)
if (r)
return r;
r = r600_audio_init(rdev);
if (r) {
DRM_ERROR("radeon: audio init failed\n");
return r;
}
return 0;
}
@@ -978,12 +984,6 @@ int rv770_resume(struct radeon_device *rdev)
return r;
}
r = r600_audio_init(rdev);
if (r) {
dev_err(rdev->dev, "radeon: audio init failed\n");
return r;
}
return r;
}
@@ -1092,12 +1092,6 @@ int rv770_init(struct radeon_device *rdev)
rdev->accel_working = false;
}
r = r600_audio_init(rdev);
if (r) {
dev_err(rdev->dev, "radeon: audio init failed\n");
return r;
}
return 0;
}

View File

@@ -867,200 +867,6 @@ void dce6_bandwidth_update(struct radeon_device *rdev)
/*
* Core functions
*/
static u32 si_get_tile_pipe_to_backend_map(struct radeon_device *rdev,
u32 num_tile_pipes,
u32 num_backends_per_asic,
u32 *backend_disable_mask_per_asic,
u32 num_shader_engines)
{
u32 backend_map = 0;
u32 enabled_backends_mask = 0;
u32 enabled_backends_count = 0;
u32 num_backends_per_se;
u32 cur_pipe;
u32 swizzle_pipe[SI_MAX_PIPES];
u32 cur_backend = 0;
u32 i;
bool force_no_swizzle;
/* force legal values */
if (num_tile_pipes < 1)
num_tile_pipes = 1;
if (num_tile_pipes > rdev->config.si.max_tile_pipes)
num_tile_pipes = rdev->config.si.max_tile_pipes;
if (num_shader_engines < 1)
num_shader_engines = 1;
if (num_shader_engines > rdev->config.si.max_shader_engines)
num_shader_engines = rdev->config.si.max_shader_engines;
if (num_backends_per_asic < num_shader_engines)
num_backends_per_asic = num_shader_engines;
if (num_backends_per_asic > (rdev->config.si.max_backends_per_se * num_shader_engines))
num_backends_per_asic = rdev->config.si.max_backends_per_se * num_shader_engines;
/* make sure we have the same number of backends per se */
num_backends_per_asic = ALIGN(num_backends_per_asic, num_shader_engines);
/* set up the number of backends per se */
num_backends_per_se = num_backends_per_asic / num_shader_engines;
if (num_backends_per_se > rdev->config.si.max_backends_per_se) {
num_backends_per_se = rdev->config.si.max_backends_per_se;
num_backends_per_asic = num_backends_per_se * num_shader_engines;
}
/* create enable mask and count for enabled backends */
for (i = 0; i < SI_MAX_BACKENDS; ++i) {
if (((*backend_disable_mask_per_asic >> i) & 1) == 0) {
enabled_backends_mask |= (1 << i);
++enabled_backends_count;
}
if (enabled_backends_count == num_backends_per_asic)
break;
}
/* force the backends mask to match the current number of backends */
if (enabled_backends_count != num_backends_per_asic) {
u32 this_backend_enabled;
u32 shader_engine;
u32 backend_per_se;
enabled_backends_mask = 0;
enabled_backends_count = 0;
*backend_disable_mask_per_asic = SI_MAX_BACKENDS_MASK;
for (i = 0; i < SI_MAX_BACKENDS; ++i) {
/* calc the current se */
shader_engine = i / rdev->config.si.max_backends_per_se;
/* calc the backend per se */
backend_per_se = i % rdev->config.si.max_backends_per_se;
/* default to not enabled */
this_backend_enabled = 0;
if ((shader_engine < num_shader_engines) &&
(backend_per_se < num_backends_per_se))
this_backend_enabled = 1;
if (this_backend_enabled) {
enabled_backends_mask |= (1 << i);
*backend_disable_mask_per_asic &= ~(1 << i);
++enabled_backends_count;
}
}
}
memset((uint8_t *)&swizzle_pipe[0], 0, sizeof(u32) * SI_MAX_PIPES);
switch (rdev->family) {
case CHIP_TAHITI:
case CHIP_PITCAIRN:
case CHIP_VERDE:
force_no_swizzle = true;
break;
default:
force_no_swizzle = false;
break;
}
if (force_no_swizzle) {
bool last_backend_enabled = false;
force_no_swizzle = false;
for (i = 0; i < SI_MAX_BACKENDS; ++i) {
if (((enabled_backends_mask >> i) & 1) == 1) {
if (last_backend_enabled)
force_no_swizzle = true;
last_backend_enabled = true;
} else
last_backend_enabled = false;
}
}
switch (num_tile_pipes) {
case 1:
case 3:
case 5:
case 7:
DRM_ERROR("odd number of pipes!\n");
break;
case 2:
swizzle_pipe[0] = 0;
swizzle_pipe[1] = 1;
break;
case 4:
if (force_no_swizzle) {
swizzle_pipe[0] = 0;
swizzle_pipe[1] = 1;
swizzle_pipe[2] = 2;
swizzle_pipe[3] = 3;
} else {
swizzle_pipe[0] = 0;
swizzle_pipe[1] = 2;
swizzle_pipe[2] = 1;
swizzle_pipe[3] = 3;
}
break;
case 6:
if (force_no_swizzle) {
swizzle_pipe[0] = 0;
swizzle_pipe[1] = 1;
swizzle_pipe[2] = 2;
swizzle_pipe[3] = 3;
swizzle_pipe[4] = 4;
swizzle_pipe[5] = 5;
} else {
swizzle_pipe[0] = 0;
swizzle_pipe[1] = 2;
swizzle_pipe[2] = 4;
swizzle_pipe[3] = 1;
swizzle_pipe[4] = 3;
swizzle_pipe[5] = 5;
}
break;
case 8:
if (force_no_swizzle) {
swizzle_pipe[0] = 0;
swizzle_pipe[1] = 1;
swizzle_pipe[2] = 2;
swizzle_pipe[3] = 3;
swizzle_pipe[4] = 4;
swizzle_pipe[5] = 5;
swizzle_pipe[6] = 6;
swizzle_pipe[7] = 7;
} else {
swizzle_pipe[0] = 0;
swizzle_pipe[1] = 2;
swizzle_pipe[2] = 4;
swizzle_pipe[3] = 6;
swizzle_pipe[4] = 1;
swizzle_pipe[5] = 3;
swizzle_pipe[6] = 5;
swizzle_pipe[7] = 7;
}
break;
}
for (cur_pipe = 0; cur_pipe < num_tile_pipes; ++cur_pipe) {
while (((1 << cur_backend) & enabled_backends_mask) == 0)
cur_backend = (cur_backend + 1) % SI_MAX_BACKENDS;
backend_map |= (((cur_backend & 0xf) << (swizzle_pipe[cur_pipe] * 4)));
cur_backend = (cur_backend + 1) % SI_MAX_BACKENDS;
}
return backend_map;
}
static u32 si_get_disable_mask_per_asic(struct radeon_device *rdev,
u32 disable_mask_per_se,
u32 max_disable_mask_per_se,
u32 num_shader_engines)
{
u32 disable_field_width_per_se = r600_count_pipe_bits(disable_mask_per_se);
u32 disable_mask_per_asic = disable_mask_per_se & max_disable_mask_per_se;
if (num_shader_engines == 1)
return disable_mask_per_asic;
else if (num_shader_engines == 2)
return disable_mask_per_asic | (disable_mask_per_asic << disable_field_width_per_se);
else
return 0xffffffff;
}
static void si_tiling_mode_table_init(struct radeon_device *rdev)
{
const u32 num_tile_mode_states = 32;
@@ -1562,18 +1368,151 @@ static void si_tiling_mode_table_init(struct radeon_device *rdev)
DRM_ERROR("unknown asic: 0x%x\n", rdev->family);
}
static void si_select_se_sh(struct radeon_device *rdev,
u32 se_num, u32 sh_num)
{
u32 data = INSTANCE_BROADCAST_WRITES;
if ((se_num == 0xffffffff) && (sh_num == 0xffffffff))
data = SH_BROADCAST_WRITES | SE_BROADCAST_WRITES;
else if (se_num == 0xffffffff)
data |= SE_BROADCAST_WRITES | SH_INDEX(sh_num);
else if (sh_num == 0xffffffff)
data |= SH_BROADCAST_WRITES | SE_INDEX(se_num);
else
data |= SH_INDEX(sh_num) | SE_INDEX(se_num);
WREG32(GRBM_GFX_INDEX, data);
}
static u32 si_create_bitmask(u32 bit_width)
{
u32 i, mask = 0;
for (i = 0; i < bit_width; i++) {
mask <<= 1;
mask |= 1;
}
return mask;
}
static u32 si_get_cu_enabled(struct radeon_device *rdev, u32 cu_per_sh)
{
u32 data, mask;
data = RREG32(CC_GC_SHADER_ARRAY_CONFIG);
if (data & 1)
data &= INACTIVE_CUS_MASK;
else
data = 0;
data |= RREG32(GC_USER_SHADER_ARRAY_CONFIG);
data >>= INACTIVE_CUS_SHIFT;
mask = si_create_bitmask(cu_per_sh);
return ~data & mask;
}
static void si_setup_spi(struct radeon_device *rdev,
u32 se_num, u32 sh_per_se,
u32 cu_per_sh)
{
int i, j, k;
u32 data, mask, active_cu;
for (i = 0; i < se_num; i++) {
for (j = 0; j < sh_per_se; j++) {
si_select_se_sh(rdev, i, j);
data = RREG32(SPI_STATIC_THREAD_MGMT_3);
active_cu = si_get_cu_enabled(rdev, cu_per_sh);
mask = 1;
for (k = 0; k < 16; k++) {
mask <<= k;
if (active_cu & mask) {
data &= ~mask;
WREG32(SPI_STATIC_THREAD_MGMT_3, data);
break;
}
}
}
}
si_select_se_sh(rdev, 0xffffffff, 0xffffffff);
}
static u32 si_get_rb_disabled(struct radeon_device *rdev,
u32 max_rb_num, u32 se_num,
u32 sh_per_se)
{
u32 data, mask;
data = RREG32(CC_RB_BACKEND_DISABLE);
if (data & 1)
data &= BACKEND_DISABLE_MASK;
else
data = 0;
data |= RREG32(GC_USER_RB_BACKEND_DISABLE);
data >>= BACKEND_DISABLE_SHIFT;
mask = si_create_bitmask(max_rb_num / se_num / sh_per_se);
return data & mask;
}
static void si_setup_rb(struct radeon_device *rdev,
u32 se_num, u32 sh_per_se,
u32 max_rb_num)
{
int i, j;
u32 data, mask;
u32 disabled_rbs = 0;
u32 enabled_rbs = 0;
for (i = 0; i < se_num; i++) {
for (j = 0; j < sh_per_se; j++) {
si_select_se_sh(rdev, i, j);
data = si_get_rb_disabled(rdev, max_rb_num, se_num, sh_per_se);
disabled_rbs |= data << ((i * sh_per_se + j) * TAHITI_RB_BITMAP_WIDTH_PER_SH);
}
}
si_select_se_sh(rdev, 0xffffffff, 0xffffffff);
mask = 1;
for (i = 0; i < max_rb_num; i++) {
if (!(disabled_rbs & mask))
enabled_rbs |= mask;
mask <<= 1;
}
for (i = 0; i < se_num; i++) {
si_select_se_sh(rdev, i, 0xffffffff);
data = 0;
for (j = 0; j < sh_per_se; j++) {
switch (enabled_rbs & 3) {
case 1:
data |= (RASTER_CONFIG_RB_MAP_0 << (i * sh_per_se + j) * 2);
break;
case 2:
data |= (RASTER_CONFIG_RB_MAP_3 << (i * sh_per_se + j) * 2);
break;
case 3:
default:
data |= (RASTER_CONFIG_RB_MAP_2 << (i * sh_per_se + j) * 2);
break;
}
enabled_rbs >>= 2;
}
WREG32(PA_SC_RASTER_CONFIG, data);
}
si_select_se_sh(rdev, 0xffffffff, 0xffffffff);
}
static void si_gpu_init(struct radeon_device *rdev)
{
u32 cc_rb_backend_disable = 0;
u32 cc_gc_shader_array_config;
u32 gb_addr_config = 0;
u32 mc_shared_chmap, mc_arb_ramcfg;
u32 gb_backend_map;
u32 cgts_tcc_disable;
u32 sx_debug_1;
u32 gc_user_shader_array_config;
u32 gc_user_rb_backend_disable;
u32 cgts_user_tcc_disable;
u32 hdp_host_path_cntl;
u32 tmp;
int i, j;
@@ -1581,9 +1520,9 @@ static void si_gpu_init(struct radeon_device *rdev)
switch (rdev->family) {
case CHIP_TAHITI:
rdev->config.si.max_shader_engines = 2;
rdev->config.si.max_pipes_per_simd = 4;
rdev->config.si.max_tile_pipes = 12;
rdev->config.si.max_simds_per_se = 8;
rdev->config.si.max_cu_per_sh = 8;
rdev->config.si.max_sh_per_se = 2;
rdev->config.si.max_backends_per_se = 4;
rdev->config.si.max_texture_channel_caches = 12;
rdev->config.si.max_gprs = 256;
@@ -1594,12 +1533,13 @@ static void si_gpu_init(struct radeon_device *rdev)
rdev->config.si.sc_prim_fifo_size_backend = 0x100;
rdev->config.si.sc_hiz_tile_fifo_size = 0x30;
rdev->config.si.sc_earlyz_tile_fifo_size = 0x130;
gb_addr_config = TAHITI_GB_ADDR_CONFIG_GOLDEN;
break;
case CHIP_PITCAIRN:
rdev->config.si.max_shader_engines = 2;
rdev->config.si.max_pipes_per_simd = 4;
rdev->config.si.max_tile_pipes = 8;
rdev->config.si.max_simds_per_se = 5;
rdev->config.si.max_cu_per_sh = 5;
rdev->config.si.max_sh_per_se = 2;
rdev->config.si.max_backends_per_se = 4;
rdev->config.si.max_texture_channel_caches = 8;
rdev->config.si.max_gprs = 256;
@@ -1610,13 +1550,14 @@ static void si_gpu_init(struct radeon_device *rdev)
rdev->config.si.sc_prim_fifo_size_backend = 0x100;
rdev->config.si.sc_hiz_tile_fifo_size = 0x30;
rdev->config.si.sc_earlyz_tile_fifo_size = 0x130;
gb_addr_config = TAHITI_GB_ADDR_CONFIG_GOLDEN;
break;
case CHIP_VERDE:
default:
rdev->config.si.max_shader_engines = 1;
rdev->config.si.max_pipes_per_simd = 4;
rdev->config.si.max_tile_pipes = 4;
rdev->config.si.max_simds_per_se = 2;
rdev->config.si.max_cu_per_sh = 2;
rdev->config.si.max_sh_per_se = 2;
rdev->config.si.max_backends_per_se = 4;
rdev->config.si.max_texture_channel_caches = 4;
rdev->config.si.max_gprs = 256;
@@ -1627,6 +1568,7 @@ static void si_gpu_init(struct radeon_device *rdev)
rdev->config.si.sc_prim_fifo_size_backend = 0x40;
rdev->config.si.sc_hiz_tile_fifo_size = 0x30;
rdev->config.si.sc_earlyz_tile_fifo_size = 0x130;
gb_addr_config = VERDE_GB_ADDR_CONFIG_GOLDEN;
break;
}
@@ -1648,31 +1590,7 @@ static void si_gpu_init(struct radeon_device *rdev)
mc_shared_chmap = RREG32(MC_SHARED_CHMAP);
mc_arb_ramcfg = RREG32(MC_ARB_RAMCFG);
cc_rb_backend_disable = RREG32(CC_RB_BACKEND_DISABLE);
cc_gc_shader_array_config = RREG32(CC_GC_SHADER_ARRAY_CONFIG);
cgts_tcc_disable = 0xffff0000;
for (i = 0; i < rdev->config.si.max_texture_channel_caches; i++)
cgts_tcc_disable &= ~(1 << (16 + i));
gc_user_rb_backend_disable = RREG32(GC_USER_RB_BACKEND_DISABLE);
gc_user_shader_array_config = RREG32(GC_USER_SHADER_ARRAY_CONFIG);
cgts_user_tcc_disable = RREG32(CGTS_USER_TCC_DISABLE);
rdev->config.si.num_shader_engines = rdev->config.si.max_shader_engines;
rdev->config.si.num_tile_pipes = rdev->config.si.max_tile_pipes;
tmp = ((~gc_user_rb_backend_disable) & BACKEND_DISABLE_MASK) >> BACKEND_DISABLE_SHIFT;
rdev->config.si.num_backends_per_se = r600_count_pipe_bits(tmp);
tmp = (gc_user_rb_backend_disable & BACKEND_DISABLE_MASK) >> BACKEND_DISABLE_SHIFT;
rdev->config.si.backend_disable_mask_per_asic =
si_get_disable_mask_per_asic(rdev, tmp, SI_MAX_BACKENDS_PER_SE_MASK,
rdev->config.si.num_shader_engines);
rdev->config.si.backend_map =
si_get_tile_pipe_to_backend_map(rdev, rdev->config.si.num_tile_pipes,
rdev->config.si.num_backends_per_se *
rdev->config.si.num_shader_engines,
&rdev->config.si.backend_disable_mask_per_asic,
rdev->config.si.num_shader_engines);
tmp = ((~cgts_user_tcc_disable) & TCC_DISABLE_MASK) >> TCC_DISABLE_SHIFT;
rdev->config.si.num_texture_channel_caches = r600_count_pipe_bits(tmp);
rdev->config.si.mem_max_burst_length_bytes = 256;
tmp = (mc_arb_ramcfg & NOOFCOLS_MASK) >> NOOFCOLS_SHIFT;
rdev->config.si.mem_row_size_in_kb = (4 * (1 << (8 + tmp))) / 1024;
@@ -1683,55 +1601,8 @@ static void si_gpu_init(struct radeon_device *rdev)
rdev->config.si.num_gpus = 1;
rdev->config.si.multi_gpu_tile_size = 64;
gb_addr_config = 0;
switch (rdev->config.si.num_tile_pipes) {
case 1:
gb_addr_config |= NUM_PIPES(0);
break;
case 2:
gb_addr_config |= NUM_PIPES(1);
break;
case 4:
gb_addr_config |= NUM_PIPES(2);
break;
case 8:
default:
gb_addr_config |= NUM_PIPES(3);
break;
}
tmp = (rdev->config.si.mem_max_burst_length_bytes / 256) - 1;
gb_addr_config |= PIPE_INTERLEAVE_SIZE(tmp);
gb_addr_config |= NUM_SHADER_ENGINES(rdev->config.si.num_shader_engines - 1);
tmp = (rdev->config.si.shader_engine_tile_size / 16) - 1;
gb_addr_config |= SHADER_ENGINE_TILE_SIZE(tmp);
switch (rdev->config.si.num_gpus) {
case 1:
default:
gb_addr_config |= NUM_GPUS(0);
break;
case 2:
gb_addr_config |= NUM_GPUS(1);
break;
case 4:
gb_addr_config |= NUM_GPUS(2);
break;
}
switch (rdev->config.si.multi_gpu_tile_size) {
case 16:
gb_addr_config |= MULTI_GPU_TILE_SIZE(0);
break;
case 32:
default:
gb_addr_config |= MULTI_GPU_TILE_SIZE(1);
break;
case 64:
gb_addr_config |= MULTI_GPU_TILE_SIZE(2);
break;
case 128:
gb_addr_config |= MULTI_GPU_TILE_SIZE(3);
break;
}
/* fix up row size */
gb_addr_config &= ~ROW_SIZE_MASK;
switch (rdev->config.si.mem_row_size_in_kb) {
case 1:
default:
@@ -1745,26 +1616,6 @@ static void si_gpu_init(struct radeon_device *rdev)
break;
}
tmp = (gb_addr_config & NUM_PIPES_MASK) >> NUM_PIPES_SHIFT;
rdev->config.si.num_tile_pipes = (1 << tmp);
tmp = (gb_addr_config & PIPE_INTERLEAVE_SIZE_MASK) >> PIPE_INTERLEAVE_SIZE_SHIFT;
rdev->config.si.mem_max_burst_length_bytes = (tmp + 1) * 256;
tmp = (gb_addr_config & NUM_SHADER_ENGINES_MASK) >> NUM_SHADER_ENGINES_SHIFT;
rdev->config.si.num_shader_engines = tmp + 1;
tmp = (gb_addr_config & NUM_GPUS_MASK) >> NUM_GPUS_SHIFT;
rdev->config.si.num_gpus = tmp + 1;
tmp = (gb_addr_config & MULTI_GPU_TILE_SIZE_MASK) >> MULTI_GPU_TILE_SIZE_SHIFT;
rdev->config.si.multi_gpu_tile_size = 1 << tmp;
tmp = (gb_addr_config & ROW_SIZE_MASK) >> ROW_SIZE_SHIFT;
rdev->config.si.mem_row_size_in_kb = 1 << tmp;
gb_backend_map =
si_get_tile_pipe_to_backend_map(rdev, rdev->config.si.num_tile_pipes,
rdev->config.si.num_backends_per_se *
rdev->config.si.num_shader_engines,
&rdev->config.si.backend_disable_mask_per_asic,
rdev->config.si.num_shader_engines);
/* setup tiling info dword. gb_addr_config is not adequate since it does
* not have bank info, so create a custom tiling dword.
* bits 3:0 num_pipes
@@ -1789,34 +1640,30 @@ static void si_gpu_init(struct radeon_device *rdev)
rdev->config.si.tile_config |= (3 << 0);
break;
}
rdev->config.si.tile_config |=
((mc_arb_ramcfg & NOOFBANK_MASK) >> NOOFBANK_SHIFT) << 4;
if ((mc_arb_ramcfg & NOOFBANK_MASK) >> NOOFBANK_SHIFT)
rdev->config.si.tile_config |= 1 << 4;
else
rdev->config.si.tile_config |= 0 << 4;
rdev->config.si.tile_config |=
((gb_addr_config & PIPE_INTERLEAVE_SIZE_MASK) >> PIPE_INTERLEAVE_SIZE_SHIFT) << 8;
rdev->config.si.tile_config |=
((gb_addr_config & ROW_SIZE_MASK) >> ROW_SIZE_SHIFT) << 12;
rdev->config.si.backend_map = gb_backend_map;
WREG32(GB_ADDR_CONFIG, gb_addr_config);
WREG32(DMIF_ADDR_CONFIG, gb_addr_config);
WREG32(HDP_ADDR_CONFIG, gb_addr_config);
/* primary versions */
WREG32(CC_RB_BACKEND_DISABLE, cc_rb_backend_disable);
WREG32(CC_SYS_RB_BACKEND_DISABLE, cc_rb_backend_disable);
WREG32(CC_GC_SHADER_ARRAY_CONFIG, cc_gc_shader_array_config);
WREG32(CGTS_TCC_DISABLE, cgts_tcc_disable);
/* user versions */
WREG32(GC_USER_RB_BACKEND_DISABLE, cc_rb_backend_disable);
WREG32(GC_USER_SYS_RB_BACKEND_DISABLE, cc_rb_backend_disable);
WREG32(GC_USER_SHADER_ARRAY_CONFIG, cc_gc_shader_array_config);
WREG32(CGTS_USER_TCC_DISABLE, cgts_tcc_disable);
si_tiling_mode_table_init(rdev);
si_setup_rb(rdev, rdev->config.si.max_shader_engines,
rdev->config.si.max_sh_per_se,
rdev->config.si.max_backends_per_se);
si_setup_spi(rdev, rdev->config.si.max_shader_engines,
rdev->config.si.max_sh_per_se,
rdev->config.si.max_cu_per_sh);
/* set HW defaults for 3D engine */
WREG32(CP_QUEUE_THRESHOLDS, (ROQ_IB1_START(0x16) |
ROQ_IB2_START(0x2b)));

View File

@@ -24,6 +24,11 @@
#ifndef SI_H
#define SI_H
#define TAHITI_RB_BITMAP_WIDTH_PER_SH 2
#define TAHITI_GB_ADDR_CONFIG_GOLDEN 0x12011003
#define VERDE_GB_ADDR_CONFIG_GOLDEN 0x12010002
#define CG_MULT_THERMAL_STATUS 0x714
#define ASIC_MAX_TEMP(x) ((x) << 0)
#define ASIC_MAX_TEMP_MASK 0x000001ff
@@ -408,6 +413,12 @@
#define SOFT_RESET_IA (1 << 15)
#define GRBM_GFX_INDEX 0x802C
#define INSTANCE_INDEX(x) ((x) << 0)
#define SH_INDEX(x) ((x) << 8)
#define SE_INDEX(x) ((x) << 16)
#define SH_BROADCAST_WRITES (1 << 29)
#define INSTANCE_BROADCAST_WRITES (1 << 30)
#define SE_BROADCAST_WRITES (1 << 31)
#define GRBM_INT_CNTL 0x8060
# define RDERR_INT_ENABLE (1 << 0)
@@ -480,6 +491,8 @@
#define VGT_TF_MEMORY_BASE 0x89B8
#define CC_GC_SHADER_ARRAY_CONFIG 0x89bc
#define INACTIVE_CUS_MASK 0xFFFF0000
#define INACTIVE_CUS_SHIFT 16
#define GC_USER_SHADER_ARRAY_CONFIG 0x89c0
#define PA_CL_ENHANCE 0x8A14
@@ -688,6 +701,12 @@
#define RLC_MC_CNTL 0xC344
#define RLC_UCODE_CNTL 0xC348
#define PA_SC_RASTER_CONFIG 0x28350
# define RASTER_CONFIG_RB_MAP_0 0
# define RASTER_CONFIG_RB_MAP_1 1
# define RASTER_CONFIG_RB_MAP_2 2
# define RASTER_CONFIG_RB_MAP_3 3
#define VGT_EVENT_INITIATOR 0x28a90
# define SAMPLE_STREAMOUTSTATS1 (1 << 0)
# define SAMPLE_STREAMOUTSTATS2 (2 << 0)

View File

@@ -37,4 +37,16 @@ config I2C_MUX_PCA954x
This driver can also be built as a module. If so, the module
will be called i2c-mux-pca954x.
config I2C_MUX_PINCTRL
tristate "pinctrl-based I2C multiplexer"
depends on PINCTRL
help
If you say yes to this option, support will be included for an I2C
multiplexer that uses the pinctrl subsystem, i.e. pin multiplexing.
This is useful for SoCs whose I2C module's signals can be routed to
different sets of pins at run-time.
This driver can also be built as a module. If so, the module will be
called pinctrl-i2cmux.
endmenu

View File

@@ -4,5 +4,6 @@
obj-$(CONFIG_I2C_MUX_GPIO) += i2c-mux-gpio.o
obj-$(CONFIG_I2C_MUX_PCA9541) += i2c-mux-pca9541.o
obj-$(CONFIG_I2C_MUX_PCA954x) += i2c-mux-pca954x.o
obj-$(CONFIG_I2C_MUX_PINCTRL) += i2c-mux-pinctrl.o
ccflags-$(CONFIG_I2C_DEBUG_BUS) := -DDEBUG

View File

@@ -0,0 +1,279 @@
/*
* I2C multiplexer using pinctrl API
*
* Copyright (c) 2012, NVIDIA CORPORATION. All rights reserved.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include <linux/i2c.h>
#include <linux/i2c-mux.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/of_i2c.h>
#include <linux/pinctrl/consumer.h>
#include <linux/i2c-mux-pinctrl.h>
#include <linux/platform_device.h>
#include <linux/slab.h>
struct i2c_mux_pinctrl {
struct device *dev;
struct i2c_mux_pinctrl_platform_data *pdata;
struct pinctrl *pinctrl;
struct pinctrl_state **states;
struct pinctrl_state *state_idle;
struct i2c_adapter *parent;
struct i2c_adapter **busses;
};
static int i2c_mux_pinctrl_select(struct i2c_adapter *adap, void *data,
u32 chan)
{
struct i2c_mux_pinctrl *mux = data;
return pinctrl_select_state(mux->pinctrl, mux->states[chan]);
}
static int i2c_mux_pinctrl_deselect(struct i2c_adapter *adap, void *data,
u32 chan)
{
struct i2c_mux_pinctrl *mux = data;
return pinctrl_select_state(mux->pinctrl, mux->state_idle);
}
#ifdef CONFIG_OF
static int i2c_mux_pinctrl_parse_dt(struct i2c_mux_pinctrl *mux,
struct platform_device *pdev)
{
struct device_node *np = pdev->dev.of_node;
int num_names, i, ret;
struct device_node *adapter_np;
struct i2c_adapter *adapter;
if (!np)
return 0;
mux->pdata = devm_kzalloc(&pdev->dev, sizeof(*mux->pdata), GFP_KERNEL);
if (!mux->pdata) {
dev_err(mux->dev,
"Cannot allocate i2c_mux_pinctrl_platform_data\n");
return -ENOMEM;
}
num_names = of_property_count_strings(np, "pinctrl-names");
if (num_names < 0) {
dev_err(mux->dev, "Cannot parse pinctrl-names: %d\n",
num_names);
return num_names;
}
mux->pdata->pinctrl_states = devm_kzalloc(&pdev->dev,
sizeof(*mux->pdata->pinctrl_states) * num_names,
GFP_KERNEL);
if (!mux->pdata->pinctrl_states) {
dev_err(mux->dev, "Cannot allocate pinctrl_states\n");
return -ENOMEM;
}
for (i = 0; i < num_names; i++) {
ret = of_property_read_string_index(np, "pinctrl-names", i,
&mux->pdata->pinctrl_states[mux->pdata->bus_count]);
if (ret < 0) {
dev_err(mux->dev, "Cannot parse pinctrl-names: %d\n",
ret);
return ret;
}
if (!strcmp(mux->pdata->pinctrl_states[mux->pdata->bus_count],
"idle")) {
if (i != num_names - 1) {
dev_err(mux->dev, "idle state must be last\n");
return -EINVAL;
}
mux->pdata->pinctrl_state_idle = "idle";
} else {
mux->pdata->bus_count++;
}
}
adapter_np = of_parse_phandle(np, "i2c-parent", 0);
if (!adapter_np) {
dev_err(mux->dev, "Cannot parse i2c-parent\n");
return -ENODEV;
}
adapter = of_find_i2c_adapter_by_node(adapter_np);
if (!adapter) {
dev_err(mux->dev, "Cannot find parent bus\n");
return -ENODEV;
}
mux->pdata->parent_bus_num = i2c_adapter_id(adapter);
put_device(&adapter->dev);
return 0;
}
#else
static inline int i2c_mux_pinctrl_parse_dt(struct i2c_mux_pinctrl *mux,
struct platform_device *pdev)
{
return 0;
}
#endif
static int __devinit i2c_mux_pinctrl_probe(struct platform_device *pdev)
{
struct i2c_mux_pinctrl *mux;
int (*deselect)(struct i2c_adapter *, void *, u32);
int i, ret;
mux = devm_kzalloc(&pdev->dev, sizeof(*mux), GFP_KERNEL);
if (!mux) {
dev_err(&pdev->dev, "Cannot allocate i2c_mux_pinctrl\n");
ret = -ENOMEM;
goto err;
}
platform_set_drvdata(pdev, mux);
mux->dev = &pdev->dev;
mux->pdata = pdev->dev.platform_data;
if (!mux->pdata) {
ret = i2c_mux_pinctrl_parse_dt(mux, pdev);
if (ret < 0)
goto err;
}
if (!mux->pdata) {
dev_err(&pdev->dev, "Missing platform data\n");
ret = -ENODEV;
goto err;
}
mux->states = devm_kzalloc(&pdev->dev,
sizeof(*mux->states) * mux->pdata->bus_count,
GFP_KERNEL);
if (!mux->states) {
dev_err(&pdev->dev, "Cannot allocate states\n");
ret = -ENOMEM;
goto err;
}
mux->busses = devm_kzalloc(&pdev->dev,
sizeof(mux->busses) * mux->pdata->bus_count,
GFP_KERNEL);
if (!mux->states) {
dev_err(&pdev->dev, "Cannot allocate busses\n");
ret = -ENOMEM;
goto err;
}
mux->pinctrl = devm_pinctrl_get(&pdev->dev);
if (IS_ERR(mux->pinctrl)) {
ret = PTR_ERR(mux->pinctrl);
dev_err(&pdev->dev, "Cannot get pinctrl: %d\n", ret);
goto err;
}
for (i = 0; i < mux->pdata->bus_count; i++) {
mux->states[i] = pinctrl_lookup_state(mux->pinctrl,
mux->pdata->pinctrl_states[i]);
if (IS_ERR(mux->states[i])) {
ret = PTR_ERR(mux->states[i]);
dev_err(&pdev->dev,
"Cannot look up pinctrl state %s: %d\n",
mux->pdata->pinctrl_states[i], ret);
goto err;
}
}
if (mux->pdata->pinctrl_state_idle) {
mux->state_idle = pinctrl_lookup_state(mux->pinctrl,
mux->pdata->pinctrl_state_idle);
if (IS_ERR(mux->state_idle)) {
ret = PTR_ERR(mux->state_idle);
dev_err(&pdev->dev,
"Cannot look up pinctrl state %s: %d\n",
mux->pdata->pinctrl_state_idle, ret);
goto err;
}
deselect = i2c_mux_pinctrl_deselect;
} else {
deselect = NULL;
}
mux->parent = i2c_get_adapter(mux->pdata->parent_bus_num);
if (!mux->parent) {
dev_err(&pdev->dev, "Parent adapter (%d) not found\n",
mux->pdata->parent_bus_num);
ret = -ENODEV;
goto err;
}
for (i = 0; i < mux->pdata->bus_count; i++) {
u32 bus = mux->pdata->base_bus_num ?
(mux->pdata->base_bus_num + i) : 0;
mux->busses[i] = i2c_add_mux_adapter(mux->parent, &pdev->dev,
mux, bus, i,
i2c_mux_pinctrl_select,
deselect);
if (!mux->busses[i]) {
ret = -ENODEV;
dev_err(&pdev->dev, "Failed to add adapter %d\n", i);
goto err_del_adapter;
}
}
return 0;
err_del_adapter:
for (; i > 0; i--)
i2c_del_mux_adapter(mux->busses[i - 1]);
i2c_put_adapter(mux->parent);
err:
return ret;
}
static int __devexit i2c_mux_pinctrl_remove(struct platform_device *pdev)
{
struct i2c_mux_pinctrl *mux = platform_get_drvdata(pdev);
int i;
for (i = 0; i < mux->pdata->bus_count; i++)
i2c_del_mux_adapter(mux->busses[i]);
i2c_put_adapter(mux->parent);
return 0;
}
#ifdef CONFIG_OF
static const struct of_device_id i2c_mux_pinctrl_of_match[] __devinitconst = {
{ .compatible = "i2c-mux-pinctrl", },
{},
};
MODULE_DEVICE_TABLE(of, i2c_mux_pinctrl_of_match);
#endif
static struct platform_driver i2c_mux_pinctrl_driver = {
.driver = {
.name = "i2c-mux-pinctrl",
.owner = THIS_MODULE,
.of_match_table = of_match_ptr(i2c_mux_pinctrl_of_match),
},
.probe = i2c_mux_pinctrl_probe,
.remove = __devexit_p(i2c_mux_pinctrl_remove),
};
module_platform_driver(i2c_mux_pinctrl_driver);
MODULE_DESCRIPTION("pinctrl-based I2C multiplexer driver");
MODULE_AUTHOR("Stephen Warren <swarren@nvidia.com>");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS("platform:i2c-mux-pinctrl");

View File

@@ -1593,6 +1593,10 @@ static int import_ep(struct c4iw_ep *ep, __be32 peer_ip, struct dst_entry *dst,
struct net_device *pdev;
pdev = ip_dev_find(&init_net, peer_ip);
if (!pdev) {
err = -ENODEV;
goto out;
}
ep->l2t = cxgb4_l2t_get(cdev->rdev.lldi.l2t,
n, pdev, 0);
if (!ep->l2t)

View File

@@ -140,7 +140,7 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
props->max_mr_size = ~0ull;
props->page_size_cap = dev->dev->caps.page_size_cap;
props->max_qp = dev->dev->caps.num_qps - dev->dev->caps.reserved_qps;
props->max_qp_wr = dev->dev->caps.max_wqes;
props->max_qp_wr = dev->dev->caps.max_wqes - MLX4_IB_SQ_MAX_SPARE;
props->max_sge = min(dev->dev->caps.max_sq_sg,
dev->dev->caps.max_rq_sg);
props->max_cq = dev->dev->caps.num_cqs - dev->dev->caps.reserved_cqs;
@@ -1084,12 +1084,9 @@ static void mlx4_ib_alloc_eqs(struct mlx4_dev *dev, struct mlx4_ib_dev *ibdev)
int total_eqs = 0;
int i, j, eq;
/* Init eq table */
ibdev->eq_table = NULL;
ibdev->eq_added = 0;
/* Legacy mode? */
if (dev->caps.comp_pool == 0)
/* Legacy mode or comp_pool is not large enough */
if (dev->caps.comp_pool == 0 ||
dev->caps.num_ports > dev->caps.comp_pool)
return;
eq_per_port = rounddown_pow_of_two(dev->caps.comp_pool/
@@ -1135,7 +1132,10 @@ static void mlx4_ib_alloc_eqs(struct mlx4_dev *dev, struct mlx4_ib_dev *ibdev)
static void mlx4_ib_free_eqs(struct mlx4_dev *dev, struct mlx4_ib_dev *ibdev)
{
int i;
int total_eqs;
/* no additional eqs were added */
if (!ibdev->eq_table)
return;
/* Reset the advertised EQ number */
ibdev->ib_dev.num_comp_vectors = dev->caps.num_comp_vectors;
@@ -1148,12 +1148,7 @@ static void mlx4_ib_free_eqs(struct mlx4_dev *dev, struct mlx4_ib_dev *ibdev)
mlx4_release_eq(dev, ibdev->eq_table[i]);
}
total_eqs = dev->caps.num_comp_vectors + ibdev->eq_added;
memset(ibdev->eq_table, 0, total_eqs * sizeof(int));
kfree(ibdev->eq_table);
ibdev->eq_table = NULL;
ibdev->eq_added = 0;
}
static void *mlx4_ib_add(struct mlx4_dev *dev)

View File

@@ -44,6 +44,14 @@
#include <linux/mlx4/device.h>
#include <linux/mlx4/doorbell.h>
enum {
MLX4_IB_SQ_MIN_WQE_SHIFT = 6,
MLX4_IB_MAX_HEADROOM = 2048
};
#define MLX4_IB_SQ_HEADROOM(shift) ((MLX4_IB_MAX_HEADROOM >> (shift)) + 1)
#define MLX4_IB_SQ_MAX_SPARE (MLX4_IB_SQ_HEADROOM(MLX4_IB_SQ_MIN_WQE_SHIFT))
struct mlx4_ib_ucontext {
struct ib_ucontext ibucontext;
struct mlx4_uar uar;

View File

@@ -310,8 +310,8 @@ static int set_rq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap,
int is_user, int has_rq, struct mlx4_ib_qp *qp)
{
/* Sanity check RQ size before proceeding */
if (cap->max_recv_wr > dev->dev->caps.max_wqes ||
cap->max_recv_sge > dev->dev->caps.max_rq_sg)
if (cap->max_recv_wr > dev->dev->caps.max_wqes - MLX4_IB_SQ_MAX_SPARE ||
cap->max_recv_sge > min(dev->dev->caps.max_sq_sg, dev->dev->caps.max_rq_sg))
return -EINVAL;
if (!has_rq) {
@@ -329,8 +329,17 @@ static int set_rq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap,
qp->rq.wqe_shift = ilog2(qp->rq.max_gs * sizeof (struct mlx4_wqe_data_seg));
}
cap->max_recv_wr = qp->rq.max_post = qp->rq.wqe_cnt;
cap->max_recv_sge = qp->rq.max_gs;
/* leave userspace return values as they were, so as not to break ABI */
if (is_user) {
cap->max_recv_wr = qp->rq.max_post = qp->rq.wqe_cnt;
cap->max_recv_sge = qp->rq.max_gs;
} else {
cap->max_recv_wr = qp->rq.max_post =
min(dev->dev->caps.max_wqes - MLX4_IB_SQ_MAX_SPARE, qp->rq.wqe_cnt);
cap->max_recv_sge = min(qp->rq.max_gs,
min(dev->dev->caps.max_sq_sg,
dev->dev->caps.max_rq_sg));
}
return 0;
}
@@ -341,8 +350,8 @@ static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap,
int s;
/* Sanity check SQ size before proceeding */
if (cap->max_send_wr > dev->dev->caps.max_wqes ||
cap->max_send_sge > dev->dev->caps.max_sq_sg ||
if (cap->max_send_wr > (dev->dev->caps.max_wqes - MLX4_IB_SQ_MAX_SPARE) ||
cap->max_send_sge > min(dev->dev->caps.max_sq_sg, dev->dev->caps.max_rq_sg) ||
cap->max_inline_data + send_wqe_overhead(type, qp->flags) +
sizeof (struct mlx4_wqe_inline_seg) > dev->dev->caps.max_sq_desc_sz)
return -EINVAL;

View File

@@ -231,7 +231,6 @@ struct ocrdma_qp_hwq_info {
u32 entry_size;
u32 max_cnt;
u32 max_wqe_idx;
u32 free_delta;
u16 dbid; /* qid, where to ring the doorbell. */
u32 len;
dma_addr_t pa;

View File

@@ -101,8 +101,6 @@ struct ocrdma_create_qp_uresp {
u32 rsvd1;
u32 num_wqe_allocated;
u32 num_rqe_allocated;
u32 free_wqe_delta;
u32 free_rqe_delta;
u32 db_sq_offset;
u32 db_rq_offset;
u32 db_shift;
@@ -126,8 +124,7 @@ struct ocrdma_create_srq_uresp {
u32 db_rq_offset;
u32 db_shift;
u32 free_rqe_delta;
u32 rsvd2;
u64 rsvd2;
u64 rsvd3;
} __packed;

View File

@@ -732,7 +732,7 @@ static void ocrdma_dispatch_ibevent(struct ocrdma_dev *dev,
break;
case OCRDMA_SRQ_LIMIT_EVENT:
ib_evt.element.srq = &qp->srq->ibsrq;
ib_evt.event = IB_EVENT_QP_LAST_WQE_REACHED;
ib_evt.event = IB_EVENT_SRQ_LIMIT_REACHED;
srq_event = 1;
qp_event = 0;
break;
@@ -1990,19 +1990,12 @@ static void ocrdma_get_create_qp_rsp(struct ocrdma_create_qp_rsp *rsp,
max_wqe_allocated = 1 << max_wqe_allocated;
max_rqe_allocated = 1 << ((u16)rsp->max_wqe_rqe);
if (qp->dev->nic_info.dev_family == OCRDMA_GEN2_FAMILY) {
qp->sq.free_delta = 0;
qp->rq.free_delta = 1;
} else
qp->sq.free_delta = 1;
qp->sq.max_cnt = max_wqe_allocated;
qp->sq.max_wqe_idx = max_wqe_allocated - 1;
if (!attrs->srq) {
qp->rq.max_cnt = max_rqe_allocated;
qp->rq.max_wqe_idx = max_rqe_allocated - 1;
qp->rq.free_delta = 1;
}
}

View File

@@ -26,7 +26,6 @@
*******************************************************************/
#include <linux/module.h>
#include <linux/version.h>
#include <linux/idr.h>
#include <rdma/ib_verbs.h>
#include <rdma/ib_user_verbs.h>

View File

@@ -940,8 +940,6 @@ static int ocrdma_copy_qp_uresp(struct ocrdma_qp *qp,
uresp.db_rq_offset = OCRDMA_DB_RQ_OFFSET;
uresp.db_shift = 16;
}
uresp.free_wqe_delta = qp->sq.free_delta;
uresp.free_rqe_delta = qp->rq.free_delta;
if (qp->dpp_enabled) {
uresp.dpp_credit = dpp_credit_lmt;
@@ -1307,8 +1305,6 @@ static int ocrdma_hwq_free_cnt(struct ocrdma_qp_hwq_info *q)
free_cnt = (q->max_cnt - q->head) + q->tail;
else
free_cnt = q->tail - q->head;
if (q->free_delta)
free_cnt -= q->free_delta;
return free_cnt;
}
@@ -1501,7 +1497,6 @@ static int ocrdma_copy_srq_uresp(struct ocrdma_srq *srq, struct ib_udata *udata)
(srq->pd->id * srq->dev->nic_info.db_page_size);
uresp.db_page_size = srq->dev->nic_info.db_page_size;
uresp.num_rqe_allocated = srq->rq.max_cnt;
uresp.free_rqe_delta = 1;
if (srq->dev->nic_info.dev_family == OCRDMA_GEN2_FAMILY) {
uresp.db_rq_offset = OCRDMA_DB_GEN2_RQ1_OFFSET;
uresp.db_shift = 24;

View File

@@ -28,7 +28,6 @@
#ifndef __OCRDMA_VERBS_H__
#define __OCRDMA_VERBS_H__
#include <linux/version.h>
int ocrdma_post_send(struct ib_qp *, struct ib_send_wr *,
struct ib_send_wr **bad_wr);
int ocrdma_post_recv(struct ib_qp *, struct ib_recv_wr *,

View File

@@ -547,26 +547,12 @@ static void iommu_poll_events(struct amd_iommu *iommu)
spin_unlock_irqrestore(&iommu->lock, flags);
}
static void iommu_handle_ppr_entry(struct amd_iommu *iommu, u32 head)
static void iommu_handle_ppr_entry(struct amd_iommu *iommu, u64 *raw)
{
struct amd_iommu_fault fault;
volatile u64 *raw;
int i;
INC_STATS_COUNTER(pri_requests);
raw = (u64 *)(iommu->ppr_log + head);
/*
* Hardware bug: Interrupt may arrive before the entry is written to
* memory. If this happens we need to wait for the entry to arrive.
*/
for (i = 0; i < LOOP_TIMEOUT; ++i) {
if (PPR_REQ_TYPE(raw[0]) != 0)
break;
udelay(1);
}
if (PPR_REQ_TYPE(raw[0]) != PPR_REQ_FAULT) {
pr_err_ratelimited("AMD-Vi: Unknown PPR request received\n");
return;
@@ -578,12 +564,6 @@ static void iommu_handle_ppr_entry(struct amd_iommu *iommu, u32 head)
fault.tag = PPR_TAG(raw[0]);
fault.flags = PPR_FLAGS(raw[0]);
/*
* To detect the hardware bug we need to clear the entry
* to back to zero.
*/
raw[0] = raw[1] = 0;
atomic_notifier_call_chain(&ppr_notifier, 0, &fault);
}
@@ -595,25 +575,62 @@ static void iommu_poll_ppr_log(struct amd_iommu *iommu)
if (iommu->ppr_log == NULL)
return;
/* enable ppr interrupts again */
writel(MMIO_STATUS_PPR_INT_MASK, iommu->mmio_base + MMIO_STATUS_OFFSET);
spin_lock_irqsave(&iommu->lock, flags);
head = readl(iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
tail = readl(iommu->mmio_base + MMIO_PPR_TAIL_OFFSET);
while (head != tail) {
volatile u64 *raw;
u64 entry[2];
int i;
/* Handle PPR entry */
iommu_handle_ppr_entry(iommu, head);
raw = (u64 *)(iommu->ppr_log + head);
/* Update and refresh ring-buffer state*/
/*
* Hardware bug: Interrupt may arrive before the entry is
* written to memory. If this happens we need to wait for the
* entry to arrive.
*/
for (i = 0; i < LOOP_TIMEOUT; ++i) {
if (PPR_REQ_TYPE(raw[0]) != 0)
break;
udelay(1);
}
/* Avoid memcpy function-call overhead */
entry[0] = raw[0];
entry[1] = raw[1];
/*
* To detect the hardware bug we need to clear the entry
* back to zero.
*/
raw[0] = raw[1] = 0UL;
/* Update head pointer of hardware ring-buffer */
head = (head + PPR_ENTRY_SIZE) % PPR_LOG_SIZE;
writel(head, iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
/*
* Release iommu->lock because ppr-handling might need to
* re-aquire it
*/
spin_unlock_irqrestore(&iommu->lock, flags);
/* Handle PPR entry */
iommu_handle_ppr_entry(iommu, entry);
spin_lock_irqsave(&iommu->lock, flags);
/* Refresh ring-buffer information */
head = readl(iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
tail = readl(iommu->mmio_base + MMIO_PPR_TAIL_OFFSET);
}
/* enable ppr interrupts again */
writel(MMIO_STATUS_PPR_INT_MASK, iommu->mmio_base + MMIO_STATUS_OFFSET);
spin_unlock_irqrestore(&iommu->lock, flags);
}

View File

@@ -1029,6 +1029,9 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h)
if (!iommu->dev)
return 1;
iommu->root_pdev = pci_get_bus_and_slot(iommu->dev->bus->number,
PCI_DEVFN(0, 0));
iommu->cap_ptr = h->cap_ptr;
iommu->pci_seg = h->pci_seg;
iommu->mmio_phys = h->mmio_phys;
@@ -1323,20 +1326,16 @@ static void iommu_apply_resume_quirks(struct amd_iommu *iommu)
{
int i, j;
u32 ioc_feature_control;
struct pci_dev *pdev = NULL;
struct pci_dev *pdev = iommu->root_pdev;
/* RD890 BIOSes may not have completely reconfigured the iommu */
if (!is_rd890_iommu(iommu->dev))
if (!is_rd890_iommu(iommu->dev) || !pdev)
return;
/*
* First, we need to ensure that the iommu is enabled. This is
* controlled by a register in the northbridge
*/
pdev = pci_get_bus_and_slot(iommu->dev->bus->number, PCI_DEVFN(0, 0));
if (!pdev)
return;
/* Select Northbridge indirect register 0x75 and enable writing */
pci_write_config_dword(pdev, 0x60, 0x75 | (1 << 7));
@@ -1346,8 +1345,6 @@ static void iommu_apply_resume_quirks(struct amd_iommu *iommu)
if (!(ioc_feature_control & 0x1))
pci_write_config_dword(pdev, 0x64, ioc_feature_control | 1);
pci_dev_put(pdev);
/* Restore the iommu BAR */
pci_write_config_dword(iommu->dev, iommu->cap_ptr + 4,
iommu->stored_addr_lo);

View File

@@ -481,6 +481,9 @@ struct amd_iommu {
/* Pointer to PCI device of this IOMMU */
struct pci_dev *dev;
/* Cache pdev to root device for resume quirks */
struct pci_dev *root_pdev;
/* physical address of MMIO space */
u64 mmio_phys;
/* virtual address of MMIO space */

View File

@@ -2550,6 +2550,7 @@ static struct r1conf *setup_conf(struct mddev *mddev)
err = -EINVAL;
spin_lock_init(&conf->device_lock);
rdev_for_each(rdev, mddev) {
struct request_queue *q;
int disk_idx = rdev->raid_disk;
if (disk_idx >= mddev->raid_disks
|| disk_idx < 0)
@@ -2562,6 +2563,9 @@ static struct r1conf *setup_conf(struct mddev *mddev)
if (disk->rdev)
goto abort;
disk->rdev = rdev;
q = bdev_get_queue(rdev->bdev);
if (q->merge_bvec_fn)
mddev->merge_check_needed = 1;
disk->head_position = 0;
}

View File

@@ -3475,6 +3475,7 @@ static int run(struct mddev *mddev)
rdev_for_each(rdev, mddev) {
long long diff;
struct request_queue *q;
disk_idx = rdev->raid_disk;
if (disk_idx < 0)
@@ -3493,6 +3494,9 @@ static int run(struct mddev *mddev)
goto out_free_conf;
disk->rdev = rdev;
}
q = bdev_get_queue(rdev->bdev);
if (q->merge_bvec_fn)
mddev->merge_check_needed = 1;
diff = (rdev->new_data_offset - rdev->data_offset);
if (!mddev->reshape_backwards)
diff = -diff;

View File

@@ -264,6 +264,9 @@ static struct dentry *dfs_rootdir;
*/
int ubi_debugfs_init(void)
{
if (!IS_ENABLED(DEBUG_FS))
return 0;
dfs_rootdir = debugfs_create_dir("ubi", NULL);
if (IS_ERR_OR_NULL(dfs_rootdir)) {
int err = dfs_rootdir ? -ENODEV : PTR_ERR(dfs_rootdir);
@@ -281,7 +284,8 @@ int ubi_debugfs_init(void)
*/
void ubi_debugfs_exit(void)
{
debugfs_remove(dfs_rootdir);
if (IS_ENABLED(DEBUG_FS))
debugfs_remove(dfs_rootdir);
}
/* Read an UBI debugfs file */
@@ -403,6 +407,9 @@ int ubi_debugfs_init_dev(struct ubi_device *ubi)
struct dentry *dent;
struct ubi_debug_info *d = ubi->dbg;
if (!IS_ENABLED(DEBUG_FS))
return 0;
n = snprintf(d->dfs_dir_name, UBI_DFS_DIR_LEN + 1, UBI_DFS_DIR_NAME,
ubi->ubi_num);
if (n == UBI_DFS_DIR_LEN) {
@@ -470,5 +477,6 @@ out:
*/
void ubi_debugfs_exit_dev(struct ubi_device *ubi)
{
debugfs_remove_recursive(ubi->dbg->dfs_dir);
if (IS_ENABLED(DEBUG_FS))
debugfs_remove_recursive(ubi->dbg->dfs_dir);
}

View File

@@ -1262,11 +1262,11 @@ int ubi_wl_flush(struct ubi_device *ubi, int vol_id, int lnum)
dbg_wl("flush pending work for LEB %d:%d (%d pending works)",
vol_id, lnum, ubi->works_count);
down_write(&ubi->work_sem);
while (found) {
struct ubi_work *wrk;
found = 0;
down_read(&ubi->work_sem);
spin_lock(&ubi->wl_lock);
list_for_each_entry(wrk, &ubi->works, list) {
if ((vol_id == UBI_ALL || wrk->vol_id == vol_id) &&
@@ -1277,18 +1277,27 @@ int ubi_wl_flush(struct ubi_device *ubi, int vol_id, int lnum)
spin_unlock(&ubi->wl_lock);
err = wrk->func(ubi, wrk, 0);
if (err)
goto out;
if (err) {
up_read(&ubi->work_sem);
return err;
}
spin_lock(&ubi->wl_lock);
found = 1;
break;
}
}
spin_unlock(&ubi->wl_lock);
up_read(&ubi->work_sem);
}
out:
/*
* Make sure all the works which have been done in parallel are
* finished.
*/
down_write(&ubi->work_sem);
up_write(&ubi->work_sem);
return err;
}

View File

@@ -697,10 +697,10 @@ static int mlx4_common_set_port(struct mlx4_dev *dev, int slave, u32 in_mod,
if (slave != dev->caps.function)
memset(inbox->buf, 0, 256);
if (dev->flags & MLX4_FLAG_OLD_PORT_CMDS) {
*(u8 *) inbox->buf = !!reset_qkey_viols << 6;
*(u8 *) inbox->buf |= !!reset_qkey_viols << 6;
((__be32 *) inbox->buf)[2] = agg_cap_mask;
} else {
((u8 *) inbox->buf)[3] = !!reset_qkey_viols;
((u8 *) inbox->buf)[3] |= !!reset_qkey_viols;
((__be32 *) inbox->buf)[1] = agg_cap_mask;
}

Some files were not shown because too many files have changed in this diff Show More