Pull MM updates from Andrew Morton:
- "Add folio_mk_pte()" from Matthew Wilcox simplifies the act of
creating a pte which addresses the first page in a folio and reduces
the amount of plumbing which architecture must implement to provide
this.
- "Misc folio patches for 6.16" from Matthew Wilcox is a shower of
largely unrelated folio infrastructure changes which clean things up
and better prepare us for future work.
- "memory,x86,acpi: hotplug memory alignment advisement" from Gregory
Price adds early-init code to prevent x86 from leaving physical
memory unused when physical address regions are not aligned to memory
block size.
- "mm/compaction: allow more aggressive proactive compaction" from
Michal Clapinski provides some tuning of the (sadly, hard-coded (more
sadly, not auto-tuned)) thresholds for our invokation of proactive
compaction. In a simple test case, the reduction of a guest VM's
memory consumption was dramatic.
- "Minor cleanups and improvements to swap freeing code" from Kemeng
Shi provides some code cleaups and a small efficiency improvement to
this part of our swap handling code.
- "ptrace: introduce PTRACE_SET_SYSCALL_INFO API" from Dmitry Levin
adds the ability for a ptracer to modify syscalls arguments. At this
time we can alter only "system call information that are used by
strace system call tampering, namely, syscall number, syscall
arguments, and syscall return value.
This series should have been incorporated into mm.git's "non-MM"
branch, but I goofed.
- "fs/proc: extend the PAGEMAP_SCAN ioctl to report guard regions" from
Andrei Vagin extends the info returned by the PAGEMAP_SCAN ioctl
against /proc/pid/pagemap. This permits CRIU to more efficiently get
at the info about guard regions.
- "Fix parameter passed to page_mapcount_is_type()" from Gavin Shan
implements that fix. No runtime effect is expected because
validate_page_before_insert() happens to fix up this error.
- "kernel/events/uprobes: uprobe_write_opcode() rewrite" from David
Hildenbrand basically brings uprobe text poking into the current
decade. Remove a bunch of hand-rolled implementation in favor of
using more current facilities.
- "mm/ptdump: Drop assumption that pxd_val() is u64" from Anshuman
Khandual provides enhancements and generalizations to the pte dumping
code. This might be needed when 128-bit Page Table Descriptors are
enabled for ARM.
- "Always call constructor for kernel page tables" from Kevin Brodsky
ensures that the ctor/dtor is always called for kernel pgtables, as
it already is for user pgtables.
This permits the addition of more functionality such as "insert hooks
to protect page tables". This change does result in various
architectures performing unnecesary work, but this is fixed up where
it is anticipated to occur.
- "Rust support for mm_struct, vm_area_struct, and mmap" from Alice
Ryhl adds plumbing to permit Rust access to core MM structures.
- "fix incorrectly disallowed anonymous VMA merges" from Lorenzo
Stoakes takes advantage of some VMA merging opportunities which we've
been missing for 15 years.
- "mm/madvise: batch tlb flushes for MADV_DONTNEED and MADV_FREE" from
SeongJae Park optimizes process_madvise()'s TLB flushing.
Instead of flushing each address range in the provided iovec, we
batch the flushing across all the iovec entries. The syscall's cost
was approximately halved with a microbenchmark which was designed to
load this particular operation.
- "Track node vacancy to reduce worst case allocation counts" from
Sidhartha Kumar makes the maple tree smarter about its node
preallocation.
stress-ng mmap performance increased by single-digit percentages and
the amount of unnecessarily preallocated memory was dramaticelly
reduced.
- "mm/gup: Minor fix, cleanup and improvements" from Baoquan He removes
a few unnecessary things which Baoquan noted when reading the code.
- ""Enhance sysfs handling for memory hotplug in weighted interleave"
from Rakie Kim "enhances the weighted interleave policy in the memory
management subsystem by improving sysfs handling, fixing memory
leaks, and introducing dynamic sysfs updates for memory hotplug
support". Fixes things on error paths which we are unlikely to hit.
- "mm/damon: auto-tune DAMOS for NUMA setups including tiered memory"
from SeongJae Park introduces new DAMOS quota goal metrics which
eliminate the manual tuning which is required when utilizing DAMON
for memory tiering.
- "mm/vmalloc.c: code cleanup and improvements" from Baoquan He
provides cleanups and small efficiency improvements which Baoquan
found via code inspection.
- "vmscan: enforce mems_effective during demotion" from Gregory Price
changes reclaim to respect cpuset.mems_effective during demotion when
possible. because presently, reclaim explicitly ignores
cpuset.mems_effective when demoting, which may cause the cpuset
settings to violated.
This is useful for isolating workloads on a multi-tenant system from
certain classes of memory more consistently.
- "Clean up split_huge_pmd_locked() and remove unnecessary folio
pointers" from Gavin Guo provides minor cleanups and efficiency gains
in in the huge page splitting and migrating code.
- "Use kmem_cache for memcg alloc" from Huan Yang creates a slab cache
for `struct mem_cgroup', yielding improved memory utilization.
- "add max arg to swappiness in memory.reclaim and lru_gen" from
Zhongkun He adds a new "max" argument to the "swappiness=" argument
for memory.reclaim MGLRU's lru_gen.
This directs proactive reclaim to reclaim from only anon folios
rather than file-backed folios.
- "kexec: introduce Kexec HandOver (KHO)" from Mike Rapoport is the
first step on the path to permitting the kernel to maintain existing
VMs while replacing the host kernel via file-based kexec. At this
time only memblock's reserve_mem is preserved.
- "mm: Introduce for_each_valid_pfn()" from David Woodhouse provides
and uses a smarter way of looping over a pfn range. By skipping
ranges of invalid pfns.
- "sched/numa: Skip VMA scanning on memory pinned to one NUMA node via
cpuset.mems" from Libo Chen removes a lot of pointless VMA scanning
when a task is pinned a single NUMA mode.
Dramatic performance benefits were seen in some real world cases.
- "JFS: Implement migrate_folio for jfs_metapage_aops" from Shivank
Garg addresses a warning which occurs during memory compaction when
using JFS.
- "move all VMA allocation, freeing and duplication logic to mm" from
Lorenzo Stoakes moves some VMA code from kernel/fork.c into the more
appropriate mm/vma.c.
- "mm, swap: clean up swap cache mapping helper" from Kairui Song
provides code consolidation and cleanups related to the folio_index()
function.
- "mm/gup: Cleanup memfd_pin_folios()" from Vishal Moola does that.
- "memcg: Fix test_memcg_min/low test failures" from Waiman Long
addresses some bogus failures which are being reported by the
test_memcontrol selftest.
- "eliminate mmap() retry merge, add .mmap_prepare hook" from Lorenzo
Stoakes commences the deprecation of file_operations.mmap() in favor
of the new file_operations.mmap_prepare().
The latter is more restrictive and prevents drivers from messing with
things in ways which, amongst other problems, may defeat VMA merging.
- "memcg: decouple memcg and objcg stocks"" from Shakeel Butt decouples
the per-cpu memcg charge cache from the objcg's one.
This is a step along the way to making memcg and objcg charging
NMI-safe, which is a BPF requirement.
- "mm/damon: minor fixups and improvements for code, tests, and
documents" from SeongJae Park is yet another batch of miscellaneous
DAMON changes. Fix and improve minor problems in code, tests and
documents.
- "memcg: make memcg stats irq safe" from Shakeel Butt converts memcg
stats to be irq safe. Another step along the way to making memcg
charging and stats updates NMI-safe, a BPF requirement.
- "Let unmap_hugepage_range() and several related functions take folio
instead of page" from Fan Ni provides folio conversions in the
hugetlb code.
* tag 'mm-stable-2025-05-31-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (285 commits)
mm: pcp: increase pcp->free_count threshold to trigger free_high
mm/hugetlb: convert use of struct page to folio in __unmap_hugepage_range()
mm/hugetlb: refactor __unmap_hugepage_range() to take folio instead of page
mm/hugetlb: refactor unmap_hugepage_range() to take folio instead of page
mm/hugetlb: pass folio instead of page to unmap_ref_private()
memcg: objcg stock trylock without irq disabling
memcg: no stock lock for cpu hot-unplug
memcg: make __mod_memcg_lruvec_state re-entrant safe against irqs
memcg: make count_memcg_events re-entrant safe against irqs
memcg: make mod_memcg_state re-entrant safe against irqs
memcg: move preempt disable to callers of memcg_rstat_updated
memcg: memcg_rstat_updated re-entrant safe against irqs
mm: khugepaged: decouple SHMEM and file folios' collapse
selftests/eventfd: correct test name and improve messages
alloc_tag: check mem_profiling_support in alloc_tag_init
Docs/damon: update titles and brief introductions to explain DAMOS
selftests/damon/_damon_sysfs: read tried regions directories in order
mm/damon/tests/core-kunit: add a test for damos_set_filters_default_reject()
mm/damon/paddr: remove unused variable, folio_list, in damon_pa_stat()
mm/damon/sysfs-schemes: fix wrong comment on damons_sysfs_quota_goal_metric_strs
...
Pull EFI updates from Ard Biesheuvel:
"Not a lot going on in the EFI tree this cycle. The only thing that
stands out is the new support for SBAT metadata, which was a bit
contentious when it was first proposed, because in the initial
incarnation, it would have required us to maintain a revocation index,
and bump it each time a vulnerability affecting UEFI secure boot got
fixed. This was shot down for obvious reasons.
This time, only the changes needed to emit the SBAT section into the
PE/COFF image are being carried upstream, and it is up to the distros
to decide what to put in there when creating and signing the build.
This only has the EFI zboot bits (which the distros will be using for
arm64); the x86 bzImage changes should be arriving next cycle,
presumably via the -tip tree.
Summary:
- Add support for emitting a .sbat section into the EFI zboot image,
so that downstreams can easily include revocation metadata in the
signed EFI images
- Align PE symbolic constant names with other projects
- Bug fix for the efi_test module
- Log the physical address and size of the EFI memory map when
failing to map it
- A kerneldoc fix for the EFI stub code"
* tag 'efi-next-for-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
include: pe.h: Fix PE definitions
efi/efi_test: Fix missing pending status update in getwakeuptime
efi: zboot specific mechanism for embedding SBAT section
efi/libstub: Describe missing 'out' parameter in efi_load_initrd
efi: Improve logging around memmap init
The main CPUID header <asm/cpuid.h> was originally a storefront for the
headers:
<asm/cpuid/api.h>
<asm/cpuid/leaf_0x2_api.h>
Now that the latter CPUID(0x2) header has been merged into the former,
there is no practical difference between <asm/cpuid.h> and
<asm/cpuid/api.h>.
Migrate all users to the <asm/cpuid/api.h> header, in preparation of
the removal of <asm/cpuid.h>.
Don't remove <asm/cpuid.h> just yet, in case some new code in -next
started using it.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250508150240.172915-3-darwi@linutronix.de
The global pseudo-constants 'page_offset_base', 'vmalloc_base' and
'vmemmap_base' are not used extremely early during the boot, and cannot be
used safely until after the KASLR memory randomization code in
kernel_randomize_memory() executes, which may update their values.
So there is no point in setting these variables extremely early, and it
can wait until after the kernel itself is mapped and running from its
permanent virtual mapping.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250513111157.717727-9-ardb+git@google.com
Prepare to resolve conflicts with an upstream series of fixes that conflict
with pending x86 changes:
6f5bf947ba Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Prepare to resolve conflicts with an upstream series of fixes that conflict
with pending x86 changes:
6f5bf947ba Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Most of the SEV support code used to reside in a single C source file
that was included in two places: the core kernel, and the decompressor.
The code that is actually shared with the decompressor was moved into a
separate, shared source file under startup/, on the basis that the
decompressor also executes from the early 1:1 mapping of memory.
However, while the elaborate #VC handling and instruction decoding that
it involves is also performed by the decompressor, it does not actually
occur in the core kernel at early boot, and therefore, does not need to
be part of the confined early startup code.
So split off the #VC handling code and move it back into arch/x86/coco
where it came from, into another C source file that is included from
both the decompressor and the core kernel.
Code movement only - no functional change intended.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250504095230.2932860-31-ardb+git@google.com
Commit:
d54d610243 ("x86/boot/sev: Avoid shared GHCB page for early memory acceptance")
provided a fix for SEV-SNP memory acceptance from the EFI stub when
running at VMPL #0. However, that fix was insufficient for SVSM SEV-SNP
guests running at VMPL >0, as those rely on a SVSM calling area, which
is a shared buffer whose address is programmed into a SEV-SNP MSR, and
the SEV init code that sets up this calling area executes much later
during the boot.
Given that booting via the EFI stub at VMPL >0 implies that the firmware
has configured this calling area already, reuse it for performing memory
acceptance in the EFI stub.
Fixes: fcd042e864 ("x86/sev: Perform PVALIDATE using the SVSM when not at VMPL0")
Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
Co-developed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250428174322.2780170-2-ardb+git@google.com
The GNU coreutils version of truncate, which is the original, accepts a
% prefix for the -s size argument which means the file in question
should be padded to a multiple of the given size. This is currently used
to pad the setup block of bzImage to a multiple of 4k before appending
the decompressor.
busybox reimplements truncate but does not support this idiom, and
therefore fails the build since commit
9c54baab44 ("x86/boot: Drop CRC-32 checksum and the build tool that generates it")
Since very little build code within the kernel depends on the 'truncate'
utility, work around this incompatibility by avoiding truncate altogether,
and relying on dd to perform the padding.
Fixes: 9c54baab44 ("x86/boot: Drop CRC-32 checksum and the build tool that generates it")
Reported-by: <phasta@kernel.org>
Tested-by: Philipp Stanner <phasta@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250424101917.1552527-2-ardb+git@google.com
This commits breaks SNP guests:
234cf67fc3 ("x86/sev: Split off startup code from core code")
The SNP guest boots, but no longer has access to the VMPCK keys needed
to communicate with the ASP, which is used, for example, to obtain an
attestation report.
The secrets_pa value is defined as static in both startup.c and
core.c. It is set by a function in startup.c and so when used in
core.c its value will be 0.
Share it again and add the sev_ prefix to put it into the global
SEV symbols namespace.
[ mingo: Renamed to sev_secrets_pa ]
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Link: https://lore.kernel.org/r/cf878810-81ed-3017-52c6-ce6aa41b5f01@amd.com
objtool already struggles to identify jump tables correctly in non-PIC
code, where the idiom is something like
jmpq *table(,%idx,8)
and the table is a list of absolute addresses of jump targets.
When using -fPIC, both the table reference as well as the jump targets
are emitted in a RIP-relative manner, resulting in something like
leaq table(%rip), %tbl
movslq (%tbl,%idx,4), %offset
addq %offset, %tbl
jmpq *%tbl
and the table is a list of offsets of the jump targets relative to the
start of the entire table.
Considering that this sequence of instructions can be interleaved with
other instructions that have nothing to do with the jump table in
question, it is extremely difficult to infer the control flow by
deriving the jump targets from the indirect jump, the location of the
table and the relative offsets it contains.
So let's not bother and disable jump tables for code built with -fPIC
under arch/x86/boot/startup.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250422210510.600354-2-ardb+git@google.com
In particular we need this fix before applying subsequent changes:
d54d610243 ("x86/boot/sev: Avoid shared GHCB page for early memory acceptance")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Minimum version of binutils required to compile the kernel is 2.25.
This version correctly handles the "rep" prefixes, so it is possible
to remove the semicolon, which was used to support ancient versions
of GNU as.
Due to the semicolon, the compiler considers "rep; insn" (or its
alternate "rep\n\tinsn" form) as two separate instructions. Removing
the semicolon makes asm length calculations more accurate, consequently
making scheduling and inlining decisions of the compiler more accurate.
Removing the semicolon also enables assembler checks involving "rep"
prefixes. Trying to assemble e.g. "rep addl %eax, %ebx" results in:
Error: invalid instruction `add' after `rep'
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Mares <mj@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250418071437.4144391-1-ubizjak@gmail.com
When building with CONFIG_LTO_CLANG, there is an error in the x86 boot
startup code because it builds with a different code model than the rest
of the kernel:
ld.lld: error: Function Import: link error: linking module flags 'Code Model': IDs have conflicting values: 'i32 2' from vmlinux.a(head64.o at 1302448), and 'i32 1' from vmlinux.a(map_kernel.o at 1314208)
ld.lld: error: Function Import: link error: linking module flags 'Code Model': IDs have conflicting values: 'i32 2' from vmlinux.a(common.o at 1306108), and 'i32 1' from vmlinux.a(gdt_idt.o at 1314148)
As this directory is for code that only runs during early system
initialization, LTO is not very important, so filter out the LTO flags
from KBUILD_CFLAGS for arch/x86/boot/startup to resolve the build error.
Fixes: 4cecebf200 ("x86/boot: Move the early GDT/IDT setup code into startup/")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20250414-x86-boot-startup-lto-error-v1-1-7c8bed7c131c@kernel.org
Closes: https://lore.kernel.org/CA+G9fYvnun+bhYgtt425LWxzOmj+8Jf3ruKeYxQSx-F6U7aisg@mail.gmail.com/
The 5-level paging trampoline is used by both the EFI stub and the
traditional decompressor. Move it out of the decompressor sources into
the newly minted arch/x86/boot/startup/ sub-directory which will hold
startup code that may be shared between the decompressor, the EFI stub
and the kernel proper, and needs to tolerate being called during early
boot, before the kernel virtual mapping has been created.
This will allow the 5-level paging trampoline to be used by EFI boot
images such as zboot that omit the traditional decompressor entirely.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250401133416.1436741-10-ardb+git@google.com
Pull Kbuild updates from Masahiro Yamada:
- Improve performance in gendwarfksyms
- Remove deprecated EXTRA_*FLAGS and KBUILD_ENABLE_EXTRA_GCC_CHECKS
- Support CONFIG_HEADERS_INSTALL for ARCH=um
- Use more relative paths to sources files for better reproducibility
- Support the loong64 Debian architecture
- Add Kbuild bash completion
- Introduce intermediate vmlinux.unstripped for architectures that need
static relocations to be stripped from the final vmlinux
- Fix versioning in Debian packages for -rc releases
- Treat missing MODULE_DESCRIPTION() as an error
- Convert Nios2 Makefiles to use the generic rule for built-in DTB
- Add debuginfo support to the RPM package
* tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (40 commits)
kbuild: rpm-pkg: build a debuginfo RPM
kconfig: merge_config: use an empty file as initfile
nios2: migrate to the generic rule for built-in DTB
rust: kbuild: skip `--remap-path-prefix` for `rustdoc`
kbuild: pacman-pkg: hardcode module installation path
kbuild: deb-pkg: don't set KBUILD_BUILD_VERSION unconditionally
modpost: require a MODULE_DESCRIPTION()
kbuild: make all file references relative to source root
x86: drop unnecessary prefix map configuration
kbuild: deb-pkg: add comment about future removal of KDEB_COMPRESS
kbuild: Add a help message for "headers"
kbuild: deb-pkg: remove "version" variable in mkdebian
kbuild: deb-pkg: fix versioning for -rc releases
Documentation/kbuild: Fix indentation in modules.rst example
x86: Get rid of Makefile.postlink
kbuild: Create intermediate vmlinux build with relocations preserved
kbuild: Introduce Kconfig symbol for linking vmlinux with relocations
kbuild: link-vmlinux.sh: Make output file name configurable
kbuild: do not generate .tmp_vmlinux*.map when CONFIG_VMLINUX_MAP=y
Revert "kheaders: Ignore silly-rename files"
...
Pull EFI updates from Ard Biesheuvel:
- Decouple mixed mode startup code from the traditional x86
decompressor
- Revert zero-length file hack in efivarfs
- Prevent EFI zboot from using the CopyMem/SetMem boot services after
ExitBootServices()
- Update EFI zboot to use the ZLIB/ZSTD library interfaces directly
* tag 'efi-next-for-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi/libstub: Avoid legacy decompressor zlib/zstd wrappers
efi/libstub: Avoid CopyMem/SetMem EFI services after ExitBootServices
efi: efibc: change kmalloc(size * count, ...) to kmalloc_array()
efivarfs: Revert "allow creation of zero length files"
x86/efi/mixed: Move mixed mode startup code into libstub
x86/efi/mixed: Simplify and document thunking logic
x86/efi/mixed: Remove dependency on legacy startup_32 code
x86/efi/mixed: Set up 1:1 mapping of lower 4GiB in the stub
x86/efi/mixed: Factor out and clean up long mode entry
x86/efi/mixed: Check CPU compatibility without relying on verify_cpu()
x86/efistub: Merge PE and handover entrypoints
We've had this before: when we remove infrastructure to generate files,
the old stale build artifacts still remain in-tree. And when the
infrastructure to generate them is gone, so is the gitignore file for
those build artifacts.
End result: git will see the old generated files, and people will
mistakenly commit them. That's what happened with the 'genheaders' file
not that long ago (see commit 04a3389b35 "Remove stale generated
'genheaders' file").
This time it's commit 9c54baab44 ("x86/boot: Drop CRC-32 checksum and
the build tool that generates it") that removed the 'build' file from
the arch/x86/boot/tools/ subdirectory, and removed the .gitignore file
too (because the whole subdirectory is gone).
And as a result, if you don't do a 'git clean -dqfx' or similar to clean
up your tree, 'git status' will say
Untracked files:
(use "git add <file>..." to include in what will be committed)
arch/x86/boot/tools/
and some hapless sleep-deprived developer will inevitably decide that
that means that they need to 'git add' that directory. Which would
bring back some stale generated file that we most definitely do not want
in the tree.
So when removing directories that had special .gitignore patterns, make
sure to add a new gitignore entry in the parent directory for the no
longer existing subdirectory.
It will avoid mistakes.
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Fixes: 9c54baab44 ("x86/boot: Drop CRC-32 checksum and the build tool that generates it")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull x86 boot code updates from Ingo Molnar:
- Memblock setup and other early boot code cleanups (Mike Rapoport)
- Export e820_table_kexec[] to sysfs (Dave Young)
- Baby steps of adding relocate_kernel() debugging support (David
Woodhouse)
- Replace open-coded parity calculation with parity8() (Kuan-Wei Chiu)
- Move the LA57 trampoline to separate source file (Ard Biesheuvel)
- Misc micro-optimizations (Uros Bizjak)
- Drop obsolete E820_TYPE_RESERVED_KERN and related code (Mike
Rapoport)
* tag 'x86-boot-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kexec: Add relocate_kernel() debugging support: Load a GDT
x86/boot: Move the LA57 trampoline to separate source file
x86/boot: Do not test if AC and ID eflags are changeable on x86_64
x86/bootflag: Replace open-coded parity calculation with parity8()
x86/bootflag: Micro-optimize sbf_write()
x86/boot: Add missing has_cpuflag() prototype
x86/kexec: Export e820_table_kexec[] to sysfs
x86/boot: Change some static bootflag functions to bool
x86/e820: Drop obsolete E820_TYPE_RESERVED_KERN and related code
x86/boot: Split parsing of boot_params into the parse_boot_params() helper function
x86/boot: Split kernel resources setup into the setup_kernel_resources() helper function
x86/boot: Move setting of memblock parameters to e820__memblock_setup()
Pull x86 build updates from Ingo Molnar:
- Drop CRC-32 checksum and the build tool that generates it (Ard
Biesheuvel)
- Fix broken copy command in genimage.sh when making isoimage (Nir
Lichtman)
* tag 'x86-build-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Add back some padding for the CRC-32 checksum
x86/boot: Drop CRC-32 checksum and the build tool that generates it
x86/build: Fix broken copy command in genimage.sh when making isoimage
The toplevel Makefile already provides -fmacro-prefix-map as part of
KBUILD_CPPFLAGS. In contrast to the KBUILD_CFLAGS and KBUILD_AFLAGS
variables, KBUILD_CPPFLAGS is not redefined in the architecture specific
Makefiles. Therefore the toplevel KBUILD_CPPFLAGS do apply just fine, to
both C and ASM sources.
The custom configuration was necessary when it was added in
commit 9e2276fa6e ("arch/x86/boot: Use prefix map to avoid embedded
paths") but has since become unnecessary in commit a716bd7432
("kbuild: use -fmacro-prefix-map for .S sources").
Drop the now unnecessary custom prefix map configuration.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
While the GCC and Clang compilers already define __ASSEMBLER__
automatically when compiling assembly code, __ASSEMBLY__ is a
macro that only gets defined by the Makefiles in the kernel.
This can be very confusing when switching between userspace
and kernelspace coding, or when dealing with UAPI headers that
rather should use __ASSEMBLER__ instead. So let's standardize on
the __ASSEMBLER__ macro that is provided by the compilers now.
This is mostly a mechanical patch (done with a simple "sed -i"
statement), with some manual tweaks in <asm/frame.h>, <asm/hw_irq.h>
and <asm/setup.h> that mentioned this macro in comments with some
missing underscores.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250314071013.1575167-38-thuth@redhat.com
Introduce an AWK script to auto-generate the <asm/cpufeaturemasks.h> header
with required and disabled feature masks based on <asm/cpufeatures.h>
and the current build config.
Thus for any CPU feature with a build config, e.g., X86_FRED, simply add:
config X86_DISABLED_FEATURE_FRED
def_bool y
depends on !X86_FRED
to arch/x86/Kconfig.cpufeatures, instead of adding a conditional CPU
feature disable flag, e.g., DISABLE_FRED.
Lastly, the generated required and disabled feature masks will be added to
their corresponding feature masks for this particular compile-time
configuration.
[ Xin: build integration improvements ]
[ mingo: Improved changelog and comments ]
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250305184725.3341760-3-xin@zytor.com
Instead of generating the vmlinux.relocs file (needed by the
decompressor build to construct the KASLR relocation tables) as a
vmlinux postlink step, which is dubious because it depends on data that
is stripped from vmlinux before the build completes, generate it from
vmlinux.unstripped, which has been introduced specifically for this
purpose.
This ensures that each artifact is rebuilt as needed, rather than as a
side effect of another build rule.
This effectively reverts commit
9d9173e9ce ("x86/build: Avoid relocation information in final vmlinux")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>