The .cold function parent/child correlation logic has two passes: one in
read_symbols() and one in add_jump_destinations().
The second pass was added with commit cd77849a69 ("objtool: Fix GCC 8
cold subfunction detection for aliased functions") to ensure that if the
parent symbol had aliases then the canonical symbol was chosen as the
parent.
That solution was rather clunky, not to mention incomplete due to the
existence of alternatives and switch tables. Now that we have
sym->alias, the canonical alias fix can be done much simpler in the
first pass, making the second pass obsolete.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/bdab245a38000a5407f663a031f39e14c67a43d4.1763671318.git.jpoimboe@kernel.org
The objtool .cold child/parent correlation is done in two phases: first
in elf_add_symbol() and later in add_jump_destinations().
The first phase is rather crude and can pick the wrong parent if there
are duplicates with the same name.
The second phase usually fixes that, but only if the parent has a direct
jump to the child. It does *not* work if the only branch from the
parent to the child is an alternative or jump table entry.
Make the first phase more robust by looking for the parent in the same
STT_FILE as the child.
Fixes the following objtool warnings in an AutoFDO build with a large
CLANG_AUTOFDO_PROFILE profile:
vmlinux.o: warning: objtool: rdev_add_key() falls through to next function rdev_add_key.cold()
vmlinux.o: warning: objtool: rdev_set_default_key() falls through to next function rdev_set_default_key.cold()
Fixes: 13810435b9 ("objtool: Support GCC 8's cold subfunctions")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/82c7b52e40efa75dd10e1c550cc75c1ce10ac2c9.1763671318.git.jpoimboe@kernel.org
AutoFDO enables -fsplit-machine-functions which can move the cold parts
of a function to a <func>.cold symbol in a .text.split.<func> section.
Unlike GCC, the Clang <func>.cold symbols are not marked STT_FUNC. This
confuses objtool in several ways, resulting in warnings like the
following:
vmlinux.o: warning: objtool: apply_retpolines.cold+0xfc: unsupported instruction in callable function
vmlinux.o: warning: objtool: machine_check_poll.cold+0x2e: unsupported instruction in callable function
vmlinux.o: warning: objtool: free_deferred_objects.cold+0x1f: relocation to !ENDBR: free_deferred_objects.cold+0x26
vmlinux.o: warning: objtool: rpm_idle.cold+0xe0: relocation to !ENDBR: rpm_idle.cold+0xe7
vmlinux.o: warning: objtool: tcp_rcv_state_process.cold+0x1c: relocation to !ENDBR: tcp_rcv_state_process.cold+0x23
Fix it by marking the .cold symbols as STT_FUNC.
Fixes: 2fd65f7afd ("AutoFDO: Enable machine function split optimization for AutoFDO")
Closes: https://lore.kernel.org/20251103215244.2080638-2-xur@google.com
Reported-by: Rong Xu <xur@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: xur@google.com
Tested-by: xur@google.com
Link: https://patch.msgid.link/20a67326f04b2a361c031b56d58e8a803b3c5893.1763671318.git.jpoimboe@kernel.org
In preparation for klp-build, enable "classic" objtool to work on
livepatch modules:
- Avoid duplicate symbol/section warnings for prefix symbols and the
.static_call_sites and __mcount_loc sections which may have already
been extracted by klp diff.
- Add __klp_funcs to the IBT function pointer section whitelist.
- Prevent KLP symbols from getting incorrectly classified as cold
subfunctions.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
The prefix symbol creation code currently ignores all errors, presumably
because some functions don't have the leading NOPs.
Shuffle the code around a bit, improve the error handling and document
why some errors are ignored.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Add a new klp diff subcommand which performs a binary diff between two
object files and extracts changed functions into a new object which can
then be linked into a livepatch module.
This builds on concepts from the longstanding out-of-tree kpatch [1]
project which began in 2012 and has been used for many years to generate
livepatch modules for production kernels. However, this is a complete
rewrite which incorporates hard-earned lessons from 12+ years of
maintaining kpatch.
Key improvements compared to kpatch-build:
- Integrated with objtool: Leverages objtool's existing control-flow
graph analysis to help detect changed functions.
- Works on vmlinux.o: Supports late-linked objects, making it
compatible with LTO, IBT, and similar.
- Simplified code base: ~3k fewer lines of code.
- Upstream: No more out-of-tree #ifdef hacks, far less cruft.
- Cleaner internals: Vastly simplified logic for symbol/section/reloc
inclusion and special section extraction.
- Robust __LINE__ macro handling: Avoids false positive binary diffs
caused by the __LINE__ macro by introducing a fix-patch-lines script
(coming in a later patch) which injects #line directives into the
source .patch to preserve the original line numbers at compile time.
Note the end result of this subcommand is not yet functionally complete.
Livepatch needs some ELF magic which linkers don't like:
- Two relocation sections (.rela*, .klp.rela*) for the same text
section.
- Use of SHN_LIVEPATCH to mark livepatch symbols.
Unfortunately linkers tend to mangle such things. To work around that,
klp diff generates a linker-compliant intermediate binary which encodes
the relevant KLP section/reloc/symbol metadata.
After module linking, a klp post-link step (coming soon) will clean up
the mess and convert the linked .ko into a fully compliant livepatch
module.
Note this subcommand requires the diffed binaries to have been compiled
with -ffunction-sections and -fdata-sections, and processed with
'objtool --checksum'. Those constraints will be handled by a klp-build
script introduced in a later patch.
Without '-ffunction-sections -fdata-sections', reliable object diffing
would be infeasible due to toolchain limitations:
- For intra-file+intra-section references, the compiler might
occasionally generated hard-coded instruction offsets instead of
relocations.
- Section-symbol-based references can be ambiguous:
- Overlapping or zero-length symbols create ambiguity as to which
symbol is being referenced.
- A reference to the end of a symbol (e.g., checking array bounds)
can be misinterpreted as a reference to the next symbol, or vice
versa.
A potential future alternative to '-ffunction-sections -fdata-sections'
would be to introduce a toolchain option that forces symbol-based
(non-section) relocations.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
In preparation for the objtool klp diff subcommand, add a command-line
option to generate a unique checksum for each function. This will
enable detection of functions which have changed between two versions of
an object file.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
elf_create_rela_section() is quite limited in that it requires the
caller to know how many relocations need to be allocated up front.
In preparation for the objtool klp diff subcommand, allow an arbitrary
number of relocations to be created and initialized on demand after
section creation.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
In preparation for the objtool klp diff subcommand, refactor
elf_add_string() by adding a new elf_add_data() helper which allows the
adding of arbitrary data to a section.
Make both interfaces global so they can be used by the upcoming klp diff
code.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
In preparation for the objtool klp diff subcommand, broaden the
elf_create_section() interface to give callers more control and reduce
duplication of some subtle setup logic.
While at it, make elf_create_rela_section() global so sections can be
created by the upcoming klp diff code.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
In preparation for the objtool klp diff subcommand, broaden the
elf_create_symbol() interface to give callers more control and reduce
duplication of some subtle setup logic.
While at it, make elf_create_symbol() and elf_create_section_symbol()
global so sections can be created by the upcoming klp diff code.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
!sym->sec isn't actually a thing: even STT_UNDEF and other special
symbol types belong to NULL section 0.
Simplify the initialization of 'shndx' accordingly.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
In preparation for the objtool klp diff subcommand, introduce a flag to
identify __pfx_*() and __cfi_*() functions in advance so they don't need
to be manually identified every time a check is needed.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Introduce a flag to identify .cold subfunctions so they can be detected
easier and faster.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
KBUILD_HOSTCFLAGS and KBUILD_HOSTLDFLAGS aren't defined when objtool is
built standalone. Also, the EXTRA_WARNINGS flags are rather arbitrary.
Make things simpler and more consistent by specifying compiler flags
explicitly and tweaking the warnings. Also make a few code tweaks to
make the new warnings happy.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
find_symbol_hole_containing() fails to find a symbol hole (aka stripped
weak symbol) if its section has no symbols before the hole. This breaks
weak symbol detection if -ffunction-sections is enabled.
Fix that by allowing the interval tree to contain section symbols, which
are always at offset zero for a given section.
Fixes a bunch of (-ffunction-sections) warnings like:
vmlinux.o: warning: objtool: .text.__x64_sys_io_setup+0x10: unreachable instruction
Fixes: 4adb236867 ("objtool: Ignore extra-symbol code")
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
The following commit
5da6aea375 ("objtool: Fix find_{symbol,func}_containing()")
fixed the issue where overlapping symbols weren't getting sorted
properly in the symbol tree. Therefore the workaround to skip adding
empty symbols from the following commit
a2e38dffcd ("objtool: Don't add empty symbols to the rbtree")
is no longer needed.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Properly check and propagate the return value of elf_truncate_section()
to avoid silent failures.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
The free(sym) call in the read_symbols() error path is fundamentally
broken: 'sym' doesn't point to any allocated block. If triggered,
things would go from bad to worse.
Remove the free() and simplify the error paths. Freeing memory isn't
necessary here anyway, these are fatal errors which lead to an immediate
exit().
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
In the rare case of overlapping symbols, find_symbol_containing() just
returns the first one it finds. Make it slightly less arbitrary by
returning the smallest symbol with size > 0.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
After elf_update_group_sh_info() was introduced, a prototype version of
"objtool klp diff" went from taking ~1s to several minutes, due to
looping almost endlessly in elf_update_group_sh_info() while creating
thousands of local symbols in a file with thousands of sections.
Dramatically improve the performance by marking all symbols' correlated
SHT_GROUP sections while reading the object. That way there's no need
to search for it every time a symbol gets reindexed.
Fixes: 2cb291596e ("objtool: Fix up st_info in COMDAT group section")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Rong Xu <xur@google.com>
Link: https://lkml.kernel.org/r/2a33e583c87e3283706f346f9d59aac20653b7fd.1746662991.git.jpoimboe@kernel.org
When __elf_create_symbol creates a local symbol, it relocates the first
global symbol upwards to make space. Subsequently, elf_update_symbol()
is called to refresh the symbol table section. However, this isn't
sufficient, as other sections might have the reference to the old
symbol index, for instance, the sh_info field of an SHT_GROUP section.
This patch updates the `sh_info` field when necessary. This field
serves as the key for the COMDAT group. An incorrect key would prevent
the linker's from deduplicating COMDAT symbols, leading to duplicate
definitions in the final link.
Signed-off-by: Rong Xu <xur@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250425200541.113015-1-xur@google.com
In the presence of both weak and strong function definitions, the
linker drops the weak symbol in favor of a strong symbol, but
leaves the code in place. Code in ignore_unreachable_insn() has
some heuristics to suppress the warning, but it does not work when
-ffunction-sections is enabled.
Suppose function foo has both strong and weak definitions.
Case 1: The strong definition has an annotated section name,
like .init.text. Only the weak definition will be placed into
.text.foo. But since the section has no symbols, there will be no
"hole" in the section.
Case 2: Both sections are without an annotated section name.
Both will be placed into .text.foo section, but there will be only one
symbol (the strong one). If the weak code is before the strong code,
there is no "hole" as it fails to find the right-most symbol before
the offset.
The fix is to use the first node to compute the hole if hole.sym
is empty. If there is no symbol in the section, the first node
will be NULL, in which case, -1 is returned to skip the whole
section.
Co-developed-by: Han Shen <shenhan@google.com>
Signed-off-by: Han Shen <shenhan@google.com>
Signed-off-by: Rong Xu <xur@google.com>
Suggested-by: Sriraman Tallam <tmsriram@google.com>
Suggested-by: Krzysztof Pszeniczny <kpszeniczny@google.com>
Tested-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Yabin Cui <yabinc@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Function elf_open_read() only zero initializes the initial part of
allocated struct elf; num_relocs member was recently added outside the
zeroed part so that it was left uninitialized, resulting in build failures
on some systems.
The partial initialization is a relic of times when struct elf had large
hash tables embedded. This is no longer the case so remove the trap and
initialize the whole structure instead.
Fixes: eb0481bbc4 ("objtool: Fix reloc_hash size")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/20230629102051.42E8360467@lion.mk-sys.cz
Objtool doesn't use DWARF at all, and the DWARF sections' data take up a
lot of memory. Skip reading them.
Note this only skips the DWARF base sections, not the rela sections.
The relas are needed because their symbol references may need to be
reindexed if any local symbols get added by elf_create_symbol().
Also note the DWARF data will eventually be read by libelf anyway, when
writing the object file. But that's fine, the goal here is to reduce
*peak* memory usage, and the previous patch (which freed insn memory)
gave some breathing room. So the allocation gets shifted to a later
time, resulting in lower peak memory usage.
With allyesconfig + CONFIG_DEBUG_INFO:
- Before: peak heap memory consumption: 29.93G
- After: peak heap memory consumption: 25.47G
Link: https://lore.kernel.org/r/52a9698835861dd35f2ec35c49f96d0bb39fb177.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
When creating an annotation section, allocate the reloc section data at
the beginning. This simplifies the data model a bit and also saves
memory due to the removal of malloc() in elf_rebuild_reloc_section().
With allyesconfig + CONFIG_DEBUG_INFO:
- Before: peak heap memory consumption: 53.49G
- After: peak heap memory consumption: 49.02G
Link: https://lore.kernel.org/r/048e908f3ede9b66c15e44672b6dda992b1dae3e.1685464332.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>