mirror of
https://github.com/torvalds/linux.git
synced 2025-12-07 11:56:58 +00:00
Patch series "mm: remove is_swap_[pte, pmd]() + non-swap entries,
introduce leaf entries", v3.
There's an established convention in the kernel that we treat leaf page
tables (so far at the PTE, PMD level) as containing 'swap entries' should
they be neither empty (i.e. p**_none() evaluating true) nor present (i.e.
p**_present() evaluating true).
However, at the same time we also have helper predicates - is_swap_pte(),
is_swap_pmd() - which are inconsistently used.
This is problematic, as it is logical to assume that should somebody wish
to operate upon a page table swap entry they should first check to see if
it is in fact one.
It also implies that perhaps, in future, we might introduce a non-present,
none page table entry that is not a swap entry.
This series resolves this issue by systematically eliminating all use of
the is_swap_pte() and is swap_pmd() predicates so we retain only the
convention that should a leaf page table entry be neither none nor present
it is a swap entry.
We also have the further issue that 'swap entry' is unfortunately a really
rather overloaded term and in fact refers to both entries for swap and for
other information such as migration entries, page table markers, and
device private entries.
We therefore have the rather 'unique' concept of a 'non-swap' swap entry.
This series therefore introduces the concept of 'software leaf entries',
of type softleaf_t, to eliminate this confusion.
A software leaf entry in this sense is any page table entry which is
non-present, and represented by the softleaf_t type. That is - page table
leaf entries which are software-controlled by the kernel.
This includes 'none' or empty entries, which are simply represented by an
zero leaf entry value.
In order to maintain compatibility as we transition the kernel to this new
type, we simply typedef swp_entry_t to softleaf_t.
We introduce a number of predicates and helpers to interact with software
leaf entries in include/linux/leafops.h which, as it imports swapops.h,
can be treated as a drop-in replacement for swapops.h wherever leaf entry
helpers are used.
Since softleaf_from_[pte, pmd]() treats present entries as they were
empty/none leaf entries, this allows for a great deal of simplification of
code throughout the code base, which this series utilises a great deal.
We additionally change from swap entry to software leaf entry handling
where it makes sense to and eliminate functions from swapops.h where
software leaf entries obviate the need for the functions.
This patch (of 16):
PTE markers were previously only concerned with UFFD-specific logic - that
is, PTE entries with the UFFD WP marker set or those marked via
UFFDIO_POISON.
However since the introduction of guard markers in commit 7c53dfbdb0
("mm: add PTE_MARKER_GUARD PTE marker"), this has no longer been the case.
Issues have been avoided as guard regions are not permitted in conjunction
with UFFD, but it still leaves very confusing logic in place, most notably
the misleading and poorly named pte_none_mostly() and
huge_pte_none_mostly().
This predicate returns true for PTE entries that ought to be treated as
none, but only in certain circumstances, and on the assumption we are
dealing with H/W poison markers or UFFD WP markers.
This patch removes these functions and makes each invocation of these
functions instead explicitly check what it needs to check.
As part of this effort it introduces is_uffd_pte_marker() to explicitly
determine if a marker in fact is used as part of UFFD or not.
In the HMM logic we note that the only time we would need to check for a
fault is in the case of a UFFD WP marker, otherwise we simply encounter a
fault error (VM_FAULT_HWPOISON for H/W poisoned marker, VM_FAULT_SIGSEGV
for a guard marker), so only check for the UFFD WP case.
While we're here we also refactor code to make it easier to understand.
[akpm@linux-foundation.org: fix comment typo, per Mike]
Link: https://lkml.kernel.org/r/cover.1762812360.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/c38625fd9a1c1f1cf64ae8a248858e45b3dcdf11.1762812360.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Chris Li <chrisl@kernel.org>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Wei Xu <weixugc@google.com>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Yuanchu Xie <yuanchu@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
132 lines
2.8 KiB
C
132 lines
2.8 KiB
C
/* SPDX-License-Identifier: GPL-2.0 */
|
|
#ifndef _ASM_GENERIC_HUGETLB_H
|
|
#define _ASM_GENERIC_HUGETLB_H
|
|
|
|
#include <linux/swap.h>
|
|
#include <linux/swapops.h>
|
|
|
|
static inline unsigned long huge_pte_write(pte_t pte)
|
|
{
|
|
return pte_write(pte);
|
|
}
|
|
|
|
static inline unsigned long huge_pte_dirty(pte_t pte)
|
|
{
|
|
return pte_dirty(pte);
|
|
}
|
|
|
|
static inline pte_t huge_pte_mkwrite(pte_t pte)
|
|
{
|
|
return pte_mkwrite_novma(pte);
|
|
}
|
|
|
|
#ifndef __HAVE_ARCH_HUGE_PTE_WRPROTECT
|
|
static inline pte_t huge_pte_wrprotect(pte_t pte)
|
|
{
|
|
return pte_wrprotect(pte);
|
|
}
|
|
#endif
|
|
|
|
static inline pte_t huge_pte_mkdirty(pte_t pte)
|
|
{
|
|
return pte_mkdirty(pte);
|
|
}
|
|
|
|
static inline pte_t huge_pte_modify(pte_t pte, pgprot_t newprot)
|
|
{
|
|
return pte_modify(pte, newprot);
|
|
}
|
|
|
|
#ifndef __HAVE_ARCH_HUGE_PTE_MKUFFD_WP
|
|
static inline pte_t huge_pte_mkuffd_wp(pte_t pte)
|
|
{
|
|
return huge_pte_wrprotect(pte_mkuffd_wp(pte));
|
|
}
|
|
#endif
|
|
|
|
#ifndef __HAVE_ARCH_HUGE_PTE_CLEAR_UFFD_WP
|
|
static inline pte_t huge_pte_clear_uffd_wp(pte_t pte)
|
|
{
|
|
return pte_clear_uffd_wp(pte);
|
|
}
|
|
#endif
|
|
|
|
#ifndef __HAVE_ARCH_HUGE_PTE_UFFD_WP
|
|
static inline int huge_pte_uffd_wp(pte_t pte)
|
|
{
|
|
return pte_uffd_wp(pte);
|
|
}
|
|
#endif
|
|
|
|
#ifndef __HAVE_ARCH_HUGE_PTE_CLEAR
|
|
static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
|
|
pte_t *ptep, unsigned long sz)
|
|
{
|
|
pte_clear(mm, addr, ptep);
|
|
}
|
|
#endif
|
|
|
|
#ifndef __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
|
|
static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
|
|
pte_t *ptep, pte_t pte, unsigned long sz)
|
|
{
|
|
set_pte_at(mm, addr, ptep, pte);
|
|
}
|
|
#endif
|
|
|
|
#ifndef __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
|
|
static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
|
|
unsigned long addr, pte_t *ptep, unsigned long sz)
|
|
{
|
|
return ptep_get_and_clear(mm, addr, ptep);
|
|
}
|
|
#endif
|
|
|
|
#ifndef __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
|
|
static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
|
|
unsigned long addr, pte_t *ptep)
|
|
{
|
|
return ptep_clear_flush(vma, addr, ptep);
|
|
}
|
|
#endif
|
|
|
|
#ifndef __HAVE_ARCH_HUGE_PTE_NONE
|
|
static inline int huge_pte_none(pte_t pte)
|
|
{
|
|
return pte_none(pte);
|
|
}
|
|
#endif
|
|
|
|
#ifndef __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
|
|
static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
|
|
unsigned long addr, pte_t *ptep)
|
|
{
|
|
ptep_set_wrprotect(mm, addr, ptep);
|
|
}
|
|
#endif
|
|
|
|
#ifndef __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
|
|
static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
|
|
unsigned long addr, pte_t *ptep,
|
|
pte_t pte, int dirty)
|
|
{
|
|
return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
|
|
}
|
|
#endif
|
|
|
|
#ifndef __HAVE_ARCH_HUGE_PTEP_GET
|
|
static inline pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
|
|
{
|
|
return ptep_get(ptep);
|
|
}
|
|
#endif
|
|
|
|
#ifndef __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED
|
|
static inline bool gigantic_page_runtime_supported(void)
|
|
{
|
|
return IS_ENABLED(CONFIG_ARCH_HAS_GIGANTIC_PAGE);
|
|
}
|
|
#endif /* __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED */
|
|
|
|
#endif /* _ASM_GENERIC_HUGETLB_H */
|