summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2024-11-12Merge tag 'kvm-s390-next-6.13-1' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - second part of the ucontrol selftest - cpumodel sanity check selftest - gen17 cpumodel changes
2024-11-11KVM: s390: add gen17 facilities to CPU modelHendrik Brueckner
Add gen17 facilities and let KVM_CAP_S390_VECTOR_REGISTERS handle the enablement of the vector extension facilities. Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241107152319.77816-4-brueckner@linux.ibm.com Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Message-ID: <20241107152319.77816-4-brueckner@linux.ibm.com>
2024-11-11KVM: s390: add msa11 to cpu modelHendrik Brueckner
Message-security-assist 11 introduces pckmo subfunctions to encrypt hmac keys. Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241107152319.77816-3-brueckner@linux.ibm.com Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Message-ID: <20241107152319.77816-3-brueckner@linux.ibm.com>
2024-11-11KVM: s390: add concurrent-function facility to cpu modelHendrik Brueckner
Adding support for concurrent-functions facility which provides additional subfunctions. Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241107152319.77816-2-brueckner@linux.ibm.com Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Message-ID: <20241107152319.77816-2-brueckner@linux.ibm.com>
2024-11-08Merge tag 'kvm-riscv-6.13-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini
KVM/riscv changes for 6.13 - Accelerate KVM RISC-V when running as a guest - Perf support to collect KVM guest statistics from host side
2024-11-05riscv: kvm: Fix out-of-bounds array accessBjörn Töpel
In kvm_riscv_vcpu_sbi_init() the entry->ext_idx can contain an out-of-bound index. This is used as a special marker for the base extensions, that cannot be disabled. However, when traversing the extensions, that special marker is not checked prior indexing the array. Add an out-of-bounds check to the function. Fixes: 56d8a385b605 ("RISC-V: KVM: Allow some SBI extensions to be disabled by default") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20241104191503.74725-1-bjorn@kernel.org Signed-off-by: Anup Patel <anup@brainfault.org>
2024-11-05RISC-V: KVM: Fix APLIC in_clrip and clripnum write emulationYong-Xuan Wang
In the section "4.7 Precise effects on interrupt-pending bits" of the RISC-V AIA specification defines that: "If the source mode is Level1 or Level0 and the interrupt domain is configured in MSI delivery mode (domaincfg.DM = 1): The pending bit is cleared whenever the rectified input value is low, when the interrupt is forwarded by MSI, or by a relevant write to an in_clrip register or to clripnum." Update the aplic_write_pending() to match the spec. Fixes: d8dd9f113e16 ("RISC-V: KVM: Fix APLIC setipnum_le/be write emulation") Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Vincent Chen <vincent.chen@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20241029085542.30541-1-yongxuan.wang@sifive.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: KVM: Use NACL HFENCEs for KVM request based HFENCEsAnup Patel
When running under some other hypervisor, use SBI NACL based HFENCEs for TLB shoot-down via KVM requests. This makes HFENCEs faster whenever SBI nested acceleration is available. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-14-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: KVM: Save trap CSRs in kvm_riscv_vcpu_enter_exit()Anup Patel
Save trap CSRs in the kvm_riscv_vcpu_enter_exit() function instead of the kvm_arch_vcpu_ioctl_run() function so that HTVAL and HTINST CSRs are accessed in more optimized manner while running under some other hypervisor. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-13-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: KVM: Use SBI sync SRET call when availableAnup Patel
Implement an optimized KVM world-switch using SBI sync SRET call when SBI nested acceleration extension is available. This improves KVM world-switch when KVM RISC-V is running as a Guest under some other hypervisor. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-12-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: KVM: Use nacl_csr_xyz() for accessing AIA CSRsAnup Patel
When running under some other hypervisor, prefer nacl_csr_xyz() for accessing AIA CSRs in the run-loop. This makes CSR access faster whenever SBI nested acceleration is available. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-11-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: KVM: Use nacl_csr_xyz() for accessing H-extension CSRsAnup Patel
When running under some other hypervisor, prefer nacl_csr_xyz() for accessing H-extension CSRs in the run-loop. This makes CSR access faster whenever SBI nested acceleration is available. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-10-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: KVM: Add common nested acceleration supportAnup Patel
Add a common nested acceleration support which will be shared by all parts of KVM RISC-V. This nested acceleration support detects and enables SBI NACL extension usage based on static keys which ensures minimum impact on the non-nested scenario. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-9-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: Add defines for the SBI nested acceleration extensionAnup Patel
Add defines for the new SBI nested acceleration extension which was ratified as part of the SBI v2.0 specification. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-8-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: KVM: Don't setup SGEI for zero guest external interruptsAnup Patel
No need to setup SGEI local interrupt when there are zero guest external interrupts (i.e. zero HW IMSIC guest files). Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-7-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: KVM: Replace aia_set_hvictl() with aia_hvictl_value()Anup Patel
The aia_set_hvictl() internally writes the HVICTL CSR which makes it difficult to optimize the CSR write using SBI NACL extension for kvm_riscv_vcpu_aia_update_hvip() function so replace aia_set_hvictl() with new aia_hvictl_value() which only computes the HVICTL value. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-6-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: KVM: Break down the __kvm_riscv_switch_to() into macrosAnup Patel
Break down the __kvm_riscv_switch_to() function into macros so that these macros can be later re-used by SBI NACL extension based low-level switch function. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-5-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: KVM: Save/restore SCOUNTEREN in C sourceAnup Patel
The SCOUNTEREN CSR need not be saved/restored in the low-level __kvm_riscv_switch_to() function hence move the SCOUNTEREN CSR save/restore to the kvm_riscv_vcpu_swap_in_guest_state() and kvm_riscv_vcpu_swap_in_host_state() functions in C sources. Also, re-arrange the CSR save/restore and related GPR usage in the low-level __kvm_riscv_switch_to() low-level function. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-4-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: KVM: Save/restore HSTATUS in C sourceAnup Patel
We will be optimizing HSTATUS CSR access via shared memory setup using the SBI nested acceleration extension. To facilitate this, we first move HSTATUS save/restore in kvm_riscv_vcpu_enter_exit(). Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-3-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28RISC-V: KVM: Order the object files alphabeticallyAnup Patel
Order the object files alphabetically in the Makefile so that it is very predictable inserting new object files in the future. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-2-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28riscv: KVM: add basic support for host vs guest profilingQuan Zhou
For the information collected on the host side, we need to identify which data originates from the guest and record these events separately, this can be achieved by having KVM register perf callbacks. Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/00342d535311eb0629b9ba4f1e457a48e2abee33.1728957131.git.zhouquan@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28riscv: perf: add guest vs host distinctionQuan Zhou
Introduce basic guest support in perf, enabling it to distinguish between PMU interrupts in the host or guest, and collect fundamental information. Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/a67d527dc1b11493fe11f7f53584772fdd983744.1728957131.git.zhouquan@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-27Merge tag 'x86_urgent_for_v6.12_rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Prevent a certain range of pages which get marked as hypervisor-only, to get allocated to a CoCo (SNP) guest which cannot use them and thus fail booting - Fix the microcode loader on AMD to pay attention to the stepping of a patch and to handle the case where a BIOS config option splits the machine into logical NUMA nodes per L3 cache slice - Disable LAM from being built by default due to security concerns * tag 'x86_urgent_for_v6.12_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Ensure that RMP table fixups are reserved x86/microcode/AMD: Split load_microcode_amd() x86/microcode/AMD: Pay attention to the stepping dynamically x86/lam: Disable ADDRESS_MASKING in most cases
2024-10-25KVM: arm64: Don't mark "struct page" accessed when making SPTE youngSean Christopherson
Don't mark pages/folios as accessed in the primary MMU when making a SPTE young in KVM's secondary MMU, as doing so relies on kvm_pfn_to_refcounted_page(), and generally speaking is unnecessary and wasteful. KVM participates in page aging via mmu_notifiers, so there's no need to push "accessed" updates to the primary MMU. Dropping use of kvm_set_pfn_accessed() also paves the way for removing kvm_pfn_to_refcounted_page() and all its users. Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-84-seanjc@google.com>
2024-10-25KVM: x86/mmu: Don't mark "struct page" accessed when zapping SPTEsSean Christopherson
Don't mark pages/folios as accessed in the primary MMU when zapping SPTEs, as doing so relies on kvm_pfn_to_refcounted_page(), and generally speaking is unnecessary and wasteful. KVM participates in page aging via mmu_notifiers, so there's no need to push "accessed" updates to the primary MMU. And if KVM zaps a SPTe in response to an mmu_notifier, marking it accessed _after_ the primary MMU has decided to zap the page is likely to go unnoticed, i.e. odds are good that, if the page is being zapped for reclaim, the page will be swapped out regardless of whether or not KVM marks the page accessed. Dropping x86's use of kvm_set_pfn_accessed() also paves the way for removing kvm_pfn_to_refcounted_page() and all its users. Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-83-seanjc@google.com>
2024-10-25KVM: s390: Use kvm_release_page_dirty() to unpin "struct page" memorySean Christopherson
Use kvm_release_page_dirty() when unpinning guest pages, as the pfn was retrieved via pin_guest_page(), i.e. is guaranteed to be backed by struct page memory. This will allow dropping kvm_release_pfn_dirty() and friends. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-81-seanjc@google.com>
2024-10-25KVM: PPC: Explicitly require struct page memory for Ultravisor sharingSean Christopherson
Explicitly require "struct page" memory when sharing memory between guest and host via an Ultravisor. Given the number of pfn_to_page() calls in the code, it's safe to assume that KVM already requires that the pfn returned by gfn_to_pfn() is backed by struct page, i.e. this is likely a bug fix, not a reduction in KVM capabilities. Switching to gfn_to_page() will eventually allow removing gfn_to_pfn() and kvm_pfn_to_refcounted_page(). Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-79-seanjc@google.com>
2024-10-25KVM: arm64: Use __gfn_to_page() when copying MTE tags to/from userspaceSean Christopherson
Use __gfn_to_page() instead when copying MTE tags between guest and userspace. This will eventually allow removing gfn_to_pfn_prot(), gfn_to_pfn(), kvm_pfn_to_refcounted_page(), and related APIs. Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-78-seanjc@google.com>
2024-10-25KVM: PPC: Use kvm_vcpu_map() to map guest memory to patch dcbz instructionsSean Christopherson
Use kvm_vcpu_map() when patching dcbz in guest memory, as a regular GUP isn't technically sufficient when writing to data in the target pages. As per Documentation/core-api/pin_user_pages.rst: Correct (uses FOLL_PIN calls): pin_user_pages() write to the data within the pages unpin_user_pages() INCORRECT (uses FOLL_GET calls): get_user_pages() write to the data within the pages put_page() As a happy bonus, using kvm_vcpu_{,un}map() takes care of creating a mapping and marking the page dirty. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-75-seanjc@google.com>
2024-10-25KVM: PPC: Remove extra get_page() to fix page refcount leakSean Christopherson
Don't manually do get_page() when patching dcbz, as gfn_to_page() gifts the caller a reference. I.e. doing get_page() will leak the page due to not putting all references. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-74-seanjc@google.com>
2024-10-25KVM: MIPS: Use kvm_faultin_pfn() to map pfns into the guestSean Christopherson
Convert MIPS to kvm_faultin_pfn()+kvm_release_faultin_page(), which are new APIs to consolidate arch code and provide consistent behavior across all KVM architectures. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-73-seanjc@google.com>
2024-10-25KVM: MIPS: Mark "struct page" pfns accessed prior to dropping mmu_lockSean Christopherson
Mark pages accessed before dropping mmu_lock when faulting in guest memory so that MIPS can convert to kvm_release_faultin_page() without tripping its lockdep assertion on mmu_lock being held. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-72-seanjc@google.com>
2024-10-25KVM: MIPS: Mark "struct page" pfns accessed only in "slow" page fault pathSean Christopherson
Mark pages accessed only in the slow page fault path in order to remove an unnecessary user of kvm_pfn_to_refcounted_page(). Marking pages accessed in the primary MMU during KVM page fault handling isn't harmful, but it's largely pointless and likely a waste of a cycles since the primary MMU will call into KVM via mmu_notifiers when aging pages. I.e. KVM participates in a "pull" model, so there's no need to also "push" updates. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-71-seanjc@google.com>
2024-10-25KVM: MIPS: Mark "struct page" pfns dirty only in "slow" page fault pathSean Christopherson
Mark pages/folios dirty only the slow page fault path, i.e. only when mmu_lock is held and the operation is mmu_notifier-protected, as marking a page/folio dirty after it has been written back can make some filesystems unhappy (backing KVM guests will such filesystem files is uncommon, and the race is minuscule, hence the lack of complaints). See the link below for details. Link: https://lore.kernel.org/all/cover.1683044162.git.lstoakes@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-70-seanjc@google.com>
2024-10-25KVM: LoongArch: Use kvm_faultin_pfn() to map pfns into the guestSean Christopherson
Convert LoongArch to kvm_faultin_pfn()+kvm_release_faultin_page(), which are new APIs to consolidate arch code and provide consistent behavior across all KVM architectures. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-69-seanjc@google.com>
2024-10-25KVM: LoongArch: Mark "struct page" pfn accessed before dropping mmu_lockSean Christopherson
Mark pages accessed before dropping mmu_lock when faulting in guest memory so that LoongArch can convert to kvm_release_faultin_page() without tripping its lockdep assertion on mmu_lock being held. Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-68-seanjc@google.com>
2024-10-25KVM: LoongArch: Mark "struct page" pfns accessed only in "slow" page fault pathSean Christopherson
Mark pages accessed only in the slow path, before dropping mmu_lock when faulting in guest memory so that LoongArch can convert to kvm_release_faultin_page() without tripping its lockdep assertion on mmu_lock being held. Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-67-seanjc@google.com>
2024-10-25KVM: LoongArch: Mark "struct page" pfns dirty only in "slow" page fault pathSean Christopherson
Mark pages/folios dirty only the slow page fault path, i.e. only when mmu_lock is held and the operation is mmu_notifier-protected, as marking a page/folio dirty after it has been written back can make some filesystems unhappy (backing KVM guests will such filesystem files is uncommon, and the race is minuscule, hence the lack of complaints). See the link below for details. Link: https://lore.kernel.org/all/cover.1683044162.git.lstoakes@gmail.com Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-66-seanjc@google.com>
2024-10-25KVM: PPC: Use kvm_faultin_pfn() to handle page faults on Book3s PRSean Christopherson
Convert Book3S PR to __kvm_faultin_pfn()+kvm_release_faultin_page(), which are new APIs to consolidate arch code and provide consistent behavior across all KVM architectures. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-65-seanjc@google.com>
2024-10-25KVM: PPC: Book3S: Mark "struct page" pfns dirty/accessed after installing PTESean Christopherson
Mark pages/folios dirty/accessed after installing a PTE, and more specifically after acquiring mmu_lock and checking for an mmu_notifier invalidation. Marking a page/folio dirty after it has been written back can make some filesystems unhappy (backing KVM guests will such filesystem files is uncommon, and the race is minuscule, hence the lack of complaints). See the link below for details. This will also allow converting Book3S to kvm_release_faultin_page(), which requires that mmu_lock be held (for the aforementioned reason). Link: https://lore.kernel.org/all/cover.1683044162.git.lstoakes@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-64-seanjc@google.com>
2024-10-25KVM: PPC: Drop unused @kvm_ro param from kvmppc_book3s_instantiate_page()Sean Christopherson
Drop @kvm_ro from kvmppc_book3s_instantiate_page() as it is now only written, and never read. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-63-seanjc@google.com>
2024-10-25KVM: PPC: Use __kvm_faultin_pfn() to handle page faults on Book3s RadixSean Christopherson
Replace Book3s Radix's homebrewed (read: copy+pasted) fault-in logic with __kvm_faultin_pfn(), which functionally does pretty much the exact same thing. Note, when the code was written, KVM indeed didn't do fast GUP without "!atomic && !async", but that has long since changed (KVM tries fast GUP for all writable mappings). Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-62-seanjc@google.com>
2024-10-25KVM: PPC: Use __kvm_faultin_pfn() to handle page faults on Book3s HVSean Christopherson
Replace Book3s HV's homebrewed fault-in logic with __kvm_faultin_pfn(), which functionally does pretty much the exact same thing. Note, when the code was written, KVM indeed didn't do fast GUP without "!atomic && !async", but that has long since changed (KVM tries fast GUP for all writable mappings). Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-61-seanjc@google.com>
2024-10-25KVM: RISC-V: Use kvm_faultin_pfn() when mapping pfns into the guestSean Christopherson
Convert RISC-V to __kvm_faultin_pfn()+kvm_release_faultin_page(), which are new APIs to consolidate arch code and provide consistent behavior across all KVM architectures. Opportunisticaly fix a s/priort/prior typo in the related comment. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Anup Patel <anup@brainfault.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-60-seanjc@google.com>
2024-10-25KVM: RISC-V: Mark "struct page" pfns accessed before dropping mmu_lockSean Christopherson
Mark pages accessed before dropping mmu_lock when faulting in guest memory so that RISC-V can convert to kvm_release_faultin_page() without tripping its lockdep assertion on mmu_lock being held. Marking pages accessed outside of mmu_lock is ok (not great, but safe), but marking pages _dirty_ outside of mmu_lock can make filesystems unhappy (see the link below). Do both under mmu_lock to minimize the chances of doing the wrong thing in the future. Link: https://lore.kernel.org/all/cover.1683044162.git.lstoakes@gmail.com Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Anup Patel <anup@brainfault.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-59-seanjc@google.com>
2024-10-25KVM: RISC-V: Mark "struct page" pfns dirty iff a stage-2 PTE is installedSean Christopherson
Don't mark pages dirty if KVM bails from the page fault handler without installing a stage-2 mapping, i.e. if the page is guaranteed to not be written by the guest. In addition to being a (very) minor fix, this paves the way for converting RISC-V to use kvm_release_faultin_page(). Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Anup Patel <anup@brainfault.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-58-seanjc@google.com>
2024-10-25KVM: arm64: Use __kvm_faultin_pfn() to handle memory abortsSean Christopherson
Convert arm64 to use __kvm_faultin_pfn()+kvm_release_faultin_page(). Three down, six to go. Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-57-seanjc@google.com>
2024-10-25KVM: arm64: Mark "struct page" pfns accessed/dirty before dropping mmu_lockSean Christopherson
Mark pages/folios accessed+dirty prior to dropping mmu_lock, as marking a page/folio dirty after it has been written back can make some filesystems unhappy (backing KVM guests will such filesystem files is uncommon, and the race is minuscule, hence the lack of complaints). While scary sounding, practically speaking the worst case scenario is that KVM would trigger this WARN in filemap_unaccount_folio(): /* * At this point folio must be either written or cleaned by * truncate. Dirty folio here signals a bug and loss of * unwritten data - on ordinary filesystems. * * But it's harmless on in-memory filesystems like tmpfs; and can * occur when a driver which did get_user_pages() sets page dirty * before putting it, while the inode is being finally evicted. * * Below fixes dirty accounting after removing the folio entirely * but leaves the dirty flag set: it has no effect for truncated * folio and anyway will be cleared before returning folio to * buddy allocator. */ if (WARN_ON_ONCE(folio_test_dirty(folio) && mapping_can_writeback(mapping))) folio_account_cleaned(folio, inode_to_wb(mapping->host)); KVM won't actually write memory because the stage-2 mappings are protected by the mmu_notifier, i.e. there is no risk of loss of data, even if the VM were backed by memory that needs writeback. See the link below for additional details. This will also allow converting arm64 to kvm_release_faultin_page(), which requires that mmu_lock be held (for the aforementioned reason). Link: https://lore.kernel.org/all/cover.1683044162.git.lstoakes@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-56-seanjc@google.com>
2024-10-25KVM: PPC: e500: Use __kvm_faultin_pfn() to handle page faultsSean Christopherson
Convert PPC e500 to use __kvm_faultin_pfn()+kvm_release_faultin_page(), and continue the inexorable march towards the demise of kvm_pfn_to_refcounted_page(). Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-55-seanjc@google.com>
2024-10-25KVM: PPC: e500: Mark "struct page" pfn accessed before dropping mmu_lockSean Christopherson
Mark pages accessed before dropping mmu_lock when faulting in guest memory so that shadow_map() can convert to kvm_release_faultin_page() without tripping its lockdep assertion on mmu_lock being held. Marking pages accessed outside of mmu_lock is ok (not great, but safe), but marking pages _dirty_ outside of mmu_lock can make filesystems unhappy. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-54-seanjc@google.com>