aboutsummaryrefslogtreecommitdiff
path: root/target/i386/kvm
AgeCommit message (Collapse)AuthorFilesLines
2022-03-23KVM: x86: workaround invalid CPUID[0xD,9] info on some AMD processorsPaolo Bonzini1-7/+12
Some AMD processors expose the PKRU extended save state even if they do not have the related PKU feature in CPUID. Worse, when they do they report a size of 64, whereas the expected size of the PKRU extended save state is 8, therefore the esa->size == eax assertion does not hold. The state is already ignored by KVM_GET_SUPPORTED_CPUID because it was not enabled in the host XCR0. However, QEMU kvm_cpu_xsave_init() runs before QEMU invokes arch_prctl() to enable dynamically-enabled save states such as XTILEDATA, and KVM_GET_SUPPORTED_CPUID hides save states that have yet to be enabled. Therefore, kvm_cpu_xsave_init() needs to consult the host CPUID instead of KVM_GET_SUPPORTED_CPUID, and dies with an assertion failure. When setting up the ExtSaveArea array to match the host, ignore features that KVM does not report as supported. This will cause QEMU to skip the incorrect CPUID leaf instead of tripping the assertion. Closes: https://gitlab.com/qemu-project/qemu/-/issues/916 Reported-by: Daniel P. Berrangé <berrange@redhat.com> Analyzed-by: Yang Zhong <yang.zhong@intel.com> Reported-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-23i386: Set MCG_STATUS_RIPV bit for mce SRAR errorluofei1-1/+1
In the physical machine environment, when a SRAR error occurs, the IA32_MCG_STATUS RIPV bit is set, but qemu does not set this bit. When qemu injects an SRAR error into virtual machine, the virtual machine kernel just call do_machine_check() to kill the current task, but not call memory_failure() to isolate the faulty page, which will cause the faulty page to be allocated and used repeatedly. If used by the virtual machine kernel, it will cause the virtual machine to crash Signed-off-by: luofei <luofei@unicloud.com> Message-Id: <20220120084634.131450-1-luofei@unicloud.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-23target/i386/kvm: Free xsave_buf when destroying vCPUPhilippe Mathieu-Daudé1-0/+2
Fix vCPU hot-unplug related leak reported by Valgrind: ==132362== 4,096 bytes in 1 blocks are definitely lost in loss record 8,440 of 8,549 ==132362== at 0x4C3B15F: memalign (vg_replace_malloc.c:1265) ==132362== by 0x4C3B288: posix_memalign (vg_replace_malloc.c:1429) ==132362== by 0xB41195: qemu_try_memalign (memalign.c:53) ==132362== by 0xB41204: qemu_memalign (memalign.c:73) ==132362== by 0x7131CB: kvm_init_xsave (kvm.c:1601) ==132362== by 0x7148ED: kvm_arch_init_vcpu (kvm.c:2031) ==132362== by 0x91D224: kvm_init_vcpu (kvm-all.c:516) ==132362== by 0x9242C9: kvm_vcpu_thread_fn (kvm-accel-ops.c:40) ==132362== by 0xB2EB26: qemu_thread_start (qemu-thread-posix.c:556) ==132362== by 0x7EB2159: start_thread (in /usr/lib64/libpthread-2.28.so) ==132362== by 0x9D45DD2: clone (in /usr/lib64/libc-2.28.so) Reported-by: Mark Kanda <mark.kanda@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Mark Kanda <mark.kanda@oracle.com> Message-Id: <20220322120522.26200-1-philippe.mathieu.daude@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-20target/i386: kvm: do not access uninitialized variable on older kernelsPaolo Bonzini1-4/+13
KVM support for AMX includes a new system attribute, KVM_X86_XCOMP_GUEST_SUPP. Commit 19db68ca68 ("x86: Grant AMX permission for guest", 2022-03-15) however did not fully consider the behavior on older kernels. First, it warns too aggressively. Second, it invokes the KVM_GET_DEVICE_ATTR ioctl unconditionally and then uses the "bitmask" variable, which remains uninitialized if the ioctl fails. Third, kvm_ioctl returns -errno rather than -1 on errors. While at it, explain why the ioctl is needed and KVM_GET_SUPPORTED_CPUID is not enough. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15x86: Support XFD and AMX xsave data migrationZeng Guang1-0/+18
XFD(eXtended Feature Disable) allows to enable a feature on xsave state while preventing specific user threads from using the feature. Support save and restore XFD MSRs if CPUID.D.1.EAX[4] enumerate to be valid. Likewise migrate the MSRs and related xsave state necessarily. Signed-off-by: Zeng Guang <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-8-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15x86: add support for KVM_CAP_XSAVE2 and AMX state migrationJing Liu1-15/+27
When dynamic xfeatures (e.g. AMX) are used by the guest, the xsave area would be larger than 4KB. KVM_GET_XSAVE2 and KVM_SET_XSAVE under KVM_CAP_XSAVE2 works with a xsave buffer larger than 4KB. Always use the new ioctls under KVM_CAP_XSAVE2 when KVM supports it. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Zeng Guang <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-7-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15x86: Add AMX CPUIDs enumerationJing Liu1-1/+3
Add AMX primary feature bits XFD and AMX_TILE to enumerate the CPU's AMX capability. Meanwhile, add AMX TILE and TMUL CPUID leaf and subleaves which exist when AMX TILE is present to provide the maximum capability of TILE and TMUL. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-6-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15x86: Grant AMX permission for guestYang Zhong3-6/+64
Kernel allocates 4K xstate buffer by default. For XSAVE features which require large state component (e.g. AMX), Linux kernel dynamically expands the xstate buffer only after the process has acquired the necessary permissions. Those are called dynamically- enabled XSAVE features (or dynamic xfeatures). There are separate permissions for native tasks and guests. Qemu should request the guest permissions for dynamic xfeatures which will be exposed to the guest. This only needs to be done once before the first vcpu is created. KVM implemented one new ARCH_GET_XCOMP_SUPP system attribute API to get host side supported_xcr0 and Qemu can decide if it can request dynamically enabled XSAVE features permission. https://lore.kernel.org/all/20220126152210.3044876-1-pbonzini@redhat.com/ Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Message-Id: <20220217060434.52460-4-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15x86: Fix the 64-byte boundary enumeration for extended stateJing Liu1-0/+1
The extended state subleaves (EAX=0Dh, ECX=n, n>1).ECX[1] indicate whether the extended state component locates on the next 64-byte boundary following the preceding state component when the compacted format of an XSAVE area is used. Right now, they are all zero because no supported component needed the bit to be set, but the upcoming AMX feature will use it. Fix the subleaves value according to KVM's supported cpuid. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-2-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15kvm/msi: do explicit commit when adding msi routesLongpeng(Mike)1-1/+3
We invoke the kvm_irqchip_commit_routes() for each addition to MSI route table, which is not efficient if we are adding lots of routes in some cases. This patch lets callers invoke the kvm_irqchip_commit_routes(), so the callers can decide how to optimize. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg00967.html Signed-off-by: Longpeng <longpeng2@huawei.com> Message-Id: <20220222141116.2091-3-longpeng2@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-07osdep: Move memalign-related functions to their own headerPeter Maydell1-0/+1
Move the various memalign-related functions out of osdep.h and into their own header, which we include only where they are used. While we're doing this, add some brief documentation comments. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org
2022-01-12KVM: x86: ignore interrupt_bitmap field of KVM_GET/SET_SREGSPaolo Bonzini1-15/+9
This is unnecessary, because the interrupt would be retrieved and queued anyway by KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS respectively, and it makes the flow more similar to the one for KVM_GET/SET_SREGS2. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-12KVM: use KVM_{GET|SET}_SREGS2 when supported.Maxim Levitsky1-2/+106
This allows to make PDPTRs part of the migration stream and thus not reload them after migration which is against X86 spec. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211101132300.192584-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-17target/i386/kvm: Replace use of __u32 typePhilippe Mathieu-Daudé1-1/+1
QEMU coding style mandates to not use Linux kernel internal types for scalars types. Replace __u32 by uint32_t. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211116193955.2793171-1-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-11-02KVM: SVM: add migration support for nested TSC scalingMaxim Levitsky1-0/+15
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211101132300.192584-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-13target/i386/sev: Declare system-specific functions in 'sev.h'Philippe Mathieu-Daudé2-2/+1
"sysemu/sev.h" is only used from x86-specific files. Let's move it to include/hw/i386, and merge it with target/i386/sev.h. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211007161716.453984-16-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-13target/i386/sev: Rename sev_i386.h -> sev.hPhilippe Mathieu-Daudé1-1/+1
SEV is a x86 specific feature, and the "sev_i386.h" header is already in target/i386/. Rename it as "sev.h" to simplify. Patch created mechanically using: $ git mv target/i386/sev_i386.h target/i386/sev.h $ sed -i s/sev_i386.h/sev.h/ $(git grep -l sev_i386.h) Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20211007161716.453984-15-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-13target/i386/kvm: Restrict SEV stubs to x86 architecturePhilippe Mathieu-Daudé2-0/+24
SEV is x86-specific, no need to add its stub to other architectures. Move the stub file to target/i386/kvm/. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211007161716.453984-5-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-13target/i386/kvm: Introduce i386_softmmu_kvm Meson source setPhilippe Mathieu-Daudé1-1/+5
Introduce the i386_softmmu_kvm Meson source set to be able to add features dependent on CONFIG_KVM. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211007161716.453984-4-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01i386: Make Hyper-V version id configurableVitaly Kuznetsov1-10/+16
Currently, we hardcode Hyper-V version id (CPUID 0x40000002) to WS2008R2 and it is known that certain tools in Windows check this. It seems useful to provide some flexibility by making it possible to change this info at will. CPUID information is defined in TLFS as: EAX: Build Number EBX Bits 31-16: Major Version Bits 15-0: Minor Version ECX Service Pack EDX Bits 31-24: Service Branch Bits 23-0: Service Number Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-8-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01i386: Implement pseudo 'hv-avic' ('hv-apicv') enlightenmentVitaly Kuznetsov2-1/+10
The enlightenment allows to use Hyper-V SynIC with hardware APICv/AVIC enabled. Normally, Hyper-V SynIC disables these hardware features and suggests the guest to use paravirtualized AutoEOI feature. Linux-4.15 gains support for conditional APICv/AVIC disablement, the feature stays on until the guest tries to use AutoEOI feature with SynIC. With 'HV_DEPRECATING_AEOI_RECOMMENDED' bit exposed, modern enough Windows/ Hyper-V versions should follow the recommendation and not use the (unwanted) feature. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-7-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01i386: Move HV_APIC_ACCESS_RECOMMENDED bit setting to hyperv_fill_cpuids()Vitaly Kuznetsov1-3/+6
In preparation to enabling Hyper-V + APICv/AVIC move HV_APIC_ACCESS_RECOMMENDED setting out of kvm_hyperv_properties[]: the 'real' feature bit for the vAPIC features is HV_APIC_ACCESS_AVAILABLE, HV_APIC_ACCESS_RECOMMENDED is a recommendation to use the feature which we may not always want to give. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01i386: Support KVM_CAP_HYPERV_ENFORCE_CPUIDVitaly Kuznetsov1-0/+9
By default, KVM allows the guest to use all currently supported Hyper-V enlightenments when Hyper-V CPUID interface was exposed, regardless of if some features were not announced in guest visible CPUIDs. hv-enforce-cpuid feature alters this behavior and only allows the guest to use exposed Hyper-V enlightenments. The feature is supported by Linux >= 5.14 and is not enabled by default in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01i386: Support KVM_CAP_ENFORCE_PV_FEATURE_CPUIDVitaly Kuznetsov1-0/+10
By default, KVM allows the guest to use all currently supported PV features even when they were not announced in guest visible CPUIDs. Introduce a new "kvm-pv-enforce-cpuid" flag to limit the supported feature set to the exposed features. The feature is supported by Linux >= 5.10 and is not enabled by default in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30memory: Name all the memory listenersPeter Xu1-1/+1
Provide a name field for all the memory listeners. It can be used to identify which memory listener is which. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20210817013553.30584-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30i386: Propagate SGX CPUID sub-leafs to KVMSean Christopherson1-0/+19
The SGX sub-leafs are enumerated at CPUID 0x12. Indices 0 and 1 are always present when SGX is supported, and enumerate SGX features and capabilities. Indices >=2 are directly correlated with the platform's EPC sections. Because the number of EPC sections is dynamic and user defined, the number of SGX sub-leafs is "NULL" terminated. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-15-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30i386: kvm: Add support for exposing PROVISIONKEY to guestSean Christopherson2-0/+31
If the guest want to fully use SGX, the guest needs to be able to access provisioning key. Add a new KVM_CAP_SGX_ATTRIBUTE to KVM to support provisioning key to KVM guests. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-14-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30i386: Add feature control MSR dependency when SGX is enabledSean Christopherson1-0/+5
SGX adds multiple flags to FEATURE_CONTROL to enable SGX and Flexible Launch Control. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-12-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30i386: Add get/set/migrate support for SGX_LEPUBKEYHASH MSRsSean Christopherson1-0/+22
On real hardware, on systems that supports SGX Launch Control, those MSRs are initialized to digest of Intel's signing key; on systems that don't support SGX Launch Control, those MSRs are not available but hardware always uses digest of Intel's signing key in EINIT. KVM advertises SGX LC via CPUID if and only if the MSRs are writable. Unconditionally initialize those MSRs to digest of Intel's signing key when CPU is realized and reset to reflect the fact. This avoids potential bug in case kvm_arch_put_registers() is called before kvm_arch_get_registers() is called, in which case guest's virtual SGX_LEPUBKEYHASH MSRs will be set to 0, although KVM initializes those to digest of Intel's signing key by default, since KVM allows those MSRs to be updated by Qemu to support live migration. Save/restore the SGX Launch Enclave Public Key Hash MSRs if SGX Launch Control (LC) is exposed to the guest. Likewise, migrate the MSRs if they are writable by the guest. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-11-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-26migration: Unify failure check for migrate_add_blocker()Markus Armbruster1-3/+3
Most callers check the return value. Some check whether it set an error. Functionally equivalent, but the former tends to be easier on the eyes, so do that everywhere. Prior art: commit c6ecec43b2 "qemu-option: Check return value instead of @err where convenient". Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-26i386: Never free migration blocker objects instead of sometimesMarkus Armbruster1-3/+0
invtsc_mig_blocker has static storage duration. When a CPU with certain features is initialized, and invtsc_mig_blocker is still null, we add a migration blocker and store it in invtsc_mig_blocker. The object is freed when migrate_add_blocker() fails, leaving invtsc_mig_blocker dangling. It is not freed on later failures. Same for hv_passthrough_mig_blocker and hv_no_nonarch_cs_mig_blocker. All failures are actually fatal, so whether we free or not doesn't really matter, except as bad examples to be copied / imitated. Clean this up in a minimal way: never free these blocker objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-7-armbru@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-29i386: assert 'cs->kvm_state' is not nullVitaly Kuznetsov1-0/+14
Coverity reports potential NULL pointer dereference in get_supported_hv_cpuid_legacy() when 'cs->kvm_state' is NULL. While 'cs->kvm_state' can indeed be NULL in hv_cpuid_get_host(), kvm_hyperv_expand_features() makes sure that it only happens when KVM_CAP_SYS_HYPERV_CPUID is supported and KVM_CAP_SYS_HYPERV_CPUID implies KVM_CAP_HYPERV_CPUID so get_supported_hv_cpuid_legacy() is never really called. Add asserts to strengthen the protection against broken KVM behavior. Coverity: CID 1458243 Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210716115852.418293-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-23i386: do not call cpudef-only models functions for max, host, baseClaudio Fontana1-49/+56
Some cpu properties have to be set only for cpu models in builtin_x86_defs, registered with x86_register_cpu_model_type, and not for cpu models "base", "max", and the subclass "host". These properties are the ones set by function x86_cpu_apply_props, (also including kvm_default_props, tcg_default_props), and the "vendor" property for the KVM and HVF accelerators. After recent refactoring of cpu, which also affected these properties, they were instead set unconditionally for all x86 cpus. This has been detected as a bug with Nested on AMD with cpu "host", as svm was not turned on by default, due to the wrongful setting of kvm_default_props via x86_cpu_apply_props, which set svm to "off". Rectify the bug introduced in commit "i386: split cpu accelerators" and document the functions that are builtin_x86_defs-only. Signed-off-by: Claudio Fontana <cfontana@suse.de> Tested-by: Alexander Bulekov <alxndr@bu.edu> Fixes: f5cc5a5c ("i386: split cpu accelerators from cpu.c,"...) Resolves: https://gitlab.com/qemu-project/qemu/-/issues/477 Message-Id: <20210723112921.12637-1-cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-13i386: Hyper-V SynIC requires POST_MESSAGES/SIGNAL_EVENTS privilegesVitaly Kuznetsov2-0/+12
When Hyper-V SynIC is enabled, we may need to allow Windows guests to make hypercalls (POST_MESSAGES/SIGNAL_EVENTS). No issue is currently observed because KVM is very permissive, allowing these hypercalls regarding of guest visible CPUid bits. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-9-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-13i386: HV_HYPERCALL_AVAILABLE privilege bit is always neededVitaly Kuznetsov1-5/+5
According to TLFS, Hyper-V guest is supposed to check HV_HYPERCALL_AVAILABLE privilege bit before accessing HV_X64_MSR_GUEST_OS_ID/HV_X64_MSR_HYPERCALL MSRs but at least some Windows versions ignore that. As KVM is very permissive and allows accessing these MSRs unconditionally, no issue is observed. We may, however, want to tighten the checks eventually. Conforming to the spec is probably also a good idea. Enable HV_HYPERCALL_AVAILABLE bit unconditionally. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-8-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-13i386: kill off hv_cpuid_check_and_set()Vitaly Kuznetsov1-68/+36
hv_cpuid_check_and_set() does too much: - Checks if the feature is supported by KVM; - Checks if all dependencies are enabled; - Sets the feature bit in cpu->hyperv_features for 'passthrough' mode. To reduce the complexity, move all the logic except for dependencies check out of it. Also, in 'passthrough' mode we don't really need to check dependencies because KVM is supposed to provide a consistent set anyway. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-7-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-13i386: expand Hyper-V features during CPU feature expansion timeVitaly Kuznetsov3-4/+26
To make Hyper-V features appear in e.g. QMP query-cpu-model-expansion we need to expand and set the corresponding CPUID leaves early. Modify x86_cpu_get_supported_feature_word() to call newly intoduced Hyper-V specific kvm_hv_get_supported_cpuid() instead of kvm_arch_get_supported_cpuid(). We can't use kvm_arch_get_supported_cpuid() as Hyper-V specific CPUID leaves intersect with KVM's. Note, early expansion will only happen when KVM supports system wide KVM_GET_SUPPORTED_HV_CPUID ioctl (KVM_CAP_SYS_HYPERV_CPUID). Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-6-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-13i386: make hyperv_expand_features() return boolVitaly Kuznetsov1-19/+21
Return 'false' when hyperv_expand_features() sets an error. No functional change intended. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-5-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-13i386: hardcode supported eVMCS version to '1'Vitaly Kuznetsov1-4/+35
Currently, the only eVMCS version, supported by KVM (and described in TLFS) is '1'. When Enlightened VMCS feature is enabled, QEMU takes the supported eVMCS version range (from KVM_CAP_HYPERV_ENLIGHTENED_VMCS enablement) and puts it to guest visible CPUIDs. When (and if) eVMCS ver.2 appears a problem on migration is expected: it doesn't seem to be possible to migrate from a host supporting eVMCS ver.2 to a host, which only support eVMCS ver.1. Hardcode eVMCS ver.1 as the result of 'hv-evmcs' enablement for now. Newer eVMCS versions will have to have their own enablement options (e.g. 'hv-evmcs=2'). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20210608120817.1325125-4-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-06target/i386: Populate x86_ext_save_areas offsets using cpuid where possibleDavid Edmondson2-0/+37
Rather than relying on the X86XSaveArea structure definition, determine the offset of XSAVE state areas using CPUID leaf 0xd where possible (KVM and HVF). Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210705104632.2902400-8-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06target/i386: Pass buffer and length to XSAVE helperDavid Edmondson1-6/+7
In preparation for removing assumptions about XSAVE area offsets, pass a buffer pointer and buffer length to the XSAVE helper functions. Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210705104632.2902400-5-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06target/i386: Consolidate the X86XSaveArea offset checksDavid Edmondson1-39/+0
Rather than having similar but different checks in cpu.h and kvm.c, move them all to cpu.h. Message-Id: <20210705104632.2902400-3-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25target/i386: kvm: add support for TSC scalingPaolo Bonzini1-4/+8
Linux 5.14 will add support for nested TSC scaling. Add the corresponding feature in QEMU; to keep support for existing kernels, do not add it to any processor yet. The handling of the VMCS enumeration MSR is ugly; once we have more than one case, we may want to add a table to check VMX features against. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-17i386: Add ratelimit for bus locks acquired in guestChenyi Qiang1-0/+41
A bus lock is acquired through either split locked access to writeback (WB) memory or any locked access to non-WB memory. It is typically >1000 cycles slower than an atomic operation within a cache and can also disrupts performance on other cores. Virtual Machines can exploit bus locks to degrade the performance of system. To address this kind of performance DOS attack coming from the VMs, bus lock VM exit is introduced in KVM and it can report the bus locks detected in guest. If enabled in KVM, it would exit to the userspace to let the user enforce throttling policies once bus locks acquired in VMs. The availability of bus lock VM exit can be detected through the KVM_CAP_X86_BUS_LOCK_EXIT. The returned bitmap contains the potential policies supported by KVM. The field KVM_BUS_LOCK_DETECTION_EXIT in bitmap is the only supported strategy at present. It indicates that KVM will exit to userspace to handle the bus locks. This patch adds a ratelimit on the bus locks acquired in guest as a mitigation policy. Introduce a new field "bus_lock_ratelimit" to record the limited speed of bus locks in the target VM. The user can specify it through the "bus-lock-ratelimit" as a machine property. In current implementation, the default value of the speed is 0 per second, which means no restrictions on the bus locks. As for ratelimit on detected bus locks, simply set the ratelimit interval to 1s and restrict the quota of bus lock occurence to the value of "bus_lock_ratelimit". A potential alternative is to introduce the time slice as a property which can help the user achieve more precise control. The detail of bus lock VM exit can be found in spec: https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20210521043820.29678-1-chenyi.qiang@intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-06-04i386: reorder call to cpu_exec_realizefnClaudio Fontana1-2/+10
i386 realizefn code is sensitive to ordering, and recent commits aimed at refactoring it, splitting accelerator-specific code, broke assumptions which need to be fixed. We need to: * process hyper-v enlightements first, as they assume features not to be expanded * only then, expand features * after expanding features, attempt to check them and modify them in the accel-specific realizefn code called by cpu_exec_realizefn(). * after the framework has been called via cpu_exec_realizefn, the code can check for what has or hasn't been set by accel-specific code, or extend its results, ie: - check and evenually set code_urev default - modify cpu->mwait after potentially being set from host CPUID. - finally check for phys_bits assuming all user and accel-specific adjustments have already been taken into account. Fixes: f5cc5a5c ("i386: split cpu accelerators from cpu.c"...) Fixes: 30565f10 ("cpu: call AccelCPUClass::cpu_realizefn in"...) Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20210603123001.17843-2-cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-02Merge remote-tracking branch ↵Peter Maydell1-1/+1
'remotes/thuth-gitlab/tags/pull-request-2021-06-02' into staging * Update the references to some doc files (use *.rst instead of *.txt) * Bump minimum versions of some requirements after removing CentOS 7 support # gpg: Signature made Wed 02 Jun 2021 08:12:18 BST # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * remotes/thuth-gitlab/tags/pull-request-2021-06-02: configure: bump min required CLang to 6.0 / XCode 10.0 configure: bump min required GCC to 7.5.0 configure: bump min required glib version to 2.56 tests/docker: drop CentOS 7 container tests/vm: convert centos VM recipe to CentOS 8 crypto: drop used conditional check crypto: bump min gnutls to 3.5.18, dropping RHEL-7 support crypto: bump min gcrypt to 1.8.0, dropping RHEL-7 support crypto: drop back compatibility typedefs for nettle crypto: bump min nettle to 3.4, dropping RHEL-7 support patchew: move quick build job from CentOS 7 to CentOS 8 container block/ssh: Bump minimum libssh version to 0.8.7 docs: fix references to docs/devel/s390-dasd-ipl.rst docs: fix references to docs/specs/tpm.rst docs: fix references to docs/devel/build-system.rst docs: fix references to docs/devel/atomics.rst docs: fix references to docs/devel/tracing.rst Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-02docs: fix references to docs/devel/tracing.rstStefano Garzarella1-1/+1
Commit e50caf4a5c ("tracing: convert documentation to rST") converted docs/devel/tracing.txt to docs/devel/tracing.rst. We still have several references to the old file, so let's fix them with the following command: sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt) Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210517151702.109066-2-sgarzare@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-05-31i386: use global kvm_state in hyperv_enabled() checkVitaly Kuznetsov1-2/+1
There is no need to use vCPU-specific kvm state in hyperv_enabled() check and we need to do that when feature expansion happens early, before vCPU specific KVM state is created. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20210422161130.652779-15-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-05-31i386: prefer system KVM_GET_SUPPORTED_HV_CPUID ioctl over vCPU's oneVitaly Kuznetsov1-4/+13
KVM_GET_SUPPORTED_HV_CPUID was made a system wide ioctl which can be called prior to creating vCPUs and we are going to use that to expand Hyper-V cpu features early. Use it when it is supported by KVM. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20210422161130.652779-14-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-05-31i386: adjust the expected KVM_GET_SUPPORTED_HV_CPUID array sizeVitaly Kuznetsov1-1/+2
SYNDBG leaves were recently (Linux-5.8) added to KVM but we haven't updated the expected size of KVM_GET_SUPPORTED_HV_CPUID output in KVM so we now make serveral tries before succeeding. Update the default. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20210422161130.652779-13-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>