aboutsummaryrefslogtreecommitdiff
path: root/hw/i386
AgeCommit message (Collapse)AuthorFilesLines
2024-05-22hw/i386/pc: Support smp.modules for x86 PC machineZhao Liu1-0/+1
As module-level topology support is added to X86CPU, now we can enable the support for the modules parameter on PC machines. With this support, we can define a 5-level x86 CPU topology with "-smp": -smp cpus=*,maxcpus=*,sockets=*,dies=*,modules=*,cores=*,threads=*. So, add the 5-level topology example in description of "-smp". Additionally, add the missed drawers and books options in previous example. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-19-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22i386/cpu: Introduce module-id to X86CPUZhao Liu1-8/+25
Introduce module-id to be consistent with the module-id field in CpuInstanceProperties. Following the legacy smp check rules, also add the module_id validity into x86_cpu_pre_plug(). Tested-by: Yongwei Ma <yongwei.ma@intel.com> Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-17-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22i386: Support module_id in X86CPUTopoIDsZhao Liu2-10/+21
Add module_id member in X86CPUTopoIDs. module_id can be parsed from APIC ID, so also update APIC ID parsing rule to support module level. With this support, the conversions with module level between X86CPUTopoIDs, X86CPUTopoInfo and APIC ID are completed. module_id can be also generated from cpu topology, and before i386 supports "modules" in smp, the default "modules per die" (modules * clusters) is only 1, thus the module_id generated in this way is 0, so that it will not conflict with the module_id generated by APIC ID. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-16-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22i386: Expose module level in CPUID[0x1F]Zhao Liu1-1/+1
Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid platforms") is able to handle platforms with Module level enumerated via CPUID.1F. Expose the module level in CPUID[0x1F] if the machine has more than 1 modules. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-15-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22i386: Support modules_per_die in X86CPUTopoInfoZhao Liu1-1/+8
Support module level in i386 cpu topology structure "X86CPUTopoInfo". Since x86 does not yet support the "modules" parameter in "-smp", X86CPUTopoInfo.modules_per_die is currently always 1. Therefore, the module level width in APIC ID, which can be calculated by "apicid_bitwidth_for_count(topo_info->modules_per_die)", is always 0 for now, so we can directly add APIC ID related helpers to support module level parsing. In addition, update topology structure in test-x86-topo.c. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-14-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22i386: Introduce module level cpu topology to CPUX86StateZhao Liu1-0/+5
Intel CPUs implement module level on hybrid client products (e.g., ADL-N, MTL, etc) and E-core server products. A module contains a set of cores that share certain resources (in current products, the resource usually includes L2 cache, as well as module scoped features and MSRs). Module level support is the prerequisite for L2 cache topology on module level. With module level, we can implement the Guest's CPU topology and future cache topology to be consistent with the Host's on Intel hybrid client/E-core server platforms. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-13-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22i386/cpu: Introduce bitmap to cache available CPU topology levelsZhao Liu1-1/+4
Currently, QEMU checks the specify number of topology domains to detect if there's extended topology levels (e.g., checking nr_dies). With this bitmap, the extended CPU topology (the levels other than SMT, core and package) could be easier to detect without touching the topology details. This is also in preparation for the follow-up to decouple CPUID[0x1F] subleaf with specific topology level. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240424154929.1487382-10-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22i386/cpu: Fix i/d-cache topology to core level for Intel CPUZhao Liu1-0/+1
For i-cache and d-cache, current QEMU hardcodes the maximum IDs for CPUs sharing cache (CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) to 0, and this means i-cache and d-cache are shared in the SMT level. This is correct if there's single thread per core, but is wrong for the hyper threading case (one core contains multiple threads) since the i-cache and d-cache are shared in the core level other than SMT level. For AMD CPU, commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information for cpuid 0x8000001D") has already introduced i/d cache topology as core level by default. Therefore, in order to be compatible with both multi-threaded and single-threaded situations, we should set i-cache and d-cache be shared at the core level by default. This fix changes the default i/d cache topology from per-thread to per-core. Potentially, this change in L1 cache topology may affect the performance of the VM if the user does not specifically specify the topology or bind the vCPU. However, the way to achieve optimal performance should be to create a reasonable topology and set the appropriate vCPU affinity without relying on QEMU's default topology structure. Fixes: 7e3482f82480 ("i386: Helpers to encode cache information consistently") Suggested-by: Robert Hoo <robert.hu@linux.intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20240424154929.1487382-6-zhao1.liu@intel.com> [Add compat property. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22hw/i386/pc_sysfw: Alias rather than copy isa-bios regionBernhard Beschow4-1/+13
In the -bios case the "isa-bios" memory region is an alias to the BIOS mapped to the top of the 4G memory boundary. Do the same in the -pflash case, but only for new machine versions for migration compatibility. This establishes common behavior and makes pflash commands work in the "isa-bios" region which some real-world legacy bioses rely on. Note that in the sev_enabled() case, the "isa-bios" memory region in the -pflash case will now also point to encrypted memory, just like it already does in the -bios case. When running `info mtree` before and after this commit with `qemu-system-x86_64 -S -drive \ if=pflash,format=raw,readonly=on,file=/usr/share/qemu/bios-256k.bin` and running `diff -u before.mtree after.mtree` results in the following changes in the memory tree: --- before.mtree +++ after.mtree @@ -71,7 +71,7 @@ 0000000000000000-ffffffffffffffff (prio -1, i/o): pci 00000000000a0000-00000000000bffff (prio 1, i/o): vga-lowmem 00000000000c0000-00000000000dffff (prio 1, rom): pc.rom - 00000000000e0000-00000000000fffff (prio 1, rom): isa-bios + 00000000000e0000-00000000000fffff (prio 1, romd): alias isa-bios @system.flash0 0000000000020000-000000000003ffff 00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff @@ -108,7 +108,7 @@ 0000000000000000-ffffffffffffffff (prio -1, i/o): pci 00000000000a0000-00000000000bffff (prio 1, i/o): vga-lowmem 00000000000c0000-00000000000dffff (prio 1, rom): pc.rom - 00000000000e0000-00000000000fffff (prio 1, rom): isa-bios + 00000000000e0000-00000000000fffff (prio 1, romd): alias isa-bios @system.flash0 0000000000020000-000000000003ffff 00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff @@ -131,11 +131,14 @@ memory-region: pc.ram 0000000000000000-0000000007ffffff (prio 0, ram): pc.ram +memory-region: system.flash0 + 00000000fffc0000-00000000ffffffff (prio 0, romd): system.flash0 + memory-region: pci 0000000000000000-ffffffffffffffff (prio -1, i/o): pci 00000000000a0000-00000000000bffff (prio 1, i/o): vga-lowmem 00000000000c0000-00000000000dffff (prio 1, rom): pc.rom - 00000000000e0000-00000000000fffff (prio 1, rom): isa-bios + 00000000000e0000-00000000000fffff (prio 1, romd): alias isa-bios @system.flash0 0000000000020000-000000000003ffff memory-region: smram 00000000000a0000-00000000000bffff (prio 0, ram): alias smram-low @pc.ram 00000000000a0000-00000000000bffff Note that in both cases the "system" memory region contains the entry 00000000fffc0000-00000000ffffffff (prio 0, romd): system.flash0 but the "system.flash0" memory region only appears standalone when "isa-bios" is an alias. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-ID: <20240508175507.22270-7-shentey@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-10kconfig: express dependency of individual boards on libfdtPaolo Bonzini1-1/+2
Now that boards are enabled by default and the "CONFIG_FOO=y" entries are gone from configs/devices/, there cannot be any more a conflicts between the default contents of configs/devices/ and a failed "depends on" clause. With this change, each individual board or target can express whether it needs FDT. It can then include the common code in the build via "select DEVICE_TREE", which will also as tell meson to link with libfdt. This allows building non-microvm x86 emulators without having libfdt available. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-10hw/i386: move rtc-reset-reinjection command out of hw/rtcPaolo Bonzini2-0/+47
The rtc-reset-reinjection QMP command is specific to x86, other boards do not have the ACK tracking functionality that is needed for RTC interrupt reinjection. Therefore the QMP command is only included in x86, but qmp_rtc_reset_reinjection() is implemented by hw/rtc/mc146818rtc.c and requires tracking of all created RTC devices. Move the implementation to hw/i386, so that 1) it is available even if no RTC device exist 2) the only RTC that exists is easily found in x86ms->rtc. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240509170044.190795-12-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-10hw/i386: split x86.c in multiple partsPaolo Bonzini4-1050/+1110
Keep the basic X86MachineState definition in x86.c. Move out functions that are only needed by other files: x86-common.c for the pc and microvm machines, x86-cpu.c for those used by accelerator code. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240509170044.190795-11-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-10i386: pc: remove unnecessary MachineClass overridesPaolo Bonzini2-6/+3
There is no need to override these fields of MachineClass because they are already set to the right value in the superclass. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240509170044.190795-10-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-10i386: correctly select code in hw/i386 that depends on other componentsPaolo Bonzini2-1/+3
fw_cfg.c and vapic.c are currently included unconditionally but depend on other components. vapic.c depends on the local APIC, while fw_cfg.c includes a piece of AML builder code that depends on CONFIG_ACPI. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240509170044.190795-9-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-10xen: initialize legacy backends from xen_bus_init()Paolo Bonzini1-1/+0
Prepare for moving the calls to xen_be_register() under the control of xen_bus_init(), using the normal xen_backend_init() method that is used by the "modern" backends. This requires the xenstore global variable to be initialized, which is done by xen_be_init(). To ensure that everything is ready at the time the xen_backend_init() functions are called, remove the xen_be_init() function from all the boards and place it directly in xen_bus_init(). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240509170044.190795-7-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-09hw/i386/x86: Extract x86_isa_bios_init() from x86_bios_rom_init()Bernhard Beschow1-9/+16
The function is inspired by pc_isa_bios_init() and should eventually replace it. Using x86_isa_bios_init() rather than pc_isa_bios_init() fixes pflash commands to work in the isa-bios region. While at it convert the magic number 0x100000 (== 1MiB) to increase readability. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-ID: <20240508175507.22270-6-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-05-09hw/i386/x86: Don't leak "pc.bios" memory regionBernhard Beschow1-7/+6
Fix the leaking in x86_bios_rom_init() by adding a "bios" attribute to X86MachineState. Note that it is only used in the -bios case. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-ID: <20240508175507.22270-5-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-05-09hw/i386/x86: Don't leak "isa-bios" memory regionsBernhard Beschow2-9/+7
Fix the leaking in x86_bios_rom_init() and pc_isa_bios_init() by adding an "isa_bios" attribute to X86MachineState. Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-ID: <20240508175507.22270-4-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-05-08hw/i386: Have x86_bios_rom_init() take X86MachineState rather than MachineStateBernhard Beschow3-5/+5
The function creates and leaks two MemoryRegion objects regarding the BIOS which will be moved into X86MachineState in the next steps to avoid the leakage. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240430150643.111976-3-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-05-08hw/i386/x86: Eliminate two if statements in x86_bios_rom_init()Bernhard Beschow1-6/+2
Given that memory_region_set_readonly() is a no-op when the readonlyness is already as requested it is possible to simplify the pattern if (condition) { foo(true); } to foo(condition); which is shorter and allows to see the invariant of the code more easily. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240430150643.111976-2-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-05-08hw/i386: Add the possibility to use i440fx and isapc without FDCThomas Huth2-4/+4
The i440fx and the isapc machines can be used in binaries without FDC, too. We just have to make sure that they don't try to instantiate the FDC when it is not available. Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240425184315.553329-4-thuth@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-05-08hw/i386/Kconfig: Allow to compile Q35 without FDC_ISAThomas Huth1-1/+3
The q35 machine can be used without floppy disk controller (FDC), but due to our current Kconfig setup, the FDC code is still always included in the binary. To fix this, the "PC" config option should only imply the "FDC_ISA" instead of always selecting it. The i440fx and the isa-pc machine currently always instantiate the FDC, so we have to add the select statements now there instead. Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240425184315.553329-3-thuth@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-05-08hw/i386/pc: Allow to compile without CONFIG_FDC_ISAThomas Huth1-4/+9
The q35 machine can work without FDC. But to be able to also link a QEMU binary that does not include the FDC code, we have to make it possible to disable the spots that call into the FDC code. Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240425184315.553329-2-thuth@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-05-07target/i386: Fix CPUID encoding of Fn8000001E_ECXBabu Moger1-0/+1
Observed the following failure while booting the SEV-SNP guest and the guest fails to boot with the smp parameters: "-smp 192,sockets=1,dies=12,cores=8,threads=2". qemu-system-x86_64: sev_snp_launch_update: SNP_LAUNCH_UPDATE ret=-5 fw_error=22 'Invalid parameter' qemu-system-x86_64: SEV-SNP: CPUID validation failed for function 0x8000001e, index: 0x0. provided: eax:0x00000000, ebx: 0x00000100, ecx: 0x00000b00, edx: 0x00000000 expected: eax:0x00000000, ebx: 0x00000100, ecx: 0x00000300, edx: 0x00000000 qemu-system-x86_64: SEV-SNP: failed update CPUID page Reason for the failure is due to overflowing of bits used for "Node per processor" in CPUID Fn8000001E_ECX. This field's width is 3 bits wide and can hold maximum value 0x7. With dies=12 (0xB), it overflows and spills over into the reserved bits. In the case of SEV-SNP, this causes CPUID enforcement failure and guest fails to boot. The PPR documentation for CPUID_Fn8000001E_ECX [Node Identifiers] ================================================================= Bits Description 31:11 Reserved. 10:8 NodesPerProcessor: Node per processor. Read-only. ValidValues: Value Description 0h 1 node per processor. 7h-1h Reserved. 7:0 NodeId: Node ID. Read-only. Reset: Fixed,XXh. ================================================================= As in the spec, the valid value for "node per processor" is 0 and rest are reserved. Looking back at the history of decoding of CPUID_Fn8000001E_ECX, noticed that there were cases where "node per processor" can be more than 1. It is valid only for pre-F17h (pre-EPYC) architectures. For EPYC or later CPUs, the linux kernel does not use this information to build the L3 topology. Also noted that the CPUID Function 0x8000001E_ECX is available only when TOPOEXT feature is enabled. This feature is enabled only for EPYC(F17h) or later processors. So, previous generation of processors do not not enumerate 0x8000001E_ECX leaf. There could be some corner cases where the older guests could enable the TOPOEXT feature by running with -cpu host, in which case legacy guests might notice the topology change. To address those cases introduced a new CPU property "legacy-multi-node". It will be true for older machine types to maintain compatibility. By default, it will be false, so new decoding will be used going forward. The documentation is taken from Preliminary Processor Programming Reference (PPR) for AMD Family 19h Model 11h, Revision B1 Processors 55901 Rev 0.25 - Oct 6, 2022. Cc: qemu-stable@nongnu.org Fixes: 31ada106d891 ("Simplify CPUID_8000_001E for AMD") Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Message-ID: <0ee4b0a8293188a53970a2b0e4f4ef713425055e.1714757834.git.babu.moger@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-03i386: switch boards to "default y"Paolo Bonzini1-1/+9
Some targets use "default y" for boards to filter out those that require TCG. For consistency we are switching all other targets to do the same. Continue with i386. No changes to generated config-devices.mak files, other than adding CONFIG_I386 to the x86_64-softmmu target. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-25hw/i386/pc_sysfw: Remove unused parameter from pc_isa_bios_init()Bernhard Beschow1-3/+2
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240422200625.2768-2-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-04-23Merge tag 'migration-20240423-pull-request' of ↵Richard Henderson1-2/+3
https://gitlab.com/peterx/qemu into staging Migration pull for 9.1 - Het's new test cases for "channels" - Het's fix for a typo for vsock parsing - Cedric's VFIO error report series - Cedric's one more patch for dirty-bitmap error reports - Zhijian's rdma deprecation patch - Yuan's zeropage optimization to fix double faults on anon mem - Zhijian's COLO fix on a crash # -----BEGIN PGP SIGNATURE----- # # iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZig4HxIccGV0ZXJ4QHJl # ZGhhdC5jb20ACgkQO1/MzfOr1wbQiwD/V5nSJzSuAG4Ra1Fjo+LRG2TT6qk8eNCi # fIytehSw6cYA/0wqarxOF0tr7ikeyhtG3w4xFf44kk6KcPkoVSl1tqoL # =pJmQ # -----END PGP SIGNATURE----- # gpg: Signature made Tue 23 Apr 2024 03:37:19 PM PDT # gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706 # gpg: issuer "peterx@redhat.com" # gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [unknown] # gpg: aka "Peter Xu <peterx@redhat.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706 * tag 'migration-20240423-pull-request' of https://gitlab.com/peterx/qemu: (26 commits) migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed. migration/multifd: solve zero page causing multiple page faults migration: Add Error** argument to add_bitmaps_to_list() migration: Modify ram_init_bitmaps() to report dirty tracking errors migration: Add Error** argument to xbzrle_init() migration: Add Error** argument to ram_state_init() memory: Add Error** argument to the global_dirty_log routines migration: Introduce ram_bitmaps_destroy() memory: Add Error** argument to .log_global_start() handler migration: Add Error** argument to .load_setup() handler migration: Add Error** argument to .save_setup() handler migration: Add Error** argument to qemu_savevm_state_setup() migration: Add Error** argument to vmstate_save() migration: Always report an error in ram_save_setup() migration: Always report an error in block_save_setup() vfio: Always report an error in vfio_save_setup() s390/stattrib: Add Error** argument to set_migrationmode() handler tests/qtest/migration: Fix typo for vsock in SocketAddress_to_str tests/qtest/migration: Add negative tests to validate migration QAPIs tests/qtest/migration: Add multifd_tcp_plain test using list of channels instead of uri ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-04-23memory: Add Error** argument to the global_dirty_log routinesCédric Le Goater1-1/+1
Now that the log_global*() handlers take an Error** parameter and return a bool, do the same for memory_global_dirty_log_start() and memory_global_dirty_log_stop(). The error is reported in the callers for now and it will be propagated in the call stack in the next changes. To be noted a functional change in ram_init_bitmaps(), if the dirty pages logger fails to start, there is no need to synchronize the dirty pages bitmaps. colo_incoming_start_dirty_log() could be modified in a similar way. Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Paul Durrant <paul@xen.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hyman Huang <yong.huang@smartx.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Acked-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-12-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-04-23memory: Add Error** argument to .log_global_start() handlerCédric Le Goater1-1/+2
Modify all .log_global_start() handlers to take an Error** parameter and return a bool. Adapt memory_global_dirty_log_start() to interrupt on the first error the loop on handlers. In such case, a rollback is performed to stop dirty logging on all listeners where it was previously enabled. Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Paul Durrant <paul@xen.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-10-clg@redhat.com [peterx: modify & enrich the comment for listener_add_address_space() ] Signed-off-by: Peter Xu <peterx@redhat.com>
2024-04-23hw/i386/sev: Use legacy SEV VM types for older machine typesMichael Roth1-0/+1
Newer 9.1 machine types will default to using the KVM_SEV_INIT2 API for creating SEV/SEV-ES going forward. However, this API results in guest measurement changes which are generally not expected for users of these older guest types and can cause disruption if they switch to a newer QEMU/kernel version. Avoid this by continuing to use the older KVM_SEV_INIT/KVM_SEV_ES_INIT APIs for older machine types. Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240409230743.962513-4-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-23target/i386: Implement mc->kvm_type() to get VM typePaolo Bonzini1-0/+11
KVM is introducing a new API to create confidential guests, which will be used by TDX and SEV-SNP but is also available for SEV and SEV-ES. The API uses the VM type argument to KVM_CREATE_VM to identify which confidential computing technology to use. Since there are no other expected uses of VM types, delegate mc->kvm_type() for x86 boards to the confidential-guest-support object pointed to by ms->cgs. For example, if a sev-guest object is specified to confidential-guest-support, like, qemu -machine ...,confidential-guest-support=sev0 \ -object sev-guest,id=sev0,... it will check if a VM type KVM_X86_SEV_VM or KVM_X86_SEV_ES_VM is supported, and if so use them together with the KVM_SEV_INIT2 function of the KVM_MEMORY_ENCRYPT_OP ioctl. If not, it will fall back to KVM_SEV_INIT and KVM_SEV_ES_INIT. This is a preparatory work towards TDX and SEV-SNP support, but it will also enable support for VMSA features such as DebugSwap, which are only available via KVM_SEV_INIT2. Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-23linux-headers: update to current kvm/nextPaolo Bonzini1-8/+0
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-23hw/i386/acpi: Set PCAT_COMPAT bit only when pic is not disabledXiaoyao Li1-1/+3
A value 1 of PCAT_COMPAT (bit 0) of MADT.Flags indicates that the system also has a PC-AT-compatible dual-8259 setup, i.e., the PIC. When PIC is not enabled (pic=off) for x86 machine, the PCAT_COMPAT bit needs to be cleared. The PIC probe should then print: [ 0.155970] Using NULL legacy PIC However, no such log printed in guest kernel unless PCAT_COMPAT is cleared. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240403145953.3082491-1-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-23q35: Introduce smm_ranges property for q35-pci-hostIsaku Yamahata1-0/+2
Add a q35 property to check whether or not SMM ranges, e.g. SMRAM, TSEG, etc... exist for the target platform. TDX doesn't support SMM and doesn't play nice with QEMU modifying related guest memory ranges. Signed-off-by: Isaku Yamahata <isaku.yamahata@linux.intel.com> Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240320083945.991426-19-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-18target/i386: add guest-phys-bits cpu propertyGerd Hoffmann1-1/+3
Allows to set guest-phys-bits (cpuid leaf 80000008, eax[23:16]) via -cpu $model,guest-phys-bits=$nr. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-ID: <20240318155336.156197-3-kraxel@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-18hw: Add compat machines for 9.1Paolo Bonzini3-5/+29
Add 9.1 machine types for arm/i440fx/m68k/q35/s390x/spapr. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Cc: Gavin Shan <gshan@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-03Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into stagingPeter Maydell1-1/+0
* lsi53c895a: fix assertion failure with invalid Block Move * vga: fix assertion failure with 4- and 16-color modes * remove unnecessary assignment # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmYNKboUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroNMDgf/Wgw+qNkNooAhEH1V5l0xdyiF4QQU # stz4kcKdWkQB5dsVy8utC3nN2baRFPgj6Utr2e8FqzxGuY8qYL3olh8k1ygiFiFz # joSOxAlBuRUOsJq90EJUyGeFykJ/F/neJ2n6VjOtKyry9c8PnInjmuNMFYsxeLow # j1VF6defALut/8wvxPm5WmfFzS1Hv3I9k/GqKSlAjNpY2COlibshEoNFuZZtpfeI # JnUL5oB+sICoZH2/mM5a9Nv2z0NCHAwKF7alXVjfHWvdaRQO6bLlraDmPXmh0ZMY # MsoULMQaeZCtC0vfc8XJZj/C/s2iO14gfqA23/mfGCLalyo7l1yh4e6JyQ== # =xDOl # -----END PGP SIGNATURE----- # gpg: Signature made Wed 03 Apr 2024 11:04:42 BST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: pc_q35: remove unnecessary m->alias assignment lsi53c895a: avoid out of bounds access to s->msg[] vga: do not treat horiz pel panning value of 8 as "enabled" vga: adjust dirty memory region if pel panning is active vga: move computation of dirty memory region later vga: merge conditionals on shift control register Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-04-02pc_q35: remove unnecessary m->alias assignmentPaolo Bonzini1-1/+0
The assignment is already inherited from pc-q35-8.2. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-02hw/xen_evtchn: Initialize flush_kvm_routesArtem Chernyshev1-1/+1
In xen_evtchn_soft_reset() variable flush_kvm_routes can be used before being initialized. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Oleg Sviridov <oleg.sviridov@red-soft.ru> Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240329113939.257033-1-artem.chernyshev@red-soft.ru> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-04-02hw/i386/pc: Restrict CXL to PCI-based machinesPhilippe Mathieu-Daudé1-1/+3
CXL is based on PCIe. In is pointless to initialize its context on non-PCI machines. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-ID: <20240327161642.33574-1-philmd@linaro.org>
2024-03-18pc/q35: set SMBIOS entry point type to 'auto' by defaultIgor Mammedov3-1/+8
Use smbios-entry-point-type='auto' for newer machine types as a workaround for Windows not detecting SMBIOS tables. Which makes QEMU pick SMBIOS tables based on configuration (with 2.x preferred and fallback to 3.x if the former isn't compatible with configuration) Default compat setting of smbios-entry-point-type after series for pc/q35 machines: * 9.0-newer: 'auto' * 8.1-8.2: '64' * 8.0-older: '32' Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2008 Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Message-Id: <20240314152302.2324164-20-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-18smbios: get rid of global smbios_ep_typeIgor Mammedov3-6/+7
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Message-Id: <20240314152302.2324164-14-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-18smbios: handle errors consistentlyIgor Mammedov1-1/+2
Current code uses mix of error_report()+exit(1) and error_setg() to handle errors. Use newer error_setg() everywhere, beside consistency it will allow to detect error condition without killing QEMU and attempt switch-over to SMBIOS3.x tables/entrypoint in follow up patch. while at it, clear smbios_tables pointer after freeing. that will avoid double free if smbios_get_tables() is called multiple times. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Message-Id: <20240314152302.2324164-13-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-18smbios: build legacy mode code only for 'pc' machineIgor Mammedov1-0/+1
basically moving code around without functional change. And exposing some symbols so that they could be shared between smbbios.c and new smbios_legacy.c plus some meson magic to build smbios_legacy.c only for 'pc' machine and otherwise replace it with stub if not selected. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Message-Id: <20240314152302.2324164-12-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-18smbios: don't check type4 structures in legacy modeIgor Mammedov1-2/+1
legacy mode doesn't support structures of type 2 and more, and CLI has a check for '-smbios type' option, however it's still possible to sneak in type4 as a blob with '-smbios file' option. However doing the later makes SMBIOS tables broken since SeaBIOS doesn't expect that. Rather than trying to add support for type4 to legacy code (both QEMU and SeaBIOS), simplify smbios_get_table_legacy() by dropping not relevant check in legacy code and error out on type4 blob. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Message-Id: <20240314152302.2324164-9-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-18smbios: get rid of smbios_legacy globalIgor Mammedov1-3/+4
clean up smbios_set_defaults() which is reused by legacy and non legacy machines from being aware of 'legacy' notion and need to turn it off. And push legacy handling up to PC machine code where it's relevant. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Acked-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Message-Id: <20240314152302.2324164-7-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-18smbios: get rid of smbios_smp_sockets globalIgor Mammedov1-1/+1
it makes smbios_validate_table() independent from smbios_smp_sockets global, which in turn lets smbios_get_tables() avoid using not related legacy code. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Message-Id: <20240314152302.2324164-6-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-18smbios: cleanup smbios_get_tables() from legacy handlingIgor Mammedov1-0/+1
smbios_get_tables() bails out right away if leagacy mode is enabled and won't generate any SMBIOS tables. At the same time x86 specific fw_cfg_build_smbios() will genarate legacy tables and then proceed to preparing temporary mem_array for useless call to smbios_get_tables() and then discard it. Drop legacy related check in smbios_get_tables() and return from fw_cfg_build_smbios() early if legacy tables where built without proceeding to non legacy part of the function. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Message-Id: <20240314152302.2324164-5-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-13Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu ↵Peter Maydell5-32/+35
into staging virtio,pc,pci: features, cleanups, fixes more memslots support in libvhost-user support PCIe Gen5/Gen6 link speeds in pcie more traces in vdpa network simulation devices support in vdpa SMBIOS type 9 descriptor implementation Bump max_cpus to 4096 vcpus in q35 aw-bits and granule options in VIRTIO-IOMMU Support report NUMA nodes for device memory using GI in acpi Beginning of shutdown event support in pvpanic fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmXw0TMPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRp8x4H+gLMoGwaGAX7gDGPgn2Ix4j/3kO77ZJ9X9k/ # 1KqZu/9eMS1j2Ei+vZqf05w7qRjxxhwDq3ilEXF/+UFqgAehLqpRRB8j5inqvzYt # +jv0DbL11PBp/oFjWcytm5CbiVsvq8KlqCF29VNzc162XdtcduUOWagL96y8lJfZ # uPrOoyeR7SMH9lp3LLLHWgu+9W4nOS03RroZ6Umj40y5B7yR0Rrppz8lMw5AoQtr # 0gMRnFhYXeiW6CXdz+Tzcr7XfvkkYDi/j7ibiNSURLBfOpZa6Y8+kJGKxz5H1K1G # 6ZY4PBcOpQzl+NMrktPHogczgJgOK10t+1i/R3bGZYw2Qn/93Eg= # =C0UU # -----END PGP SIGNATURE----- # gpg: Signature made Tue 12 Mar 2024 22:03:31 GMT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (68 commits) docs/specs/pvpanic: document shutdown event hw/cxl: Fix missing reserved data in CXL Device DVSEC hmat acpi: Fix out of bounds access due to missing use of indirection hmat acpi: Do not add Memory Proximity Domain Attributes Structure targetting non existent memory. qemu-options.hx: Document the virtio-iommu-pci aw-bits option hw/arm/virt: Set virtio-iommu aw-bits default value to 48 hw/i386/q35: Set virtio-iommu aw-bits default value to 39 virtio-iommu: Add an option to define the input range width virtio-iommu: Trace domain range limits as unsigned int qemu-options.hx: Document the virtio-iommu-pci granule option virtio-iommu: Change the default granule to the host page size virtio-iommu: Add a granule property hw/i386/acpi-build: Add support for SRAT Generic Initiator structures hw/acpi: Implement the SRAT GI affinity structure qom: new object to associate device to NUMA node hw/i386/pc: Inline pc_cmos_init() into pc_cmos_init_late() and remove it hw/i386/pc: Set "normal" boot device order in pc_basic_device_init() hw/i386/pc: Avoid one use of the current_machine global hw/i386/pc: Remove "rtc_state" link again Revert "hw/i386/pc: Confine system flash handling to pc_sysfw" ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # hw/core/machine.c
2024-03-12hw/i386/q35: Set virtio-iommu aw-bits default value to 39Eric Auger1-0/+9
Currently the default input range can extend to 64 bits. On x86, when the virtio-iommu protects vfio devices, the physical iommu may support only 39 bits. Let's set the default to 39, as done for the intel-iommu. We use hw_compat_8_2 to handle the compatibility for machines before 9.0 which used to have a virtio-iommu default input range of 64 bits. Of course if aw-bits is set from the command line, the default is overriden. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240307134445.92296-8-eric.auger@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>