aboutsummaryrefslogtreecommitdiff
path: root/hw/vfio
AgeCommit message (Collapse)AuthorFilesLines
2025-02-11vfio: Introduce vfio_get_vfio_device()Cédric Le Goater1-0/+10
This helper will be useful in the listener handlers to extract the VFIO device from a memory region using memory_region_owner(). At the moment, we only care for PCI passthrough devices. If the need arises, we will add more. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250206131438.1505542-5-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-02-11vfio: Rephrase comment in vfio_listener_region_add() error pathCédric Le Goater1-5/+10
Rephrase a bit the ending comment about how errors are handled depending on the phase in which vfio_listener_region_add() is called. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250206131438.1505542-4-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-02-11vfio/pci: Replace "iommu_device" by "vIOMMU"Cédric Le Goater1-1/+1
This is to be consistent with other reported errors related to vIOMMU devices. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250206131438.1505542-3-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-02-11vfio/iommufd: Fix SIGSEV in iommufd_cdev_attach()Zhenzhong Duan1-2/+3
When iommufd_cdev_ram_block_discard_disable() fails for whatever reason, errp should be set or else SIGSEV is triggered in vfio_realize() when error_prepend() is called. By this chance, use the same error message for both legacy and iommufd backend. Fixes: 5ee3dc7af785 ("vfio/iommufd: Implement the iommufd backend") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20250116102307.260849-1-zhenzhong.duan@intel.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-02-11vfio/igd: use VFIOConfigMirrorQuirk for mirrored registersTomita Moeko1-94/+31
With the introduction of config_offset field, VFIOConfigMirrorQuirk can now be used for those mirrored register in igd bar0. This eliminates the need for the macro intoduced in 1a2623b5c9e7 ("vfio/igd: add macro for declaring mirrored registers"). Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20250104154219.7209-4-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-02-11vfio/pci: introduce config_offset field in VFIOConfigMirrorQuirkTomita Moeko2-1/+4
Device may only expose a specific portion of PCI config space through a region in a BAR, such behavior is seen in igd GGC and BDSM mirrors in BAR0. To handle these, config_offset is introduced to allow mirroring arbitrary region in PCI config space. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20250104154219.7209-3-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-02-11vfio/pci: declare generic quirks in a new header fileTomita Moeko2-51/+75
Declare generic vfio_generic_{window_address,window_data,mirror}_quirk in newly created pci_quirks.h so that they can be used elsewhere, like igd.c. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20250104154219.7209-2-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-02-11vfio/igd: Fix potential overflow in igd_gtt_memory_size()Cédric Le Goater1-1/+1
The risk is mainly theoretical since the applied bit mask will keep the 'ggms' shift value below 3. Nevertheless, let's use a 64 bit integer type and resolve the coverity issue. Resolves: Coverity CID 1585908 Fixes: 1e1eac5f3dcd ("vfio/igd: canonicalize memory size calculations") Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20250107130604.669697-1-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-02-10qapi: Move include/qapi/qmp/ to include/qobject/Daniel P. Berrangé1-1/+1
The general expectation is that header files should follow the same file/path naming scheme as the corresponding source file. There are various historical exceptions to this practice in QEMU, with one of the most notable being the include/qapi/qmp/ directory. Most of the headers there correspond to source files in qobject/. This patch corrects most of that inconsistency by creating include/qobject/ and moving the headers for qobject/ there. This also fixes MAINTAINERS for include/qapi/qmp/dispatch.h: scripts/get_maintainer.pl now reports "QAPI" instead of "No maintainers found". Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> #s390x Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20241118151235.2665921-2-armbru@redhat.com> [Rebased]
2025-01-09hw/pci: Use -1 as the default value for rombarAkihiko Odaki1-3/+2
vfio_pci_size_rom() distinguishes whether rombar is explicitly set to 1 by checking dev->opts, bypassing the QOM property infrastructure. Use -1 as the default value for rombar to tell if the user explicitly set it to 1. The property is also converted from unsigned to signed. -1 is signed so it is safe to give it a new meaning. The values in [2 ^ 31, 2 ^ 32) become invalid, but nobody should have typed these values by chance. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20250104-reuse-v18-13-c349eafd8673@daynix.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-12-26vfio/migration: Rename vfio_devices_all_dirty_tracking()Avihai Horon1-2/+2
vfio_devices_all_dirty_tracking() is used to check if dirty page log sync is needed. However, besides checking the dirty page tracking status, it also checks the pre_copy_dirty_page_tracking flag. Rename it to vfio_devices_log_sync_needed() which reflects its purpose more accurately and makes the code clearer as there are already several helpers with similar names. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Tested-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20241218134022.21264-5-avihaih@nvidia.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/migration: Refactor vfio_devices_all_running_and_mig_active() logicAvihai Horon2-33/+9
During DMA unmap with vIOMMU, vfio_devices_all_running_and_mig_active() is used to check whether a dirty page log sync of the unmapped pages is required. Such log sync is needed during migration pre-copy phase, and the current logic detects it by checking if migration is active and if the VFIO devices are running. However, recently there has been an effort to simplify the migration status API and reduce it to a single migration_is_running() function. To accommodate this, refactor vfio_devices_all_running_and_mig_active() logic so it won't use migration_is_active(). Do it by simply checking if dirty tracking has been started using internal VFIO flags. This should be equivalent to the previous logic as during migration dirty tracking is active and when the guest is stopped there shouldn't be DMA unmaps coming from it. As a side effect, now that migration status is no longer used, DMA unmap log syncs are untied from migration. This will make calc-dirty-rate more accurate as now it will also include VFIO dirty pages that were DMA unmapped. Also rename the function to properly reflect its new logic and extract common code from vfio_devices_all_dirty_tracking(). Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Tested-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20241218134022.21264-4-avihaih@nvidia.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/migration: Refactor vfio_devices_all_dirty_tracking() logicAvihai Horon1-1/+16
During dirty page log sync, vfio_devices_all_dirty_tracking() is used to check if dirty tracking has been started in order to avoid errors. The current logic checks if migration is in ACTIVE or DEVICE states to ensure dirty tracking has been started. However, recently there has been an effort to simplify the migration status API and reduce it to a single migration_is_running() function. To accommodate this, refactor vfio_devices_all_dirty_tracking() logic so it won't use migration_is_active() and migration_is_device(). Instead, use internal VFIO dirty tracking flags. As a side effect, now that migration status is no longer used to detect dirty tracking status, VFIO log syncs are untied from migration. This will make calc-dirty-rate more accurate as now it will also include VFIO dirty pages. While at it, as VFIODevice->dirty_tracking is now used to detect dirty tracking status, add a comment that states how it's protected. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Tested-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20241218134022.21264-3-avihaih@nvidia.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/container: Add dirty tracking started flagAvihai Horon1-1/+11
Add a flag to VFIOContainerBase that indicates whether dirty tracking has been started for the container or not. This will be used in the following patches to allow dirty page syncs only if dirty tracking has been started. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Tested-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20241218134022.21264-2-avihaih@nvidia.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/igd: add x-igd-gms option back to set DSM region size for guestTomita Moeko1-0/+26
DSM region is likely to store framebuffer in Windows, a small DSM region may cause display issues (e.g. half of the screen is black). Since 971ca22f041b ("vfio/igd: don't set stolen memory size to zero"), the x-igd-gms option was functionally removed, QEMU uses host's original value, which is determined by DVMT Pre-Allocated option in Intel FSP of host bios. However, some vendors do not expose this config item to users. In such cases, x-igd-gms option can be used to manually set the data stolen memory size for guest. So this commit brings this option back, keeping its old behavior. When it is not specified, QEMU uses host's value. When DVMT Pre-Allocated option is available in host BIOS, user should set DSM region size there instead of using x-igd-gms option. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20241206122749.9893-11-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/igd: emulate BDSM in mmio bar0 for gen 6-10 devicesTomita Moeko1-8/+18
A recent commit in i915 driver [1] claims the BDSM register at 0x1080c0 of mmio bar0 has been there since gen 6. Mirror this register to the 32 bit BDSM register at 0x5c in pci config space for gen6-10 devices. [1] https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-7-ville.syrjala@linux.intel.com Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20241206122749.9893-10-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/igd: emulate GGC register in mmio bar0Tomita Moeko1-2/+11
The GGC register at 0x50 of pci config space is a mirror of the same register at 0x108040 of mmio bar0 [1]. i915 driver also reads that register from mmio bar0 instead of config space. As GGC is programmed and emulated by qemu, the mmio address should also be emulated, in the same way of BDSM register. [1] 4.1.28, 12th Generation Intel Core Processors Datasheet Volume 2 https://www.intel.com/content/www/us/en/content-details/655259 Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20241206122749.9893-9-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/igd: add macro for declaring mirrored registersTomita Moeko1-24/+36
igd devices have multipe registers mirroring mmio address and pci config space, more than a single BDSM register. To support this, the read/write functions are made common and a macro is defined to simplify the declaration of MemoryRegionOps. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20241206122749.9893-8-tomitamoeko@gmail.com [ clg : Fixed conversion specifier on 32-bit platform ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/igd: add Alder/Raptor/Rocket/Ice/Jasper Lake device idsTomita Moeko1-0/+5
All gen 11 and 12 igd devices have 64 bit BDSM register at 0xC0 in its config space, add them to the list to support igd passthrough on Alder/ Raptor/Rocket/Ice/Jasper Lake platforms. Tested legacy mode of igd passthrough works properly on both linux and windows guests with AlderLake-S GT1 (8086:4680). Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20241206122749.9893-7-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/igd: add Gemini Lake and Comet Lake device idsTomita Moeko1-0/+2
Both Gemini Lake and Comet Lake are gen 9 devices. Many user reports on internet shows legacy mode of igd passthrough works as qemu treats them as gen 8 devices by default before e433f208973f ("vfio/igd: return an invalid generation for unknown devices"). Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20241206122749.9893-6-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/igd: canonicalize memory size calculationsTomita Moeko1-44/+57
Add helper functions igd_gtt_memory_size() and igd_stolen_size() for calculating GTT stolen memory and Data stolen memory size in bytes, and use macros to replace the hardware-related magic numbers for better readability. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20241206122749.9893-5-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/igd: align generation with i915 kernel driverTomita Moeko1-22/+23
Define the igd device generations according to i915 kernel driver to avoid confusion, and adjust comment placement to clearly reflect the relationship between ids and devices. The condition of how GTT stolen memory size is calculated is changed accordingly as GGMS is in multiple of 2 starting from gen 8. Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20241206122749.9893-4-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/igd: remove unsupported device idsTomita Moeko1-10/+0
Since e433f208973f ("vfio/igd: return an invalid generation for unknown devices"), the default return of igd_gen() was changed to unsupported. There is no need to filter out those unsupported devices. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Link: https://lore.kernel.org/r/20241206122749.9893-3-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-26vfio/igd: fix GTT stolen memory size calculation for gen 8+Tomita Moeko1-2/+2
On gen 8 and later devices, the GTT stolen memory size when GGMS equals 0 is 0 (no preallocated memory) rather than 1MB [1]. [1] 3.1.13, 5th Generation Intel Core Processor Family Datasheet Vol. 2 https://www.intel.com/content/www/us/en/content-details/330835 Fixes: c4c45e943e51 ("vfio/pci: Intel graphics legacy mode assignment") Reported-By: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20241206122749.9893-2-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-12-21Merge tag 'exec-20241220' of https://github.com/philmd/qemu into stagingStefan Hajnoczi11-18/+18
Accel & Exec patch queue - Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander) - Add '-d invalid_mem' logging option (Zoltan) - Create QOM containers explicitly (Peter) - Rename sysemu/ -> system/ (Philippe) - Re-orderning of include/exec/ headers (Philippe) Move a lot of declarations from these legacy mixed bag headers: . "exec/cpu-all.h" . "exec/cpu-common.h" . "exec/cpu-defs.h" . "exec/exec-all.h" . "exec/translate-all" to these more specific ones: . "exec/page-protection.h" . "exec/translation-block.h" . "user/cpu_loop.h" . "user/guest-host.h" . "user/page-protection.h" # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t # wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt # KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K # A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8 # 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe/// # 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r # xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl # VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay # ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP # 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd # +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6 # x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo= # =cjz8 # -----END PGP SIGNATURE----- # gpg: Signature made Fri 20 Dec 2024 11:45:20 EST # gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE * tag 'exec-20241220' of https://github.com/philmd/qemu: (59 commits) util/qemu-timer: fix indentation meson: Do not define CONFIG_DEVICES on user emulation system/accel-ops: Remove unnecessary 'exec/cpu-common.h' header system/numa: Remove unnecessary 'exec/cpu-common.h' header hw/xen: Remove unnecessary 'exec/cpu-common.h' header target/mips: Drop left-over comment about Jazz machine target/mips: Remove tswap() calls in semihosting uhi_fstat_cb() target/xtensa: Remove tswap() calls in semihosting simcall() helper accel/tcg: Un-inline translator_is_same_page() accel/tcg: Include missing 'exec/translation-block.h' header accel/tcg: Move tcg_cflags_has/set() to 'exec/translation-block.h' accel/tcg: Restrict curr_cflags() declaration to 'internal-common.h' qemu/coroutine: Include missing 'qemu/atomic.h' header exec/translation-block: Include missing 'qemu/atomic.h' header accel/tcg: Declare cpu_loop_exit_requested() in 'exec/cpu-common.h' exec/cpu-all: Include 'cpu.h' earlier so MMU_USER_IDX is always defined target/sparc: Move sparc_restore_state_to_opc() to cpu.c target/sparc: Uninline cpu_get_tb_cpu_state() target/loongarch: Declare loongarch_cpu_dump_state() locally user: Move various declarations out of 'exec/exec-all.h' ... Conflicts: hw/char/riscv_htif.c hw/intc/riscv_aplic.c target/s390x/cpu.c Apply sysemu header path changes to not in the pull request. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-12-20include: Rename sysemu/ -> system/Philippe Mathieu-Daudé11-18/+18
Headers in include/sysemu/ are not only related to system *emulation*, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>
2024-12-19Constify all opaque Property pointersRichard Henderson1-2/+2
Via sed "s/ Property [*]/ const Property */". The opaque pointers passed to ObjectProperty callbacks are the last instances of non-const Property pointers in the tree. For the most part, these callbacks only use object_field_prop_ptr, which now takes a const pointer itself. This logically should have accompanied d36f165d952 which allowed const Property to be registered. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Link: https://lore.kernel.org/r/20241218134251.4724-25-richard.henderson@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-19include/hw/qdev-properties: Remove DEFINE_PROP_END_OF_LISTRichard Henderson4-5/+0
Now that all of the Property arrays are counted, we can remove the terminator object from each array. Update the assertions in device_class_set_props to match. With struct Property being 88 bytes, this was a rather large form of terminator. Saves 30k from qemu-system-aarch64. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Link: https://lore.kernel.org/r/20241218134251.4724-21-richard.henderson@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-15hw/vfio: Constify all PropertyRichard Henderson4-5/+5
Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-11-18Merge tag 'pull-request-2024-11-18' of https://gitlab.com/thuth/qemu into ↵Peter Maydell1-0/+1
staging * Fixes & doc updates for the new "boot order" s390x bios feature * Provide a "loadparm" property for scsi-hd & scsi-cd devices on s390x (required for the "boot order" feature) * Fix the floating-point multiply-and-add NaN rules on s390x * Raise timeout on cross-accel build jobs to 60m # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmc7ercRHHRodXRoQHJl # ZGhhdC5jb20ACgkQLtnXdP5wLbVjyg//ZuhSDCj+oBSU6vwM7Lwh3CS6GwZvGECU # h60V3tizKypiRNtTJRXHoWcx95brXmoZgI+QQhDEXe3fFLkOEKT6AIlDhrKZRUsd # rpLPr6O8TVKO+rSE7JVJAP3X1tpOOQDxnq83uWBv53b0S+Da0VwDRtI9gcugRMmh # d58P8Q1bV344fQdcrebejstpSUG7RxSA4Plj2uSQx4mSHT7cy/hN+vA34Ha7reE3 # tcN9yfQq3Rmfvt0MV5I9Umd6JXEoDlEAwjSNsWRsCzo69jBZwiMtXSH8LyLtwRTp # C919G/MIRuhvImF74dStLVCr82sNq54YR1NP6CGcmqPH76FOH8Mx3vmx9Cxj9ckA # 6NI6SvIg++bW2O1efG2apz8p5fjbDzYXSAbHnaWTcEu3gPgH4PQ5QXoyKaDymvWV # JIh5/gXEy+twEXgIBsdWQ44A9E06lL/tNfKnqGdXK4ZYF2JIrI+Lq7AKBee7tebP # +72I4PljHLSHQ3GxdkoOeJ8ahu70IBdSz2/VEIwOWK1wIf5C5WFNBerLJyDmkyx8 # xIvIm0vlRLwPcuOC711nlaMaKqTNT+8W4DIqIY6fHs2Jy0psMdgey1uHQxYEj9Kh # fg7CvalK8n3MkGAwTqAvRJIwMFe0a4Ss6c6CaemSaYa38ud/pCNnv+IT+Eqr+mjq # 6y5PZWNrZi0= # =UaDH # -----END PGP SIGNATURE----- # gpg: Signature made Mon 18 Nov 2024 17:34:47 GMT # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * tag 'pull-request-2024-11-18' of https://gitlab.com/thuth/qemu: .gitlab-ci.d: Raise timeout on cross-accel build jobs to 60m pc-bios: Update the s390 bios images with the recent fixes pc-bios/s390-ccw: Re-initialize receive queue index before each boot attempt pc-bios/s390x: Initialize machine loadparm before probing IPL devices pc-bios/s390x: Initialize cdrom type to false for each IPL device hw: Add "loadparm" property to scsi disk devices for booting on s390x hw/s390x: Restrict "loadparm" property to devices that can be used for booting docs/system/bootindex: Make it clear that s390x can also boot from virtio-net docs/system/s390x/bootdevices: Update loadparm documentation tests/tcg/s390x: Add the floating-point multiply-and-add test target/s390x: Fix the floating-point multiply-and-add NaN rules hw/usb: Use __attribute__((packed)) vs __packed Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-11-18hw/s390x: Restrict "loadparm" property to devices that can be used for bootingThomas Huth1-0/+1
Commit bb185de423 ("s390x: Add individual loadparm assignment to CCW device") added a "loadparm" property to all CCW devices. This was a little bit unfortunate, since this property is only useful for devices that can be used for booting, but certainly it is not useful for devices like virtio-gpu or virtio-tablet. Thus let's restrict the property to CCW devices that we can boot from (i.e. virtio-block, virtio-net and vfio-ccw devices). Message-ID: <20241113114741.681096-1-thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Jared Rossi <jrossi@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-11-18vfio/container: Fix container object destructionCédric Le Goater1-1/+1
When commit 96b7af4388b3 intoduced a .instance_finalize() handler, it did not take into account that the container was not necessarily inserted into the container list of the address space. Hence, if the container object is destroyed, by calling object_unref() for example, before vfio_address_space_insert() is called, QEMU may crash when removing the container from the list as done in vfio_container_instance_finalize(). This was seen with an SEV-SNP guest for which discarding of RAM fails. To resolve this issue, use the safe version of QLIST_REMOVE(). Cc: Zhenzhong Duan <zhenzhong.duan@intel.com> Cc: Eric Auger <eric.auger@redhat.com> Fixes: 96b7af4388b3 ("vfio/container: Move vfio_container_destroy() to an instance_finalize() handler") Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-11-18vfio/igd: fix calculation of graphics stolen memoryCorvin Köhne1-1/+1
When copying the calculation of the stolen memory size for Intels integrated graphics device of gen 9 and later from the Linux kernel [1], we missed subtracting 0xf0 from the graphics mode select value for values above 0xf0. This leads to QEMU reporting a very large size of the graphics stolen memory area. That's just a waste of memory. Additionally the guest firmware might be unable to allocate such a large buffer. [1] https://github.com/torvalds/linux/blob/7c626ce4bae1ac14f60076d00eafe71af30450ba/arch/x86/kernel/early-quirks.c#L455-L460 Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Fixes: 871922416683 ("vfio/igd: correctly calculate stolen memory size for gen 9 and later") Reviewed-by: Alex Williamson <alex.williamson@redhat.com> [ clg: Changed commit subject ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-11-18vfio/igd: add pci id for Coffee LakeCorvin Köhne1-0/+3
I've tested and verified that Coffee Lake devices are working properly. Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2024-11-05vfio/migration: Add vfio_save_block_precopy_empty_hit trace eventMaciej S. Szmigiero2-0/+9
This way it is clearly known when there's no more data to send for that device. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
2024-11-05vfio/migration: Add save_{iterate, complete_precopy}_start trace eventsMaciej S. Szmigiero2-0/+11
This way both the start and end points of migrating a particular VFIO device are known. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
2024-10-31migration: Drop migration_is_setup_or_active()Peter Xu1-1/+1
This helper is mostly the same as migration_is_running(), except that one has COLO reported as true, the other has CANCELLING reported as true. Per my past years experience on the state changes, none of them should matter. To make it slightly safer, report both COLO || CANCELLING to be true in migration_is_running(), then drop the other one. We kept the 1st only because the name is simpler, and clear enough. Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20241024213056.1395400-5-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-10-23vfio/helpers: Align mmapsAlex Williamson1-2/+30
Thanks to work by Peter Xu, support is introduced in Linux v6.12 to allow pfnmap insertions at PMD and PUD levels of the page table. This means that provided a properly aligned mmap, the vfio driver is able to map MMIO at significantly larger intervals than PAGE_SIZE. For example on x86_64 (the only architecture currently supporting huge pfnmaps for PUD), rather than 4KiB mappings, we can map device MMIO using 2MiB and even 1GiB page table entries. Typically mmap will already provide PMD aligned mappings, so devices with moderately sized MMIO ranges, even GPUs with standard 256MiB BARs, will already take advantage of this support. However in order to better support devices exposing multi-GiB MMIO, such as 3D accelerators or GPUs with resizable BARs enabled, we need to manually align the mmap. There doesn't seem to be a way for userspace to easily learn about PMD and PUD mapping level sizes, therefore this takes the simple approach to align the mapping to the power-of-two size of the region, up to 1GiB, which is currently the maximum alignment we care about. Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>
2024-10-23vfio/helpers: Refactor vfio_region_mmap() error handlingAlex Williamson1-17/+17
Move error handling code to the end of the function so that it can more easily be shared by new mmap failure conditions. No functional change intended. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>
2024-10-23vfio/migration: Change trace formats from hex to decimalAvihai Horon1-5/+5
Data sizes in VFIO migration trace events are printed in hex format while in migration core trace events they are printed in decimal format. This inconsistency makes it less readable when using both trace event types. Hence, change the data sizes print format to decimal in VFIO migration trace events. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>
2024-10-23vfio/migration: Report only stop-copy size in vfio_state_pending_exact()Avihai Horon1-3/+0
vfio_state_pending_exact() is used to update migration core how much device data is left for the device migration. Currently, the sum of pre-copy and stop-copy sizes of the VFIO device are reported. The pre-copy size is obtained via the VFIO_MIG_GET_PRECOPY_INFO ioctl, which returns the amount of device data available to be transferred while the device is in the PRE_COPY states. The stop-copy size is obtained via the VFIO_DEVICE_FEATURE_MIG_DATA_SIZE ioctl, which returns the total amount of device data left to be transferred in order to complete the device migration. According to the above, current implementation is wrong -- it reports extra overlapping data because pre-copy size is already contained in stop-copy size. Fix it by reporting only stop-copy size. Fixes: eda7362af959 ("vfio/migration: Add VFIO migration pre-copy support") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>
2024-09-17vfio/igd: correctly calculate stolen memory size for gen 9 and laterCorvin Köhne1-4/+11
We have to update the calculation of the stolen memory size because we've seen devices using values of 0xf0 and above for the graphics mode select field. The new calculation was taken from the linux kernel [1]. [1] https://github.com/torvalds/linux/blob/7c626ce4bae1ac14f60076d00eafe71af30450ba/arch/x86/kernel/early-quirks.c#L455-L460 Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2024-09-17vfio/igd: don't set stolen memory size to zeroCorvin Köhne1-17/+18
The stolen memory is required for the GOP (EFI) driver and the Windows driver. While the GOP driver seems to work with any stolen memory size, the Windows driver will crash if the size doesn't match the size allocated by the host BIOS. For that reason, it doesn't make sense to overwrite the stolen memory size. It's true that this wastes some VM memory. In the worst case, the stolen memory can take up more than a GB. However, that's uncommon. Additionally, it's likely that a bunch of RAM is assigned to VMs making use of GPU passthrough. Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2024-09-17vfio/igd: add ID's for ElkhartLake and TigerLakeCorvin Köhne1-0/+6
ElkhartLake and TigerLake devices were tested in legacy mode with Linux and Windows VMs. Both are working properly. It's likely that other Intel GPUs of gen 11 and 12 like IceLake device are working too. However, we're only adding known good devices for now. Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2024-09-17vfio/igd: add new bar0 quirk to emulate BDSM mirrorCorvin Köhne3-0/+100
The BDSM register is mirrored into MMIO space at least for gen 11 and later devices. Unfortunately, the Windows driver reads the register value from MMIO space instead of PCI config space for those devices [1]. Therefore, we either have to keep a 1:1 mapping for the host and guest address or we have to emulate the MMIO register too. Using the igd in legacy mode is already hard due to it's many constraints. Keeping a 1:1 mapping may not work in all cases and makes it even harder to use. An MMIO emulation has to trap the whole MMIO page. This makes accesses to this page slower compared to using second level address translation. Nevertheless, it doesn't have any constraints and I haven't noticed any performance degradation yet making it a better solution. [1] https://github.com/projectacrn/acrn-hypervisor/blob/5c351bee0f6ae46250eefc07f44b4a31e770f3cf/devicemodel/hw/pci/passthrough.c#L650-L653 Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2024-09-17vfio/igd: use new BDSM register location and size for gen 11 and laterCorvin Köhne1-7/+24
Intel changed the location and size of the BDSM register for gen 11 devices and later. We have to adjust our emulation for these devices to properly support them. Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2024-09-17vfio/igd: support legacy mode for all known generationsCorvin Köhne1-1/+1
We're soon going to add support for legacy mode to ElkhartLake and TigerLake devices. Those are gen 11 and 12 devices. At the moment, all devices identified by our igd_gen function do support legacy mode. This won't change when adding our new devices of gen 11 and 12. Therefore, it makes more sense to accept legacy mode for all known devices instead of maintaining a long list of known good generations. If we add a new generation to igd_gen which doesn't support legacy mode for some reason, it'll be easy to advance the check to reject legacy mode for this specific generation. Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2024-09-17vfio/igd: return an invalid generation for unknown devicesCorvin Köhne1-1/+5
Intel changes it's specification quite often e.g. the location and size of the BDSM register has change for gen 11 devices and later. This causes our emulation to fail on those devices. So, it's impossible for us to use a suitable default value for unknown devices. Instead of returning a random generation value and hoping that everthing works fine, we should verify that different devices are working and add them to our list of known devices. Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2024-09-17hw/vfio/pci.c: Use correct type in trace_vfio_msix_early_setup()Peter Maydell1-1/+1
The tracepoint trace_vfio_msix_early_setup() uses "int" for the type of the table_bar argument, but we use this to print a uint32_t. Coverity warns that this means that we could end up treating it as a negative number. We only use this in printing the value in the tracepoint, so mishandling it as a negative number would be harmless, but it's better to use the right type in the tracepoint. Use uint64_t to match how we print the table_offset in the vfio_msix_relo() tracepoint. Resolves: Coverity CID 1547690 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com>
2024-09-13hw: Use device_class_set_legacy_reset() instead of opencodingPeter Maydell3-3/+3
Use device_class_set_legacy_reset() instead of opencoding an assignment to DeviceClass::reset. This change was produced with: spatch --macro-file scripts/cocci-macro-file.h \ --sp-file scripts/coccinelle/device-reset.cocci \ --keep-comments --smpl-spacing --in-place --dir hw Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240830145812.1967042-8-peter.maydell@linaro.org