aboutsummaryrefslogtreecommitdiff
path: root/hw/vfio/common.c
AgeCommit message (Collapse)AuthorFilesLines
2024-04-23memory: Add Error** argument to .log_global_start() handlerCédric Le Goater1-1/+3
Modify all .log_global_start() handlers to take an Error** parameter and return a bool. Adapt memory_global_dirty_log_start() to interrupt on the first error the loop on handlers. In such case, a rollback is performed to stop dirty logging on all listeners where it was previously enabled. Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Paul Durrant <paul@xen.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-10-clg@redhat.com [peterx: modify & enrich the comment for listener_add_address_space() ] Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-12Merge tag 'migration-20240311-pull-request' of ↵Peter Maydell1-13/+4
https://gitlab.com/peterx/qemu into staging Migration pull request - Avihai's fix to allow vmstate iterators to not starve for VFIO - Maksim's fix on additional check on precopy load error - Fabiano's fix on fdatasync() hang in mapped-ram - Jonathan's fix on vring cached access over MMIO regions - Cedric's cleanup patches 1-4 out of his error report series - Yu's fix for RDMA migration (which used to be broken even for 8.2) - Anthony's small cleanup/fix on err message - Steve's patches on privatize migration.h - Xiang's patchset to enable zero page detections in multifd threads # -----BEGIN PGP SIGNATURE----- # # iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZe9+uBIccGV0ZXJ4QHJl # ZGhhdC5jb20ACgkQO1/MzfOr1wamaQD/SvmpMEcuRndT9LPSxzXowAGDZTBpYUfv # 5XAbx80dS9IBAO8PJJgQJIBHBeacyLBjHP9CsdVtgw5/VW+wCsbfV4AB # =xavb # -----END PGP SIGNATURE----- # gpg: Signature made Mon 11 Mar 2024 21:59:20 GMT # gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706 # gpg: issuer "peterx@redhat.com" # gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal] # gpg: aka "Peter Xu <peterx@redhat.com>" [marginal] # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706 * tag 'migration-20240311-pull-request' of https://gitlab.com/peterx/qemu: (34 commits) migration/multifd: Add new migration test cases for legacy zero page checking. migration/multifd: Enable multifd zero page checking by default. migration/multifd: Implement ram_save_target_page_multifd to handle multifd version of MigrationOps::ram_save_target_page. migration/multifd: Implement zero page transmission on the multifd thread. migration/multifd: Add new migration option zero-page-detection. migration/multifd: Allow clearing of the file_bmap from multifd migration/multifd: Allow zero pages in file migration migration: purge MigrationState from public interface migration: delete unused accessors migration: privatize colo interfaces migration: migration_file_set_error migration: migration_is_device migration: migration_thread_is_self migration: export vcpu_dirty_limit_period migration: export migration_is_running migration: export migration_is_active migration: export migration_is_setup_or_active migration: remove migration.h references migration: export fewer options migration: Fix format in error message ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-03-11migration: migration_file_set_errorSteve Sistare1-8/+1
Define and export migration_file_set_error to eliminate a dependency on MigrationState. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1710179338-294359-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11migration: migration_is_deviceSteve Sistare1-3/+1
Define and export migration_is_device to eliminate a dependency on MigrationState. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1710179338-294359-8-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11migration: export migration_is_activeSteve Sistare1-2/+2
Delete the MigrationState parameter from migration_is_active so it can be exported and used without including migration.h. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1710179338-294359-4-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11migration: export migration_is_setup_or_activeSteve Sistare1-1/+1
Delete the MigrationState parameter from migration_is_setup_or_active and move it to the public API in misc.h. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/1710179338-294359-3-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-08vfio: allow cpr-reboot migration if suspendedSteve Sistare1-1/+1
Allow cpr-reboot for vfio if the guest is in the suspended runstate. The guest drivers' suspend methods flush outstanding requests and re-initialize the devices, and thus there is no device state to save and restore. The user is responsible for suspending the guest before initiating cpr, such as by issuing guest-suspend-ram to the qemu guest agent. Relax the vfio blocker so it does not apply to cpr, and add a notifier that verifies the guest is suspended. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>
2024-01-29vfio: use matching sizeof typePaolo Bonzini1-1/+1
Do not use uint64_t for the type of the declaration and __u64 when computing the number of elements in the array. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-01-18remove unnecessary casts from uintptr_tPaolo Bonzini1-2/+2
uintptr_t, or unsigned long which is equivalent on Linux I32LP64 systems, is an unsigned type and there is no need to further cast to __u64 which is another unsigned integer type; widening casts from unsigned integers zero-extend the value. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-01-05hw/vfio: fix iteration over global VFIODevice listVolker Rümelin1-4/+4
Commit 3d779abafe ("vfio/common: Introduce a global VFIODevice list") introduced a global VFIODevice list, but forgot to update the list element field name when iterating over the new list. Change the code to use the correct list element field. Fixes: 3d779abafe ("vfio/common: Introduce a global VFIODevice list") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2061 Signed-off-by: Volker Rümelin <vr_qemu@t-online.de> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>
2024-01-05vfio/iommufd: Remove CONFIG_IOMMUFD usageCédric Le Goater1-3/+0
Availability of the IOMMUFD backend can now be fully determined at runtime and the ifdef check was a build time protection (for PPC not supporting it mostly). Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05vfio/iommufd: Introduce a VFIOIOMMU iommufd QOM interfaceCédric Le Goater1-1/+1
As previously done for the sPAPR and legacy IOMMU backends, convert the VFIOIOMMUOps struct to a QOM interface. The set of of operations for this backend can be referenced with a literal typename instead of a C struct. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05vfio/container: Introduce a VFIOIOMMU legacy QOM interfaceCédric Le Goater1-1/+5
Convert the legacy VFIOIOMMUOps struct to the new VFIOIOMMU QOM interface. The set of of operations for this backend can be referenced with a literal typename instead of a C struct. This will simplify support of multiple backends. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-01-05vfio/container: Introduce a VFIOIOMMU QOM interfaceCédric Le Goater1-1/+1
VFIOContainerBase was not introduced as an abstract QOM object because it felt unnecessary to expose all the IOMMU backends to the QEMU machine and human interface. However, we can still abstract the IOMMU backend handlers using a QOM interface class. This provides more flexibility when referencing the various implementations. Simply transform the VFIOIOMMUOps struct in an InterfaceClass and do some initial name replacements. Next changes will start converting VFIOIOMMUOps. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio: Make VFIOContainerBase poiner parameter const in VFIOIOMMUOps callbacksZhenzhong Duan1-4/+5
Some of the callbacks in VFIOIOMMUOps pass VFIOContainerBase poiner, those callbacks only need read access to the sub object of VFIOContainerBase. So make VFIOContainerBase, VFIOContainer and VFIOIOMMUFDContainer as const in these callbacks. Local functions called by those callbacks also need same changes to avoid build error. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/iommufd: Implement the iommufd backendYi Liu1-0/+6
The iommufd backend is implemented based on the new /dev/iommu user API. This backend obviously depends on CONFIG_IOMMUFD. So far, the iommufd backend doesn't support dirty page sync yet. Co-authored-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/common: return early if space isn't emptyZhenzhong Duan1-3/+6
This is a trivial optimization. If there is active container in space, vfio_reset_handler will never be unregistered. So revert the check of space->containers and return early. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/spapr: switch to spapr IOMMU BE add/del_section_windowZhenzhong Duan1-6/+2
No functional change intended. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/container: Implement attach/detach_deviceEric Auger1-0/+16
No functional change intended. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/container: Move iova_ranges to base containerZhenzhong Duan1-2/+3
Meanwhile remove the helper function vfio_free_container as it only calls g_free now. No functional change intended. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/container: Move listener to base containerEric Auger1-55/+55
Move listener to base container. Also error and initialized fields are moved at the same time. No functional change intended. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/container: Move vrdl_list to base containerZhenzhong Duan1-19/+19
No functional change intended. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/container: Move pgsizes and dma_max_mappings to base containerEric Auger1-8/+9
No functional change intended. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/container: Convert functions to base containerEric Auger1-24/+18
In the prospect to get rid of VFIOContainer refs in common.c lets convert misc functions to use the base container object instead: vfio_devices_all_dirty_tracking vfio_devices_all_device_dirty_tracking vfio_devices_all_running_and_mig_active vfio_devices_query_dirty_bitmap vfio_get_dirty_bitmap Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/container: Move per container device list in base containerZhenzhong Duan1-8/+15
VFIO Device is also changed to point to base container instead of legacy container. No functional change intended. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/container: Switch to IOMMU BE ↵Eric Auger1-4/+8
set_dirty_page_tracking/query_dirty_bitmap API dirty_pages_supported field is also moved to the base container No functional change intended. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/container: Move space field to base containerEric Auger1-2/+2
Move the space field to the base object. Also the VFIOAddressSpace now contains a list of base containers. No functional change intended. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/common: Move giommu_list in base containerEric Auger1-6/+11
Move the giommu_list field in the base container and store the base container in the VFIOGuestIOMMU. No functional change intended. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/container: Switch to dma_map|unmap APIEric Auger1-20/+25
No functional change intended. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-11-06vfio/common: Move vfio_host_win_add/del into spapr.cZhenzhong Duan1-68/+2
Only spapr supports a customed host window list, other vfio driver assume 64bit host window. So remove the check in listener callback and move vfio_host_win_add/del into spapr.c and make it static. With the check removed, we still need to do the same check for VFIO_SPAPR_TCE_IOMMU which allows a single host window range [dma32_window_start, dma32_window_size). Move vfio_find_hostwin into spapr.c and do same check in vfio_container_add_section_window instead. When mapping a ram device section, if it's unaligned with hostwin->iova_pgsizes, this mapping is bypassed. With hostwin moved into spapr, we changed to check container->pgsizes. Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-11-06vfio/container: Move IBM EEH related functions into spapr_pci_vfio.cZhenzhong Duan1-1/+0
With vfio_eeh_as_ok/vfio_eeh_as_op moved and made static, vfio.h becomes empty and is deleted. No functional changes intended. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-11-03vfio: Collect container iova range infoEric Auger1-0/+9
Collect iova range information if VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE capability is supported. This allows to propagate the information though the IOMMU MR set_iova_ranges() callback so that virtual IOMMUs get aware of those aperture constraints. This is only done if the info is available and the number of iova ranges is greater than 0. A new vfio_get_info_iova_range helper is introduced matching the coding style of existing vfio_get_info_dma_avail. The boolean returned value isn't used though. Code is aligned between both. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Yanghang Liu <yanghliu@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-20migration: simplify blockersSteve Sistare1-8/+2
Modify migrate_add_blocker and migrate_del_blocker to take an Error ** reason. This allows migration to own the Error object, so that if an error occurs in migrate_add_blocker, migration code can free the Error and clear the client handle, simplifying client code. It also simplifies the migrate_del_blocker call site. In addition, this is a pre-requisite for a proposed future patch that would add a mode argument to migration requests to support live update, and maintain a list of blockers for each mode. A blocker may apply to a single mode or to multiple modes, and passing Error** will allow one Error object to be registered for multiple modes. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Tested-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1697634216-84215-1-git-send-email-steven.sistare@oracle.com>
2023-10-18vfio/pci: Fix a potential memory leak in vfio_listener_region_addZhenzhong Duan1-1/+1
When there is an failure in vfio_listener_region_add() and the section belongs to a ram device, there is an inaccurate error report which should never be related to vfio_dma_map failure. The memory holding err is also incrementally leaked in each failure. Fix it by reporting the real error and free it. Fixes: 567b5b309ab ("vfio/pci: Relax DMA map errors for MMIO regions") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/common: Move legacy VFIO backend code into separate container.cYi Liu1-1139/+16
Move all the code really dependent on the legacy VFIO container/group into a separate file: container.c. What does remain in common.c is the code related to VFIOAddressSpace, MemoryListeners, migration and all other general operations. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/common: Introduce a global VFIODevice listZhenzhong Duan1-26/+19
Some functions iterate over all the VFIODevices. This is currently achieved by iterating over all groups/devices. Let's introduce a global list of VFIODevices simplifying that scan. This will also be useful while migrating to IOMMUFD by hiding the group specificity. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/common: Store the parent container in VFIODeviceZhenzhong Duan1-1/+7
let's store the parent contaienr within the VFIODevice. This simplifies the logic in vfio_viommu_preset() and brings the benefice to hide the group specificity which is useful for IOMMUFD migration. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/common: Introduce a per container device listZhenzhong Duan1-78/+63
Several functions need to iterate over the VFIO devices attached to a given container. This is currently achieved by iterating over the groups attached to the container and then over the devices in the group. Let's introduce a per container device list that simplifies this search. Per container list is used in below functions: vfio_devices_all_dirty_tracking vfio_devices_all_device_dirty_tracking vfio_devices_all_running_and_mig_active vfio_devices_dma_logging_stop vfio_devices_dma_logging_start vfio_devices_query_dirty_bitmap This will also ease the migration of IOMMUFD by hiding the group specificity. Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/common: Move VFIO reset handler registration to a group agnostic functionZhenzhong Duan1-8/+7
Move the reset handler registration/unregistration to a place that is not group specific. vfio_[get/put]_address_space are the best places for that purpose. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/ccw: Use vfio_[attach/detach]_deviceEric Auger1-5/+5
Let the vfio-ccw device use vfio_attach_device() and vfio_detach_device(), hence hiding the details of the used IOMMU backend. Note that the migration reduces the following trace "vfio: subchannel %s has already been attached" (featuring cssid.ssid.devid) into "device is already attached" Also now all the devices have been migrated to use the new vfio_attach_device/vfio_detach_device API, let's turn the legacy functions into static functions, local to container.c. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/pci: Introduce vfio_[attach/detach]_deviceEric Auger1-0/+74
We want the VFIO devices to be able to use two different IOMMU backends, the legacy VFIO one and the new iommufd one. Introduce vfio_[attach/detach]_device which aim at hiding the underlying IOMMU backend (IOCTLs, datatypes, ...). Once vfio_attach_device completes, the device is attached to a security context and its fd can be used. Conversely When vfio_detach_device completes, the device has been detached from the security context. At the moment only the implementation based on the legacy container/group exists. Let's use it from the vfio-pci device. Subsequent patches will handle other devices. We also take benefit of this patch to properly free vbasedev->name on failure. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/common: Extract out vfio_kvm_device_[add/del]_fdZhenzhong Duan1-16/+39
Introduce two new helpers, vfio_kvm_device_[add/del]_fd which take as input a file descriptor which can be either a group fd or a cdev fd. This uses the new KVM_DEV_VFIO_FILE VFIO KVM device group, which aliases to the legacy KVM_DEV_VFIO_GROUP. vfio_kvm_device_[add/del]_group then call those new helpers. Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/common: Introduce vfio_container_add|del_section_window()Eric Auger1-67/+89
Introduce helper functions that isolate the code used for VFIO_SPAPR_TCE_v2_IOMMU. Those helpers hide implementation details beneath the container object and make the vfio_listener_region_add/del() implementations more readable. No code change intended. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/common: Propagate KVM_SET_DEVICE_ATTR error if anyEric Auger1-5/+5
In the VFIO_SPAPR_TCE_v2_IOMMU container case, when KVM_SET_DEVICE_ATTR fails, we currently don't propagate the error as we do on the vfio_spapr_create_window() failure case. Let's align the code. Take the opportunity to reword the error message and make it more explicit. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/common: Move IOMMU agnostic helpers to a separate fileYi Liu1-588/+0
Move low-level iommu agnostic helpers to a separate helpers.c file. They relate to regions, interrupts, device/region capabilities and etc. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-09-11vfio/common: Separate vfio-pci rangesJoao Martins1-11/+60
QEMU computes the DMA logging ranges for two predefined ranges: 32-bit and 64-bit. In the OVMF case, when the dynamic MMIO window is enabled, QEMU includes in the 64-bit range the RAM regions at the lower part and vfio-pci device RAM regions which are at the top of the address space. This range contains a large gap and the size can be bigger than the dirty tracking HW limits of some devices (MLX5 has a 2^42 limit). To avoid such large ranges, introduce a new PCI range covering the vfio-pci device RAM regions, this only if the addresses are above 4GB to avoid breaking potential SeaBIOS guests. [ clg: - wrote commit log - fixed overlapping 32-bit and PCI ranges when using SeaBIOS ] Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Fixes: 5255bbf4ec16 ("vfio/common: Add device dirty page tracking start/stop") Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-09-11vfio/migration: Fail adding device with enable-migration=on and existing blockerAvihai Horon1-2/+5
If a device with enable-migration=on is added and it causes a migration blocker, adding the device should fail with a proper error. This is not the case with multiple device migration blocker when the blocker already exists. If the blocker already exists and a device with enable-migration=on is added which causes a migration blocker, adding the device will succeed. Fix it by failing adding the device in such case. Fixes: 8bbcb64a71d8 ("vfio/migration: Make VFIO migration non-experimental") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-09-11vfio/migration: Allow migration of multiple P2P supporting devicesAvihai Horon1-8/+18
Now that P2P support has been added to VFIO migration, allow migration of multiple devices if all of them support P2P migration. Single device migration is allowed regardless of P2P migration support. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-09-11vfio/migration: Add P2P support for VFIO migrationAvihai Horon1-2/+4
VFIO migration uAPI defines an optional intermediate P2P quiescent state. While in the P2P quiescent state, P2P DMA transactions cannot be initiated by the device, but the device can respond to incoming ones. Additionally, all outstanding P2P transactions are guaranteed to have been completed by the time the device enters this state. The purpose of this state is to support migration of multiple devices that might do P2P transactions between themselves. Add support for P2P migration by transitioning all the devices to the P2P quiescent state before stopping or starting the devices. Use the new VMChangeStateHandler prepare_cb to achieve that behavior. This will allow migration of multiple VFIO devices if all of them support P2P migration. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-09-11vfio/migration: Refactor PRE_COPY and RUNNING state checksJoao Martins1-4/+18
Move the PRE_COPY and RUNNING state checks to helper functions. This is in preparation for adding P2P VFIO migration support, where these helpers will also test for PRE_COPY_P2P and RUNNING_P2P states. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>