aboutsummaryrefslogtreecommitdiff
path: root/hw/vfio/trace-events
AgeCommit message (Collapse)AuthorFilesLines
2025-03-06vfio/migration: Multifd device state transfer support - send sideMaciej S. Szmigiero1-0/+2
Implement the multifd device state transfer via additional per-device thread inside save_live_complete_precopy_thread handler. Switch between doing the data transfer in the new handler and doing it in the old save_state handler depending if VFIO multifd transfer is enabled or not. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/4d727e2e0435e0022d50004e474077632830e08d.1741124640.git.maciej.szmigiero@oracle.com [ clg: - Reordered savevm_vfio_handlers - Updated save_live_complete_precopy* documentation ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-03-06vfio/migration: Multifd device state transfer support - load threadMaciej S. Szmigiero1-0/+7
Add a thread which loads the VFIO device state buffers that were received via multifd. Each VFIO device that has multifd device state transfer enabled has one such thread, which is created using migration core API qemu_loadvm_start_load_thread(). Since it's important to finish loading device state transferred via the main migration channel (via save_live_iterate SaveVMHandler) before starting loading the data asynchronously transferred via multifd the thread doing the actual loading of the multifd transferred data is only started from switchover_start SaveVMHandler. switchover_start handler is called when MIG_CMD_SWITCHOVER_START sub-command of QEMU_VM_COMMAND is received via the main migration channel. This sub-command is only sent after all save_live_iterate data have already been posted so it is safe to commence loading of the multifd-transferred device state upon receiving it - loading of save_live_iterate data happens synchronously in the main migration thread (much like the processing of MIG_CMD_SWITCHOVER_START) so by the time MIG_CMD_SWITCHOVER_START is processed all the proceeding data must have already been loaded. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/9abe612d775aaf42e31646796acd2363c723a57a.1741124640.git.maciej.szmigiero@oracle.com [ clg: - Reordered savevm_vfio_handlers - Added switchover_start documentation ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-03-06vfio/migration: Multifd device state transfer support - received buffers queuingMaciej S. Szmigiero1-0/+1
The multifd received data needs to be reassembled since device state packets sent via different multifd channels can arrive out-of-order. Therefore, each VFIO device state packet carries a header indicating its position in the stream. The raw device state data is saved into a VFIOStateBuffer for later in-order loading into the device. The last such VFIO device state packet should have VFIO_DEVICE_STATE_CONFIG_STATE flag set and carry the device config state. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/e3bff515a8d61c582b94b409eb12a45b1a143a69.1741124640.git.maciej.szmigiero@oracle.com [ clg: - Reordered savevm_vfio_handlers - Added load_state_buffer documentation ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-03-06vfio/migration: Add load_device_config_state_start trace eventMaciej S. Szmigiero1-1/+2
And rename existing load_device_config_state trace event to load_device_config_state_end for consistency since it is triggered at the end of loading of the VFIO device config state. This way both the start and end points of particular device config loading operation (a long, BQL-serialized operation) are known. Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Link: https://lore.kernel.org/qemu-devel/1b6c5a2097e64c272eb7e53f9e4cca4b79581b38.1741124640.git.maciej.szmigiero@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-02-21hw/vfio/common: Add a trace point in vfio_reset_handlerEric Auger1-0/+1
To ease the debug of reset sequence, let's add a trace point in vfio_reset_handler() Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Message-Id: <20250218182737.76722-5-eric.auger@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-05vfio/migration: Add vfio_save_block_precopy_empty_hit trace eventMaciej S. Szmigiero1-0/+1
This way it is clearly known when there's no more data to send for that device. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
2024-11-05vfio/migration: Add save_{iterate, complete_precopy}_start trace eventsMaciej S. Szmigiero1-0/+2
This way both the start and end points of migrating a particular VFIO device are known. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
2024-10-23vfio/migration: Change trace formats from hex to decimalAvihai Horon1-5/+5
Data sizes in VFIO migration trace events are printed in hex format while in migration core trace events they are printed in decimal format. This inconsistency makes it less readable when using both trace event types. Hence, change the data sizes print format to decimal in VFIO migration trace events. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>
2024-09-17hw/vfio/pci.c: Use correct type in trace_vfio_msix_early_setup()Peter Maydell1-1/+1
The tracepoint trace_vfio_msix_early_setup() uses "int" for the type of the table_bar argument, but we use this to print a uint32_t. Coverity warns that this means that we could end up treating it as a negative number. We only use this in printing the value in the tracepoint, so mishandling it as a negative number would be harmless, but it's better to use the right type in the tracepoint. Use uint64_t to match how we print the table_offset in the vfio_msix_relo() tracepoint. Resolves: Coverity CID 1547690 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com>
2024-07-22hw/vfio/common: Add vfio_listener_region_del_iommu trace eventEric Auger1-1/+2
Trace when VFIO gets notified about the deletion of an IOMMU MR. Also trace the name of the region in the add_iommu trace message. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-6-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-06-10hw/vfio: Remove newline character in trace eventsPhilippe Mathieu-Daudé1-2/+2
Trace events aren't designed to be multi-lines. Remove the newline characters. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Mads Ynddal <mads@ynddal.dk> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-id: 20240606103943.79116-5-philmd@linaro.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-05-16vfio/migration: Enhance VFIO migration state tracingAvihai Horon1-1/+2
Move trace_vfio_migration_set_state() to the top of the function, add recover_state to it, and add a new trace event to vfio_migration_set_device_state(). This improves tracing of device state changes as state changes are now also logged when vfio_migration_set_state() fails (covering recover state and device reset transitions) and in no-op state transitions to the same state. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-02-28migration: MigrationEvent for notifiersSteve Sistare1-1/+1
Passing MigrationState to notifiers is unsound because they could access unstable migration state internals or even modify the state. Instead, pass the minimal info needed in a new MigrationEvent struct, which could be extended in the future if needed. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-5-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2023-12-19vfio/iommufd: Enable pci hot reset through iommufd cdev interfaceZhenzhong Duan1-0/+1
Implement the newly introduced pci_hot_reset callback named iommufd_cdev_pci_hot_reset to do iommufd specific check and reset operation. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/iommufd: Implement the iommufd backendYi Liu1-0/+10
The iommufd backend is implemented based on the new /dev/iommu user API. This backend obviously depends on CONFIG_IOMMUFD. So far, the iommufd backend doesn't support dirty page sync yet. Co-authored-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/container: Convert functions to base containerEric Auger1-1/+1
In the prospect to get rid of VFIOContainer refs in common.c lets convert misc functions to use the base container object instead: vfio_devices_all_dirty_tracking vfio_devices_all_device_dirty_tracking vfio_devices_all_running_and_mig_active vfio_devices_query_dirty_bitmap vfio_get_dirty_bitmap Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19vfio/container: Switch to dma_map|unmap APIEric Auger1-1/+1
No functional change intended. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/platform: Use vfio_[attach/detach]_deviceEric Auger1-1/+0
Let the vfio-platform device use vfio_attach_device() and vfio_detach_device(), hence hiding the details of the used IOMMU backend. Drop the trace event for vfio-platform as we have similar one in vfio_attach_device. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-18vfio/pci: Introduce vfio_[attach/detach]_deviceEric Auger1-1/+2
We want the VFIO devices to be able to use two different IOMMU backends, the legacy VFIO one and the new iommufd one. Introduce vfio_[attach/detach]_device which aim at hiding the underlying IOMMU backend (IOCTLs, datatypes, ...). Once vfio_attach_device completes, the device is attached to a security context and its fd can be used. Conversely When vfio_detach_device completes, the device has been detached from the security context. At the moment only the implementation based on the legacy container/group exists. Let's use it from the vfio-pci device. Subsequent patches will handle other devices. We also take benefit of this patch to properly free vbasedev->name on failure. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-10-05vfio/pci: detect the support of dynamic MSI-X allocationJing Liu1-1/+1
Kernel provides the guidance of dynamic MSI-X allocation support of passthrough device, by clearing the VFIO_IRQ_INFO_NORESIZE flag to guide user space. Fetch the flags from host to determine if dynamic MSI-X allocation is supported. Originally-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-09-18spapr: Remove support for NVIDIA V100 GPU with NVLink2Cédric Le Goater1-4/+0
NVLink2 support was removed from the PPC PowerNV platform and VFIO in Linux 5.13 with commits : 562d1e207d32 ("powerpc/powernv: remove the nvlink support") b392a1989170 ("vfio/pci: remove vfio_pci_nvlink2") This was 2.5 years ago. Do the same in QEMU with a revert of commit ec132efaa81f ("spapr: Support NVIDIA V100 GPU with NVLink2"). Some adjustements are required on the NUMA part. Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Message-ID: <20230918091717.149950-1-clg@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2023-09-11vfio/common: Separate vfio-pci rangesJoao Martins1-1/+1
QEMU computes the DMA logging ranges for two predefined ranges: 32-bit and 64-bit. In the OVMF case, when the dynamic MMIO window is enabled, QEMU includes in the 64-bit range the RAM regions at the lower part and vfio-pci device RAM regions which are at the top of the address space. This range contains a large gap and the size can be bigger than the dirty tracking HW limits of some devices (MLX5 has a 2^42 limit). To avoid such large ranges, introduce a new PCI range covering the vfio-pci device RAM regions, this only if the addresses are above 4GB to avoid breaking potential SeaBIOS guests. [ clg: - wrote commit log - fixed overlapping 32-bit and PCI ranges when using SeaBIOS ] Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Fixes: 5255bbf4ec16 ("vfio/common: Add device dirty page tracking start/stop") Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-09-11vfio/migration: Add P2P support for VFIO migrationAvihai Horon1-0/+1
VFIO migration uAPI defines an optional intermediate P2P quiescent state. While in the P2P quiescent state, P2P DMA transactions cannot be initiated by the device, but the device can respond to incoming ones. Additionally, all outstanding P2P transactions are guaranteed to have been completed by the time the device enters this state. The purpose of this state is to support migration of multiple devices that might do P2P transactions between themselves. Add support for P2P migration by transitioning all the devices to the P2P quiescent state before stopping or starting the devices. Use the new VMChangeStateHandler prepare_cb to achieve that behavior. This will allow migration of multiple VFIO devices if all of them support P2P migration. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-06-30vfio/migration: Make VFIO migration non-experimentalAvihai Horon1-1/+1
The major parts of VFIO migration are supported today in QEMU. This includes basic VFIO migration, device dirty page tracking and precopy support. Thus, at this point in time, it seems appropriate to make VFIO migration non-experimental: remove the x prefix from enable_migration property, change it to ON_OFF_AUTO and let the default value be AUTO. In addition, make the following adjustments: 1. When enable_migration is ON and migration is not supported, fail VFIO device realization. 2. When enable_migration is AUTO (i.e., not explicitly enabled), require device dirty tracking support. This is because device dirty tracking is currently the only method to do dirty page tracking, which is essential for migrating in a reasonable downtime. Setting enable_migration to ON will not require device dirty tracking. 3. Make migration error and blocker messages more elaborate. 4. Remove error prints in vfio_migration_query_flags(). 5. Rename trace_vfio_migration_probe() to trace_vfio_migration_realize(). Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-06-30vfio/migration: Add VFIO migration pre-copy supportAvihai Horon1-1/+3
Pre-copy support allows the VFIO device data to be transferred while the VM is running. This helps to accommodate VFIO devices that have a large amount of data that needs to be transferred, and it can reduce migration downtime. Pre-copy support is optional in VFIO migration protocol v2. Implement pre-copy of VFIO migration protocol v2 and use it for devices that support it. Full description of it can be found in the following Linux commit: 4db52602a607 ("vfio: Extend the device migration protocol with PRE_COPY"). Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-06-13hw/vfio: Add number of dirty pages to vfio_get_dirty_bitmap tracepointJoao Martins1-1/+1
Include the number of dirty pages on the vfio_get_dirty_bitmap tracepoint. These are fetched from the newly added return value in cpu_physical_memory_set_dirty_lebitmap(). Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Alex Williamson <alex.williamson@redhat.com> Message-Id: <20230530180556.24441-3-joao.m.martins@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-03-07vfio: Fix vfio_get_dev_region() trace eventCédric Le Goater1-1/+1
Simply transpose 'x8' to fix the typo and remove the ending '8' Fixes: e61a424f05 ("vfio: Create device specific region info helper") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1526 Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/20230303074330.2609377-1-clg@kaod.org [aw: commit log s/revert/transpose/] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07vfio/common: Add device dirty page tracking start/stopJoao Martins1-0/+1
Add device dirty page tracking start/stop functionality. This uses the device DMA logging uAPI to start and stop dirty page tracking by device. Device dirty page tracking is used only if all devices within a container support device dirty page tracking. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20230307125450.62409-11-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07vfio/common: Record DMA mapped IOVA rangesJoao Martins1-0/+1
According to the device DMA logging uAPI, IOVA ranges to be logged by the device must be provided all at once upon DMA logging start. As preparation for the following patches which will add device dirty page tracking, keep a record of all DMA mapped IOVA ranges so later they can be used for DMA logging start. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-10-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-03-07vfio/common: Use a single tracepoint for skipped sectionsJoao Martins1-2/+1
In preparation to turn more of the memory listener checks into common functions, one of the affected places is how we trace when sections are skipped. Right now there is one for each. Change it into one single tracepoint `vfio_listener_region_skip` which receives a name which refers to the callback i.e. region_add and region_del. Suggested-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-7-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16vfio: Alphabetize migration section of VFIO trace-events fileAvihai Horon1-10/+10
Sort the migration section of VFIO trace events file alphabetically and move two misplaced traces to common.c section. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-11-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16vfio/migration: Remove VFIO migration protocol v1Avihai Horon1-9/+0
Now that v2 protocol implementation has been added, remove the deprecated v1 implementation. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-10-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16vfio/migration: Implement VFIO migration protocol v2Avihai Horon1-0/+7
Implement the basic mandatory part of VFIO migration protocol v2. This includes all functionality that is necessary to support VFIO_MIGRATION_STOP_COPY part of the v2 protocol. The two protocols, v1 and v2, will co-exist and in the following patches v1 protocol code will be removed. There are several main differences between v1 and v2 protocols: - VFIO device state is now represented as a finite state machine instead of a bitmap. - Migration interface with kernel is now done using VFIO_DEVICE_FEATURE ioctl and normal read() and write() instead of the migration region. - Pre-copy is made optional in v2 protocol. Support for pre-copy will be added later on. Detailed information about VFIO migration protocol v2 and its difference compared to v1 protocol can be found here [1]. [1] https://lore.kernel.org/all/20220224142024.147653-10-yishaih@nvidia.com/ Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Juan Quintela <quintela@redhat.com>. Link: https://lore.kernel.org/r/20230216143630.25610-9-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16vfio/migration: Rename functions/structs related to v1 protocolAvihai Horon1-6/+6
To avoid name collisions, rename functions and structs related to VFIO migration protocol v1. This will allow the two protocols to co-exist when v2 protocol is added, until v1 is removed. No functional changes intended. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-8-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-16vfio/migration: Move migration v1 logic to vfio_migration_init()Avihai Horon1-1/+1
Move vfio_dev_get_region_info() logic from vfio_migration_probe() to vfio_migration_init(). This logic is specific to v1 protocol and moving it will make it easier to add the v2 protocol implementation later. No functional changes intended. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-7-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-02-15migration: Remove unused res_compatibleJuan Quintela1-1/+1
Nothing assigns to it after previous commit. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-06migration: Split save_live_pending() into state_pending_*Juan Quintela1-1/+1
We split the function into to: - state_pending_estimate: We estimate the remaining state size without stopping the machine. - state pending_exact: We calculate the exact amount of remaining state. The only "device" that implements different functions for _estimate() and _exact() is ram. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-06-08vfio/common: remove spurious warning on vfio_listener_region_delEric Auger1-1/+1
851d6d1a0f ("vfio/common: remove spurious tpm-crb-cmd misalignment warning") removed the warning on vfio_listener_region_add() path. However the same warning also hits on region_del path. Let's remove it and reword the dynamic trace as this can be called on both map and unmap path. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20220524091405.416256-1-eric.auger@redhat.com Fixes: 851d6d1a0ff2 ("vfio/common: remove spurious tpm-crb-cmd misalignment warning") Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-05-06vfio/common: remove spurious tpm-crb-cmd misalignment warningEric Auger1-0/+1
The CRB command buffer currently is a RAM MemoryRegion and given its base address alignment, it causes an error report on vfio_listener_region_add(). This region could have been a RAM device region, easing the detection of such safe situation but this option was not well received. So let's add a helper function that uses the memory region owner type to detect the situation is safe wrt the assignment. Other device types can be checked here if such kind of problem occurs again. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20220506132510.1847942-3-eric.auger@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-02docs: fix references to docs/devel/tracing.rstStefano Garzarella1-1/+1
Commit e50caf4a5c ("tracing: convert documentation to rST") converted docs/devel/tracing.txt to docs/devel/tracing.rst. We still have several references to the old file, so let's fix them with the following command: sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt) Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210517151702.109066-2-sgarzare@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-03-16hw/vfio/pci-quirks: Replace the word 'blacklist'Philippe Mathieu-Daudé1-1/+1
Follow the inclusive terminology from the "Conscious Language in your Open Source Projects" guidelines [*] and replace the word "blacklist" appropriately. [*] https://github.com/conscious-lang/conscious-lang-docs/blob/main/faq.md Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210205171817.2108907-9-philmd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01vfio: Dirty page tracking when vIOMMU is enabledKirti Wankhede1-0/+1
When vIOMMU is enabled, register MAP notifier from log_sync when all devices in container are in stop and copy phase of migration. Call replay and get dirty pages from notifier callback. Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01vfio: Add vfio_listener_log_sync to mark dirty pagesKirti Wankhede1-0/+1
vfio_listener_log_sync gets list of dirty pages from container using VFIO_IOMMU_GET_DIRTY_BITMAP ioctl and mark those pages dirty when all devices are stopped and saving state. Return early for the RAM block section of mapped MMIO region. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> [aw: fix error_report types, fix cpu_physical_memory_set_dirty_lebitmap() cast] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01vfio: Add load state functions to SaveVMHandlersKirti Wankhede1-0/+4
Sequence during _RESUMING device state: While data for this device is available, repeat below steps: a. read data_offset from where user application should write data. b. write data of data_size to migration region from data_offset. c. write data_size which indicates vendor driver that data is written in staging buffer. For user, data is opaque. User should write data in the same order as received. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01vfio: Add save state functions to SaveVMHandlersKirti Wankhede1-0/+6
Added .save_live_pending, .save_live_iterate and .save_live_complete_precopy functions. These functions handles pre-copy and stop-and-copy phase. In _SAVING|_RUNNING device state or pre-copy phase: - read pending_bytes. If pending_bytes > 0, go through below steps. - read data_offset - indicates kernel driver to write data to staging buffer. - read data_size - amount of data in bytes written by vendor driver in migration region. - read data_size bytes of data from data_offset in the migration region. - Write data packet to file stream as below: {VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data, VFIO_MIG_FLAG_END_OF_STATE } In _SAVING device state or stop-and-copy phase a. read config space of device and save to migration file stream. This doesn't need to be from vendor driver. Any other special config state from driver can be saved as data in following iteration. b. read pending_bytes. If pending_bytes > 0, go through below steps. c. read data_offset - indicates kernel driver to write data to staging buffer. d. read data_size - amount of data in bytes written by vendor driver in migration region. e. read data_size bytes of data from data_offset in the migration region. f. Write data packet as below: {VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data} g. iterate through steps b to f while (pending_bytes > 0) h. Write {VFIO_MIG_FLAG_END_OF_STATE} When data region is mapped, its user's responsibility to read data from data_offset of data_size before moving to next steps. Added fix suggested by Artem Polyakov to reset pending_bytes in vfio_save_iterate(). Added fix suggested by Zhi Wang to add 0 as data size in migration stream and add END_OF_STATE delimiter to indicate phase complete. Suggested-by: Artem Polyakov <artemp@nvidia.com> Suggested-by: Zhi Wang <zhi.wang.linux@gmail.com> Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01vfio: Register SaveVMHandlers for VFIO deviceKirti Wankhede1-0/+2
Define flags to be used as delimiter in migration stream for VFIO devices. Added .save_setup and .save_cleanup functions. Map & unmap migration region from these functions at source during saving or pre-copy phase. Set VFIO device state depending on VM's state. During live migration, VM is running when .save_setup is called, _SAVING | _RUNNING state is set for VFIO device. During save-restore, VM is paused, _SAVING state is set for VFIO device. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01vfio: Add migration state change notifierKirti Wankhede1-0/+1
Added migration state change notifier to get notification on migration state change. These states are translated to VFIO device state and conveyed to vendor driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01vfio: Add VM state change handler to know state of VMKirti Wankhede1-0/+2
VM state change handler is called on change in VM's state. Based on VM state, VFIO device state should be changed. Added read/write helper functions for migration region. Added function to set device_state. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [aw: lx -> HWADDR_PRIx, remove redundant parens] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01vfio: Add migration region initialization and finalize functionKirti Wankhede1-0/+3
Whether the VFIO device supports migration or not is decided based of migration region query. If migration region query is successful and migration region initialization is successful then migration is supported else migration is blocked. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01vfio: Add function to unmap VFIO regionKirti Wankhede1-0/+1
This function will be used for migration region. Migration region is mmaped when migration starts and will be unmapped when migration is complete. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>