Age | Commit message (Collapse) | Author | Files | Lines |
|
This will be useful to report errors at a higher level, mostly in VFIO
today.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240320064911.545001-9-clg@redhat.com
[peterx: drop comment for ERRP_GUARD, per Markus]
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
The purpose is to record a potential error in the migration stream if
qemu_savevm_state_setup() fails. Most of the current .save_setup()
handlers can be modified to use the Error argument instead of managing
their own and calling locally error_report().
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: John Snow <jsnow@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240320064911.545001-8-clg@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
This will prepare ground for future changes adding an Error** argument
to the save_setup() handler. We need to make sure that on failure,
vfio_save_setup() always sets a new error.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240320064911.545001-3-clg@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
https://gitlab.com/peterx/qemu into staging
Migration pull request
- Avihai's fix to allow vmstate iterators to not starve for VFIO
- Maksim's fix on additional check on precopy load error
- Fabiano's fix on fdatasync() hang in mapped-ram
- Jonathan's fix on vring cached access over MMIO regions
- Cedric's cleanup patches 1-4 out of his error report series
- Yu's fix for RDMA migration (which used to be broken even for 8.2)
- Anthony's small cleanup/fix on err message
- Steve's patches on privatize migration.h
- Xiang's patchset to enable zero page detections in multifd threads
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZe9+uBIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wamaQD/SvmpMEcuRndT9LPSxzXowAGDZTBpYUfv
# 5XAbx80dS9IBAO8PJJgQJIBHBeacyLBjHP9CsdVtgw5/VW+wCsbfV4AB
# =xavb
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 11 Mar 2024 21:59:20 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-20240311-pull-request' of https://gitlab.com/peterx/qemu: (34 commits)
migration/multifd: Add new migration test cases for legacy zero page checking.
migration/multifd: Enable multifd zero page checking by default.
migration/multifd: Implement ram_save_target_page_multifd to handle multifd version of MigrationOps::ram_save_target_page.
migration/multifd: Implement zero page transmission on the multifd thread.
migration/multifd: Add new migration option zero-page-detection.
migration/multifd: Allow clearing of the file_bmap from multifd
migration/multifd: Allow zero pages in file migration
migration: purge MigrationState from public interface
migration: delete unused accessors
migration: privatize colo interfaces
migration: migration_file_set_error
migration: migration_is_device
migration: migration_thread_is_self
migration: export vcpu_dirty_limit_period
migration: export migration_is_running
migration: export migration_is_active
migration: export migration_is_setup_or_active
migration: remove migration.h references
migration: export fewer options
migration: Fix format in error message
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Define and export migration_file_set_error to eliminate a dependency
on MigrationState.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179338-294359-9-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
A small number of migration options are accessed by migration clients,
but to see them clients must include all of options.h, which is mostly
for migration core code. migrate_mode() in particular will be needed by
multiple clients.
Refactor the option declarations so clients can see the necessary few via
misc.h, which already exports a portion of the client API.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1710179319-294320-1-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
VFIO migration buffer size is currently limited to 1MB. Therefore, there
is no need to check if migration rate exceeded, as in the worst case it
will exceed by only 1MB.
However, if the buffer size is later changed to a bigger value,
vfio_save_iterate() should enforce migration rate (similar to migration
RAM code).
Add a note about this in vfio_save_iterate() to serve as a reminder.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240304105339.20713-4-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Currently, vfio_save_state() returns 1 regardless of whether there is
more data to send or not. This was done to prevent a fast changing VFIO
device from potentially blocking other devices from sending their data,
as qemu_savevm_state_iterate() serialized devices.
Now that qemu_savevm_state_iterate() no longer serializes devices, there
is no need for that.
Refactor vfio_save_state() to return 0 if more data is available and 1
if no more data is available.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240304105339.20713-3-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Allow cpr-reboot for vfio if the guest is in the suspended runstate. The
guest drivers' suspend methods flush outstanding requests and re-initialize
the devices, and thus there is no device state to save and restore. The
user is responsible for suspending the guest before initiating cpr, such as
by issuing guest-suspend-ram to the qemu guest agent.
Relax the vfio blocker so it does not apply to cpr, and add a notifier that
verifies the guest is suspended.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
|
|
Define MigrationNotifyFunc to improve type safety and simplify migration
notifiers.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Passing MigrationState to notifiers is unsound because they could access
unstable migration state internals or even modify the state. Instead, pass
the minimal info needed in a new MigrationEvent struct, which could be
extended in the future if needed.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-5-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Change all migration notifiers to type NotifierWithReturn, so notifiers
can return an error status in a future patch. For now, pass NULL for the
notifier error parameter, and do not check the return value.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-4-git-send-email-steven.sistare@oracle.com
[peterx: dropped unexpected update to roms/seabios-hppa]
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
There are several places where failure in setting the device state leads
to a device reset, which is done by setting ERROR as the recover state.
Add a helper function that sets the device state and resets the device
in case of failure. This will make the code cleaner and remove duplicate
comments.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
Pass the callback function to add_migration_state_change_notifier so
that migration can initialize the notifier on add and clear it on
delete, which simplifies the call sites. Shorten the function names
so the extra arg can be added more legibly. Hide the global notifier
list in a new function migration_call_notifiers, and make it externally
visible so future live update code can call it.
No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1686148954-250144-1-git-send-email-steven.sistare@oracle.com>
|
|
Modify migrate_add_blocker and migrate_del_blocker to take an Error **
reason. This allows migration to own the Error object, so that if
an error occurs in migrate_add_blocker, migration code can free the Error
and clear the client handle, simplifying client code. It also simplifies
the migrate_del_blocker call site.
In addition, this is a pre-requisite for a proposed future patch that would
add a mode argument to migration requests to support live update, and
maintain a list of blockers for each mode. A blocker may apply to a single
mode or to multiple modes, and passing Error** will allow one Error object
to be registered for multiple modes.
No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Tested-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1697634216-84215-1-git-send-email-steven.sistare@oracle.com>
|
|
Background snapshot allows creating a snapshot of the VM while it's
running and keeping it small by not including dirty RAM pages.
The way it works is by first stopping the VM, saving the non-iterable
devices' state and then starting the VM and saving the RAM while write
protecting it with UFFD. The resulting snapshot represents the VM state
at snapshot start.
VFIO migration is not compatible with background snapshot.
First of all, VFIO device state is not even saved in background snapshot
because only non-iterable device state is saved. But even if it was
saved, after starting the VM, a VFIO device could dirty pages without it
being detected by UFFD write protection. This would corrupt the
snapshot, as the RAM in it would not represent the RAM at snapshot
start.
To prevent this, block VFIO migration with background snapshot.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
VFIO migration is not compatible with postcopy migration. A VFIO device
in the destination can't handle page faults for pages that have not been
sent yet.
Doing such migration will cause the VM to crash in the destination:
qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address)
qemu: hardware error: vfio: DMA mapping failed, unable to continue
To prevent this, block VFIO migration with postcopy migration.
Reported-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
VFIO migration uAPI defines an optional intermediate P2P quiescent
state. While in the P2P quiescent state, P2P DMA transactions cannot be
initiated by the device, but the device can respond to incoming ones.
Additionally, all outstanding P2P transactions are guaranteed to have
been completed by the time the device enters this state.
The purpose of this state is to support migration of multiple devices
that might do P2P transactions between themselves.
Add support for P2P migration by transitioning all the devices to the
P2P quiescent state before stopping or starting the devices. Use the new
VMChangeStateHandler prepare_cb to achieve that behavior.
This will allow migration of multiple VFIO devices if all of them
support P2P migration.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Move the PRE_COPY and RUNNING state checks to helper functions.
This is in preparation for adding P2P VFIO migration support, where
these helpers will also test for PRE_COPY_P2P and RUNNING_P2P states.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Changing the device state from STOP_COPY to STOP can take time as the
device may need to free resources and do other operations as part of the
transition. Currently, this is done in vfio_save_complete_precopy() and
therefore it is counted in the migration downtime.
To avoid this, change the device state from STOP_COPY to STOP in
vfio_save_cleanup(), which is called after migration has completed and
thus is not part of migration downtime.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Make vfio_migration_realize() adhere to the convention of other realize()
callbacks(like qdev_realize) by returning bool instead of int.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
When vfio_realize() succeeds, hot unplug will call vfio_exitfn()
to free resources allocated in vfio_realize(); when vfio_realize()
fails, vfio_exitfn() is never called and we need to free resources
in vfio_realize().
In the case that vfio_migration_realize() fails,
e.g: with -only-migratable & enable-migration=off, we see below:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off
0000:81:11.1: Migration disabled
Error: disallowing migration blocker (--only-migratable) for: 0000:81:11.1: Migration is disabled for VFIO device
If we hotplug again we should see same log as above, but we see:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off
Error: vfio 0000:81:11.1: device is already attached
That's because some references to VFIO device isn't released.
For resources allocated in vfio_migration_realize(), free them by
jumping to out_deinit path with calling a new function
vfio_migration_deinit(). For resources allocated in vfio_realize(),
free them by jumping to de-register path in vfio_realize().
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Fixes: a22651053b59 ("vfio: Make vfio-pci device migration capable")
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Contrary to multiple device blocker which needs to consider already-attached
devices to unblock/block dynamically, the vIOMMU migration blocker is a device
specific config. Meaning it only needs to know whether the device is bypassing
or not the vIOMMU (via machine property, or per pxb-pcie::bypass_iommu), and
does not need the state of currently present devices. For this reason, the
vIOMMU global migration blocker can be consolidated into the per-device
migration blocker, allowing us to remove some unnecessary code.
This change also makes vfio_mig_active() more accurate as it doesn't check for
global blocker.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
The major parts of VFIO migration are supported today in QEMU. This
includes basic VFIO migration, device dirty page tracking and precopy
support.
Thus, at this point in time, it seems appropriate to make VFIO migration
non-experimental: remove the x prefix from enable_migration property,
change it to ON_OFF_AUTO and let the default value be AUTO.
In addition, make the following adjustments:
1. When enable_migration is ON and migration is not supported, fail VFIO
device realization.
2. When enable_migration is AUTO (i.e., not explicitly enabled), require
device dirty tracking support. This is because device dirty tracking
is currently the only method to do dirty page tracking, which is
essential for migrating in a reasonable downtime. Setting
enable_migration to ON will not require device dirty tracking.
3. Make migration error and blocker messages more elaborate.
4. Remove error prints in vfio_migration_query_flags().
5. Rename trace_vfio_migration_probe() to
trace_vfio_migration_realize().
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Currently, VFIO bytes_transferred is not reset properly:
1. bytes_transferred is not reset after a VM snapshot (so a migration
following a snapshot will report incorrect value).
2. bytes_transferred is a single counter for all VFIO devices, however
upon migration failure it is reset multiple times, by each VFIO
device.
Fix it by introducing a new function vfio_reset_bytes_transferred() and
calling it during migration and snapshot start.
Remove existing bytes_transferred reset in VFIO migration state
notifier, which is not needed anymore.
Fixes: 3710586caa5d ("qapi: Add VFIO devices migration stats in Migration stats")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Loading of a VFIO device's data can take a substantial amount of time as
the device may need to allocate resources, prepare internal data
structures, etc. This can increase migration downtime, especially for
VFIO devices with a lot of resources.
To solve this, VFIO migration uAPI defines "initial bytes" as part of
its precopy data stream. Initial bytes can be used in various ways to
improve VFIO migration performance. For example, it can be used to
transfer device metadata to pre-allocate resources in the destination.
However, for this to work we need to make sure that all initial bytes
are sent and loaded in the destination before the source VM is stopped.
Use migration switchover ack capability to make sure a VFIO device's
initial bytes are sent and loaded in the destination before the source
stops the VM and attempts to complete the migration.
This can significantly reduce migration downtime for some devices.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Pre-copy support allows the VFIO device data to be transferred while the
VM is running. This helps to accommodate VFIO devices that have a large
amount of data that needs to be transferred, and it can reduce migration
downtime.
Pre-copy support is optional in VFIO migration protocol v2.
Implement pre-copy of VFIO migration protocol v2 and use it for devices
that support it. Full description of it can be found in the following
Linux commit: 4db52602a607 ("vfio: Extend the device migration protocol
with PRE_COPY").
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
VFIO migration flags are queried once in vfio_migration_init(). Store
them in VFIOMigration so they can be used later to check the device's
migration capabilities without re-querying them.
This will be used in the next patch to check if the device supports
precopy migration.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Refactor vfio_save_block() to return the size of saved data on success
and -errno on error.
This will be used in next patch to implement VFIO migration pre-copy
support.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
This had been pulled in via qemu/plugin.h from hw/core/cpu.h,
but that will be removed.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230310195252.210956-5-richard.henderson@linaro.org>
[AJB: add various additional cases shown by CI]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230315174331.2959-15-alex.bennee@linaro.org>
Reviewed-by: Emilio Cota <cota@braap.org>
|
|
Pick names that align with the section drivers should use them from,
avoiding the confusion of calling a _finalize() function from _exit()
and generalizing the actual _finalize() to handle removing the viommu
blocker.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Link: https://lore.kernel.org/r/167820912978.606734.12740287349119694623.stgit@omen
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Now that everything has been set up for device dirty page tracking,
query the device for device dirty page tracking support.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-15-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Migrating with vIOMMU will require either tracking maximum
IOMMU supported address space (e.g. 39/48 address width on Intel)
or range-track current mappings and dirty track the new ones
post starting dirty tracking. This will be done as a separate
series, so add a live migration blocker until that is fixed.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20230307125450.62409-14-joao.m.martins@oracle.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Now that v2 protocol implementation has been added, remove the
deprecated v1 implementation.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Link: https://lore.kernel.org/r/20230216143630.25610-10-avihaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Implement the basic mandatory part of VFIO migration protocol v2.
This includes all functionality that is necessary to support
VFIO_MIGRATION_STOP_COPY part of the v2 protocol.
The two protocols, v1 and v2, will co-exist and in the following patches
v1 protocol code will be removed.
There are several main differences between v1 and v2 protocols:
- VFIO device state is now represented as a finite state machine instead
of a bitmap.
- Migration interface with kernel is now done using VFIO_DEVICE_FEATURE
ioctl and normal read() and write() instead of the migration region.
- Pre-copy is made optional in v2 protocol. Support for pre-copy will be
added later on.
Detailed information about VFIO migration protocol v2 and its difference
compared to v1 protocol can be found here [1].
[1]
https://lore.kernel.org/all/20220224142024.147653-10-yishaih@nvidia.com/
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>.
Link: https://lore.kernel.org/r/20230216143630.25610-9-avihaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
To avoid name collisions, rename functions and structs related to VFIO
migration protocol v1. This will allow the two protocols to co-exist
when v2 protocol is added, until v1 is removed. No functional changes
intended.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Link: https://lore.kernel.org/r/20230216143630.25610-8-avihaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Move vfio_dev_get_region_info() logic from vfio_migration_probe() to
vfio_migration_init(). This logic is specific to v1 protocol and moving
it will make it easier to add the v2 protocol implementation later.
No functional changes intended.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Link: https://lore.kernel.org/r/20230216143630.25610-7-avihaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Currently VFIO migration doesn't implement some kind of intermediate
quiescent state in which P2P DMAs are quiesced before stopping or
running the device. This can cause problems in multi-device migration
where the devices are doing P2P DMAs, since the devices are not stopped
together at the same time.
Until such support is added, block migration of multiple devices.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Link: https://lore.kernel.org/r/20230216143630.25610-6-avihaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Currently, if IOMMU of a VFIO container doesn't support dirty page
tracking, migration is blocked. This is because a DMA-able VFIO device
can dirty RAM pages without updating QEMU about it, thus breaking the
migration.
However, this doesn't mean that migration can't be done at all.
In such case, allow migration and let QEMU VFIO code mark all pages
dirty.
This guarantees that all pages that might have gotten dirty are reported
back, and thus guarantees a valid migration even without VFIO IOMMU
dirty tracking support.
The motivation for this patch is the introduction of iommufd [1].
iommufd can directly implement the /dev/vfio/vfio container IOCTLs by
mapping them into its internal ops, allowing the usage of these IOCTLs
over iommufd. However, VFIO IOMMU dirty tracking is not supported by
this VFIO compatibility API.
This patch will allow migration by hosts that use the VFIO compatibility
API and prevent migration regressions caused by the lack of VFIO IOMMU
dirty tracking support.
[1]
https://lore.kernel.org/kvm/0-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com/
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Link: https://lore.kernel.org/r/20230216143630.25610-4-avihaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
As part of its error flow, vfio_vmstate_change() accesses
MigrationState->to_dst_file without any checks. This can cause a NULL
pointer dereference if the error flow is taken and
MigrationState->to_dst_file is not set.
For example, this can happen if VM is started or stopped not during
migration and vfio_vmstate_change() error flow is taken, as
MigrationState->to_dst_file is not set at that time.
Fix it by checking that MigrationState->to_dst_file is set before using
it.
Fixes: 02a7e71b1e5b ("vfio: Add VM state change handler to know state of VM")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Link: https://lore.kernel.org/r/20230216143630.25610-3-avihaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Once that res_compatible is removed, they don't make sense anymore.
We remove the _only preffix. And to make things clearer we rename
them to must_precopy and can_postcopy.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Nothing assigns to it after previous commit.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
|
|
Until previous commit, save_live_pending() was used for ram. Now with
the split into state_pending_estimate() and state_pending_exact() it
is not needed anymore, so remove them.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
|
|
We split the function into to:
- state_pending_estimate: We estimate the remaining state size without
stopping the machine.
- state pending_exact: We calculate the exact amount of remaining
state.
The only "device" that implements different functions for _estimate()
and _exact() is ram.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
|
|
So remove it everywhere.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
|
|
vfio_migration_init() initializes VFIOMigration->device_state using enum
of VFIO migration protocol v2. Current implemented protocol is v1 so v1
enum should be used. Fix it.
Fixes: 429c72800654 ("vfio/migration: Fix incorrect initialization value for parameters in VFIOMigration")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Link: https://lore.kernel.org/r/20221016085752.32740-1-avihaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
VFIOMigration
The structure VFIOMigration of a VFIODevice is allocated and initialized
in vfio_migration_init(). "device_state" and "vm_running" are initialized
to 0, indicating that VFIO device is_STOP and VM is not-running. The
initialization value is incorrect. According to the agreement, default
state of VFIO device is _RUNNING. And if a VFIO device is hot-plugged
while the VM is running, "vm_running" should be 1. This patch fixes it.
Fixes: 02a7e71b1e5b ("vfio: Add VM state change handler to know state of VM")
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Link: https://lore.kernel.org/r/20220711014651.1327-1-jiangkunkun@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Update to c5eb0a61238d ("Linux 5.18-rc6"). Mechanical search and
replace of vfio defines with white space massaging.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
When migrate_add_blocker(blocker, &err) is followed by
error_propagate(errp, err), we can often just as well do
migrate_add_blocker(..., errp). This is the case in
vfio_migration_probe().
Prior art: commit 386f6c07d2 "error: Avoid error_propagate() after
migrate_add_blocker()".
Cc: Kirti Wankhede <kwankhede@nvidia.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210720125408.387910-8-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed by: Kirti Wankhede <kwankhede@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Set _SAVING flag for device state from vmstate change handler when it
gets called from savevm.
Currently State transition savevm/suspend is seen as:
_RUNNING -> _STOP -> Stop-and-copy -> _STOP
State transition savevm/suspend should be:
_RUNNING -> Stop-and-copy -> _STOP
State transition from _RUNNING to _STOP occurs from
vfio_vmstate_change() where when vmstate changes from running to
!running, _RUNNING flag is reset but at the same time when
vfio_vmstate_change() is called for RUN_STATE_SAVE_VM, _SAVING bit
should be set.
Reported by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Message-Id: <1623177441-27496-1-git-send-email-kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|