aboutsummaryrefslogtreecommitdiff
path: root/migration/migration.c
AgeCommit message (Collapse)AuthorFilesLines
2024-06-21migration/postcopy: Add postcopy-recover-setup phasePeter Xu1-6/+34
This patch adds a migration state on src called "postcopy-recover-setup". The new state will describe the intermediate step starting from when the src QEMU received a postcopy recovery request, until the migration channels are properly established, but before the recovery process take place. The request came from Libvirt where Libvirt currently rely on the migration state events to detect migration state changes. That works for most of the migration process but except postcopy recovery failures at the beginning. Currently postcopy recovery only has two major states: - postcopy-paused: this is the state that both sides of QEMU will be in for a long time as long as the migration channel was interrupted. - postcopy-recover: this is the state where both sides of QEMU handshake with each other, preparing for a continuation of postcopy which used to be interrupted. The issue here is when the recovery port is invalid, the src QEMU will take the URI/channels, noticing the ports are not valid, and it'll silently keep in the postcopy-paused state, with no event sent to Libvirt. In this case, the only thing Libvirt can do is to poll the migration status with a proper interval, however that's less optimal. Considering that this is the only case where Libvirt won't get a notification from QEMU on such events, let's add postcopy-recover-setup state to mimic what we have with the "setup" state of a newly initialized migration, describing the phase of connection establishment. With that, postcopy recovery will have two paths to go now, and either path will guarantee an event generated. Now the events will look like this during a recovery process on src QEMU: - Initially when the recovery is initiated on src, QEMU will go from "postcopy-paused" -> "postcopy-recover-setup". Old QEMUs don't have this event. - Depending on whether the channel re-establishment is succeeded: - In succeeded case, src QEMU will move from "postcopy-recover-setup" to "postcopy-recover". Old QEMUs also have this event. - In failure case, src QEMU will move from "postcopy-recover-setup" to "postcopy-paused" again. Old QEMUs don't have this event. This guarantees that Libvirt will always receive a notification for recovery process properly. One thing to mention is, such new status is only needed on src QEMU not both. On dest QEMU, the state machine doesn't change. Hence the events don't change either. It's done like so because dest QEMU may not have an explicit point of setup start. E.g., it can happen that when dest QEMUs doesn't use migrate-recover command to use a new URI/channel, but the old URI/channels can be reused in recovery, in which case the old ports simply can work again after the network routes are fixed up. Add a new helper postcopy_is_paused() detecting whether postcopy is still paused, taking RECOVER_SETUP into account too. When using it on both src/dst, a slight change is done altogether to always wait for the semaphore before checking the status, because for both sides a sem_post() will be required for a recovery. Cc: Jiri Denemark <jdenemar@redhat.com> Cc: Prasad Pandit <ppandit@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Buglink: https://issues.redhat.com/browse/RHEL-38485 Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-06-21migration: Cleanup incoming migration setup state changePeter Xu1-2/+26
Destination QEMU can setup incoming ports for two purposes: either a fresh new incoming migration, in which QEMU will switch to SETUP for channel establishment, or a paused postcopy migration, in which QEMU will stay in POSTCOPY_PAUSED until kicking off the RECOVER phase. Now the state machine worked on dest node for the latter, only because migrate_set_state() implicitly will become a noop if the current state check failed. It wasn't clear at all. Clean it up by providing a helper migration_incoming_state_setup() doing proper checks over current status. Postcopy-paused will be explicitly checked now, and then we can bail out for unknown states. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-06-21migration: Use MigrationStatus instead of intPeter Xu1-17/+7
QEMU uses "int" in most cases even if it stores MigrationStatus. I don't know why, so let's try to do that right and see what blows up.. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-06-21migration: Rename thread debug namesPeter Xu1-3/+3
The postcopy thread names on dest QEMU are slightly confusing, partly I'll need to blame myself on 36f62f11e4 ("migration: Postcopy preemption preparation on channel creation"). E.g., "fault-fast" reads like a fast version of "fault-default", but it's actually the fast version of "postcopy/listen". Taking this chance, rename all the migration threads with proper rules. Considering we only have 15 chars usable, prefix all threads with "mig/", meanwhile identify src/dst threads properly this time. So now most thread names will look like "mig/DIR/xxx", where DIR will be "src"/"dst", except the bg-snapshot thread which doesn't have a direction. For multifd threads, making them "mig/{src|dst}/{send|recv}_%d". We used to have "live_migration" thread for a very long time, now it's called "mig/src/main". We may hope to have "mig/dst/main" soon but not yet. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Zhijian Li (Fujitsu) <lizhijian@fujitsu.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-06-21migration/multifd: Add direct-io supportFabiano Rosas1-0/+23
When multifd is used along with mapped-ram, we can take benefit of a filesystem that supports the O_DIRECT flag and perform direct I/O in the multifd threads. This brings a significant performance improvement because direct-io writes bypass the page cache which would otherwise be thrashed by the multifd data which is unlikely to be needed again in a short period of time. To be able to use a multifd channel opened with O_DIRECT, we must ensure that a certain aligment is used. Filesystems usually require a block-size alignment for direct I/O. The way to achieve this is by enabling the mapped-ram feature, which already aligns its I/O properly (see MAPPED_RAM_FILE_OFFSET_ALIGNMENT at ram.c). By setting O_DIRECT on the multifd channels, all writes to the same file descriptor need to be aligned as well, even the ones that come from outside multifd, such as the QEMUFile I/O from the main migration code. This makes it impossible to use the same file descriptor for the QEMUFile and for the multifd channels. The various flags and metadata written by the main migration code will always be unaligned by virtue of their small size. To workaround this issue, we'll require a second file descriptor to be used exclusively for direct I/O. The second file descriptor can be obtained by QEMU by re-opening the migration file (already possible), or by being provided by the user or management application (support to be added in future patches). Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-05-22Merge tag 'migration-20240522-pull-request' of ↵Richard Henderson1-6/+6
https://gitlab.com/farosas/qemu into staging Migration pull request - Li Zhijian's COLO minor fixes - Marc-André's virtio-gpu fix - Fiona's virtio-net USO fix - A couple of migration-test fixes from Thomas # -----BEGIN PGP SIGNATURE----- # # iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmZObggQHGZhcm9zYXNA # c3VzZS5kZQAKCRDHmNx0G+wxnWE8D/49RGE+g29qyk9aKx3lU8mSq+ZzmX5GncBt # 5+Mx5qoHDsBCQTE+dQpEVIoeMJ2HIbgbOML4qsnp6Hw/4/TWkfwC/R6+ZmHBevRk # fVLkVh2JMHVg8Tq+0FO1X1QnMU03uJ7EAuWdDa8HqlJ5dQY/K3gDaku8oQBXk96X # 13pChSbMob76tdb+wiwbdEakabigH7XfrPdI6lzI8MCGTIcPKc/UKTFYuoj/OsNx # raqy+uBtvKtfHxiaYnIgHIPNAF/1f4tP3iAOcPoZWIMXWxFkE8+ANDJAbWo6xIcL # DGg/wEzZO/OnXLjOhjvLBUHK/fx4wQ5bsqA09BVxoRyBGblkXr+bcwBLYjgiEqzT # aniPiAx5W/Db+T7HqZPIWesFYj3cmcwvYUTrx/RPMdC0epG+ZczDMtescHdZbxvt # Pjs3nFeCLhyYcVhlTI72eXRCxdd/26+r6/OmrBC2+GaZrybM61TvNo+3XvO0Pfhi # UmwF2EN27XmSMelLvH/MnflUVgBHKDs3CCQzDlxreHq2jMVR0SL7LU5wMJJ58Iok # M3u74izQM25bwYxiASH+4iRn0puH1mOwgOx28W0uiQfZY/678/lCnwa1Tul15BRE # fIQZJhyIGzhSpwLqEXmdXdlLQs1isqIgpd/mzKgZ285nLr7kz+4gxCUqiXgVbrl7 # P45Dym1u4g== # =DDrh # -----END PGP SIGNATURE----- # gpg: Signature made Wed 22 May 2024 03:13:28 PM PDT # gpg: using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D # gpg: issuer "farosas@suse.de" # gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown] # gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D * tag 'migration-20240522-pull-request' of https://gitlab.com/farosas/qemu: tests/qtest/migration-test: Fix the check for a successful run of analyze-migration.py tests/qtest/migration-test: Run some basic tests on s390x and ppc64 with TCG, too hw/core/machine: move compatibility flags for VirtIO-net USO to machine 8.1 virtio-gpu: fix v2 migration migration: fix a typo migration: add "exists" info to load-state-field trace migration/colo: Tidy up bql_unlock() around bdrv_activate_all() migration/colo: make colo_incoming_co() return void migration/colo: Minor fix for colo error message Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-22migration/colo: make colo_incoming_co() return voidLi Zhijian1-3/+3
Currently, it always returns 0, no need to check the return value at all. In addition, enter colo coroutine only if migration_incoming_colo_enabled() is true. Once the destination side enters the COLO* state, the COLO process will take over the remaining processes until COLO exits. Cc: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> [fixed mangled author email address] Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-05-22migration/colo: Minor fix for colo error messageLi Zhijian1-3/+3
- Explicitly show the missing module name: replication - Fix capability name to x-colo Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Suggested-by: Michael Tokarev <mjt@tls.msk.ru> [fixed mangled author email address] Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-05-16migration: Extend migration_file_set_error() with Error* argumentCédric Le Goater1-2/+4
Use it to update the current error of the migration stream if available and if not, simply print out the error. Next changes will update with an error to report. Reviewed-by: Avihai Horon <avihaih@nvidia.com> Acked-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-05-08migration: Remove non-multifd compressionFabiano Rosas1-13/+0
The 'compress' migration capability enables the old compression code which has shown issues over the years and is thought to be less stable and tested than the more recent multifd-based compression. The old compression code has been deprecated in 8.2 and now is time to remove it. Deprecation commit 864128df46 ("migration: Deprecate old compression method"). Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-05-08migration: Remove block migrationFabiano Rosas1-15/+0
The block migration has been considered obsolete since QEMU 8.2 in favor of the more flexible storage migration provided by the blockdev-mirror driver. Two releases have passed so now it's time to remove it. Deprecation commit 66db46ca83 ("migration: Deprecate block migration"). Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-05-08migration: Remove 'blk/-b' option from migrate commandsFabiano Rosas1-27/+4
The block migration is considered obsolete and has been deprecated in 8.2. Remove the migrate command option that enables it. This only affects the QMP and HMP commands, the feature can still be accessed by setting the migration 'block' capability. The whole feature will be removed in a future patch. Deprecation commit 8846b5bfca ("migration: migrate 'blk' command option is deprecated."). Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-05-08migration: Remove 'inc' option from migrate commandFabiano Rosas1-17/+7
The block incremental option for block migration has been deprecated in 8.2 in favor of using the block-mirror feature. Remove it now. Deprecation commit 40101f320d ("migration: migrate 'inc' command option is deprecated."). Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-05-08migration: Remove 'skipped' field from MigrationStatsFabiano Rosas1-2/+0
The 'skipped' field of the MigrationStats struct has been deprecated in 8.1. Time to remove it. Deprecation commit 7b24d32634 ("migration: skipped field is really obsolete."). Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-05-08qapi: introduce exit-on-error parameter for migrate-incomingVladimir Sementsov-Ogievskiy1-6/+27
Now we do set MIGRATION_FAILED state, but don't give a chance to orchestrator to query migration state and get the error. Let's provide a possibility for QMP-based orchestrators to get an error like with outgoing migration. For hmp_migrate_incoming(), let's enable the new behavior: HMP is not and ABI, it's mostly intended to use by developer and it makes sense not to stop the process. For x-exit-preconfig, let's keep the old behavior: - it's called from init(), so here we want to keep current behavior by default - it does exit on error by itself as well So, if we want to change the behavior of x-exit-preconfig, it should be another patch. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-05-08migration: process_incoming_migration_co(): rework error reportingVladimir Sementsov-Ogievskiy1-10/+13
Unify error reporting in the function. This simplifies the following commit, which will not-exit-on-error behavior variant to the function. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-05-08migration: process_incoming_migration_co(): fix reporting s->errorVladimir Sementsov-Ogievskiy1-0/+1
It's bad idea to leave critical section with error object freed, but s->error still set, this theoretically may lead to use-after-free crash. Let's avoid it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-05-08migration: process_incoming_migration_co(): complete cleanup on failureVladimir Sementsov-Ogievskiy1-4/+1
Make call to migration_incoming_state_destroy(), instead of doing only partial of it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-05-08migration: move trace-point from migrate_fd_error to migrate_set_errorVladimir Sementsov-Ogievskiy1-1/+3
Cover more cases by trace-point. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-04-24Merge tag 'pull-error-2024-04-24' of https://repo.or.cz/qemu/armbru into stagingRichard Henderson1-1/+1
Error reporting patches for 2024-04-24 # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmYouloSHGFybWJydUBy # ZWRoYXQuY29tAAoJEDhwtADrkYZTzLwP+wQjCWJHpTB+uQ3+U5Tb77BUJxuEjDMj # txNIJBXHOo7erxTSCieLuQICm8e30z62QAK4nVStyMDcyGh1KfwdSDAxBFnuLpA2 # 7X5bXbvCrm4vXVASRTV1zKCYDlIXFfrMWLvN5KgM90RsodLcy0szlXg+qYyoIM3Z # 8zp0Ug0fQPFHiOAQJi9ZTOsCYJBhZc2sbzgQEmf/g6q9bJaZHzPEHvVT4AQhTAtn # 7BIJY+vGDZNZwbP/0obWy2lai3kbGak8OXpwq/bewdrxeRmvqmM7sk+V/P2tXQD+ # kZe0/HWuDoO5J8L3KHiJnBJ0KCk8fbo4I0T6v9vf55Sj8K0r7O9sykgXXWv8q0lO # GrQa0YcyWAckI41stYQpwEpIlRanuZv/p8OZFJIqsTAfaw7RlbIBYA9xZCUnTton # FbHO/t2BLfo8eO9/xRD4r1u6vMbVozImPETuUMPyLHzlrdw2thxddKQNInHYYZ2U # SvvaByceEP2UywOnOflZhVL2dIhhnrBztiW2Vqod1fQHpfBAcJn909PZIlPZyMkr # gUnABI/rtC/lW3pBee6HmfzJ6Fah0e0XCpCY20qFe27Bi/z3xKi5NWYuyAUG5csp # CuTsc4pXfPVj5Z+Mk4pyY8PK5k4jSa7vAVLCLTNzXJLZlJTb6yuf0HsJ7768nHDc # hSEIjLwQWYtw # =r8Rv # -----END PGP SIGNATURE----- # gpg: Signature made Wed 24 Apr 2024 12:52:58 AM PDT # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [undefined] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * tag 'pull-error-2024-04-24' of https://repo.or.cz/qemu/armbru: qapi: Inline and remove QERR_PROPERTY_VALUE_BAD definition qapi: Inline and remove QERR_MIGRATION_ACTIVE definition qapi: Correct error message for 'vcpu_dirty_limit' parameter qapi: Inline and remove QERR_INVALID_PARAMETER_TYPE definition qapi: Inline QERR_INVALID_PARAMETER_TYPE definition (constant value) qapi: Inline and remove QERR_INVALID_PARAMETER definition qapi: Inline and remove QERR_DEVICE_NO_HOTPLUG definition qapi: Inline and remove QERR_DEVICE_HAS_NO_MEDIUM definition qapi: Inline and remove QERR_BUS_NO_HOTPLUG definition error: Drop superfluous #include "qapi/qmp/qerror.h" Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-04-24qapi: Inline and remove QERR_MIGRATION_ACTIVE definitionPhilippe Mathieu-Daudé1-1/+1
Address the comment added in commit 4629ed1e98 ("qerror: Finally unused, clean up"), from 2015: /* * These macros will go away, please don't use * in new code, and do not add new ones! */ Mechanical transformation using sed, manually removing the definition in include/qapi/qmp/qerror.h. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240312141343.3168265-10-armbru@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> [Straightforward conflict with commit aeaafb1e59f (migration: export migration_is_running) resolved]
2024-04-23migration: Add Error** argument to qemu_savevm_state_setup()Cédric Le Goater1-2/+31
This prepares ground for the changes coming next which add an Error** argument to the .save_setup() handler. Callers of qemu_savevm_state_setup() now handle the error and fail earlier setting the migration state from MIGRATION_STATUS_SETUP to MIGRATION_STATUS_FAILED. In qemu_savevm_state(), move the cleanup to preserve the error reported by .save_setup() handlers. Since the previous behavior was to ignore errors at this step of migration, this change should be examined closely to check that cleanups are still correctly done. Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-7-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-31migration/postcopy: Ensure postcopy_start() sets errp if it failsAvihai Horon1-0/+8
There are several places where postcopy_start() fails without setting errp. This can cause a null pointer de-reference, as in case of error, the caller of postcopy_start() copies/prints the error set in errp. Fix it by setting errp in all of postcopy_start() error paths. Cc: qemu-stable <qemu-stable@nongnu.org> Fixes: 908927db28ea ("migration: Update error description whenever migration fails") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240328140252.16756-3-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-31migration: Set migration error in migration_completion()Avihai Horon1-0/+10
After commit 9425ef3f990a ("migration: Use migrate_has_error() in close_return_path_on_source()"), close_return_path_on_source() assumes that migration error is set if an error occurs during migration. This may not be true if migration errors in migration_completion(). For example, if qemu_savevm_state_complete_precopy() errors, migration error will not be set. This in turn, will cause a migration hang bug, similar to the bug that was fixed by commit 22b04245f0d5 ("migration: Join the return path thread before releasing to_dst_file"), as shutdown() will not be issued for the return-path channel. Fix it by ensuring migration error is set in case of error in migration_completion(). Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Fixes: 9425ef3f990a ("migration: Use migrate_has_error() in close_return_path_on_source()") Acked-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240328140252.16756-2-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-22migration/postcopy: Fix high frequency syncPeter Xu1-4/+3
With current code base I can observe extremely high sync count during precopy, as long as one enables postcopy-ram=on before switchover to postcopy. To provide some context of when QEMU decides to do a full sync: it checks must_precopy (which implies "data must be sent during precopy phase"), and as long as it is lower than the threshold size we calculated (out of bandwidth and expected downtime) QEMU will kick off the slow/exact sync. However, when postcopy is enabled (even if still during precopy phase), RAM only reports all pages as can_postcopy, and report must_precopy==0. Then "must_precopy <= threshold_size" mostly always triggers and enforces a slow sync for every call to migration_iteration_run() when postcopy is enabled even if not used. That is insane. It turns out it was a regress bug introduced in the previous refactoring in 8.0 as reported by Nina [1]: (a) c8df4a7aef ("migration: Split save_live_pending() into state_pending_*") Then a workaround patch is applied at the end of release (8.0-rc4) to fix it: (b) 28ef5339c3 ("migration: fix ram_state_pending_exact()") However that "workaround" was overlooked when during the cleanup in this 9.0 release in this commit.. (c) b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()") Then the issue was re-exposed as reported by Nina [1]. The problem with (b) is that it only fixed the case for RAM, rather than all the rest of iterators. Here a slow sync should only be required if all dirty data (precopy+postcopy) is less than the threshold_size that QEMU calculated. It is even debatable whether a sync is needed when switched to postcopy. Currently ram_state_pending_exact() will be mostly noop if switched to postcopy, and that logic seems to apply too for all the rest of iterators, as sync dirty bitmap during a postcopy doesn't make much sense. However let's leave such change for later, as we're in rc phase. So rather than reusing commit (b), this patch provides the complete fix for all iterators. When at it, cleanup a little bit on the lines around. [1] https://gitlab.com/qemu-project/qemu/-/issues/1565 Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Fixes: b0504edd40 ("migration: Drop unnecessary check in ram's pending_exact()") Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240320214453.584374-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-22migration: Revert mapped-ram multifd support to fd: URIFabiano Rosas1-13/+0
This reverts commit decdc76772c453ff1444612e910caa0d45cd8eac in full and also the relevant migration-tests from 7a09f092834641b7a793d50a3a261073bbb404a6. After the addition of the new QAPI-based migration address API in 8.2 we've been converting an "fd:" URI into a SocketAddress, missing the fact that the "fd:" syntax could also be used for a plain file instead of a socket. This is a problem because the SocketAddress is part of the API, so we're effectively asking users to create a "socket" channel to pass in a plain file. The easiest way to fix this situation is to deprecate the usage of both SocketAddress and "fd:" when used with a plain file for migration. Since this has been possible since 8.2, we can wait until 9.1 to deprecate it. For 9.0, however, we should avoid adding further support to migration to a plain file using the old "fd:" syntax or the new SocketAddress API, and instead require the usage of either the old-style "file:" URI or the FileMigrationArgs::filename field of the new API with the "/dev/fdset/NN" syntax, both of which are already supported. Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240319210941.1907-1-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-15migration/multifd: Ensure we're not given a socket for file migrationFabiano Rosas1-3/+3
When doing migration using the fd: URI, QEMU will fetch the file descriptor passed in via the monitor at fd_start_outgoing|incoming_migration(), which means the checks at migration_channels_and_transport_compatible() happen too soon and we don't know at that point whether the FD refers to a plain file or a socket. For this reason, we've been allowing a migration channel of type SOCKET_ADDRESS_TYPE_FD to pass the initial verifications in scenarios where the socket migration is not supported, such as with fd + multifd. The commit decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI") was supposed to add a second check prior to starting migration to make sure a socket fd is not passed instead of a file fd, but failed to do so. Add the missing verification and update the comment explaining this situation which is currently incorrect. Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI") Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240315032040.7974-2-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11migration: delete unused accessorsSteve Sistare1-10/+0
Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1710179338-294359-11-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11migration: migration_file_set_errorSteve Sistare1-0/+11
Define and export migration_file_set_error to eliminate a dependency on MigrationState. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1710179338-294359-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11migration: migration_is_deviceSteve Sistare1-0/+7
Define and export migration_is_device to eliminate a dependency on MigrationState. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1710179338-294359-8-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11migration: migration_thread_is_selfSteve Sistare1-0/+7
Define and export migration_thread_is_self to eliminate a dependency on MigrationState. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1710179338-294359-7-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11migration: export migration_is_runningSteve Sistare1-4/+6
Delete the MigrationState parameter from migration_is_running and move it to the public API in misc.h. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1710179338-294359-5-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11migration: export migration_is_activeSteve Sistare1-4/+6
Delete the MigrationState parameter from migration_is_active so it can be exported and used without including migration.h. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Link: https://lore.kernel.org/r/1710179338-294359-4-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11migration: export migration_is_setup_or_activeSteve Sistare1-6/+6
Delete the MigrationState parameter from migration_is_setup_or_active and move it to the public API in misc.h. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/1710179338-294359-3-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-01migration/multifd: Add mapped-ram support to fd: URIFabiano Rosas1-0/+4
If we receive a file descriptor that points to a regular file, there's nothing stopping us from doing multifd migration with mapped-ram to that file. Enable the fd: URI to work with multifd + mapped-ram. Note that the fds passed into multifd are duplicated because we want to avoid cross-thread effects when doing cleanup (i.e. close(fd)). The original fd doesn't need to be duplicated because monitor_get_fd() transfers ownership to the caller. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240229153017.2221-23-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-01migration/multifd: Support outgoing mapped-ram stream formatFabiano Rosas1-5/+12
The new mapped-ram stream format uses a file transport and puts ram pages in the migration file at their respective offsets and can be done in parallel by using the pwritev system call which takes iovecs and an offset. Add support to enabling the new format along with multifd to make use of the threading and page handling already in place. This requires multifd to stop sending headers and leaving the stream format to the mapped-ram code. When it comes time to write the data, we need to call a version of qio_channel_write that can take an offset. Usage on HMP is: (qemu) stop (qemu) migrate_set_capability multifd on (qemu) migrate_set_capability mapped-ram on (qemu) migrate_set_parameter max-bandwidth 0 (qemu) migrate_set_parameter multifd-channels 8 (qemu) migrate file:migfile Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-21-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-01migration/multifd: Add incoming QIOChannelFile supportFabiano Rosas1-1/+2
On the receiving side we don't need to differentiate between main channel and threads, so whichever channel is defined first gets to be the main one. And since there are no packets, use the atomic channel count to index into the params array. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-19-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-01migration: Add mapped-ram URI compatibility checkFabiano Rosas1-0/+29
The mapped-ram migration format needs a channel that supports seeking to be able to write each page to an arbitrary offset in the migration stream. Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-9-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-01migration/ram: Introduce 'mapped-ram' migration capabilityFabiano Rosas1-0/+7
Add a new migration capability 'mapped-ram'. The core of the feature is to ensure that RAM pages are mapped directly to offsets in the resulting migration file instead of being streamed at arbitrary points. The reasons why we'd want such behavior are: - The resulting file will have a bounded size, since pages which are dirtied multiple times will always go to a fixed location in the file, rather than constantly being added to a sequential stream. This eliminates cases where a VM with, say, 1G of RAM can result in a migration file that's 10s of GBs, provided that the workload constantly redirties memory. - It paves the way to implement O_DIRECT-enabled save/restore of the migration stream as the pages are ensured to be written at aligned offsets. - It allows the usage of multifd so we can write RAM pages to the migration file in parallel. For now, enabling the capability has no effect. The next couple of patches implement the core functionality. Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-8-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: Use migrate_has_error() in close_return_path_on_source()Cédric Le Goater1-2/+1
close_return_path_on_source() retrieves the migration error from the the QEMUFile '->to_dst_file' to know if a shutdown is required. This shutdown is required to exit the return-path thread. Avoid relying on '->to_dst_file' and use migrate_has_error() instead. (using to_dst_file is a heuristic to infer whether rp_state.from_dst_file might be stuck on a recvmsg(). Using a generic method for detecting errors is more reliable. We also want to reduce dependency on QEMUFile::last_error) Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> [added some words about the motivation for this patch] Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240226203122.22894-3-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: Join the return path thread before releasing to_dst_fileFabiano Rosas1-13/+9
The return path thread might hang at a blocking system call. Before joining the thread we might need to issue a shutdown() on the socket file descriptor to release it. To determine whether the shutdown() is necessary we look at the QEMUFile error. Make sure we only clean up the QEMUFile after the return path has been waited for. This fixes a hang when qemu_savevm_state_setup() produced an error that was detected by migration_detect_error(). That skips migration_completion() so close_return_path_on_source() would get stuck waiting for the RP thread to terminate. Reported-by: Cédric Le Goater <clg@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240226203122.22894-2-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: Fix qmp_query_migrate mbps valueFabiano Rosas1-9/+14
The QMP command query_migrate might see incorrect throughput numbers if it runs after we've set the migration completion status but before migration_calculate_complete() has updated s->total_time and s->mbps. The migration status would show COMPLETED, but the throughput value would be the one from the last iteration and not the one from the whole migration. This will usually be a larger value due to the time period being smaller (one iteration). Move migration_calculate_complete() earlier so that the status MIGRATION_STATUS_COMPLETED is only emitted after the final counters update. Keep everything under the BQL so the QMP thread sees the updates as atomic. Rename migration_calculate_complete to migration_completion_end to reflect its new purpose of also updating s->state. Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240226143335.14282-1-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: options incompatible with cprSteve Sistare1-0/+17
Fail the migration request if options are set that are incompatible with cpr. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-15-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: stop vm for cprSteve Sistare1-20/+31
When migration for cpr is initiated, stop the vm and set state RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the possibility of ram and device state being out of sync, and guarantees that a guest in the suspended state remains suspended, because qmp_cont rejects a cont command in the RUN_STATE_FINISH_MIGRATE state. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-11-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: notifier error checkingSteve Sistare1-9/+16
Check the status returned by migration notifiers for event type MIG_EVENT_PRECOPY_SETUP, and report errors. None of the notifiers return an error status at this time. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-10-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: refactor migrate_fd_connect failuresSteve Sistare1-5/+8
Move common code for the error path in migrate_fd_connect to a shared fail label. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: per-mode notifiersSteve Sistare1-5/+17
Keep a separate list of migration notifiers for each migration mode. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-8-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: MigrationNotifyFuncSteve Sistare1-2/+2
Define MigrationNotifyFunc to improve type safety and simplify migration notifiers. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-7-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: remove postcopy_after_devicesSteve Sistare1-7/+0
postcopy_after_devices and migration_in_postcopy_after_devices are no longer used, so delete them. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-6-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28migration: MigrationEvent for notifiersSteve Sistare1-5/+12
Passing MigrationState to notifiers is unsound because they could access unstable migration state internals or even modify the state. Instead, pass the minimal info needed in a new MigrationEvent struct, which could be extended in the future if needed. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-5-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>