aboutsummaryrefslogtreecommitdiff
path: root/migration
AgeCommit message (Collapse)AuthorFilesLines
2023-10-04migration: Unify and trace vmstate field_exists() checksPeter Xu2-8/+27
For both save/load we actually share the logic on deciding whether a field should exist. Merge the checks into a helper and use it for both save and load. When doing so, add documentations and reformat the code to make it much easier to read. The real benefit here (besides code cleanups) is we add a trace-point for this; this is a known spot where we can easily break migration compatibilities between binaries, and this trace point will be critical for us to identify such issues. For example, this will be handy when debugging things like: https://gitlab.com/qemu-project/qemu/-/issues/932 Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230906204722.514474-1-peterx@redhat.com>
2023-10-04migration: file URI offsetSteve Sistare1-2/+43
Allow an offset option to be specified as part of the file URI, in the form "file:filename,offset=offset", where offset accepts the common size suffixes, or the 0x prefix, but not both. Migration data is written to and read from the file starting at offset. If unspecified, it defaults to 0. This is needed by libvirt to store its own data at the head of the file. Suggested-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1694182931-61390-3-git-send-email-steven.sistare@oracle.com>
2023-10-04migration: file URISteve Sistare5-0/+86
Extend the migration URI to support file:<filename>. This can be used for any migration scenario that does not require a reverse path. It can be used as an alternative to 'exec:cat > file' in minimized containers that do not contain /bin/sh, and it is easier to use than the fd:<fdname> URI. It can be used in HMP commands, and as a qemu command-line parameter. For best performance, guest ram should be shared and x-ignore-shared should be true, so guest pages are not written to the file, in which case the guest may remain running. If ram is not so configured, then the user is advised to stop the guest first. Otherwise, a busy guest may re-dirty the same page, causing it to be appended to the file multiple times, and the file may grow unboundedly. That issue is being addressed in the "fixed-ram" patch series. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Tested-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1694182931-61390-2-git-send-email-steven.sistare@oracle.com>
2023-10-04migration/rdma: zore out head.repeat to make the error more clearLi Zhijian1-1/+1
Previously, we got a confusion error that complains the RDMAControlHeader.repeat: qemu-system-x86_64: rdma: Too many requests in this message (3638950032).Bailing. Actually, it's caused by an unexpected RDMAControlHeader.type. After this patch, error will become: qemu-system-x86_64: Unknown control message QEMU FILE Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230926100103.201564-2-lizhijian@fujitsu.com>
2023-10-04migration: Update error description outside migration.cTejus GK2-6/+18
A few code paths exist in the source code,where a migration is marked as failed via MIGRATION_STATUS_FAILED, but the failure happens outside of migration.c In such cases, an error_report() call is made, however the current MigrationState is never updated with the error description, and hence clients like libvirt never know the actual reason for the failure. This patch covers such cases outside of migration.c and updates the error description at the appropriate places. Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Tejus GK <tejus.gk@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231003065538.244752-3-tejus.gk@nutanix.com>
2023-10-04migration/vmstate: Introduce vmstate_save_state_with_errTejus GK2-4/+10
Currently, a few code paths exist in the function vmstate_save_state_v, which ultimately leads to a migration failure. However, an update in the current MigrationState for the error description is never done. vmstate.c somehow doesn't seem to allow the use of migrate_set_error due to some dependencies for unit tests. Hence, this patch introduces a new function vmstate_save_state_with_err, which will eventually propagate the error message to savevm.c where a migrate_set_error call can be eventually done. Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Tejus GK <tejus.gk@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231003065538.244752-2-tejus.gk@nutanix.com>
2023-10-02Merge tag 'migration-20231002-pull-request' of ↵Stefan Hajnoczi9-85/+60
https://gitlab.com/juan.quintela/qemu into staging Migration Pull request (20231002) In this migration pull request: - Refactor repeated call of yank_unregister_instance (tejus) - More migraton-test changes Please, apply. # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmUatX4ACgkQ9IfvGFhy # 1yMlbQ/+Kp7m1Mr5LUM/8mvh9LZTVvWauBHch1pdvpCsJO+Grdtv6MtZL5UKT2ue # xYksZvf/rT4bdt2H1lSsG1o2GOcIf4qyWICgYNDo8peaxm1IrvgAbimaWHWLeORX # sBxKcBBuTac55vmEKzbPSbwGCGGTU/11UGXQ4ruGN3Hwbd2JZHAK6GxGIzANToZc # JtwBr/31SxJ2YndNLaPMEnD3cHbRbD2UyODeTt1KI5LdTGgXHoB6PgCk2AMQP1Ko # LlaPLsrEKC06h2CJ27BB36CNVEGMN2iFa3aKz1FC85Oj2ckatspAFw78t9guj6eM # MYxn0ipSsjjWjMsc3zEDxi7JrA///5bp1e6e7WdLpOaMBPpV4xuvVvA6Aku2es7D # fMPOMdftBp6rrXp8edBMTs1sOHdE1k8ZsyJ90m96ckjfLX39TPAiJRm4pWD2UuP5 # Wjr+/IU+LEp/KCqimMj0kYMRz4rM3PP8hOakPZLiRR5ZG6sgbHZK44iPXB/Udz/g # TCZ87siIpI8YHb3WCaO5CvbdjPrszg1j9v7RimtDeGLDR/hNokkQ1EEeszDTGpgt # xst4S4wVmex2jYyi53woH4V1p8anP7iqa8elPehAaYPobp47pmBV53ZaSwibqzPN # TmO7P9rfyQGCiXXZRvrAQJa+gmAkQlSEI7mSssV77pU+1gdEj9c= # =hD/8 # -----END PGP SIGNATURE----- # gpg: Signature made Mon 02 Oct 2023 08:20:14 EDT # gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full] # gpg: aka "Juan Quintela <quintela@trasno.org>" [full] # Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723 * tag 'migration-20231002-pull-request' of https://gitlab.com/juan.quintela/qemu: migration/rdma: Simplify the function that saves a page migration: Remove unused qemu_file_credit_transfer() migration/rdma: Don't use imaginary transfers migration/rdma: Remove QEMUFile parameter when not used migration/RDMA: It is accounting for zero/normal pages in two places migration: Don't abuse qemu_file transferred for RDMA migration: Use qemu_file_transferred_noflush() for block migration. migration: Refactor repeated call of yank_unregister_instance migration-test: simplify shmem_opts handling migration-test: dirtylimit checks for x86_64 arch before migration-test: Add bootfile_create/delete() functions migration-test: bootpath is the same for all tests and for all archs migration-test: Create kvm_opts Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-09-29migration/rdma: Simplify the function that saves a pageJuan Quintela4-37/+18
When we sent a page through QEMUFile hooks (RDMA) there are three posiblities: - We are not using RDMA. return RAM_SAVE_CONTROL_DELAYED and control_save_page() returns false to let anything else to proceed. - There is one error but we are using RDMA. Then we return a negative value, control_save_page() needs to return true. - Everything goes well and RDMA start the sent of the page asynchronously. It returns RAM_SAVE_CONTROL_DELAYED and we need to return 1 for ram_save_page_legacy. Clear? I know, I know, the interface is as bad as it gets. I think that now it is a bit clearer, but this needs to be done some other way. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-16-quintela@redhat.com>
2023-09-29migration: Remove unused qemu_file_credit_transfer()Juan Quintela2-13/+0
After this change, nothing abuses QEMUFile to account for data transferrefd during migration. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-15-quintela@redhat.com>
2023-09-29migration/rdma: Don't use imaginary transfersJuan Quintela2-5/+1
RDMA protocol is completely asynchronous, so in qemu_rdma_save_page() they "invent" that a byte has been transferred. And then they call qemu_file_credit_transfer() and ram_transferred_add() with that byte. Just remove that calls as nothing has been sent. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-14-quintela@redhat.com>
2023-09-29migration/rdma: Remove QEMUFile parameter when not usedJuan Quintela1-12/+11
Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-13-quintela@redhat.com>
2023-09-29migration/RDMA: It is accounting for zero/normal pages in two placesJuan Quintela1-7/+0
Remove the one in control_save_page(). Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-12-quintela@redhat.com>
2023-09-29migration: Don't abuse qemu_file transferred for RDMAJuan Quintela4-5/+28
Just create a variable for it, the same way that multifd does. This way it is safe to use for other thread, etc, etc. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-11-quintela@redhat.com>
2023-09-29migration: Use qemu_file_transferred_noflush() for block migration.Juan Quintela1-2/+2
We only care about the amount of bytes transferred. Flushing is done by the system somewhere else. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230530183941.7223-4-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-09-29migration: Refactor repeated call of yank_unregister_instanceTejus GK1-4/+0
In the function qmp_migrate(), yank_unregister_instance() gets called twice which isn't required. Hence, refactoring it so that it gets called during the local_error cleanup. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Tejus GK <tejus.gk@nutanix.com> Message-ID: <20230621130940.178659-3-tejus.gk@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-09-29migration: Clean up local variable shadowingMarkus Armbruster4-11/+11
Local variables shadowing other local variables or parameters make the code needlessly hard to understand. Tracked down with -Wshadow=local. Clean up: delete inner declarations when they are actually redundant, else rename variables. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Message-ID: <20230921121312.1301864-3-armbru@redhat.com>
2023-09-29migration/rdma: Fix save_page method to fail on polling errorMarkus Armbruster1-2/+4
qemu_rdma_save_page() reports polling error with error_report(), then succeeds anyway. This is because the variable holding the polling status *shadows* the variable the function returns. The latter remains zero. Broken since day one, and duplicated more recently. Fixes: 2da776db4846 (rdma: core logic) Fixes: b390afd8c50b (migration/rdma: Fix out of order wrid) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Message-ID: <20230921121312.1301864-2-armbru@redhat.com>
2023-09-27migration: Move return path cleanup to main migration threadFabiano Rosas1-1/+10
Now that the return path thread is allowed to finish during a paused migration, we can move the cleanup of the QEMUFiles to the main migration thread. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-9-farosas@suse.de>
2023-09-27migration: Replace the return path retry logicFabiano Rosas2-50/+11
Replace the return path retry logic with finishing and restarting the thread. This fixes a race when resuming the migration that leads to a segfault. Currently when doing postcopy we consider that an IO error on the return path file could be due to a network intermittency. We then keep the thread alive but have it do cleanup of the 'from_dst_file' and wait on the 'postcopy_pause_rp' semaphore. When the user issues a migrate resume, a new return path is opened and the thread is allowed to continue. There's a race condition in the above mechanism. It is possible for the new return path file to be setup *before* the cleanup code in the return path thread has had a chance to run, leading to the *new* file being closed and the pointer set to NULL. When the thread is released after the resume, it tries to dereference 'from_dst_file' and crashes: Thread 7 "return path" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffd1dbf700 (LWP 9611)] 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154 154 return f->last_error; (gdb) bt #0 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154 #1 0x00005555560e4983 in qemu_file_get_error (f=0x0) at ../migration/qemu-file.c:206 #2 0x0000555555b9a1df in source_return_path_thread (opaque=0x555556e06000) at ../migration/migration.c:1876 #3 0x000055555602e14f in qemu_thread_start (args=0x55555782e780) at ../util/qemu-thread-posix.c:541 #4 0x00007ffff38d76ea in start_thread (arg=0x7fffd1dbf700) at pthread_create.c:477 #5 0x00007ffff35efa6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Here's the race (important bit is open_return_path happening before migration_release_dst_files): migration | qmp | return path --------------------------+-----------------------------+--------------------------------- qmp_migrate_pause() shutdown(ms->to_dst_file) f->last_error = -EIO migrate_detect_error() postcopy_pause() set_state(PAUSED) wait(postcopy_pause_sem) qmp_migrate(resume) migrate_fd_connect() resume = state == PAUSED open_return_path <-- TOO SOON! set_state(RECOVER) post(postcopy_pause_sem) (incoming closes to_src_file) res = qemu_file_get_error(rp) migration_release_dst_files() ms->rp_state.from_dst_file = NULL post(postcopy_pause_rp_sem) postcopy_pause_return_path_thread() wait(postcopy_pause_rp_sem) rp = ms->rp_state.from_dst_file goto retry qemu_file_get_error(rp) SIGSEGV ------------------------------------------------------------------------------------------- We can keep the retry logic without having the thread alive and waiting. The only piece of data used by it is the 'from_dst_file' and it is only allowed to proceed after a migrate resume is issued and the semaphore released at migrate_fd_connect(). Move the retry logic to outside the thread by waiting for the thread to finish before pausing the migration. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-8-farosas@suse.de>
2023-09-27migration: Consolidate return path closing codeFabiano Rosas1-15/+14
We'll start calling the await_return_path_close_on_source() function from other parts of the code, so move all of the related checks and tracepoints into it. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-7-farosas@suse.de>
2023-09-27migration: Remove redundant cleanup of postcopy_qemufile_srcFabiano Rosas1-6/+0
This file is owned by the return path thread which is already doing cleanup. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-6-farosas@suse.de>
2023-09-27migration: Fix possible race when shutting down to_dst_fileFabiano Rosas1-7/+11
It's not safe to call qemu_file_shutdown() on the to_dst_file without first checking for the file's presence under the lock. The cleanup of this file happens at postcopy_pause() and migrate_fd_cleanup() which are not necessarily running in the same thread as migrate_fd_cancel(). Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-5-farosas@suse.de>
2023-09-27migration: Fix possible races when shutting down the return pathFabiano Rosas1-9/+10
We cannot call qemu_file_shutdown() on the return path file without taking the file lock. The return path thread could be running it's cleanup code and have just cleared the from_dst_file pointer. Checking ms->to_dst_file for errors could also race with migrate_fd_cleanup() which clears the to_dst_file pointer. Protect both accesses by taking the file lock. This was caught by inspection, it should be rare, but the next patches will start calling this code from other places, so let's do the correct thing. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-4-farosas@suse.de>
2023-09-27migration: Fix possible race when setting rp_state.errorFabiano Rosas1-1/+0
We don't need to set the rp_state.error right after a shutdown because qemu_file_shutdown() always sets the QEMUFile error, so the return path thread would have seen it and set the rp error itself. Setting the error outside of the thread is also racy because the thread could clear it after we set it. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-3-farosas@suse.de>
2023-09-27migration: Fix race that dest preempt thread close too earlyPeter Xu3-3/+51
We hit intermit CI issue on failing at migration-test over the unit test preempt/plain: qemu-system-x86_64: Unable to read from socket: Connection reset by peer Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1 ** ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0) (test program exited with status code -6) Fabiano debugged into it and found that the preempt thread can quit even without receiving all the pages, which can cause guest not receiving all the pages and corrupt the guest memory. To make sure preempt thread finished receiving all the pages, we can rely on the page_requested_count being zero because preempt channel will only receive requested page faults. Note, not all the faulted pages are required to be sent via the preempt channel/thread; imagine the case when a requested page is just queued into the background main channel for migration, the src qemu will just still send it via the background channel. Here instead of spinning over reading the count, we add a condvar so the main thread can wait on it if that unusual case happened, without burning the cpu for no good reason, even if the duration is short; so even if we spin in this rare case is probably fine. It's just better to not do so. The condvar is only used when that special case is triggered. Some memory ordering trick is needed to guarantee it from happening (against the preempt thread status field), so the main thread will always get a kick when that triggers correctly. Closes: https://gitlab.com/qemu-project/qemu/-/issues/1886 Debugged-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-2-farosas@suse.de>
2023-09-11migration: Add .save_prepare() handler to struct SaveVMHandlersAvihai Horon4-4/+43
Add a new .save_prepare() handler to struct SaveVMHandlers. This handler is called early, even before migration starts, and can be used by devices to perform early checks. Refactor migrate_init() to be able to return errors and call .save_prepare() from there. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-09-11migration: Move more initializations to migrate_init()Avihai Horon2-10/+7
Initialization of mig_stats, compression_counters and VFIO bytes transferred is hard-coded in migration code path and snapshot code path. Make the code cleaner by initializing them in migrate_init(). Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-09-11migration: Add migration prefix to functions in target.cAvihai Horon4-10/+10
The functions in target.c are not static, yet they don't have a proper migration prefix. Add such prefix. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-09-07io: follow coroutine AioContext in qio_channel_yield()Stefan Hajnoczi2-13/+15
The ongoing QEMU multi-queue block layer effort makes it possible for multiple threads to process I/O in parallel. The nbd block driver is not compatible with the multi-queue block layer yet because QIOChannel cannot be used easily from coroutines running in multiple threads. This series changes the QIOChannel API to make that possible. In the current API, calling qio_channel_attach_aio_context() sets the AioContext where qio_channel_yield() installs an fd handler prior to yielding: qio_channel_attach_aio_context(ioc, my_ctx); ... qio_channel_yield(ioc); // my_ctx is used here ... qio_channel_detach_aio_context(ioc); This API design has limitations: reading and writing must be done in the same AioContext and moving between AioContexts involves a cumbersome sequence of API calls that is not suitable for doing on a per-request basis. There is no fundamental reason why a QIOChannel needs to run within the same AioContext every time qio_channel_yield() is called. QIOChannel only uses the AioContext while inside qio_channel_yield(). The rest of the time, QIOChannel is independent of any AioContext. In the new API, qio_channel_yield() queries the AioContext from the current coroutine using qemu_coroutine_get_aio_context(). There is no need to explicitly attach/detach AioContexts anymore and qio_channel_attach_aio_context() and qio_channel_detach_aio_context() are gone. One coroutine can read from the QIOChannel while another coroutine writes from a different AioContext. This API change allows the nbd block driver to use QIOChannel from any thread. It's important to keep in mind that the block driver already synchronizes QIOChannel access and ensures that two coroutines never read simultaneously or write simultaneously. This patch updates all users of qio_channel_attach_aio_context() to the new API. Most conversions are simple, but vhost-user-server requires a new qemu_coroutine_yield() call to quiesce the vu_client_trip() coroutine when not attached to any AioContext. While the API is has become simpler, there is one wart: QIOChannel has a special case for the iohandler AioContext (used for handlers that must not run in nested event loops). I didn't find an elegant way preserve that behavior, so I added a new API called qio_channel_set_follow_coroutine_ctx(ioc, true|false) for opting in to the new AioContext model. By default QIOChannel uses the iohandler AioHandler. Code that formerly called qio_channel_attach_aio_context() now calls qio_channel_set_follow_coroutine_ctx(ioc, true) once after the QIOChannel is created. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20230830224802.493686-5-stefanha@redhat.com> [eblake: also fix migration/rdma.c] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-08-30Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into stagingStefan Hajnoczi1-2/+9
Pull request v3: - Drop UFS emulation due to CI failures - Add "aio-posix: zero out io_uring sqe user_data" # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmTvLIEACgkQnKSrs4Gr # c8itVggAka3RMkEclbeW7JKJBOolm3oUuJTobV8oJfDNMQ8mmom9JkXVUctyPWQT # EF+oeqZz1omjr0Dk7YEA2toCahTbXm/UsG7i6cZg8JXPl6e9sOne0j+p5zO5x/kc # YlG43SBQJHdp/BfTm/gvwUh0W2on0wadaeEV82m3ZyIrZGTgNcrC1p1gj5dwF5VX # SqW02mgALETECyJpo8O7y9vNUYGxEtETG9jzAhtrugGpYk4bPeXlm/rc+2zwV+ET # YCnfUvhjhlu5vS4nkta6natg0If16ODjy35vWYm/aGlgveGTqQq9HWgTL71eNuxm # Smn+hJHuvkyBclKjbGiiO1W1MuG1/g== # =UvNK # -----END PGP SIGNATURE----- # gpg: Signature made Wed 30 Aug 2023 07:48:17 EDT # gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [ultimate] # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [ultimate] # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * tag 'block-pull-request' of https://gitlab.com/stefanha/qemu: aio-posix: zero out io_uring sqe user_data tests/qemu-iotests/197: add testcase for CoR with subclusters block/io: align requests to subcluster_size block: add subcluster_size field to BlockDriverInfo block-migration: Ensure we don't crash during migration cleanup Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-08-30block-migration: Ensure we don't crash during migration cleanupFabiano Rosas1-2/+9
We can fail the blk_insert_bs() at init_blk_migration(), leaving the BlkMigDevState without a dirty_bitmap and BlockDriverState. Account for the possibly missing elements when doing cleanup. Fix the following crashes: Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359 359 BlockDriverState *bs = bitmap->bs; #0 0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359 #1 0x0000555555bba331 in unset_dirty_tracking () at ../migration/block.c:371 #2 0x0000555555bbad98 in block_migration_cleanup_bmds () at ../migration/block.c:681 Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073 7073 QLIST_FOREACH_SAFE(blocker, &bs->op_blockers[op], list, next) { #0 0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073 #1 0x0000555555e9734a in bdrv_op_unblock_all (bs=0x0, reason=0x0) at ../block.c:7095 #2 0x0000555555bbae13 in block_migration_cleanup_bmds () at ../migration/block.c:690 Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-id: 20230731203338.27581-1-farosas@suse.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-08-29migration/dirtyrate: Fix precision losses and g_usleep overshootAndrei Gudkov1-2/+8
Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <8ddb0d40d143f77aab8f602bd494e01e5fa01614.1691161009.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-07-26migration/rdma: Split qemu_fopen_rdma() into input/output functionsJuan Quintela3-33/+19
This is how everything else in QEMUFile is structured. As a bonus they are three less lines of code. Reviewed-by: Peter Xu <peterx@redhat.com> Message-ID: <20230530183941.7223-17-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26qemu-file: Make qemu_file_get_error_obj() staticJuan Quintela2-2/+1
It was not used outside of qemu_file.c anyways. Reviewed-by: Peter Xu <peterx@redhat.com> Message-ID: <20230530183941.7223-21-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26qemu-file: Simplify qemu_file_shutdown()Juan Quintela1-4/+2
Reviewed-by: Peter Xu <peterx@redhat.com> Message-ID: <20230530183941.7223-20-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26qemu_file: Make qemu_file_is_writable() staticJuan Quintela2-2/+1
It is not used outside of qemu_file, and it shouldn't. Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230530183941.7223-19-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26migration: Change qemu_file_transferred to noflushJuan Quintela1-1/+1
We do a qemu_fclose() just after that, that also does a qemu_fflush(), so remove one qemu_fflush(). Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20230530183941.7223-3-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26qemu-file: Rename qemu_file_transferred_ fast -> noflushJuan Quintela4-11/+10
Fast don't say much. Noflush indicates more clearly that it is like qemu_file_transferred but without the flush. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20230530183941.7223-2-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26migration: enforce multifd and postcopy preempt to be set before incomingWei Wang1-0/+15
qemu_start_incoming_migration needs to check the number of multifd channels or postcopy ram channels to configure the backlog parameter (i.e. the maximum length to which the queue of pending connections for sockfd may grow) of listen(). So enforce the usage of postcopy-preempt and multifd as below: - need to use "-incoming defer" on the destination; and - set_capability and set_parameter need to be done before migrate_incoming Otherwise, disable the use of the features and report error messages to remind users to adjust the commands. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230606101910.20456-2-wei.w.wang@intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Juan Quintela <quintela@redhat.com>
2023-07-26migration: Update error description whenever migration failsTejus GK1-7/+12
There are places in migration.c where the migration is marked failed with MIGRATION_STATUS_FAILED, but the failure reason is never updated. Hence libvirt doesn't know why the migration failed when it queries for it. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Tejus GK <tejus.gk@nutanix.com> Message-ID: <20230621130940.178659-2-tejus.gk@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26migration: Extend query-migrate to provide dirty page limit infoHyman Huang(黄勇)2-0/+20
Extend query-migrate to provide throttle time and estimated ring full time with dirty-limit capability enabled, through which we can observe if dirty limit take effect during live migration. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-ID: <168733225273.5845.15871826788879741674-8@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26migration: Implement dirty-limit convergence algoHyman Huang(黄勇)3-0/+40
Implement dirty-limit convergence algo for live migration, which is kind of like auto-converge algo but using dirty-limit instead of cpu throttle to make migration convergent. Enable dirty page limit if dirty_rate_high_cnt greater than 2 when dirty-limit capability enabled, Disable dirty-limit if migration be canceled. Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit" commands are not allowed during dirty-limit live migration. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-ID: <168733225273.5845.15871826788879741674-7@git.sr.ht> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26migration: Put the detection logic before auto-converge checkingHyman Huang(黄勇)1-10/+11
This commit is prepared for the implementation of dirty-limit convergence algo. The detection logic of throttling condition can apply to both auto-converge and dirty-limit algo, putting it's position before the checking logic for auto-converge feature. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-ID: <168733225273.5845.15871826788879741674-6@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26migration: Refactor auto-converge capability logicHyman Huang(黄勇)1-1/+5
Check if block migration is running before throttling guest down in auto-converge way. Note that this modification is kind of like code clean, because block migration does not depend on auto-converge capability, so the order of checks can be adjusted. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-5@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26migration: Introduce dirty-limit capabilityHyman Huang(黄勇)2-1/+23
Introduce migration dirty-limit capability, which can be turned on before live migration and limit dirty page rate durty live migration. Introduce migrate_dirty_limit function to help check if dirty-limit capability enabled during live migration. Meanwhile, refactor vcpu_dirty_rate_stat_collect so that period can be configured instead of hardcoded. dirty-limit capability is kind of like auto-converge but using dirty limit instead of traditional cpu-throttle to throttle guest down. To enable this feature, turn on the dirty-limit capability before live migration using migrate-set-capabilities, and set the parameters "x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably to speed up convergence. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-4@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26qapi/migration: Introduce vcpu-dirty-limit parametersHyman Huang(黄勇)2-0/+29
Introduce "vcpu-dirty-limit" migration parameter used to limit dirty page rate during live migration. "vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are two dirty-limit-related migration parameters, which can be set before and during live migration by qmp migrate-set-parameters. This two parameters are used to help implement the dirty page rate limit algo of migration. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-3@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26qapi/migration: Introduce x-vcpu-dirty-limit-period parameterHyman Huang(黄勇)2-0/+36
Introduce "x-vcpu-dirty-limit-period" migration experimental parameter, which is in the range of 1 to 1000ms and used to make dirtyrate calculation period configurable. Currently with the "x-vcpu-dirty-limit-period" varies, the total time of live migration changes, test results show the optimal value of "x-vcpu-dirty-limit-period" ranges from 500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made stable once it proves best value can not be determined with developer's experiments. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-2@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26migration/multifd: Protect accesses to migration_threadsFabiano Rosas2-3/+14
This doubly linked list is common for all the multifd and migration threads so we need to avoid concurrent access. Add a mutex to protect the data from concurrent access. This fixes a crash when removing two MigrationThread objects from the list at the same time during cleanup of multifd threads. Fixes: 671326201d ("migration: Introduce interface query-migrationthreads") Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230607161306.31425-3-farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26migration/multifd: Rename threadinfo.c functionsFabiano Rosas4-9/+8
We're about to add more functions to this file so make it use the same coding style as the rest of the code. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20230607161306.31425-2-farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-25migration: spelling fixesMichael Tokarev7-8/+8
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Fabiano Rosas <farosas@suse.de>