aboutsummaryrefslogtreecommitdiff
path: root/migration/rdma.c
AgeCommit message (Collapse)AuthorFilesLines
2023-10-11migration/rdma: Fix qemu_get_cm_event_timeout() to always set errorMarkus Armbruster1-2/+8
qemu_get_cm_event_timeout() neglects to set an error when it fails because rdma_get_cm_event() fails. Harmless, as its caller qemu_rdma_connect() substitutes a generic error then. Fix it anyway. qemu_rdma_connect() also sets the generic error when its own call of rdma_get_cm_event() fails. Make the error handling more obvious: set a specific error right after rdma_get_cm_event() fails. Delete the generic error. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-22-armbru@redhat.com>
2023-10-11migration/rdma: Fix qemu_rdma_broken_ipv6_kernel() to set errorMarkus Armbruster1-0/+2
qemu_rdma_resolve_host() and qemu_rdma_dest_init() try addresses until they find on that works. If none works, they return the first Error set by qemu_rdma_broken_ipv6_kernel(), or else return a generic one. qemu_rdma_broken_ipv6_kernel() neglects to set an Error when ibv_open_device() fails. If a later address fails differently, we use that Error instead, or else the generic one. Harmless enough, but needs fixing all the same. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-21-armbru@redhat.com>
2023-10-11migration/rdma: Replace dangerous macro CHECK_ERROR_STATE()Markus Armbruster1-16/+27
Hiding return statements in macros is a bad idea. Use a function instead, and open code the return part. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-20-armbru@redhat.com>
2023-10-11migration/rdma: Fix io_writev(), io_readv() methods to obey contractMarkus Armbruster1-2/+10
QIOChannelClass methods qio_channel_rdma_readv() and qio_channel_rdma_writev() violate their method contract when rdma->error_state is non-zero: 1. They return whatever is in rdma->error_state then. Only -1 will be fine. -2 will be misinterpreted as "would block". Anything less than -2 isn't defined in the contract. A positive value would be misinterpreted as success, but I believe that's not actually possible. 2. They neglect to set an error then. If something up the call stack dereferences the error when failure is returned, it will crash. If it ignores the return value and checks the error instead, it will miss the error. Crap like this happens when return statements hide in macros, especially when their uses are far away from the definition. I elected not to investigate how callers are impacted. Expand the two bad macro uses, so we can set an error and return -1. The next commit will then get rid of the macro altogether. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-19-armbru@redhat.com>
2023-10-11migration/rdma: Ditch useless numeric error codes in error messagesMarkus Armbruster1-10/+10
Several error messages include numeric error codes returned by failed functions: * ibv_poll_cq() returns an unspecified negative value. Useless. * rdma_accept and rdma_get_cm_event() return -1. Useless. * qemu_rdma_poll() returns either -1 or an unspecified negative value. Useless. * qemu_rdma_block_for_wrid(), qemu_rdma_write_flush(), qemu_rdma_exchange_send(), qemu_rdma_exchange_recv(), qemu_rdma_write() return a negative value that may or may not be an errno value. While reporting human-readable errno information (which a number is not) can be useful, reporting an error code that may or may not be an errno value is useless. Drop these error codes from the error messages. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-18-armbru@redhat.com>
2023-10-11migration/rdma: Fix or document problematic uses of errnoMarkus Armbruster1-6/+39
We use errno after calling Libibverbs functions that are not documented to set errno (manual page does not mention errno), or where the documentation is unclear ("returns [...] the value of errno on failure"). While this could be read as "sets errno and returns it", a glance at the source code[*] kills that hope: static inline int ibv_post_send(struct ibv_qp *qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { return qp->context->ops.post_send(qp, wr, bad_wr); } The callback can be static int mana_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad) { /* This version of driver supports RAW QP only. * Posting WR is done directly in the application. */ return EOPNOTSUPP; } Neither of them touches errno. One of these errno uses is easy to fix, so do that now. Several more will go away later in the series; add temporary FIXME commments. Three will remain; add TODO comments. TODO, not FIXME, because the bug might be in Libibverbs documentation. [*] https://github.com/linux-rdma/rdma-core.git commit 55fa316b4b18f258d8ac1ceb4aa5a7a35b094dcf Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-17-armbru@redhat.com>
2023-10-11migration/rdma: Use bool for two RDMAContext flagsMarkus Armbruster1-3/+3
@error_reported and @received_error are flags. The latter is even assigned bool true. Change them from int to bool. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-16-armbru@redhat.com>
2023-10-11migration/rdma: Make qemu_rdma_buffer_mergeable() return boolMarkus Armbruster1-10/+10
qemu_rdma_buffer_mergeable() is semantically a predicate. It returns int 0 or 1. Return bool instead, and fix the function name's spelling. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-15-armbru@redhat.com>
2023-10-11migration/rdma: Drop qemu_rdma_search_ram_block() error handlingMarkus Armbruster1-16/+8
qemu_rdma_search_ram_block() can't fail. Return void, and drop the unreachable error handling. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-14-armbru@redhat.com>
2023-10-11migration/rdma: Drop rdma_add_block() error handlingMarkus Armbruster1-21/+9
rdma_add_block() can't fail. Return void, and drop the unreachable error handling. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-13-armbru@redhat.com>
2023-10-11migration/rdma: Eliminate error_propagate()Markus Armbruster1-12/+7
When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-12-armbru@redhat.com>
2023-10-11migration/rdma: Put @errp parameter lastMarkus Armbruster1-3/+4
include/qapi/error.h demands: * - Functions that use Error to report errors have an Error **errp * parameter. It should be the last parameter, except for functions * taking variable arguments. qemu_rdma_connect() does not conform. Clean it up. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-11-armbru@redhat.com>
2023-10-11migration/rdma: Fix qemu_rdma_accept() to return failure on errorsMarkus Armbruster1-7/+12
qemu_rdma_accept() returns 0 in some cases even when it didn't complete its job due to errors. Impact is not obvious. I figure the caller will soon fail again with a misleading error message. Fix it to return -1 on any failure. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-10-armbru@redhat.com>
2023-10-11migration/rdma: Give qio_channel_rdma_source_funcs internal linkageMarkus Armbruster1-1/+1
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-9-armbru@redhat.com>
2023-10-11migration/rdma: Clean up two more harmless signed vs. unsigned issuesMarkus Armbruster1-6/+5
qemu_rdma_exchange_get_response() compares int parameter @expecting with uint32_t head->type. Actual arguments are non-negative enumeration constants, RDMAControlHeader uint32_t member type, or qemu_rdma_exchange_recv() int parameter expecting. Actual arguments for the latter are non-negative enumeration constants. Change both parameters to uint32_t. In qio_channel_rdma_readv(), loop control variable @i is ssize_t, and counts from 0 up to @niov, which is size_t. Change @i to size_t. While there, make qio_channel_rdma_readv() and qio_channel_rdma_writev() more consistent: change the former's @done to ssize_t, and delete the latter's useless initialization of @len. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-8-armbru@redhat.com>
2023-10-11migration/rdma: Fix unwanted integer truncationMarkus Armbruster1-7/+7
qio_channel_rdma_readv() assigns the size_t value of qemu_rdma_fill() to an int variable before it adds it to @done / subtracts it from @want, both size_t. Truncation when qemu_rdma_fill() copies more than INT_MAX bytes. Seems vanishingly unlikely, but needs fixing all the same. Fixes: 6ddd2d76ca6f (migration: convert RDMA to use QIOChannel interface) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-7-armbru@redhat.com>
2023-10-11migration/rdma: Consistently use uint64_t for work request IDsMarkus Armbruster1-3/+4
We use int instead of uint64_t in a few places. Change them to uint64_t. This cleans up a comparison of signed qemu_rdma_block_for_wrid() parameter @wrid_requested with unsigned @wr_id. Harmless, because the actual arguments are non-negative enumeration constants. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-6-armbru@redhat.com>
2023-10-11migration/rdma: Drop fragile wr_id formattingMarkus Armbruster1-25/+7
wrid_desc[] uses 4001 pointers to map four integer values to strings. print_wrid() accesses wrid_desc[] out of bounds when passed a negative argument. It returns null for values 2..1999 and 2001..3999. qemu_rdma_poll() and qemu_rdma_block_for_wrid() print wrid_desc[wr_id] and passes print_wrid(wr_id) to tracepoints. Could conceivably crash trying to format a null string. I believe access out of bounds is not possible. Not worth cleaning up. Dumb down to show just numeric wr_id. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-5-armbru@redhat.com>
2023-10-11migration/rdma: Clean up rdma_delete_block()'s return typeMarkus Armbruster1-3/+1
rdma_delete_block() always returns 0, which its only caller ignores. Return void instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-4-armbru@redhat.com>
2023-10-11migration/rdma: Clean up qemu_rdma_data_init()'s return typeMarkus Armbruster1-1/+1
qemu_rdma_data_init() return type is void *. It actually returns RDMAContext *, and all its callers assign the value to an RDMAContext *. Unclean. Return RDMAContext * instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-3-armbru@redhat.com>
2023-10-11migration/rdma: Clean up qemu_rdma_poll()'s return typeMarkus Armbruster1-2/+2
qemu_rdma_poll()'s return type is uint64_t, even though it returns 0, -1, or @ret, which is int. Its callers assign the return value to int variables, then check whether it's negative. Unclean. Return int instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-2-armbru@redhat.com>
2023-10-04migration/rdma: zore out head.repeat to make the error more clearLi Zhijian1-1/+1
Previously, we got a confusion error that complains the RDMAControlHeader.repeat: qemu-system-x86_64: rdma: Too many requests in this message (3638950032).Bailing. Actually, it's caused by an unexpected RDMAControlHeader.type. After this patch, error will become: qemu-system-x86_64: Unknown control message QEMU FILE Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230926100103.201564-2-lizhijian@fujitsu.com>
2023-10-02Merge tag 'migration-20231002-pull-request' of ↵Stefan Hajnoczi1-30/+34
https://gitlab.com/juan.quintela/qemu into staging Migration Pull request (20231002) In this migration pull request: - Refactor repeated call of yank_unregister_instance (tejus) - More migraton-test changes Please, apply. # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmUatX4ACgkQ9IfvGFhy # 1yMlbQ/+Kp7m1Mr5LUM/8mvh9LZTVvWauBHch1pdvpCsJO+Grdtv6MtZL5UKT2ue # xYksZvf/rT4bdt2H1lSsG1o2GOcIf4qyWICgYNDo8peaxm1IrvgAbimaWHWLeORX # sBxKcBBuTac55vmEKzbPSbwGCGGTU/11UGXQ4ruGN3Hwbd2JZHAK6GxGIzANToZc # JtwBr/31SxJ2YndNLaPMEnD3cHbRbD2UyODeTt1KI5LdTGgXHoB6PgCk2AMQP1Ko # LlaPLsrEKC06h2CJ27BB36CNVEGMN2iFa3aKz1FC85Oj2ckatspAFw78t9guj6eM # MYxn0ipSsjjWjMsc3zEDxi7JrA///5bp1e6e7WdLpOaMBPpV4xuvVvA6Aku2es7D # fMPOMdftBp6rrXp8edBMTs1sOHdE1k8ZsyJ90m96ckjfLX39TPAiJRm4pWD2UuP5 # Wjr+/IU+LEp/KCqimMj0kYMRz4rM3PP8hOakPZLiRR5ZG6sgbHZK44iPXB/Udz/g # TCZ87siIpI8YHb3WCaO5CvbdjPrszg1j9v7RimtDeGLDR/hNokkQ1EEeszDTGpgt # xst4S4wVmex2jYyi53woH4V1p8anP7iqa8elPehAaYPobp47pmBV53ZaSwibqzPN # TmO7P9rfyQGCiXXZRvrAQJa+gmAkQlSEI7mSssV77pU+1gdEj9c= # =hD/8 # -----END PGP SIGNATURE----- # gpg: Signature made Mon 02 Oct 2023 08:20:14 EDT # gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full] # gpg: aka "Juan Quintela <quintela@trasno.org>" [full] # Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723 * tag 'migration-20231002-pull-request' of https://gitlab.com/juan.quintela/qemu: migration/rdma: Simplify the function that saves a page migration: Remove unused qemu_file_credit_transfer() migration/rdma: Don't use imaginary transfers migration/rdma: Remove QEMUFile parameter when not used migration/RDMA: It is accounting for zero/normal pages in two places migration: Don't abuse qemu_file transferred for RDMA migration: Use qemu_file_transferred_noflush() for block migration. migration: Refactor repeated call of yank_unregister_instance migration-test: simplify shmem_opts handling migration-test: dirtylimit checks for x86_64 arch before migration-test: Add bootfile_create/delete() functions migration-test: bootpath is the same for all tests and for all archs migration-test: Create kvm_opts Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-09-29migration/rdma: Simplify the function that saves a pageJuan Quintela1-16/+3
When we sent a page through QEMUFile hooks (RDMA) there are three posiblities: - We are not using RDMA. return RAM_SAVE_CONTROL_DELAYED and control_save_page() returns false to let anything else to proceed. - There is one error but we are using RDMA. Then we return a negative value, control_save_page() needs to return true. - Everything goes well and RDMA start the sent of the page asynchronously. It returns RAM_SAVE_CONTROL_DELAYED and we need to return 1 for ram_save_page_legacy. Clear? I know, I know, the interface is as bad as it gets. I think that now it is a bit clearer, but this needs to be done some other way. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-16-quintela@redhat.com>
2023-09-29migration/rdma: Remove QEMUFile parameter when not usedJuan Quintela1-12/+11
Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-13-quintela@redhat.com>
2023-09-29migration: Don't abuse qemu_file transferred for RDMAJuan Quintela1-2/+20
Just create a variable for it, the same way that multifd does. This way it is safe to use for other thread, etc, etc. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-11-quintela@redhat.com>
2023-09-29migration: Clean up local variable shadowingMarkus Armbruster1-3/+5
Local variables shadowing other local variables or parameters make the code needlessly hard to understand. Tracked down with -Wshadow=local. Clean up: delete inner declarations when they are actually redundant, else rename variables. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Message-ID: <20230921121312.1301864-3-armbru@redhat.com>
2023-09-29migration/rdma: Fix save_page method to fail on polling errorMarkus Armbruster1-2/+4
qemu_rdma_save_page() reports polling error with error_report(), then succeeds anyway. This is because the variable holding the polling status *shadows* the variable the function returns. The latter remains zero. Broken since day one, and duplicated more recently. Fixes: 2da776db4846 (rdma: core logic) Fixes: b390afd8c50b (migration/rdma: Fix out of order wrid) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Message-ID: <20230921121312.1301864-2-armbru@redhat.com>
2023-09-07io: follow coroutine AioContext in qio_channel_yield()Stefan Hajnoczi1-12/+13
The ongoing QEMU multi-queue block layer effort makes it possible for multiple threads to process I/O in parallel. The nbd block driver is not compatible with the multi-queue block layer yet because QIOChannel cannot be used easily from coroutines running in multiple threads. This series changes the QIOChannel API to make that possible. In the current API, calling qio_channel_attach_aio_context() sets the AioContext where qio_channel_yield() installs an fd handler prior to yielding: qio_channel_attach_aio_context(ioc, my_ctx); ... qio_channel_yield(ioc); // my_ctx is used here ... qio_channel_detach_aio_context(ioc); This API design has limitations: reading and writing must be done in the same AioContext and moving between AioContexts involves a cumbersome sequence of API calls that is not suitable for doing on a per-request basis. There is no fundamental reason why a QIOChannel needs to run within the same AioContext every time qio_channel_yield() is called. QIOChannel only uses the AioContext while inside qio_channel_yield(). The rest of the time, QIOChannel is independent of any AioContext. In the new API, qio_channel_yield() queries the AioContext from the current coroutine using qemu_coroutine_get_aio_context(). There is no need to explicitly attach/detach AioContexts anymore and qio_channel_attach_aio_context() and qio_channel_detach_aio_context() are gone. One coroutine can read from the QIOChannel while another coroutine writes from a different AioContext. This API change allows the nbd block driver to use QIOChannel from any thread. It's important to keep in mind that the block driver already synchronizes QIOChannel access and ensures that two coroutines never read simultaneously or write simultaneously. This patch updates all users of qio_channel_attach_aio_context() to the new API. Most conversions are simple, but vhost-user-server requires a new qemu_coroutine_yield() call to quiesce the vu_client_trip() coroutine when not attached to any AioContext. While the API is has become simpler, there is one wart: QIOChannel has a special case for the iohandler AioContext (used for handlers that must not run in nested event loops). I didn't find an elegant way preserve that behavior, so I added a new API called qio_channel_set_follow_coroutine_ctx(ioc, true|false) for opting in to the new AioContext model. By default QIOChannel uses the iohandler AioHandler. Code that formerly called qio_channel_attach_aio_context() now calls qio_channel_set_follow_coroutine_ctx(ioc, true) once after the QIOChannel is created. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20230830224802.493686-5-stefanha@redhat.com> [eblake: also fix migration/rdma.c] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-07-26migration/rdma: Split qemu_fopen_rdma() into input/output functionsJuan Quintela1-20/+19
This is how everything else in QEMUFile is structured. As a bonus they are three less lines of code. Reviewed-by: Peter Xu <peterx@redhat.com> Message-ID: <20230530183941.7223-17-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-05-30aio: remove aio_disable_external() APIStefan Hajnoczi1-8/+8
All callers now pass is_external=false to aio_set_fd_handler() and aio_set_event_notifier(). The aio_disable_external() API that temporarily disables fd handlers that were registered is_external=true is therefore dead code. Remove aio_disable_external(), aio_enable_external(), and the is_external arguments to aio_set_fd_handler() and aio_set_event_notifier(). The entire test-fdmon-epoll test is removed because its sole purpose was testing aio_disable_external(). Parts of this patch were generated using the following coccinelle (https://coccinelle.lip6.fr/) semantic patch: @@ expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque; @@ - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque) + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque) @@ expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready; @@ - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready) + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready) Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230516190238.8401-21-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-18migration: split migration_incoming_coVladimir Sementsov-Ogievskiy1-3/+2
Originally, migration_incoming_co was introduced by 25d0c16f625feb3b6 "migration: Switch to COLO process after finishing loadvm" to be able to enter from COLO code to one specific yield point, added by 25d0c16f625feb3b6. Later in 923709896b1b0 "migration: poll the cm event for destination qemu" we reused this variable to wake the migration incoming coroutine from RDMA code. That was doubtful idea. Entering coroutines is a very fragile thing: you should be absolutely sure which yield point you are going to enter. I don't know how much is it safe to enter during qemu_loadvm_state() which I think what RDMA want to do. But for sure RDMA shouldn't enter the special COLO-related yield-point. As well, COLO code doesn't want to enter during qemu_loadvm_state(), it want to enter it's own specific yield-point. As well, when in 8e48ac95865ac97d "COLO: Add block replication into colo process" we added bdrv_invalidate_cache_all() call (now it's called activate_all()) it became possible to enter the migration incoming coroutine during that call which is wrong too. So, let't make these things separate and disjoint: loadvm_co for RDMA, non-NULL during qemu_loadvm_state(), and colo_incoming_co for COLO, non-NULL only around specific yield. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515130640.46035-3-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-05-05migration/rdma: Check for postcopy soonerJuan Quintela1-12/+12
It makes no sense first try to see if there is an rdma error and then do nothing on postcopy stage. Change it so we check we are in postcopy before doing anything. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230504114443.23891-6-quintela@redhat.com>
2023-05-05migration/rdma: We can calculate the rioc from the QEMUFileJuan Quintela1-3/+3
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230504114443.23891-4-quintela@redhat.com>
2023-05-05migration/rdma: Don't pass the QIOChannelRDMA as an opaqueJuan Quintela1-3/+3
We can calculate it from the QEMUFile like the caller. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230503131847.11603-6-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-05-03migration/rdma: Unfold last user of acct_update_position()Juan Quintela1-1/+3
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de>
2023-05-03migration/rdma: Split the zero page case from acct_update_positionJuan Quintela1-2/+5
Now that we have atomic counters, we can do it on the place that we need it, no need to do it inside ram.c. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de>
2023-04-24migration: Create migrate_rdma_pin_all() functionJuan Quintela1-3/+3
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> --- Fixed missing space after comma (fabiano)
2023-04-24migration: Move migrate_use_return() to options.cJuan Quintela1-3/+3
Once that we are there, we rename the function to migrate_return_path() to be consistent with all other capabilities. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-04-24migration: rename enabled_capabilities to capabilitiesJuan Quintela1-2/+2
It is clear from the context what that means, and such a long name with the extra long names of the capabilities make very difficilut to stay inside the 80 columns limit. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-03-16migration/rdma: Remove deprecated variable rdma_return_pathLi Zhijian1-2/+1
It's no longer needed since commit 44bcfd45e98 ("migration/rdma: destination: create the return patch after the first accept") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-03-16migration/rdma: Fix return-path caseDr. David Alan Gilbert1-3/+5
The RDMA code has return-path handling code, but it's only enabled if postcopy is enabled; if the 'return-path' migration capability is enabled, the return path is NOT setup but the core migration code still tries to use it and breaks. Enable the RDMA return path if either postcopy or the return-path capability is enabled. bz: https://bugzilla.redhat.com/show_bug.cgi?id=2063615 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-06io: Add support for MSG_PEEK for socket channelmanish.mishra1-0/+1
MSG_PEEK peeks at the channel, The data is treated as unread and the next read shall still return this data. This support is currently added only for socket class. Extra parameter 'flags' is added to io_readv calls to pass extra read flags like MSG_PEEK. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: manish.mishra <manish.mishra@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-02-06migration/rdma: fix return value for qio_channel_rdma_{readv,writev}Fiona Ebner1-5/+10
upon errors. As the documentation in include/io/channel.h states, only -1 and QIO_CHANNEL_ERR_BLOCK should be returned upon error. Other values have the potential to confuse the call sites. error_setg is used rather than error_setg_errno, because there are certain code paths where -1 (as a non-errno) is propagated up (e.g. starting from qemu_rdma_block_for_wrid or qemu_rdma_post_recv_control) all the way to qio_channel_rdma_{readv,writev}. Similar to a216ec85b7 ("migration/channel-block: fix return value for qio_channel_block_{readv,writev}"). Suggested-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2022-06-23migration: remove the QEMUFileOps abstractionDaniel P. Berrangé1-3/+2
Now that all QEMUFile callbacks are removed, the entire concept can be deleted. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-06-22migration: stop passing 'opaque' parameter to QEMUFile hooksDaniel P. Berrangé1-9/+10
The only user of the hooks is RDMA which provides a QIOChannel backed impl of QEMUFile. It can thus use the qemu_file_get_ioc() method. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-06-22migration: remove unreachble RDMA code in save_hook implDaniel P. Berrangé1-99/+21
The QEMUFile 'save_hook' callback has a 'size_t size' parameter. The RDMA impl of this has logic that takes different actions depending on whether the value is zero or non-zero. It has commented out logic that would have taken further actions if the value was negative. The only place where the 'save_hook' callback is invoked is the ram_control_save_page() method, which passes 'size' through from its caller. The only caller of this method is in turn control_save_page(). This method unconditionally passes the 'TARGET_PAGE_SIZE' constant for the 'size' parameter. IOW, the only scenario for 'size' that can execute in the qemu_rdma_save_page method is 'size > 0'. The remaining code has been unreachable since RDMA support was first introduced 9 years ago. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-06-22migration: Remove RDMA_UNREGISTRATION_EXAMPLEJuan Quintela1-41/+0
Nobody has ever showed up to unregister individual pages, and another set of patches written by Daniel P. Berrangé <berrange@redhat.com> just remove qemu_rdma_signal_unregister() function needed here. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-05-16QIOChannel: Add flags on io_writev and introduce io_flush callbackLeonardo Bras1-0/+1
Add flags to io_writev and introduce io_flush as optional callback to QIOChannelClass, allowing the implementation of zero copy writes by subclasses. How to use them: - Write data using qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY), - Wait write completion with qio_channel_flush(). Notes: As some zero copy write implementations work asynchronously, it's recommended to keep the write buffer untouched until the return of qio_channel_flush(), to avoid the risk of sending an updated buffer instead of the buffer state during write. As io_flush callback is optional, if a subclass does not implement it, then: - io_flush will return 0 without changing anything. Also, some functions like qio_channel_writev_full_all() were adapted to receive a flag parameter. That allows shared code between zero copy and non-zero copy writev, and also an easier implementation on new flags. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20220513062836.965425-3-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-03-02migration/rdma: set the REUSEADDR option for destinationJack Wang1-0/+7
We hit following error during testing RDMA transport: in case of migration error, mgmt daemon pick one migration port, incoming rdma:[::]:8089: RDMA ERROR: Error: could not rdma_bind_addr Then try another -incoming rdma:[::]:8103, sometime it worked, sometimes need another try with other ports number. Set the REUSEADDR option for destination, This allow address could be reused to avoid rdma_bind_addr error out. Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Message-Id: <20220208085640.19702-1-jinpu.wang@ionos.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Fixed up some tabs