aboutsummaryrefslogtreecommitdiff
path: root/block/replication.c
AgeCommit message (Collapse)AuthorFilesLines
2024-12-20include: Rename sysemu/ -> system/Philippe Mathieu-Daudé1-1/+1
Headers in include/sysemu/ are not only related to system *emulation*, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>
2024-05-28qapi: blockdev-backup: add discard-source parameterVladimir Sementsov-Ogievskiy1-2/+2
Add a parameter that enables discard-after-copy. That is mostly useful in "push backup with fleecing" scheme, when source is snapshot-access format driver node, based on copy-before-write filter snapshot-access API: [guest] [snapshot-access] ~~ blockdev-backup ~~> [backup target] | | | root | file v v [copy-before-write] | | | file | target v v [active disk] [temp.img] In this case discard-after-copy does two things: - discard data in temp.img to save disk space - avoid further copy-before-write operation in discarded area Note that we have to declare WRITE permission on source in copy-before-write filter, for discard to work. Still we can't take it unconditionally, as it will break normal backup from RO source. So, we have to add a parameter and pass it thorough bdrv_open flags. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Fiona Ebner <f.ebner@proxmox.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20240313152822.626493-5-vsementsov@yandex-team.ru> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-12-21block: remove AioContext lockingStefan Hajnoczi1-55/+3
This is the big patch that removes aio_context_acquire()/aio_context_release() from the block layer and affected block layer users. There isn't a clean way to split this patch and the reviewers are likely the same group of people, so I decided to do it in one patch. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Message-ID: <20231205182011.1976568-7-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-12-21graph-lock: remove AioContext lockingStefan Hajnoczi1-7/+7
Stop acquiring/releasing the AioContext lock in bdrv_graph_wrlock()/bdrv_graph_unlock() since the lock no longer has any effect. The distinction between bdrv_graph_wrunlock() and bdrv_graph_wrunlock_ctx() becomes meaningless and they can be collapsed into one function. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231205182011.1976568-6-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-21block: Fix deadlocks in bdrv_graph_wrunlock()Kevin Wolf1-5/+5
bdrv_graph_wrunlock() calls aio_poll(), which may run callbacks that have a nested event loop. Nested event loops can depend on other iothreads making progress, so in order to allow them to make progress it must not hold the AioContext lock of another thread while calling aio_poll(). This introduces a @bs parameter to bdrv_graph_wrunlock() whose AioContext is temporarily dropped (which matches bdrv_graph_wrlock()), and a bdrv_graph_wrunlock_ctx() that can be used if the BlockDriverState doesn't necessarily exist any more when unlocking. This also requires a change to bdrv_schedule_unref(), which was relying on the incorrectly taken lock. It needs to take the lock itself now. While this is a separate bug, it can't be fixed a separate patch because otherwise the intermediate state would either deadlock or try to release a lock that we don't even hold. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231115172012.112727-3-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [kwolf: Fixed up bdrv_schedule_unref()] Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-08block: Protect bs->file with graph_lockKevin Wolf1-1/+4
Almost all functions that access bs->file already take the graph lock now. Add locking to the remaining users and finally annotate the struct field itself as protected by the graph lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-25-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-08block: Protect bs->backing with graph_lockKevin Wolf1-1/+6
Almost all functions that access bs->backing already take the graph lock now. Add locking to the remaining users and finally annotate the struct field itself as protected by the graph lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-18-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12block: Protect bs->children with graph_lockKevin Wolf1-1/+2
Almost all functions that access the child links already take the graph lock now. Add locking to the remaining users and finally annotate the struct field itself as protected by the graph lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230929145157.45443-22-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12qcow2: Mark qcow2_signal_corruption() and callers GRAPH_RDLOCKKevin Wolf1-1/+7
This adds GRAPH_RDLOCK annotations to declare that callers of qcow2_signal_corruption() need to hold a reader lock for the graph because it calls bdrv_get_node_name(), which accesses the parents list of a node. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230929145157.45443-15-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCKKevin Wolf1-2/+8
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_first_blk() and bdrv_is_root_node() need to hold a reader lock for the graph. These functions are the only functions in block-backend.c that access the parent list of a node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230929145157.45443-5-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12block: convert more bdrv_is_allocated* and bdrv_block_status* calls to ↵Paolo Bonzini1-4/+4
coroutine versions Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230904100306.156197-5-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-20block: Mark bdrv_unref_child() GRAPH_WRLOCKKevin Wolf1-0/+3
Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_unref_child(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-21-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-20block: Mark bdrv_attach_child() GRAPH_WRLOCKKevin Wolf1-0/+6
Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_attach_child_common(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-13-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11block: remove has_variable_length from filtersPaolo Bonzini1-1/+0
Filters automatically get has_variable_length from their underlying BlockDriverState. There is no need to mark them as variable-length in the BlockDriver. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230407153303.391121-3-pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_co_refresh_total_sectors() and callers GRAPH_RDLOCKKevin Wolf1-1/+2
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_refresh_total_sectors() need to hold a reader lock for the graph. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-24-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark public read/write functions GRAPH_RDLOCKKevin Wolf1-9/+6
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_pread*/pwrite*() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-12-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01block: Convert bdrv_refresh_total_sectors() to co_wrapper_mixedEmanuele Giuseppe Esposito1-3/+3
BlockDriver->bdrv_getlength is categorized as IO callback, and it currently doesn't run in a coroutine. We should let it take a graph rdlock since the callback traverses the block nodes graph, which however is only possible in a coroutine. Therefore turn it into a co_wrapper to move the actual function into a coroutine where the lock can be taken. Because now this function creates a new coroutine and polls, we need to take the AioContext lock where it is missing, for the only reason that internally co_wrapper calls AIO_WAIT_WHILE and it expects to release the AioContext lock. This is especially messy when a co_wrapper creates a coroutine and polls in bdrv_open_driver, because this function has so many callers in so many context that it can easily lead to deadlocks. Therefore the new rule for bdrv_open_driver is that the caller must always hold the AioContext lock of the given bs (except if it is a coroutine), because the function calls bdrv_refresh_total_sectors() which is now a co_wrapper. Once the rwlock is ultimated and placed in every place it needs to be, we will poll using AIO_WAIT_WHILE_UNLOCKED and remove the AioContext lock. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230113204212.359076-7-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15block: Drain individual nodes during reopenKevin Wolf1-6/+0
bdrv_reopen() and friends use subtree drains as a lazy way of covering all the nodes they touch. Turns out that this lazy way is a lot more complicated than just draining the nodes individually, even not accounting for the additional complexity in the drain mechanism itself. Simplify the code by switching to draining the individual nodes that are already managed in the BlockReopenQueue anyway. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221118174110.55183-8-kwolf@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-30Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into stagingStefan Hajnoczi1-5/+3
Block layer patches - Cleanup bs->backing and bs->file handling - Refactor bdrv_try_set_aio_context using transactions - Changes for improved coroutine_fn consistency - vhost-user-blk: fix the resize crash - io_uring: Use of io_uring_register_ring_fd() led to breakage, revert - vvfat: Fix some problems with r/w mode - Code cleanup - MAINTAINERS: Fold "Block QAPI, monitor, ..." into "Block layer core" # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmNazhIRHGt3b2xmQHJl # ZGhhdC5jb20ACgkQfwmycsiPL9ZyTw/8Dfck/SuxfyeLlnQItkjaV4cnqWOU8vHs # 9x0KhlptCs+HXdF/3iicpA0lHojn7mNnbdFGjPRY4E0LriQv91TQ5ycdEmrseFPf # sgeQlgdKCVU/pHjZ2wYarm2pE43Cx85a5xuufmw+7w49dNNZn14l4t+DgviuClVM # nuVaogfZFbYyetre+Qd2TgLl+gJ+0d4o7Zs5lSWLrT8t0L9AGkcWPA7Nrbl6loIE # dOautV4G7jLjuMiCeJZOGcnuRVe3gCQ5rCGBFzzH4DUtz4BmiYx4hd3LMEsP0PMM # CrsfDZS04Ztybl9M7TmJuwkAm1gx1JDMOuJuh18lbJocIOBvhkKKxY2wI5LIdZVI # ZntmU36RowkX+GGu/PYpYyMjBDClJppZCl7vnjyLYsVt6r0Vu6SmlHpJhcRYabhe # 96Kv1LXH9A6+ogKPU3Layw6JGjg01GNr1ALuT7PO3pGto/JshmOuBEJJDucoF84M # 5AfxFCohMROVldwblA6M0eKnlQBgtr5BvtgbV54BBo88VlFJgDJFQn7R09cTFUEo # UwaJoS+nIaiZ0bQQVZhZloVppUaTdVJojzfVRCZZctga96/tu1HSFnGLnbEFpUN3 # KOf+XnVNS6Ro+nPSDf9bMjbIom2JicGFfV+6yMgIoxY/d5UA2dTZfefil4TAlSod # 6PsTgg+jrm8= # =/Fw0 # -----END PGP SIGNATURE----- # gpg: Signature made Thu 27 Oct 2022 14:29:38 EDT # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (58 commits) block/block-backend: blk_set_enable_write_cache is IO_CODE monitor: switch to *_co_* functions vmdk: switch to *_co_* functions vhdx: switch to *_co_* functions vdi: switch to *_co_* functions qed: switch to *_co_* functions qcow2: switch to *_co_* functions qcow: switch to *_co_* functions parallels: switch to *_co_* functions mirror: switch to *_co_* functions block: switch to *_co_* functions commit: switch to *_co_* functions vmdk: manually add more coroutine_fn annotations qcow2: manually add more coroutine_fn annotations qcow: manually add more coroutine_fn annotations blkdebug: add missing coroutine_fn annotation for indirect-called functions qcow2: add coroutine_fn annotation for indirect-called functions block: add missing coroutine_fn annotation to BlockDriverState callbacks coroutine-io: add missing coroutine_fn annotation to prototypes coroutine-lock: add missing coroutine_fn annotation to prototypes ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-27block: introduce bdrv_open_file_child() helperVladimir Sementsov-Ogievskiy1-5/+3
Almost all drivers call bdrv_open_child() similarly. Let's create a helper for this. The only not updated drivers that call bdrv_open_child() to set bs->file are raw-format and snapshot-access: raw-format sometimes want to have filtered child but don't set drv->is_filter to true. snapshot-access wants only DATA | PRIMARY Possibly we should implement drv->is_filter_func() handler, to consider raw-format as filter when it works as filter.. But it's another story. Note also, that we decrease assignments to bs->file in code: it helps us restrict modifying this field in further commit. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220726201134.924743-3-vsementsov@yandex-team.ru> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-26block: add BDRV_REQ_REGISTERED_BUF request flagStefan Hajnoczi1-1/+0
Block drivers may optimize I/O requests accessing buffers previously registered with bdrv_register_buf(). Checking whether all elements of a request's QEMUIOVector are within previously registered buffers is expensive, so we need a hint from the user to avoid costly checks. Add a BDRV_REQ_REGISTERED_BUF request flag to indicate that all QEMUIOVector elements in an I/O request are known to be within previously registered buffers. Always pass the flag through to driver read/write functions. There is little harm in passing the flag to a driver that does not use it. Passing the flag to drivers avoids changes across many block drivers. Filter drivers would need to explicitly support the flag and pass through to their children when the children support it. That's a lot of code changes and it's hard to remember to do that everywhere, leading to silent reduced performance when the flag is accidentally dropped. The only problematic scenario with the approach in this patch is when a driver passes the flag through to internal I/O requests that don't use the same I/O buffer. In that case the hint may be set when it should actually be clear. This is a rare case though so the risk is low. Some drivers have assert(!flags), which no longer works when BDRV_REQ_REGISTERED_BUF is passed in. These assertions aren't very useful anyway since the functions are called almost exclusively by bdrv_driver_preadv/pwritev() so if we get flags handling right there then the assertion is not needed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20221013185908.1297568-7-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-07job.c: enable job lock/unlock and remove Aiocontext locksEmanuele Giuseppe Esposito1-0/+2
Change the job_{lock/unlock} and macros to use job_mutex. Now that they are not nop anymore, remove the aiocontext to avoid deadlocks. Therefore: - when possible, remove completely the aiocontext lock/unlock pair - if it is used by some other function too, reduce the locking section as much as possible, leaving the job API outside. - change AIO_WAIT_WHILE in AIO_WAIT_WHILE_UNLOCKED, since we are not using the aiocontext lock anymore The only functions that still need the aiocontext lock are: - the JobDriver callbacks, already documented in job.h - job_cancel_sync() in replication.c is called with aio_context_lock taken, but now job is using AIO_WAIT_WHILE_UNLOCKED so we need to release the lock. Reduce the locking section to only cover the callback invocation and document the functions that take the AioContext lock, to avoid taking it twice. Also remove real_job_{lock/unlock}, as they are replaced by the public functions. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220926093214.506243-19-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-07jobs: protect job.aio_context with BQL and job_mutexEmanuele Giuseppe Esposito1-0/+1
In order to make it thread safe, implement a "fake rwlock", where we allow reads under BQL *or* job_mutex held, but writes only under BQL *and* job_mutex. The only write we have is in child_job_set_aio_ctx, which always happens under drain (so the job is paused). For this reason, introduce job_set_aio_context and make sure that the context is set under BQL, job_mutex and drain. Also make sure all other places where the aiocontext is read are protected. The reads in commit.c and mirror.c are actually safe, because always done under BQL. Note: at this stage, job_{lock/unlock} and job lock guard macros are *nop*. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220926093214.506243-14-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-07job: @force parameter for job_cancel_sync()Hanna Reitz1-2/+2
Callers should be able to specify whether they want job_cancel_sync() to force-cancel the job or not. In fact, almost all invocations do not care about consistency of the result and just want the job to terminate as soon as possible, so they should pass force=true. The replication block driver is the exception, specifically the active commit job it runs. As for job_cancel_sync_all(), all callers want it to force-cancel all jobs, because that is the point of it: To cancel all remaining jobs as quickly as possible (generally on process termination). So make it invoke job_cancel_sync() with force=true. This changes some iotest outputs, because quitting qemu while a mirror job is active will now lead to it being cancelled instead of completed, which is what we want. (Cancelling a READY mirror job with force=false may take an indefinite amount of time, which we do not want when quitting. If users want consistent results, they must have all jobs be done before they quit qemu.) Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462 Signed-off-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006151940.214590-6-hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-07-20replication: Remove workaroundLukas Straub1-11/+1
Remove the workaround introduced in commit 6ecbc6c52672db5c13805735ca02784879ce8285 "replication: Avoid blk_make_empty() on read-only child". It is not needed anymore since s->hidden_disk is guaranteed to be writable when secondary_do_checkpoint() runs. Because replication_start(), _do_checkpoint() and _stop() are only called by COLO migration code and COLO-migration activates all disks via bdrv_invalidate_cache_all() before it calls these functions. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <d3acfad43879e9f376bffa7dd797ae74d0a7c81a.1626619393.git.lukasstraub2@web.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-07-20replication: Properly attach childrenLukas Straub1-3/+27
The replication driver needs access to the children block-nodes of it's child so it can issue bdrv_make_empty() and bdrv_co_pwritev() to manage the replication. However, it does this by directly copying the BdrvChilds, which is wrong. Fix this by properly attaching the block-nodes with bdrv_attach_child() and requesting the required permissions. This ultimatively fixes a potential crash in replication_co_writev(), because it may write to s->secondary_disk if it is in state BLOCK_REPLICATION_FAILOVER_FAILED, without requesting write permissions first. And now the workaround in secondary_do_checkpoint() can be removed. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <5d0539d729afb8072d0d7cde977c5066285591b4.1626619393.git.lukasstraub2@web.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-07-20replication: Reduce usage of s->hidden_disk and s->secondary_diskLukas Straub1-17/+28
In preparation for the next patch, initialize s->hidden_disk and s->secondary_disk later and replace access to them with local variables in the places where they aren't initialized yet. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <1eb9dc179267207d9c7eccaeb30761758e32e9ab.1626619393.git.lukasstraub2@web.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-07-20replication: Remove s->active_diskLukas Straub1-17/+17
s->active_disk is bs->file. Remove it and use local variables instead. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <2534f867ea9be5b666dfce19744b7d4e2b96c976.1626619393.git.lukasstraub2@web.de> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-07-09block: Acquire AioContexts during bdrv_reopen_multiple()Kevin Wolf1-0/+7
As the BlockReopenQueue can contain nodes in multiple AioContexts, only one of which may be locked when AIO_WAIT_WHILE() can be called, we can't let the caller lock the right contexts. Instead, individually lock the AioContext of a single node when iterating the queue. Reintroduce bdrv_reopen() as a wrapper for reopening a single node that drains the node and temporarily drops the AioContext lock for bdrv_reopen_multiple(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210708114709.206487-4-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-05-26replication: move include out of root directoryPaolo Bonzini1-1/+1
The replication.h file is included from migration/colo.c and tests/unit/test-replication.c, so it should be in include/. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-01-26qapi: backup: add max-chunk and max-workers to x-perf structVladimir Sementsov-Ogievskiy1-1/+1
Add new parameters to configure future backup features. The patch doesn't introduce aio backup requests (so we actually have only one worker) neither requests larger than one cluster. Still, formally we satisfy these maximums anyway, so add the parameters now, to facilitate further patch which will really change backup job behavior. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20210116214705.822267-11-vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-01-26qapi: backup: add perf.use-copy-range parameterVladimir Sementsov-Ogievskiy1-0/+2
Experiments show, that copy_range is not always making things faster. So, to make experimentation simpler, let's add a parameter. Some more perf parameters will be added soon, so here is a new struct. For now, add new backup qmp parameter with x- prefix for the following reasons: - We are going to add more performance parameters, some will be related to the whole block-copy process, some only to background copying in backup (ignored for copy-before-write operations). - On the other hand, we are going to use block-copy interface in other block jobs, which will need performance options as well.. And it should be the same structure or at least somehow related. So, there are too much unclean things about how the interface and now we need the new options mostly for testing. Let's keep them experimental for a while. In do_backup_common() new x-perf parameter handled in a way to make further options addition simpler. We add use-copy-range with default=true, and we'll change the default in further patch, after moving backup to use block-copy. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20210116214705.822267-2-vsementsov@virtuozzo.com> [mreitz: s/5\.2/6.0/] Signed-off-by: Max Reitz <mreitz@redhat.com>
2020-07-10error: Reduce unnecessary error propagationMarkus Armbruster1-2/+1
When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away, even when we need to keep error_propagate() for other error paths. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-38-armbru@redhat.com>
2020-07-10error: Eliminate error_propagate() manuallyMarkus Armbruster1-3/+1
When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. The previous two commits did that for sufficiently simple cases with Coccinelle. Do it for several more manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-37-armbru@redhat.com>
2020-07-10error: Avoid unnecessary error_propagate() after error_setg()Markus Armbruster1-6/+5
Replace error_setg(&err, ...); error_propagate(errp, err); by error_setg(errp, ...); Related pattern: if (...) { error_setg(&err, ...); goto out; } ... out: error_propagate(errp, err); return; When all paths to label out are that way, replace by if (...) { error_setg(errp, ...); return; } and delete the label along with the error_propagate(). When we have at most one other path that actually needs to propagate, and maybe one at the end that where propagation is unnecessary, e.g. foo(..., &err); if (err) { goto out; } ... bar(..., &err); out: error_propagate(errp, err); return; move the error_propagate() to where it's needed, like if (...) { foo(..., &err); error_propagate(errp, err); return; } ... bar(..., errp); return; and transform the error_setg() as above. In some places, the transformation results in obviously unnecessary error_propagate(). The next few commits will eliminate them. Bonus: the elimination of gotos will make later patches in this series easier to review. Candidates for conversion tracked down with this Coccinelle script: @@ identifier err, errp; expression list args; @@ - error_setg(&err, args); + error_setg(errp, args); ... when != err error_propagate(errp, err); Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-34-armbru@redhat.com>
2020-07-10qemu-option: Use returned bool to check for failureMarkus Armbruster1-2/+1
The previous commit enables conversion of foo(..., &err); if (err) { ... } to if (!foo(..., &err)) { ... } for QemuOpts functions that now return true / false on success / error. Coccinelle script: @@ identifier fun = { opts_do_parse, parse_option_bool, parse_option_number, parse_option_size, qemu_opt_parse, qemu_opt_rename, qemu_opt_set, qemu_opt_set_bool, qemu_opt_set_number, qemu_opts_absorb_qdict, qemu_opts_do_parse, qemu_opts_from_qdict_entry, qemu_opts_set, qemu_opts_validate }; expression list args, args2; typedef Error; Error *err; @@ - fun(args, &err, args2); - if (err) + if (!fun(args, &err, args2)) { ... } A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-15-armbru@redhat.com> [Conflict with commit 0b6786a9c1 "block/amend: refactor qcow2 amend options" resolved by rerunning Coccinelle on master's version]
2020-05-18block: Drop @child_class from bdrv_child_perm()Max Reitz1-1/+0
Implementations should decide the necessary permissions based on @role. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-35-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-05-18block: Make filter drivers use child_of_bdsMax Reitz1-1/+2
Note that some filters have secondary children, namely blkverify (the image to be verified) and blklogwrites (the log). This patch does not touch those children. Note that for blkverify, the filtered child should not be format-probed. While there is nothing enforcing this here, in practice, it will not be: blkverify implements .bdrv_file_open. The block layer ensures (and in fact, asserts) that BDRV_O_PROTOCOL is set for every BDS whose driver implements .bdrv_file_open. This flag will then be bequeathed to blkverify's children, and they will thus (by default) not be probed either. ("By default" refers to the fact that blkverify's other child (the non-filtered one) will have BDRV_O_PROTOCOL force-unset, because that is what happens for all non-filtered children of non-format drivers.) Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-27-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-05-18block: Pass BdrvChildRole to bdrv_child_perm()Max Reitz1-0/+1
For now, all callers pass 0 and no callee evaluates this value. Later patches will change both. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-7-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-05-18block: Add BdrvChildRole to BdrvChildMax Reitz1-1/+1
For now, it is always set to 0. Later patches in this series will ensure that all callers pass an appropriate combination of flags. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-6-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-05-18block: Rename BdrvChildRole to BdrvChildClassMax Reitz1-1/+1
This structure nearly only contains parent callbacks for child state changes. It cannot really reflect a child's role, because different roles may overlap (as we will see when real roles are introduced), and because parents can have custom callbacks even when the child fulfills a standard role. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20200513110544.176672-4-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-05-18block: Use bdrv_make_empty() where possibleMax Reitz1-2/+1
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200429141126.85159-3-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-05-18replication: Avoid blk_make_empty() on read-only childKevin Wolf1-2/+11
This is just a bandaid to keep tests/test-replication working after bdrv_make_empty() starts to assert that we're not trying to call it on a read-only child. For the real solution in the future, replication should not steal the BdrvChild from its backing file (this is never correct to do!), but instead have its own child node references, with the appropriate permissions. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-05-18block/replication.c: Avoid cancelling the job twiceLukas Straub1-0/+2
If qemu in colo secondary mode is stopped, it crashes because s->backup_job is canceled twice: First with job_cancel_sync_all() in qemu_cleanup() and then in replication_stop(). Fix this by assigning NULL to s->backup_job when the job completes so replication_stop() and replication_do_checkpoint() won't touch the job. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <20200511090801.7ed5d8f3@luklap> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-29various: Remove suspicious '\' character outside of #define in C codePhilippe Mathieu-Daudé1-2/+2
Fixes the following coccinelle warnings: $ spatch --sp-file --verbose-parsing ... \ scripts/coccinelle/remove_local_err.cocci ... SUSPICIOUS: a \ character appears outside of a #define at ./target/ppc/translate_init.inc.c:5213 SUSPICIOUS: a \ character appears outside of a #define at ./target/ppc/translate_init.inc.c:5261 SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:166 SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:167 SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:169 SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:170 SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:171 SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:172 SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:173 SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5787 SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5789 SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5800 SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5801 SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5802 SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5804 SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5805 SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5806 SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:6329 SUSPICIOUS: a \ character appears outside of a #define at ./hw/sd/sdhci.c:1133 SUSPICIOUS: a \ character appears outside of a #define at ./hw/scsi/scsi-disk.c:3081 SUSPICIOUS: a \ character appears outside of a #define at ./hw/net/virtio-net.c:1529 SUSPICIOUS: a \ character appears outside of a #define at ./hw/riscv/sifive_u.c:468 SUSPICIOUS: a \ character appears outside of a #define at ./dump/dump.c:1895 SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2209 SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2215 SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2221 SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2222 SUSPICIOUS: a \ character appears outside of a #define at ./block/replication.c:172 SUSPICIOUS: a \ character appears outside of a #define at ./block/replication.c:173 Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20200412223619.11284-2-f4bug@amsat.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-04-07replication: assert we own context before job_cancel_syncStefan Reiter1-1/+4
job_cancel_sync requires the job's lock to be held, all other callers already do this (replication_stop, drive_backup_abort, blockdev_backup_abort, job_cancel_sync_all, cancel_common). In this case we're in a BlockDriver handler, so we already have a lock, just assert that it is the same as the one used for the commit_job. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> Message-Id: <20200407115651.69472-3-s.reiter@proxmox.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-03block/replication.c: Ignore requests after failoverLukas Straub1-1/+34
After failover the Secondary side of replication shouldn't change state, because it now functions as our primary disk. In replication_start, replication_do_checkpoint, replication_stop, ignore the request if current state is BLOCK_REPLICATION_DONE (sucessful failover) or BLOCK_REPLICATION_FAILOVER (failover in progres i.e. currently merging active and hidden images into the base image). Signed-off-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Acked-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2020-02-18block: Remove bdrv_recurse_is_first_non_filter()Max Reitz1-7/+0
It no longer has any users. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200218103454.296704-11-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-10-10block/backup: use backup-top instead of write notifiersVladimir Sementsov-Ogievskiy1-1/+1
Drop write notifiers and use filter node instead. = Changes = 1. Add filter-node-name argument for backup qmp api. We have to do it in this commit, as 257 needs to be fixed. 2. There are no more write notifiers here, so is_write_notifier parameter is dropped from block-copy paths. 3. To sync with in-flight requests at job finish we now have drained removing of the filter, we don't need rw-lock. 4. Block-copy is now using BdrvChildren instead of BlockBackends 5. As backup-top owns these children, we also move block-copy state into backup-top's ownership. = Iotest changes = 56: op-blocker doesn't shoot now, as we set it on source, but then check on filter, when trying to start second backup. To keep the test we instead can catch another collision: both jobs will get 'drive0' job-id, as job-id parameter is unspecified. To prevent interleaving with file-posix locks (as they are dependent on config) let's use another target for second backup. Also, it's obvious now that we'd like to drop this op-blocker at all and add a test-case for two backups from one node (to different destinations) actually works. But not in these series. 141: Output changed: prepatch, "Node is in use" comes from bdrv_has_blk check inside qmp_blockdev_del. But we've dropped block-copy blk objects, so no more blk objects on source bs (job blk is on backup-top filter bs). New message is from op-blocker, which is the next check in qmp_blockdev_add. 257: The test wants to emulate guest write during backup. They should go to filter node, not to original source node, of course. Therefore we need to specify filter node name and use it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191001131409.14202-6-vsementsov@virtuozzo.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-08-16block/backup: Add mirror sync mode 'bitmap'John Snow1-1/+1
We don't need or want a new sync mode for simple differences in semantics. Create a new mode simply named "BITMAP" that is designed to make use of the new Bitmap Sync Mode field. Because the only bitmap sync mode is 'on-success', this adds no new functionality to the backup job (yet). The old incremental backup mode is maintained as a syntactic sugar for sync=bitmap, mode=on-success. Add all of the plumbing necessary to support this new instruction. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190709232550.10724-6-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>