aboutsummaryrefslogtreecommitdiff
path: root/block/io.c
AgeCommit message (Collapse)AuthorFilesLines
2023-06-05block: Collapse padded I/O vecs exceeding IOV_MAXHanna Czenczek1-15/+151
When processing vectored guest requests that are not aligned to the storage request alignment, we pad them by adding head and/or tail buffers for a read-modify-write cycle. The guest can submit I/O vectors up to IOV_MAX (1024) in length, but with this padding, the vector can exceed that limit. As of 4c002cef0e9abe7135d7916c51abce47f7fc1ee2 ("util/iov: make qemu_iovec_init_extended() honest"), we refuse to pad vectors beyond the limit, instead returning an error to the guest. To the guest, this appears as a random I/O error. We should not return an I/O error to the guest when it issued a perfectly valid request. Before 4c002cef0e9abe7135d7916c51abce47f7fc1ee2, we just made the vector longer than IOV_MAX, which generally seems to work (because the guest assumes a smaller alignment than we really have, file-posix's raw_co_prw() will generally see bdrv_qiov_is_aligned() return false, and so emulate the request, so that the IOV_MAX does not matter). However, that does not seem exactly great. I see two ways to fix this problem: 1. We split such long requests into two requests. 2. We join some elements of the vector into new buffers to make it shorter. I am wary of (1), because it seems like it may have unintended side effects. (2) on the other hand seems relatively simple to implement, with hopefully few side effects, so this patch does that. To do this, the use of qemu_iovec_init_extended() in bdrv_pad_request() is effectively replaced by the new function bdrv_create_padded_qiov(), which not only wraps the request IOV with padding head/tail, but also ensures that the resulting vector will not have more than IOV_MAX elements. Putting that functionality into qemu_iovec_init_extended() is infeasible because it requires allocating a bounce buffer; doing so would require many more parameters (buffer alignment, how to initialize the buffer, and out parameters like the buffer, its length, and the original elements), which is not reasonable. Conversely, it is not difficult to move qemu_iovec_init_extended()'s functionality into bdrv_create_padded_qiov() by using public qemu_iovec_* functions, so that is what this patch does. Because bdrv_pad_request() was the only "serious" user of qemu_iovec_init_extended(), the next patch will remove the latter function, so the functionality is not implemented twice. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2141964 Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230411173418.19549-3-hreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-06-01block: remove bdrv_co_io_plug() APIStefan Hajnoczi1-37/+0
No block driver implements .bdrv_co_io_plug() anymore. Get rid of the function pointers. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20230530180959.1108766-7-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-30aio: remove aio_disable_external() APIStefan Hajnoczi1-2/+0
All callers now pass is_external=false to aio_set_fd_handler() and aio_set_event_notifier(). The aio_disable_external() API that temporarily disables fd handlers that were registered is_external=true is therefore dead code. Remove aio_disable_external(), aio_enable_external(), and the is_external arguments to aio_set_fd_handler() and aio_set_event_notifier(). The entire test-fdmon-epoll test is removed because its sole purpose was testing aio_disable_external(). Parts of this patch were generated using the following coccinelle (https://coccinelle.lip6.fr/) semantic patch: @@ expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque; @@ - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque) + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque) @@ expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready; @@ - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready) + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready) Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230516190238.8401-21-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30block: drain from main loop thread in bdrv_co_yield_to_drain()Stefan Hajnoczi1-4/+10
For simplicity, always run BlockDevOps .drained_begin/end/poll() callbacks in the main loop thread. This makes it easier to implement the callbacks and avoids extra locks. Move the function pointer declarations from the I/O Code section to the Global State section for BlockDevOps, BdrvChildClass, and BlockDriver. Narrow IO_OR_GS_CODE() to GLOBAL_STATE_CODE() where appropriate. The test-bdrv-drain test case calls bdrv_drain() from an IOThread. This is now only allowed from coroutine context, so update the test case to run in a coroutine. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-11-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-15block: introduce zone append write for zoned devicesSam Li1-0/+27
A zone append command is a write operation that specifies the first logical block of a zone as the write position. When writing to a zoned block device using zone append, the byte offset of the call may point at any position within the zone to which the data is being appended. Upon completion the device will respond with the position where the data has been written in the zone. Signed-off-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20230508051510.177850-3-faithilikerun@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-15block/block-backend: add block layer APIs resembling Linux ZonedBlockDevice ↵Sam Li1-0/+41
ioctls Add zoned device option to host_device BlockDriver. It will be presented only for zoned host block devices. By adding zone management operations to the host_block_device BlockDriver, users can use the new block layer APIs including Report Zone and four zone management operations (open, close, finish, reset, reset_all). Qemu-io uses the new APIs to perform zoned storage commands of the device: zone_report(zrp), zone_open(zo), zone_close(zc), zone_reset(zrs), zone_finish(zf). For example, to test zone_report, use following command: $ ./build/qemu-io --image-opts -n driver=host_device, filename=/dev/nullb0 -c "zrp offset nr_zones" Signed-off-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20230508045533.175575-4-faithilikerun@gmail.com Message-id: 20230324090605.28361-4-faithilikerun@gmail.com [Adjust commit message prefix as suggested by Philippe Mathieu-Daudé <philmd@linaro.org> and remove spurious ret = -errno in raw_co_zone_mgmt(). --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-10block: Mark bdrv_refresh_limits() and callers GRAPH_RDLOCKKevin Wolf1-1/+0
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_refresh_limits() need to hold a reader lock for the graph because it accesses the children list of a node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-21-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-10block: Mark bdrv_co_get_info() and callers GRAPH_RDLOCKEmanuele Giuseppe Esposito1-6/+5
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_get_info() need to hold a reader lock for the graph. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-15-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-25block: convert bdrv_drain_all_begin() to AIO_WAIT_WHILE_UNLOCKED()Stefan Hajnoczi1-1/+1
Since the AioContext argument was already NULL, AIO_WAIT_WHILE() was never going to unlock the AioContext. Therefore it is possible to replace AIO_WAIT_WHILE() with AIO_WAIT_WHILE_UNLOCKED(). Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230309190855.414275-5-stefanha@redhat.com> Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11block: remove has_variable_length from BlockDriverPaolo Bonzini1-2/+0
Fill in the field in BlockLimits directly for host devices, and copy it from there for the raw format. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230407153303.391121-5-pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-04-11block: move has_variable_length to BlockLimitsPaolo Bonzini1-0/+6
At the protocol level, has_variable_length only needs to be true in the very special case of host CD-ROM drives, so that they do not need an explicit monitor command to read the new size when a disc is loaded in the tray. However, at the format level has_variable_length has to be true for all raw blockdevs and for all filters, even though in practice the length depends on the underlying file and thus will not change except in the case of host CD-ROM drives. As a first step towards computing an accurate value of has_variable_length, add the value into the BlockLimits structure and initialize the field from the BlockDriver. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230407153303.391121-2-pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_(un)register_buf() GRAPH_RDLOCKKevin Wolf1-4/+10
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_register_buf() and bdrv_unregister_buf() need to hold a reader lock for the graph. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-21-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_co_io_(un)plug() and callers GRAPH_RDLOCKKevin Wolf1-0/+2
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_io_plug() and bdrv_co_io_unplug() need to hold a reader lock for the graph. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-18-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark preadv_snapshot/snapshot_block_status GRAPH_RDLOCKKevin Wolf1-0/+2
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-16-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_co_copy_range() GRAPH_RDLOCKEmanuele Giuseppe Esposito1-2/+5
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_copy_range() need to hold a reader lock for the graph. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-15-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_co_do_pwrite_zeroes() GRAPH_RDLOCKKevin Wolf1-4/+3
All callers are already GRAPH_RDLOCK, so just add the annotation and remove assume_graph_lock(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-14-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_co_pwrite_sync() and callers GRAPH_RDLOCKKevin Wolf1-2/+1
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_pwrite_sync() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-13-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark public read/write functions GRAPH_RDLOCKKevin Wolf1-4/+3
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_pread*/pwrite*() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-12-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark read/write in block/io.c GRAPH_RDLOCKKevin Wolf1-35/+31
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_driver_*() need to hold a reader lock for the graph. It doesn't add the annotation to public functions yet. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-11-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_co_pwrite_zeroes() and callers GRAPH_RDLOCKKevin Wolf1-3/+9
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_pwrite_zeroes() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-10-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_co_pdiscard() and callers GRAPH_RDLOCKEmanuele Giuseppe Esposito1-0/+2
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_pdiscard() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-9-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_co_flush() and callers GRAPH_RDLOCKEmanuele Giuseppe Esposito1-0/+7
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_flush() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-8-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_co_ioctl() and callers GRAPH_RDLOCKKevin Wolf1-0/+1
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_ioctl() need to hold a reader lock for the graph. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-6-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_co_block_status() and callers GRAPH_RDLOCKKevin Wolf1-5/+8
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_block_status() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-5-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_co_truncate() and callers GRAPH_RDLOCKKevin Wolf1-0/+1
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_truncate() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-4-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-09block: fix detect-zeroes= with BDRV_REQ_REGISTERED_BUFStefan Hajnoczi1-0/+3
When a write request is converted into a write zeroes request by the detect-zeroes= feature, it is no longer associated with an I/O buffer. The BDRV_REQ_REGISTERED_BUF flag doesn't make sense without an I/O buffer and must be cleared because bdrv_co_do_pwrite_zeroes() fails with -EINVAL when it's set. Fiona Ebner <f.ebner@proxmox.com> bisected and diagnosed this QEMU 7.2 regression where writes containing zeroes to a blockdev with discard=unmap,detect-zeroes=unmap fail. Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1404 Fixes: e8b6535533be ("block: add BDRV_REQ_REGISTERED_BUF request flag") Tested-by: Fiona Ebner <f.ebner@proxmox.com> Cc: qemu-stable@nongnu.org Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230207203719.242926-2-stefanha@redhat.com>
2023-02-01block: Rename bdrv_load/save_vmstate() to bdrv_co_load/save_vmstate()Emanuele Giuseppe Esposito1-4/+4
Since these functions always run in coroutine context, adjust their name to include "_co_", just like all other BlockDriver callbacks. No functional change intended. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230113204212.359076-15-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01block: Convert bdrv_debug_event() to co_wrapper_mixedEmanuele Giuseppe Esposito1-11/+11
bdrv_debug_event() is categorized as an I/O function, and it currently doesn't run in a coroutine. We should let it take a graph rdlock since it traverses the block nodes graph, which however is only possible in a coroutine. Therefore turn it into a co_wrapper_mixed to move the actual function into a coroutine where the lock can be taken. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230113204212.359076-14-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01block: Convert bdrv_get_info() to co_wrapper_mixedEmanuele Giuseppe Esposito1-4/+4
bdrv_get_info() is categorized as an I/O function, and it currently doesn't run in a coroutine. We should let it take a graph rdlock since it traverses the block nodes graph, which however is only possible in a coroutine. Therefore turn it into a co_wrapper to move the actual function into a coroutine where the lock can be taken. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230113204212.359076-11-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01block: use bdrv_co_refresh_total_sectors when possibleEmanuele Giuseppe Esposito1-2/+2
In some places we are sure we are always running in a coroutine, therefore it's useless to call the generated_co_wrapper, instead call directly the _co_ function. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230113204212.359076-9-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01block: Rename refresh_total_sectors to bdrv_refresh_total_sectorsEmanuele Giuseppe Esposito1-3/+5
The name is not good, not the least because we are going to convert this to a generated co_wrapper, which adds a _co infix after the first part of the name. No functional change intended. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230113204212.359076-6-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01block: Convert bdrv_is_inserted() to co_wrapperEmanuele Giuseppe Esposito1-6/+6
bdrv_is_inserted() is categorized as an I/O function, and it currently doesn't run in a coroutine. We should let it take a graph rdlock since it traverses the block nodes graph, which however is only possible in a coroutine. Therefore turn it into a co_wrapper to move the actual function into a coroutine where the lock can be taken. At the same time, add also blk_is_inserted as co_wrapper_mixed, since it is called in both coroutine and non-coroutine contexts. Because now this function creates a new coroutine and polls, we need to take the AioContext lock where it is missing, for the only reason that internally c_w_mixed_bdrv_rdlock calls AIO_WAIT_WHILE and it expects to release the AioContext lock. Once the rwlock is ultimated and placed in every place it needs to be, we will poll using AIO_WAIT_WHILE_UNLOCKED and remove the AioContext lock. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230113204212.359076-5-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01block: Convert bdrv_io_unplug() to co_wrapperEmanuele Giuseppe Esposito1-4/+4
BlockDriver->bdrv_io_unplug is categorized as IO callback, and it currently doesn't run in a coroutine. We should let it take a graph rdlock since the callback traverses the block nodes graph, which however is only possible in a coroutine. The only caller of this function is blk_io_unplug(), therefore make blk_io_unplug() a co_wrapper, so that we're always running in a coroutine where the lock can be taken. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230113204212.359076-4-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01block: Convert bdrv_io_plug() to co_wrapperEmanuele Giuseppe Esposito1-4/+4
BlockDriver->bdrv_io_plug is categorized as IO callback, and it currently doesn't run in a coroutine. We should let it take a graph rdlock since the callback traverses the block nodes graph, which however is only possible in a coroutine. The only caller of this function is blk_io_plug(), therefore make blk_io_plug() a co_wrapper, so that we're always running in a coroutine where the lock can be taken. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230113204212.359076-3-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-01-20include/block: Untangle inclusion loopsMarkus Armbruster1-0/+1
We have two inclusion loops: block/block.h -> block/block-global-state.h -> block/block-common.h -> block/blockjob.h -> block/block.h block/block.h -> block/block-io.h -> block/block-common.h -> block/blockjob.h -> block/block.h I believe these go back to Emanuele's reorganization of the block API, merged a few months ago in commit d7e2fe4aac8. Fortunately, breaking them is merely a matter of deleting unnecessary includes from headers, and adding them back in places where they are now missing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221221133551.3967339-2-armbru@redhat.com>
2022-12-21block/io: Check for replay-enabled in bdrv_drain_all_begin()Peter Maydell1-0/+9
In commit da0bd74434 we refactored bdrv_drain_all_begin() to pull out the non-polling part into bdrv_drain_all_begin_nopoll(). This change broke record-and-replay, because the "return early if replay enabled" check is now in the sub-function bdrv_drain_all_begin_nopoll(), and so it only causes us to return from that function, and not from the calling bdrv_drain_all_begin(). Fix the regression by checking whether replay is enabled in both functions. The breakage and fix can be tested via 'make check-avocado': the tests/avocado/reverse_debugging.py:ReverseDebugging_X86_64.test_x86_64_pc tests/avocado/reverse_debugging.py:ReverseDebugging_AArch64.test_aarch64_virt tests were both broken by this. Fixes: da0bd744344adb1f285 ("block: Factor out bdrv_drain_all_begin_nopoll()") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Fabiano Rosas <farosas@suse.de> Message-id: 20221220174638.2156308-1-peter.maydell@linaro.org
2022-12-15block: GRAPH_RDLOCK for functions only called by co_wrappersKevin Wolf1-0/+2
The generated coroutine wrappers already take care to take the lock in the non-coroutine path, and assume that the lock is already taken in the coroutine path. The only thing we need to do for the wrapped function is adding the GRAPH_RDLOCK annotation. Doing so also allows us to mark the corresponding callbacks in BlockDriver as GRAPH_RDLOCK_PTR. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221207131838.239125-19-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15block: Factor out bdrv_drain_all_begin_nopoll()Kevin Wolf1-6/+13
Provide a separate function that just quiesces the users of a node to prevent new requests from coming in, but without waiting for the already in-flight I/O to complete. This function can be used in contexts where polling is not allowed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221207131838.239125-2-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15block-io: introduce coroutine_fn duplicates for ↵Emanuele Giuseppe Esposito1-3/+55
bdrv_common_block_status_above callers bdrv_common_block_status_above() is a g_c_w, and it is being called by many "wrapper" functions like bdrv_is_allocated(), bdrv_is_allocated_above() and bdrv_block_status_above(). Because we want to eventually split the coroutine from non-coroutine case in g_c_w, create duplicate wrappers that take care of directly calling the same coroutine functions called in the g_c_w. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20221128142337.657646-2-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15block: Remove poll parameter from bdrv_parent_drained_begin_single()Kevin Wolf1-6/+2
All callers of bdrv_parent_drained_begin_single() pass poll=false now, so we don't need the parameter any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221118174110.55183-16-kwolf@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15block: Don't poll in bdrv_replace_child_noperm()Kevin Wolf1-1/+1
In order to make sure that bdrv_replace_child_noperm() doesn't have to poll any more, get rid of the bdrv_parent_drained_begin_single() call. This is possible now because we can require that the parent is already drained through the child in question when the function is called and we don't call the parent drain callbacks more than once. The additional drain calls needed in callers cause the test case to run its code in the drain handler too early (bdrv_attach_child() drains now), so modify it to only enable the code after the test setup has completed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221118174110.55183-15-kwolf@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15block: Drop out of coroutine in bdrv_do_drained_begin_quiesce()Kevin Wolf1-13/+12
The next patch adds a parent drain to bdrv_attach_child_common(), which shouldn't be, but is currently called from coroutines in some cases (e.g. .bdrv_co_create implementations generally open new nodes). Therefore, the assertion that we're not in a coroutine doesn't hold true any more. We could just remove the assertion because there is nothing in the function that should be in conflict with running in a coroutine, but just to be on the safe side, we can reverse the caller relationship between bdrv_do_drained_begin() and bdrv_do_drained_begin_quiesce() so that the latter also just drops out of coroutine context and we can still be certain in the future that any drain code doesn't run in coroutines. As a nice side effect, the structure of bdrv_do_drained_begin() is now symmetrical with bdrv_do_drained_end(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221118174110.55183-14-kwolf@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15block: Remove ignore_bds_parents parameter from drain_begin/end.Kevin Wolf1-36/+22
ignore_bds_parents is now ignored during drain_begin and drain_end, so we can just remove it there. It is still a valid optimisation for drain_all in bdrv_drained_poll(), so leave it around there. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221118174110.55183-13-kwolf@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15block: Call drain callbacks only onceKevin Wolf1-12/+18
We only need to call both the BlockDriver's callback and the parent callbacks when going from undrained to drained or vice versa. A second drain section doesn't make a difference for the driver or the parent, they weren't supposed to send new requests before and after the second drain. One thing that gets in the way is the 'ignore_bds_parents' parameter in bdrv_do_drained_begin_quiesce() and bdrv_do_drained_end(): It means that bdrv_drain_all_begin() increases bs->quiesce_counter, but does not quiesce the parent through BdrvChildClass callbacks. If an additional drain section is started now, bs->quiesce_counter will be non-zero, but we would still need to quiesce the parent through BdrvChildClass in order to keep things consistent (and unquiesce it on the matching bdrv_drained_end(), even though the counter would not reach 0 yet as long as the bdrv_drain_all() section is still active). Instead of keeping track of this, let's just get rid of the parameter. It was introduced in commit 6cd5c9d7b2d as an optimisation so that during bdrv_drain_all(), we wouldn't recursively drain all parents up to the root for each node, resulting in quadratic complexity. As it happens, calling the callbacks only once solves the same problem, so as of this patch, we'll still have O(n) complexity and ignore_bds_parents is not needed any more. This patch only ignores the 'ignore_bds_parents' parameter. It will be removed in a separate patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20221118174110.55183-12-kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15block: Remove subtree drainsKevin Wolf1-95/+26
Subtree drains are not used any more. Remove them. After this, BdrvChildClass.attach/detach() don't poll any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20221118174110.55183-11-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15block: Inline bdrv_drain_invoke()Kevin Wolf1-17/+6
bdrv_drain_invoke() has now two entirely separate cases that share no code any more and are selected depending on a bool parameter. Each case has only one caller. Just inline the function. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20221118174110.55183-6-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15block: Remove drained_end_counterKevin Wolf1-74/+24
drained_end_counter is unused now, nobody changes its value any more. It can be removed. In cases where we had two almost identical functions that only differed in whether the caller passes drained_end_counter, or whether they would poll for a local drained_end_counter to reach 0, these become a single function. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20221118174110.55183-5-kwolf@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15block: Revert .bdrv_drained_begin/end to non-coroutine_fnKevin Wolf1-42/+7
Polling during bdrv_drained_end() can be problematic (and in the future, we may get cases for bdrv_drained_begin() where polling is forbidden, and we don't care about already in-flight requests, but just want to prevent new requests from arriving). The .bdrv_drained_begin/end callbacks running in a coroutine is the only reason why we have to do this polling, so make them non-coroutine callbacks again. None of the callers actually yield any more. This means that bdrv_drained_end() effectively doesn't poll any more, even if AIO_WAIT_WHILE() loops are still there (their condition is false from the beginning). This is generally not a problem, but in test-bdrv-drain, some additional explicit aio_poll() calls need to be added because the test case wants to verify the final state after BHs have executed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20221118174110.55183-4-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-11-10block: Start/end drain on correct AioContextHanna Reitz1-2/+4
bdrv_parent_drained_{begin,end}_single() are supposed to operate on the parent, not on the child, so they should not attempt to get the context to poll from the child but the parent instead. BDRV_POLL_WHILE(c->bs) does get the context from the child, so we should replace it with AIO_WAIT_WHILE() on the parent's context instead. This problem becomes apparent when bdrv_replace_child_noperm() invokes bdrv_parent_drained_end_single() after removing a child from a subgraph that is in an I/O thread. By the time bdrv_parent_drained_end_single() is called, child->bs is NULL, and so BDRV_POLL_WHILE(c->bs, ...) will poll the main loop instead of the I/O thread; but anything that bdrv_parent_drained_end_single_no_poll() may have scheduled is going to want to run in the I/O thread, but because we poll the main loop, the I/O thread is never unpaused, and nothing is run, resulting in a deadlock. Closes: https://gitlab.com/qemu-project/qemu/-/issues/1215 Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20221107151321.211175-4-hreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-30Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into stagingStefan Hajnoczi1-4/+4
Block layer patches - Cleanup bs->backing and bs->file handling - Refactor bdrv_try_set_aio_context using transactions - Changes for improved coroutine_fn consistency - vhost-user-blk: fix the resize crash - io_uring: Use of io_uring_register_ring_fd() led to breakage, revert - vvfat: Fix some problems with r/w mode - Code cleanup - MAINTAINERS: Fold "Block QAPI, monitor, ..." into "Block layer core" # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmNazhIRHGt3b2xmQHJl # ZGhhdC5jb20ACgkQfwmycsiPL9ZyTw/8Dfck/SuxfyeLlnQItkjaV4cnqWOU8vHs # 9x0KhlptCs+HXdF/3iicpA0lHojn7mNnbdFGjPRY4E0LriQv91TQ5ycdEmrseFPf # sgeQlgdKCVU/pHjZ2wYarm2pE43Cx85a5xuufmw+7w49dNNZn14l4t+DgviuClVM # nuVaogfZFbYyetre+Qd2TgLl+gJ+0d4o7Zs5lSWLrT8t0L9AGkcWPA7Nrbl6loIE # dOautV4G7jLjuMiCeJZOGcnuRVe3gCQ5rCGBFzzH4DUtz4BmiYx4hd3LMEsP0PMM # CrsfDZS04Ztybl9M7TmJuwkAm1gx1JDMOuJuh18lbJocIOBvhkKKxY2wI5LIdZVI # ZntmU36RowkX+GGu/PYpYyMjBDClJppZCl7vnjyLYsVt6r0Vu6SmlHpJhcRYabhe # 96Kv1LXH9A6+ogKPU3Layw6JGjg01GNr1ALuT7PO3pGto/JshmOuBEJJDucoF84M # 5AfxFCohMROVldwblA6M0eKnlQBgtr5BvtgbV54BBo88VlFJgDJFQn7R09cTFUEo # UwaJoS+nIaiZ0bQQVZhZloVppUaTdVJojzfVRCZZctga96/tu1HSFnGLnbEFpUN3 # KOf+XnVNS6Ro+nPSDf9bMjbIom2JicGFfV+6yMgIoxY/d5UA2dTZfefil4TAlSod # 6PsTgg+jrm8= # =/Fw0 # -----END PGP SIGNATURE----- # gpg: Signature made Thu 27 Oct 2022 14:29:38 EDT # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (58 commits) block/block-backend: blk_set_enable_write_cache is IO_CODE monitor: switch to *_co_* functions vmdk: switch to *_co_* functions vhdx: switch to *_co_* functions vdi: switch to *_co_* functions qed: switch to *_co_* functions qcow2: switch to *_co_* functions qcow: switch to *_co_* functions parallels: switch to *_co_* functions mirror: switch to *_co_* functions block: switch to *_co_* functions commit: switch to *_co_* functions vmdk: manually add more coroutine_fn annotations qcow2: manually add more coroutine_fn annotations qcow: manually add more coroutine_fn annotations blkdebug: add missing coroutine_fn annotation for indirect-called functions qcow2: add coroutine_fn annotation for indirect-called functions block: add missing coroutine_fn annotation to BlockDriverState callbacks coroutine-io: add missing coroutine_fn annotation to prototypes coroutine-lock: add missing coroutine_fn annotation to prototypes ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>