aboutsummaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)AuthorFilesLines
2022-10-30Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into stagingStefan Hajnoczi42-371/+372
Block layer patches - Cleanup bs->backing and bs->file handling - Refactor bdrv_try_set_aio_context using transactions - Changes for improved coroutine_fn consistency - vhost-user-blk: fix the resize crash - io_uring: Use of io_uring_register_ring_fd() led to breakage, revert - vvfat: Fix some problems with r/w mode - Code cleanup - MAINTAINERS: Fold "Block QAPI, monitor, ..." into "Block layer core" # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmNazhIRHGt3b2xmQHJl # ZGhhdC5jb20ACgkQfwmycsiPL9ZyTw/8Dfck/SuxfyeLlnQItkjaV4cnqWOU8vHs # 9x0KhlptCs+HXdF/3iicpA0lHojn7mNnbdFGjPRY4E0LriQv91TQ5ycdEmrseFPf # sgeQlgdKCVU/pHjZ2wYarm2pE43Cx85a5xuufmw+7w49dNNZn14l4t+DgviuClVM # nuVaogfZFbYyetre+Qd2TgLl+gJ+0d4o7Zs5lSWLrT8t0L9AGkcWPA7Nrbl6loIE # dOautV4G7jLjuMiCeJZOGcnuRVe3gCQ5rCGBFzzH4DUtz4BmiYx4hd3LMEsP0PMM # CrsfDZS04Ztybl9M7TmJuwkAm1gx1JDMOuJuh18lbJocIOBvhkKKxY2wI5LIdZVI # ZntmU36RowkX+GGu/PYpYyMjBDClJppZCl7vnjyLYsVt6r0Vu6SmlHpJhcRYabhe # 96Kv1LXH9A6+ogKPU3Layw6JGjg01GNr1ALuT7PO3pGto/JshmOuBEJJDucoF84M # 5AfxFCohMROVldwblA6M0eKnlQBgtr5BvtgbV54BBo88VlFJgDJFQn7R09cTFUEo # UwaJoS+nIaiZ0bQQVZhZloVppUaTdVJojzfVRCZZctga96/tu1HSFnGLnbEFpUN3 # KOf+XnVNS6Ro+nPSDf9bMjbIom2JicGFfV+6yMgIoxY/d5UA2dTZfefil4TAlSod # 6PsTgg+jrm8= # =/Fw0 # -----END PGP SIGNATURE----- # gpg: Signature made Thu 27 Oct 2022 14:29:38 EDT # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (58 commits) block/block-backend: blk_set_enable_write_cache is IO_CODE monitor: switch to *_co_* functions vmdk: switch to *_co_* functions vhdx: switch to *_co_* functions vdi: switch to *_co_* functions qed: switch to *_co_* functions qcow2: switch to *_co_* functions qcow: switch to *_co_* functions parallels: switch to *_co_* functions mirror: switch to *_co_* functions block: switch to *_co_* functions commit: switch to *_co_* functions vmdk: manually add more coroutine_fn annotations qcow2: manually add more coroutine_fn annotations qcow: manually add more coroutine_fn annotations blkdebug: add missing coroutine_fn annotation for indirect-called functions qcow2: add coroutine_fn annotation for indirect-called functions block: add missing coroutine_fn annotation to BlockDriverState callbacks coroutine-io: add missing coroutine_fn annotation to prototypes coroutine-lock: add missing coroutine_fn annotation to prototypes ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-27block/block-backend: blk_set_enable_write_cache is IO_CODEEmanuele Giuseppe Esposito1-1/+1
blk_set_enable_write_cache() is defined as GLOBAL_STATE_CODE but can be invoked from iothreads when handling scsi requests. This triggers an assertion failure: 0x00007fd6c3515ce1 in raise () from /lib/x86_64-linux-gnu/libc.so.6 0x00007fd6c34ff537 in abort () from /lib/x86_64-linux-gnu/libc.so.6 0x00007fd6c34ff40f in ?? () from /lib/x86_64-linux-gnu/libc.so.6 0x00007fd6c350e662 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 0x000056149e2cea03 in blk_set_enable_write_cache (wce=true, blk=0x5614a01c27f0) at ../src/block/block-backend.c:1949 0x000056149e2d0a67 in blk_set_enable_write_cache (blk=0x5614a01c27f0, wce=<optimized out>) at ../src/block/block-backend.c:1951 0x000056149dfe9c59 in scsi_disk_apply_mode_select (p=0x7fd6b400c00e "\004", page=<optimized out>, s=<optimized out>) at ../src/hw/scsi/scsi-disk.c:1520 mode_select_pages (change=true, len=18, p=0x7fd6b400c00e "\004", r=0x7fd6b4001ff0) at ../src/hw/scsi/scsi-disk.c:1570 scsi_disk_emulate_mode_select (inbuf=<optimized out>, r=0x7fd6b4001ff0) at ../src/hw/scsi/scsi-disk.c:1640 scsi_disk_emulate_write_data (req=0x7fd6b4001ff0) at ../src/hw/scsi/scsi-disk.c:1934 0x000056149e18ff16 in virtio_scsi_handle_cmd_req_submit (req=<optimized out>, req=<optimized out>, s=0x5614a12f16b0) at ../src/hw/scsi/virtio-scsi.c:719 virtio_scsi_handle_cmd_vq (vq=0x7fd6bab92140, s=0x5614a12f16b0) at ../src/hw/scsi/virtio-scsi.c:761 virtio_scsi_handle_cmd (vq=<optimized out>, vdev=<optimized out>) at ../src/hw/scsi/virtio-scsi.c:775 virtio_scsi_handle_cmd (vdev=0x5614a12f16b0, vq=0x7fd6bab92140) at ../src/hw/scsi/virtio-scsi.c:765 0x000056149e1a8aa6 in virtio_queue_notify_vq (vq=0x7fd6bab92140) at ../src/hw/virtio/virtio.c:2365 0x000056149e3ccea5 in aio_dispatch_handler (ctx=ctx@entry=0x5614a01babe0, node=<optimized out>) at ../src/util/aio-posix.c:369 0x000056149e3cd868 in aio_dispatch_ready_handlers (ready_list=0x7fd6c09b2680, ctx=0x5614a01babe0) at ../src/util/aio-posix.c:399 aio_poll (ctx=0x5614a01babe0, blocking=blocking@entry=true) at ../src/util/aio-posix.c:713 0x000056149e2a7796 in iothread_run (opaque=opaque@entry=0x56149ffde500) at ../src/iothread.c:67 0x000056149e3d0859 in qemu_thread_start (args=0x7fd6c09b26f0) at ../src/util/qemu-thread-posix.c:504 0x00007fd6c36b9ea7 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 0x00007fd6c35d9aef in clone () from /lib/x86_64-linux-gnu/libc.so.6 Changing GLOBAL_STATE_CODE in IO_CODE is allowed, since GSC callers are allowed to call IO_CODE. Resolves: #1272 Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20221027072726.2681500-1-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Tested-by: Antoine Damhet <antoine.damhet@shadow.tech> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27vmdk: switch to *_co_* functionsAlberto Faria1-27/+27
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-24-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27vhdx: switch to *_co_* functionsAlberto Faria1-4/+4
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-23-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27vdi: switch to *_co_* functionsAlberto Faria1-8/+9
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-22-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27qed: switch to *_co_* functionsAlberto Faria2-7/+7
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-21-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27qcow2: switch to *_co_* functionsAlberto Faria4-24/+24
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-20-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27qcow: switch to *_co_* functionsAlberto Faria1-22/+23
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-19-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27parallels: switch to *_co_* functionsAlberto Faria1-14/+14
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-18-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27mirror: switch to *_co_* functionsAlberto Faria1-2/+2
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-17-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block: switch to *_co_* functionsAlberto Faria1-2/+2
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-16-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27commit: switch to *_co_* functionsAlberto Faria1-1/+1
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-15-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27vmdk: manually add more coroutine_fn annotationsPaolo Bonzini1-17/+17
The validity of these was double-checked with Alberto Faria's static analyzer. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-14-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27qcow2: manually add more coroutine_fn annotationsPaolo Bonzini4-24/+27
The validity of these was double-checked with Alberto Faria's static analyzer. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-13-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27qcow: manually add more coroutine_fn annotationsPaolo Bonzini1-6/+9
get_cluster_offset() and decompress_cluster() are only called from the read and write paths. The validity of these was double-checked with Alberto Faria's static analyzer. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-12-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27blkdebug: add missing coroutine_fn annotation for indirect-called functionsPaolo Bonzini1-1/+1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-11-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27qcow2: add coroutine_fn annotation for indirect-called functionsAlberto Faria1-4/+4
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-10-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block: add missing coroutine_fn annotation to BlockDriverState callbacksAlberto Faria1-7/+7
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-9-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27ssh: add missing coroutine_fn annotationAlberto Faria1-3/+3
ssh_write is only called from ssh_co_writev. Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-5-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27monitor: add missing coroutine_fn annotationAlberto Faria1-1/+1
hmp_block_resize and hmp_screendump are defined as a ".coroutine = true" command, so they must be coroutine_fn. Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-4-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block: remove incorrect coroutine_fn annotationAlberto Faria1-2/+2
Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-3-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27backup: remove incorrect coroutine_fn annotationAlberto Faria1-1/+1
The .set_speed callback is not called from coroutine. Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-2-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block/nfs: Fix 32-bit Windows buildBin Meng1-0/+8
libnfs.h declares nfs_fstat() as the following for win32: int nfs_fstat(struct nfs_context *nfs, struct nfsfh *nfsfh, struct __stat64 *st); The 'st' parameter should be of type 'struct __stat64'. The codes happen to build successfully for 64-bit Windows, but it does not build for 32-bit Windows. Fixes: 6542aa9c75bc ("block: add native support for NFS") Fixes: 18a8056e0bc7 ("block/nfs: cache allocated filesize for read-only files") Signed-off-by: Bin Meng <bin.meng@windriver.com> Message-Id: <20220908132817.1831008-6-bmeng.cn@gmail.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block: remove bdrv_try_set_aio_context and replace it with ↵Emanuele Giuseppe Esposito1-1/+1
bdrv_try_change_aio_context No functional change intended. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221025084952.2139888-11-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block: rename bdrv_child_try_change_aio_context in bdrv_try_change_aio_contextEmanuele Giuseppe Esposito1-2/+1
No functional changes intended. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221025084952.2139888-10-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block: remove all unused ->can_set_aio_ctx and ->set_aio_ctx callbacksEmanuele Giuseppe Esposito1-33/+0
Together with all _can_set_ and _set_ APIs, as they are not needed anymore. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221025084952.2139888-9-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block: use the new _change_ API instead of _can_set_ and _set_Emanuele Giuseppe Esposito1-2/+6
Replace all direct usage of ->can_set_aio_ctx and ->set_aio_ctx, and call bdrv_child_try_change_aio_context() in bdrv_try_set_aio_context(), the main function called through the whole block layer. From this point onwards, ->can_set_aio_ctx and ->set_aio_ctx won't be used anymore. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221025084952.2139888-8-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block-backend: implement .change_aio_ctx in child_rootEmanuele Giuseppe Esposito1-0/+52
blk_root_change_aio_ctx() is very similar to blk_root_can_set_aio_ctx(), but implements a new transaction so that if all check pass, the new transaction's .commit will take care of changing the BlockBackend AioContext. blk_root_set_aio_ctx_commit() is the same as blk_root_set_aio_ctx(). Note: bdrv_child_try_change_aio_context() is not called by anyone at this point. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221025084952.2139888-7-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block/snapshot: drop indirection around bdrv_snapshot_fallback_ptrVladimir Sementsov-Ogievskiy1-22/+16
Now the indirection is not actually used, we can safely reduce it to simple pointer. For consistency do a bit of refactoring to get rid of _ptr suffixes that become meaningless. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220726201134.924743-15-vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block: Manipulate bs->file / bs->backing pointers in .attach/.detachVladimir Sementsov-Ogievskiy3-6/+5
bs->file and bs->backing are a kind of duplication of part of bs->children. But very useful diplication, so let's not drop them at all:) We should manage bs->file and bs->backing in same place, where we manage bs->children, to keep them in sync. Moreover, generic io paths are unprepared to BdrvChild without a bs, so it's double good to clear bs->file / bs->backing when we detach the child. Detach is simple: if we detach bs->file or bs->backing child, just set corresponding field to NULL. Attach is a bit more complicated. But we still can precisely detect should we set one of bs->file / bs->backing or not: - if role is BDRV_CHILD_COW, we definitely deal with bs->backing - else, if role is BDRV_CHILD_FILTERED (it must be also BDRV_CHILD_PRIMARY), it's a filtered child. Use bs->drv->filtered_child_is_backing to chose the pointer field to modify. - else, if role is BDRV_CHILD_PRIMARY, we deal with bs->file - in all other cases, it's neither bs->backing nor bs->file. It's some other child and we shouldn't care OK. This change brings one more good thing: we can (and should) get rid of all indirect pointers in the block-graph-change transactions: bdrv_attach_child_common() stores BdrvChild** into transaction to clear it on abort. bdrv_attach_child_common() has two callers: bdrv_attach_child_noperm() just pass-through this feature, bdrv_root_attach_child() doesn't need the feature. Look at bdrv_attach_child_noperm() callers: - bdrv_attach_child() doesn't need the feature - bdrv_set_file_or_backing_noperm() uses the feature to manage bs->file and bs->backing, we don't want it anymore - bdrv_append() uses the feature to manage bs->backing, again we don't want it anymore So, we should drop this stuff! Great! We could probably keep BdrvChild** argument to keep the int return value, but it seems not worth the complexity. Finally, we now set .file / .backing automatically in generic code and want to restring setting them by hand outside of .attach/.detach. So, this patch cleanups all remaining places where they were set. To find such places I use: git grep '\->file =' git grep '\->backing =' git grep '&.*\<backing\>' git grep '&.*\<file\>' Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220726201134.924743-14-vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block/snapshot: stress that we fallback to primary childVladimir Sementsov-Ogievskiy1-20/+10
Actually what we chose is a primary child. Let's stress it in the code. We are going to drop indirect pointer logic here in future. Actually this commit simplifies the future work: we drop use of indirection in the assertion now. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220726201134.924743-9-vsementsov@yandex-team.ru> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block/blklogwrites: don't care to remove bs->file child on failureVladimir Sementsov-Ogievskiy1-4/+0
We don't need to remove bs->file, generic layer takes care of it. No other driver cares to remove bs->file on failure by hand. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220726201134.924743-4-vsementsov@yandex-team.ru> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block: introduce bdrv_open_file_child() helperVladimir Sementsov-Ogievskiy22-101/+71
Almost all drivers call bdrv_open_child() similarly. Let's create a helper for this. The only not updated drivers that call bdrv_open_child() to set bs->file are raw-format and snapshot-access: raw-format sometimes want to have filtered child but don't set drv->is_filter to true. snapshot-access wants only DATA | PRIMARY Possibly we should implement drv->is_filter_func() handler, to consider raw-format as filter when it works as filter.. But it's another story. Note also, that we decrease assignments to bs->file in code: it helps us restrict modifying this field in further commit. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220726201134.924743-3-vsementsov@yandex-team.ru> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block: BlockDriver: add .filtered_child_is_backing fieldVladimir Sementsov-Ogievskiy2-0/+2
Unfortunately not all filters use .file child as filtered child. Two exclusions are mirror_top and commit_top. Happily they both are private filters. Bad thing is that this inconsistency is observable through qmp commands query-block / query-named-block-nodes. So, could we just change mirror_top and commit_top to use file child as all other filter driver is an open question. Probably, we could do that with some kind of deprecation period, but how to warn users during it? For now, let's just add a field so we can distinguish them in generic code, it will be used in further commits. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220726201134.924743-2-vsementsov@yandex-team.ru> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block/io_uring: revert "Use io_uring_register_ring_fd() to skip fd operations"Sam Li1-12/+1
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1193 The commit "Use io_uring_register_ring_fd() to skip fd operations" broke when booting a guest with iothread and io_uring. That is because the io_uring_register_ring_fd() call is made from the main thread instead of IOThread where io_uring_submit() is called. It can not be guaranteed to register the ring fd in the correct thread or unregister the same ring fd if the IOThread is disabled. This optimization is not critical so we will revert previous commit. This reverts commit e2848bc574fe2715c694bf8fe9a1ba7f78a1125a and 77e3f038af1764983087e3551a0fde9951952c4d. Cc: qemu-stable@nongnu.org Signed-off-by: Sam Li <faithilikerun@gmail.com> Message-Id: <20220924144815.5591-1-faithilikerun@gmail.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Dario Faggioli <dfaggioli@suse.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27vvfat: allow spaces in file namesHervé Poussineau1-1/+1
In R/W mode, files with spaces were never created on host side. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1176 Fixes: c79e243ed67683d6d06692bd7040f7394da178b0 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221010175511.3414357-3-hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27vvfat: allow some writes to bootsectorHervé Poussineau1-1/+25
'reserved1' field in bootsector is used to mark volume dirty, or need to verify. Allow writes to bootsector which only changes the 'reserved1' field. This fixes I/O errors on Windows guests. Resolves: https://bugs.launchpad.net/qemu/+bug/1889421 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Message-Id: <20221010175511.3414357-2-hpoussin@reactos.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27block: Refactor get_tmp_filename()Bin Meng1-4/+3
At present there are two callers of get_tmp_filename() and they are inconsistent. One does: /* TODO: extra byte is a hack to ensure MAX_PATH space on Windows. */ char *tmp_filename = g_malloc0(PATH_MAX + 1); ... ret = get_tmp_filename(tmp_filename, PATH_MAX + 1); while the other does: s->qcow_filename = g_malloc(PATH_MAX); ret = get_tmp_filename(s->qcow_filename, PATH_MAX); As we can see different 'size' arguments are passed. There are also platform specific implementations inside the function, and the use of snprintf is really undesirable. The function name is also misleading. It creates a temporary file, not just a filename. Refactor this routine by changing its name and signature to: char *create_tmp_file(Error **errp) and use g_get_tmp_dir() / g_mkstemp() for a consistent implementation. While we are here, add some comments to mention that /var/tmp is preferred over /tmp on non-win32 hosts. Signed-off-by: Bin Meng <bin.meng@windriver.com> Message-Id: <20221010040432.3380478-2-bin.meng@windriver.com> [kwolf: Fixed incorrect errno negation and iotest 051] Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-26blkio: implement BDRV_REQ_REGISTERED_BUF optimizationStefan Hajnoczi1-3/+180
Avoid bounce buffers when QEMUIOVector elements are within previously registered bdrv_register_buf() buffers. The idea is that emulated storage controllers will register guest RAM using bdrv_register_buf() and set the BDRV_REQ_REGISTERED_BUF on I/O requests. Therefore no blkio_map_mem_region() calls are necessary in the performance-critical I/O code path. This optimization doesn't apply if the I/O buffer is internally allocated by QEMU (e.g. qcow2 metadata). There we still take the slow path because BDRV_REQ_REGISTERED_BUF is not set. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20221013185908.1297568-13-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26block: add BlockRAMRegistrarStefan Hajnoczi2-0/+59
Emulated devices and other BlockBackend users wishing to take advantage of blk_register_buf() all have the same repetitive job: register RAMBlocks with the BlockBackend using RAMBlockNotifier. Add a BlockRAMRegistrar API to do this. A later commit will use this from hw/block/virtio-blk.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20221013185908.1297568-10-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26block: return errors from bdrv_register_buf()Stefan Hajnoczi3-14/+42
Registering an I/O buffer is only a performance optimization hint but it is still necessary to return errors when it fails. Later patches will need to detect errors when registering buffers but an immediate advantage is that error_report() calls are no longer needed in block driver .bdrv_register_buf() functions. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20221013185908.1297568-8-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26block: add BDRV_REQ_REGISTERED_BUF request flagStefan Hajnoczi14-37/+46
Block drivers may optimize I/O requests accessing buffers previously registered with bdrv_register_buf(). Checking whether all elements of a request's QEMUIOVector are within previously registered buffers is expensive, so we need a hint from the user to avoid costly checks. Add a BDRV_REQ_REGISTERED_BUF request flag to indicate that all QEMUIOVector elements in an I/O request are known to be within previously registered buffers. Always pass the flag through to driver read/write functions. There is little harm in passing the flag to a driver that does not use it. Passing the flag to drivers avoids changes across many block drivers. Filter drivers would need to explicitly support the flag and pass through to their children when the children support it. That's a lot of code changes and it's hard to remember to do that everywhere, leading to silent reduced performance when the flag is accidentally dropped. The only problematic scenario with the approach in this patch is when a driver passes the flag through to internal I/O requests that don't use the same I/O buffer. In that case the hint may be set when it should actually be clear. This is a rare case though so the risk is low. Some drivers have assert(!flags), which no longer works when BDRV_REQ_REGISTERED_BUF is passed in. These assertions aren't very useful anyway since the functions are called almost exclusively by bdrv_driver_preadv/pwritev() so if we get flags handling right there then the assertion is not needed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20221013185908.1297568-7-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26block: pass size to bdrv_unregister_buf()Stefan Hajnoczi3-6/+6
The only implementor of bdrv_register_buf() is block/nvme.c, where the size is not needed when unregistering a buffer. This is because util/vfio-helpers.c can look up mappings by address. Future block drivers that implement bdrv_register_buf() may not be able to do their job given only the buffer address. Add a size argument to bdrv_unregister_buf(). Also document the assumptions about bdrv_register_buf()/bdrv_unregister_buf() calls. The same <host, size> values that were given to bdrv_register_buf() must be given to bdrv_unregister_buf(). gcc 11.2.1 emits a spurious warning that img_bench()'s buf_size local variable might be uninitialized, so it's necessary to silence the compiler. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20221013185908.1297568-5-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26blkio: add libblkio block driverStefan Hajnoczi2-0/+832
libblkio (https://gitlab.com/libblkio/libblkio/) is a library for high-performance disk I/O. It currently supports io_uring, virtio-blk-vhost-user, and virtio-blk-vhost-vdpa with additional drivers under development. One of the reasons for developing libblkio is that other applications besides QEMU can use it. This will be particularly useful for virtio-blk-vhost-user which applications may wish to use for connecting to qemu-storage-daemon. libblkio also gives us an opportunity to develop in Rust behind a C API that is easy to consume from QEMU. This commit adds io_uring, nvme-io_uring, virtio-blk-vhost-user, and virtio-blk-vhost-vdpa BlockDrivers to QEMU using libblkio. It will be easy to add other libblkio drivers since they will share the majority of code. For now I/O buffers are copied through bounce buffers if the libblkio driver requires it. Later commits add an optimization for pre-registering guest RAM to avoid bounce buffers. The syntax is: --blockdev io_uring,node-name=drive0,filename=test.img,readonly=on|off,cache.direct=on|off --blockdev nvme-io_uring,node-name=drive0,filename=/dev/ng0n1,readonly=on|off,cache.direct=on --blockdev virtio-blk-vhost-vdpa,node-name=drive0,path=/dev/vdpa...,readonly=on|off,cache.direct=on --blockdev virtio-blk-vhost-user,node-name=drive0,path=vhost-user-blk.sock,readonly=on|off,cache.direct=on Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20221013185908.1297568-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-07file-posix: Remove unused s->discard_zeroesKevin Wolf1-9/+0
The field is unused (only ever set, but never read) since commit ac9185603. Additionally, the commit message of commit 34fa110e already explained earlier why it's unreliable. Remove it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220923142838.91043-1-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-07job.c: enable job lock/unlock and remove Aiocontext locksEmanuele Giuseppe Esposito1-0/+2
Change the job_{lock/unlock} and macros to use job_mutex. Now that they are not nop anymore, remove the aiocontext to avoid deadlocks. Therefore: - when possible, remove completely the aiocontext lock/unlock pair - if it is used by some other function too, reduce the locking section as much as possible, leaving the job API outside. - change AIO_WAIT_WHILE in AIO_WAIT_WHILE_UNLOCKED, since we are not using the aiocontext lock anymore The only functions that still need the aiocontext lock are: - the JobDriver callbacks, already documented in job.h - job_cancel_sync() in replication.c is called with aio_context_lock taken, but now job is using AIO_WAIT_WHILE_UNLOCKED so we need to release the lock. Reduce the locking section to only cover the callback invocation and document the functions that take the AioContext lock, to avoid taking it twice. Also remove real_job_{lock/unlock}, as they are replaced by the public functions. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220926093214.506243-19-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-07blockjob: protect iostatus field in BlockJob structEmanuele Giuseppe Esposito1-1/+5
iostatus is the only field (together with .job) that needs protection using the job mutex. It is set in the main loop (GLOBAL_STATE functions) but read in I/O code (block_job_error_action). In order to protect it, change block_job_iostatus_set_err to block_job_iostatus_set_err_locked(), always called under job lock. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220926093214.506243-17-eesposit@redhat.com> [kwolf: Fixed up type of iostatus] Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-07jobs: protect job.aio_context with BQL and job_mutexEmanuele Giuseppe Esposito1-0/+1
In order to make it thread safe, implement a "fake rwlock", where we allow reads under BQL *or* job_mutex held, but writes only under BQL *and* job_mutex. The only write we have is in child_job_set_aio_ctx, which always happens under drain (so the job is paused). For this reason, introduce job_set_aio_context and make sure that the context is set under BQL, job_mutex and drain. Also make sure all other places where the aiocontext is read are protected. The reads in commit.c and mirror.c are actually safe, because always done under BQL. Note: at this stage, job_{lock/unlock} and job lock guard macros are *nop*. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220926093214.506243-14-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-07block/mirror.c: use of job helpers in driversEmanuele Giuseppe Esposito1-4/+9
Once job lock is used and aiocontext is removed, mirror has to perform job operations under the same critical section, Note: at this stage, job_{lock/unlock} and job lock guard macros are *nop*. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220926093214.506243-11-eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-07quorum: Remove unnecessary forward declarationKevin Wolf1-2/+0
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20221006122607.162769-1-kwolf@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>