aboutsummaryrefslogtreecommitdiff
path: root/block/qcow2.c
AgeCommit message (Collapse)AuthorFilesLines
2018-06-15block: Factor out qobject_input_visitor_new_flat_confused()Markus Armbruster1-5/+2
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-06-15block: Clean up a misuse of qobject_to() in .bdrv_co_create_opts()Markus Armbruster1-5/+4
The following pattern occurs in the .bdrv_co_create_opts() methods of parallels, qcow, qcow2, qed, vhdx and vpc: qobj = qdict_crumple_for_keyval_qiv(qdict, errp); qobject_unref(qdict); qdict = qobject_to(QDict, qobj); if (qdict == NULL) { ret = -EINVAL; goto done; } v = qobject_input_visitor_new_keyval(QOBJECT(qdict)); [...] ret = 0; done: qobject_unref(qdict); [...] return ret; If qobject_to() fails, we return failure without setting errp. That's wrong. As far as I can tell, it cannot fail here. Clean it up anyway, by removing the useless conversion. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-06-15block: Fix -blockdev for certain non-string scalarsMarkus Armbruster1-1/+1
Configuration flows through the block subsystem in a rather peculiar way. Configuration made with -drive enters it as QemuOpts. Configuration made with -blockdev / blockdev-add enters it as QAPI type BlockdevOptions. The block subsystem uses QDict, QemuOpts and QAPI types internally. The precise flow is next to impossible to explain (I tried for this commit message, but gave up after wasting several hours). What I can explain is a flaw in the BlockDriver interface that leads to this bug: $ qemu-system-x86_64 -blockdev node-name=n1,driver=nfs,server.type=inet,server.host=localhost,path=/foo/bar,user=1234 qemu-system-x86_64: -blockdev node-name=n1,driver=nfs,server.type=inet,server.host=localhost,path=/foo/bar,user=1234: Internal error: parameter user invalid QMP blockdev-add is broken the same way. Here's what happens. The block layer passes configuration represented as flat QDict (with dotted keys) to BlockDriver methods .bdrv_file_open(). The QDict's members are typed according to the QAPI schema. nfs_file_open() converts it to QAPI type BlockdevOptionsNfs, with qdict_crumple() and a qobject input visitor. This visitor comes in two flavors. The plain flavor requires scalars to be typed according to the QAPI schema. That's the case here. The keyval flavor requires string scalars. That's not the case here. nfs_file_open() uses the latter, and promptly falls apart for members @user, @group, @tcp-syn-count, @readahead-size, @page-cache-size, @debug. Switching to the plain flavor would fix -blockdev, but break -drive, because there the scalars arrive in nfs_file_open() as strings. The proper fix would be to replace the QDict by QAPI type BlockdevOptions in the BlockDriver interface. Sadly, that's beyond my reach right now. Next best would be to fix the block layer to always pass correctly typed QDicts to the BlockDriver methods. Also beyond my reach. What I can do is throw another hack onto the pile: have nfs_file_open() convert all members to string, so use of the keyval flavor actually works, by replacing qdict_crumple() by new function qdict_crumple_for_keyval_qiv(). The pattern "pass result of qdict_crumple() to qobject_input_visitor_new_keyval()" occurs several times more: * qemu_rbd_open() Same issue as nfs_file_open(), but since BlockdevOptionsRbd has only string members, its only a latent bug. Fix it anyway. * parallels_co_create_opts(), qcow_co_create_opts(), qcow2_co_create_opts(), bdrv_qed_co_create_opts(), sd_co_create_opts(), vhdx_co_create_opts(), vpc_co_create_opts() These work, because they create the QDict with qemu_opts_to_qdict_filtered(), which creates only string scalars. The function sports a TODO comment asking for better typing; that's going to be fun. Use qdict_crumple_for_keyval_qiv() to be safe. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-06-15block: Add block-specific QDict headerMax Reitz1-0/+1
There are numerous QDict functions that have been introduced for and are used only by the block layer. Move their declarations into an own header file to reflect that. While qdict_extract_subqdict() is in fact used outside of the block layer (in util/qemu-config.c), it is still a function related very closely to how the block layer works with nested QDicts, namely by sometimes flattening them. Therefore, its declaration is put into this header as well and util/qemu-config.c includes it with a comment stating exactly which function it needs. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20180509165530.29561-7-mreitz@redhat.com> [Copyright note tweaked, superfluous includes dropped] Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-06-11qcow2: Do not mark inactive images corruptMax Reitz1-1/+1
When signaling a corruption on a read-only image, qcow2 already makes fatal events non-fatal (i.e., they will not result in the image being closed, and the image header's corrupt flag will not be set). This is necessary because we cannot set the corrupt flag on read-only images, and it is possible because further corruption of read-only images is impossible. Inactive images are effectively read-only, too, so we should do the same for them. bdrv_is_writable() can tell us whether an image can actually be written to, so use its result instead of !bs->read_only. (Otherwise, the assert(!(bs->open_flags & BDRV_O_INACTIVE)) in bdrv_co_pwritev() will fail, crashing qemu.) Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180606193702.7113-3-mreitz@redhat.com Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-06-11block: Add Error parameter to bdrv_amend_optionsMax Reitz1-31/+41
Looking at the qcow2 code that is riddled with error_report() calls, this is really how it should have been from the start. Along the way, turn the target_version/current_version comparisons at the beginning of qcow2_downgrade() into assertions (the caller has to make sure these conditions are met), and rephrase the error message on using compat=1.1 to get refcount widths other than 16 bits. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180509210023.20283-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-06-04Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into ↵Peter Maydell1-30/+199
staging Pull request * Copy offloading for qemu-img convert (iSCSI, raw, and qcow2) If the underlying storage supports copy offloading, qemu-img convert will use it instead of performing reads and writes. This avoids data transfers and thus frees up storage bandwidth for other purposes. SCSI EXTENDED COPY and Linux copy_file_range(2) are used to implement this optimization. * Drop spurious "WARNING: I\/O thread spun for 1000 iterations" warning # gpg: Signature made Mon 04 Jun 2018 12:20:08 BST # gpg: using RSA key 9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: main-loop: drop spin_counter qemu-img: Convert with copy offloading block-backend: Add blk_co_copy_range iscsi: Implement copy offloading iscsi: Create and use iscsi_co_wait_for_task iscsi: Query and save device designator when opening file-posix: Implement bdrv_co_copy_range qcow2: Implement copy offloading raw: Implement copy offloading raw: Check byte range uniformly block: Introduce API for copy offloading Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-04Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell1-2/+2
acpi, vhost, misc: fixes, features vDPA support, fix to vhost blk RO bit handling, some include path cleanups, NFIT ACPI table. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Fri 01 Jun 2018 17:25:19 BST # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (31 commits) vhost-blk: turn on pre-defined RO feature bit ACPI testing: test NFIT platform capabilities nvdimm, acpi: support NFIT platform capabilities tests/.gitignore: add entry for generated file arch_init: sort architectures ui: use local path for local headers qga: use local path for local headers colo: use local path for local headers migration: use local path for local headers usb: use local path for local headers sd: fix up include vhost-scsi: drop an unused include ppc: use local path for local headers rocker: drop an unused include e1000e: use local path for local headers ioapic: fix up includes ide: use local path for local headers display: use local path for local headers trace: use local path for local headers migration: drop an unused include ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-01qcow2: Implement copy offloadingFam Zheng1-30/+199
The two callbacks are implemented quite similarly to the read/write functions: bdrv_co_copy_range_from maps for read and calls into bs->file or bs->backing depending on the allocation status; bdrv_co_copy_range_to maps for write and calls into bs->file. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 20180601092648.24614-5-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-05-31block: use local path for local headersMichael S. Tsirkin1-2/+2
When pulling in headers that are in the same directory as the C file (as opposed to one in include/), we should use its relative path, without a directory. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-05-29qcow2: Fix Coverity warning when calculating the refcount cache sizeAlberto Garcia1-3/+2
MIN_REFCOUNT_CACHE_SIZE is 4 and the cluster size is guaranteed to be at most 2MB, so the minimum refcount cache size (in bytes) is always going to fit in a 32-bit integer. Coverity doesn't know that, and since we're storing the result in a uint64_t (*refcount_cache_size) it thinks that we need the 64 bits and that we probably want to do a 64-bit multiplication to prevent the result from being truncated. This is a false positive in this case, but it's a fair warning. We could do a 64-bit multiplication to get rid of it, but since we know that a 32-bit variable is enough to store this value let's simply reuse min_refcount_cache, make it a normal int and stop doing casts. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-05-15qcow2: Give the refcount cache the minimum possible size by defaultAlberto Garcia1-12/+19
The L2 and refcount caches have default sizes that can be overridden using the l2-cache-size and refcount-cache-size (an additional parameter named cache-size sets the combined size of both caches). Unless forced by one of the aforementioned parameters, QEMU will set the unspecified sizes so that the L2 cache is 4 times larger than the refcount cache. This is based on the premise that the refcount metadata needs to be only a fourth of the L2 metadata to cover the same amount of disk space. This is incorrect for two reasons: a) The amount of disk covered by an L2 table depends solely on the cluster size, but in the case of a refcount block it depends on the cluster size *and* the width of each refcount entry. The 4/1 ratio is only valid with 16-bit entries (the default). b) When we talk about disk space and L2 tables we are talking about guest space (L2 tables map guest clusters to host clusters), whereas refcount blocks are used for host clusters (including L1/L2 tables and the refcount blocks themselves). On a fully populated (and uncompressed) qcow2 file, image size > virtual size so there are more refcount entries than L2 entries. Problem (a) could be fixed by adjusting the algorithm to take into account the refcount entry width. Problem (b) could be fixed by increasing a bit the refcount cache size to account for the clusters used for qcow2 metadata. However this patch takes a completely different approach and instead of keeping a ratio between both cache sizes it assigns as much as possible to the L2 cache and the remainder to the refcount cache. The reason is that L2 tables are used for every single I/O request from the guest and the effect of increasing the cache is significant and clearly measurable. Refcount blocks are however only used for cluster allocation and internal snapshots and in practice are accessed sequentially in most cases, so the effect of increasing the cache is negligible (even when doing random writes from the guest). So, make the refcount cache as small as possible unless the user explicitly asks for a larger one. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 9695182c2eb11b77cb319689a1ebaa4e7c9d6591.1523968389.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-05-04qobject: Replace qobject_incref/QINCREF qobject_decref/QDECREFMarc-André Lureau1-4/+4
Now that we can safely call QOBJECT() on QObject * as well as its subtypes, we can have macros qobject_ref() / qobject_unref() that work everywhere instead of having to use QINCREF() / QDECREF() for QObject and qobject_incref() / qobject_decref() for its subtypes. The replacement is mechanical, except I broke a long line, and added a cast in monitor_qmp_cleanup_req_queue_locked(). Unlike qobject_decref(), qobject_unref() doesn't accept void *. Note that the new macros evaluate their argument exactly once, thus no need to shout them. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180419150145.24795-4-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Rebased, semantic conflict resolved, commit message improved] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2018-04-16qcow2: try load bitmaps only onceVladimir Sementsov-Ogievskiy1-8/+8
Checking reopen by existence of some bitmaps is wrong, as it may be some other bitmaps, or on the other hand, user may remove bitmaps. This criteria is bad. To simplify things and make behavior more predictable let's just add a flag to remember, that we've already tried to load bitmaps on open and do not want do it again. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20180411122606.367301-2-vsementsov@virtuozzo.com [mreitz: Changed comment wording according to Eric Blake's suggestion] Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-03-26qcow2: fix bitmaps loading when bitmaps already existVladimir Sementsov-Ogievskiy1-1/+16
On reopen with existing bitmaps, instead of loading bitmaps, lets reopen them if needed. This also fixes bitmaps migration through shared storage. Consider the case. Persistent bitmaps are stored on bdrv_inactivate. Then, on destination process_incoming_migration_bh() calls bdrv_invalidate_cache_all() which leads to qcow2_load_autoloading_dirty_bitmaps() which fails if bitmaps are already loaded on destination start. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20180320170521.32152-3-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-03-19qapi: Replace qobject_to_X(o) by qobject_to(X, o)Max Reitz1-1/+1
This patch was generated using the following Coccinelle script: @@ expression Obj; @@ ( - qobject_to_qnum(Obj) + qobject_to(QNum, Obj) | - qobject_to_qstring(Obj) + qobject_to(QString, Obj) | - qobject_to_qdict(Obj) + qobject_to(QDict, Obj) | - qobject_to_qlist(Obj) + qobject_to(QList, Obj) | - qobject_to_qbool(Obj) + qobject_to(QBool, Obj) ) and a bit of manual fix-up for overly long lines and three places in tests/check-qjson.c that Coccinelle did not find. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20180224154033.29559-4-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: swap order from qobject_to(o, X), rebase to master, also a fix to latent false-positive compiler complaint about hw/i386/acpi-build.c] Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-09block: x-blockdev-create QMP commandKevin Wolf1-0/+1
This adds a synchronous x-blockdev-create QMP command that can create qcow2 images on a given node name. We don't want to block while creating an image, so this is not the final interface in all aspects, but BlockdevCreateOptionsQcow2 and .bdrv_co_create() are what they actually might look like in the end. In any case, this should be good enough to test whether we interpret BlockdevCreateOptions as we should. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09qcow2: Use visitor for options in qcow2_create()Kevin Wolf1-140/+78
Instead of manually creating the BlockdevCreateOptions object, use a visitor to parse the given options into the QAPI object. This involves translation from the old command line syntax to the syntax mandated by the QAPI schema. Option names are still checked against qcow2_create_opts, so only the old option names are allowed on the command line, even if they are translated in qcow2_create(). In contrast, new option values are optionally recognised besides the old values: 'compat' accepts 'v2'/'v3' as an alias for '0.10'/'1.1', and 'encrypt.format' accepts 'qcow' as an alias for 'aes' now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09qcow2: Handle full/falloc preallocation in qcow2_co_create()Kevin Wolf1-9/+19
Once qcow2_co_create() can be called directly on an already existing node, we must provide the 'full' and 'falloc' preallocation modes outside of creating the image on the protocol layer. Fortunately, we have preallocated truncate now which can provide this functionality. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09qcow2: Use QCryptoBlockCreateOptions in qcow2_co_create()Kevin Wolf1-17/+45
Instead of passing the encryption format name and the QemuOpts down, use the QCryptoBlockCreateOptions contained in BlockdevCreateOptions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09qcow2: Use BlockdevRef in qcow2_co_create()Kevin Wolf1-14/+25
Instead of passing a separate BlockDriverState* into qcow2_co_create(), make use of the BlockdevRef that is included in BlockdevCreateOptions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09qcow2: Pass BlockdevCreateOptions to qcow2_co_create()Kevin Wolf1-38/+151
All of the simple options are now passed to qcow2_co_create() in a BlockdevCreateOptions object. Still missing: node-name and the encryption options. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09qcow2: Let qcow2_create() handle protocol layerKevin Wolf1-26/+38
Currently, qcow2_create() only parses the QemuOpts and then calls qcow2_co_create() for the actual image creation, which includes both the creation of the actual file on the file system and writing a valid empty qcow2 image into that file. The plan is that qcow2_co_create() becomes the function that implements the functionality for a future 'blockdev-create' QMP command, which only creates the qcow2 layer on an already opened file node. This is a first step towards that goal: Let's move out anything that deals with the protocol layer from qcow2_co_create() into qcow2_create(). This means that qcow2_co_create() doesn't need a file name any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09qcow2: Rename qcow2_co_create2() to qcow2_co_create()Kevin Wolf1-8/+8
The functions originally known as qcow2_create() and qcow2_create2() are now called qcow2_co_create_opts() and qcow2_co_create(), which matches the names of the BlockDriver callbacks that they will implement at the end of this patch series. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09qcow2: Generalize validate_table_offset() into qcow2_validate_table()Alberto Garcia1-47/+30
This function checks that the offset and size of a table are valid. While the offset checks are fine, the size check is too generic, since it only verifies that the total size in bytes fits in a 64-bit integer. In practice all tables used in qcow2 have much smaller size limits, so the size needs to be checked again for each table using its actual limit. This patch generalizes this function by allowing the caller to specify the maximum size for that table. In addition to that it allows passing an Error variable. The function is also renamed and made public since we're going to use it in other parts of the code. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09block: convert bdrv_check callback to coroutine_fnPaolo Bonzini1-4/+19
Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1516279431-30424-8-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09block: convert bdrv_invalidate_cache callback to coroutine_fnPaolo Bonzini1-2/+5
QED's bdrv_invalidate_cache implementation would like to reuse functions that acquire/release the metadata locks. Call it from coroutine context to simplify the logic. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1516279431-30424-6-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09qcow2: make qcow2_do_open a coroutine_fnPaolo Bonzini1-5/+41
It is called from qcow2_invalidate_cache in coroutine context (incoming migration runs in a coroutine), so it's cleaner if metadata is always loaded from a coroutine. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1516279431-30424-4-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09qcow2: introduce qcow2_write_caches and qcow2_flush_cachesPaolo Bonzini1-16/+4
They will be used to avoid recursively taking s->lock during bdrv_open or bdrv_check. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1516279431-30424-7-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-06Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell1-28/+32
Block layer patches # gpg: Signature made Mon 05 Mar 2018 17:45:51 GMT # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (38 commits) block: Fix NULL dereference on empty drive error qcow2: Replace align_offset() with ROUND_UP() block/ssh: Add basic .bdrv_truncate() block/ssh: Make ssh_grow_file() blocking block/ssh: Pull ssh_grow_file() from ssh_create() qemu-img: Make resize error message more general qcow2: make qcow2_co_create2() a coroutine_fn block: rename .bdrv_create() to .bdrv_co_create_opts() Revert "IDE: Do not flush empty CDROM drives" block: test blk_aio_flush() with blk->root == NULL block: add BlockBackend->in_flight counter block: extract AIO_WAIT_WHILE() from BlockDriverState aio: rename aio_context_in_iothread() to in_aio_context_home_thread() docs: document how to use the l2-cache-entry-size parameter specs/qcow2: Fix documentation of the compressed cluster descriptor iotest 033: add misaligned write-zeroes test via truncate block: fix write with zero flag set and iovector provided block: Drop unused .bdrv_co_get_block_status() vvfat: Switch to .bdrv_co_block_status() vpc: Switch to .bdrv_co_block_status() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # include/block/block.h
2018-03-02Include less of the generated modular QAPI headersMarkus Armbruster1-2/+1
In my "build everything" tree, a change to the types in qapi-schema.json triggers a recompile of about 4800 out of 5100 objects. The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h, qapi-types.h. Each of these headers still includes all its shards. Reduce compile time by including just the shards we actually need. To illustrate the benefits: adding a type to qapi/migration.json now recompiles some 2300 instead of 4800 objects. The next commit will improve it further. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-24-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> [eblake: rebase to master] Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-02Include qapi/qmp/qerror.h exactly where neededMarkus Armbruster1-1/+0
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-2-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-02qcow2: Replace align_offset() with ROUND_UP()Alberto Garcia1-7/+7
The align_offset() function is equivalent to the ROUND_UP() macro so there's no need to use the former. The ROUND_UP() name is also a bit more explicit. This patch uses ROUND_UP() instead of the slower QEMU_ALIGN_UP() because align_offset() already requires that the second parameter is a power of two. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20180215131008.5153-1-berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-03-02qcow2: make qcow2_co_create2() a coroutine_fnStefan Hajnoczi1-8/+9
qcow2_create2() calls qemu_co_mutex_lock(). Only a coroutine_fn may call another coroutine_fn. In fact, qcow2_create2 is always called from coroutine context. Rename the function to add the "co" moniker and add coroutine_fn. Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170705102231.20711-3-stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-02block: rename .bdrv_create() to .bdrv_co_create_opts()Stefan Hajnoczi1-2/+3
BlockDriver->bdrv_create() has been called from coroutine context since commit 5b7e1542cfa41a281af9629d31cef03704d976e6 ("block: make bdrv_create adopt coroutine"). Make this explicit by renaming to .bdrv_co_create_opts() and add the coroutine_fn annotation. This makes it obvious to block driver authors that they may yield, use CoMutex, or other coroutine_fn APIs. bdrv_co_create is reserved for the QAPI-based version that Kevin is working on. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170705102231.20711-2-stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-02qcow2: Switch to .bdrv_co_block_status()Eric Blake1-11/+13
We are gradually moving away from sector-based interfaces, towards byte-based. Update the qcow2 driver accordingly. For now, we are ignoring the 'want_zero' hint. However, it should be relatively straightforward to honor the hint as a way to return larger *pnum values when we have consecutive clusters with the same data/zero status but which differ only in having non-consecutive mappings. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-02-13qcow2: Allow configuring the L2 slice sizeAlberto Garcia1-7/+27
Now that the code is ready to handle L2 slices we can finally add an option to allow configuring their size. An L2 slice is the portion of an L2 table that is read by the qcow2 cache. Until now the cache was always reading full L2 tables, and since the L2 table size is equal to the cluster size this was not very efficient with large clusters. Here's a more detailed explanation of why it makes sense to have smaller cache entries in order to load L2 data: https://lists.gnu.org/archive/html/qemu-block/2017-09/msg00635.html This patch introduces a new command-line option to the qcow2 driver named l2-cache-entry-size (cf. l2-cache-size). The cache entry size has the same restrictions as the cluster size: it must be a power of two and it has the same range of allowed values, with the additional requirement that it must not be larger than the cluster size. The L2 cache entry size (L2 slice size) remains equal to the cluster size for now by default, so this feature must be explicitly enabled. Although my tests show that 4KB slices consistently improve performance and give the best results, let's wait and make more tests with different cluster sizes before deciding on an optimal default. Now that the cache entry size is not necessarily equal to the cluster size we need to reflect that in the MIN_L2_CACHE_SIZE documentation. That minimum value is a requirement of the COW algorithm: we need to read two L2 slices (and not two L2 tables) in order to do COW, see l2_allocate() for the actual code. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: c73e5611ff4a9ec5d20de68a6c289553a13d2354.1517840877.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-02-13qcow2: Update qcow2_truncate() to support L2 slicesAlberto Garcia1-3/+3
The qcow2_truncate() code is mostly independent from whether we're using L2 slices or full L2 tables, but in full and falloc preallocation modes new L2 tables are allocated using qcow2_alloc_cluster_link_l2(). Therefore the code needs to be modified to ensure that all nb_clusters that are processed in each call can be allocated with just one L2 slice. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 1fd7d272b5e7b66254a090b74cf2bed1cc334c0e.1517840877.git.berto@igalia.com Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-02-13qcow2: Add l2_slice_size field to BDRVQcow2StateAlberto Garcia1-0/+3
The BDRVQcow2State structure contains an l2_size field, which stores the number of 64-bit entries in an L2 table. For efficiency reasons we want to be able to load slices instead of full L2 tables, so we need to know how many entries an L2 slice can hold. An L2 slice is the portion of an L2 table that is loaded by the qcow2 cache. At the moment that cache can only load complete tables, therefore an L2 slice has the same size as an L2 table (one cluster) and l2_size == l2_slice_size. Later we'll allow smaller slices, but until then we have to use this new l2_slice_size field to make the rest of the code ready for that. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: adb048595f9fb5dfb110c802a8b3c3be3b937f37.1517840877.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-02-13qcow2: Remove BDS parameter from qcow2_cache_clean_unused()Alberto Garcia1-2/+2
This function was only using the BlockDriverState parameter to pass it to qcow2_cache_table_release(). This is no longer necessary so this parameter can be removed. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: b74f17591af52f201de0ea3a3b2dd0a81932334d.1517840876.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-02-13qcow2: Remove BDS parameter from qcow2_cache_destroy()Alberto Garcia1-8/+8
This function was never using the BlockDriverState parameter so it can be safely removed. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 49c74fe8b3aead9056e61a85b145ce787d06262b.1517840876.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-02-13block: maintain persistent disabled bitmapsVladimir Sementsov-Ogievskiy1-1/+1
To maintain load/store disabled bitmap there is new approach: - deprecate @autoload flag of block-dirty-bitmap-add, make it ignored - store enabled bitmaps as "auto" to qcow2 - store disabled bitmaps without "auto" flag to qcow2 - on qcow2 open load "auto" bitmaps as enabled and others as disabled (except in_use bitmaps) Also, adjust iotests 165 and 176 appropriately. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20180202160752.143796-1-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-02-09block: Simplify bdrv_can_write_zeroes_with_unmap()Eric Blake1-2/+1
We don't need the can_write_zeroes_with_unmap field in BlockDriverInfo, because it is redundant information with supported_zero_flags & BDRV_REQ_MAY_UNMAP. Note that BlockDriverInfo and supported_zero_flags are both per-device settings, rather than global state about the driver as a whole, which means one or both of these bits of information can already be conditional. Let's audit how they were set: crypto: always setting can_write_ to false is pointless (the struct starts life zero-initialized), no use of supported_ nbd: just recently fixed to set can_write_ if supported_ includes MAY_UNMAP (thus this commit effectively reverts bca80059e and solves the problem mentioned there in a more global way) file-posix, iscsi, qcow2: can_write_ is conditional, while supported_ was unconditional; but passing MAY_UNMAP would fail with ENOTSUP if the condition wasn't met qed: can_write_ is unconditional, but pwrite_zeroes lacks support for MAY_UNMAP and supported_ is not set. Perhaps support can be added later (since it would be similar to qcow2), but for now claiming false is no real loss all other drivers: can_write_ is not set, and supported_ is either unset or a passthrough Simplify the code by moving the conditional into supported_zero_flags for all drivers, then dropping the now-unused BDI field. For callers that relied on bdrv_can_write_zeroes_with_unmap(), we return the same per-device settings for drivers that had conditions (no observable change in behavior there); and can now return true (instead of false) for drivers that support passthrough (for example, the commit driver) which gives those drivers the same fix as nbd just got in bca80059e. For callers that relied on supported_zero_flags, we now have a few more places that can avoid a wasted call to pwrite_zeroes() that will just fail with ENOTSUP. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20180126193439.20219-1-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-02-09Include qapi/qmp/qbool.h exactly where neededMarkus Armbruster1-1/+0
Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-15-armbru@redhat.com>
2018-02-09Eliminate qapi/qmp/types.hMarkus Armbruster1-1/+2
qapi/qmp/types.h is a convenience header to include a number of qapi/qmp/ headers. Since we rarely need all of the headers qapi/qmp/types.h includes, we bypass it most of the time. Most of the places that use it don't need all the headers, either. Include the necessary headers directly, and drop qapi/qmp/types.h. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-9-armbru@redhat.com>
2018-02-09Include qapi/error.h exactly where neededMarkus Armbruster1-0/+2
This cleanup makes the number of objects depending on qapi/error.h drop from 1910 (out of 4743) to 1612 in my "build everything" tree. While there, separate #include from file comment with a blank line, and drop a useless comment on why qemu/osdep.h is included first. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-5-armbru@redhat.com> [Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
2018-01-23qcow2: No persistent dirty bitmaps for compat=0.10Max Reitz1-3/+11
Persistent dirty bitmaps require a properly functioning autoclear_features field, or we cannot track when an unsupporting program might overwrite them. Therefore, we cannot support them for compat=0.10 images. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171123020832.8165-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-12-22qcow2: get rid of qcow2_backing_read1 routineEdgar Kaziakhmedov1-43/+8
Since bdrv_co_preadv does all neccessary checks including reading after the end of the backing file, avoid duplication of verification before bdrv_co_preadv call. Signed-off-by: Edgar Kaziakhmedov <edgar.kaziakhmedov@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-11-17qcow2: check_errors are fatalMax Reitz1-1/+4
When trying to repair a dirty image, qcow2_check() may apparently succeed (no really fatal error occurred that would prevent the check from continuing), but if check_errors in the result object is non-zero, we cannot trust the image to be usable. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1728639 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-2-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17qcow2: reject unaligned offsets in write compressedAnton Nefedov1-0/+4
Misaligned compressed write is not supported. Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Message-id: 1510654613-47868-2-git-send-email-anton.nefedov@virtuozzo.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>