Age | Commit message (Collapse) | Author | Files | Lines |
|
Enhance the test to demonstrate existing less-than-stellar behavior of
qemu-img with a qcow2 image containing an inconsistent bitmap: we
don't diagnose the problem until after copying the entire image (a
potentially long time), and when we do diagnose the failure, we still
end up leaving an empty bitmap in the destination. This mess will be
cleaned up in the next patch.
While at it, rename the test now that we support useful iotest names,
and fix a missing newline in the error message thus exposed.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210709153951.2801666-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Nir Soffer <nsoffer@redhat.com>
|
|
When there are multiple queues attached to the same AIO context,
some requests may experience high latency, since in the worst case
the AIO engine queue is only flushed when it is full (MAX_EVENTS) or
there are no more queues plugged.
Commit 2558cb8dd4 ("linux-aio: increasing MAX_EVENTS to a larger
hardcoded value") changed MAX_EVENTS from 128 to 1024, to increase
the number of in-flight requests. But this change also increased
the potential maximum batch to 1024 elements.
When there is a single queue attached to the AIO context, the issue
is mitigated from laio_io_unplug() that will flush the queue every
time is invoked since there can't be others queue plugged.
Let's use the new `aio-max-batch` IOThread parameter to mitigate
this issue, limiting the number of requests in a batch.
We also define a default value (32): this value is obtained running
some benchmarks and it represents a good tradeoff between the latency
increase while a request is queued and the cost of the io_submit(2)
system call.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20210721094211.69853-4-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
When invoking block-export-add with some iothread and
fixed-iothread=false, and changing the node's iothread fails, the error
is supposed to be ignored.
However, it is still stored in *errp, which is wrong. If a second error
occurs, the "*errp must be NULL" assertion in error_setv() fails:
qemu-system-x86_64: ../util/error.c:59: error_setv: Assertion
`*errp == NULL' failed.
So if fixed-iothread=false, we should ignore the error by passing NULL
to bdrv_try_set_aio_context().
Fixes: f51d23c80af73c95e0ce703ad06a300f1b3d63ef
("block/export: add iothread and fixed-iothread options")
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210624083825.29224-2-mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Most probably this fake backing child doesn't work anyway (see notes
about it in a8a4d15c1c34d).
Still, since 25f78d9e2de528473d52 drivers are required to set
.supports_backing if they want to call bdrv_set_backing_hd, so now
vvfat just doesn't work because of this check.
Let's finally drop this fake backing file.
Fixes: 25f78d9e2de528473d52acfcf7acdfb64e3453d4
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210715124853.13335-1-vsementsov@virtuozzo.com>
Tested-by: John Arbuckle <programmingkidx@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Remove the workaround introduced in commit
6ecbc6c52672db5c13805735ca02784879ce8285
"replication: Avoid blk_make_empty() on read-only child".
It is not needed anymore since s->hidden_disk is guaranteed to be
writable when secondary_do_checkpoint() runs. Because replication_start(),
_do_checkpoint() and _stop() are only called by COLO migration code
and COLO-migration activates all disks via bdrv_invalidate_cache_all()
before it calls these functions.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <d3acfad43879e9f376bffa7dd797ae74d0a7c81a.1626619393.git.lukasstraub2@web.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
The replication driver needs access to the children block-nodes of
it's child so it can issue bdrv_make_empty() and bdrv_co_pwritev()
to manage the replication. However, it does this by directly copying
the BdrvChilds, which is wrong.
Fix this by properly attaching the block-nodes with
bdrv_attach_child() and requesting the required permissions.
This ultimatively fixes a potential crash in replication_co_writev(),
because it may write to s->secondary_disk if it is in state
BLOCK_REPLICATION_FAILOVER_FAILED, without requesting write
permissions first. And now the workaround in
secondary_do_checkpoint() can be removed.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <5d0539d729afb8072d0d7cde977c5066285591b4.1626619393.git.lukasstraub2@web.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
In preparation for the next patch, initialize s->hidden_disk and
s->secondary_disk later and replace access to them with local variables
in the places where they aren't initialized yet.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <1eb9dc179267207d9c7eccaeb30761758e32e9ab.1626619393.git.lukasstraub2@web.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
s->active_disk is bs->file. Remove it and use local variables instead.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <2534f867ea9be5b666dfce19744b7d4e2b96c976.1626619393.git.lukasstraub2@web.de>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
It's possible that requests start to wait each other in
mirror_wait_on_conflicts(). To avoid it let's use same technique as in
block/io.c in bdrv_wait_serialising_requests_locked() /
bdrv_find_conflicting_request(): don't wait on intersecting request if
it is already waiting for some other request.
For details of the dead-lock look at testIntersectingActiveIO()
test-case which we actually fixing now.
Fixes: d06107ade0ce74dc39739bac80de84b51ec18546
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210702211636.228981-4-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
This field is unused, but it very helpful for debugging.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210702211636.228981-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
First, categorize the structure fields to identify what needs
to be protected and what doesn't.
We essentially need to protect only .state, and the 3 lists in
BDRVBlkdebugState.
Then, add the lock and mark the functions accordingly.
Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210614082931.24925-7-eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
There seems to be no benefit in using a field. Replace it with a local
variable, and move the state update before the yields.
The state update has do be done before the yields because now using
a local variable does not allow the new updated state to be visible
by the other yields.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210614082931.24925-6-eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
That would be unsafe in case a rule other than the current one
is removed while the coroutine has yielded.
Keep FOREACH_SAFE because suspend_request deletes the current rule.
After this patch, *all* matching rules are deleted before suspending
the coroutine, rather than just one.
This doesn't affect the existing testcases.
Use actions_count to see how many yield to issue.
Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210614082931.24925-5-eesposit@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
Add a counter for each action that a rule can trigger.
This is mainly used to keep track of how many coroutine_yield()
we need to perform after processing all rules in the list.
Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210614082931.24925-4-eesposit@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
We want to move qemu_coroutine_yield() after the loop on rules,
because QLIST_FOREACH_SAFE is wrong if the rule list is modified
while the coroutine has yielded. Therefore move the suspended
request to the heap and clean it up from the remove side.
All that is left is for blkdebug_debug_event to handle the
yielding.
Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210614082931.24925-3-eesposit@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
Extract to a separate function. Do not rely on FOREACH_SAFE, which is
only "safe" if the *current* node is removed---not if another node is
removed. Instead, just walk the entire list from the beginning when
asked to resume all suspended requests with a given tag.
Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210614082931.24925-2-eesposit@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
Although unlikely, qemu might hang in nbd_send_request().
Allow recovery in this case by registering the yank function before
calling it.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Message-Id: <20210704000730.1befb596@gecko.fritz.box>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
|
|
staging
* More SVM fixes (Lara)
* Module annotation database (Gerd)
* Memory leak fixes (myself)
* Build fixes (myself)
* --with-devices-* support (Alex)
# gpg: Signature made Fri 09 Jul 2021 17:23:52 BST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini-gitlab/tags/for-upstream: (48 commits)
meson: Use input/output for entitlements target
configure: allow the selection of alternate config in the build
configs: rename default-configs to configs and reorganise
hw/arm: move CONFIG_V7M out of default-devices
hw/arm: add dependency on OR_IRQ for XLNX_VERSAL
meson: Introduce target-specific Kconfig
meson: switch function tests from compilation to linking
vl: fix leak of qdict_crumple return value
target/i386: fix exceptions for MOV to DR
target/i386: Added DR6 and DR7 consistency checks
target/i386: Added MSRPM and IOPM size check
monitor/tcg: move tcg hmp commands to accel/tcg, register them dynamically
usb: build usb-host as module
monitor/usb: register 'info usbhost' dynamically
usb: drop usb_host_dev_is_scsi_storage hook
monitor: allow register hmp commands
accel: build tcg modular
accel: add tcg module annotations
accel: build qtest modular
accel: add qtest module annotations
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Block layer patches
- Make blockdev-reopen stable
- Remove deprecated qemu-img backing file without format
- rbd: Convert to coroutines and add write zeroes support
- rbd: Updated MAINTAINERS
- export/fuse: Allow other users access to the export
- vhost-user: Fix backends without multiqueue support
- Fix drive-backup transaction endless drained section
# gpg: Signature made Fri 09 Jul 2021 13:49:22 BST
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (28 commits)
block: Make blockdev-reopen stable API
iotests: Test reopening multiple devices at the same time
block: Support multiple reopening with x-blockdev-reopen
block: Acquire AioContexts during bdrv_reopen_multiple()
block: Add bdrv_reopen_queue_free()
qcow2: Fix dangling pointer after reopen for 'file'
qemu-img: Improve error for rebase without backing format
qemu-img: Require -F with -b backing image
qcow2: Prohibit backing file changes in 'qemu-img amend'
blockdev: fix drive-backup transaction endless drained section
vhost-user: Fix backends without multiqueue support
MAINTAINERS: add block/rbd.c reviewer
block/rbd: fix type of task->complete
iotests/fuse-allow-other: Test allow-other
iotests/308: Test +w on read-only FUSE exports
export/fuse: Let permissions be adjustable
export/fuse: Give SET_ATTR_SIZE its own branch
export/fuse: Add allow-other option
export/fuse: Pass default_permissions for mount
util/uri: do not check argument of uri_free()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-14-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
While most libraries do not need a CONFIG_* symbol because the
"when:" clauses are enough, some do. Add them back or stop
using them if possible.
In the case of libpmem, the statement to add the CONFIG_* symbol
was still in configure, but could not be triggered because it
checked for "no" instead of "disabled" (and it would be wrong anyway
since the test for the library has not been done yet).
Reported-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Fixes: 587d59d6cc ("configure, meson: convert virgl detection to meson", 2021-07-06)
Fixes: 83ef16821a ("configure, meson: convert libdaxctl detection to meson", 2021-07-06)
Fixes: e36e8c70f6 ("configure, meson: convert libpmem detection to meson", 2021-07-06)
Fixes: 53c22b68e3 ("configure, meson: convert liburing detection to meson", 2021-07-06)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
As the BlockReopenQueue can contain nodes in multiple AioContexts, only
one of which may be locked when AIO_WAIT_WHILE() can be called, we can't
let the caller lock the right contexts. Instead, individually lock the
AioContext of a single node when iterating the queue.
Reintroduce bdrv_reopen() as a wrapper for reopening a single node that
drains the node and temporarily drops the AioContext lock for
bdrv_reopen_multiple().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210708114709.206487-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Without an external data file, s->data_file is a second pointer with the
same value as bs->file. When changing bs->file to a different BdrvChild
and freeing the old BdrvChild, s->data_file must also be updated,
otherwise it points to freed memory and causes crashes.
This problem was caught by iotests case 245.
Fixes: df2b7086f169239ebad5d150efa29c9bb6d4f820
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210708114709.206487-2-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
This was deprecated back in bc5ee6da7 (qcow2: Deprecate use of
qemu-img amend to change backing file), and no one in the meantime has
given any reasons why it should be supported. Time to make change
attempts a hard error (but for convenience, specifying the _same_
backing chain is not forbidden). Update a couple of iotests to match.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210503213600.569128-2-eblake@redhat.com>
Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
task->complete is a bool not an integer.
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <20210707180449.32665-1-pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Allow changing the file mode, UID, and GID through SETATTR.
Without allow_other, UID and GID are not allowed to be changed, because
it would not make sense. Also, changing group or others' permissions
is not allowed either.
For read-only exports, +w cannot be set.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210625142317.271673-5-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
In order to support changing other attributes than the file size in
fuse_setattr(), we have to give each its own independent branch. This
also applies to the only attribute we do support right now.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210625142317.271673-4-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Without the allow_other mount option, no user (not even root) but the
one who started qemu/the storage daemon can access the export. Allow
users to configure the export such that such accesses are possible.
While allow_other is probably what users want, we cannot make it an
unconditional default, because passing it is only possible (for non-root
users) if the global fuse.conf configuration file allows it. Thus, the
default is an 'auto' mode, in which we first try with allow_other, and
then fall back to without.
FuseExport.allow_other reports whether allow_other was actually used as
a mount option or not. Currently, this information is not used, but a
future patch will let this field decide whether e.g. an export's UID and
GID can be changed through chmod.
One notable thing about 'auto' mode is that libfuse may print error
messages directly to stderr, and so may fusermount (which it executes).
Our export code cannot really filter or hide them. Therefore, if 'auto'
fails its first attempt and has to fall back, fusermount will print an
error message that mounting with allow_other failed.
This behavior necessitates a change to iotest 308, namely we need to
filter out this error message (because if the first attempt at mounting
with allow_other succeeds, there will be no such message).
Furthermore, common.rc's _make_test_img should use allow-other=off for
FUSE exports, because iotests generally do not need to access images
from other users, so allow-other=on or allow-other=auto have no
advantage. OTOH, allow-other=on will not work on systems where
user_allow_other is disabled, and with allow-other=auto, we get said
error message that we would need to filter out again. Just disabling
allow-other is simplest.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210625142317.271673-3-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
We do not do any permission checks in fuse_open(), so let the kernel do
them. We already let fuse_getattr() report the proper UNIX permissions,
so this should work the way we want.
This causes a change in 308's reference output, because now opening a
non-writable export with O_RDWR fails already, instead of only actually
attempting to write to it. (That is an improvement.)
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210625142317.271673-2-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
uri_free() checks if its argument is NULL in uri_clean() and g_free().
There is no need to check the argument before the call.
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Message-Id: <20210629063602.4239-1-xypron.glpk@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
librbd supports 1 byte alignment for all aio operations.
Currently, there is no API call to query limits from the Ceph
ObjectStore backend. So drop the bdrv_refresh_limits completely
until there is such an API call.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210702172356.11574-7-idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
This patch wittingly sets BDRV_REQ_NO_FALLBACK and silently ignores
BDRV_REQ_MAY_UNMAP for older librbd versions.
The rationale for this is as follows (citing Ilya Dryomov current RBD
maintainer):
---8<---
a) remove the BDRV_REQ_MAY_UNMAP check in qemu_rbd_co_pwrite_zeroes()
and as a consequence always unmap if librbd is too old
It's not clear what qemu's expectation is but in general Write
Zeroes is allowed to unmap. The only guarantee is that subsequent
reads return zeroes, everything else is a hint. This is how it is
specified in the kernel and in the NVMe spec.
In particular, block/nvme.c implements it as follows:
if (flags & BDRV_REQ_MAY_UNMAP) {
cdw12 |= (1 << 25);
}
This sets the Deallocate bit. But if it's not set, the device may
still deallocate:
"""
If the Deallocate bit (CDW12.DEAC) is set to '1' in a Write Zeroes
command, and the namespace supports clearing all bytes to 0h in the
values read (e.g., bits 2:0 in the DLFEAT field are set to 001b)
from a deallocated logical block and its metadata (excluding
protection information), then for each specified logical block, the
controller:
- should deallocate that logical block;
...
If the Deallocate bit is cleared to '0' in a Write Zeroes command,
and the namespace supports clearing all bytes to 0h in the values
read (e.g., bits 2:0 in the DLFEAT field are set to 001b) from
a deallocated logical block and its metadata (excluding protection
information), then, for each specified logical block, the
controller:
- may deallocate that logical block;
"""
https://nvmexpress.org/wp-content/uploads/NVM-Express-NVM-Command-Set-Specification-2021.06.02-Ratified-1.pdf
b) set BDRV_REQ_NO_FALLBACK in supported_zero_flags
Again, it's not clear what qemu expects here, but without it we end
up in a ridiculous situation where specifying the "don't allow slow
fallback" switch immediately fails all efficient zeroing requests on
a device where Write Zeroes is always efficient:
$ qemu-io -c 'help write' | grep -- '-[zun]'
-n, -- with -z, don't allow slow fallback
-u, -- with -z, allow unmapping
-z, -- write zeroes using blk_co_pwrite_zeroes
$ qemu-io -f rbd -c 'write -z -u -n 0 1M' rbd:foo/bar
write failed: Operation not supported
--->8---
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210702172356.11574-6-idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210702172356.11574-5-idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
While at it just call rbd_get_size and avoid rbd_image_info_t.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210702172356.11574-4-idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210702172356.11574-3-idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Ceph Luminous (version 12.2.z) is almost 4 years old at this point.
Bump the requirement to get rid of the ifdef'ry in the code.
Qemu 6.1 dropped the support for RHEL-7 which was the last supported
OS that required an older librbd.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210702172356.11574-2-idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Starting from ceph Pacific, RBD has built-in support for image-level encryption.
Currently supported formats are LUKS version 1 and 2.
There are 2 new relevant librbd APIs for controlling encryption, both expect an
open image context:
rbd_encryption_format: formats an image (i.e. writes the LUKS header)
rbd_encryption_load: loads encryptor/decryptor to the image IO stack
This commit extends the qemu rbd driver API to support the above.
Signed-off-by: Or Ozeri <oro@il.ibm.com>
Message-Id: <20210627114635.39326-1-oro@il.ibm.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-id: 20210705130458.97642-3-akihiko.odaki@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
This commit introduces "punch hole" operation and optimizes transfer
block size for macOS.
Thanks to Konstantin Nazarov for detailed analysis of a flaw in an
old version of this change:
https://gist.github.com/akihikodaki/87df4149e7ca87f18dc56807ec5a1bc5#gistcomment-3654667
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-id: 20210705130458.97642-1-akihiko.odaki@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Block layer patches
- Supporting changing 'file' in x-blockdev-reopen
- ssh: add support for sha256 host key fingerprints
- vhost-user-blk: Implement reconnection during realize
- introduce QEMU_AUTO_VFREE
- Don't require password of encrypted backing file for image creation
- Code cleanups
# gpg: Signature made Wed 30 Jun 2021 17:00:55 BST
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (24 commits)
vhost-user-blk: Implement reconnection during realize
vhost-user-blk: Factor out vhost_user_blk_realize_connect()
vhost: Distinguish errors in vhost_dev_get_config()
vhost-user-blk: Add Error parameter to vhost_user_blk_start()
vhost: Return 0/-errno in vhost_dev_init()
vhost: Distinguish errors in vhost_backend_init()
vhost: Add Error parameter to vhost_dev_init()
block/ssh: add support for sha256 host key fingerprints
block/commit: use QEMU_AUTO_VFREE
introduce QEMU_AUTO_VFREE
iotests: Test replacing files with x-blockdev-reopen
block: Allow changing bs->file on reopen
block: BDRVReopenState: drop replace_backing_bs field
block: move supports_backing check to bdrv_set_file_or_backing_noperm()
block: bdrv_reopen_parse_backing(): simplify handling implicit filters
block: bdrv_reopen_parse_backing(): don't check frozen child
block: bdrv_reopen_parse_backing(): don't check aio context
block: introduce bdrv_set_file_or_backing_noperm()
block: introduce bdrv_remove_file_or_backing_child()
block: comment graph-modifying function not updating permissions
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Currently the SSH block driver supports MD5 and SHA1 for host key
fingerprints. This is a cryptographically sensitive operation and
so these hash algorithms are inadequate by modern standards. This
adds support for SHA256 which has been supported in libssh since
the 0.8.1 release.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210622115156.138458-1-berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Avoid accessing QCryptoTLSCreds internals by using
the qcrypto_tls_creds_check_endpoint() helper.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
|
|
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210628121133.193984-3-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
No need to start a tracked request that will always fail. The choice
to check read-only after bdrv_inc_in_flight() predates 1bc5f09f2e
(block: Use tracked request for truncate), but waiting for serializing
requests can make the effect more noticeable.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210609163034.997943-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
staging
* Some Meson test conversions
* KVM dirty page ring buffer fix
* KVM TSC scaling support
* Fixes for SG_IO with /dev/sdX devices
* (Non)support for host devices on iOS
* -smp cleanups
# gpg: Signature made Fri 25 Jun 2021 15:16:18 BST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini-gitlab/tags/for-upstream: (28 commits)
machine: reject -smp dies!=1 for non-PC machines
machine: pass QAPI struct to mc->smp_parse
machine: add error propagation to mc->smp_parse
machine: move common smp_parse code to caller
machine: move dies from X86MachineState to CpuTopology
file-posix: handle EINTR during ioctl
block: detect DKIOCGETBLOCKCOUNT/SIZE before use
block: try BSD disk size ioctls one after another
block: check for sys/disk.h
block: feature detection for host block support
file-posix: try BLKSECTGET on block devices too, do not round to power of 2
block: add max_hw_transfer to BlockLimits
block-backend: align max_transfer to request alignment
osdep: provide ROUND_DOWN macro
scsi-generic: pass max_segments via max_iov field in BlockLimits
file-posix: fix max_iov for /dev/sg devices
KVM: Fix dirty ring mmap incorrect size due to renaming accident
configure, meson: convert libusbredir detection to meson
configure, meson: convert libcacard detection to meson
configure, meson: convert libusb detection to meson
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
into staging
block: Make block-copy API thread-safe
# gpg: Signature made Fri 25 Jun 2021 13:40:24 BST
# gpg: using RSA key 8B9C26CDB2FD147C880E86A1561F24C1F19F79FB
# gpg: Good signature from "Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8B9C 26CD B2FD 147C 880E 86A1 561F 24C1 F19F 79FB
* remotes/vsementsov/tags/pull-jobs-2021-06-25:
block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
block-copy: add CoMutex lock
block-copy: move progress_set_remaining in block_copy_task_end
block-copy: streamline choice of copy_range vs. read/write
block-copy: small refactor in block_copy_task_entry and block_copy_common
co-shared-resource: protect with a mutex
progressmeter: protect with a mutex
blockjob: let ratelimit handle a speed of 0
block-copy: let ratelimit handle a speed of 0
ratelimit: treat zero speed as unlimited
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
By adding acquire/release pairs, we ensure that .ret and .error_is_read
fields are written by block_copy_dirty_clusters before .finished is true,
and that they are read by API user after .finished is true.
The atomic here are necessary because the fields are concurrently modified
in coroutines, and read outside.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210624072043.180494-6-eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
|
|
Group various structures fields, to better understand what we need to
protect with a lock and what doesn't need it.
Then, add a CoMutex to protect concurrent access of block-copy
data structures. This mutex also protects .copy_bitmap, because its thread-safe
API does not prevent it from assigning two tasks to the same
bitmap region.
Exceptions to the lock:
- .sleep_state is handled in the series "coroutine: new sleep/wake API"
and thus here left as TODO.
- .finished, .cancelled and reads to .ret and .error_is_read will be
protected in the following patch, because are used also outside
coroutines.
- .skip_unallocated is atomic. Including it under the mutex would
increase the critical sections and make them also much more complex.
We can have it as atomic since it is only written from outside and
read by block-copy coroutines.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210624072043.180494-5-eesposit@redhat.com>
[vsementsov: fix typo in comment]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
|
|
Moving this function in task_end ensures to update the progress
anyways, even if there is an error.
It also helps in next patch, allowing task_end to have only
one critical section.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210624072043.180494-4-eesposit@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
|
|
Put the logic to determine the copy size in a separate function, so
that there is a simple state machine for the possible methods of
copying data from one BlockDriverState to the other.
Use .method instead of .copy_range as in-out argument, and
include also .zeroes as an additional copy method.
While at it, store the common computation of block_copy_max_transfer
into a new field of BlockCopyState, and make sure that we always
obey max_transfer; that's more efficient even for the
COPY_RANGE_READ_WRITE case.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210624072043.180494-3-eesposit@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
|