diff options
author | Max Reitz <mreitz@redhat.com> | 2019-11-01 16:25:10 +0100 |
---|---|---|
committer | Max Reitz <mreitz@redhat.com> | 2019-11-04 09:33:51 +0100 |
commit | 292d06b925b2787ee6f2430996b95651cae42fce (patch) | |
tree | 5551cfcad705086a27aff3ec5c123ac32934d85d /block | |
parent | c28107e9e55b11cd35cf3dc2505e3e69d10dcf13 (diff) | |
download | qemu-292d06b925b2787ee6f2430996b95651cae42fce.zip qemu-292d06b925b2787ee6f2430996b95651cae42fce.tar.gz qemu-292d06b925b2787ee6f2430996b95651cae42fce.tar.bz2 |
block/file-posix: Let post-EOF fallocate serialize
The XFS kernel driver has a bug that may cause data corruption for qcow2
images as of qemu commit c8bb23cbdbe32f. We can work around it by
treating post-EOF fallocates as serializing up until infinity (INT64_MAX
in practice).
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191101152510.11719-4-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Diffstat (limited to 'block')
-rw-r--r-- | block/file-posix.c | 36 |
1 files changed, 36 insertions, 0 deletions
diff --git a/block/file-posix.c b/block/file-posix.c index 0b7e904..1f0f61a 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2721,6 +2721,42 @@ raw_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes, RawPosixAIOData acb; ThreadPoolFunc *handler; +#ifdef CONFIG_FALLOCATE + if (offset + bytes > bs->total_sectors * BDRV_SECTOR_SIZE) { + BdrvTrackedRequest *req; + uint64_t end; + + /* + * This is a workaround for a bug in the Linux XFS driver, + * where writes submitted through the AIO interface will be + * discarded if they happen beyond a concurrently running + * fallocate() that increases the file length (i.e., both the + * write and the fallocate() happen beyond the EOF). + * + * To work around it, we extend the tracked request for this + * zero write until INT64_MAX (effectively infinity), and mark + * it as serializing. + * + * We have to enable this workaround for all filesystems and + * AIO modes (not just XFS with aio=native), because for + * remote filesystems we do not know the host configuration. + */ + + req = bdrv_co_get_self_request(bs); + assert(req); + assert(req->type == BDRV_TRACKED_WRITE); + assert(req->offset <= offset); + assert(req->offset + req->bytes >= offset + bytes); + + end = INT64_MAX & -(uint64_t)bs->bl.request_alignment; + req->bytes = end - req->offset; + req->overlap_bytes = req->bytes; + + bdrv_mark_request_serialising(req, bs->bl.request_alignment); + bdrv_wait_serialising_requests(req); + } +#endif + acb = (RawPosixAIOData) { .bs = bs, .aio_fildes = s->fd, |