aboutsummaryrefslogtreecommitdiff
path: root/block/raw-posix.c
AgeCommit message (Collapse)AuthorFilesLines
2015-02-06block/raw-posix: refactor handle_aiocb_write_zeroes a bitDenis V. Lunev1-19/+27
move code dealing with a block device to a separate function. This will allow to implement additional processing for ordinary files. Please note, that xfs_code has been moved before checking for s->has_write_zeroes as xfs_write_zeroes does not touch this flag inside. This makes code a bit more consistent. CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Peter Lieven <pl@kamp.de> CC: Fam Zheng <famz@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06block/raw-posix: create do_fallocate helperDenis V. Lunev1-8/+14
The pattern do { if (fallocate(s->fd, mode, offset, len) == 0) { return 0; } } while (errno == EINTR); ret = translate_err(-errno); will be commonly useful in next patches. Create helper for it. CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Peter Lieven <pl@kamp.de> CC: Fam Zheng <famz@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06block/raw-posix: create translate_err helper to merge errno valuesDenis V. Lunev1-6/+13
actually the code if (ret == -ENODEV || ret == -ENOSYS || ret == -EOPNOTSUPP || ret == -ENOTTY) { ret = -ENOTSUP; } is present twice and will be added a couple more times. Create helper for this. CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Peter Lieven <pl@kamp.de> CC: Fam Zheng <famz@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-12-10block/raw-posix: Fix ret in raw_open_common()Max Reitz1-0/+1
The return value must be negative on error; there is one place in raw_open_common() where errp is set, but ret remains 0. Fix it. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-12-10block: Make essential BlockDriver objects publicMax Reitz1-1/+1
There are some block drivers which are essential to QEMU and may not be removed: These are raw, file and qcow2 (as the default non-raw format). Make their BlockDriver objects public so they can be directly referenced throughout the block layer without needing to call bdrv_find_format() and having to deal with an error at runtime, while the real problem occurred during linking (where raw, file or qcow2 were not linked into qemu). Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-12-10block: do not use get_clock()Paolo Bonzini1-4/+4
Use the external qemu-timer API instead. No one else should be calling cpu_get_clock(), get_clock() and get_clock_realtime() directly; they are internal functions and they should be confined to qemu-timer.c and cpus.c (where the icount implementation resides). All accesses should go through qemu_clock_get_ns. Cc: kwolf@redhat.com Cc: stefanha@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1417010463-3527-2-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-18block/raw-posix: Catch fsync() errorsMax Reitz1-1/+6
fsync() may fail, and that case should be handled. Reported-by: László Érsek <lersek@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-18block/raw-posix: Only sync after successful preallocationMax Reitz1-1/+3
The loop which filled the file with zeroes may have been left early due to an error. In that case, the fsync() should be skipped. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-18block/raw-posix: Fix preallocating write() loopMax Reitz1-1/+1
write() may write less bytes than requested; in this case, the number of bytes written is returned. This is the byte count we should be subtracting from the number of bytes still to be written, and not the byte count we requested to write. Reported-by: László Érsek <lersek@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-18raw-posix: The SEEK_HOLE code is flawed, rewrite itMarkus Armbruster1-26/+85
On systems where SEEK_HOLE in a trailing hole seeks to EOF (Solaris, but not Linux), try_seek_hole() reports trailing data instead. Additionally, unlikely lseek() failures are treated badly: * When SEEK_HOLE fails, try_seek_hole() reports trailing data. For -ENXIO, there's in fact a trailing hole. Can happen only when something truncated the file since we opened it. * When SEEK_HOLE succeeds, SEEK_DATA fails, and SEEK_END succeeds, then try_seek_hole() reports a trailing hole. This is okay only when SEEK_DATA failed with -ENXIO (which means the non-trailing hole found by SEEK_HOLE has since become trailing somehow). For other failures (unlikely), it's wrong. * When SEEK_HOLE succeeds, SEEK_DATA fails, SEEK_END fails (unlikely), then try_seek_hole() reports bogus data [-1,start), which its caller raw_co_get_block_status() turns into zero sectors of data. Could theoretically lead to infinite loops in code that attempts to scan data vs. hole forward. Rewrite from scratch, with very careful comments. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2014-11-18raw-posix: SEEK_HOLE suffices, get rid of FIEMAPMarkus Armbruster1-59/+4
Commit 5500316 (May 2012) implemented raw_co_is_allocated() as follows: 1. If defined(CONFIG_FIEMAP), use the FS_IOC_FIEMAP ioctl 2. Else if defined(SEEK_HOLE) && defined(SEEK_DATA), use lseek() 3. Else pretend there are no holes Later on, raw_co_is_allocated() was generalized to raw_co_get_block_status(). Commit 4f11aa8 (May 2014) changed it to try the three methods in order until success, because "there may be implementations which support [SEEK_HOLE/SEEK_DATA] but not [FIEMAP] (e.g., NFSv4.2) as well as vice versa." Unfortunately, we used FIEMAP incorrectly: we lacked FIEMAP_FLAG_SYNC. Commit 38c4d0a (Sep 2014) added it. Because that's a significant speed hit, the next commit 7c159037 put SEEK_HOLE/SEEK_DATA first. As you see, the obvious use of FIEMAP is wrong, and the correct use is slow. I guess this puts it somewhere between -7 "The obvious use is wrong" and -10 "It's impossible to get right" on Rusty Russel's Hard to Misuse scale[*]. "Fortunately", the FIEMAP code is used only when * SEEK_HOLE/SEEK_DATA aren't defined, but CONFIG_FIEMAP is Uncommon. SEEK_HOLE had no XFS implementation between 2011 (when it was introduced for ext4 and btrfs) and 2012. * SEEK_HOLE/SEEK_DATA and CONFIG_FIEMAP are defined, but lseek() fails Unlikely. Thus, the FIEMAP code executes rarely. Makes it a nice hidey-hole for bugs. Worse, bugs hiding there can theoretically bite even on a host that has SEEK_HOLE/SEEK_DATA. I don't want to worry about this crap, not even theoretically. Get rid of it. [*] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2014-11-18raw-posix: Fix comment for raw_co_get_block_status()Markus Armbruster1-3/+1
Missed in commit 705be72. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2014-11-03raw-posix: raw_co_get_block_status() return valueMax Reitz1-14/+14
Instead of generating the full return value thrice in try_fiemap(), try_seek_hole() and as a fall-back in raw_co_get_block_status() itself, generate the value only in raw_co_get_block_status(). While at it, also remove the pnum parameter from try_fiemap() and try_seek_hole(). Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1414148280-17949-3-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03raw-posix: Fix raw_co_get_block_status() after EOFMax Reitz1-4/+10
As its comment states, raw_co_get_block_status() should unconditionally return 0 and set *pnum to 0 for after EOF. An assertion after lseek(..., SEEK_HOLE) tried to catch this case by asserting that errno != -ENXIO (which would indicate a position after the EOF); but it should be errno != ENXIO instead. Regardless of that, there should be no such assertion at all. If bdrv_getlength() returned an outdated value and the image has been resized outside of qemu, lseek() will return with errno == ENXIO. Just return that value as an error then. Setting *pnum to 0 and returning 0 should not be done here, as in that case we should update the device length as well. So, from qemu's perspective, the file has not been resized; it's just that there was an error querying sectors beyond a certain point (the actual file size). Additionally, nb_sectors should be clamped against the image end. This was probably not an issue if FIEMAP or SEEK_HOLE/SEEK_DATA worked, but the fallback did not take this case into account. Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1414148280-17949-2-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-23block: char devices on FreeBSD are not behind a pagerRoger Pau Monne1-5/+21
Introduce a new flag to mark devices that require requests to be aligned and replace the usage of BDRV_O_NOCACHE and O_DIRECT with this flag when appropriate. If a character device is used as a backend on a FreeBSD host set this flag unconditionally. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2014-10-20block: Rename BlockDriverCompletionFunc to BlockCompletionFuncMarkus Armbruster1-8/+8
I'll use it with block backends shortly, and the name is going to fit badly there. It's a block layer thing anyway, not just a block driver thing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20block: Rename BlockDriverAIOCB* to BlockAIOCB*Markus Armbruster1-8/+8
I'll use BlockDriverAIOCB with block backends shortly, and the name is going to fit badly there. It's a block layer thing anyway, not just a block driver thing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20block/raw-posix: use seek_hole ahead of fiemapTony Breeds1-2/+2
try_fiemap() uses FIEMAP_FLAG_SYNC which has a significant performance impact. Prefer seek_hole() over fiemap() to avoid this impact where possible. seek_hole is more widely used and, arguably, has potential to be optimised in the kernel. Reported-By: Michael Steffens <michael_steffens@posteo.de> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: Pádraig Brady <pbrady@redhat.com> Cc: Eric Blake <eblake@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20block/raw-posix: Fix disk corruption in try_fiemapTony Breeds1-1/+1
Using fiemap without FIEMAP_FLAG_SYNC is a known corrupter. Add the FIEMAP_FLAG_SYNC flag to the FS_IOC_FIEMAP ioctl. This has the downside of significantly reducing performance. Reported-By: Michael Steffens <michael_steffens@posteo.de> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: Pádraig Brady <pbrady@redhat.com> Cc: Eric Blake <eblake@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-29raw-posix: Fix build without posix_fallocate()Kevin Wolf1-4/+14
Check for the presence of posix_fallocate() in configure and only compile in support for PREALLOC_MODE_FALLOC when it's there. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-12raw-posix: Add falloc and full preallocation optionHu Tao1-19/+73
This patch adds a new option preallocation for raw format, and implements falloc and full preallocation. Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12block: don't convert file size to sector sizeHu Tao1-6/+6
and avoid converting it back later. Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoît Canet <benoit.canet@nodalink.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12block: round up file size to nearest sectorHu Tao1-4/+4
Currently the file size requested by user is rounded down to nearest sector, causing the actual file size could be a bit less than the size user requested. Since some formats (like qcow2) record virtual disk size in bytes, this can make the last few bytes cannot be accessed. This patch fixes it by rounding up file size to nearest sector so that the actual file size is no less than the requested file size. Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-22raw-posix: fix O_DIRECT short readsStefan Hajnoczi1-0/+9
The following O_DIRECT read from a <512 byte file fails: $ truncate -s 320 test.img $ qemu-io -n -c 'read -P 0 0 512' test.img qemu-io: can't open device test.img: Could not read image for determining its format: Invalid argument Note that qemu-io completes successfully without the -n (O_DIRECT) option. This patch fixes qemu-iotests ./check -nocache -vmdk 059. Cc: qemu-stable@nongnu.org Suggested-by: Kevin Wolf <kwolf@redhat.com> Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20block: Use g_new() & friends where that makes obvious senseMarkus Armbruster1-1/+1
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. Patch created with Coccinelle, with two manual changes on top: * Add const to bdrv_iterate_format() to keep the types straight * Convert the allocation in bdrv_drop_intermediate(), which Coccinelle inexplicably misses Coccinelle semantic patch: @@ type T; @@ -g_malloc(sizeof(T)) +g_new(T, 1) @@ type T; @@ -g_try_malloc(sizeof(T)) +g_try_new(T, 1) @@ type T; @@ -g_malloc0(sizeof(T)) +g_new0(T, 1) @@ type T; @@ -g_try_malloc0(sizeof(T)) +g_try_new0(T, 1) @@ type T; expression n; @@ -g_malloc(sizeof(T) * (n)) +g_new(T, n) @@ type T; expression n; @@ -g_try_malloc(sizeof(T) * (n)) +g_try_new(T, n) @@ type T; expression n; @@ -g_malloc0(sizeof(T) * (n)) +g_new0(T, n) @@ type T; expression n; @@ -g_try_malloc0(sizeof(T) * (n)) +g_try_new0(T, n) @@ type T; expression p, n; @@ -g_realloc(p, sizeof(T) * (n)) +g_renew(T, p, n) @@ type T; expression p, n; @@ -g_try_realloc(p, sizeof(T) * (n)) +g_try_renew(T, p, n) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-15raw-posix: Handle failure for potentially large allocationsKevin Wolf1-1/+5
Some code in the block layer makes potentially huge allocations. Failure is not completely unexpected there, so avoid aborting qemu and handle out-of-memory situations gracefully. This patch addresses the allocations in the raw-posix block driver. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18raw-posix: Fail gracefully if no working alignment is foundKevin Wolf1-8/+27
If qemu couldn't find out what O_DIRECT alignment to use with a given file, it would run into assert(bdrv_opt_mem_align(bs) != 0); in block.c and confuse users. This adds a more descriptive error message for such cases. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18block: Add Error argument to bdrv_refresh_limits()Kevin Wolf1-3/+1
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-14block: Assert qiov length matches request lengthKevin Wolf1-4/+11
At least raw-posix relies on this because it can allocate bounce buffers based on the request length, but access it using all of the qiov entries later. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-07-07linux-aio: implement io plug, unplug and flush io queueMing Lei1-0/+45
This patch implements .bdrv_io_plug, .bdrv_io_unplug and .bdrv_flush_io_queue callbacks for linux-aio Block Drivers, so that submitting I/O as a batch can be supported on linux-aio. [Unprocessed requests are completed with -EIO instead of a bogus ret value. --Stefan] Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-07raw-posix: Fix raw_getlength() to always return -errno on errorMarkus Armbruster1-6/+22
We got a merry mix of -1 and -errno here. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-01qemu-img create: add 'nocow' optionChunyan Liu1-0/+25
Add 'nocow' option so that users could have a chance to set NOCOW flag to newly created files. It's useful on btrfs file system to enhance performance. Btrfs has low performance when hosting VM images, even more when the guest in those VM are also using btrfs as file system. One way to mitigate this bad performance is to turn off COW attributes on VM files. Generally, there are two ways to turn off NOCOW on btrfs: a) by mounting fs with nodatacow, then all newly created files will be NOCOW. b) per file. Add the NOCOW file attribute. It could only be done to empty or new files. This patch tries the second way, according to the option, it could add NOCOW per file. For most block drivers, since the create file step is in raw-posix.c, so we can do setting NOCOW flag ioctl in raw-posix.c only. But there are some exceptions, like block/vpc.c and block/vdi.c, they are creating file by calling qemu_open directly. For them, do the same setting NOCOW flag ioctl work in them separately. [Fixed up 082.out due to the new 'nocow' creation option --Stefan] Signed-off-by: Chunyan Liu <cyliu@suse.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16cleanup QEMUOptionParameterChunyan Liu1-5/+5
Now that all backend drivers are using QemuOpts, remove all QEMUOptionParameter related codes. Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> Signed-off-by: Chunyan Liu <cyliu@suse.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16raw-posix.c: replace QEMUOptionParameter with QemuOptsChunyan Liu1-32/+27
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> Signed-off-by: Chunyan Liu <cyliu@suse.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04raw-posix: drop raw_get_aio_fd() since it is no longer usedStefan Hajnoczi1-34/+0
virtio-blk data-plane now uses the QEMU block layer for I/O. We do not need raw_get_aio_fd() anymore. It was a layering violation anyway, so let's get rid of it. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04block/linux-aio: fix memory and fd leakStefan Hajnoczi1-0/+5
Hot unplugging -drive aio=native,file=test.img,format=raw images leaves the Linux AIO event notifier and struct qemu_laio_state allocated. Luckily nothing will use the event notifier after the BlockDriverState has been closed so the handler function is never called. It's still worth fixing this resource leak. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04block/raw-posix: implement .bdrv_detach/attach_aio_context()Stefan Hajnoczi1-0/+43
Drop the assumption that we're using the main AioContext for Linux AIO. Convert the Linux AIO event notifier to use aio_set_event_notifier(). The .bdrv_detach/attach_aio_context() interfaces also need to be implemented to move the event notifier handler from the old to the new AioContext. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-30block/raw-posix.c: Avoid nonstandard LONG_LONG_MAXPeter Maydell1-1/+1
In the MacOSX specific code in raw-posix.c we use the define LONG_LONG_MAX. This is actually a non-standard pre-C99 define; switch to using the standard LLONG_MAX instead. This apparently fixes a compilation failure with certain compiler/OS versions (though it is unclear which). Reported-by: Peter Bartoli <peter@bartoli.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-09block/raw-posix: Try both FIEMAP and SEEK_HOLEMax Reitz1-50/+77
The current version of raw-posix always uses ioctl(FS_IOC_FIEMAP) if FIEMAP is available; lseek with SEEK_HOLE/SEEK_DATA are not even compiled in in this case. However, there may be implementations which support the latter but not the former (e.g., NFSv4.2) as well as vice versa. To cover both cases, try FIEMAP first (as this will return -ENOTSUP if not supported instead of returning a failsafe value (everything allocated as a single extent)) and if that does not work, fall back to SEEK_HOLE/SEEK_DATA. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-30block: Unlink temporary files in raw-posix/win32Kevin Wolf1-1/+4
Instead of having unlink() calls in the generic block layer, where we aren't even guarateed to have a file name, move them to those block drivers that are actually used and that always have a filename. Gets us rid of some #ifdefs as well. The patch also converts bs->is_temporary to a new BDRV_O_TEMPORARY open flag so that it is inherited in the protocol layer and the raw-posix and raw-win32 drivers can unlink the file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-03-13block/raw-posix: Strip protocol prefix on creationMax Reitz1-0/+12
The hdev_create() implementation in block/raw-posix.c is used by the "host_device", "host_cdrom" and "host_floppy" protocol block drivers together. Thus, any of the associated prefixes may occur and exactly one should should be stripped, if it does (thus, "host_device:host_cdrom:/dev/cdrom" is not shortened to "/dev/cdrom"). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13block/raw-posix: bdrv_parse_filename() for cdromMax Reitz1-0/+15
The "host_cdrom" protocol drivers should strip the "host_cdrom:" prefix from filenames if present. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13block/raw-posix: bdrv_parse_filename() for floppyMax Reitz1-0/+10
The "host_floppy" protocol driver should strip the "host_floppy:" prefix from filenames if present. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13block/raw-posix: bdrv_parse_filename() for hdevMax Reitz1-0/+10
The "host_device" protocol driver should strip the "host_device:" prefix from filenames if present. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-06block/raw-posix: Strip "file:" prefix on creationMax Reitz1-0/+2
The bdrv_create() implementation of the block/raw-posix "file" protocol driver should strip the "file:" prefix from filenames if present. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-06block/raw-posix: Implement bdrv_parse_filename()Max Reitz1-0/+12
The "file" protocol driver should strip the "file:" prefix from filenames if present. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-17Use error_is_set() only when necessaryMarkus Armbruster1-6/+6
error_is_set(&var) is the same as var != NULL, but it takes whole-program analysis to figure that out. Unnecessarily hard for optimizers, static checkers, and human readers. Dumb it down to obvious. Gets rid of several dozen Coverity false positives. Note that the obvious form is already used in many places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-01-24raw: Probe required direct I/O alignmentPaolo Bonzini1-17/+85
Add a bs->request_alignment field that contains the required offset/length alignment for I/O requests and fill it in the raw block drivers. Use ioctls if possible, else see what alignment it takes for O_DIRECT to succeed. While at it, also expose the memory alignment requirements, which may be (and in practice are) different from the disk alignment requirements. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-06qemu-option: Remove qemu_opts_create_nofailPeter Crosthwaite1-1/+1
This is a boiler-plate _nofail variant of qemu_opts_create. Remove and use error_abort in call sites. null/0 arguments needs to be added for the id and fail_if_exists fields in affected callsites due to argument inconsistency between the normal and no_fail variants. Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-12-03raw-posix: add support for write_zeroes on XFS and block devicesPaolo Bonzini1-12/+72
The code is similar to the implementation of discard and write_zeroes with UNMAP. However, failure must be propagated up to block.c. The stale page cache problem can be reproduced as follows: # modprobe scsi-debug lbpws=1 lbprz=1 # ./qemu-io /dev/sdXX qemu-io> write -P 0xcc 0 2M qemu-io> write -z 0 1M qemu-io> read -P 0x00 0 512 Pattern verification failed at offset 0, 512 bytes qemu-io> read -v 0 512 00000000: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................ ... # ./qemu-io --cache=none /dev/sdXX qemu-io> write -P 0xcc 0 2M qemu-io> write -z 0 1M qemu-io> read -P 0x00 0 512 qemu-io> read -v 0 512 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ ... And similarly with discard instead of "write -z". Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>