aboutsummaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)AuthorFilesLines
2010-11-07Fix win32 buildBlue Swirl1-1/+1
Fix a return value change missed by 205ef7961f781496366e0a93a4ec621ad3724bd7. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-11-04block: avoid a warning on 64 bit hosts with long as int64_tBlue Swirl1-2/+2
When building on a 64 bit host which uses 'long' for int64_t, GCC emits a warning: CC block/blkverify.o /src/qemu/block/blkverify.c: In function `blkverify_verify_readv': /src/qemu/block/blkverify.c:304: warning: long long int format, long unsigned int arg (arg 3) Rework a77cffe7e916f4dd28f2048982ea2e0d98143b11 to avoid the warning. Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-11-04qcow2: Invalidate cache after failed readKevin Wolf2-0/+2
The cache content may be destroyed after a failed read, better not use it any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2010-11-04vpc: Implement bdrv_flushKevin Wolf1-8/+13
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-11-04block: Allow bdrv_flush to return errorsKevin Wolf10-19/+26
This changes bdrv_flush to return 0 on success and -errno in case of failure. It's a requirement for implementing proper error handle in users of bdrv_flush. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2010-10-26Merge remote branch 'kwolf/for-anthony' into stagingAnthony Liguori5-198/+146
2010-10-23qemu-timer: move commonly used timer code to qemu-timer-commonBlue Swirl1-6/+6
Move timer init functions to a new file, qemu-timer-common.c. Make other critical timer functions inlined to preserve performance in qemu-timer.c, also move muldiv64() (used by the inline functions) to qemu-timer.h. Adjust block/raw-posix.c and simpletrace.c to use get_clock() directly. Remove a similar/duplicate definition in qemu-tool.c. Adjust hw/omap_clk.c to include qemu-timer.h because muldiv64() is used there. After this change, tracing can be used also for user code and simpletrace on Win32. Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-10-22block: Use GCC_FMT_ATTR and fix a format errorStefan Weil1-2/+3
Adding the gcc format attribute detects a format bug which is fixed here. v2: Don't use type cast. BDRV_SECTOR_SIZE is unsigned long long, so %lld should be the correct format specifier. Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-10-22Copy snapshots out of QCOW2 diskedison3-0/+33
In order to backup snapshots, created from QCOW2 iamge, we want to copy snapshots out of QCOW2 disk to a seperate storage. The following patch adds a new option in "qemu-img": qemu-img convert -f qcow2 -O qcow2 -s snapshot_name src_img bck_img. Right now, it only supports to copy the full snapshot, delta snapshot is on the way. Changes from V1: all the comments from Kevin are addressed: Add read-only checking Fix coding style Change the name from bdrv_snapshot_load to bdrv_snapshot_load_tmp Signed-off-by: Disheng Su <edison@cloud.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-10-22qcow2: Remove old image creation functionKevin Wolf1-224/+0
They have been #ifdef'd out by the previous patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-10-22qcow2: Simplify image creationKevin Wolf1-1/+132
Instead of doing lots of magic for setting up initial refcount blocks and stuff create a minimal (inconsistent) image, open it and initialize the rest with regular qcow2 functions. This is a complete rewrite of the image creation function. The old implementating is #ifdef'd out and will be removed by the next patch (removing it here would have made the diff unreadable because diff tries to find similarities when it's really a rewrite) Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-10-22qcow2: Support exact L1 table growthStefan Hajnoczi4-12/+19
The L1 table grow operation includes a size calculation that bumps up the new L1 table size in order to anticipate the size needs of vmstate data. This helps reduce the number of times that the L1 table has to be grown when vmstate data is appended. This size overhead is not necessary during image creation, bdrv_truncate(), or snapshot goto operations. In fact, existing qemu-iotests that exercise table growth are no longer able to trigger it because image creation preallocates an L1 table that is too large after changes to qcow_create2(). This patch keeps the size calculation but also adds exact growth for callers that do not want to inflate the L1 table size unnecessarily. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-10-13block: avoid a write only variableBlue Swirl1-0/+1
Compiling with GCC 4.6.0 20100925 produced a warning: /src/qemu/block/qcow2-refcount.c: In function 'update_refcount': /src/qemu/block/qcow2-refcount.c:552:13: error: variable 'dummy' set but not used [-Werror=unused-but-set-variable] Fix by adding a dummy cast so that the result is not unused. Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-10-03block/vvfat: Fix compiler warning in debug codeStefan Weil1-1/+0
Fix this compiler warning: ./block/vvfat.c:2285: error: comparison of unsigned expression >= 0 is always true Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-09-22block-verify: fix 32-bit buildAnthony Liguori1-1/+1
Reported-by: Peter Lemenkov <lemenkov@gmail.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-09-21blkverify: Add block driver for verifying I/OStefan Hajnoczi1-0/+382
The blkverify block driver makes investigating image format data corruption much easier. A raw image initialized with the same contents as the test image (e.g. qcow2 file) must be provided. The raw image mirrors read/write operations and is used to verify that data read from the test image is correct. See docs/blkverify.txt for more information. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21qcow2: Avoid bounce buffers for AIO write requestsKevin Wolf1-23/+18
qcow2 used to use bounce buffers for any AIO requests. This does not only imply unnecessary copying, but also unbounded allocations which should be avoided. This patch removes bounce buffers from the normal AIO write path. Encrypted images continue to use a bounce buffer, however with constant size. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21qcow2: Avoid bounce buffers for AIO read requestsKevin Wolf3-30/+68
qcow2 used to use bounce buffers for any AIO requests. This does not only imply unnecessary copying, but also unbounded allocations which should be avoided. This patch removes bounce buffers from the normal AIO read path, and constrains them to a constant size for encrypted images. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21qcow2: Get rid of additional sync on COWKevin Wolf1-2/+8
We always have a sync for the refcount update when a new cluster is allocated. If we move this past the COW, we can save an additional sync. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21qcow2: Move sync out of qcow2_alloc_clustersKevin Wolf3-2/+7
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21qcow2: Move sync out of update_refcountKevin Wolf1-2/+11
Note that the flush is omitted intentionally in qcow2_free_clusters. If anything, we can leak clusters here if we lose the writes. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21qcow2: Move sync out of write_refcount_block_entriesKevin Wolf1-1/+3
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21nbd: correctly manage default portLaurent Vivier1-2/+0
block/nbd.c: use default port number when none is specified qemu-nbd.c: use IANA-assigned port number: 10809 Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21raw-posix: handle > 512 byte alignment correctlyChristoph Hellwig1-33/+46
Replace the hardcoded handling of 512 byte alignment with bs->buffer_alignment to handle larger sector size devices correctly. Note that we can not rely on it to be initialize in bdrv_open, so deal with the worst case there. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21vvfat: Use cache=unsafeKevin Wolf1-4/+10
The qcow file used for write support in vvfat is a temporary file, so we can use cache=unsafe there. Without this, write support is just too slow to be of any use. Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
2010-09-21vvfat: Fix double free for opening the image rwKevin Wolf1-3/+4
Allocation and deallocation of bs->opaque is not in the control of a block driver. Therefore it should not set bs->opaque to a data structure used by another bs, or closing the image will lead to a double free. Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
2010-09-21vvfat: Fix segfault on write to read-only diskKevin Wolf1-0/+5
vvfat tries to set the readonly flag in its open function, but nowadays this is overwritted with the readonly=... command line option. Check in bdrv_write if the vvfat was opened read-only and return an error in this case. Without this check, vvfat tries to access the qcow bs, which is NULL without enabled write support. Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
2010-09-18blkdebug: fix enum comparisonBlue Swirl1-3/+1
The signedness of enum types depend on the compiler implementation. Therefore the check for negative values may or may not be meaningful. Fix by explicitly casting to a signed integer. Since the values are also checked earlier against event_names table, this is an internal error. Change the 'if' to 'assert'. This also avoids a warning with GCC flag -Wtype-limits. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-09-08Revert "Make default invocation of block drivers safer (v3)"Anthony Liguori1-130/+0
This reverts commit 79368c81bf8cf93864d7afc88b81b05d8f0a2c90. Conflicts: block.c I haven't been able to come up with a solution yet for the corruption caused by unaligned requests from the IDE disk so revert until a solution can be written. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-09-08qcow2: Remove unnecessary flush after L2 writeKevin Wolf1-4/+12
When a new cluster was allocated, we only need a flush after the write to the L2 table if it was a COW and we need to decrease the refcounts of the old clusters. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-08raw-posix: improve detection of scsi-generic devicesBernhard Kohl1-2/+8
Allow symbolic links which point to /dev/sgX devices. Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-08raw-posix: Don't use file name for host_cdrom detection on LinuxKevin Wolf1-3/+0
On Linux, we have code to detect CD-ROMs using an ioctl. We shouldn't lose anything but false positives by removing the check for a /dev/cd* path. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-30nbd: Introduce NBD named exports.Laurent Vivier1-19/+49
This patch allows to connect Qemu using NBD protocol to an nbd-server using named exports. For instance, if on the host "isoserver", in /etc/nbd-server/config, you have: [generic] [debian-500-ppc-netinst] exportname = /ISO/debian-500-powerpc-netinst.iso [Fedora-10-ppc-netinst] exportname = /ISO/Fedora-10-ppc-netinst.iso You can connect to it, using: qemu -cdrom nbd:isoserver:exportname=debian-500-ppc-netinst qemu -cdrom nbd:isoserver:exportname=Fedora-10-ppc-netinst NOTE: you need at least nbd-server 2.9.18 Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-30vvfat: fat_chksum(): fix access above array boundsLoïc Minier1-1/+1
Signed-off-by: Loïc Minier <loic.minier@linaro.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-30sheepdog: remove unnecessary includesIzumi Tsutsui1-10/+0
"qemu_socket.h" includes all necessary files and including <netinet/tcp.h> without <netinet/in.h> could cause errors on some systems. Signed-off-by: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-03block: Fix bdrv_has_zero_initKevin Wolf3-4/+21
Assuming that any image on a block device is not properly zero-initialized is actually wrong: Only raw images have this problem. Any other image format shouldn't care about it, they initialize everything properly themselves. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-25block: Replace u_int8_t, u_int16_t, u_int32_t, u_int64_t by standard int typesStefan Weil1-1/+1
There is no need to have a second set of integral types. Replace them by the standard types from stdint.h. Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-07-15Make default invocation of block drivers safer (v3)Anthony Liguori1-0/+130
CVE-2008-2004 described a vulnerability in QEMU whereas a malicious user could trick the block probing code into accessing arbitrary files in a guest. To mitigate this, we added an explicit format parameter to -drive which disabling block probing. Fast forward to today, and the vast majority of users do not use this parameter. libvirt does not use this by default nor does virt-manager. Most users want block probing so we should try to make it safer. This patch adds some logic to the raw device which attempts to detect a write operation to the beginning of a raw device. If the first 4 bytes happen to match an image file that has a backing file that we support, it scrubs the signature to all zeros. If a user specifies an explicit format parameter, this behavior is disabled. I contend that while a legitimate guest could write such a signature to the header, we would behave incorrectly anyway upon the next invocation of QEMU. This simply changes the incorrect behavior to not involve a security vulnerability. I've tested this pretty extensively both in the positive and negative case. I'm not 100% confident in the block layer's ability to deal with zero sized writes particularly with respect to the aio functions so some additional eyes would be appreciated. Even in the case of a single sector write, we have to make sure to invoked the completion from a bottom half so just removing the zero sized write is not an option. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-07sheepdog: fix compile error on systems without TCP_CORKMORITA Kazutaka1-1/+1
WIN32 is not only the system which doesn't have TCP_CORK (e.g. OS X). Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-07-06block: add sheepdog driver for distributed storage supportMORITA Kazutaka1-0/+2036
Sheepdog is a distributed storage system for QEMU. It provides highly available block level storage volumes to VMs like Amazon EBS. This patch adds a qemu block driver for Sheepdog. Sheepdog features are: - No node in the cluster is special (no metadata node, no control node, etc) - Linear scalability in performance and capacity - No single point of failure - Autonomous management (zero configuration) - Useful volume management support such as snapshot and cloning - Thin provisioning - Autonomous load balancing The more details are available at the project site: http://www.osrg.net/sheepdog/ Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-06raw-posix: Fix test for host CD-ROMMarkus Armbruster1-11/+6
raw_pread_aligned() retries up to two times if the block device backs a virtual CD-ROM (a drive with media=cdrom and if=ide, scsi, xen or none). This makes no sense. Whether retrying reads can correct read errors can only depend on what we're reading, not on how the result gets used. We need to check what whether we're reading from a physical CD-ROM or floppy here. I doubt retrying is useful even then. Left for another day. Impact: * Virtual CD-ROM backed by host_cdrom behaves the same. * Virtual CD-ROM backed by file or host_device no longer retries. * A drive backed by host_cdrom now retries even if it's not a virtual CD-ROM. * Any drive backed by host_floppy now retries. While there, clean up gratuitous use of goto. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-06qcow2/vdi: Change check to distinguish error casesKevin Wolf4-63/+73
This distinguishes between harmless leaks and real corruption. Hopefully users better understand what qemu-img check wants to tell them. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02blkdebug: Initialize state as 1Kevin Wolf1-0/+3
state = 0 in rules means that the rule is valid for any state. Therefore it's impossible to have a rule that works only in the initial state. This changes the initial state from 0 to 1 to make this possible. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02blkdebug: Free QemuOpts after having read the configKevin Wolf1-0/+2
Forgetting to free them means that the next instance inherits all rules and gets its own rules only additionally. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02blkdebug: Fix set_state_opts definitionKevin Wolf1-1/+1
The list head was initialized to point to the wrong list, so all actions ended up being handled as inject-error even if they were set-state in fact. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02qcow2: Fix error handling during metadata preallocationKevin Wolf1-6/+9
People were wondering why qemu-img check failed after they tried to preallocate a large qcow2 file and ran out of disk space. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22qcow2: Don't try to check tables that couldn't be loadedKevin Wolf1-0/+1
Trying to check them leads to a second error message which is more confusing than helpful: Can't get refcount for cluster 0: Invalid argument ERROR cluster 0 refcount=-22 reference=1 Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22qcow2: Fix qemu-img check segfault on corrupted imagesKevin Wolf1-3/+11
With corrupted images, we can easily get an cluster index that exceeds the array size of the temporary refcount table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22vpc: Use bdrv_(p)write_sync for metadata writesKevin Wolf1-4/+5
Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22vmdk: Use bdrv_(p)write_sync for metadata writesKevin Wolf1-5/+5
Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. Signed-off-by: Kevin Wolf <kwolf@redhat.com>