aboutsummaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)AuthorFilesLines
2012-09-28stream: add on-error argumentPaolo Bonzini1-1/+27
This patch adds support for error management to streaming. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-28iostatus: move BlockdevOnError declaration to QAPIPaolo Bonzini1-7/+7
This will let block-stream reuse the enum. Places that used the enums are renamed accordingly. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-28block: move job APIs to separate filesPaolo Bonzini3-2/+5
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-28block: add live block commit functionalityJeff Cody2-0/+268
This adds the live commit coroutine. This iteration focuses on the commit only below the active layer, and not the active layer itself. The behaviour is similar to block streaming; the sectors are walked through, and anything that exists above 'base' is committed back down into base. At the end, intermediate images are deleted, and the chain stitched together. Images are restored to their original open flags upon completion. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-28block: Support GlusterFS as a QEMU block backend.Bharata B Rao2-0/+625
This patch adds gluster as the new block backend in QEMU. This gives QEMU the ability to boot VM images from gluster volumes. Its already possible to boot from VM images on gluster volumes using FUSE mount, but this patchset provides the ability to boot VM images from gluster volumes by by-passing the FUSE layer in gluster. This is made possible by using libgfapi routines to perform IO on gluster volumes directly. VM Image on gluster volume is specified like this: file=gluster[+transport]://[server[:port]]/volname/image[?socket=...] 'gluster' is the protocol. 'transport' specifies the transport type used to connect to gluster management daemon (glusterd). Valid transport types are tcp, unix and rdma. If a transport type isn't specified, then tcp type is assumed. 'server' specifies the server where the volume file specification for the given volume resides. This can be either hostname, ipv4 address or ipv6 address. ipv6 address needs to be within square brackets [ ]. If transport type is 'unix', then 'server' field should not be specifed. The 'socket' field needs to be populated with the path to unix domain socket. 'port' is the port number on which glusterd is listening. This is optional and if not specified, QEMU will send 0 which will make gluster to use the default port. If the transport type is unix, then 'port' should not be specified. 'volname' is the name of the gluster volume which contains the VM image. 'image' is the path to the actual VM image that resides on gluster volume. Examples: file=gluster://1.2.3.4/testvol/a.img file=gluster+tcp://1.2.3.4/testvol/a.img file=gluster+tcp://1.2.3.4:24007/testvol/dir/a.img file=gluster+tcp://[1:2:3:4:5:6:7:8]/testvol/dir/a.img file=gluster+tcp://[1:2:3:4:5:6:7:8]:24007/testvol/dir/a.img file=gluster+tcp://server.domain.com:24007/testvol/dir/a.img file=gluster+unix:///testvol/dir/a.img?socket=/tmp/glusterd.socket file=gluster+rdma://1.2.3.4:24007/testvol/a.img Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-25Merge remote-tracking branch 'kwolf/for-anthony' into stagingAnthony Liguori12-91/+286
* kwolf/for-anthony: block: remove keep_read_only flag from BlockDriverState struct block: convert bdrv_commit() to use bdrv_reopen() block: vpc image file reopen block: vdi image file reopen block: vmdk image file reopen block: qcow image file reopen block: qcow2 image file reopen block: qed image file reopen block: raw image file reopen block: raw-posix image file reopen block: purge s->aligned_buf and s->aligned_buf_size from raw-posix.c block: use BDRV_O_NOCACHE instead of s->aligned_buf in raw-posix.c block: do not parse BDRV_O_CACHE_WB in block drivers block: move open flag parsing in raw block drivers to helper functions block: move aio initialization into a helper function block: Framework for reopening files safely block: make bdrv_set_enable_write_cache() modify open_flags block: correctly set the keep_read_only flag blockdev: preserve readonly and snapshot states across media changes
2012-09-24block: vpc image file reopenJeff Cody1-0/+7
There is currently nothing that needs to be done for VPC image file reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: vdi image file reopenJeff Cody1-0/+7
There is currently nothing that needs to be done for VDI reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: vmdk image file reopenJeff Cody1-0/+35
This patch supports reopen for VMDK image files. VMDK extents are added to the existing reopen queue, so that the transactional model of reopen is maintained with multiple image files. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: qcow image file reopenJeff Cody1-0/+10
These are the stubs for the file reopen drivers for the qcow format. There is currently nothing that needs to be done by the qcow driver in reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: qcow2 image file reopenJeff Cody1-0/+10
These are the stubs for the file reopen drivers for the qcow2 format. There is currently nothing that needs to be done by the qcow2 driver in reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: qed image file reopenJeff Cody1-0/+9
These are the stubs for the file reopen drivers for the qed format. There is currently nothing that needs to be done by the qed driver in reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: raw image file reopenJeff Cody1-0/+10
These are the stubs for the file reopen drivers for the raw format. There is currently nothing that needs to be done by the raw driver in reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: raw-posix image file reopenJeff Cody1-0/+114
This is derived from the Supriya Kannery's reopen patches. This contains the raw-posix driver changes for the bdrv_reopen_* functions. All changes are staged into a temporary scratch buffer during the prepare() stage, and copied over to the live structure during commit(). Upon abort(), all changes are abandoned, and the live structures are unmodified. The _prepare() will create an extra fd - either by means of a dup, if possible, or opening a new fd if not (for instance, access control changes). Upon _commit(), the original fd is closed and the new fd is used. Upon _abort(), the duplicate/new fd is closed. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: purge s->aligned_buf and s->aligned_buf_size from raw-posix.cJeff Cody1-20/+1
The aligned_buf pointer and aligned_buf size are no longer used in raw_posix.c, so remove all references to them. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: use BDRV_O_NOCACHE instead of s->aligned_buf in raw-posix.cJeff Cody1-1/+1
Rather than check for a non-NULL aligned_buf to determine if raw_aio_submit needs to check for alignment, check for the presence of BDRV_O_NOCACHE in the bs->open_flags. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: do not parse BDRV_O_CACHE_WB in block driversJeff Cody5-24/+6
Block drivers should ignore BDRV_O_CACHE_WB in .bdrv_open flags, and in the bs->open_flags. This patch removes the code, leaving the behaviour behind as if BDRV_O_CACHE_WB was set. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: move open flag parsing in raw block drivers to helper functionsJeff Cody2-34/+47
Code motion, to move parsing of open flags into a helper function. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: move aio initialization into a helper functionJeff Cody1-18/+35
Move AIO initialization for raw-posix block driver into a helper function. In addition to just code motion, the aio_ctx pointer is checked for NULL, prior to calling laio_init(), to make sure laio_init() is only run once. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-21iSCSI: We dont need to explicitely call qemu_notify_event() any moreRonnie Sahlberg1-6/+0
We no longer need to explicitely call qemu_notify_event() any more since this is now done automatically any time the filehandles we listen to change. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-09-21iSCSI: We need to support SG_IO also from iscsi_ioctl()Ronnie Sahlberg1-0/+17
We need to support SG_IO from the synchronous iscsi_ioctl() since scsi-block uses this to do an INQ to the device to discover its properties This patch makes scsi-block work with iscsi. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-09-12vdi: Fix warning from clangStefan Weil1-13/+12
ccc-analyzer reports these warnings: block/vdi.c:704:13: warning: Dereference of null pointer bmap[i] = VDI_UNALLOCATED; ^ block/vdi.c:702:13: warning: Dereference of null pointer bmap[i] = i; ^ Moving some code into the if block fixes this. It also avoids calling function write with 0 bytes of data. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-12block/curl: Fix wrong free statementStefan Weil1-2/+1
Report from smatch: block/curl.c:546 curl_close(21) info: redundant null check on s->url calling free() The check was redundant, and free was also wrong because the memory was allocated using g_strdup. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-12sheepdog: fix savevm and loadvmMORITA Kazutaka1-1/+2
This patch sets data to be sent to Sheepdog correctly and fixes savevm and loadvm operations on a Sheepdog image. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-31Merge remote-tracking branch 'kwolf/for-anthony' into stagingAnthony Liguori2-0/+17
* kwolf/for-anthony: qemu-iotests: add backing file smaller than image test case stream: complete early if end of backing file is reached qed: refuse unaligned zero writes with a backing file
2012-08-29stream: complete early if end of backing file is reachedStefan Hajnoczi1-0/+6
It is possible to create an image that is larger than its backing file. Reading beyond the end of the backing file produces zeroes if no writes have been made to those sectors in the image file. This patch finishes streaming early when the end of the backing file is reached. Without this patch the block job hangs and continually tries to stream the first sectors beyond the end of the backing file. To reproduce the hung block job bug: $ qemu-img create -f qcow2 backing.qcow2 128M $ qemu-img create -f qcow2 -o backing_file=backing.qcow2 image.qcow2 6G $ qemu -drive if=virtio,cache=none,file=image.qcow2 (qemu) block_stream virtio0 (qemu) info block-jobs The qemu-iotests 030 streaming test still passes. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-29qed: refuse unaligned zero writes with a backing fileStefan Hajnoczi1-0/+11
Zero writes have cluster granularity in QED. Therefore they can only be used to zero entire clusters. If the zero write request leaves sectors untouched, zeroing the entire cluster would obscure the backing file. Instead return -ENOTSUP, which is handled by block.c:bdrv_co_do_write_zeroes() and falls back to a regular write. The qemu-iotests 034 test cases covers this scenario. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-28iscsi: Set number of blocks to 0 for blank CDROM devicesRonnie Sahlberg1-1/+6
The number of blocks of the device is used to compute the device size in bdrv_getlength()/iscsi_getlength(). For MMC devices, the ReturnedLogicalBlockAddress in the READCAPACITY10 has a special meaning when it is 0. In this case it does not mean that LBA 0 is the last accessible LBA, and thus the device has 1 readable block, but instead it means that the disc is blank and there are no readable blocks. This change ensures that when the iSCSI LUN is loaded with a blank DVD-R disk or similar that bdrv_getlength() will return the correct size of the device as 0 bytes. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2012-08-22Merge remote-tracking branch 'bonzini/scsi-next' into stagingAnthony Liguori1-59/+63
* bonzini/scsi-next: virtio-scsi: add backwards-compatibility properties for 1.1 and earlier machines iscsi: fix races between task completion and abort iscsi: simplify iscsi_schedule_bh iscsi: move iscsi_schedule_bh and iscsi_readv_writev_bh_cb Revert "iscsi: Fix NULL dereferences / races between task completion and abort"
2012-08-22Merge remote-tracking branch 'kwolf/for-anthony' into stagingAnthony Liguori1-1/+57
* kwolf/for-anthony: virtio-blk: hide VIRTIO_BLK_F_CONFIG_WCE from old machine types Documentation: Warn against qemu-img on active image vmdk: Read footer for streamOptimized images vmdk: Fix header structure Conflicts: hw/virtio-blk.c
2012-08-22sheepdog: don't leak socket file descriptor upon connection failureJim Meyering1-0/+1
Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-08-20iscsi: fix races between task completion and abortPaolo Bonzini1-29/+30
This patch fixes two main issues with block/iscsi.c: 1) iscsi_task_mgmt_abort_task_async calls iscsi_scsi_task_cancel which was also directly called in iscsi_aio_cancel 2) a race between task completion and task abortion could happen cause the scsi_free_scsi_task were done before iscsi_schedule_bh has finished. To fix this, all the freeing of IscsiTasks and releasing of the AIOCBs is centralized in iscsi_bh_cb, independent of whether the SCSI command has completed or was cancelled. 3) iscsi_aio_cancel was not synchronously waiting for the end of the command. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-08-20iscsi: simplify iscsi_schedule_bhPaolo Bonzini1-15/+9
It is always used with the same callback, remove the argument. And its return value is never used, assume allocation succeeds. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-08-20iscsi: move iscsi_schedule_bh and iscsi_readv_writev_bh_cbPaolo Bonzini1-28/+28
Put these functions at the beginning, to avoid forward references in the next patches. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-08-20Revert "iscsi: Fix NULL dereferences / races between task completion and abort"Paolo Bonzini1-23/+32
This reverts commit 64e69e80920d82df3fa679bc41b13770d2f99360. The commit returned immediately from iscsi_aio_cancel, risking corruption in case the following happens: guest qemu target ========================================================================= send write 1 --------> send write 1 --------> cancel write 1 ------> cancel write 1 ------> <------------------ cancellation processed send write 2 --------> send write 2 --------> <---------------- completed write 2 <------------------ completed write 2 <---------------- completed write 1 <---------------- cancellation not done Here, the guest would see write 2 superseding write 1, when in fact the outcome could have been the opposite. The right behavior is to return only after the target says whether the cancellation was done or not, and it will be implemented by the next three patches. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-08-17vmdk: Read footer for streamOptimized imagesKevin Wolf1-0/+56
The footer takes precedence over the header when it exists. It contains the real grain directory offset that is missing in the header. Without this patch, streamOptimized images with a footer cannot be read. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Reviewed-by: Jeff Cody <jcody@redhat.com>
2012-08-17vmdk: Fix header structureKevin Wolf1-1/+1
Commit bb45ded9 swapped gd_offset and rgd_offset. This is wrong. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-15iscsi: Fix NULL dereferences / races between task completion and abortStefan Priebe1-32/+23
Signed-off-by: Stefan Priebe <s.priebe@profihost.ag> Acked-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-15block: Convert close calls to qemu_closeCorey Bryant5-22/+22
This patch converts all block layer close calls, that correspond to qemu_open calls, to qemu_close. Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-15block: Convert open calls to qemu_openCorey Bryant6-28/+26
This patch converts all block layer open calls to qemu_open. Note that this adds the O_CLOEXEC flag to the changed open paths when the O_CLOEXEC macro is defined. Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-15block: Prevent detection of /dev/fdset/ as floppyCorey Bryant1-1/+3
Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-11Merge remote-tracking branch 'kwolf/for-anthony' into stagingAnthony Liguori4-24/+50
* kwolf/for-anthony: qemu-iotests: skip 039 with ./check -nocache block: add BLOCK_O_CHECK for qemu-img check qcow2: mark image clean after repair succeeds qed: mark image clean after repair succeeds blockdev: flip default cache mode from writethrough to writeback virtio-blk: disable write cache if not negotiated virtio-blk: support VIRTIO_BLK_F_CONFIG_WCE qemu-iotests: Save some sed processes ahci: Fix sglist memleak in ahci_dma_rw_buf() ahci: Fix ahci cdrom read corruptions for reads > 128k virtio-blk: fix use-after-free while handling scsi commands
2012-08-11Merge remote-tracking branch 'bonzini/scsi-next' into stagingAnthony Liguori1-30/+29
* bonzini/scsi-next: scsi-disk: add support for the UNMAP command scsi-disk: improve out-of-range LBA detection for WRITE SAME scsi-disk: more assertions and resets for aiocb virtio-scsi: do not compare 32-bit QEMU tags against 64-bit virtio-scsi tags iscsi: Pick default initiator-name based on the name of the VM iscsi: reorganize code for parse_initiator_name iscsi: do not leak initiator_name
2012-08-10block: add BLOCK_O_CHECK for qemu-img checkStefan Hajnoczi2-3/+3
Image formats with a dirty bit, like qed and qcow2, repair dirty image files upon open with BDRV_O_RDWR. Performing automatic repair when qemu-img check runs is not ideal because the bdrv_open() call repairs the image before the actual bdrv_check() call from qemu-img.c. Fix this "double repair" since it leads to confusing output from qemu-img check. Tell the block driver that this image is being opened just for bdrv_check(). This skips automatic repair and qemu-img.c can invoke it manually with bdrv_check(). Update the golden output for qemu-iotests 039 to reflect the new qemu-img check output. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10qcow2: mark image clean after repair succeedsStefan Hajnoczi1-13/+15
The dirty bit is cleared after image repair succeeds in qcow2_open(). Move this into qcow2_check() so that all callers benefit from this behavior when fix mode is enabled. This is necessary so qemu-img check can call .bdrv_check() and mark the image clean. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10qed: mark image clean after repair succeedsStefan Hajnoczi3-8/+32
The dirty bit is cleared after image repair succeeds in qed_open(). Move this into qed_check() so that all callers benefit from this behavior when fix=true. This is necessary so qemu-img check can call .bdrv_check() and mark the image clean. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-09iscsi: Pick default initiator-name based on the name of the VMRonnie Sahlberg1-1/+4
This patch updates the iscsi layer to automatically pick a 'unique' initiator-name based on the name of the vm in case the user has not set an explicit iqn-name to use. Create a new function qemu_get_vm_name() that returns the name of the VM, if specified. This way we can thus create default names to use as the initiator name based on the guest session. If the VM is not named via the '-name' command line argument, the iscsi initiator-name used wiull simply be iqn.2008-11.org.linux-kvm If a name for the VM was specified with the '-name' option, iscsi will use a default initiatorname of iqn.2008-11.org.linux-kvm:<name> These names are just the default iscsi initiator name that qemu will generate/use only when the user has not set an explicit initiator name via the commandlines or config files. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2012-08-08iscsi: reorganize code for parse_initiator_namePaolo Bonzini1-12/+9
Merge the occurrences of the "iqn.2008-11.org.linux-kvm" string to avoid duplication. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-08-08iscsi: do not leak initiator_namePaolo Bonzini1-17/+16
The argument of iscsi_create_context is never freed by libiscsi, which in fact calls strdup on it. Avoid a leak. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-08-06qcow2: implement lazy refcountsStefan Hajnoczi3-5/+86
Lazy refcounts is a performance optimization for qcow2 that postpones refcount metadata updates and instead marks the image dirty. In the case of crash or power failure the image will be left in a dirty state and repaired next time it is opened. Reducing metadata I/O is important for cache=writethrough and cache=directsync because these modes guarantee that data is on disk after each write (hence we cannot take advantage of caching updates in RAM). Refcount metadata is not needed for guest->file block address translation and therefore does not need to be on-disk at the time of write completion - this is the motivation behind the lazy refcount optimization. The lazy refcount optimization must be enabled at image creation time: qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough Update qemu-iotests 031 and 036 since the extension header size changes when we add feature bit table entries. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>