aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2015-04-28block: add BdrvDirtyBitmap documentationJohn Snow1-5/+5
Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-14-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qmp: Add dirty bitmap status field in query-blockJohn Snow2-1/+5
Add the "frozen" status booleans, to inform clients when a bitmap is occupied doing a task. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1429314609-29776-13-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qmp: add block-dirty-bitmap-clearJohn Snow5-0/+86
Add bdrv_clear_dirty_bitmap and a matching QMP command, qmp_block_dirty_bitmap_clear that enables a user to reset the bitmap attached to a drive. This allows us to reset a bitmap in the event of a full drive backup. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1429314609-29776-12-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qmp: Add support of "dirty-bitmap" sync mode for drive-backupJohn Snow9-33/+181
For "dirty-bitmap" sync mode, the block job will iterate through the given dirty bitmap to decide if a sector needs backup (backup all the dirty clusters and skip clean ones), just as allocation conditions of "top" sync mode. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-11-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: Add bitmap successorsJohn Snow4-1/+121
A bitmap successor is an anonymous BdrvDirtyBitmap that is intended to be created just prior to a sensitive operation (e.g. Incremental Backup) that can either succeed or fail, but during the course of which we still want a bitmap tracking writes. On creating a successor, we "freeze" the parent bitmap which prevents its deletion, enabling, anonymization, or creating a bitmap with the same name. On success, the parent bitmap can "abdicate" responsibility to the successor, which will inherit its name. The successor will have been tracking writes during the course of the backup operation. The parent will be safely deleted. On failure, we can "reclaim" the successor from the parent, unifying them such that the resulting bitmap describes all writes occurring since the last successful backup, for instance. Reclamation will thaw the parent, but not explicitly re-enable it. BdrvDirtyBitmap operations that target a single bitmap are protected by assertions that the bitmap is not frozen and/or disabled. BdrvDirtyBitmap operations that target a group of bitmaps, such as bdrv_{set,reset}_dirty will ignore frozen/disabled drives with a conditional instead. Internal functions that enable/disable dirty bitmaps have assertions added to them to prevent modifying frozen bitmaps. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1429314609-29776-10-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: Add bitmap disabled statusJohn Snow2-0/+28
Add a status indicating the enabled/disabled state of the bitmap. A bitmap is by default enabled, but you can lock the bitmap into a read-only state by setting disabled = true. A previous version of this patch added a QMP interface for changing the state of the bitmap, but it has since been removed for now until a use case emerges where this state must be revealed to the user. The disabled state WILL be used internally for bitmap migration and bitmap persistence. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-9-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28hbitmap: add hbitmap_mergeJohn Snow2-0/+46
We add a bitmap merge operation to assist in error cases where we wish to combine two bitmaps together. This is algorithmically O(bits) provided HBITMAP_LEVELS remains constant. For a full bitmap on a 64bit machine: sum(bits/64^k, k, 0, HBITMAP_LEVELS) ~= 1.01587 * bits We may be able to improve running speed for particularly sparse bitmaps by using iterators, but the running time for dense maps will be worse. We present the simpler solution first, and we can refine it later if needed. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-8-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28hbitmap: cache array lengthsJohn Snow1-0/+4
As a convenience: between incremental backups, bitmap migrations and bitmap persistence we seem to need to recalculate these a lot. Because the lengths are a little bit-twiddly, let's just solidly cache them and be done with it. Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-7-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: Introduce bdrv_dirty_bitmap_granularity()John Snow2-2/+7
This returns the granularity (in bytes) of dirty bitmap, which matches the QMP interface and the existing query interface. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-6-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-removeJohn Snow6-9/+250
The new command pair is added to manage a user created dirty bitmap. The dirty bitmap's name is mandatory and must be unique for the same device, but different devices can have bitmaps with the same names. The granularity is an optional field. If it is not specified, we will choose a default granularity based on the cluster size if available, clamped to between 4K and 64K to mirror how the 'mirror' code was already choosing granularity. If we do not have cluster size info available, we choose 64K. This code has been factored out into a helper shared with block/mirror. This patch also introduces the 'block_dirty_bitmap_lookup' helper, which takes a device name and a dirty bitmap name and validates the lookup, returning NULL and setting errp if there is a problem with either field. This helper will be re-used in future patches in this series. The types added to block-core.json will be re-used in future patches in this series, see: 'qapi: Add transaction support to block-dirty-bitmap-{add, enable, disable}' Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1429314609-29776-5-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qmp: Ensure consistent granularity typeJohn Snow5-10/+11
We treat this field with a variety of different types everywhere in the code. Now it's just uint32_t. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-4-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qapi: Add optional field "name" to block dirty bitmapFam Zheng5-5/+42
This field will be set for user created dirty bitmap. Also pass in an error pointer to bdrv_create_dirty_bitmap, so when a name is already taken on this BDS, it can report an error message. This is not global check, two BDSes can have dirty bitmap with a common name. Implemented bdrv_find_dirty_bitmap to find a dirty bitmap by name, will be used later when other QMP commands want to reference dirty bitmap by name. Add bdrv_dirty_bitmap_make_anon. This unsets the name of dirty bitmap. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-3-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28docs: incremental backup documentationJohn Snow1-0/+352
Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1429314609-29776-2-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block/iscsi: use the allocationmap also if cache.direct=onPeter Lieven1-1/+1
the allocationmap has only a hint character. The driver always double checks that blocks marked unallocated in the cache are still unallocated before taking the fast path and return zeroes. So using the allocationmap is migration safe and can also be enabled with cache.direct=on. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-10-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block/iscsi: bump year in copyright noticePeter Lieven1-1/+1
Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-9-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block/iscsi: handle SCSI_STATUS_TASK_SET_FULLPeter Lieven1-2/+5
a target may issue a SCSI_STATUS_TASK_SET_FULL status if there is more than one "BUSY" command queued already. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-8-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block/iscsi: increase retry countPeter Lieven1-1/+1
The idea is that a command is retried in a BUSY condition up a time of approx. 60 seconds before it is failed. This should be far higher than any command timeout in the guest. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-7-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block/iscsi: optimize WRITE10/16 if cache.writeback is not setPeter Lieven1-3/+15
SCSI allowes to tell the target to not return from a write command if the date is not written to the disk. Use this so called FUA bit if it is supported to optimize WRITE commands if writeback is not allowed. In this case qemu always issues a WRITE followed by a FLUSH. This is 2 round trip times. If we set the FUA bit we can ignore the following FLUSH. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-6-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block/iscsi: store DPOFUA bit from the modesense commandPeter Lieven1-0/+3
Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-5-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block/iscsi: rename iscsi_write_protected and let it return voidPeter Lieven1-5/+5
Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-4-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block/iscsi: change all iscsilun properties from uint8_t to boolPeter Lieven1-7/+7
Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-3-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block/iscsi: do not forget to logout from targetPeter Lieven1-0/+6
We actually were always impolitely dropping the connection and not cleanly logging out. CC: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-2-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qmp: fill in the image field in BlockDeviceInfoAlberto Garcia5-26/+35
The image field in BlockDeviceInfo is supposed to contain an ImageInfo object. However that is being filled in by bdrv_query_info(), not by bdrv_block_device_info(), which is where BlockDeviceInfo is actually created. Anyone calling bdrv_block_device_info() directly will get a null image field. As a consequence of this, the HMP command 'info block -n -v' crashes QEMU. This patch moves the code that fills in that field from bdrv_query_info() to bdrv_block_device_info(). Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 1429271563-3765-1-git-send-email-berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28Revert "hmp: fix crash in 'info block -n -v'"Stefan Hajnoczi1-2/+1
This reverts commit 638b8366200130cc7cf7a026630bc6bfb63b0c4c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: add 'node-name' field to BLOCK_IMAGE_CORRUPTEDAlberto Garcia3-16/+30
Since this event can occur in nodes that cannot have a device name associated, include also a field with the node name. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 147cec5b3594f4bec0cb41c98afe5fcbfb67567c.1428485266.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: use bdrv_get_device_or_node_name() in error messagesAlberto Garcia12-50/+46
There are several error messages that identify a BlockDriverState by its device name. However those errors can be produced in nodes that don't have a device name associated. In those cases we should use bdrv_get_device_or_node_name() to fall back to the node name and produce a more meaningful message. The messages are also updated to use the more generic term 'node' instead of 'device'. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 9823a1f0514fdb0692e92868661c38a9e00a12d6.1428485266.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: add bdrv_get_device_or_node_name()Alberto Garcia3-4/+11
This function gets the device name associated with a BlockDriverState, or its node name if the device name is empty. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 4fa30aa8d61d9052ce266fd5429a59a14e941255.1428485266.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: document block-stream in qmp-commands.hxStefan Hajnoczi1-0/+37
The 'block-stream' QMP command is documented in block-core.json but not qmp-commands.hx. Add a summary of the command to qmp-commands.hx (similar to the documentation for 'block-commit'). Reported-by: Kashyap Chamarthy <kchamart@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1429094622-26218-1-git-send-email-stefanha@redhat.com Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28m25p80: fix s->blk usage before assignmentStefan Hajnoczi1-1/+3
Delay the call to blk_blockalign() until s->blk has been assigned. This never caused a crash because blk_blockalign(NULL, size) defaults to 4096 alignment but it's technically incorrect. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1429091024-25098-1-git-send-email-stefanha@redhat.com Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28m25p80: add missing blk_attach_dev_nofailPaolo Bonzini1-0/+1
Of the block devices that poked into -drive options via drive_get_next, m25p80 was the only one who also did not attach itself to the BlockBackend. Since sd does it, and all other devices go through a "drive" property, with this change all block backends attached to the guest will have a non-NULL result for blk_get_attached_dev(). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Message-id: 1429025387-11077-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28virtio_blk: comment fixMichael S. Tsirkin1-2/+6
update virtio blk header from latest linux, include comment fixups. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-id: 1428854036-12806-1-git-send-email-mst@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: avoid unnecessary bottom halvesPaolo Bonzini1-9/+34
bdrv_aio_* APIs can use coroutines to achieve asynchronicity. However, the coroutine may terminate without having yielded back to the caller (for example because of something that invokes a nested event loop, or because the coroutine is doing nothing at all). In this case, the bdrv_aio_* API must delay the completion to the next iteration of the main loop, because bdrv_aio_* will never invoke the callback before returning. This can be done with a bottom half, and indeed bdrv_aio_* is always using one for simplicity. It is possible to gain some performance (~3%) by avoiding this in the common case. A new field in the BlockAIOCBCoroutine struct is set to true until the first time the corotine has yielded to its creator, and completion goes through a new function bdrv_co_complete. If the flag is false, bdrv_co_complete invokes the callback immediately. If it is true, the caller will notice that the coroutine has completed and schedule the bottom half itself. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427524638-28157-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28blockjob: Update function name in commentsFam Zheng2-2/+2
Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1428069921-2957-5-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qemu-iotests: Test that "stop" doesn't drain block jobsFam Zheng3-0/+92
Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1428069921-2957-4-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: Pause block jobs in bdrv_drain_allFam Zheng1-0/+20
This is necessary to suppress more IO requests from being generated from block job coroutines. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1428069921-2957-3-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28blockjob: Allow nested pauseFam Zheng4-14/+41
This patch changes block_job_pause to increase the pause counter and block_job_resume to decrease it. The counter will allow calling block_job_pause/block_job_resume unconditionally on a job when we need to suspend the IO temporarily. From now on, each block_job_resume must be paired with a block_job_pause to keep the counter balanced. The user pause from QMP or HMP will only trigger block_job_pause once until it's resumed, this is achieved by adding a user_paused flag in BlockJob. One occurrence of block_job_resume in mirror_complete is replaced with block_job_enter which does what is necessary. In block_job_cancel, the cancel flag is good enough to instruct coroutines to quit loop, so use block_job_enter to replace the unpaired block_job_resume. Upon block job IO error, user is notified about the entering to the pause state, so this pause belongs to user pause, set the flag accordingly and expect a matching QMP resume. [Extended doc comments as suggested by Paolo Bonzini <pbonzini@redhat.com>. --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1428069921-2957-2-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28MAINTAINERS: Add Fam Zheng as Null block driver maintainerFam Zheng1-0/+6
Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427852740-24315-4-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block/null: Support reopenFam Zheng1-0/+8
Reopen is used in block-commit. With this always-succeed operation, it is now possible to test committing to a null drive, by specifying "null-aio://" or "null-co://" as the backing image when creating the qcow2 image. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427852740-24315-3-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block/null: Latency simulation by adding new option "latency-ns"Fam Zheng2-7/+56
Aio context switch should just work because the requests will be drained, so the scheduled timer(s) on the old context will be freed. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427852740-24315-2-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28scripts: add 'qemu coroutine' command to qemu-gdb.pyStefan Hajnoczi1-0/+75
The 'qemu coroutine <coroutine-address>' GDB command prints the backtrace for a CoroutineUContext. This is useful for peeking inside yielded coroutines that are waiting for file descriptor events, timers, etc. For example: $ gdb tests/test-coroutine (gdb) b test_yield (gdb) r (gdb) b qemu_coroutine_enter (gdb) c (gdb) c Continuing. Breakpoint 2, qemu_coroutine_enter (co=0x555555c66520, opaque=0x0) at qemu-coroutine.c:103 103 { (gdb) source scripts/qemu-gdb.py (gdb) qemu coroutine 0x555555c66520 #0 0x000055555557a740 in qemu_coroutine_switch (from_=<optimized out>, to_=0x7ffff7f90a70, action=COROUTINE_YIELD) at coroutine-ucontext.c:177 #1 0x0000555555566af9 in yield_5_times (opaque=0x7fffffffdbb7) at tests/test-coroutine.c:107 #2 0x000055555557a7aa in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:80 #3 0x00007ffff08de000 in __start_context () at /lib64/libc.so.6 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427409754-8556-1-git-send-email-stefanha@redhat.com Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28thread-pool: clean up thread_pool_completion_bh()Stefan Hajnoczi1-8/+6
This patch simplifies thread_pool_completion_bh(). The function first checks elem->state: if (elem->state != THREAD_DONE) { continue; } It then goes on to check elem->state == THREAD_DONE although we already know this must be the case. The QLIST_REMOVE() is duplicated down both branches of an if-else statement so that can be lifted out as well. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1427992762-10126-1-git-send-email-stefanha@redhat.com Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28vhdx: Fix zero-fill iov lengthKevin Wolf1-2/+2
Fix the length of the zero-fill for the back, which was accidentally using the same value as for the front. This is caught by qemu-iotests 033. For consistency, change the code for the front as well to use the length stored in the iov (it is the same value, copied four lines above). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Jeff Cody <jcody@redhat.com>
2015-04-28blkdebug: Add bdrv_truncate()Kevin Wolf1-0/+6
This is, amongst others, required for qemu-iotests 033 to run as intended on VHDX, which uses explicit bdrv_truncate() calls to bs->file when allocating new blocks. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>
2015-04-28qemu-iotests: Some qemu-img convert testsKevin Wolf3-0/+433
This adds a regression test for some problems that the qemu-img convert rewrite just fixed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-04-28qemu-img convert: Rewrite copying logicKevin Wolf1-206/+310
The implementation of qemu-img convert is (a) messy, (b) buggy, and (c) less efficient than possible. The changes required to beat some sense into it are massive enough that incremental changes would only make my and the reviewers' life harder. So throw it away and reimplement it from scratch. Let me give some examples what I mean by messy, buggy and inefficient: (a) The copying logic of qemu-img convert has two separate branches for compressed and normal target images, which roughly do the same - except for a little code that handles actual differences between compressed and uncompressed images, and much more code that implements just a different set of optimisations and bugs. This is unnecessary code duplication, and makes the code for compressed output (unsurprisingly) suffer from bitrot. The code for uncompressed ouput is run twice to count the the total length for the progress bar. In the first run it just takes a shortcut and runs only half the loop, and when it's done, it toggles a boolean, jumps out of the loop with a backwards goto and starts over. Works, but pretty is something different. (b) Converting while keeping a backing file (-B option) is broken in several ways. This includes not writing to the image file if the input has zero clusters or data filled with zeros (ignoring that the backing file will be visible instead). It also doesn't correctly limit every iteration of the copy loop to sectors of the same status so that too many sectors may be copied to in the target image. For -B this gives an unexpected result, for other images it just does more work than necessary. Conversion with a compressed target completely ignores any target backing file. (c) qemu-img convert skips reading and writing an area if it knows from metadata that copying isn't needed (except for the bug mentioned above that ignores a status change in some cases). It does, however, read from the source even if it knows that it will read zeros, and then search for non-zero bytes in the read buffer, if it's possible that a write might be needed. This reimplementation of the copying core reorganises the code to remove the duplication and have a much more obvious code flow, by essentially splitting the copy iteration loop into three parts: 1. Find the number of contiguous sectors of the same status at the current offset (This can also be called in a separate loop before the copying loop in order to determine the total sectors for the progress bar.) 2. Read sectors. If the status implies that there is no data there to read (zero or unallocated cluster), don't do anything. 3. Write sectors depending on the status. If it's data, write it. If we want the backing file to be visible (with -B), don't write it. If it's zeroed, skip it if you can, otherwise use bdrv_write_zeroes() to optimise the write at least where possible. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-04-28block-backend: Expose bdrv_write_zeroes()Kevin Wolf2-0/+13
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-04-28iothread: release iothread around aio_pollPaolo Bonzini3-24/+14
This is the first step towards having fine-grained critical sections in dataplane threads, which resolves lock ordering problems between address_space_* functions (which need the BQL when doing MMIO, even after we complete RCU-based dispatch) and the AioContext. Because AioContext does not use contention callbacks anymore, the unit test has to be changed. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424449612-18215-4-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28AioContext: acquire/release AioContext during aio_pollPaolo Bonzini3-6/+24
This is the first step in pushing down acquire/release, and will let rfifolock drop the contention callback feature. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424449612-18215-3-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28aio-posix: move pollfds to thread-local storagePaolo Bonzini3-26/+57
By using thread-local storage, aio_poll can stop using global data during g_poll_ns. This will make it possible to drop callbacks from rfifolock. [Moved npfd = 0 assignment to end of walking_handlers region as suggested by Paolo. This resolves the assert(npfd == 0) assertion failure in pollfds_cleanup(). --Stefan] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424449612-18215-2-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: Switch to host monotonic clock for IO throttlingFam Zheng1-1/+8
Currently, throttle timers won't make any progress when VCPU is not running, which would stall the request queue in utils, qtest, vm suspending, and live migration, without special handling. Block jobs are confusingly inconsistent between with and without throttling: if user sets a bps limit, stops the vm, then start a block job, the block job will not make any progress; in contrary, if user unsets the bps limit, or if it's not set, the block job will run normally. After this patch, with the host clock, even if the VCPUs are stopped, the throttle queues will be processed. This patch also enables potential to add throttle to bdrv_drain_all. Currently all requests are drained immediately. In other words whenever it is called, IO throttling goes ineffective (examples: system reset, migration and many block job operations.). This is a loophole that guest could exploit. If we use the host clock, we can later just trust the nested poll. This could be done on top. Note that for qemu-iotests case 093, which uses qtest, we still keep vm clock so the script can control the clock stepping in order to be deterministic. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1427268446-6426-1-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>