diff options
author | Peter Maydell <peter.maydell@linaro.org> | 2017-07-18 20:29:36 +0100 |
---|---|---|
committer | Peter Maydell <peter.maydell@linaro.org> | 2017-07-18 20:29:36 +0100 |
commit | f9dada2baabb639feb988b3a564df7a06d214e18 (patch) | |
tree | 0ff304a5dbd747bd6ec1a8a74d3e81a0ec093761 | |
parent | 20df6c7689eb89ab48fa5d5766d5c724c818148e (diff) | |
parent | 8508eee740c78d1465e25dad7c3e06137485dfbc (diff) | |
download | qemu-f9dada2baabb639feb988b3a564df7a06d214e18.zip qemu-f9dada2baabb639feb988b3a564df7a06d214e18.tar.gz qemu-f9dada2baabb639feb988b3a564df7a06d214e18.tar.bz2 |
Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Tue 18 Jul 2017 05:15:03 BST
# gpg: using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg: aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg: aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98 D624 BDBE 7B27 C0DE 3057
* remotes/cody/tags/block-pull-request:
live-block-ops.txt: Rename, rewrite, and improve it
bitmaps.md: Convert to rST; move it into 'interop' dir
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-rw-r--r-- | docs/devel/bitmaps.md | 505 | ||||
-rw-r--r-- | docs/interop/bitmaps.rst | 555 | ||||
-rw-r--r-- | docs/interop/live-block-operations.rst | 1088 | ||||
-rw-r--r-- | docs/live-block-ops.txt | 72 |
4 files changed, 1643 insertions, 577 deletions
diff --git a/docs/devel/bitmaps.md b/docs/devel/bitmaps.md deleted file mode 100644 index a2e8d51..0000000 --- a/docs/devel/bitmaps.md +++ /dev/null @@ -1,505 +0,0 @@ -<!-- -Copyright 2015 John Snow <jsnow@redhat.com> and Red Hat, Inc. -All rights reserved. - -This file is licensed via The FreeBSD Documentation License, the full text of -which is included at the end of this document. ---> - -# Dirty Bitmaps and Incremental Backup - -* Dirty Bitmaps are objects that track which data needs to be backed up for the - next incremental backup. - -* Dirty bitmaps can be created at any time and attached to any node - (not just complete drives.) - -## Dirty Bitmap Names - -* A dirty bitmap's name is unique to the node, but bitmaps attached to different - nodes can share the same name. - -* Dirty bitmaps created for internal use by QEMU may be anonymous and have no - name, but any user-created bitmaps may not be. There can be any number of - anonymous bitmaps per node. - -* The name of a user-created bitmap must not be empty (""). - -## Bitmap Modes - -* A Bitmap can be "frozen," which means that it is currently in-use by a backup - operation and cannot be deleted, renamed, written to, reset, - etc. - -* The normal operating mode for a bitmap is "active." - -## Basic QMP Usage - -### Supported Commands ### - -* block-dirty-bitmap-add -* block-dirty-bitmap-remove -* block-dirty-bitmap-clear - -### Creation - -* To create a new bitmap, enabled, on the drive with id=drive0: - -```json -{ "execute": "block-dirty-bitmap-add", - "arguments": { - "node": "drive0", - "name": "bitmap0" - } -} -``` - -* This bitmap will have a default granularity that matches the cluster size of - its associated drive, if available, clamped to between [4KiB, 64KiB]. - The current default for qcow2 is 64KiB. - -* To create a new bitmap that tracks changes in 32KiB segments: - -```json -{ "execute": "block-dirty-bitmap-add", - "arguments": { - "node": "drive0", - "name": "bitmap0", - "granularity": 32768 - } -} -``` - -### Deletion - -* Bitmaps that are frozen cannot be deleted. - -* Deleting the bitmap does not impact any other bitmaps attached to the same - node, nor does it affect any backups already created from this node. - -* Because bitmaps are only unique to the node to which they are attached, - you must specify the node/drive name here, too. - -```json -{ "execute": "block-dirty-bitmap-remove", - "arguments": { - "node": "drive0", - "name": "bitmap0" - } -} -``` - -### Resetting - -* Resetting a bitmap will clear all information it holds. - -* An incremental backup created from an empty bitmap will copy no data, - as if nothing has changed. - -```json -{ "execute": "block-dirty-bitmap-clear", - "arguments": { - "node": "drive0", - "name": "bitmap0" - } -} -``` - -## Transactions - -### Justification - -Bitmaps can be safely modified when the VM is paused or halted by using -the basic QMP commands. For instance, you might perform the following actions: - -1. Boot the VM in a paused state. -2. Create a full drive backup of drive0. -3. Create a new bitmap attached to drive0. -4. Resume execution of the VM. -5. Incremental backups are ready to be created. - -At this point, the bitmap and drive backup would be correctly in sync, -and incremental backups made from this point forward would be correctly aligned -to the full drive backup. - -This is not particularly useful if we decide we want to start incremental -backups after the VM has been running for a while, for which we will need to -perform actions such as the following: - -1. Boot the VM and begin execution. -2. Using a single transaction, perform the following operations: - * Create bitmap0. - * Create a full drive backup of drive0. -3. Incremental backups are now ready to be created. - -### Supported Bitmap Transactions - -* block-dirty-bitmap-add -* block-dirty-bitmap-clear - -The usages are identical to their respective QMP commands, but see below -for examples. - -### Example: New Incremental Backup - -As outlined in the justification, perhaps we want to create a new incremental -backup chain attached to a drive. - -```json -{ "execute": "transaction", - "arguments": { - "actions": [ - {"type": "block-dirty-bitmap-add", - "data": {"node": "drive0", "name": "bitmap0"} }, - {"type": "drive-backup", - "data": {"device": "drive0", "target": "/path/to/full_backup.img", - "sync": "full", "format": "qcow2"} } - ] - } -} -``` - -### Example: New Incremental Backup Anchor Point - -Maybe we just want to create a new full backup with an existing bitmap and -want to reset the bitmap to track the new chain. - -```json -{ "execute": "transaction", - "arguments": { - "actions": [ - {"type": "block-dirty-bitmap-clear", - "data": {"node": "drive0", "name": "bitmap0"} }, - {"type": "drive-backup", - "data": {"device": "drive0", "target": "/path/to/new_full_backup.img", - "sync": "full", "format": "qcow2"} } - ] - } -} -``` - -## Incremental Backups - -The star of the show. - -**Nota Bene!** Only incremental backups of entire drives are supported for now. -So despite the fact that you can attach a bitmap to any arbitrary node, they are -only currently useful when attached to the root node. This is because -drive-backup only supports drives/devices instead of arbitrary nodes. - -### Example: First Incremental Backup - -1. Create a full backup and sync it to the dirty bitmap, as in the transactional -examples above; or with the VM offline, manually create a full copy and then -create a new bitmap before the VM begins execution. - - * Let's assume the full backup is named 'full_backup.img'. - * Let's assume the bitmap you created is 'bitmap0' attached to 'drive0'. - -2. Create a destination image for the incremental backup that utilizes the -full backup as a backing image. - - * Let's assume it is named 'incremental.0.img'. - - ```sh - # qemu-img create -f qcow2 incremental.0.img -b full_backup.img -F qcow2 - ``` - -3. Issue the incremental backup command: - - ```json - { "execute": "drive-backup", - "arguments": { - "device": "drive0", - "bitmap": "bitmap0", - "target": "incremental.0.img", - "format": "qcow2", - "sync": "incremental", - "mode": "existing" - } - } - ``` - -### Example: Second Incremental Backup - -1. Create a new destination image for the incremental backup that points to the - previous one, e.g.: 'incremental.1.img' - - ```sh - # qemu-img create -f qcow2 incremental.1.img -b incremental.0.img -F qcow2 - ``` - -2. Issue a new incremental backup command. The only difference here is that we - have changed the target image below. - - ```json - { "execute": "drive-backup", - "arguments": { - "device": "drive0", - "bitmap": "bitmap0", - "target": "incremental.1.img", - "format": "qcow2", - "sync": "incremental", - "mode": "existing" - } - } - ``` - -## Errors - -* In the event of an error that occurs after a backup job is successfully - launched, either by a direct QMP command or a QMP transaction, the user - will receive a BLOCK_JOB_COMPLETE event with a failure message, accompanied - by a BLOCK_JOB_ERROR event. - -* In the case of an event being cancelled, the user will receive a - BLOCK_JOB_CANCELLED event instead of a pair of COMPLETE and ERROR events. - -* In either case, the incremental backup data contained within the bitmap is - safely rolled back, and the data within the bitmap is not lost. The image - file created for the failed attempt can be safely deleted. - -* Once the underlying problem is fixed (e.g. more storage space is freed up), - you can simply retry the incremental backup command with the same bitmap. - -### Example - -1. Create a target image: - - ```sh - # qemu-img create -f qcow2 incremental.0.img -b full_backup.img -F qcow2 - ``` - -2. Attempt to create an incremental backup via QMP: - - ```json - { "execute": "drive-backup", - "arguments": { - "device": "drive0", - "bitmap": "bitmap0", - "target": "incremental.0.img", - "format": "qcow2", - "sync": "incremental", - "mode": "existing" - } - } - ``` - -3. Receive an event notifying us of failure: - - ```json - { "timestamp": { "seconds": 1424709442, "microseconds": 844524 }, - "data": { "speed": 0, "offset": 0, "len": 67108864, - "error": "No space left on device", - "device": "drive1", "type": "backup" }, - "event": "BLOCK_JOB_COMPLETED" } - ``` - -4. Delete the failed incremental, and re-create the image. - - ```sh - # rm incremental.0.img - # qemu-img create -f qcow2 incremental.0.img -b full_backup.img -F qcow2 - ``` - -5. Retry the command after fixing the underlying problem, - such as freeing up space on the backup volume: - - ```json - { "execute": "drive-backup", - "arguments": { - "device": "drive0", - "bitmap": "bitmap0", - "target": "incremental.0.img", - "format": "qcow2", - "sync": "incremental", - "mode": "existing" - } - } - ``` - -6. Receive confirmation that the job completed successfully: - - ```json - { "timestamp": { "seconds": 1424709668, "microseconds": 526525 }, - "data": { "device": "drive1", "type": "backup", - "speed": 0, "len": 67108864, "offset": 67108864}, - "event": "BLOCK_JOB_COMPLETED" } - ``` - -### Partial Transactional Failures - -* Sometimes, a transaction will succeed in launching and return success, - but then later the backup jobs themselves may fail. It is possible that - a management application may have to deal with a partial backup failure - after a successful transaction. - -* If multiple backup jobs are specified in a single transaction, when one of - them fails, it will not interact with the other backup jobs in any way. - -* The job(s) that succeeded will clear the dirty bitmap associated with the - operation, but the job(s) that failed will not. It is not "safe" to delete - any incremental backups that were created successfully in this scenario, - even though others failed. - -#### Example - -* QMP example highlighting two backup jobs: - - ```json - { "execute": "transaction", - "arguments": { - "actions": [ - { "type": "drive-backup", - "data": { "device": "drive0", "bitmap": "bitmap0", - "format": "qcow2", "mode": "existing", - "sync": "incremental", "target": "d0-incr-1.qcow2" } }, - { "type": "drive-backup", - "data": { "device": "drive1", "bitmap": "bitmap1", - "format": "qcow2", "mode": "existing", - "sync": "incremental", "target": "d1-incr-1.qcow2" } }, - ] - } - } - ``` - -* QMP example response, highlighting one success and one failure: - * Acknowledgement that the Transaction was accepted and jobs were launched: - ```json - { "return": {} } - ``` - - * Later, QEMU sends notice that the first job was completed: - ```json - { "timestamp": { "seconds": 1447192343, "microseconds": 615698 }, - "data": { "device": "drive0", "type": "backup", - "speed": 0, "len": 67108864, "offset": 67108864 }, - "event": "BLOCK_JOB_COMPLETED" - } - ``` - - * Later yet, QEMU sends notice that the second job has failed: - ```json - { "timestamp": { "seconds": 1447192399, "microseconds": 683015 }, - "data": { "device": "drive1", "action": "report", - "operation": "read" }, - "event": "BLOCK_JOB_ERROR" } - ``` - - ```json - { "timestamp": { "seconds": 1447192399, "microseconds": 685853 }, - "data": { "speed": 0, "offset": 0, "len": 67108864, - "error": "Input/output error", - "device": "drive1", "type": "backup" }, - "event": "BLOCK_JOB_COMPLETED" } - -* In the above example, "d0-incr-1.qcow2" is valid and must be kept, - but "d1-incr-1.qcow2" is invalid and should be deleted. If a VM-wide - incremental backup of all drives at a point-in-time is to be made, - new backups for both drives will need to be made, taking into account - that a new incremental backup for drive0 needs to be based on top of - "d0-incr-1.qcow2." - -### Grouped Completion Mode - -* While jobs launched by transactions normally complete or fail on their own, - it is possible to instruct them to complete or fail together as a group. - -* QMP transactions take an optional properties structure that can affect - the semantics of the transaction. - -* The "completion-mode" transaction property can be either "individual" - which is the default, legacy behavior described above, or "grouped," - a new behavior detailed below. - -* Delayed Completion: In grouped completion mode, no jobs will report - success until all jobs are ready to report success. - -* Grouped failure: If any job fails in grouped completion mode, all remaining - jobs will be cancelled. Any incremental backups will restore their dirty - bitmap objects as if no backup command was ever issued. - - * Regardless of if QEMU reports a particular incremental backup job as - CANCELLED or as an ERROR, the in-memory bitmap will be restored. - -#### Example - -* Here's the same example scenario from above with the new property: - - ```json - { "execute": "transaction", - "arguments": { - "actions": [ - { "type": "drive-backup", - "data": { "device": "drive0", "bitmap": "bitmap0", - "format": "qcow2", "mode": "existing", - "sync": "incremental", "target": "d0-incr-1.qcow2" } }, - { "type": "drive-backup", - "data": { "device": "drive1", "bitmap": "bitmap1", - "format": "qcow2", "mode": "existing", - "sync": "incremental", "target": "d1-incr-1.qcow2" } }, - ], - "properties": { - "completion-mode": "grouped" - } - } - } - ``` - -* QMP example response, highlighting a failure for drive2: - * Acknowledgement that the Transaction was accepted and jobs were launched: - ```json - { "return": {} } - ``` - - * Later, QEMU sends notice that the second job has errored out, - but that the first job was also cancelled: - ```json - { "timestamp": { "seconds": 1447193702, "microseconds": 632377 }, - "data": { "device": "drive1", "action": "report", - "operation": "read" }, - "event": "BLOCK_JOB_ERROR" } - ``` - - ```json - { "timestamp": { "seconds": 1447193702, "microseconds": 640074 }, - "data": { "speed": 0, "offset": 0, "len": 67108864, - "error": "Input/output error", - "device": "drive1", "type": "backup" }, - "event": "BLOCK_JOB_COMPLETED" } - ``` - - ```json - { "timestamp": { "seconds": 1447193702, "microseconds": 640163 }, - "data": { "device": "drive0", "type": "backup", "speed": 0, - "len": 67108864, "offset": 16777216 }, - "event": "BLOCK_JOB_CANCELLED" } - ``` - -<!-- -The FreeBSD Documentation License - -Redistribution and use in source (Markdown) and 'compiled' forms (SGML, HTML, -PDF, PostScript, RTF and so forth) with or without modification, are permitted -provided that the following conditions are met: - -Redistributions of source code (Markdown) must retain the above copyright -notice, this list of conditions and the following disclaimer of this file -unmodified. - -Redistributions in compiled form (transformed to other DTDs, converted to PDF, -PostScript, RTF and other formats) must reproduce the above copyright notice, -this list of conditions and the following disclaimer in the documentation and/or -other materials provided with the distribution. - -THIS DOCUMENTATION IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE -DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE -FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL -DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR -SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER -CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, -OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF -THIS DOCUMENTATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ---> diff --git a/docs/interop/bitmaps.rst b/docs/interop/bitmaps.rst new file mode 100644 index 0000000..7bcfe7f --- /dev/null +++ b/docs/interop/bitmaps.rst @@ -0,0 +1,555 @@ +.. + Copyright 2015 John Snow <jsnow@redhat.com> and Red Hat, Inc. + All rights reserved. + + This file is licensed via The FreeBSD Documentation License, the full + text of which is included at the end of this document. + +==================================== +Dirty Bitmaps and Incremental Backup +==================================== + +- Dirty Bitmaps are objects that track which data needs to be backed up + for the next incremental backup. + +- Dirty bitmaps can be created at any time and attached to any node + (not just complete drives). + +.. contents:: + +Dirty Bitmap Names +------------------ + +- A dirty bitmap's name is unique to the node, but bitmaps attached to + different nodes can share the same name. + +- Dirty bitmaps created for internal use by QEMU may be anonymous and + have no name, but any user-created bitmaps must have a name. There + can be any number of anonymous bitmaps per node. + +- The name of a user-created bitmap must not be empty (""). + +Bitmap Modes +------------ + +- A bitmap can be "frozen," which means that it is currently in-use by + a backup operation and cannot be deleted, renamed, written to, reset, + etc. + +- The normal operating mode for a bitmap is "active." + +Basic QMP Usage +--------------- + +Supported Commands +~~~~~~~~~~~~~~~~~~ + +- ``block-dirty-bitmap-add`` +- ``block-dirty-bitmap-remove`` +- ``block-dirty-bitmap-clear`` + +Creation +~~~~~~~~ + +- To create a new bitmap, enabled, on the drive with id=drive0: + +.. code:: json + + { "execute": "block-dirty-bitmap-add", + "arguments": { + "node": "drive0", + "name": "bitmap0" + } + } + +- This bitmap will have a default granularity that matches the cluster + size of its associated drive, if available, clamped to between [4KiB, + 64KiB]. The current default for qcow2 is 64KiB. + +- To create a new bitmap that tracks changes in 32KiB segments: + +.. code:: json + + { "execute": "block-dirty-bitmap-add", + "arguments": { + "node": "drive0", + "name": "bitmap0", + "granularity": 32768 + } + } + +Deletion +~~~~~~~~ + +- Bitmaps that are frozen cannot be deleted. + +- Deleting the bitmap does not impact any other bitmaps attached to the + same node, nor does it affect any backups already created from this + node. + +- Because bitmaps are only unique to the node to which they are + attached, you must specify the node/drive name here, too. + +.. code:: json + + { "execute": "block-dirty-bitmap-remove", + "arguments": { + "node": "drive0", + "name": "bitmap0" + } + } + +Resetting +~~~~~~~~~ + +- Resetting a bitmap will clear all information it holds. + +- An incremental backup created from an empty bitmap will copy no data, + as if nothing has changed. + +.. code:: json + + { "execute": "block-dirty-bitmap-clear", + "arguments": { + "node": "drive0", + "name": "bitmap0" + } + } + +Transactions +------------ + +Justification +~~~~~~~~~~~~~ + +Bitmaps can be safely modified when the VM is paused or halted by using +the basic QMP commands. For instance, you might perform the following +actions: + +1. Boot the VM in a paused state. +2. Create a full drive backup of drive0. +3. Create a new bitmap attached to drive0. +4. Resume execution of the VM. +5. Incremental backups are ready to be created. + +At this point, the bitmap and drive backup would be correctly in sync, +and incremental backups made from this point forward would be correctly +aligned to the full drive backup. + +This is not particularly useful if we decide we want to start +incremental backups after the VM has been running for a while, for which +we will need to perform actions such as the following: + +1. Boot the VM and begin execution. +2. Using a single transaction, perform the following operations: + + - Create ``bitmap0``. + - Create a full drive backup of ``drive0``. + +3. Incremental backups are now ready to be created. + +Supported Bitmap Transactions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- ``block-dirty-bitmap-add`` +- ``block-dirty-bitmap-clear`` + +The usages are identical to their respective QMP commands, but see below +for examples. + +Example: New Incremental Backup +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As outlined in the justification, perhaps we want to create a new +incremental backup chain attached to a drive. + +.. code:: json + + { "execute": "transaction", + "arguments": { + "actions": [ + {"type": "block-dirty-bitmap-add", + "data": {"node": "drive0", "name": "bitmap0"} }, + {"type": "drive-backup", + "data": {"device": "drive0", "target": "/path/to/full_backup.img", + "sync": "full", "format": "qcow2"} } + ] + } + } + +Example: New Incremental Backup Anchor Point +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Maybe we just want to create a new full backup with an existing bitmap +and want to reset the bitmap to track the new chain. + +.. code:: json + + { "execute": "transaction", + "arguments": { + "actions": [ + {"type": "block-dirty-bitmap-clear", + "data": {"node": "drive0", "name": "bitmap0"} }, + {"type": "drive-backup", + "data": {"device": "drive0", "target": "/path/to/new_full_backup.img", + "sync": "full", "format": "qcow2"} } + ] + } + } + +Incremental Backups +------------------- + +The star of the show. + +**Nota Bene!** Only incremental backups of entire drives are supported +for now. So despite the fact that you can attach a bitmap to any +arbitrary node, they are only currently useful when attached to the root +node. This is because drive-backup only supports drives/devices instead +of arbitrary nodes. + +Example: First Incremental Backup +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +1. Create a full backup and sync it to the dirty bitmap, as in the + transactional examples above; or with the VM offline, manually create + a full copy and then create a new bitmap before the VM begins + execution. + + - Let's assume the full backup is named ``full_backup.img``. + - Let's assume the bitmap you created is ``bitmap0`` attached to + ``drive0``. + +2. Create a destination image for the incremental backup that utilizes + the full backup as a backing image. + + - Let's assume the new incremental image is named + ``incremental.0.img``. + + .. code:: bash + + $ qemu-img create -f qcow2 incremental.0.img -b full_backup.img -F qcow2 + +3. Issue the incremental backup command: + + .. code:: json + + { "execute": "drive-backup", + "arguments": { + "device": "drive0", + "bitmap": "bitmap0", + "target": "incremental.0.img", + "format": "qcow2", + "sync": "incremental", + "mode": "existing" + } + } + +Example: Second Incremental Backup +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +1. Create a new destination image for the incremental backup that points + to the previous one, e.g.: ``incremental.1.img`` + + .. code:: bash + + $ qemu-img create -f qcow2 incremental.1.img -b incremental.0.img -F qcow2 + +2. Issue a new incremental backup command. The only difference here is + that we have changed the target image below. + + .. code:: json + + { "execute": "drive-backup", + "arguments": { + "device": "drive0", + "bitmap": "bitmap0", + "target": "incremental.1.img", + "format": "qcow2", + "sync": "incremental", + "mode": "existing" + } + } + +Errors +------ + +- In the event of an error that occurs after a backup job is + successfully launched, either by a direct QMP command or a QMP + transaction, the user will receive a ``BLOCK_JOB_COMPLETE`` event with + a failure message, accompanied by a ``BLOCK_JOB_ERROR`` event. + +- In the case of an event being cancelled, the user will receive a + ``BLOCK_JOB_CANCELLED`` event instead of a pair of COMPLETE and ERROR + events. + +- In either case, the incremental backup data contained within the + bitmap is safely rolled back, and the data within the bitmap is not + lost. The image file created for the failed attempt can be safely + deleted. + +- Once the underlying problem is fixed (e.g. more storage space is + freed up), you can simply retry the incremental backup command with + the same bitmap. + +Example +~~~~~~~ + +1. Create a target image: + + .. code:: bash + + $ qemu-img create -f qcow2 incremental.0.img -b full_backup.img -F qcow2 + +2. Attempt to create an incremental backup via QMP: + + .. code:: json + + { "execute": "drive-backup", + "arguments": { + "device": "drive0", + "bitmap": "bitmap0", + "target": "incremental.0.img", + "format": "qcow2", + "sync": "incremental", + "mode": "existing" + } + } + +3. Receive an event notifying us of failure: + + .. code:: json + + { "timestamp": { "seconds": 1424709442, "microseconds": 844524 }, + "data": { "speed": 0, "offset": 0, "len": 67108864, + "error": "No space left on device", + "device": "drive1", "type": "backup" }, + "event": "BLOCK_JOB_COMPLETED" } + +4. Delete the failed incremental, and re-create the image. + + .. code:: bash + + $ rm incremental.0.img + $ qemu-img create -f qcow2 incremental.0.img -b full_backup.img -F qcow2 + +5. Retry the command after fixing the underlying problem, such as + freeing up space on the backup volume: + + .. code:: json + + { "execute": "drive-backup", + "arguments": { + "device": "drive0", + "bitmap": "bitmap0", + "target": "incremental.0.img", + "format": "qcow2", + "sync": "incremental", + "mode": "existing" + } + } + +6. Receive confirmation that the job completed successfully: + + .. code:: json + + { "timestamp": { "seconds": 1424709668, "microseconds": 526525 }, + "data": { "device": "drive1", "type": "backup", + "speed": 0, "len": 67108864, "offset": 67108864}, + "event": "BLOCK_JOB_COMPLETED" } + +Partial Transactional Failures +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- Sometimes, a transaction will succeed in launching and return + success, but then later the backup jobs themselves may fail. It is + possible that a management application may have to deal with a + partial backup failure after a successful transaction. + +- If multiple backup jobs are specified in a single transaction, when + one of them fails, it will not interact with the other backup jobs in + any way. + +- The job(s) that succeeded will clear the dirty bitmap associated with + the operation, but the job(s) that failed will not. It is not "safe" + to delete any incremental backups that were created successfully in + this scenario, even though others failed. + +Example +^^^^^^^ + +- QMP example highlighting two backup jobs: + + .. code:: json + + { "execute": "transaction", + "arguments": { + "actions": [ + { "type": "drive-backup", + "data": { "device": "drive0", "bitmap": "bitmap0", + "format": "qcow2", "mode": "existing", + "sync": "incremental", "target": "d0-incr-1.qcow2" } }, + { "type": "drive-backup", + "data": { "device": "drive1", "bitmap": "bitmap1", + "format": "qcow2", "mode": "existing", + "sync": "incremental", "target": "d1-incr-1.qcow2" } }, + ] + } + } + +- QMP example response, highlighting one success and one failure: + + - Acknowledgement that the Transaction was accepted and jobs were + launched: + + .. code:: json + + { "return": {} } + + - Later, QEMU sends notice that the first job was completed: + + .. code:: json + + { "timestamp": { "seconds": 1447192343, "microseconds": 615698 }, + "data": { "device": "drive0", "type": "backup", + "speed": 0, "len": 67108864, "offset": 67108864 }, + "event": "BLOCK_JOB_COMPLETED" + } + + - Later yet, QEMU sends notice that the second job has failed: + + .. code:: json + + { "timestamp": { "seconds": 1447192399, "microseconds": 683015 }, + "data": { "device": "drive1", "action": "report", + "operation": "read" }, + "event": "BLOCK_JOB_ERROR" } + + .. code:: json + + { "timestamp": { "seconds": 1447192399, "microseconds": + 685853 }, "data": { "speed": 0, "offset": 0, "len": 67108864, + "error": "Input/output error", "device": "drive1", "type": + "backup" }, "event": "BLOCK_JOB_COMPLETED" } + +- In the above example, ``d0-incr-1.qcow2`` is valid and must be kept, + but ``d1-incr-1.qcow2`` is invalid and should be deleted. If a VM-wide + incremental backup of all drives at a point-in-time is to be made, + new backups for both drives will need to be made, taking into account + that a new incremental backup for drive0 needs to be based on top of + ``d0-incr-1.qcow2``. + +Grouped Completion Mode +~~~~~~~~~~~~~~~~~~~~~~~ + +- While jobs launched by transactions normally complete or fail on + their own, it is possible to instruct them to complete or fail + together as a group. + +- QMP transactions take an optional properties structure that can + affect the semantics of the transaction. + +- The "completion-mode" transaction property can be either "individual" + which is the default, legacy behavior described above, or "grouped," + a new behavior detailed below. + +- Delayed Completion: In grouped completion mode, no jobs will report + success until all jobs are ready to report success. + +- Grouped failure: If any job fails in grouped completion mode, all + remaining jobs will be cancelled. Any incremental backups will + restore their dirty bitmap objects as if no backup command was ever + issued. + + - Regardless of if QEMU reports a particular incremental backup job + as CANCELLED or as an ERROR, the in-memory bitmap will be + restored. + +Example +^^^^^^^ + +- Here's the same example scenario from above with the new property: + + .. code:: json + + { "execute": "transaction", + "arguments": { + "actions": [ + { "type": "drive-backup", + "data": { "device": "drive0", "bitmap": "bitmap0", + "format": "qcow2", "mode": "existing", + "sync": "incremental", "target": "d0-incr-1.qcow2" } }, + { "type": "drive-backup", + "data": { "device": "drive1", "bitmap": "bitmap1", + "format": "qcow2", "mode": "existing", + "sync": "incremental", "target": "d1-incr-1.qcow2" } }, + ], + "properties": { + "completion-mode": "grouped" + } + } + } + +- QMP example response, highlighting a failure for ``drive2``: + + - Acknowledgement that the Transaction was accepted and jobs were + launched: + + .. code:: json + + { "return": {} } + + - Later, QEMU sends notice that the second job has errored out, but + that the first job was also cancelled: + + .. code:: json + + { "timestamp": { "seconds": 1447193702, "microseconds": 632377 }, + "data": { "device": "drive1", "action": "report", + "operation": "read" }, + "event": "BLOCK_JOB_ERROR" } + + .. code:: json + + { "timestamp": { "seconds": 1447193702, "microseconds": 640074 }, + "data": { "speed": 0, "offset": 0, "len": 67108864, + "error": "Input/output error", + "device": "drive1", "type": "backup" }, + "event": "BLOCK_JOB_COMPLETED" } + + .. code:: json + + { "timestamp": { "seconds": 1447193702, "microseconds": 640163 }, + "data": { "device": "drive0", "type": "backup", "speed": 0, + "len": 67108864, "offset": 16777216 }, + "event": "BLOCK_JOB_CANCELLED" } + +.. raw:: html + + <!-- + The FreeBSD Documentation License + + Redistribution and use in source (Markdown) and 'compiled' forms (SGML, HTML, + PDF, PostScript, RTF and so forth) with or without modification, are permitted + provided that the following conditions are met: + + Redistributions of source code (Markdown) must retain the above copyright + notice, this list of conditions and the following disclaimer of this file + unmodified. + + Redistributions in compiled form (transformed to other DTDs, converted to PDF, + PostScript, RTF and other formats) must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation and/or + other materials provided with the distribution. + + THIS DOCUMENTATION IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE + FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, + OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + THIS DOCUMENTATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + --> diff --git a/docs/interop/live-block-operations.rst b/docs/interop/live-block-operations.rst new file mode 100644 index 0000000..5f01797 --- /dev/null +++ b/docs/interop/live-block-operations.rst @@ -0,0 +1,1088 @@ +.. + Copyright (C) 2017 Red Hat Inc. + + This work is licensed under the terms of the GNU GPL, version 2 or + later. See the COPYING file in the top-level directory. + +============================ +Live Block Device Operations +============================ + +QEMU Block Layer currently (as of QEMU 2.9) supports four major kinds of +live block device jobs -- stream, commit, mirror, and backup. These can +be used to manipulate disk image chains to accomplish certain tasks, +namely: live copy data from backing files into overlays; shorten long +disk image chains by merging data from overlays into backing files; live +synchronize data from a disk image chain (including current active disk) +to another target image; and point-in-time (and incremental) backups of +a block device. Below is a description of the said block (QMP) +primitives, and some (non-exhaustive list of) examples to illustrate +their use. + +.. note:: + The file ``qapi/block-core.json`` in the QEMU source tree has the + canonical QEMU API (QAPI) schema documentation for the QMP + primitives discussed here. + +.. todo (kashyapc):: Remove the ".. contents::" directive when Sphinx is + integrated. + +.. contents:: + +Disk image backing chain notation +--------------------------------- + +A simple disk image chain. (This can be created live using QMP +``blockdev-snapshot-sync``, or offline via ``qemu-img``):: + + (Live QEMU) + | + . + V + + [A] <----- [B] + + (backing file) (overlay) + +The arrow can be read as: Image [A] is the backing file of disk image +[B]. And live QEMU is currently writing to image [B], consequently, it +is also referred to as the "active layer". + +There are two kinds of terminology that are common when referring to +files in a disk image backing chain: + +(1) Directional: 'base' and 'top'. Given the simple disk image chain + above, image [A] can be referred to as 'base', and image [B] as + 'top'. (This terminology can be seen in in QAPI schema file, + block-core.json.) + +(2) Relational: 'backing file' and 'overlay'. Again, taking the same + simple disk image chain from the above, disk image [A] is referred + to as the backing file, and image [B] as overlay. + + Throughout this document, we will use the relational terminology. + +.. important:: + The overlay files can generally be any format that supports a + backing file, although QCOW2 is the preferred format and the one + used in this document. + + +Brief overview of live block QMP primitives +------------------------------------------- + +The following are the four different kinds of live block operations that +QEMU block layer supports. + +(1) ``block-stream``: Live copy of data from backing files into overlay + files. + + .. note:: Once the 'stream' operation has finished, three things to + note: + + (a) QEMU rewrites the backing chain to remove + reference to the now-streamed and redundant backing + file; + + (b) the streamed file *itself* won't be removed by QEMU, + and must be explicitly discarded by the user; + + (c) the streamed file remains valid -- i.e. further + overlays can be created based on it. Refer the + ``block-stream`` section further below for more + details. + +(2) ``block-commit``: Live merge of data from overlay files into backing + files (with the optional goal of removing the overlay file from the + chain). Since QEMU 2.0, this includes "active ``block-commit``" + (i.e. merge the current active layer into the base image). + + .. note:: Once the 'commit' operation has finished, there are three + things to note here as well: + + (a) QEMU rewrites the backing chain to remove reference + to now-redundant overlay images that have been + committed into a backing file; + + (b) the committed file *itself* won't be removed by QEMU + -- it ought to be manually removed; + + (c) however, unlike in the case of ``block-stream``, the + intermediate images will be rendered invalid -- i.e. + no more further overlays can be created based on + them. Refer the ``block-commit`` section further + below for more details. + +(3) ``drive-mirror`` (and ``blockdev-mirror``): Synchronize a running + disk to another image. + +(4) ``drive-backup`` (and ``blockdev-backup``): Point-in-time (live) copy + of a block device to a destination. + + +.. _`Interacting with a QEMU instance`: + +Interacting with a QEMU instance +-------------------------------- + +To show some example invocations of command-line, we will use the +following invocation of QEMU, with a QMP server running over UNIX +socket:: + + $ ./x86_64-softmmu/qemu-system-x86_64 -display none -nodefconfig \ + -M q35 -nodefaults -m 512 \ + -blockdev node-name=node-A,driver=qcow2,file.driver=file,file.node-name=file,file.filename=./a.qcow2 \ + -device virtio-blk,drive=node-A,id=virtio0 \ + -monitor stdio -qmp unix:/tmp/qmp-sock,server,nowait + +The ``-blockdev`` command-line option, used above, is available from +QEMU 2.9 onwards. In the above invocation, notice the ``node-name`` +parameter that is used to refer to the disk image a.qcow2 ('node-A') -- +this is a cleaner way to refer to a disk image (as opposed to referring +to it by spelling out file paths). So, we will continue to designate a +``node-name`` to each further disk image created (either via +``blockdev-snapshot-sync``, or ``blockdev-add``) as part of the disk +image chain, and continue to refer to the disks using their +``node-name`` (where possible, because ``block-commit`` does not yet, as +of QEMU 2.9, accept ``node-name`` parameter) when performing various +block operations. + +To interact with the QEMU instance launched above, we will use the +``qmp-shell`` utility (located at: ``qemu/scripts/qmp``, as part of the +QEMU source directory), which takes key-value pairs for QMP commands. +Invoke it as below (which will also print out the complete raw JSON +syntax for reference -- examples in the following sections):: + + $ ./qmp-shell -v -p /tmp/qmp-sock + (QEMU) + +.. note:: + In the event we have to repeat a certain QMP command, we will: for + the first occurrence of it, show the ``qmp-shell`` invocation, *and* + the corresponding raw JSON QMP syntax; but for subsequent + invocations, present just the ``qmp-shell`` syntax, and omit the + equivalent JSON output. + + +Example disk image chain +------------------------ + +We will use the below disk image chain (and occasionally spelling it +out where appropriate) when discussing various primitives:: + + [A] <-- [B] <-- [C] <-- [D] + +Where [A] is the original base image; [B] and [C] are intermediate +overlay images; image [D] is the active layer -- i.e. live QEMU is +writing to it. (The rule of thumb is: live QEMU will always be pointing +to the rightmost image in a disk image chain.) + +The above image chain can be created by invoking +``blockdev-snapshot-sync`` commands as following (which shows the +creation of overlay image [B]) using the ``qmp-shell`` (our invocation +also prints the raw JSON invocation of it):: + + (QEMU) blockdev-snapshot-sync node-name=node-A snapshot-file=b.qcow2 snapshot-node-name=node-B format=qcow2 + { + "execute": "blockdev-snapshot-sync", + "arguments": { + "node-name": "node-A", + "snapshot-file": "b.qcow2", + "format": "qcow2", + "snapshot-node-name": "node-B" + } + } + +Here, "node-A" is the name QEMU internally uses to refer to the base +image [A] -- it is the backing file, based on which the overlay image, +[B], is created. + +To create the rest of the overlay images, [C], and [D] (omitting the raw +JSON output for brevity):: + + (QEMU) blockdev-snapshot-sync node-name=node-B snapshot-file=c.qcow2 snapshot-node-name=node-C format=qcow2 + (QEMU) blockdev-snapshot-sync node-name=node-C snapshot-file=d.qcow2 snapshot-node-name=node-D format=qcow2 + + +A note on points-in-time vs file names +-------------------------------------- + +In our disk image chain:: + + [A] <-- [B] <-- [C] <-- [D] + +We have *three* points in time and an active layer: + +- Point 1: Guest state when [B] was created is contained in file [A] +- Point 2: Guest state when [C] was created is contained in [A] + [B] +- Point 3: Guest state when [D] was created is contained in + [A] + [B] + [C] +- Active layer: Current guest state is contained in [A] + [B] + [C] + + [D] + +Therefore, be aware with naming choices: + +- Naming a file after the time it is created is misleading -- the + guest data for that point in time is *not* contained in that file + (as explained earlier) +- Rather, think of files as a *delta* from the backing file + + +Live block streaming --- ``block-stream`` +----------------------------------------- + +The ``block-stream`` command allows you to do live copy data from backing +files into overlay images. + +Given our original example disk image chain from earlier:: + + [A] <-- [B] <-- [C] <-- [D] + +The disk image chain can be shortened in one of the following different +ways (not an exhaustive list). + +.. _`Case-1`: + +(1) Merge everything into the active layer: I.e. copy all contents from + the base image, [A], and overlay images, [B] and [C], into [D], + *while* the guest is running. The resulting chain will be a + standalone image, [D] -- with contents from [A], [B] and [C] merged + into it (where live QEMU writes go to):: + + [D] + +.. _`Case-2`: + +(2) Taking the same example disk image chain mentioned earlier, merge + only images [B] and [C] into [D], the active layer. The result will + be contents of images [B] and [C] will be copied into [D], and the + backing file pointer of image [D] will be adjusted to point to image + [A]. The resulting chain will be:: + + [A] <-- [D] + +.. _`Case-3`: + +(3) Intermediate streaming (available since QEMU 2.8): Starting afresh + with the original example disk image chain, with a total of four + images, it is possible to copy contents from image [B] into image + [C]. Once the copy is finished, image [B] can now be (optionally) + discarded; and the backing file pointer of image [C] will be + adjusted to point to [A]. I.e. after performing "intermediate + streaming" of [B] into [C], the resulting image chain will be (where + live QEMU is writing to [D]):: + + [A] <-- [C] <-- [D] + + +QMP invocation for ``block-stream`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For `Case-1`_, to merge contents of all the backing files into the +active layer, where 'node-D' is the current active image (by default +``block-stream`` will flatten the entire chain); ``qmp-shell`` (and its +corresponding JSON output):: + + (QEMU) block-stream device=node-D job-id=job0 + { + "execute": "block-stream", + "arguments": { + "device": "node-D", + "job-id": "job0" + } + } + +For `Case-2`_, merge contents of the images [B] and [C] into [D], where +image [D] ends up referring to image [A] as its backing file:: + + (QEMU) block-stream device=node-D base-node=node-A job-id=job0 + +And for `Case-3`_, of "intermediate" streaming", merge contents of +images [B] into [C], where [C] ends up referring to [A] as its backing +image:: + + (QEMU) block-stream device=node-C base-node=node-A job-id=job0 + +Progress of a ``block-stream`` operation can be monitored via the QMP +command:: + + (QEMU) query-block-jobs + { + "execute": "query-block-jobs", + "arguments": {} + } + + +Once the ``block-stream`` operation has completed, QEMU will emit an +event, ``BLOCK_JOB_COMPLETED``. The intermediate overlays remain valid, +and can now be (optionally) discarded, or retained to create further +overlays based on them. Finally, the ``block-stream`` jobs can be +restarted at anytime. + + +Live block commit --- ``block-commit`` +-------------------------------------- + +The ``block-commit`` command lets you merge live data from overlay +images into backing file(s). Since QEMU 2.0, this includes "live active +commit" (i.e. it is possible to merge the "active layer", the right-most +image in a disk image chain where live QEMU will be writing to, into the +base image). This is analogous to ``block-stream``, but in the opposite +direction. + +Again, starting afresh with our example disk image chain, where live +QEMU is writing to the right-most image in the chain, [D]:: + + [A] <-- [B] <-- [C] <-- [D] + +The disk image chain can be shortened in one of the following ways: + +.. _`block-commit_Case-1`: + +(1) Commit content from only image [B] into image [A]. The resulting + chain is the following, where image [C] is adjusted to point at [A] + as its new backing file:: + + [A] <-- [C] <-- [D] + +(2) Commit content from images [B] and [C] into image [A]. The + resulting chain, where image [D] is adjusted to point to image [A] + as its new backing file:: + + [A] <-- [D] + +.. _`block-commit_Case-3`: + +(3) Commit content from images [B], [C], and the active layer [D] into + image [A]. The resulting chain (in this case, a consolidated single + image):: + + [A] + +(4) Commit content from image only image [C] into image [B]. The + resulting chain:: + + [A] <-- [B] <-- [D] + +(5) Commit content from image [C] and the active layer [D] into image + [B]. The resulting chain:: + + [A] <-- [B] + + +QMP invocation for ``block-commit`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For :ref:`Case-1 <block-commit_Case-1>`, to merge contents only from +image [B] into image [A], the invocation is as follows:: + + (QEMU) block-commit device=node-D base=a.qcow2 top=b.qcow2 job-id=job0 + { + "execute": "block-commit", + "arguments": { + "device": "node-D", + "job-id": "job0", + "top": "b.qcow2", + "base": "a.qcow2" + } + } + +Once the above ``block-commit`` operation has completed, a +``BLOCK_JOB_COMPLETED`` event will be issued, and no further action is +required. As the end result, the backing file of image [C] is adjusted +to point to image [A], and the original 4-image chain will end up being +transformed to:: + + [A] <-- [C] <-- [D] + +.. note:: + The intermediate image [B] is invalid (as in: no more further + overlays based on it can be created). + + Reasoning: An intermediate image after a 'stream' operation still + represents that old point-in-time, and may be valid in that context. + However, an intermediate image after a 'commit' operation no longer + represents any point-in-time, and is invalid in any context. + + +However, :ref:`Case-3 <block-commit_Case-3>` (also called: "active +``block-commit``") is a *two-phase* operation: In the first phase, the +content from the active overlay, along with the intermediate overlays, +is copied into the backing file (also called the base image). In the +second phase, adjust the said backing file as the current active image +-- possible via issuing the command ``block-job-complete``. Optionally, +the ``block-commit`` operation can be cancelled by issuing the command +``block-job-cancel``, but be careful when doing this. + +Once the ``block-commit`` operation has completed, the event +``BLOCK_JOB_READY`` will be emitted, signalling that the synchronization +has finished. Now the job can be gracefully completed by issuing the +command ``block-job-complete`` -- until such a command is issued, the +'commit' operation remains active. + +The following is the flow for :ref:`Case-3 <block-commit_Case-3>` to +convert a disk image chain such as this:: + + [A] <-- [B] <-- [C] <-- [D] + +Into:: + + [A] + +Where content from all the subsequent overlays, [B], and [C], including +the active layer, [D], is committed back to [A] -- which is where live +QEMU is performing all its current writes). + +Start the "active ``block-commit``" operation:: + + (QEMU) block-commit device=node-D base=a.qcow2 top=d.qcow2 job-id=job0 + { + "execute": "block-commit", + "arguments": { + "device": "node-D", + "job-id": "job0", + "top": "d.qcow2", + "base": "a.qcow2" + } + } + + +Once the synchronization has completed, the event ``BLOCK_JOB_READY`` will +be emitted. + +Then, optionally query for the status of the active block operations. +We can see the 'commit' job is now ready to be completed, as indicated +by the line *"ready": true*:: + + (QEMU) query-block-jobs + { + "execute": "query-block-jobs", + "arguments": {} + } + { + "return": [ + { + "busy": false, + "type": "commit", + "len": 1376256, + "paused": false, + "ready": true, + "io-status": "ok", + "offset": 1376256, + "device": "job0", + "speed": 0 + } + ] + } + +Gracefully complete the 'commit' block device job:: + + (QEMU) block-job-complete device=job0 + { + "execute": "block-job-complete", + "arguments": { + "device": "job0" + } + } + { + "return": {} + } + +Finally, once the above job is completed, an event +``BLOCK_JOB_COMPLETED`` will be emitted. + +.. note:: + The invocation for rest of the cases (2, 4, and 5), discussed in the + previous section, is omitted for brevity. + + +Live disk synchronization --- ``drive-mirror`` and ``blockdev-mirror`` +---------------------------------------------------------------------- + +Synchronize a running disk image chain (all or part of it) to a target +image. + +Again, given our familiar disk image chain:: + + [A] <-- [B] <-- [C] <-- [D] + +The ``drive-mirror`` (and its newer equivalent ``blockdev-mirror``) allows +you to copy data from the entire chain into a single target image (which +can be located on a different host). + +Once a 'mirror' job has started, there are two possible actions while a +``drive-mirror`` job is active: + +(1) Issuing the command ``block-job-cancel`` after it emits the event + ``BLOCK_JOB_CANCELLED``: will (after completing synchronization of + the content from the disk image chain to the target image, [E]) + create a point-in-time (which is at the time of *triggering* the + cancel command) copy, contained in image [E], of the the entire disk + image chain (or only the top-most image, depending on the ``sync`` + mode). + +(2) Issuing the command ``block-job-complete`` after it emits the event + ``BLOCK_JOB_COMPLETED``: will, after completing synchronization of + the content, adjust the guest device (i.e. live QEMU) to point to + the target image, and, causing all the new writes from this point on + to happen there. One use case for this is live storage migration. + +About synchronization modes: The synchronization mode determines +*which* part of the disk image chain will be copied to the target. +Currently, there are four different kinds: + +(1) ``full`` -- Synchronize the content of entire disk image chain to + the target + +(2) ``top`` -- Synchronize only the contents of the top-most disk image + in the chain to the target + +(3) ``none`` -- Synchronize only the new writes from this point on. + + .. note:: In the case of ``drive-backup`` (or ``blockdev-backup``), + the behavior of ``none`` synchronization mode is different. + Normally, a ``backup`` job consists of two parts: Anything + that is overwritten by the guest is first copied out to + the backup, and in the background the whole image is + copied from start to end. With ``sync=none``, it's only + the first part. + +(4) ``incremental`` -- Synchronize content that is described by the + dirty bitmap + +.. note:: + Refer to the :doc:`bitmaps` document in the QEMU source + tree to learn about the detailed workings of the ``incremental`` + synchronization mode. + + +QMP invocation for ``drive-mirror`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To copy the contents of the entire disk image chain, from [A] all the +way to [D], to a new target (``drive-mirror`` will create the destination +file, if it doesn't already exist), call it [E]:: + + (QEMU) drive-mirror device=node-D target=e.qcow2 sync=full job-id=job0 + { + "execute": "drive-mirror", + "arguments": { + "device": "node-D", + "job-id": "job0", + "target": "e.qcow2", + "sync": "full" + } + } + +The ``"sync": "full"``, from the above, means: copy the *entire* chain +to the destination. + +Following the above, querying for active block jobs will show that a +'mirror' job is "ready" to be completed (and QEMU will also emit an +event, ``BLOCK_JOB_READY``):: + + (QEMU) query-block-jobs + { + "execute": "query-block-jobs", + "arguments": {} + } + { + "return": [ + { + "busy": false, + "type": "mirror", + "len": 21757952, + "paused": false, + "ready": true, + "io-status": "ok", + "offset": 21757952, + "device": "job0", + "speed": 0 + } + ] + } + +And, as noted in the previous section, there are two possible actions +at this point: + +(a) Create a point-in-time snapshot by ending the synchronization. The + point-in-time is at the time of *ending* the sync. (The result of + the following being: the target image, [E], will be populated with + content from the entire chain, [A] to [D]):: + + (QEMU) block-job-cancel device=job0 + { + "execute": "block-job-cancel", + "arguments": { + "device": "job0" + } + } + +(b) Or, complete the operation and pivot the live QEMU to the target + copy:: + + (QEMU) block-job-complete device=job0 + +In either of the above cases, if you once again run the +`query-block-jobs` command, there should not be any active block +operation. + +Comparing 'commit' and 'mirror': In both then cases, the overlay images +can be discarded. However, with 'commit', the *existing* base image +will be modified (by updating it with contents from overlays); while in +the case of 'mirror', a *new* target image is populated with the data +from the disk image chain. + + +QMP invocation for live storage migration with ``drive-mirror`` + NBD +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Live storage migration (without shared storage setup) is one of the most +common use-cases that takes advantage of the ``drive-mirror`` primitive +and QEMU's built-in Network Block Device (NBD) server. Here's a quick +walk-through of this setup. + +Given the disk image chain:: + + [A] <-- [B] <-- [C] <-- [D] + +Instead of copying content from the entire chain, synchronize *only* the +contents of the *top*-most disk image (i.e. the active layer), [D], to a +target, say, [TargetDisk]. + +.. important:: + The destination host must already have the contents of the backing + chain, involving images [A], [B], and [C], visible via other means + -- whether by ``cp``, ``rsync``, or by some storage array-specific + command.) + +Sometimes, this is also referred to as "shallow copy" -- because only +the "active layer", and not the rest of the image chain, is copied to +the destination. + +.. note:: + In this example, for the sake of simplicity, we'll be using the same + ``localhost`` as both source and destination. + +As noted earlier, on the destination host the contents of the backing +chain -- from images [A] to [C] -- are already expected to exist in some +form (e.g. in a file called, ``Contents-of-A-B-C.qcow2``). Now, on the +destination host, let's create a target overlay image (with the image +``Contents-of-A-B-C.qcow2`` as its backing file), to which the contents +of image [D] (from the source QEMU) will be mirrored to:: + + $ qemu-img create -f qcow2 -b ./Contents-of-A-B-C.qcow2 \ + -F qcow2 ./target-disk.qcow2 + +And start the destination QEMU (we already have the source QEMU running +-- discussed in the section: `Interacting with a QEMU instance`_) +instance, with the following invocation. (As noted earlier, for +simplicity's sake, the destination QEMU is started on the same host, but +it could be located elsewhere):: + + $ ./x86_64-softmmu/qemu-system-x86_64 -display none -nodefconfig \ + -M q35 -nodefaults -m 512 \ + -blockdev node-name=node-TargetDisk,driver=qcow2,file.driver=file,file.node-name=file,file.filename=./target-disk.qcow2 \ + -device virtio-blk,drive=node-TargetDisk,id=virtio0 \ + -S -monitor stdio -qmp unix:./qmp-sock2,server,nowait \ + -incoming tcp:localhost:6666 + +Given the disk image chain on source QEMU:: + + [A] <-- [B] <-- [C] <-- [D] + +On the destination host, it is expected that the contents of the chain +``[A] <-- [B] <-- [C]`` are *already* present, and therefore copy *only* +the content of image [D]. + +(1) [On *destination* QEMU] As part of the first step, start the + built-in NBD server on a given host (local host, represented by + ``::``)and port:: + + (QEMU) nbd-server-start addr={"type":"inet","data":{"host":"::","port":"49153"}} + { + "execute": "nbd-server-start", + "arguments": { + "addr": { + "data": { + "host": "::", + "port": "49153" + }, + "type": "inet" + } + } + } + +(2) [On *destination* QEMU] And export the destination disk image using + QEMU's built-in NBD server:: + + (QEMU) nbd-server-add device=node-TargetDisk writable=true + { + "execute": "nbd-server-add", + "arguments": { + "device": "node-TargetDisk" + } + } + +(3) [On *source* QEMU] Then, invoke ``drive-mirror`` (NB: since we're + running ``drive-mirror`` with ``mode=existing`` (meaning: + synchronize to a pre-created file, therefore 'existing', file on the + target host), with the synchronization mode as 'top' (``"sync: + "top"``):: + + (QEMU) drive-mirror device=node-D target=nbd:localhost:49153:exportname=node-TargetDisk sync=top mode=existing job-id=job0 + { + "execute": "drive-mirror", + "arguments": { + "device": "node-D", + "mode": "existing", + "job-id": "job0", + "target": "nbd:localhost:49153:exportname=node-TargetDisk", + "sync": "top" + } + } + +(4) [On *source* QEMU] Once ``drive-mirror`` copies the entire data, and the + event ``BLOCK_JOB_READY`` is emitted, issue ``block-job-cancel`` to + gracefully end the synchronization, from source QEMU:: + + (QEMU) block-job-cancel device=job0 + { + "execute": "block-job-cancel", + "arguments": { + "device": "job0" + } + } + +(5) [On *destination* QEMU] Then, stop the NBD server:: + + (QEMU) nbd-server-stop + { + "execute": "nbd-server-stop", + "arguments": {} + } + +(6) [On *destination* QEMU] Finally, resume the guest vCPUs by issuing the + QMP command `cont`:: + + (QEMU) cont + { + "execute": "cont", + "arguments": {} + } + +.. note:: + Higher-level libraries (e.g. libvirt) automate the entire above + process (although note that libvirt does not allow same-host + migrations to localhost for other reasons). + + +Notes on ``blockdev-mirror`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``blockdev-mirror`` command is equivalent in core functionality to +``drive-mirror``, except that it operates at node-level in a BDS graph. + +Also: for ``blockdev-mirror``, the 'target' image needs to be explicitly +created (using ``qemu-img``) and attach it to live QEMU via +``blockdev-add``, which assigns a name to the to-be created target node. + +E.g. the sequence of actions to create a point-in-time backup of an +entire disk image chain, to a target, using ``blockdev-mirror`` would be: + +(0) Create the QCOW2 overlays, to arrive at a backing chain of desired + depth + +(1) Create the target image (using ``qemu-img``), say, ``e.qcow2`` + +(2) Attach the above created file (``e.qcow2``), run-time, using + ``blockdev-add`` to QEMU + +(3) Perform ``blockdev-mirror`` (use ``"sync": "full"`` to copy the + entire chain to the target). And notice the event + ``BLOCK_JOB_READY`` + +(4) Optionally, query for active block jobs, there should be a 'mirror' + job ready to be completed + +(5) Gracefully complete the 'mirror' block device job, and notice the + the event ``BLOCK_JOB_COMPLETED`` + +(6) Shutdown the guest by issuing the QMP ``quit`` command so that + caches are flushed + +(7) Then, finally, compare the contents of the disk image chain, and + the target copy with ``qemu-img compare``. You should notice: + "Images are identical" + + +QMP invocation for ``blockdev-mirror`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Given the disk image chain:: + + [A] <-- [B] <-- [C] <-- [D] + +To copy the contents of the entire disk image chain, from [A] all the +way to [D], to a new target, call it [E]. The following is the flow. + +Create the overlay images, [B], [C], and [D]:: + + (QEMU) blockdev-snapshot-sync node-name=node-A snapshot-file=b.qcow2 snapshot-node-name=node-B format=qcow2 + (QEMU) blockdev-snapshot-sync node-name=node-B snapshot-file=c.qcow2 snapshot-node-name=node-C format=qcow2 + (QEMU) blockdev-snapshot-sync node-name=node-C snapshot-file=d.qcow2 snapshot-node-name=node-D format=qcow2 + +Create the target image, [E]:: + + $ qemu-img create -f qcow2 e.qcow2 39M + +Add the above created target image to QEMU, via ``blockdev-add``:: + + (QEMU) blockdev-add driver=qcow2 node-name=node-E file={"driver":"file","filename":"e.qcow2"} + { + "execute": "blockdev-add", + "arguments": { + "node-name": "node-E", + "driver": "qcow2", + "file": { + "driver": "file", + "filename": "e.qcow2" + } + } + } + +Perform ``blockdev-mirror``, and notice the event ``BLOCK_JOB_READY``:: + + (QEMU) blockdev-mirror device=node-B target=node-E sync=full job-id=job0 + { + "execute": "blockdev-mirror", + "arguments": { + "device": "node-D", + "job-id": "job0", + "target": "node-E", + "sync": "full" + } + } + +Query for active block jobs, there should be a 'mirror' job ready:: + + (QEMU) query-block-jobs + { + "execute": "query-block-jobs", + "arguments": {} + } + { + "return": [ + { + "busy": false, + "type": "mirror", + "len": 21561344, + "paused": false, + "ready": true, + "io-status": "ok", + "offset": 21561344, + "device": "job0", + "speed": 0 + } + ] + } + +Gracefully complete the block device job operation, and notice the +event ``BLOCK_JOB_COMPLETED``:: + + (QEMU) block-job-complete device=job0 + { + "execute": "block-job-complete", + "arguments": { + "device": "job0" + } + } + { + "return": {} + } + +Shutdown the guest, by issuing the ``quit`` QMP command:: + + (QEMU) quit + { + "execute": "quit", + "arguments": {} + } + + +Live disk backup --- ``drive-backup`` and ``blockdev-backup`` +------------------------------------------------------------- + +The ``drive-backup`` (and its newer equivalent ``blockdev-backup``) allows +you to create a point-in-time snapshot. + +In this case, the point-in-time is when you *start* the ``drive-backup`` +(or its newer equivalent ``blockdev-backup``) command. + + +QMP invocation for ``drive-backup`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Yet again, starting afresh with our example disk image chain:: + + [A] <-- [B] <-- [C] <-- [D] + +To create a target image [E], with content populated from image [A] to +[D], from the above chain, the following is the syntax. (If the target +image does not exist, ``drive-backup`` will create it):: + + (QEMU) drive-backup device=node-D sync=full target=e.qcow2 job-id=job0 + { + "execute": "drive-backup", + "arguments": { + "device": "node-D", + "job-id": "job0", + "sync": "full", + "target": "e.qcow2" + } + } + +Once the above ``drive-backup`` has completed, a ``BLOCK_JOB_COMPLETED`` event +will be issued, indicating the live block device job operation has +completed, and no further action is required. + + +Notes on ``blockdev-backup`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``blockdev-backup`` command is equivalent in functionality to +``drive-backup``, except that it operates at node-level in a Block Driver +State (BDS) graph. + +E.g. the sequence of actions to create a point-in-time backup +of an entire disk image chain, to a target, using ``blockdev-backup`` +would be: + +(0) Create the QCOW2 overlays, to arrive at a backing chain of desired + depth + +(1) Create the target image (using ``qemu-img``), say, ``e.qcow2`` + +(2) Attach the above created file (``e.qcow2``), run-time, using + ``blockdev-add`` to QEMU + +(3) Perform ``blockdev-backup`` (use ``"sync": "full"`` to copy the + entire chain to the target). And notice the event + ``BLOCK_JOB_COMPLETED`` + +(4) Shutdown the guest, by issuing the QMP ``quit`` command, so that + caches are flushed + +(5) Then, finally, compare the contents of the disk image chain, and + the target copy with ``qemu-img compare``. You should notice: + "Images are identical" + +The following section shows an example QMP invocation for +``blockdev-backup``. + +QMP invocation for ``blockdev-backup`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Given a disk image chain of depth 1 where image [B] is the active +overlay (live QEMU is writing to it):: + + [A] <-- [B] + +The following is the procedure to copy the content from the entire chain +to a target image (say, [E]), which has the full content from [A] and +[B]. + +Create the overlay [B]:: + + (QEMU) blockdev-snapshot-sync node-name=node-A snapshot-file=b.qcow2 snapshot-node-name=node-B format=qcow2 + { + "execute": "blockdev-snapshot-sync", + "arguments": { + "node-name": "node-A", + "snapshot-file": "b.qcow2", + "format": "qcow2", + "snapshot-node-name": "node-B" + } + } + + +Create a target image that will contain the copy:: + + $ qemu-img create -f qcow2 e.qcow2 39M + +Then add it to QEMU via ``blockdev-add``:: + + (QEMU) blockdev-add driver=qcow2 node-name=node-E file={"driver":"file","filename":"e.qcow2"} + { + "execute": "blockdev-add", + "arguments": { + "node-name": "node-E", + "driver": "qcow2", + "file": { + "driver": "file", + "filename": "e.qcow2" + } + } + } + +Then invoke ``blockdev-backup`` to copy the contents from the entire +image chain, consisting of images [A] and [B] to the target image +'e.qcow2':: + + (QEMU) blockdev-backup device=node-B target=node-E sync=full job-id=job0 + { + "execute": "blockdev-backup", + "arguments": { + "device": "node-B", + "job-id": "job0", + "target": "node-E", + "sync": "full" + } + } + +Once the above 'backup' operation has completed, the event, +``BLOCK_JOB_COMPLETED`` will be emitted, signalling successful +completion. + +Next, query for any active block device jobs (there should be none):: + + (QEMU) query-block-jobs + { + "execute": "query-block-jobs", + "arguments": {} + } + +Shutdown the guest:: + + (QEMU) quit + { + "execute": "quit", + "arguments": {} + } + "return": {} + } + +.. note:: + The above step is really important; if forgotten, an error, "Failed + to get shared "write" lock on e.qcow2", will be thrown when you do + ``qemu-img compare`` to verify the integrity of the disk image + with the backup content. + + +The end result will be the image 'e.qcow2' containing a +point-in-time backup of the disk image chain -- i.e. contents from +images [A] and [B] at the time the ``blockdev-backup`` command was +initiated. + +One way to confirm the backup disk image contains the identical content +with the disk image chain is to compare the backup and the contents of +the chain, you should see "Images are identical". (NB: this is assuming +QEMU was launched with ``-S`` option, which will not start the CPUs at +guest boot up):: + + $ qemu-img compare b.qcow2 e.qcow2 + Warning: Image size mismatch! + Images are identical. + +NOTE: The "Warning: Image size mismatch!" is expected, as we created the +target image (e.qcow2) with 39M size. diff --git a/docs/live-block-ops.txt b/docs/live-block-ops.txt deleted file mode 100644 index 2211d14..0000000 --- a/docs/live-block-ops.txt +++ /dev/null @@ -1,72 +0,0 @@ -LIVE BLOCK OPERATIONS -===================== - -High level description of live block operations. Note these are not -supported for use with the raw format at the moment. - -Note also that this document is incomplete and it currently only -covers the 'stream' operation. Other operations supported by QEMU such -as 'commit', 'mirror' and 'backup' are not described here yet. Please -refer to the qapi/block-core.json file for an overview of those. - -Snapshot live merge -=================== - -Given a snapshot chain, described in this document in the following -format: - -[A] <- [B] <- [C] <- [D] <- [E] - -Where the rightmost object ([E] in the example) described is the current -image which the guest OS has write access to. To the left of it is its base -image, and so on accordingly until the leftmost image, which has no -base. - -The snapshot live merge operation transforms such a chain into a -smaller one with fewer elements, such as this transformation relative -to the first example: - -[A] <- [E] - -Data is copied in the right direction with destination being the -rightmost image, but any other intermediate image can be specified -instead. In this example data is copied from [C] into [D], so [D] can -be backed by [B]: - -[A] <- [B] <- [D] <- [E] - -The operation is implemented in QEMU through image streaming facilities. - -The basic idea is to execute 'block_stream virtio0' while the guest is -running. Progress can be monitored using 'info block-jobs'. When the -streaming operation completes it raises a QMP event. 'block_stream' -copies data from the backing file(s) into the active image. When finished, -it adjusts the backing file pointer. - -The 'base' parameter specifies an image which data need not be -streamed from. This image will be used as the backing file for the -destination image when the operation is finished. - -In the first example above, the command would be: - -(qemu) block_stream virtio0 file-A.img - -In order to specify a destination image different from the active -(rightmost) one we can use its node name instead. - -In the second example above, the command would be: - -(qemu) block_stream node-D file-B.img - -Live block copy -=============== - -To copy an in use image to another destination in the filesystem, one -should create a live snapshot in the desired destination, then stream -into that image. Example: - -(qemu) snapshot_blkdev ide0-hd0 /new-path/disk.img qcow2 - -(qemu) block_stream ide0-hd0 - - |