aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-06-26qed: Make qed_aio_read_data() synchronousKevin Wolf1-3/+5
Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Remove callback from qed_write_table()Kevin Wolf3-45/+22
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Remove GenericCBKevin Wolf3-45/+1
The GenericCB infrastructure isn't used any more. Remove it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Make qed_write_table() synchronousKevin Wolf1-54/+30
Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Remove callback from qed_write_header()Kevin Wolf1-20/+12
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Make qed_write_header() synchronousKevin Wolf1-47/+29
Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Remove callback from qed_copy_from_backing_file()Kevin Wolf1-34/+23
With this change, qed_aio_write_prefill() and qed_aio_write_postfill() collapse into a single function. This is reflected by a rename of the combined function to qed_aio_write_cow(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Make qed_copy_from_backing_file() synchronousKevin Wolf1-49/+29
Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Make qed_read_backing_file() synchronousKevin Wolf1-14/+18
Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Remove callback from qed_find_cluster()Kevin Wolf3-32/+35
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Remove callback from qed_read_l2_table()Kevin Wolf3-76/+36
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Remove callback from qed_read_table()Kevin Wolf1-54/+25
Instead of passing the return value to a callback, return it to the caller so that the callback can be inlined there. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Make qed_read_table() synchronousKevin Wolf1-38/+18
Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Use bottom half to resume waiting requestsKevin Wolf1-3/+10
The qed driver serialises allocating write requests. When the active allocation is finished, the AIO callback is called, but after this, the next allocating request is immediately processed instead of leaving the coroutine. Resuming another allocation request in the same request coroutine means that the request now runs in the wrong coroutine. The following is one of the possible effects of this: The completed request will generally reenter its request coroutine in a bottom half, expecting that it completes the request in bdrv_driver_pwritev(). However, if the second request actually yielded before leaving the coroutine, the reused request coroutine is in an entirely different place and is reentered prematurely. Not a good idea. Let's make sure that we exit the coroutine after completing the first request by resuming the next allocating request only with a bottom half. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qcow2: Use offset_into_cluster() and offset_to_l2_index()Alberto Garcia2-3/+3
We already have functions for doing these calculations, so let's use them instead of doing everything by hand. This makes the code a bit more readable. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26qcow2: Merge the writing of the COW regions with the guest dataAlberto Garcia3-20/+91
If the guest tries to write data that results on the allocation of a new cluster, instead of writing the guest data first and then the data from the COW regions, write everything together using one single I/O operation. This can improve the write performance by 25% or more, depending on several factors such as the media type, the cluster size and the I/O request size. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26qcow2: Pass a QEMUIOVector to do_perform_cow_{read,write}()Alberto Garcia1-27/+24
Instead of passing a single buffer pointer to do_perform_cow_write(), pass a QEMUIOVector. This will allow us to merge the write requests for the COW regions and the actual data into a single one. Although do_perform_cow_read() does not strictly need to change its API, we're doing it here as well for consistency. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26qcow2: Allow reading both COW regions with only one requestAlberto Garcia1-13/+38
Reading both COW regions requires two separate requests, but it's perfectly possible to merge them and perform only one. This generally improves performance, particularly on rotating disk drives. The downside is that the data in the middle region is read but discarded. This patch takes a conservative approach and only merges reads when the size of the middle region is <= 16KB. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26qcow2: Split do_perform_cow() into _read(), _encrypt() and _write()Alberto Garcia1-30/+87
This patch splits do_perform_cow() into three separate functions to read, encrypt and write the COW regions. perform_cow() can now read both regions first, then encrypt them and finally write them to disk. The memory allocation is also done in this function now, using one single buffer large enough to hold both regions. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26qcow2: Make perform_cow() call do_perform_cow() twiceAlberto Garcia1-14/+22
Instead of calling perform_cow() twice with a different COW region each time, call it just once and make perform_cow() handle both regions. This patch simply moves code around. The next one will do the actual reordering of the COW operations. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26qcow2: Use unsigned int for both members of Qcow2COWRegionAlberto Garcia2-4/+4
Qcow2COWRegion has two attributes: - The offset of the COW region from the start of the first cluster touched by the I/O request. Since it's always going to be positive and the maximum request size is at most INT_MAX, we can use a regular unsigned int to store this offset. - The size of the COW region in bytes. This is guaranteed to be >= 0, so we should use an unsigned type instead. In x86_64 this reduces the size of Qcow2COWRegion from 16 to 8 bytes. It will also help keep some assertions simpler now that we know that there are no negative numbers. The prototype of do_perform_cow() is also updated to reflect these changes. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26qcow2: Remove unused Error variable in do_perform_cow()Alberto Garcia1-3/+1
We are using the return value of qcow2_encrypt_sectors() to detect problems but we are throwing away the returned Error since we have no way to report it to the user. Therefore we can simply get rid of the local Error variable and pass NULL instead. Alternatively we could try to figure out a way to pass the original error instead of simply returning -EIO, but that would be more invasive, so let's keep the current approach. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26nvme: Add support for Read Data and Write Data in CMBs.Stephen Bates2-26/+58
Add the ability for the NVMe model to support both the RDS and WDS modes in the Controller Memory Buffer. Although not currently supported in the upstreamed Linux kernel a fork with support exists [1] and user-space test programs that build on this also exist [2]. Useful for testing CMB functionality in preperation for real CMB enabled NVMe devices (coming soon). [1] https://github.com/sbates130272/linux-p2pmem [2] https://github.com/sbates130272/p2pmem-test Signed-off-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26qemu-iotests: 068: test iothread modeStefan Hajnoczi2-10/+24
Perform the savevm/loadvm test with both iothread on and off. This covers the recently found savevm/loadvm hang when iothread is enabled. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26qemu-iotests: 068: use -drive/-device instead of -hdaStefan Hajnoczi1-1/+6
The legacy -hda option does not support -drive/-device parameters. They will be required by the next patch that extends this test case. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26qemu-iotests: 068: extract _qemu() functionStefan Hajnoczi1-6/+9
Avoid duplicating the QEMU command-line. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26migration: hold AioContext lock for loadvm qemu_fclose()Stefan Hajnoczi1-1/+1
migration_incoming_state_destroy() uses qemu_fclose() on the vmstate file. Make sure to call it inside an AioContext acquire/release region. This fixes an 'qemu: qemu_mutex_unlock: Operation not permitted' abort in loadvm. This patch closes the vmstate file before ending the drained region. Previously we closed the vmstate file after ending the drained region. The order does not matter. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26throttle: Update throttle-groups.c documentationAlberto Garcia1-1/+1
There used to be throttle_timers_{detach,attach}_aio_context() calls in bdrv_set_aio_context(), but since 7ca7f0f6db1fedd28d490795d778cf239 they are now in blk_set_aio_context(). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26doc: Document driver-specific -blockdev optionsKevin Wolf1-1/+114
This documents the driver-specific options for the raw, qcow2 and file block drivers for the man page. For everything else, we refer to the QAPI documentation. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-06-26doc: Document generic -blockdev optionsKevin Wolf1-29/+79
This adds documentation for the -blockdev options that apply to all nodes independent of the block driver used. All options that are shared by -blockdev and -drive are now explained in the section for -blockdev. The documentation of -drive mentions that all -blockdev options are accepted as well. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-06-26migration: use bdrv_drain_all_begin/end() instead bdrv_drain_all()Stefan Hajnoczi1-3/+15
blk/bdrv_drain_all() only takes effect for a single instant and then resumes block jobs, guest devices, and other external clients like the NBD server. This can be handy when performing a synchronous drain before terminating the program, for example. Monitor commands usually need to quiesce I/O across an entire code region so blk/bdrv_drain_all() is not suitable. They must use bdrv_drain_all_begin/end() to mark the region. This prevents new I/O requests from slipping in or worse - block jobs completing and modifying the graph. I audited other blk/bdrv_drain_all() callers but did not find anything that needs a similar fix. This patch fixes the savevm/loadvm commands. Although I haven't encountered a read world issue this makes the code safer. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26migration: avoid recursive AioContext locking in save_vmstate()Stefan Hajnoczi1-1/+11
AioContext was designed to allow nested acquire/release calls. It uses a recursive mutex so callers don't need to worry about nesting...or so we thought. BDRV_POLL_WHILE() is used to wait for block I/O requests. It releases the AioContext temporarily around aio_poll(). This gives IOThreads a chance to acquire the AioContext to process I/O completions. It turns out that recursive locking and BDRV_POLL_WHILE() don't mix. BDRV_POLL_WHILE() only releases the AioContext once, so the IOThread will not be able to acquire the AioContext if it was acquired multiple times. Instead of trying to release AioContext n times in BDRV_POLL_WHILE(), this patch simply avoids nested locking in save_vmstate(). It's the simplest fix and we should step back to consider the big picture with all the recent changes to block layer threading. This patch is the final fix to solve 'savevm' hanging with -object iothread. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()Stefan Hajnoczi1-3/+1
Calling aio_poll() directly may have been fine previously, but this is the future, man! The difference between an aio_poll() loop and BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext around aio_poll(). This allows the IOThread to run fd handlers or BHs to complete the request. Failure to release the AioContext causes deadlocks. Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object iothread. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26block: count bdrv_co_rw_vmstate() requestsStefan Hajnoczi1-5/+12
Call bdrv_inc/dec_in_flight() for vmstate reads/writes. This seems unnecessary at first glance because vmstate reads/writes are done synchronously while the guest is stopped. But we need the bdrv_wakeup() in bdrv_dec_in_flight() so the main loop sees request completion. Besides, it's cleaner to count vmstate reads/writes like ordinary read/write requests. The bdrv_wakeup() partially fixes a 'savevm' hang with -object iothread. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26qemu-iotests: Test exiting qemu with running jobKevin Wolf3-0/+266
When qemu is exited, all running jobs should be cancelled successfully. This adds a test for this for all types of block jobs that currently exist in qemu. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-06-26qemu-iotests: Allow starting new qemu after cleanupKevin Wolf1-0/+3
After _cleanup_qemu(), test cases should be able to start the next qemu process and call _cleanup_qemu() for that one as well. For this to work cleanly, we need to improve the cleanup so that the second invocation doesn't try to kill the qemu instances from the first invocation a second time (which would result in error messages). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-06-26commit: Fix completion with extra referenceKevin Wolf1-0/+7
commit_complete() can't assume that after its block_job_completed() the job is actually immediately freed; someone else may still be holding references. In this case, the op blockers on the intermediate nodes make the graph reconfiguration in the completion code fail. Call block_job_remove_all_bdrv() manually so that we know for sure that any blockers on intermediate nodes are given up. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-06-23Merge remote-tracking branch 'remotes/rth/tags/pull-s390-20170623' into stagingPeter Maydell7-89/+411
Queued target/s390x patches # gpg: Signature made Fri 23 Jun 2017 17:18:24 BST # gpg: using RSA key 0xAD1270CC4DD0279B # gpg: Good signature from "Richard Henderson <rth7680@gmail.com>" # gpg: aka "Richard Henderson <rth@redhat.com>" # gpg: aka "Richard Henderson <rth@twiddle.net>" # Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC 16A4 AD12 70CC 4DD0 279B * remotes/rth/tags/pull-s390-20170623: target/s390x: Implement idte instruction target/s390x: Improve heuristic for ipte target/s390x: Indicate and check for local tlb clearing target/s390x: Clean up TB flag bits target/s390x: Finish implementing ETF2-ENH target/s390x: Mark STFLE_49 facility as available target/s390x: Implement processor-assist insn target/s390x: Implement execution-hint insns target/s390x: Mark STFLE_53 facility as available target/s390x: Implement load-and-zero-rightmost-byte insns target/s390x: Implement load-on-condition-2 insns target/s390x: Mark FPSEH facility as available target/s390x: implement mvcos instruction target/s390x: change PSW_SHIFT_KEY target/s390x: Map existing FAC_* names to S390_FEAT_* names Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-23target/s390x: Implement idte instructionDavid Hildenbrand5-0/+70
Let's keep it very simple for now and flush the complete tlb, we currently can't find the right entries in our tlb, we would have to store the used tables for each element. As we now fully implement the DAT-enhancement facility, we can allow to enable it for the qemu CPU model. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170622094151.28633-4-david@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-23target/s390x: Improve heuristic for ipteDavid Hildenbrand1-9/+16
If only the page index is set, most likely we don't have a valid virtual address. Let's do a full tlb flush for that case. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170622094151.28633-3-david@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-23target/s390x: Indicate and check for local tlb clearingDavid Hildenbrand3-3/+6
Let's allow to enable it for the qemu cpu model and correctly emulate it. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170622094151.28633-2-david@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-23target/s390x: Clean up TB flag bitsRichard Henderson2-23/+17
Most of the PSW bits that were being copied into TB->flags are not relevant to translation. Removing those that are unnecessary reduces the amount of translation required. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-23target/s390x: Finish implementing ETF2-ENHRichard Henderson2-3/+13
Missed the proper alignment in TRTO/TRTT, and ignoring the M3 field for all TRXX insns without ETF2-ENH. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-23target/s390x: Mark STFLE_49 facility as availableRichard Henderson1-0/+1
This facility bit includes execution-hint, load-and-trap, miscellaneous-instruction-extensions and processor-assist. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-23target/s390x: Implement processor-assist insnRichard Henderson2-0/+4
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-23target/s390x: Implement execution-hint insnsRichard Henderson2-1/+13
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-23target/s390x: Mark STFLE_53 facility as availableRichard Henderson1-0/+1
This facility bit includes load-on-condition-2 and load-and-zero-rightmost-byte. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-23target/s390x: Implement load-and-zero-rightmost-byte insnsRichard Henderson2-0/+12
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-23target/s390x: Implement load-on-condition-2 insnsRichard Henderson3-3/+25
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-23target/s390x: Mark FPSEH facility as availableRichard Henderson1-0/+1
This facility bit includes DFP-rounding, FPR-GR-transfer, FPS-sign-handling, and IEEE-exception-simulation. We do support all of these. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>