aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-08-25block: explicit I/O accountingChristoph Hellwig12-50/+171
Decouple the I/O accounting from bdrv_aio_readv/writev/flush and make the hardware models call directly into the accounting helpers. This means: - we do not count internal requests from image formats in addition to guest originating I/O - we do not double count I/O ops if the device model handles it chunk wise - we only account I/O once it actuall is done - can extent I/O accounting to synchronous or coroutine I/O easily - implement I/O latency tracking easily (see the next patch) I've conveted the existing device model callers to the new model, device models that are using synchronous I/O and weren't accounted before haven't been updated yet. Also scsi hasn't been converted to the end-to-end accounting as I want to defer that after the pending scsi layer overhaul. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-25qcow2: remove unused qcow2_create_refcount_update functionFrediano Ziglio2-20/+0
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-25qcow2: use always stderr for debuggingFrediano Ziglio2-3/+3
let all DEBUG_ALLOC2 printf goes to stderr Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-24sheepdog: use coroutinesMORITA Kazutaka1-57/+93
This makes the sheepdog block driver support bdrv_co_readv/writev instead of bdrv_aio_readv/writev. With this patch, Sheepdog network I/O becomes fully asynchronous. The block driver yields back when send/recv returns EAGAIN, and is resumed when the sheepdog network connection is ready for the operation. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow2: remove memory leakFrediano Ziglio1-0/+2
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow2: Removed QCowAIOCB entirelyFrediano Ziglio1-127/+80
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow2: reindent and use while before the big jumpFrediano Ziglio1-137/+135
prepare to remove read/write callbacks Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow2: remove common from QCowAIOCBFrediano Ziglio1-4/+4
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow2: remove cluster_offset from QCowAIOCBFrediano Ziglio1-11/+11
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow2: remove l2meta from QCowAIOCBFrediano Ziglio1-7/+8
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow2: removed cur_nr_sectors field in QCowAIOCBFrediano Ziglio1-55/+43
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow2: Removed unused AIOCB fieldsFrediano Ziglio1-7/+3
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow: remove old #undefined codeFrediano Ziglio1-63/+0
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow: Remove QCowAIOCBFrediano Ziglio1-168/+123
Embed qcow_aio_read_cb into qcow_co_readv and qcow_aio_write_cb into qcow_co_writev Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow: move some blocks of code to avoid useless variable initializationFrediano Ziglio1-27/+26
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow: QCowAIOCB field cleanupFrediano Ziglio1-72/+65
remove unused field from this structure and put some of them in qcow_aio_read_cb and qcow_aio_write_cb Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow/qcow2: Allocate QCowAIOCB structure using stackFrediano Ziglio2-63/+27
instead of calling qemi_aio_get use stack Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23posix-aio-compat: fix latency issuesAvi Kivity1-2/+42
In certain circumstances, posix-aio-compat can incur a lot of latency: - threads are created by vcpu threads, so if vcpu affinity is set, aio threads inherit vcpu affinity. This can cause many aio threads to compete for one cpu. - we can create up to max_threads (64) aio threads in one go; since a pthread_create can take around 30μs, we have up to 2ms of cpu time under a global lock. Fix by: - moving thread creation to the main thread, so we inherit the main thread's affinity instead of the vcpu thread's affinity. - if a thread is currently being created, and we need to create yet another thread, let thread being born create the new thread, reducing the amount of time we spend under the main thread. - drop the local lock while creating a thread (we may still hold the global mutex, though) Note this doesn't eliminate latency completely; scheduler artifacts or lack of host cpu resources can still cause it. We may want pre-allocated threads when this cannot be tolerated. Thanks to Uli Obergfell of Red Hat for his excellent analysis and suggestions. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23block: include flush requests in info blockstatsChristoph Hellwig3-5/+20
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23block/curl: Handle failed reads gracefully.Nicholas Thomas1-1/+19
Current behaviour if a read fails is for the acb to not get finished. This causes an infinite loop in bdrv_read_em (block.c). The read failure never gets reported to the guest and if the error condition clears, the process never recovers. With this patch, when curl reports a failure we finish the acb as a failure. This results in the guest receiving an I/O error (rather than the read hanging indefinitely) and if the error condition subsequently clears, retries work as expected. The simplest test is to put an ISO on a web server you have control over and open it with qemu-io. Then move the ISO out of the way and attempt to read some data - you should see behaviour matching the above. Signed-off-by: Nick Thomas <nick@bytemark.co.uk> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qemu-img: print error codes when convert failsStefan Hajnoczi1-5/+8
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow: initialize coroutine mutexScott Wood1-0/+2
commit 52b8eb60132b27ad53476490e9d7579003390cfa added a mutex, but never initialized it. This caused a segfault. Reported-by: Alexander Graf <agraf@suse.de> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow2: fix typo in documentation for qcow2_get_cluster_offset()Devin Nakamura1-2/+2
Documentation states the num is measured in clusters, but its actually measured in sectors Signed-off-by: Devin Nakamura <devin122@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qemu-img: Use qemu_blockalignKevin Wolf1-6/+6
Now that you can use cache=none for the output file in qemu-img, we should properly align our buffers so that raw-posix doesn't have to use its (smaller) bounce buffer. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-08-23qcow2: Fix DEBUG_* compilationPhilipp Hahn2-4/+17
By introducing BlockDriverState compiling qcow2 with DEBUG_ALLOC and DEBUG_EXT defined got broken. Define a BdrvCheckResult structure locally which is now needed as the second argument. Also fix qcow2_read_extensions() needing BDRVQcowState. Signed-off-by: Philipp Hahn <hahn@univention.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23block: add cache=directsync parameter to -driveStefan Hajnoczi4-6/+14
This patch adds -drive cache=directsync for O_DIRECT | O_SYNC host file I/O with no disk write cache presented to the guest. This mode is useful when guests may not be sending flushes when appropriate and therefore leave data at risk in case of power failure. When cache=directsync is used, write operations are only completed to the guest when data is safely on disk. This new mode is like cache=writethrough but it bypasses the host page cache. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23block: parse cache mode flags in a single placeStefan Hajnoczi4-36/+32
This patch introduces bdrv_parse_cache_flags() which sets open flags given a cache mode. Previously this was duplicated in blockdev.c and qemu-img.c. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23coroutine: Add CoRwlock supportAneesh Kumar K.V2-0/+76
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-22xilinx: removed microbalze_pic_init from xilinx.hPeter A. G. Crosthwaite5-4/+12
This is a microblaze target specific function that belongs outside of xilinx.h (which is a collection of target independent device model instantiator functions) Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2011-08-22xilinx.h: Added missing includesPeter A. G. Crosthwaite1-0/+2
Added some missing #includes for this file. Previously this file relied on its clients to pre-include its dependencies. Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2011-08-22sdl: Don't release input on mouse mode change in full-screen modeJan Kiszka1-1/+3
While in full-screen mode, the input focus naturally belongs to the SDL window. Avoid dropping it when switching from absolute to relative mouse mode. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22Replace qemu_system_cond with VCPU stop mechanismJan Kiszka1-14/+6
We can express the VCPU thread wakeup with the stop mechanism, saving both qemu_system_ready and the qemu_system_cond. For KVM threads, we can just enter the main loop as long as the thread is stopped. The central TCG thread is better held back before the loop as there can be side effects of the services called even when all CPUs are stopped. Creating VCPUs in stopped state will also be required for proper CPU hotplugging support. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22vga: Drop some unused fieldsJan Kiszka1-2/+0
Memory region refactorings obsoleted them. CC: Avi Kivity <avi@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22vga: Use linear mapping + dirty logging in chain 4 memory access modeJan Kiszka2-1/+52
Most VGA memory access modes require MMIO handling as they demand weird logic to get a byte from or into the video RAM. However, there is one exception: chain 4 mode with all memory planes enabled for writing. This mode actually allows lineary mapping, which can then be combined with dirty logging to accelerate KVM. This patch accelerates specifically VBE accesses like they are used by grub in graphical mode. Not only the standard VGA adapter benefits from this, also vmware and spice in VGA mode. CC: Gerd Hoffmann <kraxel@redhat.com> CC: Avi Kivity <avi@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22vmware-vga: Eliminate vga_dirty_log_restartJan Kiszka3-9/+0
After the conversion to the new Memory API, vga_dirty_log_restart became seriously pointless. Remove it from vmware-vga and and then finally drop the service. CC: Andrzej Zaborowski <balrogg@gmail.com> CC: Avi Kivity <avi@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22vmware-vga: Remove dead DIRECT_VRAM modeJan Kiszka1-142/+30
The code was disabled since day 1 of vmware-vga, and now it does not even build anymore. Time for a cleanup. CC: Andrzej Zaborowski <balrogg@gmail.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22vmware-vga: Disable verbose modeJan Kiszka1-1/+1
Elimiates 'vmsvga_value_write: guest runs Linux.' messages from the console. CC: Andrzej Zaborowski <balrogg@gmail.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22vmware-vga: Register reset serviceJan Kiszka1-31/+35
Fixes cold reset in vmware graphic modes. We need to split up the reset function for this purpose, breaking out init-once bits. Cc: Andrzej Zaborowski <balrogg@gmail.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22ioapic: Implement polarityJan Kiszka1-0/+3
If the polarity bit is set in the redirection table, the input level simply has to inverted as it is low active in this case. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22target-i386: Remove unused polarity arguments from APIC APIJan Kiszka4-21/+13
Polarity of external interrupts needs to be handled in the IOAPIC. Passing it to the APIC is pointless. So remove all these arguments. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22Do not kick vcpus in TCG modeJan Kiszka1-1/+1
In TCG mode, iothread and vcpus run in lock-step. So it's pointless to send a signal from qemu_cpu_kick to the vcpu thread - if we got here, the receiver already left the vcpu loop. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22Poll main loop after I/O events were receivedJan Kiszka2-5/+9
Polling until select returns empty fdsets helps to reduce the switches between iothread and vcpus. The benefit of this patch is best visible when running an SMP guest on an SMP host in emulation mode. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22Do not drop global mutex for polled main loop runsJan Kiszka1-2/+8
If we call select without a timeout, it's more efficient to keep the global mutex locked as we may otherwise just play ping pong with a vcpu thread contending for it. This is particularly important for TCG mode where we run in lock-step with the vcpu thread. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22Merge remote-tracking branch 'qemu-kvm/memory/core' into stagingAnthony Liguori1-2/+2
2011-08-22microblaze-user: Deliver SIGFPE on div by zeroEdgar E. Iglesias1-0/+7
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2011-08-22memory: Fix old_portio vs non-zero offsetRichard Henderson1-2/+2
The legacy functions that we're wrapping expect that offset to be included in the register. Indeed, they generally expect the absolute address and then mask off the "high" bits. The FDC is the first converted device with a non-zero offset. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-08-22memory: temporarily suppress the subregion collision warningAnthony Liguori1-0/+2
After 312b4234, the APIC and PCI devices are colliding with each other. This is harmless in practice because the APIC accesses are special cased and never make there way onto the bus. Avi is working on a proper fix, but until that's ready, avoid printing the warning. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22440fx: fix PAM, PCI holesAvi Kivity4-46/+115
The current implementation of PAM and the PCI holes is broken in several ways: - PCI BARs are not restricted to the PCI hole (a BAR may hide memory) - PCI devices do not respect PAM (if a PCI device maps a region while PAM maps the region to RAM, the request will be honored) This patch fixes things by introducing a pci address space, and using memory region aliases to represent PAM regions, SMRAM, and PCI holes. The memory hierarchy looks something like system_memory | +--- low memory alias (0-0xe0000000) | | | +-- ram@0 | +--- high memory alias (0x100000000-EOM) | | | +-- ram@0xe0000000 | +--- pci hole alias (end of low memory-0x100000000) | | | +-- pci@end-of-low-memory | | +--- pam[n] (0xc0000-0xc3fff etc) (when set to pci, priority 1) | | | +-- pci@0xc4000 etc | +--- smram (0xa0000-0xbffff) (when set to pci/vga, priority 1) | +-- pci@0xa0000 etc ram (simple ram region) pci | +--- BARn | +--- VGA 0xa0000-0xbffff | +--- ROMs Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22vga: drop get_system_memory() from vga devices and derivativesAvi Kivity11-34/+37
Instead, use the bus accessors, or get the address space directly from the board constructor. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22pci: add pci_address_space()Avi Kivity2-0/+6
Returns the PCI address space. Useful for bridges that can obscure part of the PCI address space. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>