aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-07-03tests: Add unit tests for the VM Generation ID featureBen Warren2-0/+205
The following tests are implemented: * test that a GUID passed in by command line is propagated to the guest. Read the GUID from guest memory * test that the "auto" argument to the GUID generates a valid GUID, as seen by the guest. * test that a GUID passed in can be queried from the monitor This patch is loosely based on a previous patch from: Gal Hammer <ghammer@redhat.com> and Igor Mammedov <imammedo@redhat.com> Signed-off-by: Ben Warren <ben@skyportsystems.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03vhost-user: unregister slave req handler at cleanup timeMaxime Coquelin1-0/+1
If the backend sends a request just before closing the socket, the aio dispatcher might schedule its reading after the vhost device has been cleaned, leading to a NULL pointer dereference in slave_read(); vhost_user_cleanup() already closes the socket but it is not enough, the handler has to be unregistered. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03vhost: ensure vhost_ops are set before calling iotlb callbackMaxime Coquelin1-2/+8
This patch fixes a crash that happens when vhost-user iommu support is enabled and vhost-user socket is closed. When it happens, if an IOTLB invalidation notification is sent by the IOMMU, vhost_ops's NULL pointer is dereferenced. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03intel_iommu: fix migration breakage on mr switchPeter Xu1-0/+15
Migration is broken after the vfio integration work: qemu-kvm: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address qemu-kvm: Failed to load ich9_ahci:ahci qemu-kvm: error while loading state for instance 0x0 of device '0000:00:1f.2/ich9_ahci' qemu-kvm: load of migration failed: Operation not permitted The problem is that vfio work introduced dynamic memory region switching (actually it is also used for future PT mode), and this memory region layout is not properly delivered to destination when migration happens. Solution is to rebuild the layout in post_load. Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1459906 Fixes: 558e0024 ("intel_iommu: allow dynamic switch of IOMMU region") Reviewed-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03hw/acpi: remove dead acpi codeAleksandr Bezzubikov1-10/+0
Signed-off-by: Aleksandr Bezzubikov <zuban32s@gmail.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03fw_cfg: move setting of FW_CFG_VERSION_DMA bit to fw_cfg_init1()Mark Cave-Ayland1-9/+7
The setting of the FW_CFG_VERSION_DMA bit is the same across both the TYPE_FW_CFG_MEM and TYPE_FW_CFG_IO devices, so unify the logic in fw_cfg_init1(). Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Gabriel Somlo <somlo@cmu.edu>
2017-07-03fw_cfg: don't map the fw_cfg IO ports in fw_cfg_io_realize()Mark Cave-Ayland1-8/+8
As indicated by Laszlo it is a QOM bug for the realize() method to actually map the device. Set up the IO regions within fw_cfg_io_realize() and defer the mapping with sysbus_add_io() to the caller, as already done in fw_cfg_init_mem_wide(). This makes the iobase and dma_iobase properties now obsolete so they can be removed. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Gabriel Somlo <somlo@cmu.edu>
2017-07-03i386/kvm/pci-assign: Use errp directly rather than local_errMao Zhongyi1-15/+7
In assigned_device_pci_cap_init(), first, error messages are filled to a local_err variable, then through error_propagate() pass to the parameter of errp. It leads to cumbersome code. In order to avoid the extra local_err and error_propagate(), drop it and use errp instead. Cc: pbonzini@redhat.com Cc: rth@twiddle.net Cc: ehabkost@redhat.com Cc: mst@redhat.com Cc: armbru@redhat.com Cc: marcel@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03i386/kvm/pci-assign: Fix return type of verify_irqchip_kernel()Mao Zhongyi1-12/+6
When the function no success value to transmit, it usually make the function return void. It has turned out not to be a success, because it means that the extra local_err variable and error_propagate() will be needed. It leads to cumbersome code, therefore, transmit success/ failure in the return value is worth. So fix the return type to avoid it. Cc: pbonzini@redhat.com Cc: rth@twiddle.net Cc: ehabkost@redhat.com Cc: mst@redhat.com Cc: armbru@redhat.com Cc: marcel@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03pci: Convert shpc_init() to ErrorMao Zhongyi5-22/+20
In order to propagate error message better, convert shpc_init() to Error also convert the pci_bridge_dev_initfn() to realize. Cc: mst@redhat.com Cc: marcel@redhat.com Cc: armbru@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03pci: Convert to realizeMao Zhongyi8-50/+56
Convert i82801b11, io3130_upstream, io3130_downstream and pcie_root_port devices to realize. Cc: mst@redhat.com Cc: marcel@redhat.com Cc: armbru@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03pci: Replace pci_add_capability2() with pci_add_capability()Mao Zhongyi7-34/+15
After the patch 'Make errp the last parameter of pci_add_capability()', pci_add_capability() and pci_add_capability2() now do exactly the same. So drop the wrapper pci_add_capability() of pci_add_capability2(), then replace the pci_add_capability2() with pci_add_capability() everywhere. Cc: pbonzini@redhat.com Cc: rth@twiddle.net Cc: ehabkost@redhat.com Cc: mst@redhat.com Cc: dmitry@daynix.com Cc: jasowang@redhat.com Cc: marcel@redhat.com Cc: alex.williamson@redhat.com Cc: armbru@redhat.com Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03pci: Make errp the last parameter of pci_add_capability()Mao Zhongyi12-42/+94
Add Error argument for pci_add_capability() to leverage the errp to pass info on errors. This way is helpful for its callers to make a better error handling when moving to 'realize'. Cc: pbonzini@redhat.com Cc: rth@twiddle.net Cc: ehabkost@redhat.com Cc: mst@redhat.com Cc: jasowang@redhat.com Cc: marcel@redhat.com Cc: alex.williamson@redhat.com Cc: armbru@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03pci: Fix the wrong assertion.Mao Zhongyi4-4/+4
pci_add_capability returns a strictly positive value on success, correct asserts. Cc: dmitry@daynix.com Cc: jasowang@redhat.com Cc: kraxel@redhat.com Cc: alex.williamson@redhat.com Cc: armbru@redhat.com Cc: marcel@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03pci: Add comment for pci_add_capability2()Mao Zhongyi1-0/+6
Comments for pci_add_capability2() to explain the return value. This may help to make a correct return value check for its callers. Cc: mst@redhat.com Cc: marcel@redhat.com Cc: armbru@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03pci: Clean up error checking in pci_add_capability()Mao Zhongyi1-5/+1
On success, pci_add_capability2() returns a positive value. On failure, it sets an error and return a negative value. pci_add_capability() laboriously checks this behavior. No other caller does. Drop the checks from pci_add_capability(). Cc: mst@redhat.com Cc: marcel@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03intel_iommu: relax iq tail check on VTD_GCMD_QIE enableLadi Prosek2-15/+20
The VT-d spec (section 6.5.2) prescribes software to zero the Invalidation Queue Tail Register before enabling the VTD_GCMD_QIE Global Command Register bit. Windows Server 2012 R2 and possibly other older Windows versions violate the protocol and set a non-zero queue tail first, which in effect makes them crash early on boot with -device intel-iommu,intremap=on. This commit relaxes the check and instead of failing to enable VTD_GCMD_QIE with vtd_err_qi_enable, it behaves as if the tail register was set just after enabling VTD_GCMD_QIE (see vtd_handle_iqt_write). Signed-off-by: Ladi Prosek <lprosek@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03hw/pci-bridge/dec: Classify the DEC PCI bridge as bridge deviceThomas Huth1-0/+2
This way the bridge shows up in the correct section of the "-device help" text. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2017-07-03virtio-net: enable configurable tx queue sizeWei Wang2-2/+31
This patch enables the virtio-net tx queue size to be configurable between 256 (the default queue size) and 1024 by the user when the vhost-user backend is used. Currently, the maximum tx queue size for other backends is 512 due to the following limitations: - QEMU backend: the QEMU backend implementation in some cases may send 1024+1 iovs to writev. - Vhost_net backend: there are possibilities that the guest sends a vring_desc of memory which crosses a MemoryRegion thereby generating more than 1024 iovs after translation from guest-physical address in the backend. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20170603' into stagingPeter Maydell5-26/+32
Queued TCG patches # gpg: Signature made Fri 30 Jun 2017 20:03:53 BST # gpg: using RSA key 0xAD1270CC4DD0279B # gpg: Good signature from "Richard Henderson <rth7680@gmail.com>" # gpg: aka "Richard Henderson <rth@redhat.com>" # gpg: aka "Richard Henderson <rth@twiddle.net>" # Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC 16A4 AD12 70CC 4DD0 279B * remotes/rth/tags/pull-tcg-20170603: tcg: consistently access cpu->tb_jmp_cache atomically gen-icount: use tcg_ctx.tcg_env instead of cpu_env gen-icount: add missing inline to gen_tb_end Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-30tcg: consistently access cpu->tb_jmp_cache atomicallyEmilio G. Cota4-21/+25
Some code paths can lead to atomic accesses racing with memset() on cpu->tb_jmp_cache, which can result in torn reads/writes and is undefined behaviour in C11. These torn accesses are unlikely to show up as bugs, but from code inspection they seem possible. For example, tb_phys_invalidate does: /* remove the TB from the hash list */ h = tb_jmp_cache_hash_func(tb->pc); CPU_FOREACH(cpu) { if (atomic_read(&cpu->tb_jmp_cache[h]) == tb) { atomic_set(&cpu->tb_jmp_cache[h], NULL); } } Here atomic_set might race with a concurrent memset (such as the ones scheduled via "unsafe" async work, e.g. tlb_flush_page) and therefore we might end up with a torn pointer (or who knows what, because we are under undefined behaviour). This patch converts parallel accesses to cpu->tb_jmp_cache to use atomic primitives, thereby bringing these accesses back to defined behaviour. The price to pay is to potentially execute more instructions when clearing cpu->tb_jmp_cache, but given how infrequently they happen and the small size of the cache, the performance impact I have measured is within noise range when booting debian-arm. Note that under "safe async" work (e.g. do_tb_flush) we could use memset because no other vcpus are running. However I'm keeping these accesses atomic as well to keep things simple and to avoid confusing analysis tools such as ThreadSanitizer. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1497486973-25845-1-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-30gen-icount: use tcg_ctx.tcg_env instead of cpu_envEmilio G. Cota1-4/+6
We are relying on cpu_env being defined as a global, yet most targets (i.e. all but arm/a64) have it defined as a local variable. Luckily all of them use the same "cpu_env" name, but really compilation shouldn't break if the name of that local variable changed. Fix it by using tcg_ctx.tcg_env, which all targets set in their translate_init function. This change also helps paving the way for the upcoming "translation loop common to all targets" work. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1497639397-19453-3-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-30gen-icount: add missing inline to gen_tb_endEmilio G. Cota1-1/+1
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1497639397-19453-2-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-30Merge remote-tracking branch 'remotes/famz/tags/block-pull-request' into stagingPeter Maydell4-19/+41
# gpg: Signature made Fri 30 Jun 2017 15:08:45 BST # gpg: using RSA key 0xCA35624C6A9171C6 # gpg: Good signature from "Fam Zheng <famz@redhat.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6 * remotes/famz/tags/block-pull-request: block: Exploit BDRV_BLOCK_EOF for larger zero blocks block: Add BDRV_BLOCK_EOF to bdrv_get_block_status() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-30Merge remote-tracking branch ↵Peter Maydell5-123/+619
'remotes/vivier/tags/m68k-for-2.10-pull-request' into staging # gpg: Signature made Fri 30 Jun 2017 13:30:44 BST # gpg: using RSA key 0xF30C38BD3F2FBE3C # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" # gpg: aka "Laurent Vivier <laurent@vivier.eu>" # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier/tags/m68k-for-2.10-pull-request: target/m68k: add fmovem target/m68k: add explicit single and double precision operations (part 2) target/m68k: add fsglmul and fsgldiv softfloat: define floatx80_round() target/m68k: add explicit single and double precision operations target/m68k: add fmovecr target/m68k: add fscc. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-30block: Exploit BDRV_BLOCK_EOF for larger zero blocksEric Blake3-15/+28
When we have a BDS with unallocated clusters, but asking the status of its underlying bs->file or backing layer encounters an end-of-file condition, we know that the rest of the unallocated area will read as zeroes. However, pre-patch, this required two separate calls to bdrv_get_block_status(), as the first call stops at the point where the underlying file ends. Thanks to BDRV_BLOCK_EOF, we can now widen the results of the primary status if the secondary status already includes BDRV_BLOCK_ZERO. In turn, this fixes a TODO mentioned in iotest 154, where we can now see that all sectors in a partial cluster at the end of a file read as zero when coupling the shorter backing file's status along with our knowledge that the remaining sectors came from an unallocated cluster. Also, note that the loop in bdrv_co_get_block_status_above() had an inefficent exit: in cases where the active layer sets BDRV_BLOCK_ZERO but does NOT set BDRV_BLOCK_ALLOCATED (namely, where we know we read zeroes merely because our unallocated clusters lie beyond the backing file's shorter length), we still ended up probing the backing layer even though we already had a good answer. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170505021500.19315-3-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-30block: Add BDRV_BLOCK_EOF to bdrv_get_block_status()Eric Blake2-4/+13
Just as the block layer already sets BDRV_BLOCK_ALLOCATED as a shortcut for subsequent operations, there are also some optimizations that are made easier if we can quickly tell that *pnum will advance us to the end of a file, via a new BDRV_BLOCK_EOF which gets set by the block layer. This just plumbs up the new bit; subsequent patches will make use of it. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170505021500.19315-2-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-30Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into ↵Peter Maydell8-26/+101
staging # gpg: Signature made Fri 30 Jun 2017 12:46:17 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: virtio-pci: use ioeventfd even when KVM is disabled tests: fix virtio-net-test ISR dependence tests: fix virtio-blk-test ISR dependence tests: fix virtio-scsi-test ISR dependence libqos: add virtio used ring support libqos: fix typo in virtio.h QVirtQueue->used comment virtio-blk: trace vdev so devices can be distinguished Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-30Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.10-20170630' ↵Peter Maydell18-308/+749
into staging ppc patch queue 2017-06-30 * More DRC cleanups, these now actually fix a few bugs * Properly implements the openpic timers (they now count and generate interrupts) * Fixes for XICS migration * Fixes for migration of POWER9 RPT guests * The last of the compatibility mode rework # gpg: Signature made Fri 30 Jun 2017 10:52:25 BST # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.10-20170630: (21 commits) spapr: Clean up DRC set_isolation_state() path spapr: Clean up DRC set_allocation_state path spapr: Make DRC reset force DRC into known state spapr: Split DRC release from DRC detach spapr: Eliminate DRC 'signalled' state variable spapr: Start hotplugged PCI devices in ISOLATED state target-ppc: Enable open-pic timers to count and generate interrupts hw/ppc/spapr.c: consecutive 'spapr->patb_entry = 0' statements spapr: prevent QEMU crash when CPU realization fails target/ppc: Proper cleanup when ppc_cpu_realizefn fails spapr: fix migration of ICPState objects from/to older QEMU xics: directly register ICPState objects to vmstate target/ppc: Fix return value in tcg radix mmu fault handler target/ppc/excp_helper: Take BQL before calling cpu_interrupt() spapr: Fix migration of Radix guests spapr: Add a "no HPT" encoding to HTAB migration stream ppc: Rework CPU compatibility testing across migration pseries: Reset CPU compatibility mode pseries: Move CPU compatibility property to machine qapi: add explicit null to string input and output visitors ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-30virtio-pci: use ioeventfd even when KVM is disabledStefan Hajnoczi1-1/+1
Old kvm.ko versions only supported a tiny number of ioeventfds so virtio-pci avoids ioeventfds when kvm_has_many_ioeventfds() returns 0. Do not check kvm_has_many_ioeventfds() when KVM is disabled since it always returns 0. Since commit 8c56c1a592b5092d91da8d8943c17777d6462a6f ("memory: emulate ioeventfd") it has been possible to use ioeventfds in qtest or TCG mode. This patch makes -device virtio-blk-pci,iothread=iothread0 work even when KVM is disabled. I have tested that virtio-blk-pci works under TCG both with and without iothread. This patch fixes qemu-iotests 068, which was accidentally merged early despite the dependency on ioeventfd. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Tested-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20170628184724.21378-7-stefanha@redhat.com Message-id: 20170615163813.7255-2-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-30tests: fix virtio-net-test ISR dependenceStefan Hajnoczi1-3/+3
Use the new used ring APIs instead of assuming ISR being set means the request has completed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Tested-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20170628184724.21378-6-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-30tests: fix virtio-blk-test ISR dependenceStefan Hajnoczi1-10/+17
Use the new used ring APIs instead of assuming ISR being set means the request has completed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Tested-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20170628184724.21378-5-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-30tests: fix virtio-scsi-test ISR dependenceStefan Hajnoczi1-1/+1
Use the new used ring APIs instead of assuming ISR being set means the request has completed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Tested-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20170628184724.21378-4-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-30libqos: add virtio used ring supportStefan Hajnoczi2-0/+66
Existing tests do not touch the virtqueue used ring. Instead they poll the virtqueue ISR register and peek into their request's device-specific status field. It turns out that the virtqueue ISR register can be set to 1 more than once for a single notification (see commit 83d768b5640946b7da55ce8335509df297e2c7cd "virtio: set ISR on dataplane notifications"). This causes problems for tests that assume a 1:1 correspondence between the ISR being 1 and request completion. Peeking at device-specific status fields is also problematic if the device has no field that can be abused for EINPROGRESS polling semantics. This is the case if all the field's values may be set by the device; there's no magic constant left for polling. It's time to process the used ring for completed requests, just like a real virtio guest driver. This patch adds the necessary APIs. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Tested-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20170628184724.21378-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-30libqos: fix typo in virtio.h QVirtQueue->used commentStefan Hajnoczi1-1/+1
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Tested-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20170628184724.21378-2-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-30spapr: Clean up DRC set_isolation_state() pathDavid Gibson2-46/+105
There are substantial differences in the various paths through set_isolation_state(), both for setting to ISOLATED versus UNISOLATED state and for logical versus physical DRCs. So, split the set_isolation_state() method into isolate() and unisolate() methods, and give it different implementations for the two DRC types. Factor some minimal common checks, including for valid indicator values (which we weren't previously checking) into rtas_set_isolation_state(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30spapr: Clean up DRC set_allocation_state pathDavid Gibson2-30/+45
The allocation-state indicator should only actually be implemented for "logical" DRCs, not physical ones. Factor a check for this, and also for valid indicator state values into rtas_set_allocation_state(). Because they don't exist for physical DRCs, there's no reason that we'd ever want more than one method implementation, so it can just be a plain function. In addition, the setting to USABLE and setting to UNUSABLE paths in set_allocation_state() don't actually have much in common. So, split the method separate functions for each parameter value (drc_set_usable() and drc_set_unusable()). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30spapr: Make DRC reset force DRC into known stateDavid Gibson2-34/+16
The reset handler for DRCs attempts several state transitions which are subject to various checks and restrictions. But at reset time we know there is no guest, so we can ignore most of the usual sequencing rules and just set the DRC back to a known state. In fact, it's safer to do so. The existing code also has several redundant checks for drc->awaiting_release inside a block which has already tested that. This patch removes those and sets the DRC to a fixed initial state based only on whether a device is currently plugged or not. With DRCs correctly reset to a state based on device presence, we don't need to force state transitions as cold plugged devices are processed. This allows us to remove all the callers of the set_*_state() methods from outside spapr_drc.c. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30spapr: Split DRC release from DRC detachDavid Gibson1-22/+27
spapr_drc_detach() is called when qemu generic code requests a device be unplugged. It makes a number of tests, which could well delay further action until later, before actually detach the device from the DRC. This splits out the part which actually removes the device from the DRC into spapr_drc_release(). This will be useful for further cleanups. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30spapr: Eliminate DRC 'signalled' state variableDavid Gibson3-56/+1
The 'signalled' field in the DRC appears to be entirely a torturous workaround for the fact that PCI devices were started in UNISOLATED state for unclear reasons. 1) 'signalled' is already meaningless for logical (so far, all non PCI) DRCs. It's always set to true (at least at any point it might be tested), and can't be assigned any real meaning due to the way signalling works for logical DRCs. 2) For PCI DRCs, the only time signalled would be false is when non-zero functions of a multifunction device are hotplugged, followed by function zero (the other way around is explicitly not permitted). In that case the secondary function DRCs are attached, but the notification isn't sent to the guest until function 0 is plugged. 3) signalled being false is used to allow a DRC detach to switch mode back to ISOLATED state, which allows a secondary function to be hotplugged then unplugged with function 0 never inserted. Without this a secondary function starting in UNISOLATED state couldn't be detached again without function 0 being inserted, all the functions configured by the guest, then sent back to ISOLATED state. 4) But now that PCI DRCs start in ISOLATED state, there's nothing to be done. If the guest doesn't get the notification, it won't switch the device to UNISOLATED state, so nothing prevents it from being unplugged. If the guest does move it to UNISOLATED state without the signal (due to a manual drmgr call, for instance) then it really isn't safe to unplug it. So, this patch removes the signalled variable and all code related to it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30spapr: Start hotplugged PCI devices in ISOLATED stateDavid Gibson1-10/+0
PCI DRCs, and only PCI DRCs, are immediately moved to UNISOLATED isolation state once the device is attached. This has been there from the initial implementation, and it's not clear why. The state diagram in PAPR 13.4 suggests PCI devices should start in ISOLATED state until the guest moves them into UNISOLATED, and the code in the guest-side drmgr tool seems to work that way too. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org>
2017-06-30target-ppc: Enable open-pic timers to count and generate interruptsAaron Larson1-6/+111
Previously QEMU open-pic implemented the 4 open-pic timers including all timer registers, but the timers did not "count" or generate any interrupts. The patch makes the timers both count and generate interrupts. The timer clock frequency is fixed at 25MHZ. -- Responding to V2 patch comments. - Simplify clock frequency logic and commentary. - Remove camelCase variables. - Timer objects now created at init rather than lazily. Signed-off-by: Aaron Larson <alarson@ddci.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-30hw/ppc/spapr.c: consecutive 'spapr->patb_entry = 0' statementsDaniel Henrique Barboza1-1/+0
In ppc_spapr_reset(), if the guest is using HPT, the code was executing: } else { spapr->patb_entry = 0; spapr_setup_hpt_and_vrma(spapr); } And, at the end of spapr_setup_hpt_and_vrma: /* We're setting up a hash table, so that means we're not radix */ spapr->patb_entry = 0; Resulting in spapr->patb_entry being assigned to 0 twice in a row. Given that 'spapr_setup_hpt_and_vrma' is also called inside 'spapr_check_setup_free_hpt' of spapr_hcall.c, this trivial patch removes the 'patb_entry = 0' assignment from the 'else' clause inside ppc_spapr_reset to avoid this behavior. Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-30spapr: prevent QEMU crash when CPU realization failsBharata B Rao1-3/+4
ICPState objects were being allocated before CPU thread realization. However commit 9ed656631d73 (xics: setup cpu at realize time) reversed it by allocating ICPState objects after CPU thread is realized. But it didn't take care to fix the error path because of which we observe a SIGSEGV when CPU thread realization fails during cold/hotplug. Fix this by ensuring that we do object_unparent() of ICPState object only in case when is was created earlier. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-30target/ppc: Proper cleanup when ppc_cpu_realizefn failsBharata B Rao1-4/+8
If ppc_cpu_realizefn() fails after cpu_exec_realizefn() has been called, we will have to undo whatever cpu_exec_realizefn() did by explicitly calling cpu_exec_unrealizeffn() which is currently missing. Failure to do this proper cleanup will result in CPU which was never fully realized to linger on the cpus list causing SIGSEGV later (for eg when running "info cpus"). Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-30spapr: fix migration of ICPState objects from/to older QEMUGreg Kurz2-2/+87
Commit 5bc8d26de20c ("spapr: allocate the ICPState object from under sPAPRCPUCore") moved ICPState objects from the machine to CPU cores. This is an improvement since we no longer allocate ICPState objects that will never be used. But it has the side-effect of breaking migration of older machine types from older QEMU versions. This patch allows spapr to register dummy "icp/server" entries to vmstate. These entries use a dedicated VMStateDescription that can swallow and discard state of an incoming migration stream, and that don't send anything on outgoing migration. As for real ICPState objects, the instance_id is the cpu_index of the corresponding vCPU, which happens to be equal to the generated instance_id of older machine types. The machine can unregister/register these entries when CPUs are dynamically plugged/unplugged. This is only available for pseries-2.9 and older machines, thanks to a compat property. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-30xics: directly register ICPState objects to vmstateGreg Kurz1-1/+4
The ICPState objects are currently registered to vmstate as qdev objects. Their instance ids are hence computed automatically in the migration code, and thus depends on the order the CPU cores were plugged. If the destination had its CPU cores plugged in a different order than the source, then ICPState objects will have different instance_ids and load the wrong state. Since CPU objects have a reliable cpu_index which is already used as instance_id in vmstate, let's use it for ICPState as well. Please note that this doesn't break migration. Older machine types used to allocate and realize all ICPState objects at machine init time, for the whole lifetime of the machine. The qdev instance ids are thus 0,1,2... nr_servers and happen to map to the vCPU indexes. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-30target/ppc: Fix return value in tcg radix mmu fault handlerSuraj Jitindar Singh1-1/+1
The mmu fault handler should return 0 if it was able to successfully handle the fault and a positive value otherwise. Currently the tcg radix mmu fault handler will return 1 after successfully handling a fault in virtual mode. This is incorrect so fix it so that it returns 0 in this case. The handler already correctly returns 0 when a fault was handled in real mode and 1 if an interrupt was generated. Fixes: d5fee0bbe68d ("target/ppc: Implement ISA V3.00 radix page fault handler") Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-30target/ppc/excp_helper: Take BQL before calling cpu_interrupt()Thomas Huth1-0/+3
Since the introduction of MTTCG, using the msgsnd instruction abort()s if being called without holding the BQL. So let's protect that part of the code now with qemu_mutex_lock_iothread(). Buglink: https://bugs.launchpad.net/qemu/+bug/1694998 Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-30spapr: Fix migration of Radix guestsBharata B Rao1-0/+12
Fix migration of radix guests by ensuring that we issue KVM_PPC_CONFIGURE_V3_MMU for radix case post migration. Reported-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>