aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-11-22block: Pass unaligned discard requests to driversEric Blake1-13/+32
Discard is advisory, so rounding the requests to alignment boundaries is never semantically wrong from the data that the guest sees. But at least the Dell Equallogic iSCSI SANs has an interesting property that its advertised discard alignment is 15M, yet documents that discarding a sequence of 1M slices will eventually result in the 15M page being marked as discarded, and it is possible to observe which pages have been discarded. Between commits 9f1963b and b8d0a980, we converted the block layer to a byte-based interface that ultimately ignores any unaligned head or tail based on the driver's advertised discard granularity, which means that qemu 2.7 refuses to pass any discard request smaller than 15M down to the Dell Equallogic hardware. This is a slight regression in behavior compared to earlier qemu, where a guest executing discards in power-of-2 chunks used to be able to get every page discarded, but is now left with various pages still allocated because the guest requests did not align with the hardware's 15M pages. Since the SCSI specification says nothing about a minimum discard granularity, and only documents the preferred alignment, it is best if the block layer gives the driver every bit of information about discard requests, rather than rounding it to alignment boundaries early. Rework the block layer discard algorithm to mirror the write zero algorithm: always peel off any unaligned head or tail and manage that in isolation, then do the bulk of the request on an aligned boundary. The fallback when the driver returns -ENOTSUP for an unaligned request is to silently ignore that portion of the discard request; but for devices that can pass the partial request all the way down to hardware, this can result in the hardware coalescing requests and discarding aligned pages after all. Reported by: Peter Lieven <pl@kamp.de> CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-11-22block: Return -ENOTSUP rather than assert on unaligned discardsEric Blake3-3/+11
Right now, the block layer rounds discard requests, so that individual drivers are able to assert that discard requests will never be unaligned. But there are some ISCSI devices that track and coalesce multiple unaligned requests, turning it into an actual discard if the requests eventually cover an entire page, which implies that it is better to always pass discard requests as low down the stack as possible. In isolation, this patch has no semantic effect, since the block layer currently never passes an unaligned request through. But the block layer already has code that silently ignores drivers that return -ENOTSUP for a discard request that cannot be honored (as well as drivers that return 0 even when nothing was done). But the next patch will update the block layer to fragment discard requests, so that clients are guaranteed that they are either dealing with an unaligned head or tail, or an aligned core, making it similar to the block layer semantics of write zero fragmentation. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-11-22block: Let write zeroes fallback work even with small max_transferEric Blake1-5/+8
Commit 443668ca rewrote the write_zeroes logic to guarantee that an unaligned request never crosses a cluster boundary. But in the rewrite, the new code assumed that at most one iteration would be needed to get to an alignment boundary. However, it is easy to trigger an assertion failure: the Linux kernel limits loopback devices to advertise a max_transfer of only 64k. Any operation that requires falling back to writes rather than more efficient zeroing must obey max_transfer during that fallback, which means an unaligned head may require multiple iterations of the write fallbacks before reaching the aligned boundaries, when layering a format with clusters larger than 64k atop the protocol of file access to a loopback device. Test case: $ qemu-img create -f qcow2 -o cluster_size=1M file 10M $ losetup /dev/loop2 /path/to/file $ qemu-io -f qcow2 /dev/loop2 qemu-io> w 7m 1k qemu-io> w -z 8003584 2093056 In fairness to Denis (as the original listed author of the culprit commit), the faulty logic for at most one iteration is probably all my fault in reworking his idea. But the solution is to restore what was in place prior to that commit: when dealing with an unaligned head or tail, iterate as many times as necessary while fragmenting the operation at max_transfer boundaries. Reported-by: Ed Swierk <eswierk@skyportsystems.com> CC: qemu-stable@nongnu.org CC: Denis V. Lunev <den@openvz.org> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-11-22qcow2: Inform block layer about discard boundariesEric Blake1-0/+1
At the qcow2 layer, discard is only possible on a per-cluster basis; at the moment, qcow2 silently rounds any unaligned requests to this granularity. However, an upcoming patch will fix a regression in the block layer ignoring too much of an unaligned discard request, by changing the block layer to break up a discard request at alignment boundaries; for that to work, the block layer must know about our limits. However, we can't go one step further by changing qcow2_discard_clusters() to assert that requests are always aligned, since that helper function is reached on paths outside of the block layer. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-11-22Fix FreeBSD (10.x) build after 7dc9ae43Ed Maste2-0/+3
Include sys/user.h for declaration of 'struct kinfo_proc'. Add -lutil to qemu-ga link for kinfo_getproc. Signed-off-by: Ed Maste <emaste@freebsd.org> Message-id: 1479778365-11315-1-git-send-email-emaste@freebsd.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-22Merge remote-tracking branch 'jtc/tags/block-pull-request' into stagingStefan Hajnoczi1-1/+2
# gpg: Signature made Mon 21 Nov 2016 10:12:43 PM GMT # gpg: using RSA key 0xBDBE7B27C0DE3057 # gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>" # gpg: aka "Jeffrey Cody <jeff@codyprime.org>" # gpg: aka "Jeffrey Cody <codyprime@gmail.com>" # Primary key fingerprint: 9957 4B4D 3474 90E7 9D98 D624 BDBE 7B27 C0DE 3057 * jtc/tags/block-pull-request: gluster: Fix use after free in glfs_clear_preopened() Message-id: 1479766499-29972-1-git-send-email-jcody@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-21gluster: Fix use after free in glfs_clear_preopened()Kevin Wolf1-1/+2
This fixes a use-after-free bug introduced in commit 6349c154. We need to use QLIST_FOREACH_SAFE() when freeing elements in the loop. Spotted by Coverity. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1479378608-11962-1-git-send-email-kwolf@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-21Merge remote-tracking branch 'sstabellini/tags/xen-20161108-tag' into stagingStefan Hajnoczi2-6/+4
Xen 2016/11/08 # gpg: Signature made Tue 08 Nov 2016 07:48:12 PM GMT # gpg: using RSA key 0x894F8F4870E1AE90 # gpg: Good signature from "Stefano Stabellini <sstabellini@kernel.org>" # gpg: aka "Stefano Stabellini <stefano.stabellini@eu.citrix.com>" # Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90 * sstabellini/tags/xen-20161108-tag: xen: Fix xenpv machine initialisation Message-id: alpine.DEB.2.10.1611081150170.3491@sstabellini-ThinkPad-X260 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-21Merge remote-tracking branch 'mst/tags/for_upstream' into stagingStefan Hajnoczi15-57/+151
virtio, vhost, pc: fixes Most notably this fixes a regression with vhost introduced by the pull before last. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Fri 18 Nov 2016 03:51:55 PM GMT # gpg: using RSA key 0x281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * mst/tags/for_upstream: acpi: Use apic_id_limit when calculating legacy ACPI table size ipmi: fix qemu crash while migrating with ipmi ivshmem: Fix 64 bit memory bar configuration virtio: set ISR on dataplane notifications virtio: access ISR atomically virtio: introduce grab/release_ioeventfd to fix vhost virtio-crypto: fix virtio_queue_set_notification() race Message-id: 1479484366-7977-1-git-send-email-mst@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-18acpi: Use apic_id_limit when calculating legacy ACPI table sizeEduardo Habkost1-1/+1
The code that calculates the legacy ACPI table size for migration compatibility uses max_cpus when calculating legacy_aml_len (the size of the DSDT and SSDT tables). However, the SSDT grows according to APIC ID limit, not max_cpus. The bug is not triggered very often because of the 4k alignment on the table size. But it can be triggered if you are unlucky enough to cross a 4k boundary. Change the legacy_aml_len calculation to use apic_id_limit, to calculate the right size. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-18ipmi: fix qemu crash while migrating with ipmiZhuangYanying1-4/+2
Qemu crash in the source side while migrating, after starting ipmi service inside vm. ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 \ -drive file=/work/suse/suse11_sp3_64_vt,format=raw,if=none,id=drive-virtio-disk0,cache=none \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 \ -vnc :99 -monitor vc -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-kcs,bmc=bmc0,ioport=0xca2 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffec4268700 (LWP 7657)] __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2757 (gdb) bt #0 __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2757 #1 0x00005555559ef775 in memcpy (__len=3, __src=0xc1421c, __dest=<optimized out>) at /usr/include/bits/string3.h:51 #2 qemu_put_buffer (f=0x555557a97690, buf=0xc1421c <Address 0xc1421c out of bounds>, size=3) at migration/qemu-file.c:346 #3 0x00005555559eef66 in vmstate_save_state (f=f@entry=0x555557a97690, vmsd=0x555555f8a5a0 <vmstate_ISAIPMIKCSDevice>, opaque=0x555557231160, vmdesc=vmdesc@entry=0x55555798cc40) at migration/vmstate.c:333 #4 0x00005555557cfe45 in vmstate_save (f=f@entry=0x555557a97690, se=se@entry=0x555557231de0, vmdesc=vmdesc@entry=0x55555798cc40) at /mnt/sdb/zyy/qemu/migration/savevm.c:720 #5 0x00005555557d2be7 in qemu_savevm_state_complete_precopy (f=0x555557a97690, iterable_only=iterable_only@entry=false) at /mnt/sdb/zyy/qemu/migration/savevm.c:1128 #6 0x00005555559ea102 in migration_completion (start_time=<synthetic pointer>, old_vm_running=<synthetic pointer>, current_active_state=<optimized out>, s=0x5555560eaa80 <current_migration.44078>) at migration/migration.c:1707 #7 migration_thread (opaque=0x5555560eaa80 <current_migration.44078>) at migration/migration.c:1855 #8 0x00007ffff3900dc5 in start_thread (arg=0x7ffec4268700) at pthread_create.c:308 #9 0x00007fffefc6c71d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-18ivshmem: Fix 64 bit memory bar configurationZhuang Yanying1-1/+3
Device ivshmem property use64=0 is designed to make the device expose a 32 bit shared memory BAR instead of 64 bit one. The default is a 64 bit BAR, except pc-1.2 and older retain a 32 bit BAR. A 32 bit BAR can support only up to 1 GiB of shared memory. This worked as designed until commit 5400c02 accidentally flipped its sense: since then, we misinterpret use64=0 as use64=1 and vice versa. Worse, the default got flipped as well. Devices ivshmem-plain and ivshmem-doorbell are not affected. Fix by restoring the test of IVShmemState member not_legacy_32bit that got messed up in commit 5400c02. Also update its initialization for devices ivhsmem-plain and ivshmem-doorbell. Without that, they'd regress to 32 bit BARs. Cc: qemu-stable@nongnu.org Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-11-18virtio: set ISR on dataplane notificationsPaolo Bonzini7-22/+32
Dataplane has been omitting forever the step of setting ISR when an interrupt is raised. This caused little breakage, because the specification actually says that ISR may not be updated in MSI mode. Some versions of the Windows drivers however didn't clear MSI mode correctly, and proceeded using polling mode (using ISR, not the used ring index!) for crashdump and hibernation. If it were just crashdump and hibernation it would not be a big deal, but recent releases of Windows do not really shut down, but rather log out and hibernate to make the next startup faster. Hence, this manifested as a more serious hang during shutdown with e.g. Windows 8.1 and virtio-win 1.8.0 RPMs. Newer versions fixed this, while older versions do not use MSI at all. The failure has always been there for virtio dataplane, but it became visible after commits 9ffe337 ("virtio-blk: always use dataplane path if ioeventfd is active", 2016-10-30) and ad07cd6 ("virtio-scsi: always use dataplane path if ioeventfd is active", 2016-10-30) made virtio-blk and virtio-scsi always use the dataplane code under KVM. The good news therefore is that it was not a bug in the patches---they were doing exactly what they were meant for, i.e. shake out remaining dataplane bugs. The fix is not hard, so it's worth arranging for the broken drivers. The virtio_should_notify+event_notifier_set pair that is common to virtio-blk and virtio-scsi dataplane is replaced with a new public function virtio_notify_irqfd that also sets ISR. The irqfd emulation code now need not set ISR anymore, so virtio_irq is removed. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-18virtio: access ISR atomicallyPaolo Bonzini3-14/+23
This will be needed once dataplane will be able to set it outside the big QEMU lock. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-18virtio: introduce grab/release_ioeventfd to fix vhostPaolo Bonzini5-18/+86
Following the recent refactoring of virtio notifiers [1], more specifically the patch ed08a2a0b ("virtio: use virtio_bus_set_host_notifier to start/stop ioeventfd") that uses virtio_bus_set_host_notifier [2] by default, core virtio code requires 'ioeventfd_started' to be set to true/false when the host notifiers are configured. When vhost is stopped and started, however, there is a stop followed by another start. Since ioeventfd_started was never set to true, the 'stop' operation triggered by virtio_bus_set_host_notifier() will not result in a call to virtio_pci_ioeventfd_assign(assign=false). This leaves the memory regions with stale notifiers and results on the next start triggering the following assertion: kvm_mem_ioeventfd_add: error adding ioeventfd: File exists Aborted This patch reintroduces (hopefully in a cleaner way) the concept that was present with ioeventfd_disabled before the refactoring. When ioeventfd_grabbed>0, ioeventfd_started tracks whether ioeventfd should be enabled or not, but ioeventfd is actually not started at all until vhost releases the host notifiers. [1] http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07748.html [2] http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07760.html Reported-by: Felipe Franciosi <felipe@nutanix.com> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Reported-by: Alex Williamson <alex.williamson@redhat.com> Fixes: ed08a2a0b ("virtio: use virtio_bus_set_host_notifier to start/stop ioeventfd") Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-18Merge remote-tracking branch 'public/tags/tracing-pull-request' into stagingStefan Hajnoczi1-1/+1
# gpg: Signature made Fri 18 Nov 2016 03:01:22 PM GMT # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * public/tags/tracing-pull-request: trace: fix generated code build break Message-id: 1479481289-2479-1-git-send-email-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-18virtio-crypto: fix virtio_queue_set_notification() raceStefan Hajnoczi1-2/+11
We must check for new virtqueue buffers after re-enabling notifications. This prevents the race condition where the guest added buffers just after we stopped popping the virtqueue but before we re-enabled notifications. I think the virtio-crypto code was based on virtio-net but this crucial detail was missed. virtio-net does not have the race condition because it processes the virtqueue one more time after re-enabling notifications. Cc: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com>
2016-11-18Merge remote-tracking branch 'remotes/elmarco/tags/ivshmem-pull-request' ↵Stefan Hajnoczi1-1/+3
into staging * remotes/elmarco/tags/ivshmem-pull-request: ivshmem: Fix 64 bit memory bar configuration Message-id: 20161117152613.18578-1-marcandre.lureau@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-18Merge remote-tracking branch 'rth/tags/pull-axp-20161117' into stagingStefan Hajnoczi3-2/+4
Update alpha palcode for smp # gpg: Signature made Thu 17 Nov 2016 02:57:29 PM GMT # gpg: using RSA key 0xAD1270CC4DD0279B # gpg: Good signature from "Richard Henderson <rth7680@gmail.com>" # gpg: aka "Richard Henderson <rth@redhat.com>" # gpg: aka "Richard Henderson <rth@twiddle.net>" # Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC 16A4 AD12 70CC 4DD0 279B * rth/tags/pull-axp-20161117: target-alpha: Log cpuid with -d int target-alpha: Update palcode for smp Message-id: 1479394965-11254-1-git-send-email-rth@twiddle.net Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-18trace: fix generated code build breakGreg Kurz1-1/+1
If the QEMU source dir is /var/tmp/aaa-qemu-clone and the build dir is /var/tmp/qemu-aio-poll-v2 Then I get an error as: trace/generated-tracers.c:15950:13: error: invalid suffix "_trace_events" on integer constant TraceEvent *2_trace_events[] = { ^ trace/generated-tracers.c:15950:13: error: expected identifier or ‘(’ before numeric constant trace/generated-tracers.c: In function ‘trace_2_register_events’: trace/generated-tracers.c:17949:32: error: invalid suffix "_trace_events" on integer constant trace_event_register_group(2_trace_events); ^ make: *** [trace/generated-tracers.o] Error 1 This patch fixes the issue. Reported-by: Fam Zheng <famz@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org> Tested-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-18Merge remote-tracking branch 'vivier/tags/trivial-patches-pull-request' into ↵Stefan Hajnoczi1-3/+5
staging # gpg: Signature made Thu 17 Nov 2016 10:18:58 AM GMT # gpg: using RSA key 0xF30C38BD3F2FBE3C # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" # gpg: aka "Laurent Vivier <laurent@vivier.eu>" # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * vivier/tags/trivial-patches-pull-request: qapi-schema: clarify 'colo' state for MigrationStatus Message-id: 1479378016-19022-1-git-send-email-laurent@vivier.eu Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-17target-alpha: Log cpuid with -d intRichard Henderson1-2/+4
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-11-17target-alpha: Update palcode for smpRichard Henderson2-0/+0
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-11-17ivshmem: Fix 64 bit memory bar configurationZhuang Yanying1-1/+3
Device ivshmem property use64=0 is designed to make the device expose a 32 bit shared memory BAR instead of 64 bit one. The default is a 64 bit BAR, except pc-1.2 and older retain a 32 bit BAR. A 32 bit BAR can support only up to 1 GiB of shared memory. This worked as designed until commit 5400c02 accidentally flipped its sense: since then, we misinterpret use64=0 as use64=1 and vice versa. Worse, the default got flipped as well. Devices ivshmem-plain and ivshmem-doorbell are not affected. Fix by restoring the test of IVShmemState member not_legacy_32bit that got messed up in commit 5400c02. Also update its initialization for devices ivhsmem-plain and ivshmem-doorbell. Without that, they'd regress to 32 bit BARs. Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1479385863-7648-1-git-send-email-ann.zhuangyanying@huawei.com>
2016-11-17qapi-schema: clarify 'colo' state for MigrationStatuszhanghailiang1-3/+5
VM can not get into colo state unless users enable 'x-colo' capability for migration, Here it is necessary to clarify this. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Message-Id: <1478072652-9884-1-git-send-email-zhang.zhanghailiang@huawei.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2016-11-16pc: fix FW_CFG_NB_CPUS to account for -device added CPUsIgor Mammedov2-17/+29
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1479301481-197333-1-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-11-16fw_cfg: move FW_CFG_NB_CPUS out of fw_cfg_init1()Igor Mammedov7-2/+9
PC will use this field in other way, so move it outside the common code so PC could set a different value, i.e. all CPUs regardless of where they are coming from (-smp X | -device cpu...). It's quick and dirty hack as it could be implemented in more generic way in MashineClass. But do it in simple way since only PC is affected so far. Later we can generalize it when another affected target gets support for -device cpu. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1479212236-183810-3-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-11-16Revert "pc: Add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUs"Igor Mammedov2-31/+15
This reverts commit 080ac219cc7d9c55adf925c3545b7450055ad625. Legacy FW_CFG_NB_CPUS will be reused instead of 'etc/boot-cpus' fw_cfg file since it does the same and there is no point to maintaing duplicate guest ABI, if it can be helped. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1479212236-183810-2-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-11-15Update version for v2.8.0-rc0 releasev2.8.0-rc0Stefan Hajnoczi1-1/+1
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-15Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingStefan Hajnoczi28-307/+586
virtio, vhost, pc, pci: documentation, fixes and cleanups Lots of fixes all over the place. Unfortunately, this does not yet fix a regression with vhost introduced by the last pull, the issue is typically this error: kvm_mem_ioeventfd_add: error adding ioeventfd: File exists followed by QEMU aborting. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> * remotes/mst/tags/for_upstream: (28 commits) docs: add PCIe devices placement guidelines virtio: drop virtio_queue_get_ring_{size,addr}() vhost: drop legacy vring layout bits vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout nvdimm acpi: introduce NVDIMM_DSM_MEMORY_SIZE nvdimm acpi: use aml_name_decl to define named object nvdimm acpi: rename nvdimm_dsm_reserved_root nvdimm acpi: fix two comments nvdimm acpi: define DSM return codes nvdimm acpi: rename nvdimm_acpi_hotplug nvdimm acpi: cleanup nvdimm_build_fit nvdimm acpi: rename nvdimm_plugged_device_list docs: improve the doc of Read FIT method nvdimm acpi: clean up nvdimm_build_acpi pc: memhp: stop handling nvdimm hotplug in pc_dimm_unplug pc: memhp: move nvdimm hotplug out of memory hotplug nvdimm acpi: drop the lock of fit buffer qdev: hotplug: drop HotplugHandler.post_plug callback vhost: migration blocker only if shared log is used virtio-net: mark VIRTIO_NET_F_GSO as legacy ... Message-id: 1479237527-11846-1-git-send-email-mst@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-15Merge remote-tracking branch 'ehabkost/tags/machine-pull-request' into stagingStefan Hajnoczi1-4/+10
qdev: Fix assert in PCI address property when used by vfio-pci # gpg: Signature made Tue 15 Nov 2016 06:27:18 PM GMT # gpg: using RSA key 0x2807936F984DC5A6 # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * ehabkost/tags/machine-pull-request: qdev: Fix assert in PCI address property when used by vfio-pci Message-id: 1479234540-3192-1-git-send-email-ehabkost@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-15qdev: Fix assert in PCI address property when used by vfio-pciDaniel Oram1-4/+10
Allow the PCIHostDeviceAddress structure to work as the host property in vfio-pci when it has it's default value of all fields set to ~0. In this form the property indicates a non-existant device but given the field bit sizes gets asserted as excess (and invalid) precision overflows the string buffer. The BDF of an invalid device "FFFF:FF:FF.F" is returned instead. Signed-off-by: Daniel Oram <daniel.oram@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Message-Id: <71f06765c4ba16dcd71cbf78e877619948f04ed9.1478777270.git.daniel.oram@gmail.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-11-15Merge remote-tracking branch 'public/tags/block-pull-request' into stagingStefan Hajnoczi1-0/+3
# gpg: Signature made Tue 15 Nov 2016 03:42:29 PM GMT # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * public/tags/block-pull-request: test-replication: fix leaks Message-id: 1479224556-19367-1-git-send-email-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-15test-replication: fix leaksMarc-André Lureau1-0/+3
ASAN spotted: SUMMARY: AddressSanitizer: 301990288 byte(s) leaked in 33 allocation(s). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20161109104547.23861-1-marcandre.lureau@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-15docs: add PCIe devices placement guidelinesMarcel Apfelbaum1-0/+310
Proposes best practices on how to use PCI Express/PCI device in PCI Express based machines and explain the reasoning behind them. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15virtio: drop virtio_queue_get_ring_{size,addr}()Greg Kurz2-13/+0
These are not used anymore. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15vhost: drop legacy vring layout bitsGreg Kurz2-16/+0
The legacy vring layout is not used anymore as we use the separate mappings even for legacy devices. This patch simply removes it. This also fixes a bug with virtio 1 devices when the vring descriptor table is mapped at a higher address than the used vring because the following function may return an insanely great value: hwaddr virtio_queue_get_ring_size(VirtIODevice *vdev, int n) { return vdev->vq[n].vring.used - vdev->vq[n].vring.desc + virtio_queue_get_used_size(vdev, n); } and the mapping fails. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layoutGreg Kurz2-19/+64
With virtio 1, the vring layout is split in 3 separate regions of contiguous memory for the descriptor table, the available ring and the used ring, as opposed with legacy virtio which uses a single region. In case of memory re-mapping, the code ensures it doesn't affect the vring mapping. This is done in vhost_verify_ring_mappings() which assumes the device is legacy. This patch changes vhost_verify_ring_mappings() to check the mappings of each part of the vring separately. This works for legacy mappings as well. Cc: qemu-stable@nongnu.org Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15nvdimm acpi: introduce NVDIMM_DSM_MEMORY_SIZEXiao Guangrong1-13/+17
and use it to replace the raw number Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-11-15nvdimm acpi: use aml_name_decl to define named objectXiao Guangrong1-4/+2
to make the code more clearer Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-11-15nvdimm acpi: rename nvdimm_dsm_reserved_rootXiao Guangrong1-3/+4
Rename it to nvdimm_dsm_handle_reserved_root_method Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-11-15nvdimm acpi: fix two commentsXiao Guangrong1-2/+2
fixed the English issue and code-style issue Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-11-15nvdimm acpi: define DSM return codesXiao Guangrong1-19/+27
and use these codes to refine the code Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-11-15nvdimm acpi: rename nvdimm_acpi_hotplugXiao Guangrong3-3/+3
Rename it to nvdimm_plug() Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-11-15nvdimm acpi: cleanup nvdimm_build_fitXiao Guangrong1-3/+1
inline buf_size to refine the code a bit Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-11-15nvdimm acpi: rename nvdimm_plugged_device_listXiao Guangrong1-9/+8
Its behavior has been changed as the nvdimm device which is being realized also will be handled in this function, so rename it to reflect the fact Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-11-15docs: improve the doc of Read FIT methodXiao Guangrong1-49/+47
Improve the description and clearly document the length field Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-11-15nvdimm acpi: clean up nvdimm_build_acpiXiao Guangrong1-14/+16
To make the code more clearer, we 1) check ram_slots first, and build ssdt & nfit only when it is available 2) use nvdimm_get_plugged_device_list() to check if there is nvdimm device plugged Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-11-15pc: memhp: stop handling nvdimm hotplug in pc_dimm_unplugXiao Guangrong1-6/+0
as it is never called when nvdimm hotplug happens Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-11-15pc: memhp: move nvdimm hotplug out of memory hotplugXiao Guangrong8-29/+34
as they use completely different way to handle hotplug event Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>