aboutsummaryrefslogtreecommitdiff
path: root/hw/virtio/virtio.c
AgeCommit message (Collapse)AuthorFilesLines
2022-01-28Remove unnecessary minimum_version_id_old fieldsPeter Maydell1-1/+0
The migration code will not look at a VMStateDescription's minimum_version_id_old field unless that VMSD has set the load_state_old field to something non-NULL. (The purpose of minimum_version_id_old is to specify what migration version is needed for the code in the function pointed to by load_state_old to be able to handle it on incoming migration.) We have exactly one VMSD which still has a load_state_old, in the PPC CPU; every other VMSD which sets minimum_version_id_old is doing so unnecessarily. Delete all the unnecessary ones. Commit created with: sed -i '/\.minimum_version_id_old/d' $(git grep -l '\.minimum_version_id_old') with the one legitimate use then hand-edited back in. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> --- It missed vmstate_ppc_cpu.
2022-01-12virtio: unify dataplane and non-dataplane ->handle_output()Stefan Hajnoczi1-17/+17
Now that virtio-blk and virtio-scsi are ready, get rid of the handle_aio_output() callback. It's no longer needed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-7-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12virtio: use ->handle_output() instead of ->handle_aio_output()Stefan Hajnoczi1-30/+3
The difference between ->handle_output() and ->handle_aio_output() was that ->handle_aio_output() returned a bool return value indicating progress. This was needed by the old polling API but now that the bool return value is gone, the two functions can be unified. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-6-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12virtio: get rid of VirtIOHandleAIOOutputStefan Hajnoczi1-8/+4
The virtqueue host notifier API virtio_queue_aio_set_host_notifier_handler() polls the virtqueue for new buffers. AioContext previously required a bool progress return value indicating whether an event was handled or not. This is no longer necessary because the AioContext polling API has been split into a poll check function and an event handler function. The event handler is only run when we know there is work to do, so it doesn't return bool. The VirtIOHandleAIOOutput function signature is now the same as VirtIOHandleOutput. Get rid of the bool return value. Further simplifications will be made for virtio-blk and virtio-scsi in the next patch. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12aio-posix: split poll check from ready handlerStefan Hajnoczi1-6/+10
Adaptive polling measures the execution time of the polling check plus handlers called when a polled event becomes ready. Handlers can take a significant amount of time, making it look like polling was running for a long time when in fact the event handler was running for a long time. For example, on Linux the io_submit(2) syscall invoked when a virtio-blk device's virtqueue becomes ready can take 10s of microseconds. This can exceed the default polling interval (32 microseconds) and cause adaptive polling to stop polling. By excluding the handler's execution time from the polling check we make the adaptive polling calculation more accurate. As a result, the event loop now stays in polling mode where previously it would have fallen back to file descriptor monitoring. The following data was collected with virtio-blk num-queues=2 event_idx=off using an IOThread. Before: 168k IOPS, IOThread syscalls: 9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0) = 16 9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8) = 8 9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8) = 8 9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3 9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0) = 32 174k IOPS (+3.6%), IOThread syscalls: 9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0) = 32 9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8) = 8 9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8) = 8 9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50) = 32 Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because the IOThread stays in polling mode instead of falling back to file descriptor monitoring. As usual, polling is not implemented on Windows so this patch ignores the new io_poll_read() callback in aio-win32.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-2-stefanha@redhat.com [Fixed up aio_set_event_notifier() calls in tests/unit/test-fdmon-epoll.c added after this series was queued. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-10Revert "virtio: add support for configure interrupt"Michael S. Tsirkin1-29/+0
This reverts commit 081f864f56307551f59c5e934e3f30a7290d0faa. Fixes: 081f864f56 ("virtio: add support for configure interrupt") Cc: "Cindy Lu" <lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07virtio: signal after wrapping packed used_idxStefan Hajnoczi1-0/+1
Packed Virtqueues wrap used_idx instead of letting it run freely like Split Virtqueues do. If the used ring wraps more than once there is no way to compare vq->signalled_used and vq->used_idx in virtio_packed_should_notify() since they are modulo vq->vring.num. This causes the device to stop sending used buffer notifications when when virtio_packed_should_notify() is called less than once each time around the used ring. It is possible to trigger this with virtio-blk's dataplane notify_guest_bh() irq coalescing optimization. The call to virtio_notify_irqfd() (and virtio_packed_should_notify()) is deferred to a BH. If the guest driver is polling it can complete and submit more requests before the BH executes, causing the used ring to wrap more than once. The result is that the virtio-blk device ceases to raise interrupts and I/O hangs. Cc: Tiwei Bie <tiwei.bie@intel.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20211130134510.267382-1-stefanha@redhat.com> Fixes: 86044b24e865fb9596ed77a4d0f3af8b90a088a1 ("virtio: basic packed virtqueue support") Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06virtio: add support for configure interruptCindy Lu1-0/+29
Add the functions to support the configure interrupt in virtio The function virtio_config_guest_notifier_read will notify the guest if there is an configure interrupt. The function virtio_config_set_guest_notifier_fd_handler is to set the fd hander for the notifier Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-7-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-12-30dma: Let dma_memory_map() take MemTxAttrs argumentPhilippe Mathieu-Daudé1-2/+4
Let devices specify transaction attributes when calling dma_memory_map(). Patch created mechanically using spatch with this script: @@ expression E1, E2, E3, E4; @@ - dma_memory_map(E1, E2, E3, E4) + dma_memory_map(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20211223115554.3155328-7-philmd@redhat.com>
2021-11-15virtio: use virtio accessor to access packed eventJason Wang1-9/+4
We used to access packed descriptor event and off_wrap via address_space_{write|read}_cached(). When we hit the cache, memcpy() is used which is not atomic which may lead a wrong value to be read or wrote. This patch fixes this by switching to use virito_{stw|lduw}_phys_cached() to make sure the access is atomic. Fixes: 683f7665679c1 ("virtio: event suppression support for packed ring") Cc: qemu-stable@nongnu.org Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20211111063854.29060-2-jasowang@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-15virtio: use virtio accessor to access packed descriptor flagsJason Wang1-7/+4
We used to access packed descriptor flags via address_space_{write|read}_cached(). When we hit the cache, memcpy() is used which is not an atomic operation which may lead a wrong value is read or wrote. So this patch switches to use virito_{stw|lduw}_phys_cached() to make sure the aceess is atomic. Fixes: 86044b24e865f ("virtio: basic packed virtqueue support") Cc: qemu-stable@nongnu.org Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20211111063854.29060-1-jasowang@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-05hw/virtio: Have virtqueue_get_avail_bytes() pass caches arg to calleesPhilippe Mathieu-Daudé1-17/+12
Both virtqueue_packed_get_avail_bytes() and virtqueue_split_get_avail_bytes() access the region cache, but their caller also does. Simplify by having virtqueue_get_avail_bytes calling both with RCU lock held, and passing the caches as argument. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210906104318.1569967-4-philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-10-05hw/virtio: Acquire RCU read lock in virtqueue_packed_drop_all()Philippe Mathieu-Daudé1-0/+2
vring_get_region_caches() must be called with the RCU read lock acquired. virtqueue_packed_drop_all() does not, and uses the 'caches' pointer. Fix that by using the RCU_READ_LOCK_GUARD() macro. Reported-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210906104318.1569967-3-philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-09-30memory: Name all the memory listenersPeter Xu1-0/+1
Provide a name field for all the memory listeners. It can be used to identify which memory listener is which. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20210817013553.30584-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-04hw/virtio: Remove NULL check in virtio_free_region_cache()Philippe Mathieu-Daudé1-4/+2
virtio_free_region_cache() is called within call_rcu(), always with a non-NULL argument. Ensure new code keep it that way by replacing the NULL check by an assertion. Add a comment this function is called within call_rcu(). Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210826172658.2116840-3-philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-04hw/virtio: Document virtio_queue_packed_empty_rcu is called within RCUPhilippe Mathieu-Daudé1-0/+1
While virtio_queue_packed_empty_rcu() uses the '_rcu' suffix, it is not obvious it is called within rcu_read_lock(). All other functions from this file called with the RCU locked have a comment describing it. Document this one similarly for consistency. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210826172658.2116840-2-philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-07-09hw/virtio: Document *_should_notify() are called within rcu_read_lock()Philippe Mathieu-Daudé1-0/+2
Such comments make reviewing this file somehow easier. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20210523094040.3516968-1-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-07-02virtio: Clarify MR transaction optimizationGreg Kurz1-0/+16
The device model batching its ioeventfds in a single MR transaction is an optimization. Clarify this in virtio-scsi, virtio-blk and generic virtio code. Also clarify that the transaction must commit before closing ioeventfds so that no one is tempted to merge the loops in the start functions error path and in the stop functions. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <162125799728.1394228.339855768563326832.stgit@bahia.lan> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-05-26cpu: Introduce cpu_virtio_is_big_endian()Philippe Mathieu-Daudé1-3/+1
Introduce the cpu_virtio_is_big_endian() generic helper to avoid calling CPUClass internal virtio_is_big_endian() one. Similarly to commit bf7663c4bd8 ("cpu: introduce CPUClass::virtio_is_big_endian()"), we keep 'virtio' in the method name to hint this handler shouldn't be called anywhere but from the virtio code. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210517105140.1062037-8-f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-05-16Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell1-1/+1
pc,pci,virtio: bugfixes, improvements Fixes all over the place. Faster boot for virtio. ioeventfd support for mmio. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Fri 14 May 2021 15:27:13 BST # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: Fix build with 64 bits time_t vhost-vdpa: Make vhost_vdpa_get_device_id() static hw/virtio: enable ioeventfd configuring for mmio hw/smbios: support for type 41 (onboard devices extended information) checkpatch: Fix use of uninitialized value virtio-scsi: Configure all host notifiers in a single MR transaction virtio-scsi: Set host notifiers and callbacks separately virtio-blk: Configure all host notifiers in a single MR transaction virtio-blk: Fix rollback path in virtio_blk_data_plane_start() pc-dimm: remove unnecessary get_vmstate_memory_region() method amd_iommu: fix wrong MMIO operations virtio-net: Constify VirtIOFeature feature_sizes[] virtio-blk: Constify VirtIOFeature feature_sizes[] hw/virtio: Pass virtio_feature_get_config_size() a const argument x86: acpi: use offset instead of pointer when using build_header() amd_iommu: Fix pte_override_page_mask() Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # hw/arm/virt.c
2021-05-14hw/virtio: Pass virtio_feature_get_config_size() a const argumentPhilippe Mathieu-Daudé1-1/+1
The VirtIOFeature structure isn't modified, mark it const. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210511104157.2880306-2-philmd@redhat.com>
2021-05-02Do not include exec/address-spaces.h if it's not really necessaryThomas Huth1-1/+0
Stop including exec/address-spaces.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-5-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-03-09sysemu: Let VMChangeStateHandler take boolean 'running' argumentPhilippe Mathieu-Daudé1-1/+1
The 'running' argument from VMChangeStateHandler does not require other value than 0 / 1. Make it a plain boolean. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20210111152020.1422021-3-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-02-05virtio: Add corresponding memory_listener_unregister to unrealizeEugenio Pérez1-1/+1
Address space is destroyed without proper removal of its listeners with current code. They are expected to be removed in virtio_device_instance_finalize [1], but qemu calls it through object_deinit, after address_space_destroy call through device_set_realized [2]. Move it to virtio_device_unrealize, called before device_set_realized [3] and making it symmetric with memory_listener_register in virtio_device_realize. v2: Delete no-op call of virtio_device_instance_finalize. Add backtraces. [1] #0 virtio_device_instance_finalize (obj=0x555557de5120) at /home/qemu/include/hw/virtio/virtio.h:71 #1 0x0000555555b703c9 in object_deinit (type=0x555556639860, obj=<optimized out>) at ../qom/object.c:671 #2 object_finalize (data=0x555557de5120) at ../qom/object.c:685 #3 object_unref (objptr=0x555557de5120) at ../qom/object.c:1184 #4 0x0000555555b4de9d in bus_free_bus_child (kid=0x555557df0660) at ../hw/core/qdev.c:55 #5 0x0000555555c65003 in call_rcu_thread (opaque=opaque@entry=0x0) at ../util/rcu.c:281 Queued by: #0 bus_remove_child (bus=0x555557de5098, child=child@entry=0x555557de5120) at ../hw/core/qdev.c:60 #1 0x0000555555b4ee31 in device_unparent (obj=<optimized out>) at ../hw/core/qdev.c:984 #2 0x0000555555b70465 in object_finalize_child_property ( obj=<optimized out>, name=<optimized out>, opaque=0x555557de5120) at ../qom/object.c:1725 #3 0x0000555555b6fa17 in object_property_del_child ( child=0x555557de5120, obj=0x555557ddcf90) at ../qom/object.c:645 #4 object_unparent (obj=0x555557de5120) at ../qom/object.c:664 #5 0x0000555555b4c071 in bus_unparent (obj=<optimized out>) at ../hw/core/bus.c:147 #6 0x0000555555b70465 in object_finalize_child_property ( obj=<optimized out>, name=<optimized out>, opaque=0x555557de5098) at ../qom/object.c:1725 #7 0x0000555555b6fa17 in object_property_del_child ( child=0x555557de5098, obj=0x555557ddcf90) at ../qom/object.c:645 #8 object_unparent (obj=0x555557de5098) at ../qom/object.c:664 #9 0x0000555555b4ee19 in device_unparent (obj=<optimized out>) at ../hw/core/qdev.c:981 #10 0x0000555555b70465 in object_finalize_child_property ( obj=<optimized out>, name=<optimized out>, opaque=0x555557ddcf90) at ../qom/object.c:1725 #11 0x0000555555b6fa17 in object_property_del_child ( child=0x555557ddcf90, obj=0x55555685da10) at ../qom/object.c:645 #12 object_unparent (obj=0x555557ddcf90) at ../qom/object.c:664 #13 0x00005555558dc331 in pci_for_each_device_under_bus ( opaque=<optimized out>, fn=<optimized out>, bus=<optimized out>) at ../hw/pci/pci.c:1654 [2] Optimizer omits pci_qdev_unrealize, called by device_set_realized, and do_pci_unregister_device, called by pci_qdev_unrealize and caller of address_space_destroy. #0 address_space_destroy (as=0x555557ddd1b8) at ../softmmu/memory.c:2840 #1 0x0000555555b4fc53 in device_set_realized (obj=0x555557ddcf90, value=<optimized out>, errp=0x7fffeea8f1e0) at ../hw/core/qdev.c:850 #2 0x0000555555b6eaa6 in property_set_bool (obj=0x555557ddcf90, v=<optimized out>, name=<optimized out>, opaque=0x555556650ba0, errp=0x7fffeea8f1e0) at ../qom/object.c:2255 #3 0x0000555555b70e07 in object_property_set ( obj=obj@entry=0x555557ddcf90, name=name@entry=0x555555db99df "realized", v=v@entry=0x7fffe46b7500, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/object.c:1400 #4 0x0000555555b73c5f in object_property_set_qobject ( obj=obj@entry=0x555557ddcf90, name=name@entry=0x555555db99df "realized", value=value@entry=0x7fffe44f6180, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/qom-qobject.c:28 #5 0x0000555555b71044 in object_property_set_bool ( obj=0x555557ddcf90, name=0x555555db99df "realized", value=<optimized out>, errp=0x5555565bbf38 <error_abort>) at ../qom/object.c:1470 #6 0x0000555555921cb7 in pcie_unplug_device (bus=<optimized out>, dev=0x555557ddcf90, opaque=<optimized out>) at /home/qemu/include/hw/qdev-core.h:17 #7 0x00005555558dc331 in pci_for_each_device_under_bus ( opaque=<optimized out>, fn=<optimized out>, bus=<optimized out>) at ../hw/pci/pci.c:1654 [3] #0 virtio_device_unrealize (dev=0x555557de5120) at ../hw/virtio/virtio.c:3680 #1 0x0000555555b4fc63 in device_set_realized (obj=0x555557de5120, value=<optimized out>, errp=0x7fffee28df90) at ../hw/core/qdev.c:850 #2 0x0000555555b6eab6 in property_set_bool (obj=0x555557de5120, v=<optimized out>, name=<optimized out>, opaque=0x555556650ba0, errp=0x7fffee28df90) at ../qom/object.c:2255 #3 0x0000555555b70e17 in object_property_set ( obj=obj@entry=0x555557de5120, name=name@entry=0x555555db99ff "realized", v=v@entry=0x7ffdd8035040, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/object.c:1400 #4 0x0000555555b73c6f in object_property_set_qobject ( obj=obj@entry=0x555557de5120, name=name@entry=0x555555db99ff "realized", value=value@entry=0x7ffdd8035020, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/qom-qobject.c:28 #5 0x0000555555b71054 in object_property_set_bool ( obj=0x555557de5120, name=name@entry=0x555555db99ff "realized", value=value@entry=false, errp=0x5555565bbf38 <error_abort>) at ../qom/object.c:1470 #6 0x0000555555b4edc5 in qdev_unrealize (dev=<optimized out>) at ../hw/core/qdev.c:403 #7 0x0000555555b4c2a9 in bus_set_realized (obj=<optimized out>, value=<optimized out>, errp=<optimized out>) at ../hw/core/bus.c:204 #8 0x0000555555b6eab6 in property_set_bool (obj=0x555557de5098, v=<optimized out>, name=<optimized out>, opaque=0x555557df04c0, errp=0x7fffee28e0a0) at ../qom/object.c:2255 #9 0x0000555555b70e17 in object_property_set ( obj=obj@entry=0x555557de5098, name=name@entry=0x555555db99ff "realized", v=v@entry=0x7ffdd8034f50, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/object.c:1400 #10 0x0000555555b73c6f in object_property_set_qobject ( obj=obj@entry=0x555557de5098, name=name@entry=0x555555db99ff "realized", value=value@entry=0x7ffdd8020630, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/qom-qobject.c:28 #11 0x0000555555b71054 in object_property_set_bool ( obj=obj@entry=0x555557de5098, name=name@entry=0x555555db99ff "realized", value=value@entry=false, errp=0x5555565bbf38 <error_abort>) at ../qom/object.c:1470 #12 0x0000555555b4c725 in qbus_unrealize ( bus=bus@entry=0x555557de5098) at ../hw/core/bus.c:178 #13 0x0000555555b4fc00 in device_set_realized (obj=0x555557ddcf90, value=<optimized out>, errp=0x7fffee28e1e0) at ../hw/core/qdev.c:844 #14 0x0000555555b6eab6 in property_set_bool (obj=0x555557ddcf90, v=<optimized out>, name=<optimized out>, opaque=0x555556650ba0, errp=0x7fffee28e1e0) at ../qom/object.c:2255 #15 0x0000555555b70e17 in object_property_set ( obj=obj@entry=0x555557ddcf90, name=name@entry=0x555555db99ff "realized", v=v@entry=0x7ffdd8020560, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/object.c:1400 #16 0x0000555555b73c6f in object_property_set_qobject ( obj=obj@entry=0x555557ddcf90, name=name@entry=0x555555db99ff "realized", value=value@entry=0x7ffdd8020540, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/qom-qobject.c:28 #17 0x0000555555b71054 in object_property_set_bool ( obj=0x555557ddcf90, name=0x555555db99ff "realized", value=<optimized out>, errp=0x5555565bbf38 <error_abort>) at ../qom/object.c:1470 #18 0x0000555555921cb7 in pcie_unplug_device (bus=<optimized out>, dev=0x555557ddcf90, opaque=<optimized out>) at /home/qemu/include/hw/qdev-core.h:17 #19 0x00005555558dc331 in pci_for_each_device_under_bus ( opaque=<optimized out>, fn=<optimized out>, bus=<optimized out>) at ../hw/pci/pci.c:1654 Fixes: c611c76417f ("virtio: add MemoryListener to cache ring translations") Buglink: https://bugs.launchpad.net/qemu/+bug/1912846 Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20210125192505.390554-1-eperezma@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-19migration: Replace migration's JSON writer by the general oneMarkus Armbruster1-2/+2
Commit 8118f0950f "migration: Append JSON description of migration stream" needs a JSON writer. The existing qobject_to_json() wasn't a good fit, because it requires building a QObject to convert. Instead, migration got its very own JSON writer, in commit 190c882ce2 "QJSON: Add JSON writer". It tacitly limits numbers to int64_t, and strings contents to characters that don't need escaping, unlike qobject_to_json(). The previous commit factored the JSON writer out of qobject_to_json(). Replace migration's JSON writer by it. Cc: Juan Quintela <quintela@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20201211171152.146877-17-armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-12-08virtio: reset device on bad guest index in virtio_load()John Levon1-6/+9
If we find a queue with an inconsistent guest index value, explicitly mark the device as needing a reset - and broken - via virtio_error(). There's at least one driver implementation - the virtio-win NetKVM driver - that is able to handle a VIRTIO_CONFIG_S_NEEDS_RESET notification and successfully restore the device to a working state. Other implementations do not correctly handle this, but as the VQ is not in a functional state anyway, this is still worth doing. Signed-off-by: John Levon <john.levon@nutanix.com> Message-Id: <20201120185103.GA442386@sent> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-10-30virtio: skip guest index check on device loadFelipe Franciosi1-6/+7
QEMU must be careful when loading device state off migration streams to prevent a malicious source from exploiting the emulator. Overdoing these checks has the side effect of allowing a guest to "pin itself" in cloud environments by messing with state which is entirely in its control. Similarly to what f3081539 achieved in usb_device_post_load(), this commit removes such a check from virtio_load(). Worth noting, the result of a load without this check is the same as if a guest enables a VQ with invalid indexes to begin with. That is, the virtual device is set in a broken state (by the datapath handler) and must be reset. Signed-off-by: Felipe Franciosi <felipe@nutanix.com> Message-Id: <20201028134643.110698-1-felipe@nutanix.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-09-29virtio: update MemoryRegionCaches when guest set bad featuresLi Qiang1-9/+8
Current the 'virtio_set_features' only update the 'MemorRegionCaches' when the 'virtio_set_features_nocheck' return '0' which means it is not bad features. However the guest can still trigger the access of the used vring after set bad features. In this situation it will cause assert failure in 'ADDRESS_SPACE_ST_CACHED'. Buglink: https://bugs.launchpad.net/qemu/+bug/1890333 Fixes: db812c4073c7 ("virtio: update MemoryRegionCaches when guest negotiates features") Reported-by: Alexander Bulekov <alxndr@bu.edu> Signed-off-by: Li Qiang <liq3ea@163.com> Message-Id: <20200919082706.6703-1-liq3ea@163.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-09-29virtio: skip legacy support check on machine types less than 5.1Stefano Garzarella1-0/+7
Commit 9b3a35ec82 ("virtio: verify that legacy support is not accidentally on") added a check that returns an error if legacy support is on, but the device does not support legacy. Unfortunately some devices were wrongly declared legacy capable even if they were not (e.g vhost-vsock). To avoid migration issues, we add a virtio-device property (x-disable-legacy-check) to skip the legacy error, printing a warning instead, for machine types < 5.1. Cc: qemu-stable@nongnu.org Fixes: 9b3a35ec82 ("virtio: verify that legacy support is not accidentally on") Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Suggested-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20200921122506.82515-2-sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-09-23qemu/atomic.h: rename atomic_ to qatomic_Stefan Hajnoczi1-8/+8
clang's C11 atomic_fetch_*() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int *' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic\(64\)\?_[a-z0-9_]\+' include/qemu/atomic.h | \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>
2020-07-27virtio-pci: fix virtio_pci_queue_enabled()Laurent Vivier1-1/+6
In legacy mode, virtio_pci_queue_enabled() falls back to virtio_queue_enabled() to know if the queue is enabled. But virtio_queue_enabled() calls again virtio_pci_queue_enabled() if k->queue_enabled is set. This ends in a crash after a stack overflow. The problem can be reproduced with "-device virtio-net-pci,disable-legacy=off,disable-modern=true -net tap,vhost=on" And a look to the backtrace is very explicit: ... #4 0x000000010029a438 in virtio_queue_enabled () #5 0x0000000100497a9c in virtio_pci_queue_enabled () ... #130902 0x000000010029a460 in virtio_queue_enabled () #130903 0x0000000100497a9c in virtio_pci_queue_enabled () #130904 0x000000010029a460 in virtio_queue_enabled () #130905 0x0000000100454a20 in vhost_net_start () ... This patch fixes the problem by introducing a new function for the legacy case and calls it from virtio_pci_queue_enabled(). It also calls it from virtio_queue_enabled() to avoid code duplication. Fixes: f19bcdfedd53 ("virtio-pci: implement queue_enabled method") Cc: Jason Wang <jasowang@redhat.com> Cc: Cindy Lu <lulu@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20200727153319.43716-1-lvivier@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-22virtio: list legacy-capable devicesCornelia Huck1-0/+25
Several types of virtio devices had already been around before the virtio standard was specified. These devices support virtio in legacy (and transitional) mode. Devices that have been added in the virtio standard are considered non-transitional (i.e. with no support for legacy virtio). Provide a helper function so virtio transports can figure that out easily. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200707105446.677966-2-cohuck@redhat.com> Cc: qemu-stable@nongnu.org Acked-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-03virtio-bus: introduce queue_enabled methodJason Wang1-0/+6
This patch introduces queue_enabled() method which allows the transport to implement its own way to report whether or not a queue is enabled. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20200701145538.22333-4-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2020-06-15qom: Less verbose object_initialize_child()Markus Armbruster1-2/+3
All users of object_initialize_child() pass the obvious child size argument. Almost all pass &error_abort and no properties. Tiresome. Rename object_initialize_child() to object_initialize_child_with_props() to free the name. New convenience wrapper object_initialize_child() automates the size argument, and passes &error_abort and no properties. Rename object_initialize_childv() to object_initialize_child_with_propsv() for consistency. Convert callers with this Coccinelle script: @@ expression parent, propname, type; expression child, size; symbol error_abort; @@ - object_initialize_child(parent, propname, OBJECT(child), size, type, &error_abort, NULL) + object_initialize_child(parent, propname, child, size, type, &error_abort, NULL) @@ expression parent, propname, type; expression child; symbol error_abort; @@ - object_initialize_child(parent, propname, child, sizeof(*child), type, &error_abort, NULL) + object_initialize_child(parent, propname, child, type) @@ expression parent, propname, type; expression child; symbol error_abort; @@ - object_initialize_child(parent, propname, &child, sizeof(child), type, &error_abort, NULL) + object_initialize_child(parent, propname, &child, type) @@ expression parent, propname, type; expression child, size, err; expression list props; @@ - object_initialize_child(parent, propname, child, size, type, err, props) + object_initialize_child_with_props(parent, propname, child, size, type, err, props) Note that Coccinelle chokes on ARMSSE typedef vs. macro in hw/arm/armsse.c. Worked around by temporarily renaming the macro for the spatch run. Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> [Rebased: machine opentitan is new (commit fe0fe4735e7)] Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-37-armbru@redhat.com>
2020-05-15qdev: Unrealize must not failMarkus Armbruster1-8/+3
Devices may have component devices and buses. Device realization may fail. Realization is recursive: a device's realize() method realizes its components, and device_set_realized() realizes its buses (which should in turn realize the devices on that bus, except bus_set_realized() doesn't implement that, yet). When realization of a component or bus fails, we need to roll back: unrealize everything we realized so far. If any of these unrealizes failed, the device would be left in an inconsistent state. Must not happen. device_set_realized() lets it happen: it ignores errors in the roll back code starting at label child_realize_fail. Since realization is recursive, unrealization must be recursive, too. But how could a partly failed unrealize be rolled back? We'd have to re-realize, which can fail. This design is fundamentally broken. device_set_realized() does not roll back at all. Instead, it keeps unrealizing, ignoring further errors. It can screw up even for a device with no buses: if the lone dc->unrealize() fails, it still unregisters vmstate, and calls listeners' unrealize() callback. bus_set_realized() does not roll back either. Instead, it stops unrealizing. Fortunately, no unrealize method can fail, as we'll see below. To fix the design error, drop parameter @errp from all the unrealize methods. Any unrealize method that uses @errp now needs an update. This leads us to unrealize() methods that can fail. Merely passing it to another unrealize method cannot cause failure, though. Here are the ones that do other things with @errp: * virtio_serial_device_unrealize() Fails when qbus_set_hotplug_handler() fails, but still does all the other work. On failure, the device would stay realized with its resources completely gone. Oops. Can't happen, because qbus_set_hotplug_handler() can't actually fail here. Pass &error_abort to qbus_set_hotplug_handler() instead. * hw/ppc/spapr_drc.c's unrealize() Fails when object_property_del() fails, but all the other work is already done. On failure, the device would stay realized with its vmstate registration gone. Oops. Can't happen, because object_property_del() can't actually fail here. Pass &error_abort to object_property_del() instead. * spapr_phb_unrealize() Fails and bails out when remove_drcs() fails, but other work is already done. On failure, the device would stay realized with some of its resources gone. Oops. remove_drcs() fails only when chassis_from_bus()'s object_property_get_uint() fails, and it can't here. Pass &error_abort to remove_drcs() instead. Therefore, no unrealize method can fail before this patch. device_set_realized()'s recursive unrealization via bus uses object_property_set_bool(). Can't drop @errp there, so pass &error_abort. We similarly unrealize with object_property_set_bool() elsewhere, always ignoring errors. Pass &error_abort instead. Several unrealize methods no longer handle errors from other unrealize methods: virtio_9p_device_unrealize(), virtio_input_device_unrealize(), scsi_qdev_unrealize(), ... Much of the deleted error handling looks wrong anyway. One unrealize methods no longer ignore such errors: usb_ehci_pci_exit(). Several realize methods no longer ignore errors when rolling back: v9fs_device_realize_common(), pci_qdev_unrealize(), spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(), virtio_device_realize(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-17-armbru@redhat.com>
2020-03-16misc: Replace zero-length arrays with flexible array member (automatic)Philippe Mathieu-Daudé1-2/+2
Description copied from Linux kernel commit from Gustavo A. R. Silva (see [3]): --v-- description start --v-- The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member [1], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being unadvertenly introduced [2] to the Linux codebase from now on. --^-- description end --^-- Do the similar housekeeping in the QEMU codebase (which uses C99 since commit 7be41675f7cb). All these instances of code were found with the help of the following Coccinelle script: @@ identifier s, m, a; type t, T; @@ struct s { ... t m; - T a[0]; + T a[]; }; @@ identifier s, m, a; type t, T; @@ struct s { ... t m; - T a[0]; + T a[]; } QEMU_PACKED; [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=76497732932f [3] https://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux.git/commit/?id=17642a2fbd2c1 Inspired-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-27Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell1-8/+91
virtio, pc: fixes, features New virtio iommu. Unrealize memory leaks. In-band kick/call support. Bugfixes, documentation all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 27 Feb 2020 08:46:33 GMT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (30 commits) Fixed assert in vhost_user_set_mem_table_postcopy vhost-user: only set slave channel for first vq acpi: cpuhp: document CPHP_GET_CPU_ID_CMD command libvhost-user: implement in-band notifications docs: vhost-user: add in-band kick/call messages libvhost-user: handle NOFD flag in call/kick/err better libvhost-user-glib: use g_main_context_get_thread_default() libvhost-user-glib: fix VugDev main fd cleanup libvhost-user: implement VHOST_USER_PROTOCOL_F_REPLY_ACK MAINTAINERS: add virtio-iommu related files hw/arm/virt: Add the virtio-iommu device tree mappings virtio-iommu-pci: Add virtio iommu pci support virtio-iommu: Support migration virtio-iommu: Implement fault reporting virtio-iommu: Implement translate virtio-iommu: Implement map/unmap virtio-iommu: Implement attach/detach command virtio-iommu: Decode the command payload virtio-iommu: Add skeleton virtio: gracefully handle invalid region caches ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-27virtio: gracefully handle invalid region cachesStefan Hajnoczi1-8/+91
The virtqueue code sets up MemoryRegionCaches to access the virtqueue guest RAM data structures. The code currently assumes that VRingMemoryRegionCaches is initialized before device emulation code accesses the virtqueue. An assertion will fail in vring_get_region_caches() when this is not true. Device fuzzing found a case where this assumption is false (see below). Virtqueue guest RAM addresses can also be changed from a vCPU thread while an IOThread is accessing the virtqueue. This breaks the same assumption but this time the caches could become invalid partway through the virtqueue code. The code fetches the caches RCU pointer multiple times so we will need to validate the pointer every time it is fetched. Add checks each time we call vring_get_region_caches() and treat invalid caches as a nop: memory stores are ignored and memory reads return 0. The fuzz test failure is as follows: $ qemu -M pc -device virtio-blk-pci,id=drv0,drive=drive0,addr=4.0 \ -drive if=none,id=drive0,file=null-co://,format=raw,auto-read-only=off \ -drive if=none,id=drive1,file=null-co://,file.read-zeroes=on,format=raw \ -display none \ -qtest stdio endianness outl 0xcf8 0x80002020 outl 0xcfc 0xe0000000 outl 0xcf8 0x80002004 outw 0xcfc 0x7 write 0xe0000000 0x24 0x00ffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffab5cffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffab0000000001 inb 0x4 writew 0xe000001c 0x1 write 0xe0000014 0x1 0x0d The following error message is produced: qemu-system-x86_64: /home/stefanha/qemu/hw/virtio/virtio.c:286: vring_get_region_caches: Assertion `caches != NULL' failed. The backtrace looks like this: #0 0x00007ffff5520625 in raise () at /lib64/libc.so.6 #1 0x00007ffff55098d9 in abort () at /lib64/libc.so.6 #2 0x00007ffff55097a9 in _nl_load_domain.cold () at /lib64/libc.so.6 #3 0x00007ffff5518a66 in annobin_assert.c_end () at /lib64/libc.so.6 #4 0x00005555559073da in vring_get_region_caches (vq=<optimized out>) at qemu/hw/virtio/virtio.c:286 #5 vring_get_region_caches (vq=<optimized out>) at qemu/hw/virtio/virtio.c:283 #6 0x000055555590818d in vring_used_flags_set_bit (mask=1, vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:398 #7 virtio_queue_split_set_notification (enable=0, vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:398 #8 virtio_queue_set_notification (vq=vq@entry=0x5555575ceea0, enable=enable@entry=0) at qemu/hw/virtio/virtio.c:451 #9 0x0000555555908512 in virtio_queue_set_notification (vq=vq@entry=0x5555575ceea0, enable=enable@entry=0) at qemu/hw/virtio/virtio.c:444 #10 0x00005555558c697a in virtio_blk_handle_vq (s=0x5555575c57e0, vq=0x5555575ceea0) at qemu/hw/block/virtio-blk.c:775 #11 0x0000555555907836 in virtio_queue_notify_aio_vq (vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:2244 #12 0x0000555555cb5dd7 in aio_dispatch_handlers (ctx=ctx@entry=0x55555671a420) at util/aio-posix.c:429 #13 0x0000555555cb67a8 in aio_dispatch (ctx=0x55555671a420) at util/aio-posix.c:460 #14 0x0000555555cb307e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 #15 0x00007ffff7bbc510 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0 #16 0x0000555555cb5848 in glib_pollfds_poll () at util/main-loop.c:219 #17 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 #18 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 #19 0x00005555559b20c9 in main_loop () at vl.c:1683 #20 0x0000555555838115 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4441 Reported-by: Alexander Bulekov <alxndr@bu.edu> Cc: Michael Tsirkin <mst@redhat.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200207104619.164892-1-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-02-20hw/virtio: Let virtqueue_map_iovec() use a boolean 'is_write' argumentPhilippe Mathieu-Daudé1-3/+4
The 'is_write' argument is either 0 or 1. Convert it to a boolean type. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2020-01-24qdev: set properties with device_class_set_props()Marc-André Lureau1-1/+1
The following patch will need to handle properties registration during class_init time. Let's use a device_class_set_props() setter. spatch --macro-file scripts/cocci-macro-file.h --sp-file ./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place --dir . @@ typedef DeviceClass; DeviceClass *d; expression val; @@ - d->props = val + device_class_set_props(d, val) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-06virtio: reset region cache when on queue deletionYuri Benditovich1-0/+1
https://bugzilla.redhat.com/show_bug.cgi?id=1708480 Fix leak of region reference that prevents complete device deletion on hot unplug. Cc: qemu-stable@nongnu.org Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Message-Id: <20191226043649.14481-2-yuri.benditovich@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-01-05virtio: don't enable notifications during pollingStefan Hajnoczi1-6/+6
Virtqueue notifications are not necessary during polling, so we disable them. This allows the guest driver to avoid MMIO vmexits. Unfortunately the virtio-blk and virtio-scsi handler functions re-enable notifications, defeating this optimization. Fix virtio-blk and virtio-scsi emulation so they leave notifications disabled. The key thing to remember for correctness is that polling always checks one last time after ending its loop, therefore it's safe to lose the race when re-enabling notifications at the end of polling. There is a measurable performance improvement of 5-10% with the null-co block driver. Real-life storage configurations will see a smaller improvement because the MMIO vmexit overhead contributes less to latency. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20191209210957.65087-1-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-01-05virtio-pci: disable vring processing when bus-mastering is disabledMichael Roth1-7/+28
Currently the SLOF firmware for pseries guests will disable/re-enable a PCI device multiple times via IO/MEM/MASTER bits of PCI_COMMAND register after the initial probe/feature negotiation, as it tends to work with a single device at a time at various stages like probing and running block/network bootloaders without doing a full reset in-between. In QEMU, when PCI_COMMAND_MASTER is disabled we disable the corresponding IOMMU memory region, so DMA accesses (including to vring fields like idx/flags) will no longer undergo the necessary translation. Normally we wouldn't expect this to happen since it would be misbehavior on the driver side to continue driving DMA requests. However, in the case of pseries, with iommu_platform=on, we trigger the following sequence when tearing down the virtio-blk dataplane ioeventfd in response to the guest unsetting PCI_COMMAND_MASTER: #2 0x0000555555922651 in virtqueue_map_desc (vdev=vdev@entry=0x555556dbcfb0, p_num_sg=p_num_sg@entry=0x7fffe657e1a8, addr=addr@entry=0x7fffe657e240, iov=iov@entry=0x7fffe6580240, max_num_sg=max_num_sg@entry=1024, is_write=is_write@entry=false, pa=0, sz=0) at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:757 #3 0x0000555555922a89 in virtqueue_pop (vq=vq@entry=0x555556dc8660, sz=sz@entry=184) at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:950 #4 0x00005555558d3eca in virtio_blk_get_request (vq=0x555556dc8660, s=0x555556dbcfb0) at /home/mdroth/w/qemu.git/hw/block/virtio-blk.c:255 #5 0x00005555558d3eca in virtio_blk_handle_vq (s=0x555556dbcfb0, vq=0x555556dc8660) at /home/mdroth/w/qemu.git/hw/block/virtio-blk.c:776 #6 0x000055555591dd66 in virtio_queue_notify_aio_vq (vq=vq@entry=0x555556dc8660) at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:1550 #7 0x000055555591ecef in virtio_queue_notify_aio_vq (vq=0x555556dc8660) at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:1546 #8 0x000055555591ecef in virtio_queue_host_notifier_aio_poll (opaque=0x555556dc86c8) at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:2527 #9 0x0000555555d02164 in run_poll_handlers_once (ctx=ctx@entry=0x55555688bfc0, timeout=timeout@entry=0x7fffe65844a8) at /home/mdroth/w/qemu.git/util/aio-posix.c:520 #10 0x0000555555d02d1b in try_poll_mode (timeout=0x7fffe65844a8, ctx=0x55555688bfc0) at /home/mdroth/w/qemu.git/util/aio-posix.c:607 #11 0x0000555555d02d1b in aio_poll (ctx=ctx@entry=0x55555688bfc0, blocking=blocking@entry=true) at /home/mdroth/w/qemu.git/util/aio-posix.c:639 #12 0x0000555555d0004d in aio_wait_bh_oneshot (ctx=0x55555688bfc0, cb=cb@entry=0x5555558d5130 <virtio_blk_data_plane_stop_bh>, opaque=opaque@entry=0x555556de86f0) at /home/mdroth/w/qemu.git/util/aio-wait.c:71 #13 0x00005555558d59bf in virtio_blk_data_plane_stop (vdev=<optimized out>) at /home/mdroth/w/qemu.git/hw/block/dataplane/virtio-blk.c:288 #14 0x0000555555b906a1 in virtio_bus_stop_ioeventfd (bus=bus@entry=0x555556dbcf38) at /home/mdroth/w/qemu.git/hw/virtio/virtio-bus.c:245 #15 0x0000555555b90dbb in virtio_bus_stop_ioeventfd (bus=bus@entry=0x555556dbcf38) at /home/mdroth/w/qemu.git/hw/virtio/virtio-bus.c:237 #16 0x0000555555b92a8e in virtio_pci_stop_ioeventfd (proxy=0x555556db4e40) at /home/mdroth/w/qemu.git/hw/virtio/virtio-pci.c:292 #17 0x0000555555b92a8e in virtio_write_config (pci_dev=0x555556db4e40, address=<optimized out>, val=1048832, len=<optimized out>) at /home/mdroth/w/qemu.git/hw/virtio/virtio-pci.c:613 I.e. the calling code is only scheduling a one-shot BH for virtio_blk_data_plane_stop_bh, but somehow we end up trying to process an additional virtqueue entry before we get there. This is likely due to the following check in virtio_queue_host_notifier_aio_poll: static bool virtio_queue_host_notifier_aio_poll(void *opaque) { EventNotifier *n = opaque; VirtQueue *vq = container_of(n, VirtQueue, host_notifier); bool progress; if (!vq->vring.desc || virtio_queue_empty(vq)) { return false; } progress = virtio_queue_notify_aio_vq(vq); namely the call to virtio_queue_empty(). In this case, since no new requests have actually been issued, shadow_avail_idx == last_avail_idx, so we actually try to access the vring via vring_avail_idx() to get the latest non-shadowed idx: int virtio_queue_empty(VirtQueue *vq) { bool empty; ... if (vq->shadow_avail_idx != vq->last_avail_idx) { return 0; } rcu_read_lock(); empty = vring_avail_idx(vq) == vq->last_avail_idx; rcu_read_unlock(); return empty; but since the IOMMU region has been disabled we get a bogus value (0 usually), which causes virtio_queue_empty() to falsely report that there are entries to be processed, which causes errors such as: "virtio: zero sized buffers are not allowed" or "virtio-blk missing headers" and puts the device in an error state. This patch works around the issue by introducing virtio_set_disabled(), which sets a 'disabled' flag to bypass checks like virtio_queue_empty() when bus-mastering is disabled. Since we'd check this flag at all the same sites as vdev->broken, we replace those checks with an inline function which checks for either vdev->broken or vdev->disabled. The 'disabled' flag is only migrated when set, which should be fairly rare, but to maintain migration compatibility we disable it's use for older machine types. Users requiring the use of the flag in conjunction with older machine types can set it explicitly as a virtio-device option. NOTES: - This leaves some other oddities in play, like the fact that DRIVER_OK also gets unset in response to bus-mastering being disabled, but not restored (however the device seems to continue working) - Similarly, we disable the host notifier via virtio_bus_stop_ioeventfd(), which seems to move the handling out of virtio-blk dataplane and back into the main IO thread, and it ends up staying there till a reset (but otherwise continues working normally) Cc: David Gibson <david@gibson.dropbear.id.au>, Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Message-Id: <20191120005003.27035-1-mdroth@linux.vnet.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-01-05virtio: make virtio_delete_queue idempotentMichael S. Tsirkin1-0/+1
Let's make sure calling this twice is harmless - no known instances, but seems safer. Suggested-by: Pan Nengyuan <pannengyuan@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-01-05virtio: add ability to delete vq through a pointerMichael S. Tsirkin1-5/+10
Devices tend to maintain vq pointers, allow deleting them trough a vq pointer. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
2019-11-06virtio: notify virtqueue via host notifier when availableStefan Hajnoczi1-1/+8
Host notifiers are used in several cases: 1. Traditional ioeventfd where virtqueue notifications are handled in the main loop thread. 2. IOThreads (aio_handle_output) where virtqueue notifications are handled in an IOThread AioContext. 3. vhost where virtqueue notifications are handled by kernel vhost or a vhost-user device backend. Most virtqueue notifications from the guest use the ioeventfd mechanism, but there are corner cases where QEMU code calls virtio_queue_notify(). This currently honors the host notifier for the IOThreads aio_handle_output case, but not for the vhost case. The result is that vhost does not receive virtqueue notifications from QEMU when virtio_queue_notify() is called. This patch extends virtio_queue_notify() to set the host notifier whenever it is enabled instead of calling the vq->(aio_)handle_output() function directly. We track the host notifier state for each virtqueue separately since some devices may use it only for certain virtqueues. This fixes the vhost case although it does add a trip through the eventfd for the traditional ioeventfd case. I don't think it's worth adding a fast path for the traditional ioeventfd case because calling virtio_queue_notify() is rare when ioeventfd is enabled. Reported-by: Felipe Franciosi <felipe@nutanix.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20191105140946.165584-1-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-29virtio: Use auto rcu_read macrosDr. David Alan Gilbert1-42/+23
Use RCU_READ_LOCK_GUARD and WITH_RCU_READ_LOCK_GUARD to replace the manual rcu_read_(un)lock calls. I think the only change is virtio_load which was missing unlocks in error paths; those end up being fatal errors so it's not that important anyway. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20191028161109.60205-1-dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-29Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵Peter Maydell1-0/+7
staging # gpg: Signature made Tue 29 Oct 2019 02:33:36 GMT # gpg: using RSA key EF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal] # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: COLO-compare: Fix incorrect `if` logic virtio-net: prevent offloads reset on migration virtio: new post_load hook net: add tulip (dec21143) driver Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-29virtio: new post_load hookMichael S. Tsirkin1-0/+7
Post load hook in virtio vmsd is called early while device is processed, and when VirtIODevice core isn't fully initialized. Most device specific code isn't ready to deal with a device in such state, and behaves weirdly. Add a new post_load hook in a device class instead. Devices should use this unless they specifically want to verify the migration stream as it's processed, e.g. for bounds checking. Cc: qemu-stable@nongnu.org Suggested-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Mikhail Sennikovsky <mikhail.sennikovskii@cloud.ionos.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-10-25virtio: drop unused virtio_device_stop_ioeventfd() functionStefan Hajnoczi1-8/+0
virtio_device_stop_ioeventfd() has not been used since commit 310837de6c1e0badfd736b1b316b1698c53120a7 ("virtio: introduce grab/release_ioeventfd to fix vhost") in 2016. Nowadays ioeventfd is stopped implicitly by the virtio transport when lifecycle events such as the VM pausing or device unplug occur. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20191021150343.30742-1-stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>