aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-05-16vhost-user-scsi: avoid unlink(NULL) with fd passingStefan Hajnoczi1-1/+4
Commit 747421e949fc1eb3ba66b5fcccdb7ba051918241 ("Implements Backend Program conventions for vhost-user-scsi") introduced fd-passing support as part of implementing the vhost-user backend program conventions. When fd passing is used the UNIX domain socket path is NULL and we must not call unlink(2). The unlink(2) call is necessary when the listen socket, lsock, was created successfully since that means the UNIX domain socket is visible in the file system. Fixes: Coverity CID 1488353 Fixes: 747421e949fc1eb3ba66b5fcccdb7ba051918241 ("Implements Backend Program conventions for vhost-user-scsi") Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220516155701.1789638-1-stefanha@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16virtio-net: don't handle mq request in userspace handler for vhost-vdpaSi-Wei Liu1-0/+13
virtio_queue_host_notifier_read() tends to read pending event left behind on ioeventfd in the vhost_net_stop() path, and attempts to handle outstanding kicks from userspace vq handler. However, in the ctrl_vq handler, virtio_net_handle_mq() has a recursive call into virtio_net_set_status(), which may lead to segmentation fault as shown in below stack trace: 0 0x000055f800df1780 in qdev_get_parent_bus (dev=0x0) at ../hw/core/qdev.c:376 1 0x000055f800c68ad8 in virtio_bus_device_iommu_enabled (vdev=vdev@entry=0x0) at ../hw/virtio/virtio-bus.c:331 2 0x000055f800d70d7f in vhost_memory_unmap (dev=<optimized out>) at ../hw/virtio/vhost.c:318 3 0x000055f800d70d7f in vhost_memory_unmap (dev=<optimized out>, buffer=0x7fc19bec5240, len=2052, is_write=1, access_len=2052) at ../hw/virtio/vhost.c:336 4 0x000055f800d71867 in vhost_virtqueue_stop (dev=dev@entry=0x55f8037ccc30, vdev=vdev@entry=0x55f8044ec590, vq=0x55f8037cceb0, idx=0) at ../hw/virtio/vhost.c:1241 5 0x000055f800d7406c in vhost_dev_stop (hdev=hdev@entry=0x55f8037ccc30, vdev=vdev@entry=0x55f8044ec590) at ../hw/virtio/vhost.c:1839 6 0x000055f800bf00a7 in vhost_net_stop_one (net=0x55f8037ccc30, dev=0x55f8044ec590) at ../hw/net/vhost_net.c:315 7 0x000055f800bf0678 in vhost_net_stop (dev=dev@entry=0x55f8044ec590, ncs=0x55f80452bae0, data_queue_pairs=data_queue_pairs@entry=7, cvq=cvq@entry=1) at ../hw/net/vhost_net.c:423 8 0x000055f800d4e628 in virtio_net_set_status (status=<optimized out>, n=0x55f8044ec590) at ../hw/net/virtio-net.c:296 9 0x000055f800d4e628 in virtio_net_set_status (vdev=vdev@entry=0x55f8044ec590, status=15 '\017') at ../hw/net/virtio-net.c:370 10 0x000055f800d534d8 in virtio_net_handle_ctrl (iov_cnt=<optimized out>, iov=<optimized out>, cmd=0 '\000', n=0x55f8044ec590) at ../hw/net/virtio-net.c:1408 11 0x000055f800d534d8 in virtio_net_handle_ctrl (vdev=0x55f8044ec590, vq=0x7fc1a7e888d0) at ../hw/net/virtio-net.c:1452 12 0x000055f800d69f37 in virtio_queue_host_notifier_read (vq=0x7fc1a7e888d0) at ../hw/virtio/virtio.c:2331 13 0x000055f800d69f37 in virtio_queue_host_notifier_read (n=n@entry=0x7fc1a7e8894c) at ../hw/virtio/virtio.c:3575 14 0x000055f800c688e6 in virtio_bus_cleanup_host_notifier (bus=<optimized out>, n=n@entry=14) at ../hw/virtio/virtio-bus.c:312 15 0x000055f800d73106 in vhost_dev_disable_notifiers (hdev=hdev@entry=0x55f8035b51b0, vdev=vdev@entry=0x55f8044ec590) at ../../../include/hw/virtio/virtio-bus.h:35 16 0x000055f800bf00b2 in vhost_net_stop_one (net=0x55f8035b51b0, dev=0x55f8044ec590) at ../hw/net/vhost_net.c:316 17 0x000055f800bf0678 in vhost_net_stop (dev=dev@entry=0x55f8044ec590, ncs=0x55f80452bae0, data_queue_pairs=data_queue_pairs@entry=7, cvq=cvq@entry=1) at ../hw/net/vhost_net.c:423 18 0x000055f800d4e628 in virtio_net_set_status (status=<optimized out>, n=0x55f8044ec590) at ../hw/net/virtio-net.c:296 19 0x000055f800d4e628 in virtio_net_set_status (vdev=0x55f8044ec590, status=15 '\017') at ../hw/net/virtio-net.c:370 20 0x000055f800d6c4b2 in virtio_set_status (vdev=0x55f8044ec590, val=<optimized out>) at ../hw/virtio/virtio.c:1945 21 0x000055f800d11d9d in vm_state_notify (running=running@entry=false, state=state@entry=RUN_STATE_SHUTDOWN) at ../softmmu/runstate.c:333 22 0x000055f800d04e7a in do_vm_stop (state=state@entry=RUN_STATE_SHUTDOWN, send_stop=send_stop@entry=false) at ../softmmu/cpus.c:262 23 0x000055f800d04e99 in vm_shutdown () at ../softmmu/cpus.c:280 24 0x000055f800d126af in qemu_cleanup () at ../softmmu/runstate.c:812 25 0x000055f800ad5b13 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:51 For now, temporarily disable handling MQ request from the ctrl_vq userspace hanlder to avoid the recursive virtio_net_set_status() call. Some rework is needed to allow changing the number of queues without going through a full virtio_net_set_status cycle, particularly for vhost-vdpa backend. This patch will need to be reverted as soon as future patches of having the change of #queues handled in userspace is merged. Fixes: 402378407db ("vhost-vdpa: multiqueue support") Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <1651890498-24478-8-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16vhost-vdpa: change name and polarity for vhost_vdpa_one_time_request()Si-Wei Liu1-8/+15
The name vhost_vdpa_one_time_request() was confusing. No matter whatever it returns, its typical occurrence had always been at requests that only need to be applied once. And the name didn't suggest what it actually checks for. Change it to vhost_vdpa_first_dev() with polarity flipped for better readibility of code. That way it is able to reflect what the check is really about. This call is applicable to request which performs operation only once, before queues are set up, and usually at the beginning of the caller function. Document the requirement for it in place. Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Message-Id: <1651890498-24478-7-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-16vhost-vdpa: backend feature should set only onceSi-Wei Liu1-1/+1
The vhost_vdpa_one_time_request() branch in vhost_vdpa_set_backend_cap() incorrectly sends down ioctls on vhost_dev with non-zero index. This may end up with multiple VHOST_SET_BACKEND_FEATURES ioctl calls sent down on the vhost-vdpa fd that is shared between all these vhost_dev's. To fix it, send down ioctl only once via the first vhost_dev with index 0. Toggle the polarity of the vhost_vdpa_one_time_request() test should do the trick. Fixes: 4d191cfdc7de ("vhost-vdpa: classify one time request") Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <1651890498-24478-6-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16vhost-net: fix improper cleanup in vhost_net_startSi-Wei Liu1-1/+3
vhost_net_start() missed a corresponding stop_one() upon error from vhost_set_vring_enable(). While at it, make the error handling for err_start more robust. No real issue was found due to this though. Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <1651890498-24478-5-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16vhost-vdpa: fix improper cleanup in net_init_vhost_vdpaSi-Wei Liu1-1/+3
... such that no memory leaks on dangling net clients in case of error. Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <1651890498-24478-4-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpaSi-Wei Liu1-2/+31
With MQ enabled vdpa device and non-MQ supporting guest e.g. booting vdpa with mq=on over OVMF of single vqp, below assert failure is seen: ../hw/virtio/vhost-vdpa.c:560: vhost_vdpa_get_vq_index: Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed. 0 0x00007f8ce3ff3387 in raise () at /lib64/libc.so.6 1 0x00007f8ce3ff4a78 in abort () at /lib64/libc.so.6 2 0x00007f8ce3fec1a6 in __assert_fail_base () at /lib64/libc.so.6 3 0x00007f8ce3fec252 in () at /lib64/libc.so.6 4 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:563 5 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:558 6 0x0000558f52d7329a in vhost_virtqueue_mask (hdev=0x558f55c01800, vdev=0x558f568f91f0, n=2, mask=<optimized out>) at ../hw/virtio/vhost.c:1557 7 0x0000558f52c6b89a in virtio_pci_set_guest_notifier (d=d@entry=0x558f568f0f60, n=n@entry=2, assign=assign@entry=true, with_irqfd=with_irqfd@entry=false) at ../hw/virtio/virtio-pci.c:974 8 0x0000558f52c6c0d8 in virtio_pci_set_guest_notifiers (d=0x558f568f0f60, nvqs=3, assign=true) at ../hw/virtio/virtio-pci.c:1019 9 0x0000558f52bf091d in vhost_net_start (dev=dev@entry=0x558f568f91f0, ncs=0x558f56937cd0, data_queue_pairs=data_queue_pairs@entry=1, cvq=cvq@entry=1) at ../hw/net/vhost_net.c:361 10 0x0000558f52d4e5e7 in virtio_net_set_status (status=<optimized out>, n=0x558f568f91f0) at ../hw/net/virtio-net.c:289 11 0x0000558f52d4e5e7 in virtio_net_set_status (vdev=0x558f568f91f0, status=15 '\017') at ../hw/net/virtio-net.c:370 12 0x0000558f52d6c4b2 in virtio_set_status (vdev=vdev@entry=0x558f568f91f0, val=val@entry=15 '\017') at ../hw/virtio/virtio.c:1945 13 0x0000558f52c69eff in virtio_pci_common_write (opaque=0x558f568f0f60, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at ../hw/virtio/virtio-pci.c:1292 14 0x0000558f52d15d6e in memory_region_write_accessor (mr=0x558f568f19d0, addr=20, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, attrs=...) at ../softmmu/memory.c:492 15 0x0000558f52d127de in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7f8cdbffe748, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x558f52d15cf0 <memory_region_write_accessor>, mr=0x558f568f19d0, attrs=...) at ../softmmu/memory.c:554 16 0x0000558f52d157ef in memory_region_dispatch_write (mr=mr@entry=0x558f568f19d0, addr=20, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...) at ../softmmu/memory.c:1504 17 0x0000558f52d078e7 in flatview_write_continue (fv=fv@entry=0x7f8accbc3b90, addr=addr@entry=103079215124, attrs=..., ptr=ptr@entry=0x7f8ce6300028, len=len@entry=1, addr1=<optimized out>, l=<optimized out>, mr=0x558f568f19d0) at /home/opc/qemu-upstream/include/qemu/host-utils.h:165 18 0x0000558f52d07b06 in flatview_write (fv=0x7f8accbc3b90, addr=103079215124, attrs=..., buf=0x7f8ce6300028, len=1) at ../softmmu/physmem.c:2822 19 0x0000558f52d0b36b in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>) at ../softmmu/physmem.c:2914 20 0x0000558f52d0b3da in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=..., attrs@entry=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>, is_write=<optimized out>) at ../softmmu/physmem.c:2924 21 0x0000558f52dced09 in kvm_cpu_exec (cpu=cpu@entry=0x558f55c2da60) at ../accel/kvm/kvm-all.c:2903 22 0x0000558f52dcfabd in kvm_vcpu_thread_fn (arg=arg@entry=0x558f55c2da60) at ../accel/kvm/kvm-accel-ops.c:49 23 0x0000558f52f9f04a in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:556 24 0x00007f8ce4392ea5 in start_thread () at /lib64/libpthread.so.0 25 0x00007f8ce40bb9fd in clone () at /lib64/libc.so.6 The cause for the assert failure is due to that the vhost_dev index for the ctrl vq was not aligned with actual one in use by the guest. Upon multiqueue feature negotiation in virtio_net_set_multiqueue(), if guest doesn't support multiqueue, the guest vq layout would shrink to a single queue pair, consisting of 3 vqs in total (rx, tx and ctrl). This results in ctrl_vq taking a different vhost_dev group index than the default. We can map vq to the correct vhost_dev group by checking if MQ is supported by guest and successfully negotiated. Since the MQ feature is only present along with CTRL_VQ, we ensure the index 2 is only meant for the control vq while MQ is not supported by guest. Fixes: 22288fe ("virtio-net: vhost control virtqueue support") Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <1651890498-24478-3-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16virtio-net: setup vhost_dev and notifiers for cvq only when feature is ↵Si-Wei Liu1-1/+2
negotiated When the control virtqueue feature is absent or not negotiated, vhost_net_start() still tries to set up vhost_dev and install vhost notifiers for the control virtqueue, which results in erroneous ioctl calls with incorrect queue index sending down to driver. Do that only when needed. Fixes: 22288fe ("virtio-net: vhost control virtqueue support") Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <1651890498-24478-2-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16hw/i386/amd_iommu: Fix IOMMU event log encoding errorsWei Huang1-10/+14
Coverity issues several UNINIT warnings against amd_iommu.c [1]. This patch fixes them by clearing evt before encoding. On top of it, this patch changes the event log size to 16 bytes per IOMMU specification, and fixes the event log entry format in amdvi_encode_event(). [1] CID 1487116/1487200/1487190/1487232/1487115/1487258 Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Wei Huang <wei.huang2@amd.com> Message-Id: <20220422055146.3312226-1-wei.huang2@amd.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16hw/i386: Make pic a property of common x86 base machine typeXiaoyao Li6-30/+34
Legacy PIC (8259) cannot be supported for TDX guests since TDX module doesn't allow directly interrupt injection. Using posted interrupts for the PIC is not a viable option as the guest BIOS/kernel will not do EOI for PIC IRQs, i.e. will leave the vIRR bit set. Make PIC the property of common x86 machine type. Hence all x86 machines, including microvm, can disable it. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Sergio Lopez <slp@redhat.com> Message-Id: <20220310122811.807794-3-xiaoyao.li@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16hw/i386: Make pit a property of common x86 base machine typeXiaoyao Li6-51/+31
Both pc and microvm have pit property individually. Let's just make it the property of common x86 base machine type. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Sergio Lopez <slp@redhat.com> Message-Id: <20220310122811.807794-2-xiaoyao.li@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16include/hw/pci/pcie_host: Correct PCIE_MMCFG_SIZE_MAXFrancisco Iglesias1-1/+1
According to 7.2.2 in [1] bit 27 is the last bit that can be part of the bus number, this makes the ECAM max size equal to '1 << 28'. This patch restores back this value into the PCIE_MMCFG_SIZE_MAX define (which was changed in commit 58d5b22bbd5 ("ppc4xx: Add device models found in PPC440 core SoCs")). [1] PCI Express® Base Specification Revision 5.0 Version 1.0 Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com> Message-Id: <20220411221836.17699-3-frasse.iglesias@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16include/hw/pci/pcie_host: Correct PCIE_MMCFG_BUS_MASKFrancisco Iglesias1-2/+2
According to [1] address bits 27 - 20 are mapped to the bus number (the TLPs bus number field is 8 bits). Below is the formula taken from Table 7-1 in [1]. " Memory Address | PCI Express Configuration Space A[(20+n-1):20] | Bus Number, 1 ≤ n ≤ 8 " [1] PCI Express® Base Specification Revision 5.0 Version 1.0 Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com> Message-Id: <20220411221836.17699-2-frasse.iglesias@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16docs/vhost-user: Clarifications for VHOST_USER_ADD/REM_MEM_REGKevin Wolf1-0/+16
The specification for VHOST_USER_ADD/REM_MEM_REG messages is unclear in several points, which has led to clients having incompatible implementations. This changes the specification to be more explicit about them: * VHOST_USER_ADD_MEM_REG is not specified as receiving a file descriptor, though it obviously does need to do so. All implementations agree on this one, fix the specification. * VHOST_USER_REM_MEM_REG is not specified as receiving a file descriptor either, and it also has no reason to do so. rust-vmm does not send file descriptors for removing a memory region (in agreement with the specification), libvhost-user and QEMU do (which is a bug), though libvhost-user doesn't actually make any use of it. Change the specification so that for compatibility QEMU's behaviour becomes legal, even if discouraged, but rust-vmm's behaviour becomes the explicitly recommended mode of operation. * VHOST_USER_ADD_MEM_REG doesn't have a documented return value, which is the desired behaviour in the non-postcopy case. It also implemented like this in QEMU and rust-vmm, though libvhost-user is buggy and sometimes sends an unexpected reply. This will be fixed in a separate patch. However, in postcopy mode it does reply like VHOST_USER_SET_MEM_TABLE. This behaviour is shared between libvhost-user and QEMU; rust-vmm doesn't implement postcopy mode yet. Mention it explicitly in the spec. * The specification doesn't mention how VHOST_USER_REM_MEM_REG identifies the memory region to be removed. Change it to describe the existing behaviour of libvhost-user (guest address, user address and size must match). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220407133657.155281-2-kwolf@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-05-16vhost-user: more master/slave thingsMichael S. Tsirkin1-1/+1
we switched to front-end/back-end, but newer patches reintroduced old language. Fix this up. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16virtio: add vhost support for virtio devicesJonah Palmer12-1/+76
This patch adds a get_vhost() callback function for VirtIODevices that returns the device's corresponding vhost_dev structure, if the vhost device is running. This patch also adds a vhost_started flag for VirtIODevices. Previously, a VirtIODevice wouldn't be able to tell if its corresponding vhost device was active or not. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <1648819405-25696-3-git-send-email-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16virtio: drop name parameter for virtio_init()Jonah Palmer24-43/+77
This patch drops the name parameter for the virtio_init function. The pair between the numeric device ID and the string device ID (name) of a virtio device already exists, but not in a way that lets us map between them. This patch lets us do this and removes the need for the name parameter in the virtio_init function. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <1648819405-25696-2-git-send-email-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16virtio/vhost-user: dynamically assign VhostUserHostNotifiersAlex Bennée3-18/+108
At a couple of hundred bytes per notifier allocating one for every potential queue is very wasteful as most devices only have a few queues. Instead of having this handled statically dynamically assign them and track in a GPtrArray. [AJB: it's hard to trigger the vhost notifiers code, I assume as it requires a KVM guest with appropriate backend] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220321153037.3622127-14-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16hw/virtio/vhost-user: don't suppress F_CONFIG when supportedAlex Bennée3-13/+35
Previously we would silently suppress VHOST_USER_PROTOCOL_F_CONFIG during the protocol negotiation if the QEMU stub hadn't implemented the vhost_dev_config_notifier. However this isn't the only way we can handle config messages, the existing vdc->get/set_config can do this as well. Lightly re-factor the code to check for both potential methods and instead of silently squashing the feature error out. It is unlikely that a vhost-user backend expecting to handle CONFIG messages will behave correctly if they never get sent. Fixes: 1c3e5a2617 ("vhost-user: back SET/GET_CONFIG requests with a protocol feature") Cc: Maxime Coquelin <maxime.coquelin@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220321153037.3622127-13-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16include/hw: start documenting the vhost APIAlex Bennée1-10/+122
While trying to get my head around the nest of interactions for vhost devices I though I could start by documenting the key API functions. This patch documents the main API hooks for creating and starting a vhost device as well as how the configuration changes are handled. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220321153037.3622127-11-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16docs/devel: start documenting writing VirtIO devicesAlex Bennée2-0/+215
While writing my own VirtIO devices I've gotten confused with how things are structured and what sort of shared infrastructure there is. If we can document how everything is supposed to work we can then maybe start cleaning up inconsistencies in the code. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220309164929.19395-1-alex.bennee@linaro.org> Message-Id: <20220321153037.3622127-10-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16libvhost-user: expose vu_request_to_stringAlex Bennée2-1/+10
This is useful for more human readable debug messages in vhost-user programs. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220321153037.3622127-9-alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16vhost-user.rst: add clarifying language about protocol negotiationAlex Bennée1-2/+16
Make the language about feature negotiation explicitly clear about the handling of the VHOST_USER_F_PROTOCOL_FEATURES feature bit. Try and avoid the sort of bug introduced in vhost.rs REPLY_ACK processing: https://github.com/rust-vmm/vhost/pull/24 Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Cc: Jiang Liu <gerry@linux.alibaba.com> Message-Id: <20210226111619.21178-1-alex.bennee@linaro.org> Message-Id: <20220321153037.3622127-8-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16docs: vhost-user: replace master/slave with front-end/back-endPaolo Bonzini2-180/+180
This matches the nomenclature that is generally used. Also commonly used is client/server, but it is not as clear because sometimes the front-end exposes a passive (server) socket that the back-end connects to. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210226143413.188046-4-pbonzini@redhat.com> Message-Id: <20220321153037.3622127-7-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16docs: vhost-user: rewrite section on ring state machinePaolo Bonzini1-25/+21
This section is using the word "back-end" to refer to the "slave's back-end", and talking about the "client" for what the rest of the document calls the "slave". Rework it to free the use of the term "back-end", which in the next patch will replace "slave". Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210226143413.188046-3-pbonzini@redhat.com> Message-Id: <20220321153037.3622127-6-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16docs: vhost-user: clean up request/reply descriptionPaolo Bonzini1-68/+95
It is not necessary to mention which side is sending/receiving each payload; it is more interesting to say which is the request and which is the reply. This also matches what vhost-user-gpu.rst already does. While at it, ensure that all messages list both the request and the reply payload. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210226143413.188046-2-pbonzini@redhat.com> Message-Id: <20220321153037.3622127-5-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-05-16hw/virtio: add vhost_user_[read|write] trace pointsAlex Bennée2-0/+6
These are useful when trying to debug the initial vhost-user negotiation, especially when it hard to get logging from the low level library on the other side. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220321153037.3622127-4-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-05-16virtio-pci: add notification trace pointsAlex Bennée2-1/+9
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200925125147.26943-6-alex.bennee@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220321153037.3622127-3-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16hw/virtio: move virtio-pci.h into shared include spaceAlex Bennée21-20/+20
This allows other device classes that will be exposed via PCI to be able to do so in the appropriate hw/ directory. I resisted the temptation to re-order headers to be more aesthetically pleasing. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20200925125147.26943-4-alex.bennee@linaro.org> Message-Id: <20220321153037.3622127-2-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16vhost_net: Print feature masks in hexIlya Maximets1-2/+2
"0x200000000" is much more readable than "8589934592". The change saves one step (conversion) while debugging. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Message-Id: <20220318140440.596019-1-i.maximets@ovn.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-05-16intel-iommu: update iq_dw during post loadJason Wang1-6/+15
We need to update iq_dw according to the DMA_IRQ_REG during post load. Otherwise we may get wrong IOTLB invalidation descriptor after migration. Fixes: fb43cf739e ("intel_iommu: scalable mode emulation") Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220317080522.14621-2-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2022-05-16intel-iommu: update root_scalable before switching as during post_loadJason Wang1-7/+7
We need check whether passthrough is enabled during vtd_switch_address_space() by checking the context entries. This requires the root_scalable to be set correctly otherwise we may try to check legacy rsvd bits instead of scalable ones. Fixing this by updating root_scalable before switching the address spaces during post_load. Fixes: fb43cf739e ("intel_iommu: scalable mode emulation") Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220317080522.14621-1-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2022-05-16intel-iommu: block output address in interrupt address rangeJason Wang2-1/+30
According to vtd spec v3.3 3.14: """ Software must not program paging-structure entries to remap any address to the interrupt address range. Untranslated requests and translation requests that result in an address in the interrupt range will be blocked with condition code LGN.4 or SGN.8. """ This patch blocks the request that result in interrupt address range. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220210092815.45174-2-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2022-05-16intel-iommu: remove VTD_FR_RESERVED_ERRJason Wang2-11/+0
This fault reason is not used and is duplicated with SPT.2 condition code. So let's remove it. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220210092815.45174-1-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2022-05-16intel_iommu: Fix irqchip / X2APIC configuration checksDavid Woodhouse1-6/+1
We don't need to check kvm_enable_x2apic(). It's perfectly OK to support interrupt remapping even if we can't address CPUs above 254. Kind of pointless, but still functional. The check on kvm_enable_x2apic() needs to happen *anyway* in order to allow CPUs above 254 even without an IOMMU, so allow that to happen elsewhere. However, we do require the *split* irqchip in order to rewrite I/OAPIC destinations. So fix that check while we're here. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20220314142544.150555-4-dwmw2@infradead.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16intel_iommu: Only allow interrupt remapping to be enabled if it's supportedDavid Woodhouse1-1/+3
We should probably check if we were meant to be exposing IR, before letting the guest turn the IRE bit on. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220314142544.150555-3-dwmw2@infradead.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16intel_iommu: Support IR-only mode without DMA translationDavid Woodhouse2-4/+11
By setting none of the SAGAW bits we can indicate to a guest that DMA translation isn't supported. Tested by booting Windows 10, as well as Linux guests with the fix at https://git.kernel.org/torvalds/c/c40aaaac10 Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220314142544.150555-2-dwmw2@infradead.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16target/i386: Fix sanity check on max APIC ID / X2APIC enablementDavid Woodhouse3-9/+17
The check on x86ms->apic_id_limit in pc_machine_done() had two problems. Firstly, we need KVM to support the X2APIC API in order to allow IRQ delivery to APICs >= 255. So we need to call/check kvm_enable_x2apic(), which was done elsewhere in *some* cases but not all. Secondly, microvm needs the same check. So move it from pc_machine_done() to x86_cpus_init() where it will work for both. The check in kvm_cpu_instance_init() is now redundant and can be dropped. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20220314142544.150555-1-dwmw2@infradead.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-13vhost: Fix element in vhost_svq_add failureEugenio Pérez1-0/+8
Coverity rightly reports that is not free in that case. Fixes: Coverity CID 1487559 Fixes: 100890f7ca ("vhost: Shadow virtqueue buffers forwarding") Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220512175747.142058-7-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-13hw/virtio: Replace g_memdup() by g_memdup2()Philippe Mathieu-Daudé2-4/+5
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538 The old API took the size of the memory to duplicate as a guint, whereas most memory functions take memory sizes as a gsize. This made it easy to accidentally pass a gsize to g_memdup(). For large values, that would lead to a silent truncation of the size from 64 to 32 bits, and result in a heap area being returned which is significantly smaller than what the caller expects. This can likely be exploited in various modules to cause a heap buffer overflow. Replace g_memdup() by the safer g_memdup2() wrapper. Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20220512175747.142058-6-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-13vdpa: Fix index calculus at vhost_vdpa_svqs_startEugenio Pérez1-1/+1
With the introduction of MQ the index of the vq needs to be calculated with the device model vq_index. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220512175747.142058-5-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-13vdpa: Fix bad index calculus at vhost_vdpa_get_vring_baseEugenio Pérez1-2/+2
Fixes: 6d0b222666 ("vdpa: Adapt vhost_vdpa_get_vring_base to SVQ") Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220512175747.142058-4-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-13vhost: Fix device's used descriptor dequeueEugenio Pérez1-2/+15
Only the first one of them were properly enqueued back. Fixes: 100890f7ca ("vhost: Shadow virtqueue buffers forwarding") Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220512175747.142058-3-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-13vhost: Track descriptor chain in private at SVQEugenio Pérez2-5/+13
The device could have access to modify them, and it definitely have access when we implement packed vq. Harden SVQ maintaining a private copy of the descriptor chain. Other fields like buffer addresses are already maintained sepparatedly. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220512175747.142058-2-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-13docs/cxl: Add initial Compute eXpress Link (CXL) documentation.Jonathan Cameron2-0/+303
Provide an introduction to the main components of a CXL system, with detailed explanation of memory interleaving, example command lines and kernel configuration. This was a challenging document to write due to the need to extract only that subset of CXL information which is relevant to either users of QEMU emulation of CXL or to those interested in the implementation. Much of CXL is concerned with specific elements of the protocol, management of memory pooling etc which is simply not relevant to what is currently planned for CXL emulation in QEMU. All comments welcome Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20220429144110.25167-43-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-13qtest/cxl: Add more complex test cases with CFMWsBen Widawsky1-3/+5
Add CXL Fixed Memory Windows to the CXL tests. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20220429144110.25167-40-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-13tests/acpi: Add tables for CXL emulation.Jonathan Cameron3-2/+0
Tables that differ from normal Q35 tables when running the CXL test. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20220429144110.25167-39-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-13qtests/bios-tables-test: Add a test for CXL emulation.Jonathan Cameron1-0/+44
The DSDT includes several CXL specific elements and the CEDT table is only present if we enable CXL. The test exercises all current functionality with several CFMWS, CHBS structures in CEDT and ACPI0016/ACPI00017 and _OSC entries in DSDT. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20220429144110.25167-38-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-13tests/acpi: q35: Allow addition of a CXL test.Jonathan Cameron3-0/+2
Add exceptions for the DSDT and the new CEDT tables specific to a new CXL test in the following patch. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20220429144110.25167-37-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-13i386/pc: Enable CXL fixed memory windowsJonathan Cameron1-1/+30
Add the CFMWs memory regions to the memorymap and adjust the PCI window to avoid hitting the same memory. Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com> Message-Id: <20220429144110.25167-36-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>