Age | Commit message (Collapse) | Author | Files | Lines |
|
VirtIOPMEMClass's parent is VirtioDeviceClass rather than VirtIODevice.
This isn't catastrophic only because sizeof(VirtIODevice) >
sizeof(VirtioDeviceClass).
Fixes: 5f503cd9f388 ("virtio-pmem: add virtio device")
Closes: https://lists.gnu.org/archive/html/qemu-devel/2025-06/msg00586.html
Reported-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-ID: <20250606092406.229833-3-zhenzhong.duan@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
Parent of VirtIOMEMClass is VirtioDeviceClass rather than VirtIODevice.
This isn't catastrophic only because sizeof(VirtIODevice) >
sizeof(VirtioDeviceClass).
Fixes: 910b25766b33 ("virtio-mem: Paravirtualized memory hot(un)plug")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250606092406.229833-2-zhenzhong.duan@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
into staging
virtio,pci,pc: features, fixes, tests
vhost will now no longer set a call notifier if unused
some work towards loongarch testing based on bios-tables-test
some core pci work for SVM support in vtd
vhost vdpa init has been optimized for response time to QMP
A couple more fixes
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmg97ZUPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpRBsH/0Fx4NNMaynXmVOgV1rMFirTydhQG5NSdeJv
# i1RHd25Rne/RXH0CL71UPuOPADWh6bv9iZTg6RU6g7TwI8K9v3M0R71RlPLh1Lh1
# x7fifWNSNXVi18fM9/j+mIg7I2Ye0AaqveezRJWGzqoOxQKKlVI2xspKZBCCkygd
# i2tgtR1ORB6+ji6wVoTDPlL42X5Jef5MUT3XOcRR5biHm0JfqxxQKVM83mD+5yMI
# 0YqjT2BVRzo5rGN7mSuf7tQ50xI6I0wI1+eoWeKHRbg08f709M8TZRDKuVh24Evg
# 9WnIhKLTzRVdCNLNbw9h9EhxoANpWCyvmnn6GCfkJui40necFHY=
# =0lO6
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 02 Jun 2025 14:29:41 EDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (26 commits)
hw/i386/pc_piix: Fix RTC ISA IRQ wiring of isapc machine
vdpa: move memory listener register to vhost_vdpa_init
vdpa: move iova_tree allocation to net_vhost_vdpa_init
vdpa: reorder listener assignment
vdpa: add listener_registered
vdpa: set backend capabilities at vhost_vdpa_init
vdpa: reorder vhost_vdpa_set_backend_cap
vdpa: check for iova tree initialized at net_client_start
vhost: Don't set vring call if guest notifier is unused
tests/qtest/bios-tables-test: Use MiB macro rather hardcode value
tests/data/uefi-boot-images: Add ISO image for LoongArch system
uefi-test-tools:: Add LoongArch64 support
pci: Add a PCI-level API for PRI
pci: Add a pci-level API for ATS
pci: Add a pci-level initialization function for IOMMU notifiers
memory: Store user data pointer in the IOMMU notifiers
pci: Add an API to get IOMMU's min page size and virtual address width
pci: Cache the bus mastering status in the device
pcie: Helper functions to check to check if PRI is enabled
pcie: Add a helper to declare the PRI capability for a pcie device
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
As we are moving to keep the mapping through all the vdpa device life
instead of resetting it at VirtIO reset, we need to move all its
dependencies to the initialization too. In particular devices with
x-svq=on need a valid iova_tree from the beginning.
Simplify the code also consolidating the two creation points: the first
data vq in case of SVQ active and CVQ start in case only CVQ uses it.
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-7-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Check if the listener has been registered or not, so it needs to be
registered again at start.
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20250522145839.59974-5-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
VIRTIO_PCI_FLAG_DISABLE_PCIE was only used by the hw_compat_2_4[]
array, via the 'x-disable-pcie=false' property. We removed all
machines using that array, lets remove all the code around
VIRTIO_PCI_FLAG_DISABLE_PCIE (see commit 9a4c0e220d8 for similar
VIRTIO_PCI_FLAG_* enum removal).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250512083948.39294-9-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
|
|
VIRTIO_PCI_FLAG_MIGRATE_EXTRA was only used by the
hw_compat_2_4[] array, via the 'migrate-extra=true'
property. We removed all machines using that array,
lets remove all the code around VIRTIO_PCI_FLAG_MIGRATE_EXTRA.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20250512083948.39294-8-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
|
|
Live migration should be terminated if the vhost-user backend crashes
before the migration completes.
Specifically, since the vhost device will be stopped when VM is stopped
before the end of the live migration, in current implementation if the
backend crashes, vhost-user device set_status() won't return failure,
live migration won't perceive the disconnection between QEMU and the
backend.
When the VM is migrated to the destination, the inflight IO will be
resubmitted, and if the IO was completed out of order before, it will
cause IO error.
To fix this issue:
1. Add the return value to set_status() for VirtioDeviceClass.
a. For the vhost-user device, return failure when the backend crashes.
b. For other virtio devices, always return 0.
2. Return failure if vhost_dev_stop() failed for vhost-user device.
If QEMU loses connection with the vhost-user backend, virtio set_status()
can return failure to the upper layer, migration_completion() can handle
the error, terminate the live migration, and restore the VM, so that
inflight IO can be completed normally.
Signed-off-by: Haoqian He <haoqian.he@smartx.com>
Message-Id: <20250416024729.3289157-4-haoqian.he@smartx.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This patch captures the error of vhost_virtqueue_stop() in vhost_dev_stop()
and returns the error upward.
Specifically, if QEMU is disconnected from the vhost backend, some actions
in vhost_dev_stop() will fail, such as sending vhost-user messages to the
backend (GET_VRING_BASE, SET_VRING_ENABLE) and vhost_reset_status.
Considering that both set_vring_enable and vhost_reset_status require setting
the specific virtio feature bit, we can capture vhost_virtqueue_stop()'s
error to indicate that QEMU has lost connection with the backend.
This patch is the pre patch for 'vhost-user: return failure if backend crashes
when live migration', which makes the live migration aware of the loss of
connection with the vhost-user backend and aborts the live migration.
Signed-off-by: Haoqian He <haoqian.he@smartx.com>
Message-Id: <20250416024729.3289157-3-haoqian.he@smartx.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Allow user to attach SR-IOV VF to a virtio-pci PF.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250314-sriov-v9-6-57dae8ae3ab5@daynix.com>
Tested-by: Yui Washizu <yui.washidu@gmail.com>
Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20250424194905.82506-6-philmd@linaro.org>
|
|
Mechanical change using gsed, then style manually adapted
to pass checkpatch.pl script.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250424194905.82506-4-philmd@linaro.org>
|
|
Convert the existing includes with
sed -i ,exec/memory.h,system/memory.h,g
Move the include within cpu-all.h into a !CONFIG_USER_ONLY block.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Previously the ctrl virtqueue was handled in the AioContext where SCSI
requests are processed. When IOThread Virtqueue Mapping was added things
become more complicated because SCSI requests could run in other
AioContexts.
Simplify by handling the ctrl virtqueue in the main loop where reset
operations can be performed. Note that BHs are still used canceling SCSI
requests in their AioContexts but at least the mean loop activity
doesn't need BHs anymore.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250311132616.1049687-13-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Allow virtio-scsi virtqueues to be assigned to different IOThreads. This
makes it possible to take advantage of host multi-queue block layer
scalability by assigning virtqueues that have affinity with vCPUs to
different IOThreads that have affinity with host CPUs. The same feature
was introduced for virtio-blk in the past:
https://developers.redhat.com/articles/2024/09/05/scaling-virtio-blk-disk-io-iothread-virtqueue-mapping
Here are fio randread 4k iodepth=64 results from a 4 vCPU guest with an
Intel P4800X SSD:
iothreads IOPS
------------------------------
1 189576
2 312698
4 346744
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250311132616.1049687-12-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
[kwolf: Updated 051 output, virtio-scsi can now use any iothread]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
The code that builds an array of AioContext pointers indexed by the
virtqueue is not specific to virtio-blk. virtio-scsi will need to do the
same thing, so extract the functions.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250311132616.1049687-11-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
The block layer can invoke the resize callback from any AioContext that
is processing requests. The virtqueue is already protected but the
events_dropped field also needs to be protected against races. Cover it
using the event virtqueue lock because it is closely associated with
accesses to the virtqueue.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250311132616.1049687-7-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Virtqueues are not thread-safe. Until now this was not a major issue
since all virtqueue processing happened in the same thread. The ctrl
queue's Task Management Function (TMF) requests sometimes need the main
loop, so a BH was used to schedule the virtqueue completion back in the
thread that has virtqueue access.
When IOThread Virtqueue Mapping is introduced in later commits, event
and ctrl virtqueue accesses from other threads will become necessary.
Introduce an optional per-virtqueue lock so the event and ctrl
virtqueues can be protected in the commits that follow.
The addition of the ctrl virtqueue lock makes
virtio_scsi_complete_req_from_main_loop() and its BH unnecessary.
Instead, take the ctrl virtqueue lock from the main loop thread.
The cmd virtqueue does not have a lock because the entirety of SCSI
command processing happens in one thread. Only one thread accesses the
cmd virtqueue and a lock is unnecessary.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250311132616.1049687-6-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
Apple has its own virtio-blk PCI device ID where it deviates from the
official virtio-pci spec slightly: It puts a new "apple type"
field at a static offset in config space and introduces a new barrier
command.
This patch first creates a mechanism for virtio-blk downstream classes to
handle unknown commands. It then creates such a downstream class and a new
vmapple-virtio-blk-pci class which support the additional apple type config
identifier as well as the barrier command.
The 'aux' or 'root' device type are selected using the 'variant' property.
Signed-off-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20241223221645.29911-13-phil@philjordan.eu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
When a machine is first booted, all virtio balloon stats are initialized
to their default value -1 (18446744073709551615 when represented as
unsigned).
They remain that way while the firmware is loading, and early phase of
guest OS boot, until the virtio-balloon driver is activated. Thereafter
the reported stats reflect the guest OS activity.
When a machine reset is performed, however, the virtio-balloon stats are
left unchanged by QEMU, despite the guest OS no longer updating them,
nor indeed even still existing.
IOW, the mgmt app keeps getting stale stats until the guest OS starts
once more and loads the virtio-balloon driver (if ever). At that point
the app will see a discontinuity in the reported values as they sudden
jump from the stale value to the new value. This jump is indigituishable
from a valid data update.
While there is an "last-updated" field to report on the freshness of
the stats, that does not unambiguously tell the mgmt app whether the
stats are still conceptually relevant to the current running workload.
It is more conceptually useful to reset the stats to their default
values on machine reset, given that the previous guest workload the
stats reflect no longer exists. The mgmt app can now clearly identify
that there are is no stats information available from the current
executing workload.
The 'last-updated' time is also reset back to 0.
IOW, on every machine reset, the virtio stats are in the same clean
state they were when the macine first powered on.
A functional test is added to validate this behaviour with a real
world guest OS.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20250204094202.2183262-1-berrange@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Migration state transfer interface is only used by vhost-user-fs,
so the interface needs to be defined only when vhost is built.
But I need to use this interface with virtio-net and vhost is not always
enabled, and to avoid undefined reference error during build, define stub
functions for vhost_supports_device_state(), vhost_save_backend_state() and
vhost_load_backend_state().
Cc: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20250115135044.799698-2-lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Add the VIRTIO_GPU_F_RESOURCE_UUID feature to enable the assignment
of resources UUIDs for export to other virtio devices.
Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
Message-Id: <20241007070013.3350752-1-dbassey@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
staging
Hi,
"Host Memory Backends" and "Memory devices" queue ("mem"):
- Fixup handling of virtio-mem unplug during system resets, as
preparation for s390x support (especially kdump in the Linux guest)
- virtio-mem support for s390x
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmdnFD4RHGRhdmlkQHJl
# ZGhhdC5jb20ACgkQTd4Q9wD/g1rWBBAAp7WkYaNAjRy1PgpjNZ3z1gUJc/vk+skJ
# xVgGodA8txrJOFpNrbTyfhrdLs2TV4oWDvB/zrZRRtuxvur3O1EhFd9k6EqXuydr
# 0FunvLvVJwRHfEZycjN4aacQMRH3CJw07OaTzexeSl5UR/6w5PRofwUK4HX7W/Ka
# arqomGa3OJrs1+WgkV0Qcn4vh9HLRVv3iNC2Xo4W1wOCr1Du9zSPn9oC7zOQ0EO4
# ZC//7QsdkNRjUX/yMXMkhlSXx3b/RmRg2DBrxo7BZXg27VwGu4uHxL4LRBZiB2A7
# V9MqFOcVKzPMkXKTRjrgZ0vXQx1MPJ6WprEihMzMpYU6DrpA7KN/l8Ca8H24B2ln
# h7+bmkDsHVVcWovE9ii/9cMRfws6uWXXg3KoA8RQ8IbX1tU02lblw2uHhXEzcoge
# npqp/Z5LAiKVMetEnNnLH5thjut5PAEjuqD00cmZAMy4DNngLX2bGSdzMeVBkDMa
# 78ehLGRplm3t7ibUfaZaMKe6UD9tFrcD6XKsvUTXXHNbYO8ynbx58WOxSZmY98zU
# n3JNQRqtXYjBVlH3Dqm47vOTZHgOzFv3raa8BmSLpcBDeTXCTcUIl20s77dGw/vT
# r5YNCMN7O4YPFKUoRK9604QTgw6qlYaRTQlJD09usprGqVylb6gQtfZZuZkYDMp8
# sEI77QHsePA=
# =HDxr
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 21 Dec 2024 14:17:18 EST
# gpg: using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A
# gpg: issuer "david@redhat.com"
# gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown]
# gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full]
# gpg: aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A
* tag 'mem-2024-12-21' of https://github.com/davidhildenbrand/qemu:
s390x: virtio-mem support
s390x/virtio-ccw: add support for virtio based memory devices
s390x: remember the maximum page size
s390x/pv: prepare for memory devices
s390x/s390-virtio-ccw: prepare for memory devices
s390x/s390-skeys: prepare for memory devices
s390x/s390-stattrib-kvm: prepare for memory devices and sparse memory layouts
s390x/s390-hypercall: introduce DIAG500 STORAGE_LIMIT
s390x: introduce s390_get_memory_limit()
s390x/s390-virtio-ccw: move setting the maximum guest size from sclp to machine code
s390x: rename s390-virtio-hcall* to s390-hypercall*
s390x/s390-virtio-hcall: prepare for more diag500 hypercalls
s390x/s390-virtio-hcall: remove hypercall registration mechanism
s390x/s390-virtio-ccw: don't crash on weird RAM sizes
virtio-mem: unplug memory only during system resets, not device resets
Conflicts:
- hw/s390x/s390-stattrib-kvm.c
sysemu/ -> system/ header rename conflict.
- hw/s390x/virtio-ccw-mem.c
Make Property array const and removed DEFINE_PROP_END_OF_LIST() to
conform to the latest conventions.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
We recently converted from the LegacyReset to the new reset framework
in commit c009a311e939 ("virtio-mem: Use new Resettable framework instead
of LegacyReset") to be able to use the ResetType to filter out wakeup
resets.
However, this change had an undesired implications: as we override the
Resettable interface methods in VirtIOMEMClass, the reset handler will
not only get called during system resets (i.e., qemu_devices_reset())
but also during any direct or indirect device rests (e.g.,
device_cold_reset()).
Further, we might now receive two reset callbacks during
qemu_devices_reset(), first when reset by a parent and later when reset
directly.
The memory state of virtio-mem devices is rather special: it's supposed to
be persistent/unchanged during most resets (similar to resetting a hard
disk will not destroy the data), unless actually cold-resetting the whole
system (different to a hard disk where a reboot will not destroy the data):
ripping out system RAM is something guest OSes don't particularly enjoy,
but we want to detect when rebooting to an OS that does not support
virtio-mem and wouldn't be able to detect+use the memory -- and we want
to force-defragment hotplugged memory to also shrink the usable device
memory region. So we rally want to catch system resets to do that.
On supported targets (e.g., x86), getting a cold reset on the
device/parent triggers is not that easy (but looks like PCI code
might trigger it), so this implication went unnoticed.
However, with upcoming s390x support it is problematic: during
kdump, s390x triggers a subsystem reset, ending up in
s390_machine_reset() and calling only subsystem_reset() instead of
qemu_devices_reset() -- because it's not a full system reset.
In subsystem_reset(), s390x performs a device_cold_reset() of any
TYPE_VIRTUAL_CSS_BRIDGE device, which ends up resetting all children,
including the virtio-mem device. Consequently, we wrongly detect a system
reset and unplug all device memory, resulting in hotplugged memory not
getting included in the crash dump -- undesired.
We really must not mess with hotplugged memory state during simple
device resets. To fix, create+register a new reset object that will only
get triggered during qemu_devices_reset() calls, but not during any other
resets as it is logically not the child of any other object.
Message-ID: <20241025104103.342188-1-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Juraj Marcin <jmarcin@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
Headers in include/sysemu/ are not only related to system
*emulation*, they are also used by virtualization. Rename
as system/ which is clearer.
Files renamed manually then mechanical change using sed tool.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20241203172445.28576-1-philmd@linaro.org>
|
|
Call virtio_net_set_multiqueue() to add queues before loading their
states. Otherwise the loaded queues will not have handlers and elements
in them will not be processed.
Cc: qemu-stable@nongnu.org
Fixes: 8c49756825da ("virtio-net: Add only one queue pair when realizing")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
|
|
In virtio-net.c we assume that the IP length field in the packet is
aligned, and we copy its address into a uint16_t* in the
VirtioNetRscUnit struct which we then dereference later. This isn't
a safe assumption; it will also result in compilation failures if we
mark the ip_header struct as QEMU_PACKED because the compiler will
not let you take the address of an unaligned struct field.
Make the ip_plen field in VirtioNetRscUnit a void*, and make all the
places where we read or write through that pointer instead use some
new accessor functions read_unit_ip_len() and write_unit_ip_len()
which account for the pointer being potentially unaligned and also do
the network-byte-order conversion we were previously using htons() to
perform.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241114141619.806652-2-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
Coverity reports (CID 1564769, 1564770) that we potentially overflow
by doing some 32x32 multiplies for something that ends up in a 64 bit
value. Fix this by first using stride for all lines and casting input
to uint64_t to ensure a 64 bit multiply is used.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Message-ID: <20241111230040.68470-3-alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
There are two identical sequences of a code doing the same thing that
raise warnings with Coverity. Before fixing those issues lets factor
out the common code into a helper function we can share.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Message-ID: <20241111230040.68470-2-alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
into staging
virtio,pc,pci: features, fixes, cleanups
CXL now can use Generic Port Affinity Structures.
CXL now allows control of link speed and width
vhost-user-blk now supports live resize, by means of
a new device-sync-config command
amd iommu now supports interrupt remapping
pcie devices now report extended tag field support
intel_iommu dropped support for Transient Mapping, to match VTD spec
arch agnostic ACPI infrastructure for vCPU Hotplug
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmcpNqUPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp/2oH/0qO33prhDa48J5mqT9NuJzzYwp5QHKF9Zjv
# fDAplMUEmfxZIEgJchcyDWPYTGX2geT4pCFhRWioZMIR/0JyzrFgSwsk1kL88cMh
# 46gzhNVD6ybyPJ7O0Zq3GLy5jo7rlw/n+fFxKAuRCzcbK/fmH8gNC+RwW1IP64Na
# HDczYilHUhnO7yKZFQzQNQVbK4BckrG1bu0Fcx0EMUQBf4V6x7GLOrT+3hkKYcr6
# +DG5DmUmv20or/FXnu2Ye+MzR8Ebx6JVK3A3sXEE4Ns2CCzK9QLzeeyc2aU13jWN
# OpZ6WcKF8HqYprIwnSsMTxhPcq0/c7TvrGrazVwna5RUBMyjjvc=
# =zSX4
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Nov 2024 21:03:33 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (65 commits)
intel_iommu: Add missed reserved bit check for IEC descriptor
intel_iommu: Add missed sanity check for 256-bit invalidation queue
intel_iommu: Send IQE event when setting reserved bit in IQT_TAIL
hw/acpi: Update GED with vCPU Hotplug VMSD for migration
tests/qtest/bios-tables-test: Update DSDT golden masters for x86/{pc,q35}
hw/acpi: Update ACPI `_STA` method with QOM vCPU ACPI Hotplug states
qtest: allow ACPI DSDT Table changes
hw/acpi: Make CPUs ACPI `presence` conditional during vCPU hot-unplug
hw/pci: Add parenthesis to PCI_BUILD_BDF macro
hw/cxl: Ensure there is enough data to read the input header in cmd_get_physical_port_state()
hw/cxl: Ensure there is enough data for the header in cmd_ccls_set_lsa()
hw/cxl: Check that writes do not go beyond end of target attributes
hw/cxl: Ensuring enough data to read parameters in cmd_tunnel_management_cmd()
hw/cxl: Avoid accesses beyond the end of cel_log.
hw/cxl: Check the length of data requested fits in get_log()
hw/cxl: Check enough data in cmd_firmware_update_transfer()
hw/cxl: Check input length is large enough in cmd_events_clear_records()
hw/cxl: Check input includes at least the header in cmd_features_set_feature()
hw/cxl: Check size of input data to dynamic capacity mailbox commands
hw/cxl/cxl-mailbox-util: Fix output buffer index update when retrieving DC extents
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
During the hot-unplugging of vhost-user-net type network cards,
the vhost_user_cleanup function may add the same rcu node to
the rcu linked list. The function call in this case is as follows:
vhost_user_cleanup
->vhost_user_host_notifier_remove
->call_rcu(n, vhost_user_host_notifier_free, rcu);
->g_free_rcu(n, rcu);
When this happens, QEMU will abort in try_dequeue:
if (head == &dummy && qatomic_mb_read(&tail) == &dummy.next) {
abort();
}
backtrace is as follows:
0 __pthread_kill_implementation () at /usr/lib64/libc.so.6
1 raise () at /usr/lib64/libc.so.6
2 abort () at /usr/lib64/libc.so.6
3 try_dequeue () at ../util/rcu.c:235
4 call_rcu_thread (0) at ../util/rcu.c:288
5 qemu_thread_start (0) at ../util/qemu-thread-posix.c:541
6 start_thread () at /usr/lib64/libc.so.6
7 clone3 () at /usr/lib64/libc.so.6
The reason for the abort is that adding two identical nodes to
the rcu linked list will cause the rcu linked list to become a ring,
but when the dummy node is added after the two identical nodes,
the ring is opened. But only one node is added to list with
rcu_call_count added twice. This will cause rcu try_dequeue abort.
This happens when n->addr != 0. In some scenarios, this does happen.
For example, this situation will occur when using a 32-queue DPU
vhost-user-net type network card for hot-unplug testing, because
VhostUserHostNotifier->addr will be cleared during the processing of
VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG. However,it is asynchronous,
so we cannot guarantee that VhostUserHostNotifier->addr is zero in
vhost_user_cleanup. Therefore, it is necessary to merge g_free_rcu
and vhost_user_host_notifier_free into one rcu node.
Fixes: 503e355465 ("virtio/vhost-user: dynamically assign VhostUserHostNotifiers")
Signed-off-by: yaozhenguo <yaozhenguo@jd.com>
Message-Id: <20241011102913.45582-1-yaozhenguo@jd.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
As shown below, if a virtio PCI device is attached under a pci-bridge, the MR
of VirtIOPCIRegion does not belong to any address space. So memory_region_find
cannot be used to search for this MR.
Introduce the virtio-pci and pci_bridge address spaces to solve this problem.
Before:
memory-region: pci_bridge_pci
0000000000000000-ffffffffffffffff (prio 0, i/o): pci_bridge_pci
00000000fe840000-00000000fe840fff (prio 1, i/o): virtio-net-pci-msix
00000000fe840000-00000000fe84003f (prio 0, i/o): msix-table
00000000fe840800-00000000fe840807 (prio 0, i/o): msix-pba
0000380000000000-0000380000003fff (prio 1, i/o): virtio-pci
0000380000000000-0000380000000fff (prio 0, i/o): virtio-pci-common-virtio-net
0000380000001000-0000380000001fff (prio 0, i/o): virtio-pci-isr-virtio-net
0000380000002000-0000380000002fff (prio 0, i/o): virtio-pci-device-virtio-net
0000380000003000-0000380000003fff (prio 0, i/o): virtio-pci-notify-virtio-net
After:
address-space: virtio-pci-cfg-mem-as
0000380000000000-0000380000003fff (prio 1, i/o): virtio-pci
0000380000000000-0000380000000fff (prio 0, i/o): virtio-pci-common-virtio-net
0000380000001000-0000380000001fff (prio 0, i/o): virtio-pci-isr-virtio-net
0000380000002000-0000380000002fff (prio 0, i/o): virtio-pci-device-virtio-net
0000380000003000-0000380000003fff (prio 0, i/o): virtio-pci-notify-virtio-net
address-space: pci_bridge_pci_mem
0000000000000000-ffffffffffffffff (prio 0, i/o): pci_bridge_pci
00000000fe840000-00000000fe840fff (prio 1, i/o): virtio-net-pci-msix
00000000fe840000-00000000fe84003f (prio 0, i/o): msix-table
00000000fe840800-00000000fe840807 (prio 0, i/o): msix-pba
0000380000000000-0000380000003fff (prio 1, i/o): virtio-pci
0000380000000000-0000380000000fff (prio 0, i/o): virtio-pci-common-virtio-net
0000380000001000-0000380000001fff (prio 0, i/o): virtio-pci-isr-virtio-net
0000380000002000-0000380000002fff (prio 0, i/o): virtio-pci-device-virtio-net
0000380000003000-0000380000003fff (prio 0, i/o): virtio-pci-notify-virtio-net
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2576
Fixes: ffa8a3e3b2e6 ("virtio-pci: Add lookup subregion of VirtIOPCIRegion MR")
Co-developed-by: Zuo Boqun <zuoboqun@baidu.com>
Signed-off-by: Zuo Boqun <zuoboqun@baidu.com>
Co-developed-by: Wang Liang <wangliang44@baidu.com>
Signed-off-by: Wang Liang <wangliang44@baidu.com>
Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com>
Message-Id: <20241030131324.34144-1-gaoshiyuan@baidu.com>
Tested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
* target/i386: new feature bits for AMD processors
* target/i386/tcg: improvements around flag handling
* target/i386: add AVX10 support
* target/i386: add GraniteRapids-v2 model
* dockerfiles: add libcbor
* New nitro-enclave machine type
* qom: cleanups to object_new
* configure: detect 64-bit MIPS for rust
* configure: deprecate 32-bit MIPS
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmcjvkQUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPIKgf/etNpO2T+eLFtWN/Qd5eopBXqNd9k
# KmeK9EgW9lqx2IPGNen33O+uKpb/TsMmubSsSF+YxTp7pmkc8+71f3rBMaIAD02r
# /paHSMVw0+f12DAFQz1jdvGihR7Mew0wcF/UdEt737y6vEmPxLTyYG3Gfa4NSZwT
# /V5jTOIcfUN/UEjNgIp6NTuOEESKmlqt22pfMapgkwMlAJYeeJU2X9eGYE86wJbq
# ZSXNgK3jL9wGT2XKa3e+OKzHfFpSkrB0JbQbdico9pefnBokN/hTeeUJ81wBAc7u
# i00W1CEQVJ5lhBc121d4AWMp83ME6HijJUOTMmJbFIONPsITFPHK1CAkng==
# =D4nR
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 31 Oct 2024 17:28:36 GMT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream-i386' of https://gitlab.com/bonzini/qemu: (49 commits)
target/i386: Introduce GraniteRapids-v2 model
target/i386: Add AVX512 state when AVX10 is supported
target/i386: Add feature dependencies for AVX10
target/i386: add CPUID.24 features for AVX10
target/i386: add AVX10 feature and AVX10 version property
target/i386: return bool from x86_cpu_filter_features
target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits
target/i386: cpu: set correct supported XCR0 features for TCG
target/i386: use + to put flags together
target/i386: use higher-precision arithmetic to compute CF
target/i386: use compiler builtin to compute PF
target/i386: make flag variables unsigned
target/i386: add a note about gen_jcc1
target/i386: add a few more trivial CCPrepare cases
target/i386: optimize TEST+Jxx sequences
target/i386: optimize computation of ZF from CC_OP_DYNAMIC
target/i386: Wrap cc_op_live with a validity check
target/i386: Introduce cc_op_size
target/i386: Rearrange CCOp
target/i386: remove CC_OP_CLR
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Nitro Secure Module (NSM)[1] device is used in AWS Nitro Enclaves[2]
for stripped down TPM functionality like cryptographic attestation.
The requests to and responses from NSM device are CBOR[3] encoded.
This commit adds support for NSM device in QEMU. Although related to
AWS Nitro Enclaves, the virito-nsm device is independent and can be
used in other machine types as well. The libcbor[4] library has been
used for the CBOR encoding and decoding functionalities.
[1] https://lists.oasis-open.org/archives/virtio-comment/202310/msg00387.html
[2] https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave.html
[3] http://cbor.io/
[4] https://libcbor.readthedocs.io/en/latest/
Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20241008211727.49088-3-dorjoychy111@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Request Venus when initializing VirGL and if venus=true flag is set for
virtio-gpu-gl device.
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Message-Id: <20241024210311.118220-14-dmitry.osipenko@collabora.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
|
|
virtio_gpu_virgl_get_num_capsets will return "num_capsets", but we can't
assume that capset_index 1 is always VIRGL2 once we'll support more capsets,
like Venus and DRM capsets. Register capsets dynamically to avoid that problem.
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Message-Id: <20241024210311.118220-13-dmitry.osipenko@collabora.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
|
|
Support BLOB resources creation, mapping, unmapping and set-scanout by
calling the new stable virglrenderer 0.10 interface. Only enabled when
available and via the blob config. E.g. -device virtio-vga-gl,blob=true
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Signed-off-by: Robert Beckett <bob.beckett@collabora.com> # added set_scanout_blob
Signed-off-by: Xenia Ragiadakou <xenia.ragiadakou@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Message-Id: <20241024210311.118220-12-dmitry.osipenko@collabora.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
|
|
virtio_gpu_virgl_init() may fail, leading to a further Qemu crash
because Qemu assumes it never fails. Check virtio_gpu_virgl_init()
return code and don't execute virtio commands on error. Failed
virtio_gpu_virgl_init() will result in a timed out virtio commands
for a guest OS.
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Message-Id: <20241024210311.118220-5-dmitry.osipenko@collabora.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
|
|
Move print_stats timer to VirtIOGPUGL for consistency with
cmdq_resume_bh and fence_poll that are used only by GL device.
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Message-Id: <20241024210311.118220-4-dmitry.osipenko@collabora.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
|
|
Move fence_poll timer to VirtIOGPUGL for consistency with cmdq_resume_bh
that are used only by GL device.
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Message-Id: <20241024210311.118220-3-dmitry.osipenko@collabora.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
|
|
Use a common shareable type for win32 & unix, and helper functions.
This simplify the code as it avoids a lot of #ifdef'ery.
Note: if it helps review, commits could be reordered to introduce the
common type before introducing shareable memory for unix.
Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20241008125028.1177932-19-marcandre.lureau@redhat.com>
|
|
Similar to what was done in commit 9462ff46 ("virtio-gpu/win32: allocate
shareable 2d resources/images") for win32, allocate resource memory with
memfd, so the associated display surface memory can be shared with a
different process.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20241008125028.1177932-18-marcandre.lureau@redhat.com>
|
|
vhost_dev_load_inflight and vhost_dev_save_inflight have been
unused since they were added in 2019 by:
5ad204bf2a ("vhost-user: Support transferring inflight buffer between qemu and backend")
Remove them, and their helper vhost_dev_resize_inflight.
Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
|
|
staging
Hi,
"Host Memory Backends" and "Memory devices" queue ("mem"):
- Kconfig fix for virtio-based memory devices
- virtio-mem support for suspend+wake-up with plugged memory
- hostmem fix when specifying "merge=off"
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmbyikMRHGRhdmlkQHJl
# ZGhhdC5jb20ACgkQTd4Q9wD/g1q6MBAAitNST73Shc+j327WvRLHQDkzkAlIYm+M
# E8NqtDiV11h7A0eNVu+5BkY/ejtY0Fduae3nxIkrHjK20eHHpiNPUp3hBNIhkKs3
# vlSaU8FLGdt58CteMGcLYsP2E32WNNTckaFGwGjDmyUEfk+Gug4r/rJAZXDfuuLV
# 083I0/MuUF+ozPA0c2MrOwhoBPerg3a5aflVpbgPwGNrT9BHMjo62Q5QzG3U7mxr
# HnlLAScSXsYg2z+d5XLXkKLAiZ4C7UN4vfUAOZwqkfs7IFUTtFO/ev6e7VZI747n
# XhAqOAKzLqPu7tBPZJIC6jwZAUIv5yM0/v5qhVvVVdu7H0ZMtSCXyvCVtnT25Rsn
# yiA+XvCOb7yQ3hRbBIi60IzjNYfWbvw+oTVIDfXkG35TeNf4ZdjWtAiUmw9s5U9Q
# z0tINsD7VlSkbh5h3PkFw1+xagIuJAVkp673HHTtQsg+xgYK2ur5jhhWJdJlnpzA
# 77CAu07UaqU39ssnC2zeGG1eNRA4uzjwQtREzqH2jMfkw/7UuUeXMF+v/fEuLn6w
# JneSMq/a0gmD42HNae0Y40cn2Akfj6+wFu1rW3djF8F6TeLUSssQhbQSHCMwGoOg
# qX7O/3SeSRzlnp3Zyx9Tr7s+BkMz0EGGDe17GQwTQUX2t5wR5iXoGqpKZgOBA8En
# 6uUIcjBUckc=
# =PExj
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 24 Sep 2024 10:45:39 BST
# gpg: using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A
# gpg: issuer "david@redhat.com"
# gpg: Good signature from "David Hildenbrand <david@redhat.com>" [marginal]
# gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full]
# gpg: aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown]
# Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A
* tag 'mem-2024-09-24' of https://github.com/davidhildenbrand/qemu:
hostmem: Apply merge property after the memory region is initialized
virtio-mem: Add support for suspend+wake-up with plugged memory
virtio-mem: Use new Resettable framework instead of LegacyReset
reset: Add RESET_TYPE_WAKEUP
reset: Use ResetType for qemu_devices_reset() and MachineClass::reset()
virtio: kconfig: memory devices are PCI only
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
LegacyReset does not pass ResetType to the reset callback method, which
the new Resettable framework uses. Due to this, virtio-mem cannot use
the new RESET_TYPE_WAKEUP to skip the reset during wake-up from a
suspended state.
This patch adds overrides Resettable interface methods in VirtIOMEMClass
to use the new Resettable framework and replaces
qemu_[un]register_reset() calls with qemu_[un]register_resettable().
Message-ID: <20240904103722.946194-4-jmarcin@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
|
|
The 'GPL-2.0+' license identifier has been deprecated since license
list version 2.0rc2 [1] and replaced by the 'GPL-2.0-or-later' [2]
tag.
[1] https://spdx.org/licenses/GPL-2.0+.html
[2] https://spdx.org/licenses/GPL-2.0-or-later.html
Mechanical patch running:
$ sed -i -e s/GPL-2.0+/GPL-2.0-or-later/ \
$(git grep -lP 'SPDX-License-Identifier: \W+GPL-2.0\+[ $]' \
| egrep -v '^linux-headers|^include/standard-headers')
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
|
|
This allows the vhost_net device which has multiple virtqueues to batch
the setup of all its host notifiers. This significantly reduces the
vhost_net device starting and stoping time, e.g. the time spend
on enabling notifiers reduce from 630ms to 75ms and the time spend on
disabling notifiers reduce from 441ms to 45ms for a VM with 192 vCPUs
and 15 vhost-user-net devices (64vq per device) in our case.
Signed-off-by: zuoboqun <zuoboqun@baidu.com>
Message-Id: <20240816070835.8309-1-zuoboqun@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Historically, .get_vhost() was probably only called when
vdev->vhost_started is true. However, we now decidedly want to call it
also when vhost_started is false, specifically so we can issue a reset
to the vhost back-end while device operation is stopped.
Some .get_vhost() implementations dereference some pointers (or return
offsets from them) that are probably guaranteed to be non-NULL when
vhost_started is true, but not necessarily otherwise. This patch makes
all such implementations check all such pointers, returning NULL if any
is NULL.
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20240723163941.48775-2-hreitz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Patch 06b12970174 ("virtio-net: fix network stall under load")
added double-check to test whether the available buffer size
can satisfy the request or not, in case the guest has added
some buffers to the avail ring simultaneously after the first
check. It will be lucky if the available buffer size becomes
okay after the double-check, then the host can send the packet
to the guest. If the buffer size still can't satisfy the request,
even if the guest has added some buffers, viritio-net would
stall at the host side forever.
The patch enables notification and checks whether the guest has
added some buffers since last check of available buffers when
the available buffers are insufficient. If no buffer is added,
return false, else recheck the available buffers in the loop.
If the available buffers are sufficient, disable notification
and return true.
Changes:
1. Change the return type of virtqueue_get_avail_bytes() from void
to int, it returns an opaque that represents the shadow_avail_idx
of the virtqueue on success, else -1 on error.
2. Add a new API: virtio_queue_enable_notification_and_check(),
it takes an opaque as input arg which is returned from
virtqueue_get_avail_bytes(). It enables notification firstly,
then checks whether the guest has added some buffers since
last check of available buffers or not by virtio_queue_poll(),
return ture if yes.
The patch also reverts patch "06b12970174".
The case below can reproduce the stall.
Guest 0
+--------+
| iperf |
---------------> | server |
Host | +--------+
+--------+ | ...
| iperf |----
| client |---- Guest n
+--------+ | +--------+
| | iperf |
---------------> | server |
+--------+
Boot many guests from qemu with virtio network:
qemu ... -netdev tap,id=net_x \
-device virtio-net-pci-non-transitional,\
iommu_platform=on,mac=xx:xx:xx:xx:xx:xx,netdev=net_x
Each guest acts as iperf server with commands below:
iperf3 -s -D -i 10 -p 8001
iperf3 -s -D -i 10 -p 8002
The host as iperf client:
iperf3 -c guest_IP -p 8001 -i 30 -w 256k -P 20 -t 40000
iperf3 -c guest_IP -p 8002 -i 30 -w 256k -P 20 -t 40000
After some time, the host loses connection to the guest,
the guest can send packet to the host, but can't receive
packet from the host.
It's more likely to happen if SWIOTLB is enabled in the guest,
allocating and freeing bounce buffer takes some CPU ticks,
copying from/to bounce buffer takes more CPU ticks, compared
with that there is no bounce buffer in the guest.
Once the rate of producing packets from the host approximates
the rate of receiveing packets in the guest, the guest would
loop in NAPI.
receive packets ---
| |
v |
free buf virtnet_poll
| |
v |
add buf to avail ring ---
|
| need kick the host?
| NAPI continues
v
receive packets ---
| |
v |
free buf virtnet_poll
| |
v |
add buf to avail ring ---
|
v
... ...
On the other hand, the host fetches free buf from avail
ring, if the buf in the avail ring is not enough, the
host notifies the guest the event by writing the avail
idx read from avail ring to the event idx of used ring,
then the host goes to sleep, waiting for the kick signal
from the guest.
Once the guest finds the host is waiting for kick singal
(in virtqueue_kick_prepare_split()), it kicks the host.
The host may stall forever at the sequences below:
Host Guest
------------ -----------
fetch buf, send packet receive packet ---
... ... |
fetch buf, send packet add buf |
... add buf virtnet_poll
buf not enough avail idx-> add buf |
read avail idx add buf |
add buf ---
receive packet ---
write event idx ... |
wait for kick add buf virtnet_poll
... |
---
no more packet, exit NAPI
In the first loop of NAPI above, indicated in the range of
virtnet_poll above, the host is sending packets while the
guest is receiving packets and adding buffers.
step 1: The buf is not enough, for example, a big packet
needs 5 buf, but the available buf count is 3.
The host read current avail idx.
step 2: The guest adds some buf, then checks whether the
host is waiting for kick signal, not at this time.
The used ring is not empty, the guest continues
the second loop of NAPI.
step 3: The host writes the avail idx read from avail
ring to used ring as event idx via
virtio_queue_set_notification(q->rx_vq, 1).
step 4: At the end of the second loop of NAPI, recheck
whether kick is needed, as the event idx in the
used ring written by the host is beyound the
range of kick condition, the guest will not
send kick signal to the host.
Fixes: 06b12970174 ("virtio-net: fix network stall under load")
Cc: qemu-stable@nongnu.org
Signed-off-by: Wencheng Yang <east.moutain.yang@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
|
|
This reverts commit 3f868ffb0bae0c4feafabe34a371cded57fe3806.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|