Age | Commit message (Collapse) | Author | Files | Lines |
|
into staging
virtio,pci,pc: features, fixes
users can now control VM bit in smbios.
vhost-user-device is now user-createable.
intel_iommu now supports PRI
virtio-net now supports GSO over UDP tunnel
ghes now supports error injection
amd iommu now supports dma remapping for vfio
better error messages for virtio
small fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmji0s0PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpuH4H/09h70IqAWZGHIWKGmmGGtdKOj3g54KuI0Ss
# mGECEsHvvBexOy670Qy8jdgXfaW4UuNui8BiOnJnGsBX8Y0dy+/yZori3KhkXkaY
# D57Ap9agkpHem7Vw0zgNsAF2bzDdlzTiQ6ns5oDnSq8yt82onCb5WGkWTGkPs/jL
# Gf8Jv+Ddcpt5SU4/hHPYC8pUhl7z4xPOOyl0Qp1GG21Pxf5v4sGFcWuGGB7UEPSQ
# MjZeoM0rSnLDtNg18sGwD5RPLQs13TbtgsVwijI79c3w3rcSpPNhGR5OWkdRCIYF
# 8A0Nhq0Yfo0ogTht7yt1QNPf/ktJkuoBuGVirvpDaix2tCBECes=
# =Zvq/
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 05 Oct 2025 01:19:25 PM PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [unknown]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (75 commits)
virtio: improve virtqueue mapping error messages
pci: Fix wrong parameter passing to pci_device_get_iommu_bus_devfn()
intel_iommu: Simplify caching mode check with VFIO device
intel_iommu: Enable Enhanced Set Root Table Pointer Support (ESRTPS)
vdpa-dev: add get_vhost() callback for vhost-vdpa device
amd_iommu: HATDis/HATS=11 support
intel-iommu: Move dma_translation to x86-iommu
amd_iommu: Refactor amdvi_page_walk() to use common code for page walk
amd_iommu: Do not assume passthrough translation when DTE[TV]=0
amd_iommu: Toggle address translation mode on devtab entry invalidation
amd_iommu: Add dma-remap property to AMD vIOMMU device
amd_iommu: Set all address spaces to use passthrough mode on reset
amd_iommu: Toggle memory regions based on address translation mode
amd_iommu: Invalidate address translations on INVALIDATE_IOMMU_ALL
amd_iommu: Add replay callback
amd_iommu: Unmap all address spaces under the AMD IOMMU on reset
amd_iommu: Use iova_tree records to determine large page size on UNMAP
amd_iommu: Sync shadow page tables on page invalidation
amd_iommu: Add basic structure to support IOMMU notifier updates
amd_iommu: Add a page walker to sync shadow page tables on invalidation
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
The 2nd parameter of pci_device_get_iommu_bus_devfn() about root PCIBus
backed by an IOMMU for the PCI device, the 3rd is about aliased PCIBus
of the PCI device.
Meanwhile the 3rd and 4th parameters are optional, pass NULL if they
are not needed.
Reviewed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250929034206.439266-4-zhenzhong.duan@intel.com>
Fixes: a849ff5d6f ("pci: Add a pci-level initialization function for IOMMU notifiers")
Fixes: f0f37daf8e ("pci: Add a PCI-level API for PRI")
Fixes: e9b457500a ("pci: Add a pci-level API for ATS")
Fixes: 042cbc9aec ("pci: Add an API to get IOMMU's min page size and virtual address width")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Starting with commit cab1398a60eb, SR-IOV VFs are realized as soon as
pcie_sriov_pf_init() is called. Because pcie_sriov_pf_init() must be
called before pcie_sriov_pf_init_vf_bar(), the VF BARs types won't be
known when the VF realize function calls pcie_sriov_vf_register_bar().
This breaks the memory regions of the VFs (for instance with igbvf):
$ lspci
...
Region 0: Memory at 281a00000 (64-bit, prefetchable) [virtual] [size=16K]
Region 3: Memory at 281a20000 (64-bit, prefetchable) [virtual] [size=16K]
$ info mtree
...
address-space: pci_bridge_pci_mem
0000000000000000-ffffffffffffffff (prio 0, i/o): pci_bridge_pci
0000000081a00000-0000000081a03fff (prio 1, i/o): igbvf-mmio
0000000081a20000-0000000081a23fff (prio 1, i/o): igbvf-msix
and causes MMIO accesses to fail:
Invalid write at addr 0x281A01520, size 4, region '(null)', reason: rejected
Invalid read at addr 0x281A00C40, size 4, region '(null)', reason: rejected
To fix this, VF BARs are now registered with pci_register_bar() which
has a type parameter and pcie_sriov_vf_register_bar() is removed.
Fixes: cab1398a60eb ("pcie_sriov: Reuse SR-IOV VF device instances")
Signed-off-by: Damien Bergamini <damien.bergamini@eviden.com>
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20250901151314.1038020-1-clement.mathieu--drif@eviden.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This commit removes the redundant vmstate_save_state_with_err()
function.
Previously, commit 969298f9d7 introduced vmstate_save_state_with_err()
to handle error propagation, while vmstate_save_state() existed for
non-error scenarios.
This is because there were code paths where vmstate_save_state_v()
(called internally by vmstate_save_state) did not explicitly set
errors on failure.
This change unifies error handling by
- updating vmstate_save_state() to accept an Error **errp argument.
- vmstate_save_state_v() ensures errors are set directly within the errp
object, eliminating the need for two separate functions.
All calls to vmstate_save_state_with_err() are replaced with
vmstate_save_state(). This simplifies the API and improves code
maintainability.
vmstate_save_state() that only calls vmstate_save_state_v(),
by inference, also has errors set in errp in case of failure.
The errors are reported using error_report_err().
If we want the function to exit on error, then &error_fatal is
passed.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-24-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that vmstate_load_state() must report an error
in errp, in case of failure.
The errors are temporarily reported using error_report_err().
This is removed in the subsequent patches in this series,
when we are actually able to propagate the error to the calling
function using errp. Whereas, if we want the function to exit on
error, then error_fatal is passed.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-2-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
Currently, pci_setup_iommu() registers IOMMU ops for a given PCIBus.
However, when retrieving IOMMU ops for a device using
pci_device_get_iommu_bus_devfn(), the function checks the parent_dev
and fetches IOMMU ops from the parent device, even if the current
bus does not have any associated IOMMU ops.
This behavior works for now because QEMU's IOMMU implementations are
globally scoped, and host bridges rely on the bypass_iommu property
to skip IOMMU translation when needed.
However, this model will break with the soon to be introduced
arm-smmuv3 device, which allows users to associate the IOMMU
with a specific PCIe root complex (e.g., the default pcie.0
or a pxb-pcie root complex).
For example, consider the following setup with multiple root
complexes:
-device arm-smmuv3,primary-bus=pcie.0,id=smmuv3.0 \
...
-device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \
-device pcie-root-port,id=pcie.port1,bus=pcie.1 \
-device virtio-net-pci,bus=pcie.port1
In Qemu, pxb-pcie acts as a special root complex whose parent is
effectively the default root complex(pcie.0). Hence, though pcie.1
has no associated SMMUv3 as per above, pci_device_get_iommu_bus_devfn()
will incorrectly return the IOMMU ops from pcie.0 due to the fallback
via parent_dev.
To fix this, introduce a new helper pci_setup_iommu_per_bus() that
explicitly sets the new iommu_per_bus field in the PCIBus structure.
This helper will be used in a subsequent patch that adds support for
the new arm-smmuv3 device.
Update pci_device_get_iommu_bus_devfn() to use iommu_per_bus when
determining the correct IOMMU ops, ensuring accurate behavior for
per-bus IOMMUs.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nathan Chen <nathanc@nvidia.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
Reviewed-by: Donald Dutile <ddutile@redhat.com>
Message-id: 20250829082543.7680-7-skolothumtho@nvidia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Since there is no pch_gbe emulation, we could be using func other
than 0 when adding new devices to specific boards.
Signed-off-by: Chao-ying Fu <cfu@mips.com>
Signed-off-by: Djordje Todorovic <djordje.todorovic@htecgroup.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250901102850.1172983-13-djordje.todorovic@htecgroup.com>
[PMD: Compare with null character ('\0'), not '0']
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
Do not reset a vfio-pci device during CPR.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749576403-25355-1-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
The vring call fd is set even when the guest does not use MSI-X (e.g., in the
case of virtio PMD), leading to unnecessary CPU overhead for processing
interrupts.
The commit 96a3d98d2c("vhost: don't set vring call if no vector") optimized the
case where MSI-X is enabled but the queue vector is unset. However, there's an
additional case where the guest uses INTx and the INTx_DISABLED bit in the PCI
config is set, meaning that no interrupt notifier will actually be used.
In such cases, the vring call fd should also be cleared to avoid redundant
interrupt handling.
Fixes: 96a3d98d2c("vhost: don't set vring call if no vector")
Reported-by: Zhiyuan Yuan <yuanzhiyuan@chinatelecom.cn>
Signed-off-by: Jidong Xia <xiajd@chinatelecom.cn>
Signed-off-by: Huaitong Han <hanht2@chinatelecom.cn>
Message-Id: <20250522100548.212740-1-hanht2@chinatelecom.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
A device can send a PRI request to the IOMMU using pci_pri_request_page.
The PRI response is sent back using the notifier managed with
pci_pri_register_notifier and pci_pri_unregister_notifier.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Co-authored-by: Ethan Milon <ethan.milon@eviden.com>
Message-Id: <20250520071823.764266-12-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Devices implementing ATS can send translation requests using
pci_ats_request_translation. The invalidation events are sent
back to the device using the iommu notifier managed with
pci_iommu_register_iotlb_notifier / pci_iommu_unregister_iotlb_notifier.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Co-authored-by: Ethan Milon <ethan.milon@eviden.com>
Message-Id: <20250520071823.764266-11-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This is meant to be used by ATS-capable devices.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-10-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This kind of information is needed by devices implementing ATS in order
to initialize their translation cache.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-8-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
The cached is_master value is necessary to know if a device is
allowed to issue ATS/PRI requests or not as these operations do not go
through the master_enable memory region.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250520071823.764266-7-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
into staging
virtio,pci,pc: fixes, features
vhost-scsi now supports scsi hotplug
cxl gained a bag of new operations, motably media operations
virtio-net now supports SR-IOV emulation
pci-testdev now supports backing memory bar with host memory
amd iommu now supports migration
fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmgkg0UPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpcDIH+wbrq7DzG+BVOraYtmD69BQCzYszby1mAWry
# 2OUYuAx9Oh+DsAwbzwbBdh9+SmJoi1oJ/d8rzSK328hdDrpCaPmc7bcBdAWJ3YcB
# bGNPyJ+9eJLRXtlceGIhfAOMLIB0ugXGkHLQ61zlVCTg4Xwnj7/dQp2tAQ1BkTwW
# Azc7ujBoJOBF3WVpa1Pqw0t1m3K74bwanOlkIg/JUWXk27sgP2YMnyrcpOu9Iz1T
# VazgobyHo5y15V0wvd05w4Bk7cJSHwgW+y3DtgTtIffetIaAbSRgl3Pl5Ic1yKcX
# ofg9aDFN6m0S8tv4WgFc+rT3Xaa/aPue9awjD5sEEldRasWKKNo=
# =847R
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 14 May 2025 07:49:25 EDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (27 commits)
hw/i386/amd_iommu: Allow migration when explicitly create the AMDVI-PCI device
hw/i386/amd_iommu: Isolate AMDVI-PCI from amd-iommu device to allow full control over the PCI device creation
intel_iommu: Take locks when looking for and creating address spaces
intel_iommu: Use BQL_LOCK_GUARD to manage cleanup automatically
virtio: Move virtio_reset()
virtio: Call set_features during reset
vhost-scsi: support VIRTIO_SCSI_F_HOTPLUG
vhost-user: return failure if backend crash when live migration
vhost: return failure if stop virtqueue failed in vhost_dev_stop
system/runstate: add VM state change cb with return value
pci-testdev.c: Add membar-backed option for backing membar
pcie_sriov: Make a PCI device with user-created VF ARI-capable
docs: Document composable SR-IOV device
virtio-net: Implement SR-IOV VF
virtio-pci: Implement SR-IOV PF
pcie_sriov: Allow user to create SR-IOV device
pcie_sriov: Check PCI Express for SR-IOV PF
pcie_sriov: Ensure PF and VF are mutually exclusive
hw/pci: Fix SR-IOV VF number calculation
hw/pci: Do not add ROM BAR for SR-IOV VF
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
A user can create a SR-IOV device by specifying the PF with the
sriov-pf property of the VFs. The VFs must be added before the PF.
A user-creatable VF must have PCIDeviceClass::sriov_vf_user_creatable
set. Such a VF cannot refer to the PF because it is created before the
PF.
A PF that user-creatable VFs can be attached calls
pcie_sriov_pf_init_from_user_created_vfs() during realization and
pcie_sriov_pf_exit() when exiting.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250314-sriov-v9-5-57dae8ae3ab5@daynix.com>
Tested-by: Yui Washizu <yui.washidu@gmail.com>
Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
pci_config_get_bar_addr() had a division by vf_stride. vf_stride needs
to be non-zero when there are multiple VFs, but the specification does
not prohibit to make it zero when there is only one VF.
Do not perform the division for the first VF to avoid division by zero.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250314-sriov-v9-2-57dae8ae3ab5@daynix.com>
Tested-by: Yui Washizu <yui.washidu@gmail.com>
Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
A SR-IOV VF cannot have a ROM BAR.
Co-developed-by: Yui Washizu <yui.washidu@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250314-sriov-v9-1-57dae8ae3ab5@daynix.com>
Tested-by: Yui Washizu <yui.washidu@gmail.com>
Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Remove PCI_DPRINTF() macro and use trace events instead.
Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
|
|
Mechanical change using:
$ sed -i -E 's/\(InterfaceInfo.?\[/\(const InterfaceInfo\[/g' \
$(git grep -lE '\(InterfaceInfo.?\[\]\)')
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20250424194905.82506-7-philmd@linaro.org>
|
|
Mechanical change using gsed, then style manually adapted
to pass checkpatch.pl script.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250424194905.82506-4-philmd@linaro.org>
|
|
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250424194905.82506-3-philmd@linaro.org>
|
|
vfio queue:
* Added property documentation
* Added Minor fixes
* Implemented basic PCI PM capability backing
* Promoted new IGD maintainer
* Deprecated vfio-plaform
* Extended VFIO migration with multifd support
# -----BEGIN PGP SIGNATURE-----
#
# iQIyBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmfJrZoACgkQUaNDx8/7
# 7KFE2A/0Dmief9u/dDJIKGIDa+iawcf4hu8iX4v5pB0DlGniT3rgK8WMGnhDpPxq
# Q4wsKfo+JJ2q6msInrT7Ckqyydu9nQztI3vwmfMuWxLhTMyH28K96ptwPqIZBjOx
# rPTEXfnVX4W3tpn1+48S+vefWVa/gkBkIvv7RpK18rMBXv1kDeyOvc/d2dbAt7ft
# zJc4f8gH3jfQzGwmnYVZU1yPrZN7p6zhYR/AD3RQOY97swgZIEyYxXhOuTPiCuEC
# zC+2AMKi9nmnCG6x/mnk7l2yJXSlv7lJdqcjYZhJ9EOIYfiUGTREYIgQbARcafE/
# 4KSg2QR35BoUd4YrmEWxXJCRf3XnyWXDY36dDKVhC0OHng1F/U44HuL4QxwoTIay
# s1SP/DHcvDiPAewVTvdgt7Iwfn9xGhcQO2pkrxBoNLB5JYwW+R6mG7WXeDv1o3GT
# QosTu1fXZezQqFd4v6+q5iRNS2KtBZLTspwAmVdywEFUs+ZLBRlC+bodYlinZw6B
# Yl/z0LfAEh4J55QmX2espbp8MH1+mALuW2H2tgSGSrTBX1nwxZFI5veFzPepgF2S
# eTx69BMjiNMwzIjq1T7e9NpDCceiW0fXDu7IK1MzYhqg1nM9lX9AidhFTeiF2DB2
# EPb3ljy/8fyxcPKa1T9X47hQaSjbMwofaO8Snoh0q0jokY246Q==
# =hIBw
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 06 Mar 2025 22:13:46 HKT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [full]
# gpg: aka "Cédric Le Goater <clg@kaod.org>" [full]
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20250306' of https://github.com/legoater/qemu: (42 commits)
hw/core/machine: Add compat for x-migration-multifd-transfer VFIO property
vfio/migration: Make x-migration-multifd-transfer VFIO property mutable
vfio/migration: Add x-migration-multifd-transfer VFIO property
vfio/migration: Multifd device state transfer support - send side
vfio/migration: Multifd device state transfer support - config loading support
migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile
vfio/migration: Multifd device state transfer support - load thread
vfio/migration: Multifd device state transfer support - received buffers queuing
vfio/migration: Setup and cleanup multifd transfer in these general methods
vfio/migration: Multifd setup/cleanup functions and associated VFIOMultifd
vfio/migration: Multifd device state transfer - add support checking function
vfio/migration: Multifd device state transfer support - basic types
vfio/migration: Move migration channel flags to vfio-common.h header file
vfio/migration: Add vfio_add_bytes_transferred()
vfio/migration: Convert bytes_transferred counter to atomic
vfio/migration: Add load_device_config_state_start trace event
migration: Add save_live_complete_precopy_thread handler
migration/multifd: Add multifd_device_state_supported()
migration/multifd: Make MultiFDSendData a struct
migration/multifd: Device state transfer support - send side
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
PropertyInfo member @name becomes ObjectProperty member @type, while
Property member @name becomes ObjectProperty member @name. Rename the
former.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250227085601.4140852-4-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[One missed instance of @type fixed]
|
|
The memory and IO BARs for devices are only accessible in the D0 power
state. In other power states the PCI spec defines that the device
responds to TLPs and messages with an Unsupported Request response.
To approximate this behavior, consider the BARs as unmapped when the
device is not in the D0 power state. This makes the BARs inaccessible
and has the additional bonus for vfio-pci that we don't attempt to DMA
map BARs for devices in a non-D0 power state.
To support this, an interface is added for devices to register the PM
capability, which allows central tracking to enforce valid transitions
and unmap BARs in non-D0 states.
NB. We currently have device models (eepro100 and pcie_pci_bridge)
that register a PM capability but do not set wmask to enable writes to
the power state field. In order to maintain migration compatibility,
this new helper does not manage the wmask to enable guest writes to
initiate a power state change. The contents and write access of the
PM capability are still managed by the caller.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-2-alex.williamson@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
|
|
Nothing should be doing this, but it doesn't get caught by
pci_register_bar(). Add an assertion to prevent misuse.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20250117172842.406338-3-npiggin@gmail.com>
Reviewed-by: Phil Dennis-Jordan <phil@philjordan.eu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
pcie_sriov doesn't have code to restore its state after migration, but
igb, which uses pcie_sriov, naively claimed its migration capability.
Add code to register VFs after migration and fix igb migration.
Fixes: 3a977deebe6b ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-reuse-v20-11-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Disable SR-IOV VF devices by reusing code to power down PCI devices
instead of removing them when the guest requests to disable VFs. This
allows to realize devices and report VF realization errors at PF
realization time.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-reuse-v20-8-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
The renamed state will not only represent powering state of PFs, but
also represent SR-IOV VF enablement in the future.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250109-reuse-v19-1-f541e82ca5f7@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
vfio_pci_size_rom() distinguishes whether rombar is explicitly set to 1
by checking dev->opts, bypassing the QOM property infrastructure.
Use -1 as the default value for rombar to tell if the user explicitly
set it to 1. The property is also converted from unsigned to signed.
-1 is signed so it is safe to give it a new meaning. The values in
[2 ^ 31, 2 ^ 32) become invalid, but nobody should have typed these
values by chance.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250104-reuse-v18-13-c349eafd8673@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
|
|
Accel & Exec patch queue
- Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander)
- Add '-d invalid_mem' logging option (Zoltan)
- Create QOM containers explicitly (Peter)
- Rename sysemu/ -> system/ (Philippe)
- Re-orderning of include/exec/ headers (Philippe)
Move a lot of declarations from these legacy mixed bag headers:
. "exec/cpu-all.h"
. "exec/cpu-common.h"
. "exec/cpu-defs.h"
. "exec/exec-all.h"
. "exec/translate-all"
to these more specific ones:
. "exec/page-protection.h"
. "exec/translation-block.h"
. "user/cpu_loop.h"
. "user/guest-host.h"
. "user/page-protection.h"
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t
# wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt
# KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K
# A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8
# 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe///
# 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r
# xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl
# VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay
# ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP
# 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd
# +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6
# x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo=
# =cjz8
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 20 Dec 2024 11:45:20 EST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* tag 'exec-20241220' of https://github.com/philmd/qemu: (59 commits)
util/qemu-timer: fix indentation
meson: Do not define CONFIG_DEVICES on user emulation
system/accel-ops: Remove unnecessary 'exec/cpu-common.h' header
system/numa: Remove unnecessary 'exec/cpu-common.h' header
hw/xen: Remove unnecessary 'exec/cpu-common.h' header
target/mips: Drop left-over comment about Jazz machine
target/mips: Remove tswap() calls in semihosting uhi_fstat_cb()
target/xtensa: Remove tswap() calls in semihosting simcall() helper
accel/tcg: Un-inline translator_is_same_page()
accel/tcg: Include missing 'exec/translation-block.h' header
accel/tcg: Move tcg_cflags_has/set() to 'exec/translation-block.h'
accel/tcg: Restrict curr_cflags() declaration to 'internal-common.h'
qemu/coroutine: Include missing 'qemu/atomic.h' header
exec/translation-block: Include missing 'qemu/atomic.h' header
accel/tcg: Declare cpu_loop_exit_requested() in 'exec/cpu-common.h'
exec/cpu-all: Include 'cpu.h' earlier so MMU_USER_IDX is always defined
target/sparc: Move sparc_restore_state_to_opc() to cpu.c
target/sparc: Uninline cpu_get_tb_cpu_state()
target/loongarch: Declare loongarch_cpu_dump_state() locally
user: Move various declarations out of 'exec/exec-all.h'
...
Conflicts:
hw/char/riscv_htif.c
hw/intc/riscv_aplic.c
target/s390x/cpu.c
Apply sysemu header path changes to not in the pull request.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
Headers in include/sysemu/ are not only related to system
*emulation*, they are also used by virtualization. Rename
as system/ which is clearer.
Files renamed manually then mechanical change using sed tool.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20241203172445.28576-1-philmd@linaro.org>
|
|
Now that all of the Property arrays are counted, we can remove
the terminator object from each array. Update the assertions
in device_class_set_props to match.
With struct Property being 88 bytes, this was a rather large
form of terminator. Saves 30k from qemu-system-aarch64.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://lore.kernel.org/r/20241218134251.4724-21-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
pci_bus_add_fw_cfg_extra_pci_roots() calls the fw_cfg
API with PCI bus specific arguments.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20241206181352.6836-5-philmd@linaro.org>
|
|
The FW_CFG_DATA_GENERATOR interface allows any object to
produce a blob of data consumable by the fw_cfg device.
Implement that for PCI bus objects.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20241213133352.10915-5-philmd@linaro.org>
|
|
>From what I read PCI has 32 transactions, PCI Express devices can handle
256 with Extended tag enabled (spec mentions also larger values but I
lack PCIe knowledge).
QEMU leaves 'Extended tag field' with 0 as value:
Capabilities: [e0] Express (v1) Root Complex Integrated Endpoint, IntMsgNum 0
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+ FLReset- TEE-IO-
SBSA ACS has test 824 which checks for PCIe device capabilities. BSA
specification [1] (SBSA is on top of BSA) in section F.3.2 lists
expected values for Device Capabilities Register:
Device Capabilities Register Requirement
Role based error reporting RCEC and RCiEP: Hardwired to 1
Endpoint L0s acceptable latency RCEC and RCiEP: Hardwired to 0
L1 acceptable latency RCEC and RCiEP: Hardwired to 0
Captured slot power limit scale RCEC and RCiEP: Hardwired to 0
Captured slot power limit value RCEC and RCiEP: Hardwired to 0
Max payload size value must be compliant with PCIe spec
Phantom functions RCEC and RCiEP: Recommendation is to
hardwire this bit to 0.
Extended tag field Hardwired to 1
1. https://developer.arm.com/documentation/den0094/c/
This change enables Extended tag field. All versioned platforms should
have it disabled for older versions (tested with Arm/virt).
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-Id: <20241023113820.486017-1-marcin.juszkiewicz@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Using a property allows us to hide the internal details of the PCI device
from the code to build a SRAT Generic Initiator Affinity Structure with
PCI Device Handle.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240916171017.1841767-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
When the devfn is already assigned in the command line, the
do_pci_register_device() may verify if the function 0 is already occupied.
However, when devfn < 0, the verification is skipped because it is part of
the last "else if".
For instance, suppose there is already a device at addr=00.00 of a port.
-device pcie-root-port,bus=pcie.0,chassis=115,id=port01,addr=0e.00 \
-device virtio-net-pci,bus=port01,id=vnet01,addr=00.00 \
When 'addr' is specified for the 2nd device, the hotplug is denied.
(qemu) device_add virtio-net-pci,bus=port01,id=vnet02,addr=01.00
Error: PCI: slot 0 function 0 already occupied by virtio-net-pci, new func virtio-net-pci cannot be exposed to guest.
When 'addr' is automatically assigned, the hotplug is not denied. This is
because the verification is skipped.
(qemu) device_add virtio-net-pci,bus=port01,id=vnet02
warning: PCI: slot 1 is not valid for virtio-net-pci, parent device only allows plugging into slot 0.
Fix the issue by moving the verification into an independent 'if'
statement.
Fixes: 3f1e1478db2d ("enable multi-function hot-add")
Reported-by: Aswin Unnikrishnan <aswin.u.unnikrishnan@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Message-Id: <20240708041056.54504-1-dongli.zhang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
When DMA memory can't be directly accessed, as is the case when
running the device model in a separate process without shareable DMA
file descriptors, bounce buffering is used.
It is not uncommon for device models to request mapping of several DMA
regions at the same time. Examples include:
* net devices, e.g. when transmitting a packet that is split across
several TX descriptors (observed with igb)
* USB host controllers, when handling a packet with multiple data TRBs
(observed with xhci)
Previously, qemu only provided a single bounce buffer per AddressSpace
and would fail DMA map requests while the buffer was already in use. In
turn, this would cause DMA failures that ultimately manifest as hardware
errors from the guest perspective.
This change allocates DMA bounce buffers dynamically instead of
supporting only a single buffer. Thus, multiple DMA mappings work
correctly also when RAM can't be mmap()-ed.
The total bounce buffer allocation size is limited individually for each
AddressSpace. The default limit is 4096 bytes, matching the previous
maximum buffer size. A new x-max-bounce-buffer-size parameter is
provided to configure the limit for PCI devices.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240819135455.2957406-1-mnissler@rivosinc.com
Signed-off-by: Peter Xu <peterx@redhat.com>
|
|
This reverts commit 6a31b219a5338564f3978251c79f96f689e037da.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This reverts commit 139610ae67f6ecf92127bb7bf53ac6265b459ec8.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This reverts commit 107a64b9a360cf5ca046852bc03334f7a9f22aef.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This reverts commit ca6dd3aef8a103138c99788bcba8195d4905ddc5.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This reverts commit 122173a5830f7757f8a94a3b1559582f312e140b.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
A user can create a SR-IOV device by specifying the PF with the
sriov-pf property of the VFs. The VFs must be added before the PF.
A user-creatable VF must have PCIDeviceClass::sriov_vf_user_creatable
set. Such a VF cannot refer to the PF because it is created before the
PF.
A PF that user-creatable VFs can be attached calls
pcie_sriov_pf_init_from_user_created_vfs() during realization and
pcie_sriov_pf_exit() when exiting.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240715-sriov-v5-5-3f5539093ffc@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
pci_config_get_bar_addr() had a division by vf_stride. vf_stride needs
to be non-zero when there are multiple VFs, but the specification does
not prohibit to make it zero when there is only one VF.
Do not perform the division for the first VF to avoid division by zero.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240715-sriov-v5-2-3f5539093ffc@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
romsize is an uint32_t variable. Specifying -1 as an uint32_t value is
obscure way to denote UINT32_MAX.
Worse, if int is wider than 32-bit, it will change the behavior of a
construct like the following:
romsize = -1;
if (romsize != -1) {
...
}
When -1 is assigned to romsize, -1 will be implicitly casted into
uint32_t, resulting in UINT32_MAX. On contrary, when evaluating
romsize != -1, romsize will be casted into int, and it will be a
comparison of UINT32_MAX and -1, and result in false.
Replace -1 with UINT32_MAX for statements involving the variable to
clarify the intent and prevent potential breakage.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240627-reuse-v10-10-7ca0b8ed3d9f@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
pcie_sriov doesn't have code to restore its state after migration, but
igb, which uses pcie_sriov, naively claimed its migration capability.
Add code to register VFs after migration and fix igb migration.
Fixes: 3a977deebe6b ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240627-reuse-v10-9-7ca0b8ed3d9f@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Disable SR-IOV VF devices by reusing code to power down PCI devices
instead of removing them when the guest requests to disable VFs. This
allows to realize devices and report VF realization errors at PF
realization time.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240627-reuse-v10-6-7ca0b8ed3d9f@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|