diff options
author | Peter Xu <peterx@redhat.com> | 2023-01-09 14:37:27 -0500 |
---|---|---|
committer | Michael S. Tsirkin <mst@redhat.com> | 2023-01-27 11:47:02 -0500 |
commit | 8a7c606016d283a1716290c657f6f45bc7c4d817 (patch) | |
tree | 824a507afcc15711d5dab06cb9b1bd65218e13f2 | |
parent | bad9c5a5166fd5e3a892b7b0477cf2f4bd3a959a (diff) | |
download | qemu-8a7c606016d283a1716290c657f6f45bc7c4d817.zip qemu-8a7c606016d283a1716290c657f6f45bc7c4d817.tar.gz qemu-8a7c606016d283a1716290c657f6f45bc7c4d817.tar.bz2 |
intel-iommu: Document iova_tree
It seems not super clear on when iova_tree is used, and why. Add a rich
comment above iova_tree to track why we needed the iova_tree, and when we
need it.
Also comment for the map/unmap messages, on how they're used and
implications (e.g. unmap can be larger than the mapped ranges).
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230109193727.1360190-1-peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-rw-r--r-- | include/exec/memory.h | 26 | ||||
-rw-r--r-- | include/hw/i386/intel_iommu.h | 38 |
2 files changed, 63 insertions, 1 deletions
diff --git a/include/exec/memory.h b/include/exec/memory.h index c37ffdb..2e602a2 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -129,6 +129,32 @@ struct IOMMUTLBEntry { /* * Bitmap for different IOMMUNotifier capabilities. Each notifier can * register with one or multiple IOMMU Notifier capability bit(s). + * + * Normally there're two use cases for the notifiers: + * + * (1) When the device needs accurate synchronizations of the vIOMMU page + * tables, it needs to register with both MAP|UNMAP notifies (which + * is defined as IOMMU_NOTIFIER_IOTLB_EVENTS below). + * + * Regarding to accurate synchronization, it's when the notified + * device maintains a shadow page table and must be notified on each + * guest MAP (page table entry creation) and UNMAP (invalidation) + * events (e.g. VFIO). Both notifications must be accurate so that + * the shadow page table is fully in sync with the guest view. + * + * (2) When the device doesn't need accurate synchronizations of the + * vIOMMU page tables, it needs to register only with UNMAP or + * DEVIOTLB_UNMAP notifies. + * + * It's when the device maintains a cache of IOMMU translations + * (IOTLB) and is able to fill that cache by requesting translations + * from the vIOMMU through a protocol similar to ATS (Address + * Translation Service). + * + * Note that in this mode the vIOMMU will not maintain a shadowed + * page table for the address space, and the UNMAP messages can cover + * more than the pages that used to get mapped. The IOMMU notifiee + * should be able to take care of over-sized invalidations. */ typedef enum { IOMMU_NOTIFIER_NONE = 0, diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index 46d973e..89dcbc5 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -109,7 +109,43 @@ struct VTDAddressSpace { QLIST_ENTRY(VTDAddressSpace) next; /* Superset of notifier flags that this address space has */ IOMMUNotifierFlag notifier_flags; - IOVATree *iova_tree; /* Traces mapped IOVA ranges */ + /* + * @iova_tree traces mapped IOVA ranges. + * + * The tree is not needed if no MAP notifier is registered with current + * VTD address space, because all guest invalidate commands can be + * directly passed to the IOMMU UNMAP notifiers without any further + * reshuffling. + * + * The tree OTOH is required for MAP typed iommu notifiers for a few + * reasons. + * + * Firstly, there's no way to identify whether an PSI (Page Selective + * Invalidations) or DSI (Domain Selective Invalidations) event is an + * MAP or UNMAP event within the message itself. Without having prior + * knowledge of existing state vIOMMU doesn't know whether it should + * notify MAP or UNMAP for a PSI message it received when caching mode + * is enabled (for MAP notifiers). + * + * Secondly, PSI messages received from guest driver can be enlarged in + * range, covers but not limited to what the guest driver wanted to + * invalidate. When the range to invalidates gets bigger than the + * limit of a PSI message, it can even become a DSI which will + * invalidate the whole domain. If the vIOMMU directly notifies the + * registered device with the unmodified range, it may confuse the + * registered drivers (e.g. vfio-pci) on either: + * + * (1) Trying to map the same region more than once (for + * VFIO_IOMMU_MAP_DMA, -EEXIST will trigger), or, + * + * (2) Trying to UNMAP a range that is still partially mapped. + * + * That accuracy is not required for UNMAP-only notifiers, but it is a + * must-to-have for notifiers registered with MAP events, because the + * vIOMMU needs to make sure the shadow page table is always in sync + * with the guest IOMMU pgtables for a device. + */ + IOVATree *iova_tree; }; struct VTDIOTLBEntry { |