aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2022-02-04ignore writes to RO MSI-X registers (#642)Thanos Makatos1-8/+11
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-12-09allow DMA funcs to be called in quiesced state (#635)Thanos Makatos1-5/+36
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-11-30introduce device quiesce callback (#609)Thanos Makatos1-32/+62
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Leon <john.levon@nutanix.com>
2021-10-29fix vfu_run_ctx() docs (#616)John Levon1-1/+0
We were incorrectly claiming we'd return EAGAIN, but now we'd return 0. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-10-05make migration state callback optionally asynchronous (#608)Thanos Makatos1-0/+22
Some devices need the migration state callback to be asynchronous. The simplest way to implement this is to require from the callback to return -1 and set errno to EBUSY, not process any other new messages (vfu_ctx_run returns -1 and sets errno to EBUSY), and provide a way to the user to complete migration (vfu_migr_done). Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-09-27clarify LIBVFIO_USER_FLAG_ATTACH_NB behavior (#603)John Levon1-0/+4
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-09-08initial ioeventfd support (#601)JAKelly102-0/+61
Provide initial support for handling VFIO_USER_DEVICE_GET_REGION_IO_FDS, along with a new vfu_create_ioeventfd() API. Reviewed-by: John Levon <john.levon@nutanix.com>
2021-08-27Add support for VFIO_DMA_UNMAP_FLAG_ALL flag (#600)Swapnil Ingle1-0/+3
* Add support for VFIO_DMA_UNMAP_FLAG_ALL flag Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-08-18improve API docs a little bit (#587)John Levon1-2/+9
Clarify a couple of minor things in the API documentation and README. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-08-17fix dma_{map,unmap}_sg() array handling (#586)John Levon1-9/+9
Multiple places in dma_map_sg() and dma_unmap_sg() were dereferencing sg[0] instead of the correct index. Take the opportunity to improve the doc comments at the same time. Reported-by: Changpeng Liu <changpeng.liu@intel.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-07-14check for valid vfu_setup_region() flags (#579)John Levon1-5/+8
Complain about a region that isn't readable *or* writable, or any unknown flags. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-07-13add VFU_REGION_FLAG_ALWAYS_CB to receive callback always (#583)Jag Raman1-0/+3
2021-07-12basic write support for PXLC, PXSC, PXRS, and PXSC2 (#575)Thanos Makatos1-4/+18
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-06-30return process request count in vfu_run_ctx() (#574)John Levon1-1/+2
Consumers such as SPDK would like to know if any actual work was done. Modify the API to support this. Also, clean up some stale mocking we no longer use. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-06-18superficially handle Device Control 2 and Link Control 2 (#568)Thanos Makatos1-4/+16
* superficially handle Device Control 2 and Link Control 2 Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-06-09clear dirty pages bitmap after getting dirty pages but keep mapped segments ↵Thanos Makatos1-12/+19
dirty (#551) Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-06-09drop mappable flag from DMA map (#553)Thanos Makatos2-3/+1
The flags field belongs to VFIO and it's not a good idea to reuse as new VFIO flags can break things. Instead, we derive whether or not a region is mappable if a file descriptor is passed. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-06-02replace max_msg_size with max_data_xfer_size (#541)John Levon1-0/+2
The previously specified max_msg_size had one major issue: it implied a (way too small) limit on the size of dirty bitmaps that could be requested by a client, and as a result a hard limit on memory region size. It seemed awkward to attempt to split up an unmap request instead. Instead, let most requests and replies be limited by their "natural" limits; for example, the number of booleans in VFIO_USER_SET_IRQS is limited by MSI-X count. For the requests that solicit or provide data - that is, VFIO_USER_DMA_READ/WRITE and VFIO_USER_REGION_READ/WRITE - we negotiate a new max_data_xfer_size value. These are much easier to split up into separate requests at the client side so should not present an implementation problem. For our server, chunking is implemented in vfu_dma_read/vfu_dma_write(). Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-06-02clean up migration register definitions (#550)John Levon2-67/+49
We should explicitly define the expected migration register contents for API users who aren't using the callbacks. Clean up some related lint. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-06-01limit max DMA region size (#545)John Levon1-2/+0
Since the dirty bitmap in message replies is allocated based upon the maximum size of an individual region, add a limit (somewhat arbitrarily 8TiB, which is a bitmap size of 256MiB). Add a couple of basic tests on the two DMA limits. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-06-01fixes for VFIO_USER_DIRTY_PAGES (#537)John Levon1-10/+12
- we should only accept one range, not multiple ones - clearly define and implement argsz behaviour - we need to check if migration is configured - add proper test coverage; move existing testing to python Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-28restore argsz for DMA map/unmap (#523)Thanos Makatos1-8/+19
use DMA map/unmap format similar to VFIO's Using a DMA map/unmap format similar to VFIO's (vfio_iommu_type1_dma_map / vfio_iommu_type1_dma_unmap) makes it easier to adapt to future changes. Consequently we also honor the passed argsz. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanitx.com>
2021-05-27Fix struct pxcap (#534)Swapnil Ingle1-3/+4
* Added missing reserved bits and renamed per to rer nameing as the nvme specs * Add pxcap capability in lspci test Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-27Cleanup and fix structs padding (#532)Swapnil Ingle4-91/+91
In case of bitfields compiler may use data type to allocate struct size and add additional paddings. Instead use data type which is closest to the total size of the bitfields. This patch also uses uint* version consistently throughout. Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-26support VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP (#521)Thanos Makatos1-0/+3
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-26don't support multiple DMA regions per map/unmap (#520)Thanos Makatos1-12/+13
We're dropping this behavior from the spec. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-25more spec updates (#491)John Levon1-7/+7
update spec to v0.9.1 Changes include: - reply message includes the command number - split out message definitions into request/reply sections, and skip the repeated standard header definitions - lots of markup fixes - re-organization for clarity - further documentation of argsz - remove VFIO_USER_VM_INTERRUPT until we have a working implementation - dirty page tracking is optional - fix implementations to match the spec Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-25Handle support of PCI FLR capability (#517)Swapnil Ingle1-1/+7
* Handle support of PCI FLR capability If device supports FLR cap then call vfu_reset_cb_t when FLR is initiated by client. Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-24fix region offset handling (#485)John Levon1-11/+2
The specification states that the region offset given in the region info should be used as the "offset" when mmap()ing the region from the client side. However, the library instead implemented a fixed offset scheme similar to that of vfio - and no clients actually set up the file like that. Instead, let servers define their own offsets, and pass them through to clients as is. It's up to the server to decide how its backing file or files is organized. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-20migration: various dirty page tracking fixes (#457)Thanos Makatos1-12/+14
- document how to use a vfio-user device with libvirt - document how to use SPDK's nvmf/vfio-user target with libvirt - replace vfio_bitmap with vfio_user_bitmap and vfio_iommu_type1_dirty_bitmap_get with vfio_user_bitmap_range - fix bug for calculating number of pages needed for dirty page bitmap - align number of bytes for dirty page bitmap to QWORD - add debug messages around dirty page tracking - only support flags=0 when doing DMA unmap - set device state to running after reset - allow region read/write even if device is in stopped state - allow transitioning from stopped/stop-and-copy state to running state - fix unit tests Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-14Fix dma read write count (#497)Swapnil Ingle1-1/+1
* spec: Fixed DMA_READ/WRITE data count DMA region size is maxed to uint64_t. Updated DMA_READ/WRITE data count to be defined as uint64_t. * Fix vfu_dma_read/write() as per spec changes Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-14dma: Use correct len type (#479)Swapnil Ingle1-1/+1
* dma: Use correct len type vfio_iommu_type1_dirty_bitmap_get.size is of type __u64 dma_controller_dirty_page_get() receives it as int, instead it should be u64 Also added UT to test overflow of length passed to dma_controller_dirty_page_get Fixes: #477 Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-14Revert "Merge pull request #493 from swapnili/use-linux-vfio-header"John Levon1-0/+61
This reverts commit 250aedb026ba557fc4fae6ff301b3b1dfd953c7e, reversing changes made to 71f8b30557d3635336aec06c084188370ed5e248.
2021-05-11Use defines from linux-headers/linux/vfio.hSwapnil Ingle1-61/+0
Instead of having local copy use the defines from linux-headers/linux/vfio.h. Same as how Qemu does. Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-05-10fix dma unregister callback during region removal (#464)John Levon1-2/+6
There are two issues with the unregister callback: - we were requiring the callback to be set when removing a region, but it's only required if a consumer wants to map regions - when we removed all regions (for example, on a reset), we weren't triggering the callback Signed-off-by: John Levon <john.levon@nutanix.com> swapnil code review add assert Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-05-04stop using struct vfio_device_info (#456)John Levon1-0/+11
This struct from vfio.h has grown larger in newer Linux versions; this breaks older clients, as now the server would require the larger size. Replace with our own definition. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-23correct PM capability definition (#452)John Levon1-11/+19
the static size assert for the PMCS register was checking the wrong struct; however, the struct was nonetheless 4 bytes long, due to uint bitfields. This accidentally meant the containing struct pmcap was the correct size (the alignment attribute makes no difference). After fixing struct pmcs, we'll include the additional two bytes defined in the PCI PM specification, Section 3.2. These are "optional", but as elsewhere, we'll require them when adding the capability. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-15vfu_ctx_create(): validate flags argument (#442)John Levon1-1/+1
In addition, return ENOTSUP for unknown device types, and add some unit tests. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-14migration: use ERROR_INT() (#432)John Levon1-1/+5
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-13drop use of __u* types (#438)John Levon1-6/+6
As we are now pure userspace, there is no need for us to use non-standard integer types. This leaves the copied defines from Linux's vfio.h alone, however. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-13dma: use ERROR_INT()John Levon1-3/+3
The first in a series excising the use of the "return -errno" idiom. This is a non-standard usage, and in userspace, we have "errno" for delivering side-band error values. As there have been multiple bugs from not using standard error return methods like -1+errno or NULL+errno, let's do that. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-07clean up newlines in logs (#423)John Levon1-1/+5
vfu_log() and err() should not take newlines. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-07mark vfu_log() with format attribute (#426)John Levon1-1/+2
Fix up all resulting fallout. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-07correct type for dma_sg_t::dma_addr (#425)John Levon1-1/+1
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-06call reset callback on losing client connection (#419)John Levon1-7/+27
Give API users an opportunity to clean up when a client disconnects from the vfio-user socket. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-06implement short read/write, EOF handling (#415)John Levon1-3/+7
Report any short reads to callers as ECONNRESET, which is the closest we can meaningfully get right now. This also fixes get_next_command(), which previously wasn't checking for short reads at all. When we fail to send or recv from the socket due to the client disappearing in some manner, call into vfu_reset_ctx() to clean up the connection fd, allowing a subsequent vfu_attach_ctx() to work. If we get 0 bytes from recv[msg](), this is reported by the transport as ENOMSG, and is a normal EOF condition. We can also get ECONNRESET: this can happen when we've written unacknowledged data to the socket, the client side socket is closed, and we try a subsequent read. Finally, we can get a short read or write. Our handling of these still has issues, but for now we'll presume this means the client has gone too. It may in fact be due to a client bug - if it failed to write enough data - but right now, we can't easily tell that. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-03-31rework DMA callbacks (#396)John Levon1-32/+86
This fixes a number of issues with how DMA is handled, based on some changes by Thanos Makatos: - rename callbacks to register/unregister, as there is not necessarily any mapping - provide the (large) page-aligned mapped start and size, the page size used, as well as the protection flags: some API users need these - for convenience, provide the virtual address separately that corresponds to the mapped region - we should only require a DMA controller to use vfu_addr_to_sg(), not an unregister callback - the callbacks should return errno not -errno - region removal was incorrectly updating the region array - various other cleanups and clarifications Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-03-24_dma_addr_sg_split(): set errno when not found (#402)John Levon1-1/+1
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-03-09remove vfu_irq_message() (#389)John Levon1-16/+0
This sends a message to a vfio-user client to trigger an IRQ, instead of writing to an eventfd. However, this isn't necessary on the cases we care about, where eventfds *are* available. Furthermore, this isn't something an API user should need to know about: if we ever care, the better way to do this is to make vfu_irq_trigger() automatically use a message if an eventfd isn't available. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-25don't redefine migration defines (#373)Thanos Makatos1-5/+8
RHEL kernels have some of the migration work backported Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>