aboutsummaryrefslogtreecommitdiff
path: root/lib/libvfio-user.c
AgeCommit message (Collapse)AuthorFilesLines
2021-06-18fix print (#567)Thanos Makatos1-2/+2
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-06-09clear dirty pages bitmap after getting dirty pages but keep mapped segments ↵Thanos Makatos1-53/+57
dirty (#551) Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-06-09drop mappable flag from DMA map (#553)Thanos Makatos1-6/+7
The flags field belongs to VFIO and it's not a good idea to reuse as new VFIO flags can break things. Instead, we derive whether or not a region is mappable if a file descriptor is passed. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-06-02replace max_msg_size with max_data_xfer_size (#541)John Levon1-56/+89
The previously specified max_msg_size had one major issue: it implied a (way too small) limit on the size of dirty bitmaps that could be requested by a client, and as a result a hard limit on memory region size. It seemed awkward to attempt to split up an unmap request instead. Instead, let most requests and replies be limited by their "natural" limits; for example, the number of booleans in VFIO_USER_SET_IRQS is limited by MSI-X count. For the requests that solicit or provide data - that is, VFIO_USER_DMA_READ/WRITE and VFIO_USER_REGION_READ/WRITE - we negotiate a new max_data_xfer_size value. These are much easier to split up into separate requests at the client side so should not present an implementation problem. For our server, chunking is implemented in vfu_dma_read/vfu_dma_write(). Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-06-01limit max DMA region size (#545)John Levon1-4/+3
Since the dirty bitmap in message replies is allocated based upon the maximum size of an individual region, add a limit (somewhat arbitrarily 8TiB, which is a bitmap size of 256MiB). Add a couple of basic tests on the two DMA limits. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-06-01fixes for VFIO_USER_DIRTY_PAGES (#537)John Levon1-53/+71
- we should only accept one range, not multiple ones - clearly define and implement argsz behaviour - we need to check if migration is configured - add proper test coverage; move existing testing to python Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-28don't return non-zero caps for zero-sized regions (#543)Thanos Makatos1-3/+5
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-28hide non-ABI symbols (#538)John Levon1-19/+19
Default to hidden visibility to remove non-public symbols from API users (and improve performance a little). Every public function gets an EXPORT annotation. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-28restore argsz for DMA map/unmap (#523)Thanos Makatos1-39/+68
use DMA map/unmap format similar to VFIO's Using a DMA map/unmap format similar to VFIO's (vfio_iommu_type1_dma_map / vfio_iommu_type1_dma_unmap) makes it easier to adapt to future changes. Consequently we also honor the passed argsz. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanitx.com>
2021-05-26support VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP (#521)Thanos Makatos1-5/+32
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-26improve request header handlingJohn Levon1-10/+14
We should require a non-empty payload for every command type except VFIO_USER_DEVICE_RESET. We should also reply to the caller with such failures. Add some testing for is_valid_header(), and move the fd handling test over to it too. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-26don't support multiple DMA regions per map/unmap (#520)Thanos Makatos1-67/+79
We're dropping this behavior from the spec. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-25more spec updates (#491)John Levon1-11/+11
update spec to v0.9.1 Changes include: - reply message includes the command number - split out message definitions into request/reply sections, and skip the repeated standard header definitions - lots of markup fixes - re-organization for clarity - further documentation of argsz - remove VFIO_USER_VM_INTERRUPT until we have a working implementation - dirty page tracking is optional - fix implementations to match the spec Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-24fix region offset handling (#485)John Levon1-16/+4
The specification states that the region offset given in the region info should be used as the "offset" when mmap()ing the region from the client side. However, the library instead implemented a fixed offset scheme similar to that of vfio - and no clients actually set up the file like that. Instead, let servers define their own offsets, and pass them through to clients as is. It's up to the server to decide how its backing file or files is organized. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-20migration: various dirty page tracking fixes (#457)Thanos Makatos1-21/+42
- document how to use a vfio-user device with libvirt - document how to use SPDK's nvmf/vfio-user target with libvirt - replace vfio_bitmap with vfio_user_bitmap and vfio_iommu_type1_dirty_bitmap_get with vfio_user_bitmap_range - fix bug for calculating number of pages needed for dirty page bitmap - align number of bytes for dirty page bitmap to QWORD - add debug messages around dirty page tracking - only support flags=0 when doing DMA unmap - set device state to running after reset - allow region read/write even if device is in stopped state - allow transitioning from stopped/stop-and-copy state to running state - fix unit tests Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-20python tests: add VFIO_USER_DEVICE_GET_REGION_INFO (#471)John Levon1-1/+1
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-15python tests: add VFIO_USER_DEVICE_GET_INFO (#454)John Levon1-1/+1
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-14Fix dma read write count (#497)Swapnil Ingle1-2/+2
* spec: Fixed DMA_READ/WRITE data count DMA region size is maxed to uint64_t. Updated DMA_READ/WRITE data count to be defined as uint64_t. * Fix vfu_dma_read/write() as per spec changes Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-11some specification updates (#465)John Levon1-2/+2
Make a few specification updates after review by Stefan Hajnoczi. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-10fix dma unregister callback during region removal (#464)John Levon1-4/+5
There are two issues with the unregister callback: - we were requiring the callback to be set when removing a region, but it's only required if a consumer wants to map regions - when we removed all regions (for example, on a reset), we weren't triggering the callback Signed-off-by: John Levon <john.levon@nutanix.com> swapnil code review add assert Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-05-07don't close outgoing message fds (#472)John Levon1-6/+0
We made a silly mistake in free_msg(): the file descriptors we send out in message replies should never be closed. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-05-04stop using struct vfio_device_info (#456)John Levon1-2/+2
This struct from vfio.h has grown larger in newer Linux versions; this breaks older clients, as now the server would require the larger size. Replace with our own definition. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-04refactor message handling path (#376)John Levon1-347/+314
Capture message handling inside a new vfu_msg_t private structure and pass that around to the handlers. This provides no functional change, but greatly simplifies and cleans up that path, especially around fd and iovec handling. As part of fixing up the unit tests, start using global variables to reduce the amount of boiler-plate. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-23handle_region_access(): fix error-path log message (#451)John Levon1-2/+2
This was an error handling message that was missed when converting from -errno to -1 return style. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-04-19vfu_realize_ctx(): fix default PCI config space region (#445)John Levon1-7/+5
Fix check for an un-configured PCI config space region (the previous method was not accounting for the initialized ->fd). Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-15remove vfu_get_region_info() (#444)John Levon1-7/+0
This is only used internally, and not really useful. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-04-15vfu_ctx_create(): validate flags argument (#442)John Levon1-2/+4
In addition, return ENOTSUP for unknown device types, and add some unit tests. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-14libvfio-user.c: use ERROR_INT() (#433)John Levon1-105/+99
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-14hard-code migration region index (#441)John Levon1-17/+10
Now we are confident we are OK with a hard-coded VFU_PCI_DEV_MIGR_REGION_IDX value, there's no need for us to track .migr_reg any more, either in the client or internally. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-14migration: use ERROR_INT() (#432)John Levon1-0/+3
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-13tran_sock: use ERROR_INT() (#431)John Levon1-35/+42
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-13pci: use ERROR_INT() (#430)John Levon1-0/+3
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-13irq.c: use ERROR_INT() (#429)John Levon1-0/+4
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-13dma: use ERROR_INT()John Levon1-6/+13
The first in a series excising the use of the "return -errno" idiom. This is a non-standard usage, and in userspace, we have "errno" for delivering side-band error values. As there have been multiple bugs from not using standard error return methods like -1+errno or NULL+errno, let's do that. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-07mark vfu_log() with format attribute (#426)John Levon1-7/+7
Fix up all resulting fallout. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-06call reset callback on losing client connection (#419)John Levon1-3/+5
Give API users an opportunity to clean up when a client disconnects from the vfio-user socket. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-06vfu_reset_ctx(): tear down DMA and IRQs (#418)John Levon1-0/+8
When we lose the client connection, the IRQ and DMA region state is no longer valid; clean them up. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-06implement short read/write, EOF handling (#415)John Levon1-21/+70
Report any short reads to callers as ECONNRESET, which is the closest we can meaningfully get right now. This also fixes get_next_command(), which previously wasn't checking for short reads at all. When we fail to send or recv from the socket due to the client disappearing in some manner, call into vfu_reset_ctx() to clean up the connection fd, allowing a subsequent vfu_attach_ctx() to work. If we get 0 bytes from recv[msg](), this is reported by the transport as ENOMSG, and is a normal EOF condition. We can also get ECONNRESET: this can happen when we've written unacknowledged data to the socket, the client side socket is closed, and we try a subsequent read. Finally, we can get a short read or write. Our handling of these still has issues, but for now we'll presume this means the client has gone too. It may in fact be due to a client bug - if it failed to write enough data - but right now, we can't easily tell that. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-03-31rework DMA callbacks (#396)John Levon1-54/+53
This fixes a number of issues with how DMA is handled, based on some changes by Thanos Makatos: - rename callbacks to register/unregister, as there is not necessarily any mapping - provide the (large) page-aligned mapped start and size, the page size used, as well as the protection flags: some API users need these - for convenience, provide the virtual address separately that corresponds to the mapped region - we should only require a DMA controller to use vfu_addr_to_sg(), not an unregister callback - the callbacks should return errno not -errno - region removal was incorrectly updating the region array - various other cleanups and clarifications Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-03-25re-work unit test mocking (#400)John Levon1-24/+13
Instead of trying to use the linker's --wrap, which just led to more problems when we want to call the real function, we'll add two defines, MOCK_DEFINE() and MOCK_DECLARE(), that behave differently when building the unit tests, such that all wrapped functions are picked up from test/mocks.c instead, regardless of compilation unit. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-03-23globally define _GNU_SOURCE (#401)John Levon1-1/+0
This avoids any issues with multiple definitions when passing CFLAGS in. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-03-10fix IRQ disable path (#386)John Levon1-6/+3
Properly fix IRQ disabling: Allow count == 0 to mean "disable all IRQS of the given type". On our side, disabling an IRQ means forgetting about the eventfd that was previously passed over the socket. Allow individual IRQs to be disabled, by means of a VFIO_IRQ_SET_DATA_EVENTFD message with no file descriptors passed. In vfio, this is done via setting "-1" in the fd slots; which isn't possible via auxiliary data. Thus, only one IRQ can be disabled a a time in vfio-user. Clean up "->type": this is never set, so wasn't having any effect. Follow up changes will likely re-introduce this in some form. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-03-01don't call user's unmap_dma callback when removing DMA region (#370)Thanos Makatos1-8/+5
Plus unit tests. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reported-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-02-18use sizeof() consistently (#351)John Levon1-20/+20
The most common way we have written this is as "sizeof()"; use this form consistently. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-18unit test exec_command and friends w.r.t. migration device state (#346)Thanos Makatos1-17/+36
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-02-17add unit tests for handle dirty pages w/o DMA (#348)Thanos Makatos1-1/+3
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-02-16exec_command: free out structs on failure (#345)John Levon1-0/+4
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-16fix DEVICE_GET_INFO specification and handling (#344)John Levon1-11/+16
The specification for DEVICE_GET_INFO differed from the implementation. After some discussion, fix the spec such that the struct should be passed in with ->argsz set. As it happened, the implementation was also wrong: we weren't actually checking the incoming ->argsz for validation, but we should. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-15add vfu_get_poll_fd() (#322)John Levon1-9/+18
Library users can use this to sleep on either a newly-attached socket client, or a new message. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-15make file descriptors private to the transport (#321)John Levon1-3/+1
General code has no business knowing about the socket file descriptors. vfu_attach_ctx() is changed to not return the file descriptor; we'll re-expose a suitable file descriptor in a follow-up Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>