aboutsummaryrefslogtreecommitdiff
path: root/lib/private.h
AgeCommit message (Collapse)AuthorFilesLines
2022-11-22allow shadow memory offset per shadow ioeventfd (#703)Thanos Makatos1-1/+2
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-10-04fix compilation for i386 and ppc64 (#709)Thanos Makatos1-0/+4
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reported-by: Eduardo Lima <eblima@gmail.com>
2022-07-04support for shadow ioeventfd (#698)Thanos Makatos1-0/+1
When an ioeventfd is written to, KVM discards the value since it has no memory to write it to, and simply kicks the eventfd. This a problem for devices such a NVMe controllers that need the value (e.g. doorbells on BAR0). This patch allows the vfio-user server to pass a file descriptor that can be mmap'ed and KVM can write the ioeventfd value to this _shadow_ memory instead of discarding it. This shadow memory is not exposed to the guest. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com> Change-Id: Iad849c94076ffa5988e034c8bf7ec312d01f095f
2022-06-07irq: inform device of IRQ mask & unmask via callback (#694)Jag Raman1-0/+1
Client masks or unmasks a device IRQ using the VFIO_USER_DEVICE_SET_IRQS message. Inform the device of such changes to the IRQ state. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-04-21support AFL++ fuzzing (#623)John Levon1-25/+1
To support fuzzing with AFL++, add a "pipe" transport that reads from stdin and outputs to stdout: this is the most convenient way of doing fuzzing. Add some docs on how to run a fuzzing session. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-04-19use struct iovec for grouping buffer and length (#658)Thanos Makatos1-14/+8
This make it tidier and easier to pass to function the buffer and length, instead of passing the whole msg. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-03-07check for allowed operations in quiesce state (#647)Thanos Makatos1-0/+11
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-03-02improve region access debugging (#653)John Levon1-3/+0
Many region accesses of interest are of normal register sizes; sniff the region access size, and report the read/written value if possible. Clean up dump_buffer() now, as it's not of much use. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-12-01refactor process_request() (#633)John Levon1-3/+0
Instead of process_request() having a dual role, split into get_request() and handle_request(). Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-11-30introduce device quiesce callback (#609)Thanos Makatos1-4/+27
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Leon <john.levon@nutanix.com>
2021-10-05make migration state callback optionally asynchronous (#608)Thanos Makatos1-0/+2
Some devices need the migration state callback to be asynchronous. The simplest way to implement this is to require from the callback to return -1 and set errno to EBUSY, not process any other new messages (vfu_ctx_run returns -1 and sets errno to EBUSY), and provide a way to the user to complete migration (vfu_migr_done). Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-09-08initial ioeventfd support (#601)JAKelly101-0/+12
Provide initial support for handling VFIO_USER_DEVICE_GET_REGION_IO_FDS, along with a new vfu_create_ioeventfd() API. Reviewed-by: John Levon <john.levon@nutanix.com>
2021-06-30return process request count in vfu_run_ctx() (#574)John Levon1-2/+0
Consumers such as SPDK would like to know if any actual work was done. Modify the API to support this. Also, clean up some stale mocking we no longer use. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-06-02replace max_msg_size with max_data_xfer_size (#541)John Levon1-1/+10
The previously specified max_msg_size had one major issue: it implied a (way too small) limit on the size of dirty bitmaps that could be requested by a client, and as a result a hard limit on memory region size. It seemed awkward to attempt to split up an unmap request instead. Instead, let most requests and replies be limited by their "natural" limits; for example, the number of booleans in VFIO_USER_SET_IRQS is limited by MSI-X count. For the requests that solicit or provide data - that is, VFIO_USER_DMA_READ/WRITE and VFIO_USER_REGION_READ/WRITE - we negotiate a new max_data_xfer_size value. These are much easier to split up into separate requests at the client side so should not present an implementation problem. For our server, chunking is implemented in vfu_dma_read/vfu_dma_write(). Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-06-01limit max DMA region size (#545)John Levon1-1/+9
Since the dirty bitmap in message replies is allocated based upon the maximum size of an individual region, add a limit (somewhat arbitrarily 8TiB, which is a bitmap size of 256MiB). Add a couple of basic tests on the two DMA limits. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-06-01fixes for VFIO_USER_DIRTY_PAGES (#537)John Levon1-6/+0
- we should only accept one range, not multiple ones - clearly define and implement argsz behaviour - we need to check if migration is configured - add proper test coverage; move existing testing to python Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-28restore argsz for DMA map/unmap (#523)Thanos Makatos1-2/+2
use DMA map/unmap format similar to VFIO's Using a DMA map/unmap format similar to VFIO's (vfio_iommu_type1_dma_map / vfio_iommu_type1_dma_unmap) makes it easier to adapt to future changes. Consequently we also honor the passed argsz. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanitx.com>
2021-05-26improve request header handlingJohn Levon1-7/+6
We should require a non-empty payload for every command type except VFIO_USER_DEVICE_RESET. We should also reply to the caller with such failures. Add some testing for is_valid_header(), and move the fd handling test over to it too. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-26don't support multiple DMA regions per map/unmap (#520)Thanos Makatos1-1/+6
We're dropping this behavior from the spec. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-24fix region offset handling (#485)John Levon1-0/+2
The specification states that the region offset given in the region info should be used as the "offset" when mmap()ing the region from the client side. However, the library instead implemented a fixed offset scheme similar to that of vfio - and no clients actually set up the file like that. Instead, let servers define their own offsets, and pass them through to clients as is. It's up to the server to decide how its backing file or files is organized. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-20migration: various dirty page tracking fixes (#457)Thanos Makatos1-0/+7
- document how to use a vfio-user device with libvirt - document how to use SPDK's nvmf/vfio-user target with libvirt - replace vfio_bitmap with vfio_user_bitmap and vfio_iommu_type1_dirty_bitmap_get with vfio_user_bitmap_range - fix bug for calculating number of pages needed for dirty page bitmap - align number of bytes for dirty page bitmap to QWORD - add debug messages around dirty page tracking - only support flags=0 when doing DMA unmap - set device state to running after reset - allow region read/write even if device is in stopped state - allow transitioning from stopped/stop-and-copy state to running state - fix unit tests Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-05-15python tests: add VFIO_USER_DEVICE_GET_INFO (#454)John Levon1-3/+0
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-05-04refactor message handling path (#376)John Levon1-58/+54
Capture message handling inside a new vfu_msg_t private structure and pass that around to the handlers. This provides no functional change, but greatly simplifies and cleans up that path, especially around fd and iovec handling. As part of fixing up the unit tests, start using global variables to reduce the amount of boiler-plate. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-15remove vfu_get_region_info() (#444)John Levon1-3/+0
This is only used internally, and not really useful. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-04-14libvfio-user.c: use ERROR_INT() (#433)John Levon1-1/+1
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-14hard-code migration region index (#441)John Levon1-1/+0
Now we are confident we are OK with a hard-coded VFU_PCI_DEV_MIGR_REGION_IDX value, there's no need for us to track .migr_reg any more, either in the client or internally. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-04-13dma: use ERROR_INT()John Levon1-2/+6
The first in a series excising the use of the "return -errno" idiom. This is a non-standard usage, and in userspace, we have "errno" for delivering side-band error values. As there have been multiple bugs from not using standard error return methods like -1+errno or NULL+errno, let's do that. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-03-31rework DMA callbacks (#396)John Levon1-9/+27
This fixes a number of issues with how DMA is handled, based on some changes by Thanos Makatos: - rename callbacks to register/unregister, as there is not necessarily any mapping - provide the (large) page-aligned mapped start and size, the page size used, as well as the protection flags: some API users need these - for convenience, provide the virtual address separately that corresponds to the mapped region - we should only require a DMA controller to use vfu_addr_to_sg(), not an unregister callback - the callbacks should return errno not -errno - region removal was incorrectly updating the region array - various other cleanups and clarifications Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-03-25re-work unit test mocking (#400)John Levon1-12/+12
Instead of trying to use the linker's --wrap, which just led to more problems when we want to call the real function, we'll add two defines, MOCK_DEFINE() and MOCK_DECLARE(), that behave differently when building the unit tests, such that all wrapped functions are picked up from test/mocks.c instead, regardless of compilation unit. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-03-23add -Wmissing-declarations (#399)John Levon1-23/+18
This is used by SPDK, and it's generally useful. This also uncovered some issues in the test mocking. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-03-10fix IRQ disable path (#386)John Levon1-9/+5
Properly fix IRQ disabling: Allow count == 0 to mean "disable all IRQS of the given type". On our side, disabling an IRQ means forgetting about the eventfd that was previously passed over the socket. Allow individual IRQs to be disabled, by means of a VFIO_IRQ_SET_DATA_EVENTFD message with no file descriptors passed. In vfio, this is done via setting "-1" in the fd slots; which isn't possible via auxiliary data. Thus, only one IRQ can be disabled a a time in vfio-user. Clean up "->type": this is never set, so wasn't having any effect. Follow up changes will likely re-introduce this in some form. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-18unit test exec_command and friends w.r.t. migration device state (#346)Thanos Makatos1-0/+6
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-02-16fix DEVICE_GET_INFO specification and handling (#344)John Levon1-1/+2
The specification for DEVICE_GET_INFO differed from the implementation. After some discussion, fix the spec such that the struct should be passed in with ->argsz set. As it happened, the implementation was also wrong: we weren't actually checking the incoming ->argsz for validation, but we should. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-15add vfu_get_poll_fd() (#322)John Levon1-0/+3
Library users can use this to sleep on either a newly-attached socket client, or a new message. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-15make file descriptors private to the transport (#321)John Levon1-2/+1
General code has no business knowing about the socket file descriptors. vfu_attach_ctx() is changed to not return the file descriptor; we'll re-expose a suitable file descriptor in a follow-up Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-11move exec_command socket handling into the transport (#320)John Levon1-0/+3
Also clean up some code surrounding this. In particular, don't play games with modifying the message header. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-11tiny rename of vfu_ctx_t::trans -> tran (#315)John Levon1-1/+1
This matches the tran_* namespace better. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-11introduce transport reply() handler (#313)John Levon1-0/+4
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-10API error return converged to one func (#325)swapnili1-1/+8
* API error return converged to one func Use ERROR_INT() or ERROR_PTR() to return errors from API's. This way if we want to change the behaviour later, we will just need to update these funcitons. Also fixed some error return cases and comments. Reviewed-by: John Levon <john.levon@nutanix.com>
2021-02-09introduce transport send_msg() handler (#314)John Levon1-0/+6
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-02-04close listening socket in vfu_destroy_ctx() (#299)John Levon1-4/+7
We were forgetting to close vfu_ctx->fd, add a tran callback for this. While we're there, clean up the tran callbacks somewhat. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-01-20support extended capabilities (#226)John Levon1-0/+2
Provide initial support for extended capabilities, and implement handlers for the Device Serial Number and Vendor-Specific capabilities. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-01-20add whole-region mmap area for vfu_setup_region() (#225)John Levon1-6/+2
If an fd is provided, automatically add a mmap area covering the entire region unless an mmap_areas array is provided. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-01-20re-work API for adding capabilities (#200)John Levon1-17/+9
Allow to add capabilities individually, including extended capabilities, and those to be handled via the region callback. As a side effect, rework config space accesses to handle reads that straddle capabilities and non-standard areas and use callbacks as needed. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-01-07re-work access handling (#220)John Levon1-26/+3
Various cleanups and fixes to handling of region accesses, including: - there should be no reason for us to split accesses into 1/2/4/8 byte accesses: in general, the client will have already be doing that, and if not, there's no particular reason we should be the ones to split up such larger accesses. - use a callback for PCI config space reads and writes if one is provided (needs more work for capabilities) Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-01-04move PCI-specific code to pci.c (#219)John Levon1-2/+21
It's still pretty entangled, but move the bulk of the non-cap PCI code over to pci.c. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2020-12-15send file descriptors for sparse areas in get region info (#201)Thanos Makatos1-7/+4
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
2020-12-14return region capabilities a la VFIO (#187)Thanos Makatos1-0/+3
This patch returns region capabilities the same way VFIO does: if argsz is not large enough then it returns only region info and sets argsz to what it should be in order to fit the capabilities, the client then retries with a large enough argsz. The protocol specification has been updated as well. Plus unit tests. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
2020-12-14add unit test for device get info (#192)Thanos Makatos1-0/+4
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
2020-12-08Misc fixes for vfu_ctx_try_attach() and vfu_realize_ctx() (#175)swapnili1-1/+1
Misc changes for vfu_ctx_try_attach() * Rename to vfu_attach_ctx() * Removed call to vfu_realize_ctx(), should be called separately * Now vfu_attach_ctx() must also be called for blocking ctx. Misc changes for vfu_realize_ctx() * Made calling vfu_realize_ctx() mandatory * vfu_ctx_drive() and vfu_poll_ctx() returns EINVAL if the device is not realized. * Renamed vfu_ctx->ready to vfu_ctx->realized Added unit test for vfu_attach_ctx() and vfu_realize_ctx() Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>