Age | Commit message (Collapse) | Author | Files | Lines |
|
We were accidentally calling VFIO_USER_DIRTY_PAGES twice.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Add a little more coverage of our validation, and correct a small typo.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Leon <john.levon@nutanix.com>
|
|
DMA regions not mapped by the server are not dirty tracked (the client must
track changes via handling VFIO_USER_DMA_WRITE), but we weren't correctly
enforcing this, which could segfault when ->dirty_bitmap was NULL.
Found via AFL++.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
We weren't checking for a too-large ->argsz for this command.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
There were two issues with unmap request validation when the dirty bitmap flag was set:
- we weren't checking ->argsz against the maximum transfer size, allowing a client
to trigger unbounded allocations
- we needed to check for overflow when calculating the requested message out size
Found via AFL++.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
AFL++ found this, though we already knew about it, so fix it by comparing
against a saturating addition. This was the only instance of client-controlled
potential overflow I noticed.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
As clients control ->client_max_fds, we should return an error, not assert, if
we can't represent a region's mmap_areas.
Found via AFL++.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
|
|
* Fix reply of VFIO_USER_DEVICE_GET_REGION_INFO
Set VFIO_REGION_INFO_FLAG_CAPS flag only if caps are part of the reply.
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
These extra options make tracking uninitilized values easier. They make
Valgrind run slower so we need to increase the timeouts in the CI.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Aside from general style goodness, this found a couple of accidental
re-definitions, so it's worth taking the pain now.
Also, only run rstlint as part of pre-push.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
|
|
Some devices need the migration state callback to be asynchronous. The simplest way to implement this is to require from the callback to return -1 and set errno to EBUSY, not process any other new messages (vfu_ctx_run returns -1 and sets errno to EBUSY), and provide a way to the user to complete migration (vfu_migr_done).
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
|
|
If a region is not set up, asking for its iofds should fail with EINVAL.
Co-authored-by: John Levon <john.levon@nutanix.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
An unmappable region should still allow io fds, as they are orthogonal.
Co-authored-by: John Levon <john.levon@nutanix.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Provide initial support for handling VFIO_USER_DEVICE_GET_REGION_IO_FDS, along with a new vfu_create_ioeventfd() API.
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
* Add support for VFIO_DMA_UNMAP_FLAG_ALL flag
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
* initial dma_unmap test
Signed-off-by: John Levon <john.levon@nutanix.com>
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
* Fix err path of handle_dma_unmap()
Set msg->out_size before successful return. Otherwise in case of error
reply path we may endup setting iovecs[1].iov_len with invalid
iovecs[1].iov_base in tran_sock_reply()
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
* pytests for vfu_dma_{map, unmap}_sg
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
* dma: cleanup dma_{map,unmap}_sg
Instead of using index to traverse sg and iovec, better to use it as
pointers. It's more readable and less prone from coding mistakes.
Also adding unit tests for the same.
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Complain about a region that isn't readable *or* writable, or any unknown flags.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Verify that the return value from vfu_run_ctx() is what we're expecting, rather
than just driving on in case of error.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
|
|
Consumers such as SPDK would like to know if any actual work was done. Modify
the API to support this. Also, clean up some stale mocking we no longer use.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
* superficially handle Device Control 2 and Link Control 2
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
_dma_addr_sg_split() is supposed to return back sg's if the requested
dma addr spans across regions.
Also adding unit tests to cover these case.
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Most tests need to send a request, process it, then retrieve the reply.
Add a utility function to avoid lots of tedious boilerplate.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
dirty (#551)
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
We were accidentally hitting the wrong error condition for one of these tests,
and hence not properly covering the intended failure path.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
The flags field belongs to VFIO and it's not a good idea to reuse as new
VFIO flags can break things. Instead, we derive whether or not a region
is mappable if a file descriptor is passed.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
The previously specified max_msg_size had one major issue: it implied a (way too
small) limit on the size of dirty bitmaps that could be requested by a client,
and as a result a hard limit on memory region size. It seemed awkward to attempt
to split up an unmap request instead.
Instead, let most requests and replies be limited by their "natural" limits; for
example, the number of booleans in VFIO_USER_SET_IRQS is limited by MSI-X count.
For the requests that solicit or provide data - that is, VFIO_USER_DMA_READ/WRITE
and VFIO_USER_REGION_READ/WRITE - we negotiate a new max_data_xfer_size value.
These are much easier to split up into separate requests at the client side
so should not present an implementation problem. For our server, chunking is
implemented in vfu_dma_read/vfu_dma_write().
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
We should explicitly define the expected migration register contents for API
users who aren't using the callbacks. Clean up some related lint.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Since the dirty bitmap in message replies is allocated based upon the maximum
size of an individual region, add a limit (somewhat arbitrarily 8TiB, which is a
bitmap size of 256MiB). Add a couple of basic tests on the two DMA limits.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
- we should only accept one range, not multiple ones
- clearly define and implement argsz behaviour
- we need to check if migration is configured
- add proper test coverage; move existing testing to python
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
use DMA map/unmap format similar to VFIO's
Using a DMA map/unmap format similar to VFIO's (vfio_iommu_type1_dma_map / vfio_iommu_type1_dma_unmap) makes it easier to adapt to future changes. Consequently we also honor the passed argsz.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanitx.com>
|
|
* Added missing reserved bits and renamed per to rer nameing as the
nvme specs
* Add pxcap capability in lspci test
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
We should require a non-empty payload for every command type except
VFIO_USER_DEVICE_RESET.
We should also reply to the caller with such failures.
Add some testing for is_valid_header(), and move the fd handling test over to it
too.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
We're dropping this behavior from the spec.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
As suggested by Thanos, replace hard-coded struct.pack() with ctypes.Structure
objects so it's much clearer what the data being constructed is.
Replace some hard-coded unpacking with code based on ctypes::from_buffer_copy(),
but in a form that allows us to "pop" a structure from a longer buffer.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
update spec to v0.9.1
Changes include:
- reply message includes the command number
- split out message definitions into request/reply sections, and
skip the repeated standard header definitions
- lots of markup fixes
- re-organization for clarity
- further documentation of argsz
- remove VFIO_USER_VM_INTERRUPT until we have a working implementation
- dirty page tracking is optional
- fix implementations to match the spec
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
* Handle support of PCI FLR capability
If device supports FLR cap then call vfu_reset_cb_t when FLR is
initiated by client.
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
The specification states that the region offset given in the region info should
be used as the "offset" when mmap()ing the region from the client side. However,
the library instead implemented a fixed offset scheme similar to that of vfio -
and no clients actually set up the file like that.
Instead, let servers define their own offsets, and pass them through to clients
as is. It's up to the server to decide how its backing file or files is
organized.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
- document how to use a vfio-user device with libvirt
- document how to use SPDK's nvmf/vfio-user target with libvirt
- replace vfio_bitmap with vfio_user_bitmap and vfio_iommu_type1_dirty_bitmap_get with vfio_user_bitmap_range
- fix bug for calculating number of pages needed for dirty page bitmap
- align number of bytes for dirty page bitmap to QWORD
- add debug messages around dirty page tracking
- only support flags=0 when doing DMA unmap
- set device state to running after reset
- allow region read/write even if device is in stopped state
- allow transitioning from stopped/stop-and-copy state to running state
- fix unit tests
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|