Age | Commit message (Collapse) | Author | Files | Lines |
|
* Support 64 bits and prefetchable BARs
Add two new flags for lib user to request 64bits and/or prefetchable
BARs.
Tested with a vfio-user client patched QEMU.
Signed-off-by: Jérémy Fanguède <jfanguede@kalrayinc.com>
|
|
Our previous fuzzing attempts missed this incorrect range check, but
SPDK's fuzzing did catch it. Make the check using a saturating add so
that we account for overflow.
Fixes issue #790.
Reported-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
|
|
When performing DMA via VFIO-user commands over the socket,
vfu_dma_transfer breaks large requests into chunks according to the
client's maximum data transfer size negotiated at connection setup time.
This change fixes the calculation of the chunk size for the case where
the last chunk is less than the maximum transfer size.
Unfortunately, the existing test didn't catch this due to the request
size being a multiple of that maximum data transfer size. Adjust the
test to make the last chunk size a true remainder.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
|
|
clang-tidy static analysis identified a zero-sized allocation in the
case that no ioregionfds had been configured. Fix this issue and add a
test for it.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Add migration support to the test setup, and complete some additional
testing for the migration JSON capability.
Signed-off-by: John Levon <john.levon@nutanix.com>
|
|
This commit adapts the vfio-user protocol specification and the libvfio-user
implementation to v2 of the VFIO live migration interface, as used in the kernel
and QEMU.
The differences between v1 and v2 are discussed in this email thread [1], and we
slightly differ from upstream VFIO v2 in that instead of transferring data over
a new FD, we use the existing UNIX socket with new commands
VFIO_USER_MIG_DATA_READ/WRITE. We also don't yet use P2P states.
The updated spec was submitted to qemu-devel [2].
[1] https://lore.kernel.org/all/20220130160826.32449-9-yishaih@nvidia.com/
[2] https://lore.kernel.org/all/20230718094150.110183-1-william.henderson@nutanix.com/
Signed-off-by: William Henderson <william.henderson@nutanix.com>
|
|
Use separate socket for server->client commands
This change adds support for a separate socket to carry commands in the
server-to-client direction. It has proven problematic to send commands
in both directions over a single socket, since matching replies to
commands can become non-trivial when both sides send commands at the same
time and adds significant complexity. See issue #279 for details.
To set up the reverse communication channel, the client indicates
support for it via a new capability flag in the version message. The
server will then create a fresh pair of sockets and pass one end to the
client in its version reply. When the server wishes to send commands to
the client at a later point, it now uses its end of the new socket pair
rather than the main socket. Corresponding replies are also passed back
over the new socket pair.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
|
|
Thus far, the client end of the socket is the only piece of client state
tracked in tests, for which a global `socket` variable has been used. In
preparation to add more state, replace the `socket` global with a
`client` global object that groups all client state.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Thus far, the python test code has only ever sent messages of type
commands to the server and processed the corresponding replies. For the
twin-socket feature, the tests will exercise flows where DMA access
commands must be received, processed, and replied to by the client.
This change refactors the message handling python test code to provide
functions to handle server-to-client commands, reusing existing code as
appropriate.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
It turns out that the bit field will not yield the desired / specified
bit layout on big-endian systems, see issue #768 for details. Thus,
replace the bit field with constants for the individual fields and use
bit masking when accessing the flags field.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
The newer flake8 version in the arch linux job of the pull request
workflow fails due to:
E721 do not compare types, for exact checks use `is` / `is not`, for instance checks use `isinstance()`
Both `__eq__` functions now use `is not` instead of `!=` for the type
initial check.
Signed-off-by: Sandro-Alessio Gierens <sandro@gierens.de>
|
|
Signed-off-by: Florian Freudiger <25648113+FlorianFreudiger@users.noreply.github.com>
|
|
This adds the expected output to the lspci test I get on my Arch with kernel
version 6.1.44-lts and pciutils version 3.10.0.
Signed-off-by: Sandro-Alessio Gierens <sandro@gierens.de>
|
|
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
When handle_device_set_irqs set err irq/req irq, fd will be filled
in vfu_ctx->irqs->efds[] rather than vfu_ctx->irqs->err_efd or
vfu_ctx->irqs->req_efd. This patch adds irq index judgment before
filling in fd to make sure fd is filled in the correct place.
Signed-off-by: Miao Li <miao.li@intel.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
test_device_is_stopped_and_copying points the global
vfu_ctx structure to a local stack-allocated data
structure. This is fine while the function is
executing, but newer gcc complains that the
pointer is left there after it returns.
So clear the pointer to NULL before returning.
Fixes issue #734.
Reported-by: Kamil Godzwon <kamilx.godzwon@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
|
|
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
A reset callback is allowed to call functions disallowed in quiescent
state. However, the FLR reset path neglected to account for this
properly, causing an incorrect assert to be triggered if, for example,
vfu_sgl_put() is called. To fix this, make sure all reset paths go
through call_reset_cb().
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
`egrep` has been deprecated in GNU grep since 2007,
and since 3.8 it emits obsolescence warnings:
https://git.savannah.gnu.org/cgit/grep.git/commit/?id=a9515624709865d480e3142fd959bccd1c9372d1
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
|
|
vfu_pci_init() sets the size of the PCI config space but not the flags;
vfu_realize_ctx() won't initialize the flags since the size if already
set. vfu_pci_init() must initialize flags as well.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Use misspell-fixer if available, and correct the small number of errors
it found. Rather than trying to install into the CI, run it directly from a
github action.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
This is out of spec.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reported-by: Eduardo Lima <eblima@gmail.com>
|
|
fixes #660
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
|
|
This test is flaky: there is some kind of race that causes the test to
hang. Now we are run as part of qemu CI, we need to disable this by
default, until we can find time to fix the test.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
|
|
When an ioeventfd is written to, KVM discards the value since it has no
memory to write it to, and simply kicks the eventfd. This a problem for
devices such a NVMe controllers that need the value (e.g. doorbells on
BAR0). This patch allows the vfio-user server to pass a file descriptor
that can be mmap'ed and KVM can write the ioeventfd value to this
_shadow_ memory instead of discarding it. This shadow memory is not
exposed to the guest.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Change-Id: Iad849c94076ffa5988e034c8bf7ec312d01f095f
|
|
There is a typo in the arguments for vfu_dev_irq_state_cb_t - fix it in
this patch.
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Client masks or unmasks a device IRQ using the
VFIO_USER_DEVICE_SET_IRQS message. Inform the device of such changes to
the IRQ state.
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Use atomic operations to allow concurrent bitmap updates with
VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP operations.
Dirtying clients can race against each other, so we must use atomic or
when marking dirty: we do this byte-by-byte.
When reading the dirty bitmap, we must be careful to not race and lose
any set bits within the same byte. If we miss an update, we'll catch it
the next time around, presuming that before the final pass we'll have
quiesced all I/O.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Harmonize and rename the vfu_*sg() APIs to better reflect their functionality:
in our case, there is no mapping happening as part of these calls, they are
merely housekeeping for range splitting, dirty tracking, and so on.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
->maps existed so that if a consumer does vfu_map_sg() and then we are asked to
enable dirty page tracking, we won't mark those pages as dirty, and will hence
potentially lose data.
Now that we require quiesce and the use of either vfu_unmap_sg() or
vfu_sg_mark_dirty(), there's no need to have this list any more.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
The reference count is unused, and not atomically handled, remove it.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Move SG dirtying to vfu_unmap_sg(): as we don't want to track SGs
ourselves, doing this in vfu_map_sg() is no longer the right place.
Note that the lack of tracking implies that any SGs must be unmapped
before the final stop and copy phase. To avoid the need for this, add
vfu_mark_sg_dirty(): this allows a consumer to mark a region as dirty
explicitly without needing to unmap it. Currently it's the same as
vfu_unmap_sg(), but that's an implementation detail.
Note this still marks current maps after a get operation; that will
change subsequently.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
If we require a quiesce for these calls, we can be sure that it will not race
with any usage of vfu_*_sg() calls, as a first step towards concurrency.
This is not ideal for VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP, which can
potentially be called multiple times during pre-copy phase, but that's something
we can fix later.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Rename VFIO_DEVICE_STATE_XXXX defines as VFIO_DEVICE_STATE_V1_XXXX.
Upstream renamed these variable to be of the XXXX_V1_XXXX format and
switched an enum for VFIO_DEVICE_STATE_XXXX.
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
test-lspci.sh: some test platforms don't include the lspci command, as
such skip this test if lspci is not found
test-linkage.sh: specify the source and build root paths of the subproject
instead of the root paths of the master project
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Yet another static analyzer pass, this one is used by SPDK, and as it
did detect some minor issues, it's worth running.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
Now that Meson is functional, support for building with CMake is
removed so that there is only one build system to maintain.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
|
|
The Meson build system used by many other virt projects (QEMU, libvirt
and others) is easier to understand & maintain rules for than cmake,
guiding towards best practice.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
|
|
The test currently hardwires a location based on where cmake
creates binaries. Pass in an explicit location via LIBVFIO_SO_DIR
env variable, to override this hardwired default.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
|
|
Rather than assuming the location of the client and server binaries,
allowing passing in explicit paths.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
|
|
Rather than assuming the location of the lspci binary, allowing
passing in an explicit path.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
|
|
Add a cheesy test for identifying functions in the public header that
are not exported.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
The dma_sg_size() method is listed in libvfio-user.h but the symbol
is marked private in the ELF library.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
To support fuzzing with AFL++, add a "pipe" transport that reads from stdin and
outputs to stdout: this is the most convenient way of doing fuzzing.
Add some docs on how to run a fuzzing session.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
This make it tidier and easier to pass to function the buffer and
length, instead of passing the whole msg.
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
|
|
Catch valgrind issues earlier with less noise.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
|
|
We explicitly identify the quiesce EBUSY case for msg(), letting us simplify the
handling of expected errno.
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
|