aboutsummaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)AuthorFilesLines
2024-03-29Support 64 bits and prefetchable BARs (#792)jfgd1-0/+11
* Support 64 bits and prefetchable BARs Add two new flags for lib user to request 64bits and/or prefetchable BARs. Tested with a vfio-user client patched QEMU. Signed-off-by: Jérémy Fanguède <jfanguede@kalrayinc.com>
2024-03-21correct IRQ range check (#791)John Levon2-3/+19
Our previous fuzzing attempts missed this incorrect range check, but SPDK's fuzzing did catch it. Make the check using a saturating add so that we account for overflow. Fixes issue #790. Reported-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com> Signed-off-by: John Levon <john.levon@nutanix.com>
2024-01-24Fix DMA message size calculation (#788)Mattias Nissler1-4/+5
When performing DMA via VFIO-user commands over the socket, vfu_dma_transfer breaks large requests into chunks according to the client's maximum data transfer size negotiated at connection setup time. This change fixes the calculation of the chunk size for the case where the last chunk is less than the maximum transfer size. Unfortunately, the existing test didn't catch this due to the request size being a multiple of that maximum data transfer size. Adjust the test to make the last chunk size a true remainder. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
2023-10-02replace bcopy() with memcpy() (#786)John Levon2-4/+5
For some unclear reason, clang-tidy believes bcopy() is insecure. Regardless, it is deprecated, so replace usages with memcpy(). Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2023-10-02fix VFIO_USER_DEVICE_GET_REGION_IO_FDS allocation (#785)John Levon1-4/+9
clang-tidy static analysis identified a zero-sized allocation in the case that no ioregionfds had been configured. Fix this issue and add a test for it. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2023-09-18fix: minor memory bugs #784 William Henderson1-8/+2
Fixes the following Coverity reports: ________________________________________________________________________________________________________ *** CID 417161: Memory - corruptions (ARRAY_VS_SINGLETON) /samples/server.c: 438 in migration_write_data() 432 } 433 434 /* write to bar0, if any */ 435 if (write_end > server_data->bar1_size) { 436 length_in_bar0 = write_end - write_start; 437 write_start -= server_data->bar1_size; CID 417161: Memory - corruptions (ARRAY_VS_SINGLETON) Using "&server_data->bar0" as an array. This might corrupt or misinterpret adjacent memory locations. 438 memcpy(&server_data->bar0 + write_start, buf + length_in_bar1, 439 length_in_bar0); 440 } 441 442 server_data->migration.bytes_transferred += bytes_written; 443 ________________________________________________________________________________________________________ *** CID 417160: Memory - corruptions (ARRAY_VS_SINGLETON) /samples/server.c: 394 in migration_read_data() 388 } 389 390 /* read bar0, if any */ 391 if (read_end > server_data->bar1_size) { 392 length_in_bar0 = read_end - read_start; 393 read_start -= server_data->bar1_size; CID 417160: Memory - corruptions (ARRAY_VS_SINGLETON) Using "&server_data->bar0" as an array. This might corrupt or misinterpret adjacent memory locations. 394 memcpy(buf + length_in_bar1, &server_data->bar0 + read_start, 395 length_in_bar0); 396 } 397 398 server_data->migration.bytes_transferred += bytes_read; 399 ________________________________________________________________________________________________________ *** CID 417159: Possible Control flow issues (DEADCODE) /lib/libvfio-user.c: 121 in dev_get_caps() 115 116 header = (struct vfio_info_cap_header*)(vfio_reg + 1); 117 118 if (vfu_reg->mmap_areas != NULL) { 119 int i, nr_mmap_areas = vfu_reg->nr_mmap_areas; 120 if (type != NULL) { CID 417159: Possible Control flow issues (DEADCODE) Execution cannot reach this statement: "type->header.next = vfio_re...". 121 type->header.next = vfio_reg->cap_offset + sizeof(struct vfio_region_info_cap_type); 122 sparse = (struct vfio_region_info_cap_sparse_mmap*)(type + 1); 123 } else { 124 vfio_reg->cap_offset = sizeof(struct vfio_region_info); 125 sparse = (struct vfio_region_info_cap_sparse_mmap*)header; 126 } Signed-off-by: William Henderson <william.henderson@nutanix.com>
2023-09-15adapt to VFIO live migration v2 (#782)William Henderson8-703/+783
This commit adapts the vfio-user protocol specification and the libvfio-user implementation to v2 of the VFIO live migration interface, as used in the kernel and QEMU. The differences between v1 and v2 are discussed in this email thread [1], and we slightly differ from upstream VFIO v2 in that instead of transferring data over a new FD, we use the existing UNIX socket with new commands VFIO_USER_MIG_DATA_READ/WRITE. We also don't yet use P2P states. The updated spec was submitted to qemu-devel [2]. [1] https://lore.kernel.org/all/20220130160826.32449-9-yishaih@nvidia.com/ [2] https://lore.kernel.org/all/20230718094150.110183-1-william.henderson@nutanix.com/ Signed-off-by: William Henderson <william.henderson@nutanix.com>
2023-09-15Pass server->client command over a separate socket pair (#762)Mattias Nissler4-16/+117
Use separate socket for server->client commands This change adds support for a separate socket to carry commands in the server-to-client direction. It has proven problematic to send commands in both directions over a single socket, since matching replies to commands can become non-trivial when both sides send commands at the same time and adds significant complexity. See issue #279 for details. To set up the reverse communication channel, the client indicates support for it via a new capability flag in the version message. The server will then create a fresh pair of sockets and pass one end to the client in its version reply. When the server wishes to send commands to the client at a later point, it now uses its end of the new socket pair rather than the main socket. Corresponding replies are also passed back over the new socket pair. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
2023-08-31Construct server capabilities using json-c (#771)Mattias Nissler1-25/+106
String formatting is hitting its limits: Adding another field is difficult given that we already branch on whether migration is enabled. This change constructs a JSON-C object instead so we can add what we need and serialize to a string afterwards. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2023-08-30Replace protocol header flags bit field with mask (#773)Mattias Nissler3-14/+14
It turns out that the bit field will not yield the desired / specified bit layout on big-endian systems, see issue #768 for details. Thus, replace the bit field with constants for the individual fields and use bit masking when accessing the flags field. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2023-08-23fix: incorrect number of dirty pages printed (#766)William Henderson1-1/+1
The `log_dirty_bitmap` function in `dma.c` would output the wrong number of dirty pages due to the `char` of the bitmap being sign-extended when implicitly being converted to `unsigned int` for `__builtin_popcount`. By adding an intermediate cast to `uint8_t` we avoid this incorrect behaviour. See https://github.com/nutanix/libvfio-user/pull/746#discussion_r1297173318. Signed-off-by: William Henderson <william.henderson@nutanix.com>
2023-08-15Introduce close_safely helper function (#763)Mattias Nissler6-63/+49
The helper function centralizes some extra checks and diligence desired by many/most current code paths but currently inconsistently applied. This includes bypassing the close call when the file descriptor is -1 already, resetting the file descriptor variable to -1 after closing, and preserving errno. All calls to close are replaced by close_safely. Some warning log output is lost over this, but it doesn't seem like this was very useful anyways given that Linux always closes the file descriptor anyways. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
2023-08-15Allow adding MSI capability via vfu_pci_add_capability (#758)Florian Freudiger1-0/+67
Signed-off-by: Florian Freudiger <25648113+FlorianFreudiger@users.noreply.github.com>
2023-08-08Fix MSI-X capability write logging opposite status (#759)Florian Freudiger1-3/+3
Signed-off-by: Florian Freudiger <25648113+FlorianFreudiger@users.noreply.github.com>
2023-07-03Fix address calculation for message-based DMA (#740)Mattias Nissler1-1/+1
The correct DMA address is formed by adding base and offset - the latter was accidentally missing. Change the server example to read and write blocks at non-zero offsets, such that `test-client-server.sh` exercises offset handling. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
2023-06-08fix err/req irq fd issue (#731)limiao-intel1-17/+46
When handle_device_set_irqs set err irq/req irq, fd will be filled in vfu_ctx->irqs->efds[] rather than vfu_ctx->irqs->err_efd or vfu_ctx->irqs->req_efd. This patch adds irq index judgment before filling in fd to make sure fd is filled in the correct place. Signed-off-by: Miao Li <miao.li@intel.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2023-01-04allow -1 file descriptor for ioregionfd (#727)Thanos Makatos1-1/+14
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2023-01-03fix FLR reset callback (#729)John Levon3-19/+27
A reset callback is allowed to call functions disallowed in quiescent state. However, the FLR reset path neglected to account for this properly, causing an incorrect assert to be triggered if, for example, vfu_sgl_put() is called. To fix this, make sure all reset paths go through call_reset_cb(). Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-11-22vfu_pci_init: initialize PCI config space flags (#724)Thanos Makatos1-0/+1
vfu_pci_init() sets the size of the PCI config space but not the flags; vfu_realize_ctx() won't initialize the flags since the size if already set. vfu_pci_init() must initialize flags as well. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-11-22add debugging to handle_device_get_region_io_fds (#723)Thanos Makatos1-0/+4
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-11-22allow shadow memory offset per shadow ioeventfd (#703)Thanos Makatos2-8/+10
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-11-16check spelling (#720)John Levon1-1/+1
Use misspell-fixer if available, and correct the small number of errors it found. Rather than trying to install into the CI, run it directly from a github action. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-10-05add some unlikely (#717)Thanos Makatos1-13/+16
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-10-05only call debug_region_access if in debug mode (#716)Thanos Makatos1-1/+9
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-10-05don't duplicate FD in get region info (#715)Thanos Makatos1-9/+3
This is out of spec. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-10-04fix compilation for i386 and ppc64 (#709)Thanos Makatos10-63/+102
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reported-by: Eduardo Lima <eblima@gmail.com>
2022-08-18make SGL error-checking DEBUG-only (#706)John Levon1-3/+11
As vfu_addr_to_sgl() and co are on the hot path, compile out these sanity checks for non-DEBUG builds. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2022-08-18avoid vfu_log() in SGL hot path (#705)John Levon1-0/+10
Even though in non-debug, we don't actually log anything here, even assembling the arguments to vfu_log() has a performance impact. Hide them behind a DEBUG_SGL define - even in a DEBUG build, they are particularly noisy and low-value. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2022-08-08delete socket on vfu_ctx_destroy (#702)Thanos Makatos2-5/+9
fixes #660 Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2022-07-04support for shadow ioeventfd (#698)Thanos Makatos2-3/+22
When an ioeventfd is written to, KVM discards the value since it has no memory to write it to, and simply kicks the eventfd. This a problem for devices such a NVMe controllers that need the value (e.g. doorbells on BAR0). This patch allows the vfio-user server to pass a file descriptor that can be mmap'ed and KVM can write the ioeventfd value to this _shadow_ memory instead of discarding it. This shadow memory is not exposed to the guest. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com> Change-Id: Iad849c94076ffa5988e034c8bf7ec312d01f095f
2022-06-09report function in quiesce_check_allowed() (#693)John Levon1-7/+10
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2022-06-07irq: inform device of IRQ mask & unmask via callback (#694)Jag Raman3-2/+44
Client masks or unmasks a device IRQ using the VFIO_USER_DEVICE_SET_IRQS message. Inform the device of such changes to the IRQ state. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-05-30allow all LOG_* levels (#691)John Levon1-2/+1
While libvfio-user doesn't use them all, at least SPDK was expecting to be able to set LOG_NOTICE level, and silently failing. There's no reason we can't support any valid syslog level. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2022-05-30allow concurrent dirty bitmap get (#677)John Levon3-15/+87
Use atomic operations to allow concurrent bitmap updates with VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP operations. Dirtying clients can race against each other, so we must use atomic or when marking dirty: we do this byte-by-byte. When reading the dirty bitmap, we must be careful to not race and lose any set bits within the same byte. If we miss an update, we'll catch it the next time around, presuming that before the final pass we'll have quiesced all I/O. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-27re-work SGL API (#675)John Levon3-52/+61
Harmonize and rename the vfu_*sg() APIs to better reflect their functionality: in our case, there is no mapping happening as part of these calls, they are merely housekeeping for range splitting, dirty tracking, and so on. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-27remove maps list from DMA controller (#674)John Levon2-42/+7
->maps existed so that if a consumer does vfu_map_sg() and then we are asked to enable dirty page tracking, we won't mark those pages as dirty, and will hence potentially lose data. Now that we require quiesce and the use of either vfu_unmap_sg() or vfu_sg_mark_dirty(), there's no need to have this list any more. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-27remove refcnt from region (#673)John Levon2-5/+0
The reference count is unused, and not atomically handled, remove it. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-27re-work SG dirty tracking (#672)John Levon2-8/+52
Move SG dirtying to vfu_unmap_sg(): as we don't want to track SGs ourselves, doing this in vfu_map_sg() is no longer the right place. Note that the lack of tracking implies that any SGs must be unmapped before the final stop and copy phase. To avoid the need for this, add vfu_mark_sg_dirty(): this allows a consumer to mark a region as dirty explicitly without needing to unmap it. Currently it's the same as vfu_unmap_sg(), but that's an implementation detail. Note this still marks current maps after a get operation; that will change subsequently. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-27require quiesce for VFIO_USER_DIRTY_PAGES (#671)John Levon1-0/+3
If we require a quiesce for these calls, we can be sure that it will not race with any usage of vfu_*_sg() calls, as a first step towards concurrency. This is not ideal for VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP, which can potentially be called multiple times during pre-copy phase, but that's something we can fix later. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-23libvfio-user.h: sync VFIO_DEVICE_STATE_XXXX definitions with upstream (#690)Jag Raman3-40/+40
Rename VFIO_DEVICE_STATE_XXXX defines as VFIO_DEVICE_STATE_V1_XXXX. Upstream renamed these variable to be of the XXXX_V1_XXXX format and switched an enum for VFIO_DEVICE_STATE_XXXX. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-12run scan-build in CI (#680)John Levon2-1/+5
Yet another static analyzer pass, this one is used by SPDK, and as it did detect some minor issues, it's worth running. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-09build: delete CMake build rulesDaniel P. Berrangé1-89/+0
Now that Meson is functional, support for building with CMake is removed so that there is only one build system to maintain. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-05-09build: introduce Meson build file rulesDaniel P. Berrangé1-0/+47
The Meson build system used by many other virt projects (QEMU, libvirt and others) is easier to understand & maintain rules for than cmake, guiding towards best practice. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-04-28lib: export dma_sg_size symbol in library (#664)Daniel Berrangé1-1/+1
The dma_sg_size() method is listed in libvfio-user.h but the symbol is marked private in the ELF library. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-04-21fix a small coverity complaint (#663)John Levon1-6/+0
The complaint was: 259 if (ret != 0) { >>> CID 392380: Possible Control flow issues (DEADCODE) >>> Execution cannot reach this statement: "free(tp);". 260 free(tp); 261 return ERROR_INT(ret); 262 } Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2022-04-21support AFL++ fuzzing (#623)John Levon9-340/+1002
To support fuzzing with AFL++, add a "pipe" transport that reads from stdin and outputs to stdout: this is the most convenient way of doing fuzzing. Add some docs on how to run a fuzzing session. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-04-19use struct iovec for grouping buffer and length (#658)Thanos Makatos4-124/+118
This make it tidier and easier to pass to function the buffer and length, instead of passing the whole msg. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-03-07check for allowed operations in quiesce state (#647)Thanos Makatos4-0/+60
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-03-02improve region access debugging (#653)John Levon3-60/+49
Many region accesses of interest are of normal register sizes; sniff the region access size, and report the read/written value if possible. Clean up dump_buffer() now, as it's not of much use. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-02-25clarify when logging when device changes migration state (#649)Thanos Makatos1-1/+6
This makes reading logs easier. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>