aboutsummaryrefslogtreecommitdiff
path: root/lib/libvfio-user.c
AgeCommit message (Collapse)AuthorFilesLines
2024-03-29Support 64 bits and prefetchable BARs (#792)jfgd1-0/+11
* Support 64 bits and prefetchable BARs Add two new flags for lib user to request 64bits and/or prefetchable BARs. Tested with a vfio-user client patched QEMU. Signed-off-by: Jérémy Fanguède <jfanguede@kalrayinc.com>
2024-01-24Fix DMA message size calculation (#788)Mattias Nissler1-4/+5
When performing DMA via VFIO-user commands over the socket, vfu_dma_transfer breaks large requests into chunks according to the client's maximum data transfer size negotiated at connection setup time. This change fixes the calculation of the chunk size for the case where the last chunk is less than the maximum transfer size. Unfortunately, the existing test didn't catch this due to the request size being a multiple of that maximum data transfer size. Adjust the test to make the last chunk size a true remainder. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
2023-10-02fix VFIO_USER_DEVICE_GET_REGION_IO_FDS allocation (#785)John Levon1-4/+9
clang-tidy static analysis identified a zero-sized allocation in the case that no ioregionfds had been configured. Fix this issue and add a test for it. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2023-09-18fix: minor memory bugs #784 William Henderson1-8/+2
Fixes the following Coverity reports: ________________________________________________________________________________________________________ *** CID 417161: Memory - corruptions (ARRAY_VS_SINGLETON) /samples/server.c: 438 in migration_write_data() 432 } 433 434 /* write to bar0, if any */ 435 if (write_end > server_data->bar1_size) { 436 length_in_bar0 = write_end - write_start; 437 write_start -= server_data->bar1_size; CID 417161: Memory - corruptions (ARRAY_VS_SINGLETON) Using "&server_data->bar0" as an array. This might corrupt or misinterpret adjacent memory locations. 438 memcpy(&server_data->bar0 + write_start, buf + length_in_bar1, 439 length_in_bar0); 440 } 441 442 server_data->migration.bytes_transferred += bytes_written; 443 ________________________________________________________________________________________________________ *** CID 417160: Memory - corruptions (ARRAY_VS_SINGLETON) /samples/server.c: 394 in migration_read_data() 388 } 389 390 /* read bar0, if any */ 391 if (read_end > server_data->bar1_size) { 392 length_in_bar0 = read_end - read_start; 393 read_start -= server_data->bar1_size; CID 417160: Memory - corruptions (ARRAY_VS_SINGLETON) Using "&server_data->bar0" as an array. This might corrupt or misinterpret adjacent memory locations. 394 memcpy(buf + length_in_bar1, &server_data->bar0 + read_start, 395 length_in_bar0); 396 } 397 398 server_data->migration.bytes_transferred += bytes_read; 399 ________________________________________________________________________________________________________ *** CID 417159: Possible Control flow issues (DEADCODE) /lib/libvfio-user.c: 121 in dev_get_caps() 115 116 header = (struct vfio_info_cap_header*)(vfio_reg + 1); 117 118 if (vfu_reg->mmap_areas != NULL) { 119 int i, nr_mmap_areas = vfu_reg->nr_mmap_areas; 120 if (type != NULL) { CID 417159: Possible Control flow issues (DEADCODE) Execution cannot reach this statement: "type->header.next = vfio_re...". 121 type->header.next = vfio_reg->cap_offset + sizeof(struct vfio_region_info_cap_type); 122 sparse = (struct vfio_region_info_cap_sparse_mmap*)(type + 1); 123 } else { 124 vfio_reg->cap_offset = sizeof(struct vfio_region_info); 125 sparse = (struct vfio_region_info_cap_sparse_mmap*)header; 126 } Signed-off-by: William Henderson <william.henderson@nutanix.com>
2023-09-15adapt to VFIO live migration v2 (#782)William Henderson1-195/+311
This commit adapts the vfio-user protocol specification and the libvfio-user implementation to v2 of the VFIO live migration interface, as used in the kernel and QEMU. The differences between v1 and v2 are discussed in this email thread [1], and we slightly differ from upstream VFIO v2 in that instead of transferring data over a new FD, we use the existing UNIX socket with new commands VFIO_USER_MIG_DATA_READ/WRITE. We also don't yet use P2P states. The updated spec was submitted to qemu-devel [2]. [1] https://lore.kernel.org/all/20220130160826.32449-9-yishaih@nvidia.com/ [2] https://lore.kernel.org/all/20230718094150.110183-1-william.henderson@nutanix.com/ Signed-off-by: William Henderson <william.henderson@nutanix.com>
2023-08-30Replace protocol header flags bit field with mask (#773)Mattias Nissler1-2/+2
It turns out that the bit field will not yield the desired / specified bit layout on big-endian systems, see issue #768 for details. Thus, replace the bit field with constants for the individual fields and use bit masking when accessing the flags field. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2023-08-15Introduce close_safely helper function (#763)Mattias Nissler1-15/+8
The helper function centralizes some extra checks and diligence desired by many/most current code paths but currently inconsistently applied. This includes bypassing the close call when the file descriptor is -1 already, resetting the file descriptor variable to -1 after closing, and preserving errno. All calls to close are replaced by close_safely. Some warning log output is lost over this, but it doesn't seem like this was very useful anyways given that Linux always closes the file descriptor anyways. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
2023-07-03Fix address calculation for message-based DMA (#740)Mattias Nissler1-1/+1
The correct DMA address is formed by adding base and offset - the latter was accidentally missing. Change the server example to read and write blocks at non-zero offsets, such that `test-client-server.sh` exercises offset handling. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
2023-01-04allow -1 file descriptor for ioregionfd (#727)Thanos Makatos1-1/+14
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2023-01-03fix FLR reset callback (#729)John Levon1-16/+24
A reset callback is allowed to call functions disallowed in quiescent state. However, the FLR reset path neglected to account for this properly, causing an incorrect assert to be triggered if, for example, vfu_sgl_put() is called. To fix this, make sure all reset paths go through call_reset_cb(). Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-11-22add debugging to handle_device_get_region_io_fds (#723)Thanos Makatos1-0/+4
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-11-22allow shadow memory offset per shadow ioeventfd (#703)Thanos Makatos1-7/+8
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-11-16check spelling (#720)John Levon1-1/+1
Use misspell-fixer if available, and correct the small number of errors it found. Rather than trying to install into the CI, run it directly from a github action. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-10-05add some unlikely (#717)Thanos Makatos1-13/+16
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-10-05only call debug_region_access if in debug mode (#716)Thanos Makatos1-1/+9
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-10-05don't duplicate FD in get region info (#715)Thanos Makatos1-9/+3
This is out of spec. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-10-04fix compilation for i386 and ppc64 (#709)Thanos Makatos1-31/+51
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reported-by: Eduardo Lima <eblima@gmail.com>
2022-08-18make SGL error-checking DEBUG-only (#706)John Levon1-3/+11
As vfu_addr_to_sgl() and co are on the hot path, compile out these sanity checks for non-DEBUG builds. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2022-08-08delete socket on vfu_ctx_destroy (#702)Thanos Makatos1-1/+1
fixes #660 Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2022-07-04support for shadow ioeventfd (#698)Thanos Makatos1-3/+21
When an ioeventfd is written to, KVM discards the value since it has no memory to write it to, and simply kicks the eventfd. This a problem for devices such a NVMe controllers that need the value (e.g. doorbells on BAR0). This patch allows the vfio-user server to pass a file descriptor that can be mmap'ed and KVM can write the ioeventfd value to this _shadow_ memory instead of discarding it. This shadow memory is not exposed to the guest. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com> Change-Id: Iad849c94076ffa5988e034c8bf7ec312d01f095f
2022-06-09report function in quiesce_check_allowed() (#693)John Levon1-7/+10
Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2022-06-07irq: inform device of IRQ mask & unmask via callback (#694)Jag Raman1-0/+16
Client masks or unmasks a device IRQ using the VFIO_USER_DEVICE_SET_IRQS message. Inform the device of such changes to the IRQ state. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-05-30allow all LOG_* levels (#691)John Levon1-2/+1
While libvfio-user doesn't use them all, at least SPDK was expecting to be able to set LOG_NOTICE level, and silently failing. There's no reason we can't support any valid syslog level. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2022-05-30allow concurrent dirty bitmap get (#677)John Levon1-2/+9
Use atomic operations to allow concurrent bitmap updates with VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP operations. Dirtying clients can race against each other, so we must use atomic or when marking dirty: we do this byte-by-byte. When reading the dirty bitmap, we must be careful to not race and lose any set bits within the same byte. If we miss an update, we'll catch it the next time around, presuming that before the final pass we'll have quiesced all I/O. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-27re-work SGL API (#675)John Levon1-22/+27
Harmonize and rename the vfu_*sg() APIs to better reflect their functionality: in our case, there is no mapping happening as part of these calls, they are merely housekeeping for range splitting, dirty tracking, and so on. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-27re-work SG dirty tracking (#672)John Levon1-2/+15
Move SG dirtying to vfu_unmap_sg(): as we don't want to track SGs ourselves, doing this in vfu_map_sg() is no longer the right place. Note that the lack of tracking implies that any SGs must be unmapped before the final stop and copy phase. To avoid the need for this, add vfu_mark_sg_dirty(): this allows a consumer to mark a region as dirty explicitly without needing to unmap it. Currently it's the same as vfu_unmap_sg(), but that's an implementation detail. Note this still marks current maps after a get operation; that will change subsequently. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-27require quiesce for VFIO_USER_DIRTY_PAGES (#671)John Levon1-0/+3
If we require a quiesce for these calls, we can be sure that it will not race with any usage of vfu_*_sg() calls, as a first step towards concurrency. This is not ideal for VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP, which can potentially be called multiple times during pre-copy phase, but that's something we can fix later. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-23libvfio-user.h: sync VFIO_DEVICE_STATE_XXXX definitions with upstream (#690)Jag Raman1-1/+1
Rename VFIO_DEVICE_STATE_XXXX defines as VFIO_DEVICE_STATE_V1_XXXX. Upstream renamed these variable to be of the XXXX_V1_XXXX format and switched an enum for VFIO_DEVICE_STATE_XXXX. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-05-12run scan-build in CI (#680)John Levon1-1/+1
Yet another static analyzer pass, this one is used by SPDK, and as it did detect some minor issues, it's worth running. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-04-21support AFL++ fuzzing (#623)John Levon1-1/+14
To support fuzzing with AFL++, add a "pipe" transport that reads from stdin and outputs to stdout: this is the most convenient way of doing fuzzing. Add some docs on how to run a fuzzing session. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-04-19use struct iovec for grouping buffer and length (#658)Thanos Makatos1-79/+79
This make it tidier and easier to pass to function the buffer and length, instead of passing the whole msg. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-03-07check for allowed operations in quiesce state (#647)Thanos Makatos1-0/+41
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-03-02improve region access debugging (#653)John Levon1-52/+49
Many region accesses of interest are of normal register sizes; sniff the region access size, and report the read/written value if possible. Clean up dump_buffer() now, as it's not of much use. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2022-02-24additional logging around dirty pages (#650)Thanos Makatos1-0/+6
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2022-01-17small get_request_header() cleanup (#637)John Levon1-9/+2
LGTM.com reports "Comparison is always true because ret <= -1.", and it's indeed correct (but harmless). Clean this up. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-12-22fix coverity issues (#636)John Levon1-0/+1
Fix three remaining low priority coverity issues; they do not represent bugs. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-12-09allow DMA funcs to be called in quiesced state (#635)Thanos Makatos1-2/+0
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-12-01refactor process_request() (#633)John Levon1-174/+200
Instead of process_request() having a dual role, split into get_request() and handle_request(). Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-11-30introduce device quiesce callback (#609)Thanos Makatos1-71/+181
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Leon <john.levon@nutanix.com>
2021-11-24check ->argsz for VFIO_USER_DIRTY_PAGES (#628)John Levon1-5/+12
We weren't checking for a too-large ->argsz for this command. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-11-24fix dma unmap validation (#626)John Levon1-40/+57
There were two issues with unmap request validation when the dirty bitmap flag was set: - we weren't checking ->argsz against the maximum transfer size, allowing a client to trigger unbounded allocations - we needed to check for overflow when calculating the requested message out size Found via AFL++. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-11-24is_valid_region_access(): account for wrap-around (#624)John Levon1-2/+1
AFL++ found this, though we already knew about it, so fix it by comparing against a saturating addition. This was the only instance of client-controlled potential overflow I noticed. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-11-12don't assert in dev_get_caps() (#621)John Levon1-5/+6
As clients control ->client_max_fds, we should return an error, not assert, if we can't represent a region's mmap_areas. Found via AFL++. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-11-11Fix reply of VFIO_USER_DEVICE_GET_REGION_INFO (#619)Swapnil Ingle1-4/+8
* Fix reply of VFIO_USER_DEVICE_GET_REGION_INFO Set VFIO_REGION_INFO_FLAG_CAPS flag only if caps are part of the reply. Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-10-05make migration state callback optionally asynchronous (#608)Thanos Makatos1-14/+62
Some devices need the migration state callback to be asynchronous. The simplest way to implement this is to require from the callback to return -1 and set errno to EBUSY, not process any other new messages (vfu_ctx_run returns -1 and sets errno to EBUSY), and provide a way to the user to complete migration (vfu_migr_done). Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com> Reviewed-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
2021-09-28disable VFIO_USER_DEVICE_GET_REGION_IO_FDS on invalid region (#607)John Levon1-0/+5
If a region is not set up, asking for its iofds should fail with EINVAL. Co-authored-by: John Levon <john.levon@nutanix.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-09-28fix issue with get_region_io_fds when fd = -1 (#606)John Levon1-5/+1
An unmappable region should still allow io fds, as they are orthogonal. Co-authored-by: John Levon <john.levon@nutanix.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
2021-09-08initial ioeventfd support (#601)JAKelly101-1/+181
Provide initial support for handling VFIO_USER_DEVICE_GET_REGION_IO_FDS, along with a new vfu_create_ioeventfd() API. Reviewed-by: John Levon <john.levon@nutanix.com>
2021-08-27Add support for VFIO_DMA_UNMAP_FLAG_ALL flag (#600)Swapnil Ingle1-1/+24
* Add support for VFIO_DMA_UNMAP_FLAG_ALL flag Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>
2021-08-26Fix err path of handle_dma_unmap() (#597)Swapnil Ingle1-5/+5
* initial dma_unmap test Signed-off-by: John Levon <john.levon@nutanix.com> Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> * Fix err path of handle_dma_unmap() Set msg->out_size before successful return. Otherwise in case of error reply path we may endup setting iovecs[1].iov_len with invalid iovecs[1].iov_base in tran_sock_reply() Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com> Reviewed-by: John Levon <john.levon@nutanix.com>