aboutsummaryrefslogtreecommitdiff
path: root/util
AgeCommit message (Collapse)AuthorFilesLines
2022-02-14util: adjust coroutine pool size to virtio block queueHiroki Narukawa1-4/+16
Coroutine pool size was 64 from long ago, and the basis was organized in the commit message in 4d68e86b. At that time, virtio-blk queue-size and num-queue were not configuable, and equivalent values were 128 and 1. Coroutine pool size 64 was fine then. Later queue-size and num-queue got configuable, and default values were increased. Coroutine pool with size 64 exhausts frequently with random disk IO in new size, and slows down. This commit adjusts coroutine pool size adaptively with new values. This commit adds 64 by default, but now coroutine is not only for block devices, and is not too much burdon comparing with new default. pool size of 128 * vCPUs. Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> Message-id: 20220214115302.13294-2-hnarukaw@yahoo-corp.jp Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-02-06util/oslib-posix: Fix missing unlock in the error path of os_mem_prealloc()David Hildenbrand1-0/+1
We're missing an unlock in case installing the signal handler failed. Fortunately, we barely see this error in real life. Fixes: a960d6642d39 ("util/oslib-posix: Support concurrent os_mem_prealloc() invocation") Fixes: CID 1468941 Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta@ionos.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Cc: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20220111120830.119912-1-david@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-02-04cpuid: use unsigned for max cpuidMichael S. Tsirkin1-1/+1
__get_cpuid_max returns an unsigned value. For consistency, store the result in an unsigned variable. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-02-01block/export: Fix vhost-user-blk shutdown with requests in flightKevin Wolf1-0/+22
The vhost-user-blk export runs requests asynchronously in their own coroutine. When the vhost connection goes away and we want to stop the vhost-user server, we need to wait for these coroutines to stop before we can unmap the shared memory. Otherwise, they would still access the unmapped memory and crash. This introduces a refcount to VuServer which is increased when spawning a new request coroutine and decreased before the coroutine exits. The memory is only unmapped when the refcount reaches zero. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220125151435.48792-1-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-01-14Merge remote-tracking branch ↵Peter Maydell7-34/+90
'remotes/stefanha-gitlab/tags/block-pull-request' into staging Pull request # gpg: Signature made Wed 12 Jan 2022 17:13:54 GMT # gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full] # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full] # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha-gitlab/tags/block-pull-request: virtio: unify dataplane and non-dataplane ->handle_output() virtio: use ->handle_output() instead of ->handle_aio_output() virtio-scsi: prepare virtio_scsi_handle_cmd for dataplane virtio-blk: drop unused virtio_blk_handle_vq() return value virtio: get rid of VirtIOHandleAIOOutput aio-posix: split poll check from ready handler Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-12aio-posix: split poll check from ready handlerStefan Hajnoczi7-34/+90
Adaptive polling measures the execution time of the polling check plus handlers called when a polled event becomes ready. Handlers can take a significant amount of time, making it look like polling was running for a long time when in fact the event handler was running for a long time. For example, on Linux the io_submit(2) syscall invoked when a virtio-blk device's virtqueue becomes ready can take 10s of microseconds. This can exceed the default polling interval (32 microseconds) and cause adaptive polling to stop polling. By excluding the handler's execution time from the polling check we make the adaptive polling calculation more accurate. As a result, the event loop now stays in polling mode where previously it would have fallen back to file descriptor monitoring. The following data was collected with virtio-blk num-queues=2 event_idx=off using an IOThread. Before: 168k IOPS, IOThread syscalls: 9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0) = 16 9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8) = 8 9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8) = 8 9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3 9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0) = 32 174k IOPS (+3.6%), IOThread syscalls: 9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0) = 32 9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8) = 8 9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8) = 8 9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50) = 32 Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because the IOThread stays in polling mode instead of falling back to file descriptor monitoring. As usual, polling is not implemented on Windows so this patch ignores the new io_poll_read() callback in aio-win32.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-2-stefanha@redhat.com [Fixed up aio_set_event_notifier() calls in tests/unit/test-fdmon-epoll.c added after this series was queued. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12meson: reenable filemonitor-inotify compilationVolker Rümelin1-2/+5
Reenable util/filemonitor-inotify compilation. Compilation was disabled when commit a620fbe9ac ("configure: convert compiler tests to meson, part 5") moved CONFIG_INOTIFY1 from config-host.mak to config-host.h. This fixes the usb-mtp device and reenables test-util-filemonitor. Fixes: a620fbe9ac ("configure: convert compiler tests to meson, part 5") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/800 Signed-off-by: Volker Rümelin <vr_qemu@t-online.de> Message-Id: <20220107133514.7785-1-vr_qemu@t-online.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-08qemu/int128: addition of div/rem 128-bit operationsFrédéric Pétrot2-0/+148
Addition of div and rem on 128-bit integers, using the 128/64->128 divu and 64x64->128 mulu in host-utils. These operations will be used within div/rem helpers in the 128-bit riscv target. Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20220106210108.138226-4-frederic.petrot@univ-grenoble-alpes.fr Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2022-01-07util/oslib-posix: Forward SIGBUS to MCE handler under LinuxDavid Hildenbrand1-3/+34
Temporarily modifying the SIGBUS handler is really nasty, as we might be unlucky and receive an MCE SIGBUS while having our handler registered. Unfortunately, there is no way around messing with SIGBUS when MADV_POPULATE_WRITE is not applicable or not around. Let's forward SIGBUS that don't belong to us to the already registered handler and document the situation. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-8-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Support concurrent os_mem_prealloc() invocationDavid Hildenbrand1-0/+9
Add a mutex to protect the SIGBUS case, as we cannot mess concurrently with the sigbus handler and we have to manage the global variable sigbus_memset_context. The MADV_POPULATE_WRITE path can run concurrently. Note that page_mutex and page_cond are shared between concurrent invocations, which shouldn't be a problem. This is a preparation for future virtio-mem prealloc code, which will call os_mem_prealloc() asynchronously from an iothread when handling guest requests. Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-7-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Avoid creating a single thread with MADV_POPULATE_WRITEDavid Hildenbrand1-0/+8
Let's simplify the case when we only want a single thread and don't have to mess with signal handlers. Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-6-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Don't create too many threads with small memory or little ↵David Hildenbrand1-2/+10
pages Let's limit the number of threads to something sane, especially that - We don't have more threads than the number of pages we have - We don't have threads that initialize small (< 64 MiB) memory Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-5-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Introduce and use MemsetContext for touch_all_pages()David Hildenbrand1-26/+47
Let's minimize the number of global variables to prepare for os_mem_prealloc() getting called concurrently and make the code a bit easier to read. The only consumer that really needs a global variable is the sigbus handler, which will require protection via a mutex in the future either way as we cannot concurrently mess with the SIGBUS handler. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-4-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc()David Hildenbrand1-21/+62
Let's sense support and use it for preallocation. MADV_POPULATE_WRITE does not require a SIGBUS handler, doesn't actually touch page content, and avoids context switches; it is, therefore, faster and easier to handle than our current approach. While MADV_POPULATE_WRITE is, in general, faster than manual prefaulting, and especially faster with 4k pages, there is still value in prefaulting using multiple threads to speed up preallocation. More details on MADV_POPULATE_WRITE can be found in the Linux commits 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1]. This resolves the TODO in do_touch_pages(). In the future, we might want to look into using fallocate(), eventually combined with MADV_POPULATE_READ, when dealing with shared file/fd mappings and not caring about memory bindings. [1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-3-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Let touch_all_pages() return an errorDavid Hildenbrand1-12/+16
Let's prepare touch_all_pages() for returning differing errors. Return an error from the thread and report the last processed error. Translate SIGBUS to -EFAULT, as a SIGBUS can mean all different kind of things (memory error, read error, out of memory). When allocating memory fails via the current SIGBUS-based mechanism, we'll get: os_mem_prealloc: preallocating memory failed: Bad address Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-2-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-16transactions: Invoke clean() after everything elseHanna Reitz1-2/+6
Invoke the transaction drivers' .clean() methods only after all .commit() or .abort() handlers are done. This makes it easier to have nested transactions where the top-level transactions pass objects to lower transactions that the latter can still use throughout their commit/abort phases, while the top-level transaction keeps a reference that is released in its .clean() method. (Before this commit, that is also possible, but the top-level transaction would need to take care to invoke tran_add() before the lower-level transaction does. This commit makes the ordering irrelevant, which is just a bit nicer.) Signed-off-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20211111120829.81329-8-hreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20211115145409.176785-8-kwolf@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-11-10rcu: Introduce force_rcu notifierGreg Kurz1-0/+19
The drain_rcu_call() function can be blocked as long as an RCU reader stays in a read-side critical section. This is typically what happens when a TCG vCPU is executing a busy loop. It can deadlock the QEMU monitor as reported in https://gitlab.com/qemu-project/qemu/-/issues/650 . This can be avoided by allowing drain_rcu_call() to enforce an RCU grace period. Since each reader might need to do specific actions to end a read-side critical section, do it with notifiers. Prepare ground for this by adding a notifier list to the RCU reader struct and use it in wait_for_readers() if drain_rcu_call() is in progress. An API is added for readers to register their notifiers. This is largely based on a draft from Paolo Bonzini. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211109183523.47726-2-groug@kaod.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-02util: Make some iova_tree parameters constEugenio Pérez1-6/+6
As qemu guidelines: Unless a pointer is used to modify the pointed-to storage, give it the "const" attribute. In the particular case of iova_tree_find it allows to enforce what is requested by its comment, since the compiler would shout in case of modifying or freeing the const-qualified returned pointer. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211013182713.888753-2-eperezma@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-27host-utils: add 128-bit quotient support to divu128/divs128Luis Pires1-41/+86
These will be used to implement new decimal floating point instructions from Power ISA 3.1. The remainder is now returned directly by divu128/divs128, freeing up phigh to receive the high 64 bits of the quotient. Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211025191154.350831-4-luis.pires@eldorado.org.br> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-27host-utils: move checks out of divu128/divs128Luis Pires1-22/+18
In preparation for changing the divu128/divs128 implementations to allow for quotients larger than 64 bits, move the div-by-zero and overflow checks to the callers. Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211025191154.350831-2-luis.pires@eldorado.org.br> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-15qemu-option: Allow deleting opts during qemu_opts_foreach()Kevin Wolf1-2/+2
Use QTAILQ_FOREACH_SAFE() so that the current QemuOpts can be deleted while iterating through the whole list. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211008133442.141332-11-kwolf@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-14configure, meson: move more compiler checks to MesonPaolo Bonzini1-1/+3
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20211007130829.632254-15-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-14configure, meson: move pthread_setname_np checks to MesonPaolo Bonzini1-3/+2
This makes the pthreads check dead in configure, so remove it as well. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20211007130829.632254-9-pbonzini@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-13util/compatfd.c: use libc signalfd wrapper instead of raw syscallKacper Słomiński1-3/+2
This allows the use of native signalfd instead of the sigtimedwait based emulation on systems other than Linux. Signed-off-by: Kacper Słomiński <kacper.slominski72@gmail.com> Message-Id: <20210905011621.200785-1-kacper.slominski72@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into ↵Peter Maydell1-3/+3
staging * SGX implementation for x86 * Miscellaneous bugfixes * Fix dependencies from ROMs to qtests # gpg: Signature made Thu 30 Sep 2021 14:30:35 BST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini-gitlab/tags/for-upstream: (33 commits) meson_options.txt: Switch the default value for the vnc option to 'auto' build-sys: add HAVE_IPPROTO_MPTCP memory: Add tracepoint for dirty sync memory: Name all the memory listeners target/i386: Fix memory leak in sev_read_file_base64() tests: qtest: bios-tables-test depends on the unpacked edk2 ROMs meson: unpack edk2 firmware even if --disable-blobs target/i386: Add the query-sgx-capabilities QMP command target/i386: Add HMP and QMP interfaces for SGX docs/system: Add SGX documentation to the system manual sgx-epc: Add the fill_device_info() callback support i440fx: Add support for SGX EPC q35: Add support for SGX EPC i386: acpi: Add SGX EPC entry to ACPI tables i386/pc: Add e820 entry for SGX EPC section(s) hw/i386/pc: Account for SGX EPC sections when calculating device memory hw/i386/fw_cfg: Set SGX bits in feature control fw_cfg accordingly Adjust min CPUID level to 0x12 when SGX is enabled i386: Propagate SGX CPUID sub-leafs to KVM i386: kvm: Add support for exposing PROVISIONKEY to guest ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-09-30build-sys: add HAVE_IPPROTO_MPTCPMarc-André Lureau1-3/+3
The QAPI schema shouldn't rely on C system headers #define, but on configure-time project #define, so we can express the build condition in a C-independent way. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20210907121943.3498701-3-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-29host-utils: Fix overflow detection in divu128()Luis Pires1-1/+1
The previous code didn't detect overflows if the high 64-bit of the dividend were equal to the 64-bit divisor. In that case, 64 bits wouldn't be enough to hold the quotient. Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210910112624.72748-2-luis.pires@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-27qapi: Convert simple union SocketAddressLegacy to flat oneMarkus Armbruster1-4/+4
Simple unions predate flat unions. Having both complicates the QAPI schema language and the QAPI generator. We haven't been using simple unions in new code for a long time, because they are less flexible and somewhat awkward on the wire. To prepare for their removal, convert simple union SocketAddressLegacy to an equivalent flat one, with existing enum SocketAddressType replacing implicit enum type SocketAddressLegacyKind. Adds some boilerplate to the schema, which is a bit ugly, but a lot easier to maintain than the simple union feature. Cc: "Daniel P. Berrangé" <berrange@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210917143134.412106-9-armbru@redhat.com>
2021-09-15util: Remove redundant checks in the openpty()AlexChen1-4/+3
As we can see from the following function call stack, amaster and aslave can not be NULL: char_pty_open() -> qemu_openpty_raw() -> openpty(). In addition, according to the API specification for openpty(): https://www.gnu.org/software/libc/manual/html_node/Pseudo_002dTerminal-Pairs.html, the arguments name, termp and winp can all be NULL, but arguments amaster or aslave can not be NULL. Finally, amaster and aslave has been dereferenced at the beginning of the openpty(). So the checks on amaster and aslave in the openpty() are redundant. Remove them. Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <5F9FE5B8.1030803@huawei.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-09-13util: Suppress -Wstringop-overflow in qemu_thread_startRichard Henderson1-0/+19
This seems to be either a glibc or gcc bug, but the code appears to be fine with the warning suppressed. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210803211907.150525-1-richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-07Merge remote-tracking branch ↵Peter Maydell1-44/+55
'remotes/stefanha-gitlab/tags/block-pull-request' into staging Pull request Userspace NVMe driver patches. # gpg: Signature made Tue 07 Sep 2021 09:13:57 BST # gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full] # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full] # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha-gitlab/tags/block-pull-request: block/nvme: Only report VFIO error on failed retry util/vfio-helpers: Let qemu_vfio_do_mapping() propagate Error util/vfio-helpers: Simplify qemu_vfio_dma_map() returning directly util/vfio-helpers: Use error_setg in qemu_vfio_find_[fixed/temp]_iova util/vfio-helpers: Extract qemu_vfio_water_mark_reached() util/vfio-helpers: Pass Error handle to qemu_vfio_dma_map() block/nvme: Have nvme_create_queue_pair() report errors consistently util/vfio-helpers: Remove unreachable code in qemu_vfio_dma_map() util/vfio-helpers: Replace qemu_mutex_lock() calls with QEMU_LOCK_GUARD util/vfio-helpers: Let qemu_vfio_verify_mappings() use error_report() block/nvme: Use safer trace format string Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-09-07util/vfio-helpers: Let qemu_vfio_do_mapping() propagate ErrorPhilippe Mathieu-Daudé1-4/+4
Pass qemu_vfio_do_mapping() an Error* argument so it can propagate any error to callers. Replace error_report() which only report to the monitor by the more generic error_setg_errno(). Reviewed-by: Fam Zheng <fam@euphon.net> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20210902070025.197072-11-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-07util/vfio-helpers: Simplify qemu_vfio_dma_map() returning directlyPhilippe Mathieu-Daudé1-13/+10
To simplify qemu_vfio_dma_map(): - reduce 'ret' (returned value) scope by returning errno directly, - remove the goto 'out' label. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20210902070025.197072-10-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-07util/vfio-helpers: Use error_setg in qemu_vfio_find_[fixed/temp]_iovaPhilippe Mathieu-Daudé1-10/+14
Both qemu_vfio_find_fixed_iova() and qemu_vfio_find_temp_iova() return an errno which is unused (or overwritten). Have them propagate eventual errors to callers, returning a boolean (which is what the Error API recommends, see commit e3fe3988d78 "error: Document Error API usage rules" for rationale). Suggested-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20210902070025.197072-9-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-07util/vfio-helpers: Extract qemu_vfio_water_mark_reached()Philippe Mathieu-Daudé1-1/+16
Extract qemu_vfio_water_mark_reached() for readability, and have it provide an error hint it its Error* handle. Suggested-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20210902070025.197072-8-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-07util/vfio-helpers: Pass Error handle to qemu_vfio_dma_map()Philippe Mathieu-Daudé1-4/+6
Currently qemu_vfio_dma_map() displays errors on stderr. When using management interface, this information is simply lost. Pass qemu_vfio_dma_map() an Error** handle so it can propagate the error to callers. Reviewed-by: Fam Zheng <fam@euphon.net> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20210902070025.197072-7-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-07util/vfio-helpers: Remove unreachable code in qemu_vfio_dma_map()Philippe Mathieu-Daudé1-4/+0
qemu_vfio_add_mapping() returns a pointer to an indexed entry in pre-allocated QEMUVFIOState::mappings[], thus can not be NULL. Remove the pointless check. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20210902070025.197072-5-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-07util/vfio-helpers: Replace qemu_mutex_lock() calls with QEMU_LOCK_GUARDPhilippe Mathieu-Daudé1-6/+3
Simplify qemu_vfio_dma_[un]map() handlers by replacing a pair of qemu_mutex_lock/qemu_mutex_unlock calls by the WITH_QEMU_LOCK_GUARD macro. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20210902070025.197072-4-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-07util/vfio-helpers: Let qemu_vfio_verify_mappings() use error_report()Philippe Mathieu-Daudé1-2/+2
Instead of displaying the error on stderr, use error_report() which also report to the monitor. Reviewed-by: Fam Zheng <fam@euphon.net> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20210902070025.197072-3-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-06qemu-sockets: fix unix socket path copy (again)Michael Tokarev1-8/+5
Commit 4cfd970ec188558daa6214f26203fe553fb1e01f added an assert which ensures the path within an address of a unix socket returned from the kernel is at least one byte and does not exceed sun_path buffer. Both of this constraints are wrong: A unix socket can be unnamed, in this case the path is completely empty (not even \0) And some implementations (notable linux) can add extra trailing byte (\0) _after_ the sun_path buffer if we passed buffer larger than it (and we do). So remove the assertion (since it causes real-life breakage) but at the same time fix the usage of sun_path. Namely, we should not access sun_path[0] if kernel did not return it at all (this is the case for unnamed sockets), and use the returned salen when copyig actual path as an upper constraint for the amount of bytes to copy - this will ensure we wont exceed the information provided by the kernel, regardless whenever there is a trailing \0 or not. This also helps with unnamed sockets. Note the case of abstract socket, the sun_path is actually a blob and can contain \0 characters, - it should not be passed to g_strndup and the like, it should be accessed by memcpy-like functions. Fixes: 4cfd970ec188558daa6214f26203fe553fb1e01f Fixes: http://bugs.debian.org/993145 Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> CC: qemu-stable@nongnu.org
2021-08-04util: fix abstract socket path copyMarc-André Lureau1-1/+4
Commit 776b97d360 "qemu-sockets: add abstract UNIX domain socket support" neglected to update socket_sockaddr_to_address_unix() and copied the whole sun_path without taking "salen" into account. Later, commit 3b14b4ec49 "sockets: Fix socket_sockaddr_to_address_unix() for abstract sockets" handled the abstract UNIX path, by stripping the leading \0 character and fixing address details, but didn't use salen either. Not taking "salen" into account may result in incorrect "path" being returned in monitors commands, as we read past the address which is not necessarily \0-terminated. Fixes: 776b97d3605ed0fc94443048fdf988c7725e38a9 Fixes: 3b14b4ec49a801067da19d6b8469eb1c1911c020 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: xiaoqiang zhao <zxq_yx_007@163.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-07-26util/selfmap: Discard mapping on errorRichard Henderson1-12/+17
From clang-13: util/selfmap.c:26:21: error: variable 'errors' set but not used \ [-Werror,-Wunused-but-set-variable] Quite right of course, but there's no reason not to check errors. First, incrementing errors is incorrect, because qemu_strtoul returns an errno not a count -- just or them together so that we have a non-zero value at the end. Second, if we have an error, do not add the struct to the list, but free it instead. Cc: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-07-22Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into ↵Peter Maydell1-5/+11
staging Bugfixes. # gpg: Signature made Thu 22 Jul 2021 14:11:27 BST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini-gitlab/tags/for-upstream: configure: Let --without-default-features disable vhost-kernel and vhost-vdpa configure: Fix the default setting of the "xen" feature configure: Allow vnc to get disabled with --without-default-features configure: Fix --without-default-features propagation to meson meson: fix dependencies for modinfo configure: Drop obsolete check for the alloc_size attribute target/i386: Added consistency checks for EFER target/i386: Added consistency checks for CR4 target/i386: Added V_INTR_PRIO check to virtual interrupts qemu-config: restore "machine" in qmp_query_command_line_options() usb: fix usb-host dependency check chardev-spice: add missing module_obj directive vl: Parse legacy default_machine_opts qemu-config: fix memory leak on ferror() qemu-config: never call the callback after an error, fix leak Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-07-22qemu-config: restore "machine" in qmp_query_command_line_options()Stefan Hajnoczi1-2/+7
Commit d8fb7d0969d5c32b3d1b9e20b63ec6c0abe80be4 ("vl: switch -M parsing to keyval") stopped adding the "machine" QemuOptsList. This causes "machine" options to not show up in QMP query-command-line-options output. For example, libvirt cannot detect that kernel_irqchip support is available. Adjust the "machine" opts enumeration in qmp_query_command_line_options() so that options are properly reported. Fixes: d8fb7d0969d5 ("vl: switch -M parsing to keyval") Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210721151055.424580-1-stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-21qemu/atomic: Add aligned_{int64,uint64}_t typesRichard Henderson1-2/+2
Use it to avoid some clang-12 -Watomic-alignment errors, forcing some structures to be aligned and as a pointer when we have ensured that the address is aligned. Tested-by: Cole Robinson <crobinso@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-07-21iothread: add aio-max-batch parameterStefano Garzarella3-0/+19
The `aio-max-batch` parameter will be propagated to AIO engines and it will be used to control the maximum number of queued requests. When there are in queue a number of requests equal to `aio-max-batch`, the engine invokes the system call to forward the requests to the kernel. This parameter allows us to control the maximum batch size to reduce the latency that requests might accumulate while queued in the AIO engine queue. If `aio-max-batch` is equal to 0 (default value), the AIO engine will use its default maximum batch size value. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20210721094211.69853-3-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-07-20qemu-config: fix memory leak on ferror()Paolo Bonzini1-1/+2
The leak is basically impossible to reach, since the only common way to get ferror(fp) is by passing a directory to -readconfig. In that case, the error occurs before qdict is set to anything non-NULL. However, it's theoretically possible to get there after an EIO. Cc: armbru@redhat.com Reported-by: Peter Maydell <peter.maydell@linaro.org> Fixes: f7544edcd3 ("qemu-config: add error propagation to qemu_config_parse", 2021-03-06) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-20qemu-config: never call the callback after an error, fix leakPaolo Bonzini1-2/+2
Ensure that the callback to qemu_config_foreach is never called upon an error, by moving the invocation before the "out" label. Cc: armbru@redhat.com Fixes: 3770141139 ("qemu-config: parse configuration files to a QDict", 2021-06-04) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-11Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into ↵Peter Maydell3-88/+118
staging * More SVM fixes (Lara) * Module annotation database (Gerd) * Memory leak fixes (myself) * Build fixes (myself) * --with-devices-* support (Alex) # gpg: Signature made Fri 09 Jul 2021 17:23:52 BST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini-gitlab/tags/for-upstream: (48 commits) meson: Use input/output for entitlements target configure: allow the selection of alternate config in the build configs: rename default-configs to configs and reorganise hw/arm: move CONFIG_V7M out of default-devices hw/arm: add dependency on OR_IRQ for XLNX_VERSAL meson: Introduce target-specific Kconfig meson: switch function tests from compilation to linking vl: fix leak of qdict_crumple return value target/i386: fix exceptions for MOV to DR target/i386: Added DR6 and DR7 consistency checks target/i386: Added MSRPM and IOPM size check monitor/tcg: move tcg hmp commands to accel/tcg, register them dynamically usb: build usb-host as module monitor/usb: register 'info usbhost' dynamically usb: drop usb_host_dev_is_scsi_storage hook monitor: allow register hmp commands accel: build tcg modular accel: add tcg module annotations accel: build qtest modular accel: add qtest module annotations ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-07-11Merge remote-tracking branch ↵Peter Maydell2-3/+1
'remotes/vivier2/tags/trivial-branch-for-6.1-pull-request' into staging Trivial patches pull request 20210709 # gpg: Signature made Fri 09 Jul 2021 21:26:52 BST # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier2/tags/trivial-branch-for-6.1-pull-request: util/guest-random: Fix size arg to tail memcpy migration: fix typo in mig_throttle_guest_down comment target/xtensa/xtensa-semi: Fix compilation problem on Haiku hw/virtio: Document *_should_notify() are called within rcu_read_lock() misc: Remove redundant new line in perror() virtiofsd: Add missing newline in error message misc: Fix "havn't" typo memory: Display MemoryRegion name in read/write ops trace events qemu-option: Drop dead assertion Signed-off-by: Peter Maydell <peter.maydell@linaro.org>