aboutsummaryrefslogtreecommitdiff
path: root/util
AgeCommit message (Collapse)AuthorFilesLines
2022-03-18util/osdep: Remove some early cruftAndrew Deason1-7/+0
The include for statvfs.h has not been needed since all statvfs calls were removed in commit 4a1418e07bdc ("Unbreak large mem support by removing kqemu"). The comment mentioning CONFIG_BSD hasn't made sense since an include for config-host.h was removed in commit aafd75841001 ("util: Clean up includes"). Remove this cruft. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Andrew Deason <adeason@sinenomine.net> Message-id: 20220316035227.3702-4-adeason@sinenomine.net Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-03-18util/osdep: Avoid madvise proto on modern SolarisAndrew Deason1-3/+0
On older Solaris releases (before Solaris 11), we didn't get a prototype for madvise, and so util/osdep.c provides its own prototype. Some time between the public Solaris 11.4 release and Solaris 11.4.42 CBE, we started getting an madvise prototype that looks like this: extern int madvise(void *, size_t, int); which conflicts with the prototype in util/osdeps.c. Instead of always declaring this prototype, check if we're missing the madvise() prototype, and only declare it ourselves if the prototype is missing. Move the prototype to include/qemu/osdep.h, the normal place to handle platform-specific header quirks. The 'missing_madvise_proto' meson check contains an obviously wrong prototype for madvise. So if that code compiles and links, we must be missing the actual prototype for madvise. Signed-off-by: Andrew Deason <adeason@sinenomine.net> Message-id: 20220316035227.3702-2-adeason@sinenomine.net Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-03-15util: add iova_tree_find_iovaEugenio Pérez1-0/+34
This function does the reverse operation of iova_tree_find: To look for a mapping that match a translated address so we can do the reverse. This have linear complexity instead of logarithmic, but it supports overlapping HVA. Future developments could reduce it. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15util: Add iova_tree_alloc_mapEugenio Pérez1-0/+136
This iova tree function allows it to look for a hole in allocated regions and return a totally new translation for a given translated address. It's usage is mainly to allow devices to access qemu address space, remapping guest's one into a new iova space where qemu can add chunks of addresses. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-08Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell1-0/+5
virtio,pc,pci: features, cleanups, fixes vhost-user enabled on non-linux systems beginning of nvme sriov support bigger tx queue for vdpa virtio iommu bypass FADT flag to detect legacy keyboards Fixes, cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Mon 07 Mar 2022 22:43:31 GMT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (47 commits) hw/acpi/microvm: turn on 8042 bit in FADT boot architecture flags if present tests/acpi: i386: update FACP table differences hw/acpi: add indication for i8042 in IA-PC boot flags of the FADT table tests/acpi: i386: allow FACP acpi table changes docs: vhost-user: add subsection for non-Linux platforms configure, meson: allow enabling vhost-user on all POSIX systems vhost: use wfd on functions setting vring call fd event_notifier: add event_notifier_get_wfd() pci: drop COMPAT_PROP_PCP for 2.0 machine types hw/smbios: Add table 4 parameter, "processor-id" x86: cleanup unused compat_apic_id_mode vhost-vsock: detach the virqueue element in case of error pc: add option to disable PS/2 mouse/keyboard acpi: pcihp: pcie: set power on cap on parent slot pci: expose TYPE_XIO3130_DOWNSTREAM name pci: show id info when pci BDF conflict hw/misc/pvpanic: Use standard headers instead headers: Add pvpanic.h pci-bridge/xio3130_downstream: Fix error handling pci-bridge/xio3130_upstream: Fix error handling ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # docs/specs/index.rst
2022-03-08Merge remote-tracking branch ↵Peter Maydell6-81/+95
'remotes/pmaydell/tags/pull-target-arm-20220307' into staging target-arm queue: * cleanups of qemu_oom_check() and qemu_memalign() * target/arm/translate-neon: UNDEF if VLD1/VST1 stride bits are non-zero * target/arm/translate-neon: Simplify align field check for VLD3 * GICv3 ITS: add more trace events * GICv3 ITS: implement 8-byte accesses properly * GICv3: fix minor issues with some trace/log messages * ui/cocoa: Use the standard about panel * target/arm: Provide cpu property for controling FEAT_LPA2 * hw/arm/virt: Disable LPA2 for -machine virt-6.2 # gpg: Signature made Mon 07 Mar 2022 16:46:06 GMT # gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE # gpg: issuer "peter.maydell@linaro.org" # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate] # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20220307: hw/arm/virt: Disable LPA2 for -machine virt-6.2 target/arm: Provide cpu property for controling FEAT_LPA2 ui/cocoa: Use the standard about panel hw/intc/arm_gicv3_cpuif: Fix register names in ICV_HPPIR read trace event hw/intc/arm_gicv3: Fix missing spaces in error log messages hw/intc/arm_gicv3: Specify valid and impl in MemoryRegionOps hw/intc/arm_gicv3_its: Add trace events for table reads and writes hw/intc/arm_gicv3_its: Add trace events for commands target/arm/translate-neon: Simplify align field check for VLD3 target/arm/translate-neon: UNDEF if VLD1/VST1 stride bits are non-zero osdep: Move memalign-related functions to their own header util: Put qemu_vfree() in memalign.c util: Use meson checks for valloc() and memalign() presence util: Share qemu_try_memalign() implementation between POSIX and Windows meson.build: Don't misdetect posix_memalign() on Windows util: Return valid allocation for qemu_try_memalign() with zero size util: Unify implementations of qemu_memalign() util: Make qemu_oom_check() a static function Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-03-08Merge remote-tracking branch 'remotes/cschoenebeck/tags/pull-9p-20220307' ↵Peter Maydell1-21/+0
into staging 9pfs: introduce macOS host support and cleanup * Add support for Darwin (a.k.a. macOS) hosts. * Code cleanup (move qemu_dirent_dup() from osdep -> 9p-util). * API doc cleanup (convert Doxygen -> kerneldoc format). # gpg: Signature made Mon 07 Mar 2022 11:14:45 GMT # gpg: using RSA key 96D8D110CF7AF8084F88590134C2B58765A47395 # gpg: issuer "qemu_oss@crudebyte.com" # gpg: Good signature from "Christian Schoenebeck <qemu_oss@crudebyte.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: ECAB 1A45 4014 1413 BA38 4926 30DB 47C3 A012 D5F4 # Subkey fingerprint: 96D8 D110 CF7A F808 4F88 5901 34C2 B587 65A4 7395 * remotes/cschoenebeck/tags/pull-9p-20220307: fsdev/p9array.h: convert Doxygen -> kerneldoc format 9pfs/coth.h: drop Doxygen format on v9fs_co_run_in_worker() 9pfs/9p-util.h: convert Doxygen -> kerneldoc format 9pfs/9p.c: convert Doxygen -> kerneldoc format 9pfs/codir.c: convert Doxygen -> kerneldoc format 9pfs/9p.h: convert Doxygen -> kerneldoc format 9pfs: drop Doxygen format from qemu_dirent_dup() API comment 9pfs: move qemu_dirent_dup() from osdep -> 9p-util 9p: darwin: meson: Allow VirtFS on Darwin 9p: darwin: Adjust assumption on virtio-9p-test 9p: darwin: Implement compatibility for mknodat 9p: darwin: Compatibility for f/l*xattr 9p: darwin: *xattr_nofollow implementations 9p: darwin: Move XATTR_SIZE_MAX->P9_XATTR_SIZE_MAX 9p: darwin: Ignore O_{NOATIME, DIRECT} 9p: darwin: Handle struct dirent differences 9p: darwin: Handle struct stat(fs) differences 9p: Rename 9p-util -> 9p-util-linux 9p: linux: Fix a couple Linux assumptions Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-03-07osdep: Move memalign-related functions to their own headerPeter Maydell3-0/+3
Move the various memalign-related functions out of osdep.h and into their own header, which we include only where they are used. While we're doing this, add some brief documentation comments. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org
2022-03-07util: Put qemu_vfree() in memalign.cPeter Maydell3-12/+11
qemu_vfree() is the companion free function to qemu_memalign(); put it in memalign.c so the allocation and free functions are together. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220226180723.1706285-9-peter.maydell@linaro.org Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-03-07util: Use meson checks for valloc() and memalign() presencePeter Maydell1-2/+4
Instead of assuming that all CONFIG_BSD have valloc() and anything else is memalign(), explicitly check for those functions in meson.build and use the "is the function present" define. Tests for specific functionality are better than which-OS checks; this also lets us give a helpful error message if somehow there's no usable function present. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-8-peter.maydell@linaro.org
2022-03-07util: Share qemu_try_memalign() implementation between POSIX and WindowsPeter Maydell3-46/+39
The qemu_try_memalign() functions for POSIX and Windows used to be significantly different, but these days they are identical except for the actual allocation function called, and the POSIX version already has to have ifdeffery for different allocation functions. Move to a single implementation in memalign.c, which uses the Windows _aligned_malloc if we detect that function in meson. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220226180723.1706285-7-peter.maydell@linaro.org Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-03-07util: Return valid allocation for qemu_try_memalign() with zero sizePeter Maydell2-1/+6
Currently qemu_try_memalign()'s behaviour if asked to allocate 0 bytes is rather variable: * on Windows, we will assert * on POSIX platforms, we get the underlying behaviour of the posix_memalign() or equivalent function, which may be either "return a valid non-NULL pointer" or "return NULL" Explictly check for 0 byte allocations, so we get consistent behaviour across platforms. We handle them by incrementing the size so that we return a valid non-NULL pointer that can later be passed to qemu_vfree(). This is permitted behaviour for the posix_memalign() API and is the most usual way that underlying malloc() etc implementations handle a zero-sized allocation request, because it won't trip up calling code that assumes NULL means an error. (This includes our own qemu_memalign(), which will abort on NULL.) This change is a preparation for sharing the qemu_try_memalign() code between Windows and POSIX. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-03-07util: Unify implementations of qemu_memalign()Peter Maydell4-28/+40
We implement qemu_memalign() in both oslib-posix.c and oslib-win32.c, but the two versions are essentially the same: they call qemu_try_memalign(), and abort() after printing an error message if it fails. The only difference is that the win32 version prints the GetLastError() value whereas the POSIX version prints strerror(errno). However, this is a bug in the win32 version: in commit dfbd0b873a85021 in 2020 we changed the implementation of qemu_try_memalign() from using VirtualAlloc() (which sets the GetLastError() value) to using _aligned_malloc() (which sets errno), but didn't update the error message to match. Replace the two separate functions with a single version in a new memalign.c file, which drops the unnecessary extra qemu_oom_check() function and instead prints a more useful message including the requested size and alignment as well as the errno string. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220226180723.1706285-4-peter.maydell@linaro.org Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-03-07util: Make qemu_oom_check() a static functionPeter Maydell2-2/+2
The qemu_oom_check() function, which we define in both oslib-posix.c and oslib-win32.c, is now used only locally in that file; make it static. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-3-peter.maydell@linaro.org
2022-03-079pfs: move qemu_dirent_dup() from osdep -> 9p-utilChristian Schoenebeck1-21/+0
Function qemu_dirent_dup() is currently only used by 9pfs server, so move it from project global header osdep.h to 9pfs specific header 9p-util.h. Link: https://lore.kernel.org/qemu-devel/CAFEAcA_=HAUNomKD2wurSVaAHa5mrk22A1oHKLWUDjk7v6Khmg@mail.gmail.com/ Based-on: <20220227223522.91937-12-wwcohen@gmail.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <E1nP9Oz-00043L-KJ@lizzy.crudebyte.com>
2022-03-07block/dirty-bitmap: introduce bdrv_dirty_bitmap_status()Vladimir Sementsov-Ogievskiy1-0/+33
Add a convenient function similar with bdrv_block_status() to get status of dirty bitmap. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220303194349.2304213-9-vsementsov@virtuozzo.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-06event_notifier: add event_notifier_get_wfd()Sergio Lopez1-0/+5
event_notifier_get_fd(const EventNotifier *e) always returns EventNotifier's read file descriptor (rfd). This is not a problem when the EventNotifier is backed by a an eventfd, as a single file descriptor is used both for reading and triggering events (rfd == wfd). But, when EventNotifier is backed by a pipe pair, we have two file descriptors, one that can only be used for reads (rfd), and the other only for writes (wfd). There's, at least, one known situation in which we need to obtain wfd instead of rfd, which is when setting up the file that's going to be sent to the peer in vhost's SET_VRING_CALL. Add a new event_notifier_get_wfd(const EventNotifier *e) that can be used to obtain wfd where needed. Signed-off-by: Sergio Lopez <slp@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220304100854.14829-2-slp@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04rcu: use coroutine TLS macrosStefan Hajnoczi1-5/+5
RCU may be used from coroutines. Standard __thread variables cannot be used by coroutines. Use the coroutine TLS macros instead. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220222140150.27240-4-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04util/async: replace __thread with QEMU TLS macrosStefan Hajnoczi1-5/+7
QEMU TLS macros must be used to make TLS variables safe with coroutines. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220222140150.27240-3-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-02-28keyval: Fix grammar comment to cover downstream prefixMarkus Armbruster1-1/+3
According to the grammar, a key __com.redhat_foo would be parsed as two key fragments __com and redhat_foo. It's actually parsed as a single fragment. Fix the grammar. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20220218145551.892787-2-armbru@redhat.com>
2022-02-21Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into ↵Peter Maydell1-1/+3
staging * More Meson conversions (0.59.x now required rather than suggested) * UMIP support for TCG x86 * Fix migration crash * Restore error output for check-block # gpg: Signature made Mon 21 Feb 2022 09:35:59 GMT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini-gitlab/tags/for-upstream: (29 commits) configure, meson: move CONFIG_IASL to a Meson option meson, configure: move ntddscsi API check to meson meson: require dynamic linking for VSS support qga/vss-win32: require widl/midl, remove pre-built TLB file meson: do not make qga/vss-win32/meson.build conditional on C++ presence configure, meson: replace VSS SDK checks and options with --enable-vss-sdk qga/vss: use standard windows headers location qga/vss-win32: use widl if available meson: drop --with-win-sdk qga/vss-win32: fix midl arguments meson: refine check for whether to look for virglrenderer configure, meson: move guest-agent, tools to meson configure, meson: move smbd options to meson_options.txt configure, meson: move coroutine options to meson_options.txt configure, meson: move some default-disabled options to meson_options.txt meson: define qemu_cflags/qemu_ldflags configure, meson: move block layer options to meson_options.txt configure, meson: move image format options to meson_options.txt configure, meson: cleanup qemu-ga libraries configure, meson: move TPM check to meson ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-21include: Move hardware version declarations to new qemu/hw-version.hPeter Maydell1-0/+1
The "hardware version" machinery (qemu_set_hw_version(), qemu_hw_version(), and the QEMU_HW_VERSION define) is used by fewer than 10 files. Move it out from osdep.h into a new qemu/hw-version.h. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220208200856.3558249-6-peter.maydell@linaro.org
2022-02-21include: Move qemu_[id]cache_* declarations to new qemu/cacheinfo.hPeter Maydell3-0/+3
The qemu_icache_linesize, qemu_icache_linesize_log, qemu_dcache_linesize, and qemu_dcache_linesize_log variables are not used in many files. Move them out of osdep.h to a new qemu/cacheinfo.h, and document them. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220208200856.3558249-5-peter.maydell@linaro.org
2022-02-21include: Move qemu_mprotect_*() to new qemu/mprotect.hPeter Maydell1-0/+1
The qemu_mprotect_*() family of functions are used in very few files; move them from osdep.h to a new qemu/mprotect.h. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220208200856.3558249-3-peter.maydell@linaro.org
2022-02-21include: Move qemu_madvise() and related #defines to new qemu/madvise.hPeter Maydell2-0/+2
The function qemu_madvise() and the QEMU_MADV_* constants associated with it are used in only 10 files. Move them out of osdep.h to a new qemu/madvise.h header that is included where it is needed. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org
2022-02-179pfs: Fix segfault in do_readdir_many caused by struct dirent overreadVitaly Chikunov1-0/+21
`struct dirent' returned from readdir(3) could be shorter (or longer) than `sizeof(struct dirent)', thus memcpy of sizeof length will overread into unallocated page causing SIGSEGV. Example stack trace: #0 0x00005555559ebeed v9fs_co_readdir_many (/usr/bin/qemu-system-x86_64 + 0x497eed) #1 0x00005555559ec2e9 v9fs_readdir (/usr/bin/qemu-system-x86_64 + 0x4982e9) #2 0x0000555555eb7983 coroutine_trampoline (/usr/bin/qemu-system-x86_64 + 0x963983) #3 0x00007ffff73e0be0 n/a (n/a + 0x0) While fixing this, provide a helper for any future `struct dirent' cloning. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/841 Cc: qemu-stable@nongnu.org Co-authored-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Vitaly Chikunov <vt@altlinux.org> Tested-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Acked-by: Greg Kurz <groug@kaod.org> Tested-by: Vitaly Chikunov <vt@altlinux.org> Message-Id: <20220216181821.3481527-1-vt@altlinux.org> [C.S. - Fix typo in source comment. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-02-16configure, meson: move membarrier test to mesonPaolo Bonzini1-1/+3
The test is a bit different from the others, in that it does not run if $membarrier is empty. For meson, the default can simply be disabled; if one day we will toggle the default, no change is needed in meson.build. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-14util: adjust coroutine pool size to virtio block queueHiroki Narukawa1-4/+16
Coroutine pool size was 64 from long ago, and the basis was organized in the commit message in 4d68e86b. At that time, virtio-blk queue-size and num-queue were not configuable, and equivalent values were 128 and 1. Coroutine pool size 64 was fine then. Later queue-size and num-queue got configuable, and default values were increased. Coroutine pool with size 64 exhausts frequently with random disk IO in new size, and slows down. This commit adjusts coroutine pool size adaptively with new values. This commit adds 64 by default, but now coroutine is not only for block devices, and is not too much burdon comparing with new default. pool size of 128 * vCPUs. Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> Message-id: 20220214115302.13294-2-hnarukaw@yahoo-corp.jp Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-02-06util/oslib-posix: Fix missing unlock in the error path of os_mem_prealloc()David Hildenbrand1-0/+1
We're missing an unlock in case installing the signal handler failed. Fortunately, we barely see this error in real life. Fixes: a960d6642d39 ("util/oslib-posix: Support concurrent os_mem_prealloc() invocation") Fixes: CID 1468941 Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta@ionos.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Cc: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20220111120830.119912-1-david@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-02-04cpuid: use unsigned for max cpuidMichael S. Tsirkin1-1/+1
__get_cpuid_max returns an unsigned value. For consistency, store the result in an unsigned variable. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-02-01block/export: Fix vhost-user-blk shutdown with requests in flightKevin Wolf1-0/+22
The vhost-user-blk export runs requests asynchronously in their own coroutine. When the vhost connection goes away and we want to stop the vhost-user server, we need to wait for these coroutines to stop before we can unmap the shared memory. Otherwise, they would still access the unmapped memory and crash. This introduces a refcount to VuServer which is increased when spawning a new request coroutine and decreased before the coroutine exits. The memory is only unmapped when the refcount reaches zero. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220125151435.48792-1-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-01-14Merge remote-tracking branch ↵Peter Maydell7-34/+90
'remotes/stefanha-gitlab/tags/block-pull-request' into staging Pull request # gpg: Signature made Wed 12 Jan 2022 17:13:54 GMT # gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full] # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full] # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha-gitlab/tags/block-pull-request: virtio: unify dataplane and non-dataplane ->handle_output() virtio: use ->handle_output() instead of ->handle_aio_output() virtio-scsi: prepare virtio_scsi_handle_cmd for dataplane virtio-blk: drop unused virtio_blk_handle_vq() return value virtio: get rid of VirtIOHandleAIOOutput aio-posix: split poll check from ready handler Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-12aio-posix: split poll check from ready handlerStefan Hajnoczi7-34/+90
Adaptive polling measures the execution time of the polling check plus handlers called when a polled event becomes ready. Handlers can take a significant amount of time, making it look like polling was running for a long time when in fact the event handler was running for a long time. For example, on Linux the io_submit(2) syscall invoked when a virtio-blk device's virtqueue becomes ready can take 10s of microseconds. This can exceed the default polling interval (32 microseconds) and cause adaptive polling to stop polling. By excluding the handler's execution time from the polling check we make the adaptive polling calculation more accurate. As a result, the event loop now stays in polling mode where previously it would have fallen back to file descriptor monitoring. The following data was collected with virtio-blk num-queues=2 event_idx=off using an IOThread. Before: 168k IOPS, IOThread syscalls: 9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0) = 16 9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8) = 8 9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8) = 8 9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3 9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0) = 32 174k IOPS (+3.6%), IOThread syscalls: 9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0) = 32 9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8) = 8 9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8) = 8 9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50) = 32 Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because the IOThread stays in polling mode instead of falling back to file descriptor monitoring. As usual, polling is not implemented on Windows so this patch ignores the new io_poll_read() callback in aio-win32.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-2-stefanha@redhat.com [Fixed up aio_set_event_notifier() calls in tests/unit/test-fdmon-epoll.c added after this series was queued. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12meson: reenable filemonitor-inotify compilationVolker Rümelin1-2/+5
Reenable util/filemonitor-inotify compilation. Compilation was disabled when commit a620fbe9ac ("configure: convert compiler tests to meson, part 5") moved CONFIG_INOTIFY1 from config-host.mak to config-host.h. This fixes the usb-mtp device and reenables test-util-filemonitor. Fixes: a620fbe9ac ("configure: convert compiler tests to meson, part 5") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/800 Signed-off-by: Volker Rümelin <vr_qemu@t-online.de> Message-Id: <20220107133514.7785-1-vr_qemu@t-online.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-08qemu/int128: addition of div/rem 128-bit operationsFrédéric Pétrot2-0/+148
Addition of div and rem on 128-bit integers, using the 128/64->128 divu and 64x64->128 mulu in host-utils. These operations will be used within div/rem helpers in the 128-bit riscv target. Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20220106210108.138226-4-frederic.petrot@univ-grenoble-alpes.fr Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2022-01-07util/oslib-posix: Forward SIGBUS to MCE handler under LinuxDavid Hildenbrand1-3/+34
Temporarily modifying the SIGBUS handler is really nasty, as we might be unlucky and receive an MCE SIGBUS while having our handler registered. Unfortunately, there is no way around messing with SIGBUS when MADV_POPULATE_WRITE is not applicable or not around. Let's forward SIGBUS that don't belong to us to the already registered handler and document the situation. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-8-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Support concurrent os_mem_prealloc() invocationDavid Hildenbrand1-0/+9
Add a mutex to protect the SIGBUS case, as we cannot mess concurrently with the sigbus handler and we have to manage the global variable sigbus_memset_context. The MADV_POPULATE_WRITE path can run concurrently. Note that page_mutex and page_cond are shared between concurrent invocations, which shouldn't be a problem. This is a preparation for future virtio-mem prealloc code, which will call os_mem_prealloc() asynchronously from an iothread when handling guest requests. Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-7-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Avoid creating a single thread with MADV_POPULATE_WRITEDavid Hildenbrand1-0/+8
Let's simplify the case when we only want a single thread and don't have to mess with signal handlers. Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-6-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Don't create too many threads with small memory or little ↵David Hildenbrand1-2/+10
pages Let's limit the number of threads to something sane, especially that - We don't have more threads than the number of pages we have - We don't have threads that initialize small (< 64 MiB) memory Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-5-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Introduce and use MemsetContext for touch_all_pages()David Hildenbrand1-26/+47
Let's minimize the number of global variables to prepare for os_mem_prealloc() getting called concurrently and make the code a bit easier to read. The only consumer that really needs a global variable is the sigbus handler, which will require protection via a mutex in the future either way as we cannot concurrently mess with the SIGBUS handler. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-4-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc()David Hildenbrand1-21/+62
Let's sense support and use it for preallocation. MADV_POPULATE_WRITE does not require a SIGBUS handler, doesn't actually touch page content, and avoids context switches; it is, therefore, faster and easier to handle than our current approach. While MADV_POPULATE_WRITE is, in general, faster than manual prefaulting, and especially faster with 4k pages, there is still value in prefaulting using multiple threads to speed up preallocation. More details on MADV_POPULATE_WRITE can be found in the Linux commits 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1]. This resolves the TODO in do_touch_pages(). In the future, we might want to look into using fallocate(), eventually combined with MADV_POPULATE_READ, when dealing with shared file/fd mappings and not caring about memory bindings. [1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-3-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07util/oslib-posix: Let touch_all_pages() return an errorDavid Hildenbrand1-12/+16
Let's prepare touch_all_pages() for returning differing errors. Return an error from the thread and report the last processed error. Translate SIGBUS to -EFAULT, as a SIGBUS can mean all different kind of things (memory error, read error, out of memory). When allocating memory fails via the current SIGBUS-based mechanism, we'll get: os_mem_prealloc: preallocating memory failed: Bad address Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-2-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-16transactions: Invoke clean() after everything elseHanna Reitz1-2/+6
Invoke the transaction drivers' .clean() methods only after all .commit() or .abort() handlers are done. This makes it easier to have nested transactions where the top-level transactions pass objects to lower transactions that the latter can still use throughout their commit/abort phases, while the top-level transaction keeps a reference that is released in its .clean() method. (Before this commit, that is also possible, but the top-level transaction would need to take care to invoke tran_add() before the lower-level transaction does. This commit makes the ordering irrelevant, which is just a bit nicer.) Signed-off-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20211111120829.81329-8-hreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20211115145409.176785-8-kwolf@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-11-10rcu: Introduce force_rcu notifierGreg Kurz1-0/+19
The drain_rcu_call() function can be blocked as long as an RCU reader stays in a read-side critical section. This is typically what happens when a TCG vCPU is executing a busy loop. It can deadlock the QEMU monitor as reported in https://gitlab.com/qemu-project/qemu/-/issues/650 . This can be avoided by allowing drain_rcu_call() to enforce an RCU grace period. Since each reader might need to do specific actions to end a read-side critical section, do it with notifiers. Prepare ground for this by adding a notifier list to the RCU reader struct and use it in wait_for_readers() if drain_rcu_call() is in progress. An API is added for readers to register their notifiers. This is largely based on a draft from Paolo Bonzini. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211109183523.47726-2-groug@kaod.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-02util: Make some iova_tree parameters constEugenio Pérez1-6/+6
As qemu guidelines: Unless a pointer is used to modify the pointed-to storage, give it the "const" attribute. In the particular case of iova_tree_find it allows to enforce what is requested by its comment, since the compiler would shout in case of modifying or freeing the const-qualified returned pointer. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211013182713.888753-2-eperezma@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-27host-utils: add 128-bit quotient support to divu128/divs128Luis Pires1-41/+86
These will be used to implement new decimal floating point instructions from Power ISA 3.1. The remainder is now returned directly by divu128/divs128, freeing up phigh to receive the high 64 bits of the quotient. Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211025191154.350831-4-luis.pires@eldorado.org.br> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-27host-utils: move checks out of divu128/divs128Luis Pires1-22/+18
In preparation for changing the divu128/divs128 implementations to allow for quotients larger than 64 bits, move the div-by-zero and overflow checks to the callers. Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211025191154.350831-2-luis.pires@eldorado.org.br> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-15qemu-option: Allow deleting opts during qemu_opts_foreach()Kevin Wolf1-2/+2
Use QTAILQ_FOREACH_SAFE() so that the current QemuOpts can be deleted while iterating through the whole list. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211008133442.141332-11-kwolf@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-14configure, meson: move more compiler checks to MesonPaolo Bonzini1-1/+3
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20211007130829.632254-15-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-14configure, meson: move pthread_setname_np checks to MesonPaolo Bonzini1-3/+2
This makes the pthreads check dead in configure, so remove it as well. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20211007130829.632254-9-pbonzini@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>