aboutsummaryrefslogtreecommitdiff
path: root/util
AgeCommit message (Collapse)AuthorFilesLines
2022-10-27util: Introduce ThreadContext user-creatable objectDavid Hildenbrand3-0/+280
Setting the CPU affinity of QEMU threads is a bit problematic, because QEMU doesn't always have permissions to set the CPU affinity itself, for example, with seccomp after initialized by QEMU: -sandbox enable=on,resourcecontrol=deny General information about CPU affinities can be found in the man page of taskset: CPU affinity is a scheduler property that "bonds" a process to a given set of CPUs on the system. The Linux scheduler will honor the given CPU affinity and the process will not run on any other CPUs. While upper layers are already aware of how to handle CPU affinities for long-lived threads like iothreads or vcpu threads, especially short-lived threads, as used for memory-backend preallocation, are more involved to handle. These threads are created on demand and upper layers are not even able to identify and configure them. Introduce the concept of a ThreadContext, that is essentially a thread used for creating new threads. All threads created via that context thread inherit the configured CPU affinity. Consequently, it's sufficient to create a ThreadContext and configure it once, and have all threads created via that ThreadContext inherit the same CPU affinity. The CPU affinity of a ThreadContext can be configured two ways: (1) Obtaining the thread id via the "thread-id" property and setting the CPU affinity manually (e.g., via taskset). (2) Setting the "cpu-affinity" property and letting QEMU try set the CPU affinity itself. This will fail if QEMU doesn't have permissions to do so anymore after seccomp was initialized. A simple QEMU example to set the CPU affinity to host CPU 0,1,6,7 would be: qemu-system-x86_64 -S \ -object thread-context,id=tc1,cpu-affinity=0-1,cpu-affinity=6-7 And we can query it via HMP/QMP: (qemu) qom-get tc1 cpu-affinity [ 0, 1, 6, 7 ] But note that due to dynamic library loading this example will not work before we actually make use of thread_context_create_thread() in QEMU code, because the type will otherwise not get registered. We'll wire this up next to make it work. In general, the interface behaves like pthread_setaffinity_np(): host CPU numbers that are currently not available are ignored; only host CPU numbers that are impossible with the current kernel will fail. If the list of host CPU numbers does not include a single CPU that is available, setting the CPU affinity will fail. A ThreadContext can be reused, simply by reconfiguring the CPU affinity. Note that the CPU affinity of previously created threads will not get adjusted. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221014134720.168738-4-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2022-10-27util: Introduce qemu_thread_set_affinity() and qemu_thread_get_affinity()David Hildenbrand2-0/+82
Usually, we let upper layers handle CPU pinning, because pthread_setaffinity_np() (-> sched_setaffinity()) is blocked via seccomp when starting QEMU with -sandbox enable=on,resourcecontrol=deny However, we want to configure and observe the CPU affinity of threads from QEMU directly in some cases when the sandbox option is either not enabled or not active yet. So let's add a way to configure CPU pinning via qemu_thread_set_affinity() and obtain CPU affinity via qemu_thread_get_affinity() and implement them under POSIX using pthread_setaffinity_np() + pthread_getaffinity_np(). Implementation under Windows is possible using SetProcessAffinityMask() + GetProcessAffinityMask(), however, that is left as future work. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Message-Id: <20221014134720.168738-3-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2022-10-27util: Cleanup and rename os_mem_prealloc()David Hildenbrand2-16/+16
Let's * give the function a "qemu_*" style name * make sure the parameters in the implementation match the prototype * rename smp_cpus to max_threads, which makes the semantics of that parameter clearer ... and add a function documentation. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Message-Id: <20221014134720.168738-2-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2022-10-26numa: call ->ram_block_removed() in ram_block_notifer_remove()Stefan Hajnoczi1-1/+4
When a RAMBlockNotifier is added, ->ram_block_added() is called with all existing RAMBlocks. There is no equivalent ->ram_block_removed() call when a RAMBlockNotifier is removed. The util/vfio-helpers.c code (the sole user of RAMBlockNotifier) is fine with this asymmetry because it does not rely on RAMBlockNotifier for cleanup. It walks its internal list of DMA mappings and unmaps them by itself. Future users of RAMBlockNotifier may not have an internal data structure that records added RAMBlocks so they will need ->ram_block_removed() callbacks. This patch makes ram_block_notifier_remove() symmetric with respect to callbacks. Now util/vfio-helpers.c needs to unmap remaining DMA mappings after ram_block_notifier_remove() has been called. This is necessary since users like block/nvme.c may create additional DMA mappings that do not originate from the RAMBlockNotifier. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20221013185908.1297568-4-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26coroutine: add flag to re-queue at front of CoQueueStefan Hajnoczi1-2/+7
When a coroutine wakes up it may determine that it must re-queue. Normally coroutines are pushed onto the back of the CoQueue, but for fairness it may be necessary to push it onto the front of the CoQueue. Add a flag to specify that the coroutine should be pushed onto the front of the CoQueue. A later patch will use this to ensure fairness in the bounce buffer CoQueue used by the blkio BlockDriver. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20221013185908.1297568-2-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26util/qemu-sockets: Use g_get_tmp_dir() to get the directory for temporary filesBin Meng1-3/+2
Replace the existing logic to get the directory for temporary files with g_get_tmp_dir(), which works for win32 too. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-13Merge tag 'win32-pull-request' of https://gitlab.com/marcandre.lureau/qemu ↵Stefan Hajnoczi3-7/+68
into staging win32-related misc patches # -----BEGIN PGP SIGNATURE----- # # iQJPBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmNG488cHG1hcmNhbmRy # ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5eQTD/j/rEcONwL4gZn/Rcp8 # aJlr39GEHo0JxBAF3eoxCLJlebPcdaUQ4pu/FTegS1A4abPaajDH7rdtcA58ciAG # rCQjUOrobHzxmI9XaTIPT4PQh3DA4HB58rTpAvb/6P/UDRc0MpkcvaOkGlJVhi+7 # WB63+gnQOBEjcieNcQtmRwYRkx7K5/9G4qEESl0i2E+SE4DM+/vcVa7lfqEZ+6HS # bsDy2BslxtPFmHj1UElwXjTbCs4Y7pfTFd+9z8ySsGL1Komf45MZs0iS4FmZLqL/ # 7Cuj+xRWibnPN9jnAc+Sdua3FAFZbqmfPQaH6DN6SICZ6Txf2hxFkAgTahagcxYX # 9EiKGHZzI4L3l/YAxFg9RfK+AsF44ZLPId58AVvUnG1jWwxl3nRaTmvtvHaEwJuZ # PgnbAdsNzQAJjLnk8ndpTq4mQFM+9/mrQo+iaOCwmB5s07woyEq+L+KJHMUgyk2D # lECn3vlqVGGb6GA6MS5gSXh0TDRxPxLyr9ofIG5i5YaTo4nH56S80tHrzZMUYNKD # xe2yUrEZ7UjeV4/6M19xdw3haPOdrG3BoBshb61vI1bF/4iQxYNo8AxptCRhzNNM # 5Jrn/gyt47SEgMYpGIvHa/qo1lQiLsQAVKAK3O2QWd5T58V6J1a804zhTuT7T45O # kZS2c8XEdAiBtUAkYNgFxwGM # =Lpqm # -----END PGP SIGNATURE----- # gpg: Signature made Wed 12 Oct 2022 11:57:03 EDT # gpg: using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5 # gpg: issuer "marcandre.lureau@redhat.com" # gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full] # gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full] # Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5 * tag 'win32-pull-request' of https://gitlab.com/marcandre.lureau/qemu: tests/unit: make test-io-channel-command work on win32 io/command: implement support for win32 io/command: use glib GSpawn, instead of open-coding fork/exec tests/channel-helper: set blocking in main thread util: make do_send_recv work with partial send/recv osdep: make readv_writev() work with partial read/write win32: set threads name Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-12util: make do_send_recv work with partial send/recvMarc-André Lureau1-2/+8
According to msdn documentation and Linux man pages, send() should try to send as much as possible in blocking mode, while recv() may return earlier with a smaller available amount, we should try to continue send/recv from there. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20221006113657.2656108-3-marcandre.lureau@redhat.com>
2022-10-12osdep: make readv_writev() work with partial read/writeMarc-André Lureau1-3/+8
With a pipe or other reasons, read/write may return less than the requested bytes. This happens with the test-io-channel-command test on Windows. glib spawn code uses a binary pipe of 4096 bytes, and the first read returns that much (although more are requested), for some unclear reason... Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20221006113657.2656108-2-marcandre.lureau@redhat.com>
2022-10-12win32: set threads nameMarc-André Lureau1-2/+52
As described in: https://learn.microsoft.com/en-us/visualstudio/debugger/how-to-set-a-thread-name-in-native-code?view=vs-2022 SetThreadDescription() is available since Windows 10, version 1607 and in some versions only by "Run Time Dynamic Linking". Its declaration is not yet in mingw, so we lookup the function the same way glib does. Tested with Visual Studio Community 2022 debugger. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Acked-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-07coroutine-lock: add missing coroutine_fn annotationsPaolo Bonzini1-7/+7
Callers of coroutine_fn must be coroutine_fn themselves, or the call must be within "if (qemu_in_coroutine())". Apply coroutine_fn to functions where this holds. Reviewed-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220922084924.201610-23-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-07coroutine: remove incorrect coroutine_fn annotationsPaolo Bonzini1-1/+1
qemu_coroutine_get_aio_context inspects a coroutine, but it does not have to be called from the coroutine itself (or from any coroutine). Reviewed-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220922084924.201610-6-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-09-29oslib-posix: Introduce qemu_socketpair()Guoyi Tu1-0/+19
qemu_socketpair() will create a pair of connected sockets with FD_CLOEXEC set Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <17fa1eff729eeabd9a001f4639abccb127ceec81.1661240709.git.tugy@chinatelecom.cn>
2022-09-26s390x/s390-virtio-ccw: add zpcii-disable machine propertyMatthew Rosato1-0/+4
The zpcii-disable machine property can be used to force-disable the use of zPCI interpretation facilities for a VM. By default, this setting will be off for machine 7.2 and newer. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Message-Id: <20220902172737.170349-9-mjrosato@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> [thuth: Fix contextual conflict in ccw_machine_7_1_instance_options()] Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-09-02Merge tag 'net-pull-request' of https://github.com/jasowang/qemu into stagingStefan Hajnoczi1-2/+2
# -----BEGIN PGP SIGNATURE----- # Version: GnuPG v1 # # iQEcBAABAgAGBQJjEaMLAAoJEO8Ells5jWIRoRwIAJpwefLgH/+lkd1mtWqxBhuS # KLa0bkcS6nIGnjQzNX/XWipu/5tMbBLzbaKw0myodvoK6Yx0MFog1cWf6gLHuvWH # Jy3ONUrF9umHYuOa9sJJtXv/aP7neNJSB3RW67BaiLCLkaetDj9lLciA/KKMvb/I # JNFtuLVTPibZ5iVTjvifFWmJD/Yk0P8mlrH5yfrA3B2EaaWf1es0GWobGIwwLu9s # ZSqjhMDAhfOW2E1sBh7jFRh4lJX1t1jRhyIGx2bOXevPx2hFHq6FSq+yuJ9OsZvO # wC8mC4DD+fovypDWbv3WLslIejM0+THD8KuBQnZtKX5Mbhc+0cELpIFLUdH95TM= # =eMUT # -----END PGP SIGNATURE----- # gpg: Signature made Fri 02 Sep 2022 02:30:35 EDT # gpg: using RSA key EF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [full] # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * tag 'net-pull-request' of https://github.com/jasowang/qemu: (21 commits) net: tulip: Restrict DMA engine to memories net/colo.c: Fix the pointer issue reported by Coverity. vdpa: Delete CVQ migration blocker vdpa: Add virtio-net mac address via CVQ at start vhost_net: add NetClientState->load() callback vdpa: extract vhost_vdpa_net_cvq_add from vhost_vdpa_net_handle_ctrl_avail vdpa: Move command buffers map to start of net device vdpa: add net_vhost_vdpa_cvq_info NetClientInfo vhost_net: Add NetClientInfo stop callback vhost_net: Add NetClientInfo start callback vhost: Do not depend on !NULL VirtQueueElement on vhost_svq_flush vhost: Delete useless read memory barrier vhost: use SVQ element ndescs instead of opaque data for desc validation vhost: stop transfer elem ownership in vhost_handle_guest_kick vdpa: Use ring hwaddr at vhost_vdpa_svq_unmap_ring vhost: Always store new kick fd on vhost_svq_set_svq_kick_fd vdpa: Make SVQ vring unmapping return void vdpa: Remove SVQ vring from iova_tree at shutdown util: accept iova_tree_remove_parameter by value vdpa: do not save failed dma maps in SVQ iova tree ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-09-02Merge tag 'char-pull-request' of https://gitlab.com/marcandre.lureau/qemu ↵Stefan Hajnoczi1-25/+0
into staging chardev patches & small audio fix # -----BEGIN PGP SIGNATURE----- # # iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmMSAXYcHG1hcmNhbmRy # ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5YvaD/9VUIy96LZUGIexEhLj # IT804yjCtSl9iV7/V7oivIPr9IpTKnUQS/yqbX8B8Afc6uQHDQRrhoNmuDRb3gCo # V4XhZxZTzUvwJ/FUp35tgsEvqTMsK9taVrPtwVB9VJ3c7OkjvJGn1Q9+Di7WbsuZ # +rZVR7+1IxkFpIqxBiSqdjHCkqSsAYtaL7wqSnpwiz3jw1nbL25iheo3gylNJbg5 # tfxLLJDFUs9Qqf04iVFtMv9vKoXZDBlCLEiCaCHbpzMXylP6t82oRoj3j2XioqvS # 9dc3NNcWqTg5Srx1HJ95V8jPnUqLXD91fw9EqD+v0Va1l1JZ+2lGvqnTWDRZfBl3 # 2WZ23oHgwPSgFUyArmrSMX6qRG+f29NHA+r6F5ebVm8AzCP/QkhIqY/EJx8te77C # 6cN8xS8LDkiL6fsJ5r5ZXViaCgvC33oLSmBQ/wVAJtNChYykmFUBw66Wc+ySSM/L # HqNNflM1vWHnAc4/EqQT9PYV7cl5Ooss7i1lDIXu5tEpWtBFzV5OFtGE+njfQJ4B # gpe0zhwXM/+fRyGvDnCkwINTQMgoKku12nTTE9NBpMWxlhW9BtCpY92Ht5BJmNVj # b+ylbZaTiGBjHfshx0UlZ4vsDDy5gA28gJa7S6cs/Ak7TMLjwqj0Av+upUYt3PBW # 8A1IB2wL91sFESh5RrMJCg4Bbg== # =jtDp # -----END PGP SIGNATURE----- # gpg: Signature made Fri 02 Sep 2022 09:13:26 EDT # gpg: using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5 # gpg: issuer "marcandre.lureau@redhat.com" # gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full] # gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full] # Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5 * tag 'char-pull-request' of https://gitlab.com/marcandre.lureau/qemu: audio: exit(1) if audio backend failed to be found or initialized tests/unit: Update test-io-channel-socket.c for Windows chardev/char-socket: Update AF_UNIX for Windows util/qemu-sockets: Enable unix socket support on Windows Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-09-02util/qemu-sockets: Enable unix socket support on WindowsBin Meng1-25/+0
Support for the unix socket has existed both in BSD and Linux for the longest time, but not on Windows. Since Windows 10 build 17063 [1], the native support for the unix socket has come to Windows. Starting this build, two Win32 processes can use the AF_UNIX address family over Winsock API to communicate with each other. [1] https://devblogs.microsoft.com/commandline/af_unix-comes-to-windows/ Signed-off-by: Xuzhou Cheng <xuzhou.cheng@windriver.com> Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220802075200.907360-3-bmeng.cn@gmail.com>
2022-09-02util: accept iova_tree_remove_parameter by valueEugenio Pérez1-2/+2
It's convenient to call iova_tree_remove from a map returned from iova_tree_find or iova_tree_find_iova. With the current code this is not possible, since we will free it, and then we will try to search for it again. Fix it making accepting the map by value, forcing a copy of the argument. Not applying a fixes tag, since there is no use like that at the moment. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-08-26util/mmap-alloc: Remove qemu_mempath_getpagesize()Thomas Huth1-31/+0
The last user of this function has just been removed, so we can drop this function now, too. Message-Id: <20220810125720.3849835-4-thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-08-12cutils: Add missing dyld(3) include on macOSPhilippe Mathieu-Daudé2-4/+4
Commit 06680b15b4 moved qemu_*_exec_dir() to cutils but forgot to move the macOS dyld(3) include, resulting in the following error (when building with Homebrew GCC on macOS Monterey 12.4): [313/1197] Compiling C object libqemuutil.a.p/util_cutils.c.o FAILED: libqemuutil.a.p/util_cutils.c.o ../../util/cutils.c:1039:13: error: implicit declaration of function '_NSGetExecutablePath' [-Werror=implicit-function-declaration] 1039 | if (_NSGetExecutablePath(fpath, &len) == 0) { | ^~~~~~~~~~~~~~~~~~~~ ../../util/cutils.c:1039:13: error: nested extern declaration of '_NSGetExecutablePath' [-Werror=nested-externs] Fix by moving the include line to cutils. Fixes: 06680b15b4 ("include: move qemu_*_exec_dir() to cutils") Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220809222046.30812-1-f4bug@amsat.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-08-05util/qemu-sockets: Replace the call to close a socket with closesocket()Bin Meng1-2/+2
close() is a *nix function. It works on any file descriptor, and sockets in *nix are an example of a file descriptor. closesocket() is a Windows-specific function, which works only specifically with sockets. Sockets on Windows do not use *nix-style file descriptors, and socket() returns a handle to a kernel object instead, so it must be closed with closesocket(). In QEMU there is already a logic to handle such platform difference in os-posix.h and os-win32.h, that: * closesocket maps to close on POSIX * closesocket maps to a wrapper that calls the real closesocket() on Windows Replace the call to close a socket with closesocket() instead. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-07-18util: Fix broken build on HaikuThomas Huth2-4/+4
A recent commit moved some Haiku-specific code parts from oslib-posix.c to cutils.c, but failed to move the corresponding header #include statement, too, so "make vm-build-haiku.x86_64" is currently broken. Fix it by moving the header #include, too. Fixes: 06680b15b4 ("include: move qemu_*_exec_dir() to cutils") Message-Id: <20220718172026.139004-1-thuth@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-07-13module: Use bundle mechanismAkihiko Odaki1-1/+0
Before this change, the directory of the executable was being added to resolve modules in the build tree. However, get_relocated_path() can now resolve them with the new bundle mechanism. Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com> Message-Id: <20220624145039.49929-5-akihiko.odaki@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-13cutils: Introduce bundle mechanismAkihiko Odaki2-21/+48
Developers often run QEMU without installing. The bundle mechanism allows to look up files which should be present in installation even in such a situation. It is a general mechanism and can find any files in the installation tree. The build tree will have a new directory, qemu-bundle, to represent what files the installation tree would have for reference by the executables. Note that it abandons compatibility with Windows older than 8. The extended support for the prior version, 7 ended more than 2 years ago, and it is unlikely that someone would like to run the latest QEMU on such an old system. Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220624145039.49929-3-akihiko.odaki@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-30Merge tag 'trivial-branch-for-7.1-pull-request' of ↵Richard Henderson1-3/+1
https://gitlab.com/laurent_vivier/qemu into staging trivial patches pull request 20220629 # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmK8FmsSHGxhdXJlbnRA # dml2aWVyLmV1AAoJEPMMOL0/L7482EkP/19M/AAUkGqIdU9Dj7H46r+LEKtrT7Xu # jNRDDrkhVQvx42mklSB+fO/ptMKUDgxvLs4mnuZFxM7SrTOb4h5jfZzyYjk73ENQ # YZ/TLxRtxAfRCcGwso7NGyk85mwt+sBFKZXfW6qsfc9AjDphLUOblfSieeFegz69 # BUtzbMOPSMR7e54y6azJX3gCkxLytSXYgk4otSLTrL233sT7pnwPRdxKGzCTA5vs # fRxKb4p/R05lWepcjrL2d2lB1TabsV0kqmNkHDvubVWlgyoK3Vt/1dzD1UP7CrvF # WghlZWmxCHrmLlBb+VSDUa22kpfv5fi/feauuug+dya+s1Mlq8HZTL8VtjUJHwLL # 92xRPeP/RfEJdoQDuMKXP9DWAAYM03HGgR37cE5NMDCyHG0XRKOJ+i2P7DQLVDjW # QyWX6bX1WV6FovdwwMnZR9OclvKtsZnb1jlfj+G2DdKXpLliDH6DkFm8mPQTM1L7 # w53iMtK88erEc+NP6+fPbbZmySvDVUcLmcTiBceZK6Vjo4oTGNrAWP+VgjBTJaz+ # 71ulkJ6vo39ZnEQOUlWrL/yW+8sQNaeO1tO67HZZ8dgTvAnPwyvKq88jSMzGCNpz # Wpcf4yVAEvU+fP3KkEaqQqmQeK/Vc+H6044O00tcLVICkpCdN/FwRjgfZanX9CIJ # xQjxW5mkb1Z3 # =fgtJ # -----END PGP SIGNATURE----- # gpg: Signature made Wed 29 Jun 2022 02:37:55 PM +0530 # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [undefined] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [undefined] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * tag 'trivial-branch-for-7.1-pull-request' of https://gitlab.com/laurent_vivier/qemu: hw/i386/xen/xen-hvm: Inline xen_piix_pci_write_config_client() and remove it hw/i386/xen/xen-hvm: Allow for stubbing xen_set_pci_link_route() hw/ide/atapi.c: Correct typos (CD-CDROM -> CD-ROM) common-user: Only compile the common user code if have_user is set hw/pci-host/i440fx: Remove unused parameter from i440fx_init() MAINTAINERS: Add softmmu/runstate.c to "Main loop" trivial typos: namesapce Trivial: 3 char repeat typos util: Return void on iova_tree_remove qom/object: Remove circular include dependency vga: avoid crash if no default vga card Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-06-29util: add qemu-co-timeoutVladimir Sementsov-Ogievskiy2-0/+90
Add new API, to make a time limited call of the coroutine. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2022-06-28util: Return void on iova_tree_removeEugenio Pérez1-3/+1
It always returns IOVA_OK so nobody uses it. Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220427154931.3166388-1-eperezma@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-06-24aio_wait_kick: add missing memory barrierEmanuele Giuseppe Esposito1-1/+15
It seems that aio_wait_kick always required a memory barrier or atomic operation in the caller, but nobody actually took care of doing it. Let's put the barrier in the function instead, and pair it with another one in AIO_WAIT_WHILE. Read aio_wait_kick() comment for further explanation. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220524173054.12651-1-eesposit@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-06-24block: simplify handling of try to merge different sized bitmapsVladimir Sementsov-Ogievskiy1-18/+7
We have too much logic to simply check that bitmaps are of the same size. Let's just define that hbitmap_merge() and bdrv_dirty_bitmap_merge_internal() require their argument bitmaps be of same size, this simplifies things. Let's look through the callers: For backup_init_bcs_bitmap() we already assert that merge can't fail. In bdrv_reclaim_dirty_bitmap_locked() we gracefully handle the error that can't happen: successor always has same size as its parent, drop this logic. In bdrv_merge_dirty_bitmap() we already has assertion and separate check. Make the check explicit and improve error message. Signed-off-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru> Reviewed-by: Nikita Lapshin <nikita.lapshin@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220517111206.23585-4-v.sementsov-og@mail.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-06-21Merge tag 'pull-tcg-20220621' of https://gitlab.com/rth7680/qemu into stagingRichard Henderson4-217/+235
Speed empty timer list in qemu_clock_deadline_ns_all. Implement remainder for Power3.1 hosts. Optimize ppc host icache flushing. Cleanups to tcg_accel_ops_init. Fix mmio crash accessing unmapped physical memory. # -----BEGIN PGP SIGNATURE----- # # iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmKyLesdHHJpY2hhcmQu # aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8O1wf5AW6JeeUTs2r3owsK # UpVaRqjlLpNeuktoOQoG8lbVzm1ulEv7zgXYJTZg4cc/83WQZ2G8WzTj3W+Qr/S9 # ECRd73Kou+fK3jTo8I+wPLQjLjkIV4xSABMGz/onxhoAeyS+xcAI4qGuSGrtIg2r # sQ61V4fWCwvQJdHMyG756Xsh8Xjf18mrNQZ5PLGkyn/e9UIAc4KH6FsgWJdinGEs # V/oibY20kCXpLxN0ajNmx3x4/NFs/ymMtn1z9fdhVGjAVPY0N6YsxjsGqd/WP/5U # ui/x0wAhl/VNK2M2+z3hVGfNlMpkzTVG2A3ndD+tYI3nofwTYb/UiakhID7ZX1cQ # yKDyAw== # =3Rhw # -----END PGP SIGNATURE----- # gpg: Signature made Tue 21 Jun 2022 01:45:31 PM PDT # gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F # gpg: issuer "richard.henderson@linaro.org" # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate] * tag 'pull-tcg-20220621' of https://gitlab.com/rth7680/qemu: util/cacheflush: Optimize flushing when ppc host has coherent icache util/cacheflush: Merge aarch64 ctr_el0 usage util: Merge cacheflush.c and cacheinfo.c softmmu: Always initialize xlat in address_space_translate_for_iotlb qemu-timer: Skip empty timer lists before locking in qemu_clock_deadline_ns_all accel/tcg: Reorganize tcg_accel_ops_init() accel/tcg: Init TCG cflags in vCPU thread handler target/avr: Drop avr_cpu_memory_rw_debug() tcg/ppc: implement rem[u]_i{32,64} with mod[su][wd] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-06-21util/cacheflush: Optimize flushing when ppc host has coherent icacheNicholas Piggin1-2/+23
On linux, the AT_HWCAP bit PPC_FEATURE_ICACHE_SNOOP indicates that we can use a simplified 3 instruction flush sequence. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20220519141131.29839-1-npiggin@gmail.com> [rth: update after merging cacheflush.c and cacheinfo.c] Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20220621014837.189139-4-richard.henderson@linaro.org>
2022-06-21util/cacheflush: Merge aarch64 ctr_el0 usageRichard Henderson1-25/+19
Merge init_ctr_el0 into arch_cache_info. In flush_idcache_range, use the pre-computed line sizes from the global variables. Use CONFIG_DARWIN in preference to __APPLE__. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20220621014837.189139-3-richard.henderson@linaro.org>
2022-06-21util: Merge cacheflush.c and cacheinfo.cRichard Henderson3-202/+202
Combine the two files into cacheflush.c. There's a couple of bits that would be helpful to share between the two, and combining them seems better than exporting the bits. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20220621014837.189139-2-richard.henderson@linaro.org>
2022-06-21qemu-timer: Skip empty timer lists before locking in qemu_clock_deadline_ns_allIdan Horowitz1-0/+3
This decreases qemu_clock_deadline_ns_all's share from 23.2% to 13% in a profile of icount-enabled aarch64-softmmu. Signed-off-by: Idan Horowitz <idan.horowitz@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220114004358.299534-2-idan.horowitz@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-06-20host-utils: Implemented signed 256-by-128 divisionLucas Mateus Castro (alqotel)1-0/+51
Based on already existing QEMU implementation created a signed 256 bit by 128 bit division needed to implement the vector divide extended signed quadword instruction from PowerISA 3.1 Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220525134954.85056-6-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-06-20host-utils: Implemented unsigned 256-by-128 divisionLucas Mateus Castro (alqotel)1-0/+129
Based on already existing QEMU implementation, created an unsigned 256 bit by 128 bit division needed to implement the vector divide extended unsigned instruction from PowerISA3.1 Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220525134954.85056-5-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-06-14cutils: add functions for IEC and SI prefixesPaolo Bonzini1-9/+25
Extract the knowledge of IEC and SI prefixes out of size_to_str and freq_to_str, so that it can be reused when printing statistics. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-06replay: notify vCPU when BH is scheduledPavel Dovgalyuk1-0/+8
vCPU execution should be suspended when new BH is scheduled. This is needed to avoid guest timeouts caused by the long cycles of the execution. In replay mode execution may hang when vCPU sleeps and block event comes to the queue. This patch adds notification which wakes up vCPU or interrupts execution of guest code. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> -- v2: changed first_cpu to current_cpu (suggested by Richard Henderson) v4: moved vCPU notification to aio_bh_enqueue (suggested by Paolo Bonzini) Message-Id: <165364837317.688121.17680519919871405281.stgit@pasha-ThinkPad-X280> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-28util/win32: simplify qemu_get_local_state_dir()Marc-André Lureau1-13/+4
SHGetFolderPath() is a deprecated API: https://docs.microsoft.com/en-us/windows/win32/api/shlobj_core/nf-shlobj_core-shgetfolderpatha It is a wrapper for SHGetKnownFolderPath() and CSIDL_COMMON_PATH is mapped to FOLDERID_ProgramData: https://docs.microsoft.com/en-us/windows/win32/shell/csidl g_get_system_data_dirs() is a suitable replacement, as it will have FOLDERID_ProgramData in the returned list. However, it follows the XDG Base Directory Specification, if `XDG_DATA_DIRS` is defined, it will be returned instead. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20220525144140.591926-3-marcandre.lureau@redhat.com>
2022-05-28include: move qemu_*_exec_dir() to cutilsMarc-André Lureau3-120/+119
The function is required by get_relocated_path() (already in cutils), and used by qemu-ga and may be generally useful. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20220525144140.591926-2-marcandre.lureau@redhat.com>
2022-05-25thread-pool: remove stopping variablePaolo Bonzini1-3/+2
Just setting the max threads to 0 is enough to stop all workers. Message-Id: <20220514065012.1149539-4-pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-25thread-pool: replace semaphore with condition variablePaolo Bonzini1-40/+28
Since commit f9fc8932b1 ("thread-posix: remove the posix semaphore support", 2022-04-06) QemuSemaphore has its own mutex and condition variable; this adds unnecessary overhead on I/O with small block sizes. Check the QTAILQ directly instead of adding the indirection of a semaphore's count. Using a semaphore has not been necessary since qemu_cond_timedwait was introduced; the new code has to be careful about spurious wakeups but it is simpler, for example thread_pool_cancel does not have to worry about synchronizing the semaphore count with the number of elements of pool->request_list. Note that the return value of qemu_cond_timedwait (0 for timeout, 1 for signal or spurious wakeup) is different from that of qemu_sem_timedwait (-1 for timeout, 0 for success). Reported-by: Lukáš Doktor <ldoktor@redhat.com> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Message-Id: <20220514065012.1149539-3-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-25thread-pool: optimize scheduling of completion bottom halfPaolo Bonzini1-2/+1
The completion bottom half was scheduled within the pool->lock critical section. That actually results in worse performance, because the worker thread can run its own small critical section and go to sleep before the bottom half starts running. Note that this simple change does not produce an improvement without changing the thread pool QemuSemaphore to a condition variable. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Message-Id: <20220514065012.1149539-2-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-12Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into stagingRichard Henderson1-28/+19
* small cleanups for pc-bios/optionrom Makefiles * checkpatch: fix g_malloc check * fix mremap() and RDMA detection * confine igd-passthrough-isa-bridge to Xen-enabled builds * cover PCI in arm-virt machine qtests * add -M boot and -M mem compound properties * bump SLIRP submodule * support CFI with system libslirp (>= 4.7) * clean up CoQueue wakeup functions * fix vhost-vsock regression * fix --disable-vnc compilation * other minor bugfixes # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJ8/KMUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroNTTAf9Et1C8iZn+OlZi99wMEeMy8a4mIE5 # CpkBpFphhkBvt3AH7XNsCyL4Gea4QgsI7nOIEVUwvW7gPf85PiBUX8mjrIVg3x1k # bmMEwMKSTYPmDieAnYBP9zCqZQXNYP8L8WxVs2jFY2GXZ2ZogODYFbvCY4yEEB72 # UR6uIvQRdpiB6BEj8UZ+5i+sDtb0zxqrjzUz8T/PJC9/2JSNgi+sAWWQoQT3PPU7 # R7z2nmEa1VeVLPP6mUHvJKhBltVXF+LyIjQHvo+Tp9tSqp9JwXfFBNQ5W/MFes2D # skF47N7PdgKRH9Dp4r0j+MqBwoAq86+ao+MKsbQ1Gb91HhoCWt/MrVrVyg== # =1E6P # -----END PGP SIGNATURE----- # gpg: Signature made Thu 12 May 2022 05:25:07 AM PDT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (27 commits) vmxcap: add tertiary execution controls vl: make machine type deprecation a warning meson: link libpng independent of vnc vhost-backend: do not depend on CONFIG_VHOST_VSOCK coroutine-lock: qemu_co_queue_restart_all is a coroutine-only qemu_co_enter_all coroutine-lock: introduce qemu_co_queue_enter_all coroutine-lock: qemu_co_queue_next is a coroutine-only qemu_co_enter_next net: slirp: allow CFI with libslirp >= 4.7 net: slirp: add support for CFI-friendly timer API net: slirp: switch to slirp_new net: slirp: introduce a wrapper struct for QemuTimer slirp: bump submodule past 4.7 release machine: move more memory validation to Machine object machine: make memory-backend a link property machine: add mem compound property machine: add boot compound property machine: use QAPI struct for boot configuration tests/qtest/libqos: Add generic pci host bridge in arm-virt machine tests/qtest/libqos: Skip hotplug tests if pci root bus is not hotpluggable tests/qtest/libqos/pci: Introduce pio_limit ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-05-12coroutine-lock: qemu_co_queue_restart_all is a coroutine-only qemu_co_enter_allPaolo Bonzini1-15/+6
qemu_co_queue_restart_all is basically the same as qemu_co_enter_all but without a QemuLockable argument. That's perfectly fine, but only as long as the function is marked coroutine_fn. If used outside coroutine context, qemu_co_queue_wait will attempt to take the lock and that is just broken: if you are calling qemu_co_queue_restart_all outside coroutine context, the lock is going to be a QemuMutex which cannot be taken twice by the same thread. The patch adds the marker to qemu_co_queue_restart_all and to its sole non-coroutine_fn caller; it then reimplements the function in terms of qemu_co_enter_all_impl, to remove duplicated code and to clarify that the latter also works in coroutine context. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20220427130830.150180-4-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-12coroutine-lock: introduce qemu_co_queue_enter_allPaolo Bonzini1-0/+7
Because qemu_co_queue_restart_all does not release the lock, it should be used only in coroutine context. Introduce a new function that, like qemu_co_enter_next, does release the lock, and use it whenever qemu_co_queue_restart_all was used outside coroutine context. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20220427130830.150180-3-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-12coroutine-lock: qemu_co_queue_next is a coroutine-only qemu_co_enter_nextPaolo Bonzini1-14/+7
qemu_co_queue_next is basically the same as qemu_co_enter_next but without a QemuLockable argument. That's perfectly fine, but only as long as the function is marked coroutine_fn. If used outside coroutine context, qemu_co_queue_wait will attempt to take the lock and that is just broken: if you are calling qemu_co_queue_next outside coroutine context, the lock is going to be a QemuMutex which cannot be taken twice by the same thread. The patch adds the marker and reimplements qemu_co_queue_next in terms of qemu_co_enter_next_impl, to remove duplicated code and to clarify that the latter also works in coroutine context. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20220427130830.150180-2-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-12coroutine: Revert to constant batch sizeKevin Wolf1-8/+14
Commit 4c41c69e changed the way the coroutine pool is sized because for virtio-blk devices with a large queue size and heavy I/O, it was just too small and caused coroutines to be deleted and reallocated soon afterwards. The change made the size dynamic based on the number of queues and the queue size of virtio-blk devices. There are two important numbers here: Slightly simplified, when a coroutine terminates, it is generally stored in the global release pool up to a certain pool size, and if the pool is full, it is freed. Conversely, when allocating a new coroutine, the coroutines in the release pool are reused if the pool already has reached a certain minimum size (the batch size), otherwise we allocate new coroutines. The problem after commit 4c41c69e is that it not only increases the maximum pool size (which is the intended effect), but also the batch size for reusing coroutines (which is a bug). It means that in cases with many devices and/or a large queue size (which defaults to the number of vcpus for virtio-blk-pci), many thousand coroutines could be sitting in the release pool without being reused. This is not only a waste of memory and allocations, but it actually makes the QEMU process likely to hit the vm.max_map_count limit on Linux because each coroutine requires two mappings (its stack and the guard page for the stack), causing it to abort() in qemu_alloc_stack() because when the limit is hit, mprotect() starts to fail with ENOMEM. In order to fix the problem, change the batch size back to 64 to avoid uselessly accumulating coroutines in the release pool, but keep the dynamic maximum pool size so that coroutines aren't freed too early in heavy I/O scenarios. Note that this fix doesn't strictly make it impossible to hit the limit, but this would only happen if most of the coroutines are actually in use at the same time, not just sitting in a pool. This is the same behaviour as we already had before commit 4c41c69e. Fully preventing this would require allowing qemu_coroutine_create() to return an error, but it doesn't seem to be a scenario that people hit in practice. Cc: qemu-stable@nongnu.org Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2079938 Fixes: 4c41c69e05fe28c0f95f8abd2ebf407e95a4f04b Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220510151020.105528-3-kwolf@redhat.com> Tested-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-05-12coroutine: Rename qemu_coroutine_inc/dec_pool_size()Kevin Wolf1-2/+2
It's true that these functions currently affect the batch size in which coroutines are reused (i.e. moved from the global release pool to the allocation pool of a specific thread), but this is a bug and will be fixed in a separate patch. In fact, the comment in the header file already just promises that it influences the pool size, so reflect this in the name of the functions. As a nice side effect, the shorter function name makes some line wrapping unnecessary. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220510151020.105528-2-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-05-09util/event-loop-base: Introduce options to set the thread pool sizeNicolas Saenz Julienne4-4/+81
The thread pool regulates itself: when idle, it kills threads until empty, when in demand, it creates new threads until full. This behaviour doesn't play well with latency sensitive workloads where the price of creating a new thread is too high. For example, when paired with qemu's '-mlock', or using safety features like SafeStack, creating a new thread has been measured take multiple milliseconds. In order to mitigate this let's introduce a new 'EventLoopBase' property to set the thread pool size. The threads will be created during the pool's initialization or upon updating the property's value, remain available during its lifetime regardless of demand, and destroyed upon freeing it. A properly characterized workload will then be able to configure the pool to avoid any latency spikes. Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-id: 20220425075723.20019-4-nsaenzju@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>