aboutsummaryrefslogtreecommitdiff
path: root/accel/tcg/cpu-exec.c
AgeCommit message (Collapse)AuthorFilesLines
2024-01-29accel/tcg: Inline need_replay_interruptRichard Henderson1-15/+2
The function is now trivial, and with inlining we can re-use the calling function's tcg_ops variable. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-01-29target/i386: Extract x86_need_replay_interrupt() from accel/tcg/Philippe Mathieu-Daudé1-4/+0
Move this x86-specific code out of the generic accel/tcg/. Reviewed-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240124101639.30056-8-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-01-29accel/tcg: Introduce TCGCPUOps::need_replay_interrupt() handlerPhilippe Mathieu-Daudé1-3/+5
In order to make accel/tcg/ target agnostic, introduce the need_replay_interrupt() handler. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Message-Id: <20240124101639.30056-7-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-01-29accel/tcg: Use CPUState.cc instead of CPU_GET_CLASS in cpu-exec.cRichard Henderson1-49/+52
CPU_GET_CLASS does runtime type checking; use the cached copy of the class instead. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-01-29accel/tcg: Un-inline icount_exit_request() for clarityPhilippe Mathieu-Daudé1-4/+12
Convert packed logic to dumb icount_exit_request() helper. No functional change intended. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Message-Id: <20240124101639.30056-5-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-01-29accel/tcg/cpu-exec: Use RCU_READ_LOCK_GUARDPhilippe Mathieu-Daudé1-3/+1
Replace the manual rcu_read_(un)lock calls in cpu_exec(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240124074201.8239-2-philmd@linaro.org> [rth: Use RCU_READ_LOCK_GUARD not WITH_RCU_READ_LOCK_GUARD] Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-01-29cpu-exec: simplify jump cache managementPaolo Bonzini1-43/+23
Unless I'm missing something egregious, the jmp cache is only every populated with a valid entry by the same thread that reads the cache. Therefore, the contents of any valid entry are always consistent and there is no need for any acquire/release magic. Indeed ->tb has to be accessed with atomics, because concurrent invalidations would otherwise cause data races. But ->pc is only ever accessed by one thread, and accesses to ->tb and ->pc within tb_lookup can never race with another tb_lookup. While the TranslationBlock (especially the flags) could be modified by a concurrent invalidation, store-release and load-acquire operations on the cache entry would not add any additional ordering beyond what you get from performing the accesses within a single thread. Because of this, there is really nothing to win in splitting the CF_PCREL and !CF_PCREL paths. It is easier to just always use the ->pc field in the jump cache. I noticed this while working on splitting commit 8ed558ec0cb ("accel/tcg: Introduce TARGET_TB_PCREL", 2022-10-04) into multiple pieces, for the sake of finding a more fine-grained bisection result for https://gitlab.com/qemu-project/qemu/-/issues/2092. It does not (and does not intend to) fix that issue; therefore it may make sense to not commit it until the root cause of issue #2092 is found. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20240122153409.351959-1-pbonzini@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-01-08system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()Stefan Hajnoczi1-13/+13
The Big QEMU Lock (BQL) has many names and they are confusing. The actual QemuMutex variable is called qemu_global_mutex but it's commonly referred to as the BQL in discussions and some code comments. The locking APIs, however, are called qemu_mutex_lock_iothread() and qemu_mutex_unlock_iothread(). The "iothread" name is historic and comes from when the main thread was split into into KVM vcpu threads and the "iothread" (now called the main loop thread). I have contributed to the confusion myself by introducing a separate --object iothread, a separate concept unrelated to the BQL. The "iothread" name is no longer appropriate for the BQL. Rename the locking APIs to: - void bql_lock(void) - void bql_unlock(void) - bool bql_locked(void) There are more APIs with "iothread" in their names. Subsequent patches will rename them. There are also comments and documentation that will be updated in later patches. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Fabiano Rosas <farosas@suse.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Cédric Le Goater <clg@kaod.org> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Acked-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-id: 20240102153529.486531-2-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-11-14accel/tcg: Remove CF_LAST_IORichard Henderson1-1/+1
In cpu_exec_step_atomic, we did not set CF_LAST_IO, which lead to a loop with cpu_io_recompile. But since 18a536f1f8 ("Always require can_do_io") we no longer need a flag to indicate when the last insn should have can_do_io set, so remove the flag entirely. Reported-by: Clément Chigot <chigot@adacore.com> Tested-by: Clément Chigot <chigot@adacore.com> Reviewed-by: Claudio Fontana <cfontana@suse.de> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1961 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-04accel/tcg: Make monitor.c a target-agnostic unitPhilippe Mathieu-Daudé1-0/+1
Move target-agnostic declarations from "internal-target.h" to a new "internal-common.h" header. monitor.c now don't include target specific headers and can be compiled once in system_ss[]. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Message-Id: <20230914185718.76241-10-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-04accel/tcg: Rename target-specific 'internal.h' -> 'internal-target.h'Philippe Mathieu-Daudé1-1/+1
accel/tcg/internal.h contains target specific declarations. Unit files including it become "target tainted": they can not be compiled as target agnostic. Rename using the '-target' suffix to make this explicit. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Message-Id: <20230914185718.76241-9-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-04accel/tcg: Replace CPUState.env_ptr with cpu_env()Richard Henderson1-4/+4
Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-03accel/tcg: Remove cpu_neg()Richard Henderson1-7/+7
Now that CPUNegativeOffsetState is part of CPUState, we can reference it directly. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-03accel/tcg: Move can_do_io to CPUNegativeOffsetStateRichard Henderson1-1/+1
Minimize the displacement to can_do_io, since it may be touched at the start of each TranslationBlock. It fits into other padding within the substructure. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-03accel/tcg: Have tcg_exec_realizefn() return a booleanPhilippe Mathieu-Daudé1-1/+3
Following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), have tcg_exec_realizefn() return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20231003123026.99229-7-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-28accel/tcg: Always set CF_LAST_IO with CF_NOIRQRichard Henderson1-1/+1
Without this we can get see loops through cpu_io_recompile, in which the cpu makes no progress. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-17accel/tcg: Zero-pad PC in TCG CPU exec trace linesPeter Maydell1-2/+2
In commit f0a08b0913befbd we changed the type of the PC from target_ulong to vaddr. In doing so we inadvertently dropped the zero-padding on the PC in trace lines (the second item inside the [] in these lines). They used to look like this on AArch64, for instance: Trace 0: 0x7f2260000100 [00000000/0000000040000000/00000061/ff200000] and now they look like this: Trace 0: 0x7f4f50000100 [00000000/40000000/00000061/ff200000] and if the PC happens to be somewhere low like 0x5000 then the field is shown as /5000/. This is because TARGET_FMT_lx is a "%08x" or "%016x" specifier, depending on TARGET_LONG_SIZE, whereas VADDR_PRIx is just PRIx64 with no width specifier. Restore the zero-padding by adding an 016 width specifier to this tracing and a couple of others that were similarly recently changed to use VADDR_PRIx without a width specifier. We can't unfortunately restore the "32-bit guests are padded to 8 hex digits and 64-bit guests to 16 hex digits" behaviour so easily. Fixes: f0a08b0913befbd ("accel/tcg/cpu-exec.c: Widen pc to vaddr") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Message-id: 20230711165434.4123674-1-peter.maydell@linaro.org
2023-07-15accel/tcg: Always lock pages before translationRichard Henderson1-0/+20
We had done this for user-mode by invoking page_protect within the translator loop. Extend this to handle system mode as well. Move page locking out of tb_link_page. Reported-by: Liren Wei <lrwei@bupt.edu.cn> Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Richard W.M. Jones <rjones@redhat.com>
2023-07-15accel/tcg: Split out cpu_exec_longjmp_cleanupRichard Henderson1-24/+19
Share the setjmp cleanup between cpu_exec_step_atomic and cpu_exec_setjmp. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26accel/tcg/cpu-exec.c: Widen pc to vaddrAnton Johansson1-17/+17
Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230621135633.1649-7-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26target: Widen pc/cs_base in cpu_get_tb_cpu_stateAnton Johansson1-3/+6
Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230621135633.1649-4-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20accel/tcg/cpu-exec: Use generic 'helper-proto-common.h' headerPhilippe Mathieu-Daudé1-1/+1
We only need lookup_tb_ptr() prototype. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230611085846.21415-3-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20accel/tcg: Check for USER_ONLY definition instead of SOFTMMU onePhilippe Mathieu-Daudé1-2/+2
Since we *might* have user emulation with softmmu, replace the system emulation check by !user emulation one. Invert some if() ladders for clarity. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230613133347.82210-7-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-13util/log: Add vector registers to logIvan Klokov1-0/+3
Added QEMU option 'vpu' to log vector extension registers such as gpr\fpu. Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20230410124451.15929-2-ivan.klokov@syntacore.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-06-06atomics: eliminate mb_read/mb_setPaolo Bonzini1-1/+1
qatomic_mb_read and qatomic_mb_set were the very first atomic primitives introduced for QEMU; their semantics are unclear and they provide a false sense of safety. The last use of qatomic_mb_read() has been removed, so delete it. qatomic_mb_set() instead can survive as an optimized qatomic_set()+smp_mb(), similar to Linux's smp_store_mb(), but rename it to qatomic_set_mb() to match the order of the two operations. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-05exec-all: Widen TranslationBlock pc and cs_base to 64-bitsRichard Henderson1-1/+1
This makes TranslationBlock agnostic to the address size of the guest. Use vaddr for pc, since that's always a virtual address. Use uint64_t for cs_base, since usage varies between guests. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-01accel/tcg: include cs_base in our hash calculationsAlex Bennée1-1/+1
We weren't using cs_base in the hash calculations before. Since the arm front end moved a chunk of flags in a378206a20 (target/arm: Move mode specific TB flags to tb->cs_base) they comprise of an important part of the execution state. Widen the tb_hash_func to include cs_base and expand to qemu_xxhash8() to accommodate it. My initial benchmark shows very little difference in the runtime. Before: armhf ➜ hyperfine -w 2 -m 20 "./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot" Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot Time (mean ± σ): 24.627 s ± 2.708 s [User: 34.309 s, System: 1.797 s] Range (min … max): 22.345 s … 29.864 s 20 runs arm64 ➜ hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot" Benchmark 1: 20 Time (mean ± σ): 62.559 s ± 2.917 s [User: 189.115 s, System: 4.089 s] Range (min … max): 59.997 s … 70.153 s 10 runs After: armhf Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot Time (mean ± σ): 24.223 s ± 2.151 s [User: 34.284 s, System: 1.906 s] Range (min … max): 22.000 s … 28.476 s 20 runs arm64 hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot" Benchmark 1: 20 Time (mean ± σ): 62.769 s ± 1.978 s [User: 188.431 s, System: 5.269 s] Range (min … max): 60.285 s … 66.868 s 10 runs Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20230526165401.574474-12-alex.bennee@linaro.org Message-Id: <20230524133952.3971948-11-alex.bennee@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01tcg: remove the final vestiges of dstateAlex Bennée1-6/+1
Now we no longer have dynamic state affecting things we can remove the additional fields in cpu.h and simplify the TB hash calculation. For the benchmark: hyperfine -w 2 -m 20 \ "./arm-softmmu/qemu-system-arm -cpu cortex-a15 \ -machine type=virt,highmem=off \ -display none -m 2048 \ -serial mon:stdio \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -device virtio-net-pci,netdev=unet \ -device virtio-scsi-pci \ -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf \ -device scsi-hd,drive=hd -smp 4 \ -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage \ -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' \ -snapshot" It has a marginal effect on runtime, before: Time (mean ± σ): 26.279 s ± 2.438 s [User: 41.113 s, System: 1.843 s] Range (min … max): 24.420 s … 32.565 s 20 runs after: Time (mean ± σ): 24.440 s ± 2.885 s [User: 34.474 s, System: 2.028 s] Range (min … max): 21.663 s … 29.937 s 20 runs Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1358 Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20230526165401.574474-10-alex.bennee@linaro.org Message-Id: <20230524133952.3971948-9-alex.bennee@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-23tcg: Remove DEBUG_DISASRichard Henderson1-2/+0
This had been set since the beginning, is never undefined, and it would seem to be harmful to debugging to do so. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-02accel/tcg: Use one_insn_per_tb global instead of old singlestep globalPeter Maydell1-1/+1
The only place left that looks at the old 'singlestep' global variable is the TCG curr_cflags() function. Replace the old global with a new 'one_insn_per_tb' which is defined in tcg-all.c and declared in accel/tcg/internal.h. This keeps it restricted to the TCG code, unlike 'singlestep' which was available to every file in the system and defined in multiple different places for softmmu vs linux-user vs bsd-user. While we're making this change, use qatomic_read() and qatomic_set() on the accesses to the new global, because TCG will read it without holding a lock. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230417164041.684562-4-peter.maydell@linaro.org
2023-04-04accel/tcg: Fix jump cache set in cpu_exec_loopRichard Henderson1-4/+13
Assign pc and use store_release to assign tb. Fixes: 2dd5b7a1b91 ("accel/tcg: Move jmp-cache `CF_PCREL` checks to caller") Reported-by: Weiwei Li <liweiwei@iscas.ac.cn> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-03-22tcg: Clear plugin_mem_cbs on TB exitRichard Henderson1-4/+1
Do this in cpu_tb_exec (normal exit) and cpu_loop_exit (exception), adjacent to where we reset can_do_io. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1381 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230310195252.210956-2-richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230315174331.2959-12-alex.bennee@linaro.org> Reviewed-by: Emilio Cota <cota@braap.org>
2023-03-01accel/tcg: Replace `tb_pc()` with `tb->pc`Anton Johansson1-3/+3
Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230227135202.9710-13-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-03-01accel/tcg: Move jmp-cache `CF_PCREL` checks to callerAnton Johansson1-15/+41
tb-jmp-cache.h contains a few small functions that only exist to hide a CF_PCREL check, however the caller often already performs such a check. This patch moves CF_PCREL checks from the callee to the caller, and also removes these functions which now only hide an access of the jmp-cache. Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230227135202.9710-12-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-03-01accel/tcg: Replace `TARGET_TB_PCREL` with `CF_PCREL`Anton Johansson1-4/+4
Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230227135202.9710-5-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-02-27replay: Extract core API to 'exec/replay-core.h'Philippe Mathieu-Daudé1-1/+1
replay API is used deeply within TCG common code (common to user and system emulation). Unfortunately "sysemu/replay.h" requires some QAPI headers for few system-specific declarations, example: void replay_input_event(QemuConsole *src, InputEvent *evt); Since commit c2651c0eaa ("qapi/meson: Restrict UI module to system emulation and tools") the QAPI header defining the InputEvent is not generated anymore. To keep it simple, extract the 'core' replay prototypes to a new "exec/replay-core.h" header which we include in the TCG code that doesn't need the rest of the replay API. Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Message-Id: <20221219170806.60580-5-philmd@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-02-27accel/tcg: Restrict 'qapi-commands-machine.h' to system emulationPhilippe Mathieu-Daudé1-86/+2
Since commit a0e61807a3 ("qapi: Remove QMP events and commands from user-mode builds") we don't generate the "qapi-commands-machine.h" header in a user-emulation-only build. Rename 'hmp.c' as 'monitor.c' and move the QMP functions from cpu-exec.c (which is always compiled) to monitor.c (which is only compiled when system-emulation is selected). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221219170806.60580-4-philmd@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-02-27exec: Remove unused 'qemu/timer.h' timerPhilippe Mathieu-Daudé1-1/+0
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221219170806.60580-2-philmd@linaro.org>
2023-02-08Don't include headers already included by qemu/osdep.hMarkus Armbruster1-1/+0
This commit was created with scripts/clean-includes. Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20230202133830.2152150-19-armbru@redhat.com>
2023-02-02cpu-exec: assert that plugin_mem_cbs is NULL after executionEmilio Cota1-0/+2
Fixes: #1381 Signed-off-by: Emilio Cota <cota@braap.org> Message-Id: <20230108165107.62488-1-cota@braap.org> [AJB: manually applied follow-up fix] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230124180127.1881110-35-alex.bennee@linaro.org>
2023-02-02cpu: free cpu->tb_jmp_cache with RCUEmilio Cota1-2/+1
Fixes the appended use-after-free. The root cause is that during tb invalidation we use CPU_FOREACH, and therefore to safely free a vCPU we must wait for an RCU grace period to elapse. $ x86_64-linux-user/qemu-x86_64 tests/tcg/x86_64-linux-user/munmap-pthread ================================================================= ==1800604==ERROR: AddressSanitizer: heap-use-after-free on address 0x62d0005f7418 at pc 0x5593da6704eb bp 0x7f4961a7ac70 sp 0x7f4961a7ac60 READ of size 8 at 0x62d0005f7418 thread T2 #0 0x5593da6704ea in tb_jmp_cache_inval_tb ../accel/tcg/tb-maint.c:244 #1 0x5593da6704ea in do_tb_phys_invalidate ../accel/tcg/tb-maint.c:290 #2 0x5593da670631 in tb_phys_invalidate__locked ../accel/tcg/tb-maint.c:306 #3 0x5593da670631 in tb_invalidate_phys_page_range__locked ../accel/tcg/tb-maint.c:542 #4 0x5593da67106d in tb_invalidate_phys_range ../accel/tcg/tb-maint.c:614 #5 0x5593da6a64d4 in target_munmap ../linux-user/mmap.c:766 #6 0x5593da6dba05 in do_syscall1 ../linux-user/syscall.c:10105 #7 0x5593da6f564c in do_syscall ../linux-user/syscall.c:13329 #8 0x5593da49e80c in cpu_loop ../linux-user/x86_64/../i386/cpu_loop.c:233 #9 0x5593da6be28c in clone_func ../linux-user/syscall.c:6633 #10 0x7f496231cb42 in start_thread nptl/pthread_create.c:442 #11 0x7f49623ae9ff (/lib/x86_64-linux-gnu/libc.so.6+0x1269ff) 0x62d0005f7418 is located 28696 bytes inside of 32768-byte region [0x62d0005f0400,0x62d0005f8400) freed by thread T148 here: #0 0x7f49627b6460 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52 #1 0x5593da5ac057 in cpu_exec_unrealizefn ../cpu.c:180 #2 0x5593da81f851 (/home/cota/src/qemu/build/qemu-x86_64+0x484851) Signed-off-by: Emilio Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230111151628.320011-2-cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230124180127.1881110-27-alex.bennee@linaro.org>
2023-01-17tcg: Remove TCG_TARGET_HAS_direct_jumpRichard Henderson1-12/+11
We now have the option to generate direct or indirect goto_tb depending on the dynamic displacement, thus the define is no longer necessary or completely accurate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Change tb_target_set_jmp_target argumentsRichard Henderson1-3/+8
Replace 'tc_ptr' and 'addr' with 'tb' and 'n'. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Add TranslationBlock.jmp_insn_offsetRichard Henderson1-3/+2
Stop overloading jmp_target_arg for both offset and address, depending on TCG_TARGET_HAS_direct_jump. Instead, add a new field to hold the jump insn offset and always set the target address in jmp_target_addr[]. This will allow a tcg backend to use either direct or indirect depending on displacement. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-16accel/tcg: Split out cpu_exec_{setjmp,loop}Richard Henderson1-57/+54
Recently the g_assert(cpu == current_cpu) test has been intermittently failing with gcc. Reorg the code around the setjmp to minimize the lifetime of the cpu variable affected by the setjmp. This appears to fix the existing issue with clang as well. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1147 Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-01accel/tcg: Complete cpu initialization before registrationRichard Henderson1-3/+5
Delay cpu_list_add until realize is complete, so that cross-cpu interaction does not happen with incomplete cpu state. For this, we must delay plugin initialization out of tcg_exec_realizefn, because no cpu_index has been assigned. Fixes a problem with cross-cpu jump cache flushing, when the jump cache has not yet been allocated. Fixes: a976a99a2975 ("include/hw/core: Create struct CPUJumpCache") Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26accel/tcg: Introduce tb_{set_}page_addr{0,1}Richard Henderson1-4/+5
This data structure will be replaced for user-only: add accessors. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26accel/tcg: Add a quicker check for breakpointsLeandro Lupori1-6/+9
Profiling QEMU during Fedora 35 for PPC64 boot revealed that a considerable amount of time was being spent in check_for_breakpoints() (0.61% of total time on PPC64 and 2.19% on amd64), even though it was just checking that its queue was empty and returning, when no breakpoints were set. It turns out this function is not inlined by the compiler and it's always called by helper_lookup_tb_ptr(), one of the most called functions. By leaving only the check for empty queue in check_for_breakpoints() and moving the remaining code to check_for_breakpoints_slow(), called only when the queue is not empty, it's possible to avoid the call overhead. An improvement of about 3% in total time was measured on POWER9. Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221025202424.195984-2-leandro.lupori@eldorado.org.br> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-04accel/tcg: Introduce TARGET_TB_PCRELRichard Henderson1-6/+10
Prepare for targets to be able to produce TBs that can run in more than one virtual context. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-04accel/tcg: Introduce tb_pc and log_pcRichard Henderson1-21/+25
The availability of tb->pc will shortly be conditional. Introduce accessor functions to minimize ifdefs. Pass around a known pc to places like tcg_gen_code, where the caller must already have the value. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>