aboutsummaryrefslogtreecommitdiff
path: root/target
AgeCommit message (Collapse)AuthorFilesLines
2022-11-01Merge tag 'pull-testing-for-7.2-011122-3' of https://github.com/stsquad/qemu ↵Stefan Hajnoczi1-4/+10
into staging testing and plugin updates for 7.2: - cleanup win32/64 docker files - update test-mingw test - add flex/bison to debian-all-test - handle --enable-static/--disable-pie in config - extend timeouts on x86_64 avocado tests - add flex/bison to debian-hexagon-cross - use regular semihosting for nios2 check-tcg - fix obscure linker error to nios2 softmmu tests - various windows portability fixes for tests - clean-up of MAINTAINERS - use -machine none when appropriate in avocado - make raspi2_initrd test detect shutdown - disable sh4 rd2 tests on gitlab - re-enable threadcount/linux-test for sh4 - clean-up s390x handling of "ex" instruction - better handle new CPUs in execlog plugin - pass CONFIG_DEBUG_TCG to plugin builds - try and avoid races in test-io-channel-command - speed up ssh key checking for tests/vm # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmNhI/MACgkQ+9DbCVqe # KkSFXggAg0HIpBDcNz0V5Mh5p69F14pwbDSygKqGDFBebdOHeL7f+WCvQPUGEWxp # 814zjvRY3SC4Mo4mtzguRvNu0styaUpemvRw5FDYK48GpEjg2eVxTnAFD4nr7ud0 # dhw3iaHP+RjA6s3EpPUqQ5nlZEgFJ+Tvkckk3wKSpksBYA4tJra6Uey5kpZ27x0T # KOzB2P6w+9B/B11n/aeSxvRPZdnXt2MyfS/3pwwfoFYioEyaEQ3Ie6ooachtdSL3 # PEvnJVK0VVYbZQwBXJlycNLlK/D++s4AEwmnZ5GmvDFuXlkRO9YMy9Wa5TKJl7gz # 76Aw1KHsE03SyAPvH4bE7eGkIwhJOQ== # =6hXE # -----END PGP SIGNATURE----- # gpg: Signature made Tue 01 Nov 2022 09:49:39 EDT # gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44 # gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44 * tag 'pull-testing-for-7.2-011122-3' of https://github.com/stsquad/qemu: (31 commits) tests/vm: use -o IdentitiesOnly=yes for ssh tests/unit: cleanups for test-io-channel-command contrib/plugins: protect execlog's last_exec expansion contrib/plugins: enable debug on CONFIG_DEBUG_TCG tests/tcg: include CONFIG_PLUGIN in config-host.mak target/s390x: fake instruction loading when handling 'ex' target/s390x: don't probe next pc for EXecuted insns target/s390x: don't use ld_code2 to probe next pc tests/tcg: re-enable threadcount for sh4 tests/tcg: re-enable linux-test for sh4 tests/avocado: disable sh4 rd2 tests on Gitlab tests/avocado: raspi2_initrd: Wait for guest shutdown message before stopping tests/avocado: set -machine none for userfwd and vnc tests MAINTAINERS: fix-up for check-tcg Makefile changes MAINTAINERS: add features_to_c.sh to gdbstub files MAINTAINERS: add entries for the key build bits hw/usb: dev-mtp: Use g_mkdir() block/vvfat: Unify the mkdir() call tcg: Avoid using hardcoded /tmp semihosting/arm-compat-semi: Avoid using hardcoded /tmp ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-11-01target/i386: Expand eflags updates inlineRichard Henderson3-51/+25
The helpers for reset_rf, cli, sti, clac, stac are completely trivial; implement them inline. Drop some nearby #if 0 code. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-01accel/tcg: Remove will_exit argument from cpu_restore_stateRichard Henderson14-21/+21
The value passed is always true, and if the target's synchronize_from_tb hook is non-trivial, not exiting may be erroneous. Reviewed-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-01target/openrisc: Use cpu_unwind_state_data for mfsprRichard Henderson1-2/+9
Since we do not plan to exit, use cpu_unwind_state_data and extract exactly the data requested. This is a bug fix, in that we no longer clobber dflag. Consider: l.j L2 // branch l.mfspr r1, ppc // delay L1: boom L2: l.lwa r3, (r4) Here, dflag would be set by cpu_restore_state (because that is the current state of the cpu), but but not cleared by tb_stop on exiting the TB (because DisasContext has recorded the current value as zero). The next TB begins at L2 with dflag incorrectly set. If the load has a tlb miss, then the exception will be delivered as per a delay slot: with DSX set in the status register and PC decremented (delay slots restart by re-executing the branch). This will cause the return from interrupt to go to L1, and boom! Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-01target/openrisc: Always exit after mtspr npcRichard Henderson1-1/+1
We have called cpu_restore_state asserting will_exit. Do not go back on that promise. This affects icount. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-01target/i386: Use cpu_unwind_state_data for tpr accessRichard Henderson1-2/+23
Avoid cpu_restore_state, and modifying env->eip out from underneath the translator with TARGET_TB_PCREL. There is some slight duplication from x86_restore_state_to_opc, but it's just a few lines. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1269 Reviewed-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-31target/s390x: fake instruction loading when handling 'ex'Alex Bennée1-0/+6
The s390x EXecute instruction is a bit weird as we synthesis the executed instruction from what we have stored in memory. This missed the plugin instrumentation. Work around this with a special helper to inform the rest of the translator about the instruction so things stay consistent. Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Cc: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221027183637.2772968-26-alex.bennee@linaro.org>
2022-10-31target/s390x: don't probe next pc for EXecuted insnsAlex Bennée1-3/+3
We have finished the TB anyway so we can shortcut the other tests by checking dc->ex_value first. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20221027183637.2772968-25-alex.bennee@linaro.org>
2022-10-31target/s390x: don't use ld_code2 to probe next pcAlex Bennée1-1/+1
This isn't an translator picking up an instruction so we shouldn't use the translator_lduw function which has side effects for plugins. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20221027183637.2772968-24-alex.bennee@linaro.org>
2022-10-31Merge tag 'pull-ppc-20221029' of https://gitlab.com/danielhb/qemu into stagingStefan Hajnoczi19-794/+1801
ppc patch queue for 2022-10-29: This queue has the second part of the ppc4xx_sdram cleanups, doorbell instructions for POWER8, new pflash handling for the e500 machine and a Radix MMU regression fix. It also has a lot of performance optimizations in the PowerPC emulation done by the researchers of the Eldorado institute. Between using gvec for VMX/VSX instructions, a full rework of the interrupt model and PMU optimizations, they managed to drastically speed up the emulation of powernv8/9/10 machines. Here's an example with avocado tests: - with master: tests/avocado/boot_linux_console.py:BootLinuxConsole.test_ppc_powernv8: PASS (38.89 s) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_ppc_powernv9: PASS (43.89 s) - with this queue applied: tests/avocado/boot_linux_console.py:BootLinuxConsole.test_ppc_powernv8: PASS (21.23 s) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_ppc_powernv9: PASS (22.58 s) Other ppc machines, like pseries, also had a noticeable performance boost. # -----BEGIN PGP SIGNATURE----- # # iHUEABYKAB0WIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCY10J/gAKCRA82cqW3gMx # ZAbjAPwKNbE1wE2POJbMALBQAM5MewwLMV/UKGjE6jA7HAbb/AEA9e3o11FoUmSJ # rZkmTvMzBQZ81mMGRlS0cnqbrr4ADgc= # =gnKY # -----END PGP SIGNATURE----- # gpg: Signature made Sat 29 Oct 2022 07:09:50 EDT # gpg: using EDDSA key 17EBFF9923D01800AF2838193CD9CA96DE033164 # gpg: Good signature from "Daniel Henrique Barboza <danielhb413@gmail.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 17EB FF99 23D0 1800 AF28 3819 3CD9 CA96 DE03 3164 * tag 'pull-ppc-20221029' of https://gitlab.com/danielhb/qemu: (63 commits) target/ppc: Fix regression in Radix MMU hw/ppc/e500: Implement pflash handling hw/sd/sdhci: Rename ESDHC_* defines to USDHC_* hw/sd/sdhci-internal: Unexport ESDHC defines hw/block/pflash_cfi0{1, 2}: Error out if device length isn't a power of two docs/system/ppc/ppce500: Use qemu-system-ppc64 across the board(s) target/ppc: Increment PMC5 with inline insns target/ppc: Add new PMC HFLAGS ppc4xx_sdram: Add errp parameter to ppc4xx_sdram_banks() ppc4xx_sdram: Convert DDR SDRAM controller to new bank handling ppc4xx_sdram: Generalise bank setup ppc4xx_sdram: Rename local state variable for brevity ppc4xx_sdram: Use hwaddr for memory bank size ppc4xx_sdram: Move ppc4xx_sdram_banks() to ppc4xx_sdram.c ppc4xx_devs.c: Move DDR SDRAM controller model to ppc4xx_sdram.c ppc440_uc.c: Move DDR2 SDRAM controller model to ppc4xx_sdram.c target/ppc: move the p*_interrupt_powersave methods to excp_helper.c target/ppc: unify cpu->has_work based on cs->interrupt_request target/ppc: introduce ppc_maybe_interrupt target/ppc: remove ppc_store_lpcr from CONFIG_USER_ONLY builds ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-31Merge tag 'pull-request-2022-10-28' of https://gitlab.com/thuth/qemu into ↵Stefan Hajnoczi1-1/+1
staging * Fix and test the VISTR instruction on s390x * Some more small s390x fixes and maintainer updates * Make sure to remove all temporary files from qtests * OpenBSD VM test update to version 7.2 * Add sndio to FreeBSD tests * More patches to enable the qtests on Windows # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmNb1x8RHHRodXRoQHJl # ZGhhdC5jb20ACgkQLtnXdP5wLbXmcA//TCliiFkhprVxzIqy7zb9uz2Odu+sS4dT # azUSlXvC14fECm/Rb/rd2VLqCu5x2er8CYauxKQ4VhRImzcDta4kvpt/HKIppN2t # sqw5tipJL0DYcWBwYL1llvfutM26M+Oh0igwR8uV7b+W1FjojEZdcOr9IZ6E6V55 # wQCE5OHm0VCr61QeI5IBfZTsiPo+DFomUCpj7w66j6i0CVDvmpoe36tCmvGgrcpZ # SP7ep7/Iq+dnGh2YnJyoUOPlXeeiBCxAygOVnIRXptDeniGoliCFn7ksLdKDQ9qY # 69pSPR/W7mTZB/HkCRalAbYuYrI9Rcqxdu6c9vcyB8Pr0snQLTf8qThY+BJ2oC4w # JSGgWVniAk5MmrDazwNRkSbgngYLYf+CcT1h5AANuU5Kt50Bdy9Y3TuL5YVmofEp # N4bypV0ICImQyDECz76+i5/iJOcWiRyjMfLT6y00dspeuy983xHakrsHGD8xj0U/ # 3IVxnF9bDnUSVg6lFhYrgCB3dRG1TNPJoYQOM7raS5MAPRrDtIuSabwtyn84jo4+ # 9kZRPJBriMBHNsCjGVlJ9CATmaK1SKVAbRcabjgOKoIwhZTpAe6JalykREUJlTys # hB2V//lWWYPaSpzwY+OkvxoOmJIziixEskOmx6hPcoxID5v/bqlR69W15aUlKuLq # VWFb+/yMvaE= # =h0Ep # -----END PGP SIGNATURE----- # gpg: Signature made Fri 28 Oct 2022 09:20:31 EDT # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * tag 'pull-request-2022-10-28' of https://gitlab.com/thuth/qemu: (21 commits) tests/qtest: libqtest: Correct the timeout unit of blocking receive calls for win32 tests/qtest: libqos: Do not build virtio-9p unconditionally tests/qtest: migration-test: Make sure QEMU process "to" exited after migration is canceled tests/qtest: libqtest: Introduce qtest_wait_qemu() tests/qtest: Use EXIT_FAILURE instead of magic number tests/qtest: device-plug-test: Reverse the usage of double/single quotes tests/qtest: Support libqtest to build and run on Windows tests/qtest: Use send/recv for socket communication accel/qtest: Support qtest accelerator for Windows tests: Add sndio to the FreeBSD CI containers / VM tests/vm: update openbsd to release 7.2 tests/qtest/libqos/e1000e: Use e1000_regs.h tests/qtest/cxl-test: Remove temporary directories after testing tests/qtest/tpm: Clean up remainders of swtpm MAINTAINERS: target/s390x/: add Ilya as reviewer tests/tcg/s390x: Add a test for the vistr instruction target/s390x: Fix emulation of the VISTR instruction tests/tcg/s390x: Test compiler flags only once, not every time s390x/tod-kvm: don't save/restore the TOD in PV guests s390x: step down as general arch maintainer ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-29target/ppc: Fix regression in Radix MMULeandro Lupori1-8/+21
Commit 47e83d9107 ended up unintentionally changing the control flow of ppc_radix64_process_scoped_xlate(). When guest_visible is false, it must not raise an exception, even if the radix configuration is not valid. This regression prevented Linux boot in a nested environment with L1 using TCG and emulating KVM (cap-nested-hv=on) and L2 using KVM. L2 would hang on Linux's futex_init(), when it tested how a futex_atomic_cmpxchg_inatomic() handled a fault, because L1 would start a loop of trying to perform partition scoped translations and raising exceptions. Fixes: 47e83d9107 ("target/ppc: Improve Radix xlate level validation") Reported-by: Victor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Tested-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20221028183617.121786-1-leandro.lupori@eldorado.org.br> [danielhb: use %"PRIu64" to print 'nls'] Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Increment PMC5 with inline insnsLeandro Lupori4-39/+67
Profiling QEMU during Fedora 35 for PPC64 boot revealed that 6.39% of total time was being spent in helper_insns_inc(), on a POWER9 machine. To avoid calling this helper every time PMCs had to be incremented, an inline implementation of PMC5 increment and check for overflow was developed. This led to a reduction of about 12% in Fedora's boot time. Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20221025202424.195984-4-leandro.lupori@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Add new PMC HFLAGSLeandro Lupori3-1/+13
Add 2 new PMC related HFLAGS: - HFLAGS_PMCJCE - value of MMCR0 PMCjCE bit - HFLAGS_PMC_OTHER - set if a PMC other than PMC5-6 is enabled These flags allow further optimization of PMC5 update code, by allowing frequently tested conditions to be performed at translation time. Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20221025202424.195984-3-leandro.lupori@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: move the p*_interrupt_powersave methods to excp_helper.cMatheus Ferst3-108/+102
Move the methods to excp_helper.c and make them static. Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20221021142156.4134411-4-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: unify cpu->has_work based on cs->interrupt_requestMatheus Ferst1-93/+1
Now that cs->interrupt_request indicates if there is any unmasked interrupt, checking if the CPU has work to do can be simplified to a single check that works for all CPU models. Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20221021142156.4134411-3-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: introduce ppc_maybe_interruptMatheus Ferst6-1/+58
This new method will check if any pending interrupt was unmasked and then call cpu_interrupt/cpu_reset_interrupt accordingly. Code that raises/lowers or masks/unmasks interrupts should call this method to keep CPU_INTERRUPT_HARD coherent with env->pending_interrupts. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20221021142156.4134411-2-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: remove ppc_store_lpcr from CONFIG_USER_ONLY buildsMatheus Ferst2-1/+3
Writes to LPCR are hypervisor privileged. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-27-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: add power-saving interrupt masking logic to ↵Matheus Ferst3-13/+14
p7_next_unmasked_interrupt Export p7_interrupt_powersave and use it in p7_next_unmasked_interrupt. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-26-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: move power-saving interrupt masking out of cpu_has_work_POWER7Matheus Ferst1-20/+25
Move the interrupt masking logic out of cpu_has_work_POWER7 in a new method, p7_interrupt_powersave, that only returns an interrupt if it can wake the processor from power-saving mode. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-25-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: remove generic architecture checks from p7_deliver_interruptMatheus Ferst1-3/+0
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-24-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: remove unused interrupts from p7_deliver_interruptMatheus Ferst1-50/+0
Remove the following unused interrupts from the POWER7 interrupt processing method: - PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p; - Hypervisor Virtualization: introduced in Power ISA v3.0; - Hypervisor Doorbell and Event-Based Branch: introduced in Power ISA v2.07; - Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined for embedded CPUs; - Doorbell and Critical Doorbell Interrupt: processor does not implement the Embedded.Processor Control category; - Programmable Interval Timer: 40x-only; - PPC_INTERRUPT_THERM: only raised for 970 and POWER5p; Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-23-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: create an interrupt deliver method for POWER7Matheus Ferst1-0/+107
The new method is identical to ppc_deliver_interrupt, processor-specific code will be added/removed in the following patches. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-22-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: remove unused interrupts from p7_next_unmasked_interruptMatheus Ferst1-55/+8
Remove the following unused interrupts from the POWER7 interrupt masking method: - PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p; - Hypervisor Virtualization: introduced in Power ISA v3.0; - Hypervisor Doorbell and Event-Based Branch: introduced in Power ISA v2.07; - Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined for embedded CPUs; - Doorbell and Critical Doorbell Interrupt: processor does not implement the Embedded.Processor Control category; - Programmable Interval Timer: 40x-only; - PPC_INTERRUPT_THERM: only raised for 970 and POWER5p; Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-21-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: create an interrupt masking method for POWER7Matheus Ferst1-0/+108
The new method is identical to ppc_next_unmasked_interrupt_generic, processor-specific code will be added/removed in the following patches. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-20-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: add power-saving interrupt masking logic to ↵Matheus Ferst3-13/+14
p8_next_unmasked_interrupt Export p8_interrupt_powersave and use it in p8_next_unmasked_interrupt. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-19-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: move power-saving interrupt masking out of cpu_has_work_POWER8Matheus Ferst1-28/+33
Move the interrupt masking logic out of cpu_has_work_POWER8 in a new method, p8_interrupt_powersave, that only returns an interrupt if it can wake the processor from power-saving mode. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-18-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: remove generic architecture checks from p8_deliver_interruptMatheus Ferst1-3/+0
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-17-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: remove unused interrupts from p8_deliver_interruptMatheus Ferst1-30/+0
Remove the following unused interrupts from the POWER8 interrupt processing method: - PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p; - Debug Interrupt: removed in Power ISA v2.07; - Hypervisor Virtualization: introduced in Power ISA v3.0; - Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined for embedded CPUs; - Critical Doorbell: processor does not implement the "Embedded.Processor Control" category; - Programmable Interval Timer: 40x-only; - PPC_INTERRUPT_THERM: only raised for 970 and POWER5p; Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-16-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: create an interrupt deliver method for POWER8Matheus Ferst1-0/+107
The new method is identical to ppc_deliver_interrupt, processor-specific code will be added/removed in the following patches. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-15-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: remove unused interrupts from p8_next_unmasked_interruptMatheus Ferst1-38/+7
Remove the following unused interrupts from the POWER8 interrupt masking method: - PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970, and POWER5p; - Debug Interrupt: removed in Power ISA v2.07; - Hypervisor Virtualization: introduced in Power ISA v3.0; - Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined for embedded CPUs; - Critical Doorbell: processor does not implement the "Embedded.Processor Control" category; - Programmable Interval Timer: 40x-only; - PPC_INTERRUPT_THERM: only raised for 970 and POWER5p; Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-14-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: create an interrupt masking method for POWER8Matheus Ferst1-0/+108
The new method is identical to ppc_next_unmasked_interrupt_generic, processor-specific code will be added/removed in the following patches. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-13-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: add power-saving interrupt masking logic to ↵Matheus Ferst3-14/+38
p9_next_unmasked_interrupt Export p9_interrupt_powersave and use it in p9_next_unmasked_interrupt. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-12-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: move power-saving interrupt masking out of cpu_has_work_POWER9Matheus Ferst1-76/+50
Move the interrupt masking logic out of cpu_has_work_POWER9 in a new method, p9_interrupt_powersave, that only returns an interrupt if it can wake the processor from power-saving mode. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-11-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: remove generic architecture checks from p9_deliver_interruptMatheus Ferst1-8/+1
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-10-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: remove unused interrupts from p9_deliver_interruptMatheus Ferst1-27/+0
Remove the following unused interrupts from the POWER9 interrupt processing method: - PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p; - Debug Interrupt: removed in Power ISA v2.07; - Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined for embedded CPUs; - Critical Doorbell Interrupt: removed in Power ISA v3.0; - Programmable Interval Timer: 40x-only. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-9-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: create an interrupt deliver method for POWER9/POWER10Matheus Ferst1-0/+112
The new method is identical to ppc_deliver_interrupt, processor-specific code will be added/removed in the following patches. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-8-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: remove unused interrupts from p9_next_unmasked_interruptMatheus Ferst1-29/+7
Remove the following unused interrupts from the POWER9 interrupt masking method: - PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p; - Debug Interrupt: removed in Power ISA v2.07; - Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined for embedded CPUs; - Critical Doorbell Interrupt: removed in Power ISA v3.0; - Programmable Interval Timer: 40x-only. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-7-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: create an interrupt masking method for POWER9/POWER10Matheus Ferst1-0/+113
The new method is identical to ppc_next_unmasked_interrupt_generic, processor-specific code will be added/removed in the following patches. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-6-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: prepare to split interrupt masking and delivery by excp_modelMatheus Ferst1-2/+18
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-5-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: split interrupt masking and delivery from ppc_hw_interruptMatheus Ferst1-76/+125
Split ppc_hw_interrupt into an interrupt masking method, ppc_next_unmasked_interrupt, and an interrupt processing method, ppc_deliver_interrupt. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20221011204829.1641124-4-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: always use ppc_set_irq to set env->pending_interruptsMatheus Ferst2-17/+9
Use ppc_set_irq to raise/clear interrupts to ensure CPU_INTERRUPT_HARD will be set/reset accordingly. Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20221011204829.1641124-3-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: define PPC_INTERRUPT_* values directlyMatheus Ferst4-88/+88
This enum defines the bit positions in env->pending_interrupts for each interrupt. However, except for the comparison in kvmppc_set_interrupt, the values are always used as (1 << PPC_INTERRUPT_*). Define them directly like that to save some clutter. No functional change intended. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20221011204829.1641124-2-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Use gvec to decode XVTSTDC[DS]PLucas Mateus Castro (alqotel)1-10/+157
Used gvec to translate XVTSTDCSP and XVTSTDCDP. xvtstdcsp: rept loop imm master version prev version current version 25 4000 0 0,206200 0,040730 (-80.2%) 0,040740 (-80.2%) 25 4000 1 0,205120 0,053650 (-73.8%) 0,053510 (-73.9%) 25 4000 3 0,206160 0,058630 (-71.6%) 0,058570 (-71.6%) 25 4000 51 0,217110 0,191490 (-11.8%) 0,192320 (-11.4%) 25 4000 127 0,206160 0,191490 (-7.1%) 0,192640 (-6.6%) 8000 12 0 1,234719 0,418833 (-66.1%) 0,386365 (-68.7%) 8000 12 1 1,232417 1,435979 (+16.5%) 1,462792 (+18.7%) 8000 12 3 1,232760 1,766073 (+43.3%) 1,743990 (+41.5%) 8000 12 51 1,239281 1,319562 (+6.5%) 1,423479 (+14.9%) 8000 12 127 1,231708 1,315760 (+6.8%) 1,426667 (+15.8%) xvtstdcdp: rept loop imm master version prev version current version 25 4000 0 0,159930 0,040830 (-74.5%) 0,040610 (-74.6%) 25 4000 1 0,160640 0,053670 (-66.6%) 0,053480 (-66.7%) 25 4000 3 0,160020 0,063030 (-60.6%) 0,062960 (-60.7%) 25 4000 51 0,160410 0,128620 (-19.8%) 0,127470 (-20.5%) 25 4000 127 0,160330 0,127670 (-20.4%) 0,128690 (-19.7%) 8000 12 0 1,190365 0,422146 (-64.5%) 0,388417 (-67.4%) 8000 12 1 1,191292 1,445312 (+21.3%) 1,428698 (+19.9%) 8000 12 3 1,188687 1,980656 (+66.6%) 1,975354 (+66.2%) 8000 12 51 1,191250 1,264500 (+6.1%) 1,355083 (+13.8%) 8000 12 127 1,197313 1,266729 (+5.8%) 1,349156 (+12.7%) Overall, these instructions are the hardest ones to measure performance as the gvec implementation is affected by the immediate. Above there are 5 different scenarios when it comes to immediate and 2 when it comes to rept/loop combination. The immediates scenarios are: all bits are 0 therefore the target register should just be changed to 0, with 1 bit set, with 2 bits set in a combination the new implementation can deal with using gvec, 4 bits set and the new implementation can't deal with it using gvec and all bits set. The rept/loop scenarios are high loop and low rept (so it should spend more time executing it than translating it) and high rept low loop (so it should spend more time translating it than executing this code). These comparisons are between the upstream version, a previous similar implementation and a one with a cleaner code(this one). For a comparison with o previous different implementation: <20221010191356.83659-13-lucas.araujo@eldorado.org.br> Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-13-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Moved XSTSTDC[QDS]P to decodetreeLucas Mateus Castro (alqotel)5-90/+60
Moved XSTSTDCSP, XSTSTDCDP and XSTSTDCQP to decodetree and moved some of its decoding away from the helper as previously the DCMX, XB and BF were calculated in the helper with the help of cpu_env, now that part was moved to the decodetree with the rest. xvtstdcsp: rept loop master patch 8 12500 1,85393600 1,94683600 (+5.0%) 25 4000 1,78779800 1,92479000 (+7.7%) 100 1000 2,12775000 2,28895500 (+7.6%) 500 200 2,99655300 3,23102900 (+7.8%) 2500 40 6,89082200 7,44827500 (+8.1%) 8000 12 17,50585500 18,95152100 (+8.3%) xvtstdcdp: rept loop master patch 8 12500 1,39043100 1,33539800 (-4.0%) 25 4000 1,35731800 1,37347800 (+1.2%) 100 1000 1,51514800 1,56053000 (+3.0%) 500 200 2,21014400 2,47906000 (+12.2%) 2500 40 5,39488200 6,68766700 (+24.0%) 8000 12 13,98623900 18,17661900 (+30.0%) xvtstdcdp: rept loop master patch 8 12500 1,35123800 1,34455800 (-0.5%) 25 4000 1,36441200 1,36759600 (+0.2%) 100 1000 1,49763500 1,54138400 (+2.9%) 500 200 2,19020200 2,46196400 (+12.4%) 2500 40 5,39265700 6,68147900 (+23.9%) 8000 12 14,04163600 18,19669600 (+29.6%) As some values are now decoded outside the helper and passed to it as an argument the number of arguments of the helper increased, the number of TCGop needed to load the arguments increased. I suspect that's why the slow-down in the tests with a high REPT but low LOOP. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-12-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Moved XVTSTDC[DS]P to decodetreeLucas Mateus Castro (alqotel)5-14/+70
Moved XVTSTDCSP and XVTSTDCDP to decodetree an restructured the helper to be simpler and do all decoding in the decodetree (so XB, XT and DCMX are all calculated outside the helper). Obs: The tests in this one are slightly different, these are the sum of these instructions with all possible immediate and those instructions are repeated 10 times. xvtstdcsp: rept loop master patch 8 12500 2,76402100 2,70699100 (-2.1%) 25 4000 2,64867100 2,67884100 (+1.1%) 100 1000 2,73806300 2,78701000 (+1.8%) 500 200 3,44666500 3,61027600 (+4.7%) 2500 40 5,85790200 6,47475500 (+10.5%) 8000 12 15,22102100 17,46062900 (+14.7%) xvtstdcdp: rept loop master patch 8 12500 2,11818000 1,61065300 (-24.0%) 25 4000 2,04573400 1,60132200 (-21.7%) 100 1000 2,13834100 1,69988100 (-20.5%) 500 200 2,73977000 2,48631700 (-9.3%) 2500 40 5,05067000 5,25914100 (+4.1%) 8000 12 14,60507800 15,93704900 (+9.1%) Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-11-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Use gvec to decode XVCPSGN[SD]PLucas Mateus Castro (alqotel)3-59/+55
Moved XVCPSGNSP and XVCPSGNDP to decodetree and used gvec to translate them. xvcpsgnsp: rept loop master patch 8 12500 0,00561400 0,00537900 (-4.2%) 25 4000 0,00562100 0,00400000 (-28.8%) 100 1000 0,00696900 0,00416300 (-40.3%) 500 200 0,02211900 0,00840700 (-62.0%) 2500 40 0,09328600 0,02728300 (-70.8%) 8000 12 0,27295300 0,06867800 (-74.8%) xvcpsgndp: rept loop master patch 8 12500 0,00556300 0,00584200 (+5.0%) 25 4000 0,00482700 0,00431700 (-10.6%) 100 1000 0,00585800 0,00464400 (-20.7%) 500 200 0,01565300 0,00839700 (-46.4%) 2500 40 0,05766500 0,02430600 (-57.8%) 8000 12 0,19875300 0,07947100 (-60.0%) Like the previous instructions there seemed to be a improvement on translation time. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-10-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Use gvec to decode XV[N]ABS[DS]P/XVNEG[DS]PLucas Mateus Castro (alqotel)3-12/+76
Moved XVABSSP, XVABSDP, XVNABSSP,XVNABSDP, XVNEGSP and XVNEGDP to decodetree and used gvec to translate them. xvabssp: rept loop master patch 8 12500 0,00477900 0,00476000 (-0.4%) 25 4000 0,00442800 0,00353300 (-20.2%) 100 1000 0,00478700 0,00366100 (-23.5%) 500 200 0,00973200 0,00649400 (-33.3%) 2500 40 0,03165200 0,02226700 (-29.7%) 8000 12 0,09315900 0,06674900 (-28.3%) xvabsdp: rept loop master patch 8 12500 0,00475000 0,00474400 (-0.1%) 25 4000 0,00355600 0,00367500 (+3.3%) 100 1000 0,00444200 0,00366000 (-17.6%) 500 200 0,00942700 0,00732400 (-22.3%) 2500 40 0,02990000 0,02308500 (-22.8%) 8000 12 0,08770300 0,06683800 (-23.8%) xvnabssp: rept loop master patch 8 12500 0,00494500 0,00492900 (-0.3%) 25 4000 0,00397700 0,00338600 (-14.9%) 100 1000 0,00421400 0,00353500 (-16.1%) 500 200 0,01048000 0,00707100 (-32.5%) 2500 40 0,03251500 0,02238300 (-31.2%) 8000 12 0,08889100 0,06469800 (-27.2%) xvnabsdp: rept loop master patch 8 12500 0,00511000 0,00492700 (-3.6%) 25 4000 0,00398800 0,00381500 (-4.3%) 100 1000 0,00390500 0,00365900 (-6.3%) 500 200 0,00924800 0,00784600 (-15.2%) 2500 40 0,03138900 0,02391600 (-23.8%) 8000 12 0,09654200 0,05684600 (-41.1%) xvnegsp: rept loop master patch 8 12500 0,00493900 0,00452800 (-8.3%) 25 4000 0,00369100 0,00366800 (-0.6%) 100 1000 0,00371100 0,00380000 (+2.4%) 500 200 0,00991100 0,00652300 (-34.2%) 2500 40 0,03025800 0,02422300 (-19.9%) 8000 12 0,09251100 0,06457600 (-30.2%) xvnegdp: rept loop master patch 8 12500 0,00474900 0,00454400 (-4.3%) 25 4000 0,00353100 0,00325600 (-7.8%) 100 1000 0,00398600 0,00366800 (-8.0%) 500 200 0,01032300 0,00702400 (-32.0%) 2500 40 0,03125000 0,02422400 (-22.5%) 8000 12 0,09475100 0,06173000 (-34.9%) This one to me seemed the opposite of the previous instructions, as it looks like there was an improvement in the translation time (itself not a surprise as operations were done twice before so there was the need to translate twice as many TCGop) Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-9-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Move VABSDU[BHW] to decodetree and use gvecLucas Mateus Castro (alqotel)5-17/+60
Moved VABSDUB, VABSDUH and VABSDUW to decodetree and use gvec to translate them. vabsdub: rept loop master patch 8 12500 0,03601600 0,00688500 (-80.9%) 25 4000 0,03651000 0,00532100 (-85.4%) 100 1000 0,03666900 0,00595300 (-83.8%) 500 200 0,04305800 0,01244600 (-71.1%) 2500 40 0,06893300 0,04273700 (-38.0%) 8000 12 0,14633200 0,12660300 (-13.5%) vabsduh: rept loop master patch 8 12500 0,02172400 0,00687500 (-68.4%) 25 4000 0,02154100 0,00531500 (-75.3%) 100 1000 0,02235400 0,00596300 (-73.3%) 500 200 0,02827500 0,01245100 (-56.0%) 2500 40 0,05638400 0,04285500 (-24.0%) 8000 12 0,13166000 0,12641400 (-4.0%) vabsduw: rept loop master patch 8 12500 0,01646400 0,00688300 (-58.2%) 25 4000 0,01454500 0,00475500 (-67.3%) 100 1000 0,01545800 0,00511800 (-66.9%) 500 200 0,02168200 0,01114300 (-48.6%) 2500 40 0,04571300 0,04138800 (-9.5%) 8000 12 0,12209500 0,12178500 (-0.3%) Same as VADDCUW and VSUBCUW, overall performance gain but it uses more TCGop (4 before the patch, 6 after). Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-8-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Move VAVG[SU][BHW] to decodetree and use gvecLucas Mateus Castro (alqotel)5-41/+127
Moved the instructions VAVGUB, VAVGUH, VAVGUW, VAVGSB, VAVGSH, VAVGSW, to decodetree and use gvec with them. For these one the right shift had to be made before the sum as to avoid an overflow, so add 1 at the end if any of the entries had 1 in its LSB as to replicate the "+ 1" before the shift described by the ISA. vavgub: rept loop master patch 8 12500 0,02616600 0,00754200 (-71.2%) 25 4000 0,02530000 0,00637700 (-74.8%) 100 1000 0,02604600 0,00790100 (-69.7%) 500 200 0,03189300 0,01838400 (-42.4%) 2500 40 0,06006900 0,06851000 (+14.1%) 8000 12 0,13941000 0,20548500 (+47.4%) vavguh: rept loop master patch 8 12500 0,01818200 0,00780600 (-57.1%) 25 4000 0,01789300 0,00641600 (-64.1%) 100 1000 0,01899100 0,00787200 (-58.5%) 500 200 0,02527200 0,01828400 (-27.7%) 2500 40 0,05361800 0,06773000 (+26.3%) 8000 12 0,12886600 0,20291400 (+57.5%) vavguw: rept loop master patch 8 12500 0,01423100 0,00776600 (-45.4%) 25 4000 0,01780800 0,00638600 (-64.1%) 100 1000 0,02085500 0,00787000 (-62.3%) 500 200 0,02737100 0,01828800 (-33.2%) 2500 40 0,05572600 0,06774200 (+21.6%) 8000 12 0,13101700 0,20311600 (+55.0%) vavgsb: rept loop master patch 8 12500 0,03006000 0,00788600 (-73.8%) 25 4000 0,02882200 0,00637800 (-77.9%) 100 1000 0,02958000 0,00791400 (-73.2%) 500 200 0,03548800 0,01860400 (-47.6%) 2500 40 0,06360000 0,06850800 (+7.7%) 8000 12 0,13816500 0,20550300 (+48.7%) vavgsh: rept loop master patch 8 12500 0,01965900 0,00776600 (-60.5%) 25 4000 0,01875400 0,00638700 (-65.9%) 100 1000 0,01952200 0,00786900 (-59.7%) 500 200 0,02562000 0,01760300 (-31.3%) 2500 40 0,05384300 0,06742800 (+25.2%) 8000 12 0,13240800 0,20330000 (+53.5%) vavgsw: rept loop master patch 8 12500 0,01407700 0,00775600 (-44.9%) 25 4000 0,01762300 0,00640000 (-63.7%) 100 1000 0,02046500 0,00788500 (-61.5%) 500 200 0,02745600 0,01843000 (-32.9%) 2500 40 0,05375500 0,06820500 (+26.9%) 8000 12 0,13068300 0,20304900 (+55.4%) These results to me seems to indicate that with gvec the results have a slower translation but faster execution. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-7-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>