aboutsummaryrefslogtreecommitdiff
path: root/tcg
AgeCommit message (Collapse)AuthorFilesLines
2013-03-22Use proper term in TCG README陳韋任 (Wei-Ren Chen)1-5/+9
In TCG, "target" means the host architecture for which TCG generates the code. Using "guest" rather than "target" to make the document more consistent. Signed-off-by: Chen Wei-Ren <chenwj@iis.sinica.edu.tw> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-03Handle CPU interrupts by inline checking of a flagPeter Maydell1-0/+5
Fix some of the nasty TCG race conditions and crashes by implementing cpu_exit() as setting a flag which is checked at the start of each TB. This avoids crashes if a thread or signal handler calls cpu_exit() while the execution thread is itself modifying the TB graph (which may happen in system emulation mode as well as in linux-user mode with a multithreaded guest binary). This fixes the crashes seen in LP:668799; however there are another class of crashes described in LP:1098729 which stem from the fact that in linux-user with a multithreaded guest all threads will use and modify the same global TCG date structures (including the generated code buffer) without any kind of locking. This means that multithreaded guest binaries are still in the "unsupported" category. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-03-03tcg: Document tcg_qemu_tb_exec() and provide constants for low bit usesPeter Maydell1-1/+43
Document tcg_qemu_tb_exec(). In particular, its return value is a combination of a pointer to the next translation block and some extra information in the low two bits. Provide some #defines for the values passed in these bits to improve code clarity. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-03-03tcg-sparc: fix buildBlue Swirl1-1/+1
Fix build breakage by 803d805bcef4ea7b7d6ef0b4929263e1160d6b3c: make tcg_out_addsub2() always available. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-26qemu-log: default to stderr for logging outputPeter Maydell1-1/+1
Switch the default for qemu_log logging output from "/tmp/qemu.log" to stderr. This is an incompatible change in some sense, but logging is mostly used for debugging purposes so it shouldn't affect production use. The previous behaviour can be obtained by adding "-D /tmp/qemu.log" to the command line. This change requires us to: * update all the documentation/help text (we take the opportunity to smooth out minor inconsistencies between the phrasing in linux-user/bsd-user/system help messages) * make linux-user and bsd-user defer to qemu-log for the default logging destination rather than overriding it themselves * ensure that all logfile closing is done via qemu_log_close() and that that function doesn't close stderr as well as the obvious change to the behaviour of do_qemu_set_log() when no logfile name has been specified. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-id: 1361901160-28729-1-git-send-email-peter.maydell@linaro.org Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-23tcg: Apply life analysis to 64-bit multiword arithmetic opsRichard Henderson1-8/+20
Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23tcg: Implement muls2 with mulu2Richard Henderson1-0/+40
Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23tcg-arm: Implement muls2_i32Richard Henderson2-1/+5
We even had the encoding of smull already handy... Cc: Andrzej Zaborowski <balrogg@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23tcg-i386: Implement multiword arithmetic opsRichard Henderson2-17/+26
Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23tcg: Implement multiword addition helpersRichard Henderson1-0/+82
Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23tcg: Implement multiword multiply helpersRichard Henderson2-0/+86
Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23tcg: Implement a 64-bit to 32-bit extraction helperRichard Henderson1-0/+22
We're going to have use for this shortly in implementing other helpers. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23tcg: Add signed multiword multiplication operationsRichard Henderson14-0/+24
Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23tcg: Add 64-bit multiword arithmetic operationsRichard Henderson10-14/+41
Matching the 32-bit multiword arithmetic that we already have. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23tcg-sparc: Always implement 32-bit multiword opsRichard Henderson2-6/+7
Cc: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23tcg-i386: Always implement 32-bit multiword opsRichard Henderson2-12/+13
Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23tcg: Make 32-bit multiword operations optional for 64-bit hostsRichard Henderson8-4/+29
Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-17tcg/ppc: Fix build of tcg_qemu_tb_exec()Andreas Färber1-1/+1
Commit 0b0d3320db74cde233ee7855ad32a9c121d20eb4 (TCG: Final globals clean-up) moved code_gen_prologue but forgot to update ppc code. This broke the build on 32-bit ppc. ppc64 is unaffected. Cc: Evgeny Voevodin <evgenyvoevodin@gmail.com> Cc: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Andreas Färber <andreas.faerber@web.de> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-16qemu-log: Rename the public-facing cpu_set_log function to qemu_set_logPeter Maydell1-1/+1
Rename the public-facing function cpu_set_log to qemu_set_log. This requires us to rename the internal-only qemu_set_log() to do_qemu_set_log(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-16TCG: Move translation block variables to new context inside tcg_ctx: tb_ctxEvgeny Voevodin1-0/+2
It's worth to clean-up translation blocks variables and move them into one context as was suggested by Swirl. Also if we use this context directly inside tcg_ctx, then it speeds up code generation a bit. Signed-off-by: Evgeny Voevodin <evgenyvoevodin@gmail.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-16TCG: Final globals clean-upEvgeny Voevodin2-4/+12
Signed-off-by: Evgeny Voevodin <evgenyvoevodin@gmail.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-01-19tcg/target-arm: Add missing parens to assertionsPeter Maydell1-2/+2
Silence a (legitimate) complaint about missing parentheses: tcg/arm/tcg-target.c: In function ‘tcg_out_qemu_ld’: tcg/arm/tcg-target.c:1148:5: error: suggest parentheses around comparison in operand of ‘&’ [-Werror=parentheses] tcg/arm/tcg-target.c: In function ‘tcg_out_qemu_st’: tcg/arm/tcg-target.c:1357:5: error: suggest parentheses around comparison in operand of ‘&’ [-Werror=parentheses] which meant that we would mistakenly always assert if running a QEMU built with debug enabled on ARM. Signed-off-by: Peter Maydell <peter.maydelL@linaro.org> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-01-19optimize: optimize using nonzero bitsPaolo Bonzini1-2/+28
This adds two optimizations using the non-zero bit mask. In some cases involving shifts or ANDs the value can become zero, and can thus be optimized to a move of zero. Second, useless zero-extension or an AND with constant can be detected that would only zero bits that are already zero. The main advantage of this optimization is that it turns zero-extensions into moves, thus enabling much better copy propagation (around 1% code reduction). Here is for example a "test $0xff0000,%ecx + je" before optimization: mov_i64 tmp0,rcx movi_i64 tmp1,$0xff0000 discard cc_src and_i64 cc_dst,tmp0,tmp1 movi_i32 cc_op,$0x1c ext32u_i64 tmp0,cc_dst movi_i64 tmp12,$0x0 brcond_i64 tmp0,tmp12,eq,$0x0 and after (without patch on the left, with on the right): movi_i64 tmp1,$0xff0000 movi_i64 tmp1,$0xff0000 discard cc_src discard cc_src and_i64 cc_dst,rcx,tmp1 and_i64 cc_dst,rcx,tmp1 movi_i32 cc_op,$0x1c movi_i32 cc_op,$0x1c ext32u_i64 tmp0,cc_dst movi_i64 tmp12,$0x0 movi_i64 tmp12,$0x0 brcond_i64 tmp0,tmp12,eq,$0x0 brcond_i64 cc_dst,tmp12,eq,$0x0 Other similar cases: "test %eax, %eax + jne" where eax is already 32-bit (after optimization, without patch on the left, with on the right): discard cc_src discard cc_src mov_i64 cc_dst,rax mov_i64 cc_dst,rax movi_i32 cc_op,$0x1c movi_i32 cc_op,$0x1c ext32u_i64 tmp0,cc_dst movi_i64 tmp12,$0x0 movi_i64 tmp12,$0x0 brcond_i64 tmp0,tmp12,ne,$0x0 brcond_i64 rax,tmp12,ne,$0x0 "test $0x1, %dl + je": movi_i64 tmp1,$0x1 movi_i64 tmp1,$0x1 discard cc_src discard cc_src and_i64 cc_dst,rdx,tmp1 and_i64 cc_dst,rdx,tmp1 movi_i32 cc_op,$0x1a movi_i32 cc_op,$0x1a ext8u_i64 tmp0,cc_dst movi_i64 tmp12,$0x0 movi_i64 tmp12,$0x0 brcond_i64 tmp0,tmp12,eq,$0x0 brcond_i64 cc_dst,tmp12,eq,$0x0 In some cases TCG even outsmarts GCC. :) Here the input code has "and $0x2,%eax + movslq %eax,%rbx + test %rbx, %rbx" and the optimizer, thanks to copy propagation, does the following: movi_i64 tmp12,$0x2 movi_i64 tmp12,$0x2 and_i64 rax,rax,tmp12 and_i64 rax,rax,tmp12 mov_i64 cc_dst,rax mov_i64 cc_dst,rax ext32s_i64 tmp0,rax -> nop mov_i64 rbx,tmp0 -> mov_i64 rbx,cc_dst and_i64 cc_dst,rbx,rbx -> nop Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-01-19optimize: track nonzero bits of registersPaolo Bonzini1-22/+110
Add a "mask" field to the tcg_temp_info struct. A bit that is zero in "mask" will always be zero in the corresponding temporary. Zero bits in the mask can be produced from moves of immediates, zero-extensions, ANDs with constants, shifts; they can then be be propagated by logical operations, shifts, sign-extensions, negations, deposit operations, and conditional moves. Other operations will just reset the mask to all-ones, i.e. unknown. [rth: s/target_ulong/tcg_target_ulong/] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-01-19optimize: only write to state when clearing optimizer dataPaolo Bonzini1-5/+14
The next patch will add to the TCG optimizer a field that should be non-zero in the default case. Thus, replace the memset of the temps array with a loop. Only the state field has to be up-to-date, because others are not used except if the state is TCG_TEMP_COPY or TCG_TEMP_CONST. [rth: Extracted the loop to a function.] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-01-12tcg-i386: use LEA for 3-operand 64-bit additionPaolo Bonzini1-1/+1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-01-02tcg: Remove unneeded assertionStefan Weil1-1/+0
Commit 7f6f0ae5b95adfa76e10eabe2c34424a955fd10c added two assertions. One of these assertions is not needed: The pointer ts is never NULL because it is initialized with the address of an array element. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-12-29tcg-hppa: Fix typo in brcond2Richard Henderson1-1/+1
Reported-by: Stuart Brady <sdb@zubnet.me.uk> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-12-29tcg-i386: Perform cmov detection at runtime for 32-bit.Richard Henderson2-6/+30
Existing compile-time detection is spotty at best. Convert it all to runtime detection instead. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-12-29tcg: Add TCGV_IS_UNUSED_*Richard Henderson2-0/+5
Cc: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-12-19misc: move include files to include/qemu/Paolo Bonzini1-3/+3
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19exec: move include files to include/exec/Paolo Bonzini9-9/+9
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19janitor: add guards to headersPaolo Bonzini9-0/+27
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-10Merge remote-tracking branch 'stefanha/trivial-patches' into stagingAnthony Liguori1-2/+2
* stefanha/trivial-patches: pc_sysfw: Plug memory leak on pc_fw_add_pflash_drv() error path qemu-options: Fix space at EOL Fix spelling in comments and documentation Clean up pci_drive_hot_add()'s use of BlockInterfaceType arm: a9mpcore: remove un-used ptimer_iomem field target-sparc: Remove t0, t1 from CPUSPARCState target-m68k: Remove t1 from CPUM68KState target-alpha: Remove t0, t1 from CPUAlphaState s390x: Spelling fixes (endianess -> endianness, occured -> occurred) Fix comments (adress -> address, layed -> laid, wierd -> weird) Fix spelling (prefered -> preferred) configure: Remove stray debug output sd: Send debug printfery to stderr not stdout Conflicts: configure Resolve spelling conflict in configure. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-12-08tcg/tcg.h: Duplicate global TCG gen_opc_ arrays into TCGContext.Evgeny Voevodin1-0/+3
Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-12-07Fix comments (adress -> address, layed -> laid, wierd -> weird)Stefan Weil1-2/+2
Remove also a duplicated 'the'. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-11-24tcg: mark local temps as MEM in dead_temp()Aurelien Jarno1-1/+1
In dead_temp, local temps should always be marked as back to memory, even if they have not been allocated (i.e. they are discared before cross a basic block). It fixes the following assertion in target-xtensa: qemu-system-xtensa: tcg/tcg.c:1665: temp_save: Assertion `s->temps[temp].val_type == 2 || s->temps[temp].fixed_reg' failed. Aborted Reported-by: Max Filippov <jcmvbkbc@gmail.com> Tested-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-11-24tcg/arm: fix cross-endian qemu_st16Aurelien Jarno1-2/+18
The bswap16 TCG opcode assumes that the high bytes of the temp equal to 0 before calling it. The ARM backend implementation takes this assumption to slightly optimize the generated code. The same implementation is called for implementing the cross-endian qemu_st16 opcode, where this assumption is not true anymore. One way to fix that would be to zero the high bytes before calling it. Given the store instruction just ignore them, it is possible to provide a slightly more optimized version. With ARMv6+ the rev16 instruction does the work correctly. For lower ARM versions the patch provides a version which behaves correctly with non-zero high bytes, but fill them with junk. Cc: Andrzej Zaborowski <balrogg@gmail.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-11-24tcg/arm: fix TLB access in qemu-ld/st opsAurelien Jarno1-36/+42
The TCG arm backend considers likely that the offset to the TLB entries does not exceed 12 bits for mem_index = 0. In practice this is not true for at least the MIPS target. The current patch fixes that by loading the bits 23-12 with a separate instruction, and using loads with address writeback, independently of the value of mem_idx. In total this allow a 24-bit offset, which is a lot more than needed. Cc: Andrzej Zaborowski <balrogg@gmail.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: qemu-stable@nongnu.org Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-11-21tcg/ppc: Fix !softmmu casemalc1-4/+8
Signed-off-by: malc <av1474@comtv.ru>
2012-11-19tcg/ppc: Remove unused s_bits variablemalc1-3/+0
Thanks to Alexander Graf for heads up. Signed-off-by: malc <av1474@comtv.ru>
2012-11-18tci: Support deposit operationsStefan Weil2-2/+26
The operations for INDEX_op_deposit_i32 and INDEX_op_deposit_i64 are now supported and enabled by default. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-11-17TCG: Remove unused global variablesEvgeny Voevodin2-8/+0
Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-11-17TCG: Use gen_opparam_buf from context instead of global variable.Evgeny Voevodin1-5/+6
Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-11-17TCG: Use gen_opc_buf from context instead of global variable.Evgeny Voevodin2-46/+46
Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-11-17TCG: Use gen_opparam_ptr from context instead of global variable.Evgeny Voevodin2-145/+145
Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-11-17TCG: Use gen_opc_ptr from context instead of global variable.Evgeny Voevodin2-43/+43
Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-11-17tcg/tcg.h: Duplicate global TCG variables in TCGContextEvgeny Voevodin1-0/+6
Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-11-11tcg: properly check that op's output needs to be synced to memoryKirill Batuzov1-4/+4
Fix typo introduced in b3a1be87bac3a6aaa59bb88c1410f170dc9b22d5. Reported-by: Ruslan Savchenko <ruslan.savchenko@gmail.com> Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-11-06tcg/ppc32: Use trampolines to trim the code size for mmu slow path accessorsmalc1-8/+24
mmu access looks something like: <check tlb> if miss goto slow_path <fast path> done: ... ; end of the TB slow_path: <pre process> mr r3, r27 ; move areg0 to r3 ; (r3 holds the first argument for all the PPC32 ABIs) <call mmu_helper> b $+8 .long done <post process> b done On ppc32 <call mmu_helper> is: (SysV and Darwin) mmu_helper is most likely not within direct branching distance from the call site, necessitating a. moving 32 bit offset of mmu_helper into a GPR ; 8 bytes b. moving GPR to CTR/LR ; 4 bytes c. (finally) branching to CTR/LR ; 4 bytes r3 setting - 4 bytes call - 16 bytes dummy jump over retaddr - 4 bytes embedded retaddr - 4 bytes Total overhead - 28 bytes (PowerOpen (AIX)) a. moving 32 bit offset of mmu_helper's TOC into a GPR1 ; 8 bytes b. loading 32 bit function pointer into GPR2 ; 4 bytes c. moving GPR2 to CTR/LR ; 4 bytes d. loading 32 bit small area pointer into R2 ; 4 bytes e. (finally) branching to CTR/LR ; 4 bytes r3 setting - 4 bytes call - 24 bytes dummy jump over retaddr - 4 bytes embedded retaddr - 4 bytes Total overhead - 36 bytes Following is done to trim the code size of slow path sections: In tcg_target_qemu_prologue trampolines are emitted that look like this: trampoline: mfspr r3, LR addi r3, 4 mtspr LR, r3 ; fixup LR to point over embedded retaddr mr r3, r27 <jump mmu_helper> ; tail call of sorts And slow path becomes: slow_path: <pre process> <call trampoline> .long done <post process> b done call - 4 bytes (trampoline is within code gen buffer and most likely accessible via direct branch) embedded retaddr - 4 bytes Total overhead - 8 bytes In the end the icache pressure is decreased by 20/28 bytes at the cost of an extra jump to trampoline and adjusting LR (to skip over embedded retaddr) once inside. Signed-off-by: malc <av1474@comtv.ru>