diff options
author | Roger Sayle <roger@nextmovesoftware.com> | 2022-06-27 07:47:40 +0100 |
---|---|---|
committer | Roger Sayle <roger@nextmovesoftware.com> | 2022-06-27 07:47:40 +0100 |
commit | 64d4f27a0ce47e97867512bda7fa5683acf8a134 (patch) | |
tree | ce98fc5bd02f772e5fffe058cb00d38efe6ca474 /gcc/final.cc | |
parent | f3f73e86ec8613f176db3e52bbfbfbb9636cb714 (diff) | |
download | gcc-64d4f27a0ce47e97867512bda7fa5683acf8a134.zip gcc-64d4f27a0ce47e97867512bda7fa5683acf8a134.tar.gz gcc-64d4f27a0ce47e97867512bda7fa5683acf8a134.tar.bz2 |
Implement __imag__ of float _Complex using shufps on x86_64.
This patch is a follow-up improvement to my recent patch for
PR rtl-optimization/7061. That patch added the test case
gcc.target/i386/pr7061-2.c:
float im(float _Complex a) { return __imag__ a; }
For which GCC on x86_64 currently generates:
movq %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm0
ret
but with this patch we now generate (the same as LLVM):
shufps $85, %xmm0, %xmm0
ret
This is achieved by providing a define_insn_and_split that allows
truncated lshiftrt:DI by 32 to be performed on either SSE or general
regs, where if the register allocator prefers to use SSE, we split
to a shufps_v4si, or if not, we use a regular shrq.
2022-06-27 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/7061
* config/i386/i386.md (*highpartdisi2): New define_insn_and_split.
gcc/testsuite/ChangeLog
PR rtl-optimization/7061
* gcc.target/i386/pr7061-2.c: Update to look for shufps.
Diffstat (limited to 'gcc/final.cc')
0 files changed, 0 insertions, 0 deletions