aboutsummaryrefslogtreecommitdiff
path: root/gcc/dwarf2codeview.cc
diff options
context:
space:
mode:
authorRoger Sayle <roger@nextmovesoftware.com>2024-08-15 22:02:05 +0100
committerRoger Sayle <roger@nextmovesoftware.com>2024-08-15 22:02:05 +0100
commitb6fb4f7f651d2aa89548c5833fe2679af2638df5 (patch)
tree5e0265e427ea0a1463259ab09770a626fa28d342 /gcc/dwarf2codeview.cc
parent0f8b11968472ff12674d67fd856610646b373bd0 (diff)
downloadgcc-b6fb4f7f651d2aa89548c5833fe2679af2638df5.zip
gcc-b6fb4f7f651d2aa89548c5833fe2679af2638df5.tar.gz
gcc-b6fb4f7f651d2aa89548c5833fe2679af2638df5.tar.bz2
i386: Improve split of *extendv2di2_highpart_stv_noavx512vl.
This patch follows up on the previous patch to fix PR target/116275 by improving the code STV (ultimately) generates for highpart sign extensions like (x<<8)>>8. The arithmetic right shift is able to take advantage of the available common subexpressions from the preceding left shift. Hence previously with -O2 -m32 -mavx -mno-avx512vl we'd generate: vpsllq $8, %xmm0, %xmm0 vpsrad $8, %xmm0, %xmm1 vpsrlq $8, %xmm0, %xmm0 vpblendw $51, %xmm0, %xmm1, %xmm0 But with improved splitting, we now generate three instructions: vpslld $8, %xmm1, %xmm0 vpsrad $8, %xmm0, %xmm0 vpblendw $51, %xmm1, %xmm0, %xmm0 This patch also implements Uros' suggestion that the pre-reload splitter could introduced a new pseudo to hold the intermediate to potentially help reload with register allocation, which applies when not performing the above optimization, i.e. on TARGET_XOP. 2024-08-15 Roger Sayle <roger@nextmovesoftware.com> Uros Bizjak <ubizjak@gmail.com> gcc/ChangeLog * config/i386/i386.md (*extendv2di2_highpart_stv_noavx512vl): Split to an improved implementation on !TARGET_XOP. On TARGET_XOP, use a new pseudo for the intermediate to simplify register allocation. gcc/testsuite/ChangeLog * g++.target/i386/pr116275-2.C: New test case.
Diffstat (limited to 'gcc/dwarf2codeview.cc')
0 files changed, 0 insertions, 0 deletions