aboutsummaryrefslogtreecommitdiff
path: root/gcc/tree-ssa-forwprop.c
diff options
context:
space:
mode:
authorRoger Sayle <roger@nextmovesoftware.com>2022-01-14 10:06:03 +0000
committerRoger Sayle <roger@nextmovesoftware.com>2022-01-14 10:08:26 +0000
commit51e9e8a2e2098d87e4e1932424938bd11078860f (patch)
tree21578568f1da13e14d5c3977ae73d78314cbf288 /gcc/tree-ssa-forwprop.c
parent89b4e316a02be9fda3b793a7be871f7c7913cd58 (diff)
downloadgcc-51e9e8a2e2098d87e4e1932424938bd11078860f.zip
gcc-51e9e8a2e2098d87e4e1932424938bd11078860f.tar.gz
gcc-51e9e8a2e2098d87e4e1932424938bd11078860f.tar.bz2
x86_64: Improvements to arithmetic right shifts of V1TImode values.
This patch to the i386 backend's ix86_expand_v1ti_ashiftrt provides improved (shorter) implementations of V1TI mode arithmetic right shifts for constant amounts between 111 and 126 bits. The significance of this range is that this functionality is useful for (eventually) providing sign extension from HImode and QImode to V1TImode. For example, x>>112 (to sign extend a 16-bit value), was previously generated as a four operation sequence: movdqa %xmm0, %xmm1 // word 7 6 5 4 3 2 1 0 psrad $31, %xmm0 // V8HI = [S,S,?,?,?,?,?,?] psrad $16, %xmm1 // V8HI = [S,X,?,?,?,?,?,?] punpckhqdq %xmm0, %xmm1 // V8HI = [S,S,?,?,S,X,?,?] pshufd $253, %xmm1, %xmm0 // V8HI = [S,S,S,S,S,S,S,X] with this patch, we now generates a three operation sequence: psrad $16, %xmm0 // V8HI = [S,X,?,?,?,?,?,?] pshufhw $254, %xmm0, %xmm0 // V8HI = [S,S,S,X,?,?,?,?] pshufd $254, %xmm0, %xmm0 // V8HI = [S,S,S,S,S,S,S,X] The correctness of generated code is confirmed by the existing run-time test gcc.target/i386/sse2-v1ti-ashiftrt-1.c in the testsuite. This idiom is safe to use for shifts by 127, but that case gets handled by a two operation sequence earlier in this function. 2022-01-14 Roger Sayle <roger@nextmovesoftware.com> Uroš Bizjak <ubizjak@gmail.com> gcc/ChangeLog * config/i386/i386-expand.c (ix86_expand_v1ti_to_ti): Use force_reg. (ix86_expand_ti_to_v1ti): Use force_reg. (ix86_expand_v1ti_shift): Use force_reg. (ix86_expand_v1ti_rotate): Use force_reg. (ix86_expand_v1ti_ashiftrt): Provide new three operation implementations for shifts by 111..126 bits. Use force_reg.
Diffstat (limited to 'gcc/tree-ssa-forwprop.c')
0 files changed, 0 insertions, 0 deletions