diff options
author | Xionghu Luo <luoxhu@linux.ibm.com> | 2020-07-12 20:21:05 -0500 |
---|---|---|
committer | Xionghu Luo <luoxhu@linux.ibm.com> | 2020-07-12 20:21:05 -0500 |
commit | 466dd1629c699599050f68d2bfee58be9db40aab (patch) | |
tree | b7cff812d3ee52f3a4b9e8587984880c30fadf1f /gcc/expr.c | |
parent | 9e28851b345461dd2d097abeb2d1ee4218191a1d (diff) | |
download | gcc-466dd1629c699599050f68d2bfee58be9db40aab.zip gcc-466dd1629c699599050f68d2bfee58be9db40aab.tar.gz gcc-466dd1629c699599050f68d2bfee58be9db40aab.tar.bz2 |
rs6000: Init V4SF vector without converting SP to DP
Move V4SF to V4SI, init vector like V4SI and move to V4SF back.
Better instruction sequence could be generated on Power9:
lfs + xxpermdi + xvcvdpsp + vmrgew
=>
lwz + (sldi + or) + mtvsrdd
With the patch followed, it could be continue optimized to:
lwz + rldimi + mtvsrdd
The point is to use lwz to avoid converting the single-precision to
double-precision upon load, pack four 32-bit data into one 128-bit
register directly.
gcc/ChangeLog:
2020-07-13 Xionghu Luo <luoxhu@linux.ibm.com>
* config/rs6000/rs6000.c (rs6000_expand_vector_init):
Move V4SF to V4SI, init vector like V4SI and move to V4SF back.
Diffstat (limited to 'gcc/expr.c')
0 files changed, 0 insertions, 0 deletions