diff options
author | Sudakshina Das <sudi.das@arm.com> | 2020-08-04 12:01:21 +0100 |
---|---|---|
committer | Sudakshina Das <sudi.das@arm.com> | 2020-08-04 12:01:53 +0100 |
commit | 7cda9e0878da44dcaf025d3d146534dfaf0b9986 (patch) | |
tree | 55d3496d7ffbd9f70eef4063b9fafbdd6c980423 /gcc/fortran/trans-openmp.c | |
parent | d2b86e14c14020f3e119ab8f462e2a91bd7d46e5 (diff) | |
download | gcc-7cda9e0878da44dcaf025d3d146534dfaf0b9986.zip gcc-7cda9e0878da44dcaf025d3d146534dfaf0b9986.tar.gz gcc-7cda9e0878da44dcaf025d3d146534dfaf0b9986.tar.bz2 |
aarch64: Use Q-reg loads/stores in movmem expansion
This is my attempt at reviving the old patch
https://gcc.gnu.org/pipermail/gcc-patches/2019-January/514632.html
I have followed on Kyrill's comment upstream on the link above and I
am using the recommended option iii that he mentioned.
"1) Adjust the copy_limit to 256 bits after checking
AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS in the tuning.
2) Adjust aarch64_copy_one_block_and_progress_pointers to handle
256-bit moves. by iii:
iii) Emit explicit V4SI (or any other 128-bit vector mode) pairs
ldp/stps. This wouldn't need any adjustments to MD patterns,
but would make aarch64_copy_one_block_and_progress_pointers
more complex as it would now have two paths, where one
handles two adjacent memory addresses in one calls."
gcc/ChangeLog:
* config/aarch64/aarch64.c (aarch64_gen_store_pair): Add case
for E_V4SImode.
(aarch64_gen_load_pair): Likewise.
(aarch64_copy_one_block_and_progress_pointers): Handle 256 bit copy.
(aarch64_expand_cpymem): Expand copy_limit to 256bits where
appropriate.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/cpymem-q-reg_1.c: New test.
* gcc.target/aarch64/large_struct_copy_2.c: Update for ldp q regs.
Diffstat (limited to 'gcc/fortran/trans-openmp.c')
0 files changed, 0 insertions, 0 deletions