aboutsummaryrefslogtreecommitdiff
path: root/gcc/fortran/trans-openmp.c
diff options
context:
space:
mode:
authorSudakshina Das <sudi.das@arm.com>2020-08-04 12:01:21 +0100
committerSudakshina Das <sudi.das@arm.com>2020-08-04 12:01:53 +0100
commit7cda9e0878da44dcaf025d3d146534dfaf0b9986 (patch)
tree55d3496d7ffbd9f70eef4063b9fafbdd6c980423 /gcc/fortran/trans-openmp.c
parentd2b86e14c14020f3e119ab8f462e2a91bd7d46e5 (diff)
downloadgcc-7cda9e0878da44dcaf025d3d146534dfaf0b9986.zip
gcc-7cda9e0878da44dcaf025d3d146534dfaf0b9986.tar.gz
gcc-7cda9e0878da44dcaf025d3d146534dfaf0b9986.tar.bz2
aarch64: Use Q-reg loads/stores in movmem expansion
This is my attempt at reviving the old patch https://gcc.gnu.org/pipermail/gcc-patches/2019-January/514632.html I have followed on Kyrill's comment upstream on the link above and I am using the recommended option iii that he mentioned. "1) Adjust the copy_limit to 256 bits after checking AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS in the tuning. 2) Adjust aarch64_copy_one_block_and_progress_pointers to handle 256-bit moves. by iii: iii) Emit explicit V4SI (or any other 128-bit vector mode) pairs ldp/stps. This wouldn't need any adjustments to MD patterns, but would make aarch64_copy_one_block_and_progress_pointers more complex as it would now have two paths, where one handles two adjacent memory addresses in one calls." gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_gen_store_pair): Add case for E_V4SImode. (aarch64_gen_load_pair): Likewise. (aarch64_copy_one_block_and_progress_pointers): Handle 256 bit copy. (aarch64_expand_cpymem): Expand copy_limit to 256bits where appropriate. gcc/testsuite/ChangeLog: * gcc.target/aarch64/cpymem-q-reg_1.c: New test. * gcc.target/aarch64/large_struct_copy_2.c: Update for ldp q regs.
Diffstat (limited to 'gcc/fortran/trans-openmp.c')
0 files changed, 0 insertions, 0 deletions