diff options
author | Richard Biener <rguenther@suse.de> | 2024-11-08 11:17:22 +0100 |
---|---|---|
committer | Richard Biener <rguenth@gcc.gnu.org> | 2024-11-12 09:54:25 +0100 |
commit | 9a62c1495891032922af5bf9bd1906999cf63605 (patch) | |
tree | b58f6178322bb75070742cdfc09c1b5414cfc586 /gcc/ada/raise-gcc.c | |
parent | 82d955b0a8acfdf3e63e82135077806c19e622e6 (diff) | |
download | gcc-9a62c1495891032922af5bf9bd1906999cf63605.zip gcc-9a62c1495891032922af5bf9bd1906999cf63605.tar.gz gcc-9a62c1495891032922af5bf9bd1906999cf63605.tar.bz2 |
Add X86_TUNE_AVX512_TWO_EPILOGUES, enable for Zen4 and Zen5
The following adds X86_TUNE_AVX512_TWO_EPILOGUES tuning and directs the
vectorizer to produce both a vector AVX2 and SSE epilogue for AVX512
vectorized loops when set. The tuning is enabled by default for Zen4
and Zen5 where I benchmarked it to be overall positive on SPEC CPU 2017 both
in performance and overall code size. In particular it speeds up
525.x264_r which with only an AVX2 epilogue ends up in unvectorized code
at the moment.
* config/i386/i386.cc (ix86_vector_costs::finish_cost): Set
m_suggested_epilogue_mode according to X86_TUNE_AVX512_TWO_EPILOGUES.
* config/i386/x86-tune.def (X86_TUNE_AVX512_TWO_EPILOGUES): Add.
Enable for znver4 and znver5.
Diffstat (limited to 'gcc/ada/raise-gcc.c')
0 files changed, 0 insertions, 0 deletions