diff options
author | H.J. Lu <hjl.tools@gmail.com> | 2021-03-11 16:56:26 -0800 |
---|---|---|
committer | H.J. Lu <hjl.tools@gmail.com> | 2021-04-06 05:36:00 -0700 |
commit | a32452a5442cd05040af53787af0d8b537ac77a6 (patch) | |
tree | 1090d473d81fc8ed8a028d92d22ecb224c264f49 /gcc/gcov-io.h | |
parent | e5c170e080399fb3d24a38bbfcd66bd4675abe53 (diff) | |
download | gcc-a32452a5442cd05040af53787af0d8b537ac77a6.zip gcc-a32452a5442cd05040af53787af0d8b537ac77a6.tar.gz gcc-a32452a5442cd05040af53787af0d8b537ac77a6.tar.bz2 |
x86: Update memcpy/memset inline strategies for Skylake family CPUs
Simply memcpy and memset inline strategies to avoid branches for
Skylake family CPUs:
1. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
load and store for up to 16 * 16 (256) bytes when the data size is
fixed and known.
2. Inline only if data size is known to be <= 256.
a. Use "rep movsb/stosb" with simple code sequence if the data size
is a constant.
b. Use loop if data size is not a constant.
3. Use memcpy/memset libray function if data size is unknown or > 256.
On Cascadelake processor with -march=native -Ofast -flto,
1. Performance impacts of SPEC CPU 2017 rate are:
500.perlbench_r 0.17%
502.gcc_r -0.36%
505.mcf_r 0.00%
520.omnetpp_r 0.08%
523.xalancbmk_r -0.62%
525.x264_r 1.04%
531.deepsjeng_r 0.11%
541.leela_r -1.09%
548.exchange2_r -0.25%
557.xz_r 0.17%
Geomean -0.08%
503.bwaves_r 0.00%
507.cactuBSSN_r 0.69%
508.namd_r -0.07%
510.parest_r 1.12%
511.povray_r 1.82%
519.lbm_r 0.00%
521.wrf_r -1.32%
526.blender_r -0.47%
527.cam4_r 0.23%
538.imagick_r -1.72%
544.nab_r -0.56%
549.fotonik3d_r 0.12%
554.roms_r 0.43%
Geomean 0.02%
2. Significant impacts on eembc benchmarks are:
eembc/idctrn01 9.23%
eembc/nnet_test 29.26%
gcc/
* config/i386/x86-tune-costs.h (skylake_memcpy): Updated.
(skylake_memset): Likewise.
(skylake_cost): Change CLEAR_RATIO to 17.
* config/i386/x86-tune.def (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB):
Replace m_CANNONLAKE, m_ICELAKE_CLIENT, m_ICELAKE_SERVER,
m_TIGERLAKE and m_SAPPHIRERAPIDS with m_SKYLAKE and m_CORE_AVX512.
gcc/testsuite/
* gcc.target/i386/memcpy-strategy-9.c: New test.
* gcc.target/i386/memcpy-strategy-10.c: Likewise.
* gcc.target/i386/memcpy-strategy-11.c: Likewise.
* gcc.target/i386/memset-strategy-7.c: Likewise.
* gcc.target/i386/memset-strategy-8.c: Likewise.
* gcc.target/i386/memset-strategy-9.c: Likewise.
Diffstat (limited to 'gcc/gcov-io.h')
0 files changed, 0 insertions, 0 deletions