aboutsummaryrefslogtreecommitdiff
path: root/gcc/omp-expand.c
diff options
context:
space:
mode:
authorRoger Sayle <roger@nextmovesoftware.com>2021-11-16 08:55:21 +0000
committerRoger Sayle <roger@nextmovesoftware.com>2021-11-16 08:55:21 +0000
commit473b5e87346edf9885abc28b7de68e3cd7059746 (patch)
tree606d11010341ebf687aa40844dd78aa5ea1f572f /gcc/omp-expand.c
parente69b7c5779863469479698f863ab25e0d9b4586e (diff)
downloadgcc-473b5e87346edf9885abc28b7de68e3cd7059746.zip
gcc-473b5e87346edf9885abc28b7de68e3cd7059746.tar.gz
gcc-473b5e87346edf9885abc28b7de68e3cd7059746.tar.bz2
x86_64: Avoid rorx rotation instructions with -Os.
This patch teaches the i386 backend to avoid using BMI2's rorx instructions when optimizing for size. The benefits are shown with the following example: unsigned int ror1(unsigned int x) { return (x >> 1) | (x << 31); } unsigned int ror2(unsigned int x) { return (x >> 2) | (x << 30); } unsigned int rol2(unsigned int x) { return (x >> 30) | (x << 2); } unsigned int rol1(unsigned int x) { return (x >> 31) | (x << 1); } which currently with -Os -march=cascadelake generates: ror1: rorx $1, %edi, %eax // 6 bytes ret ror2: rorx $2, %edi, %eax // 6 bytes ret rol2: rorx $30, %edi, %eax // 6 bytes ret rol1: rorx $31, %edi, %eax // 6 bytes ret but with this patch now generates: ror1: movl %edi, %eax // 2 bytes rorl %eax // 2 bytes ret ror2: movl %edi, %eax // 2 bytes rorl $2, %eax // 3 bytes ret rol2: movl %edi, %eax // 2 bytes roll $2, %eax // 3 bytes ret rol1: movl %edi, %eax // 2 bytes roll %eax // 2 bytes ret I've confirmed that this patch is a win on the CSiBE benchmark, even though rotations are rare, where for example libmspack/test/md5.o shrinks from 5824 bytes to 5632 bytes. 2021-11-16 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386.md (*bmi2_rorx<mode3>_1): Make conditional on !optimize_function_for_size_p. (*<any_rotate><mode>3_1): Add preferred_for_size attribute. (define_splits): Conditionalize on !optimize_function_for_size_p. (*bmi2_rorxsi3_1_zext): Likewise. (*<any_rotate>si2_1_zext): Add preferred_for_size attribute. (define_splits): Conditionalize on !optimize_function_for_size_p.
Diffstat (limited to 'gcc/omp-expand.c')
0 files changed, 0 insertions, 0 deletions