diff options
author | Uros Bizjak <ubizjak@gmail.com> | 2023-08-08 18:53:51 +0200 |
---|---|---|
committer | Uros Bizjak <ubizjak@gmail.com> | 2023-08-08 18:56:07 +0200 |
commit | ad5b757d99b5a121198b79a6a42c1f15ae86a190 (patch) | |
tree | e9f3179e9e8ac70689fc986ada4d12247a448cb1 /gcc/tree-vectorizer.h | |
parent | aadc5c07feb0ab08729ab25d0d896b55860ad9e6 (diff) | |
download | gcc-ad5b757d99b5a121198b79a6a42c1f15ae86a190.zip gcc-ad5b757d99b5a121198b79a6a42c1f15ae86a190.tar.gz gcc-ad5b757d99b5a121198b79a6a42c1f15ae86a190.tar.bz2 |
i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]
Also introduce -m[no-]partial-vector-fp-math option to disable trapping
V2SF named patterns in order to avoid generation of partial vector V4SFmode
trapping instructions.
The new option is enabled by default, because even with sanitization,
a small but consistent speed up of 2 to 3% with Polyhedron capacita
benchmark can be achieved vs. scalar code.
Using -fno-trapping-math improves Polyhedron capacita runtime 8 to 9%
vs. scalar code. This is what clang does by default, as it defaults
to -fno-trapping-math.
PR target/110832
gcc/ChangeLog:
* config/i386/i386.opt (mpartial-vector-fp-math): New option.
* config/i386/mmx.md (movq_<mode>_to_sse): Do not sanitize
upper part of V2SFmode register with -fno-trapping-math.
(<plusminusmult:insn>v2sf3): Enable for ix86_partial_vec_fp_math.
(divv2sf3): Ditto.
(<smaxmin:code>v2sf3): Ditto.
(sqrtv2sf2): Ditto.
(*mmx_haddv2sf3_low): Ditto.
(*mmx_hsubv2sf3_low): Ditto.
(vec_addsubv2sf3): Ditto.
(vec_cmpv2sfv2si): Ditto.
(vcond<V2FI:mode>v2sf): Ditto.
(fmav2sf4): Ditto.
(fmsv2sf4): Ditto.
(fnmav2sf4): Ditto.
(fnmsv2sf4): Ditto.
(fix_truncv2sfv2si2): Ditto.
(fixuns_truncv2sfv2si2): Ditto.
(floatv2siv2sf2): Ditto.
(floatunsv2siv2sf2): Ditto.
(nearbyintv2sf2): Ditto.
(rintv2sf2): Ditto.
(lrintv2sfv2si2): Ditto.
(ceilv2sf2): Ditto.
(lceilv2sfv2si2): Ditto.
(floorv2sf2): Ditto.
(lfloorv2sfv2si2): Ditto.
(btruncv2sf2): Ditto.
(roundv2sf2): Ditto.
(lroundv2sfv2si2): Ditto.
* doc/invoke.texi (x86 Options): Document
-mpartial-vector-fp-math option.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr110832-1.c: New test.
* gcc.target/i386/pr110832-2.c: New test.
* gcc.target/i386/pr110832-3.c: New test.
Diffstat (limited to 'gcc/tree-vectorizer.h')
0 files changed, 0 insertions, 0 deletions