diff options
author | Jan Beulich <jbeulich@suse.com> | 2023-06-21 08:03:05 +0200 |
---|---|---|
committer | Jan Beulich <jbeulich@suse.com> | 2023-06-21 08:03:05 +0200 |
commit | 864c6471bdc6cdec6da60b66ac13e9fe3cd73fb8 (patch) | |
tree | 693e8c381d27d59d961845d0403b849243934c7e /libjava/java/lang/Process.h | |
parent | 67061960b6ccdb706b11613a27c4ae30ee81c2c5 (diff) | |
download | gcc-864c6471bdc6cdec6da60b66ac13e9fe3cd73fb8.zip gcc-864c6471bdc6cdec6da60b66ac13e9fe3cd73fb8.tar.gz gcc-864c6471bdc6cdec6da60b66ac13e9fe3cd73fb8.tar.bz2 |
x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F
There's no reason to constrain this to AVX512VL, unless instructed so by
-mprefer-vector-width=, as the wider operation is unusable for more
narrow operands only when the possible memory source is a non-broadcast
one. This way even the scalar copysign<mode>3 can benefit from the
operation being a single-insn one (leaving aside moves which the
compiler decides to insert for unclear reasons, and leaving aside the
fact that bcst_mem_operand() is too restrictive for broadcast to be
embedded right into VPTERNLOG*).
While there also bring *<avx512>_vternlog<mode>_all's in sync with that
of the three splitters.
Along with this also request value duplication in
ix86_expand_copysign()'s call to ix86_build_signbit_mask(), eliminating
excess space allocation in .rodata.*, filled with zeros which are never
read.
gcc/
* config/i386/i386-expand.cc (ix86_expand_copysign): Request
value duplication by ix86_build_signbit_mask() when AVX512F and
not HFmode.
* config/i386/sse.md (*<avx512>_vternlog<mode>_all): Convert to
2-alternative form. Adjust "mode" attribute. Add "enabled"
attribute.
(*<avx512>_vpternlog<mode>_1): Also permit when TARGET_AVX512F
&& !TARGET_PREFER_AVX256.
(*<avx512>_vpternlog<mode>_2): Likewise.
(*<avx512>_vpternlog<mode>_3): Likewise.
gcc/testsuite/
* gcc.target/i386/avx512f-copysign.c: New test.
Diffstat (limited to 'libjava/java/lang/Process.h')
0 files changed, 0 insertions, 0 deletions