diff options
author | Matthias Kretz <kretz@kde.org> | 2021-02-03 15:49:30 +0000 |
---|---|---|
committer | Jonathan Wakely <jwakely@redhat.com> | 2021-02-03 15:49:30 +0000 |
commit | 81c2c32de9c1058c33fcf77ada31186b4ae1f1fe (patch) | |
tree | ade43ae42ef8baf375965866e4811c3a871d9389 /libcpp | |
parent | 71f9b9bd0acc7d0749e159efb1b9b4c57197a77d (diff) | |
download | gcc-81c2c32de9c1058c33fcf77ada31186b4ae1f1fe.zip gcc-81c2c32de9c1058c33fcf77ada31186b4ae1f1fe.tar.gz gcc-81c2c32de9c1058c33fcf77ada31186b4ae1f1fe.tar.bz2 |
libstdc++: Fix mask reduction of simd_mask<double> on POWER7
POWER7 does not support __vector long long reductions, making the
generic _S_popcount implementation ill-formed. Specializing _S_popcount
for PPC allows optimization and avoids the issue.
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Add __have_power10vec
conditional on _ARCH_PWR10.
* include/experimental/bits/simd_builtin.h: Forward declare
_MaskImplPpc and use it as _MaskImpl when __ALTIVEC__ is
defined.
(_MaskImplBuiltin::_S_some_of): Call _S_popcount from the
_SuperImpl for optimizations and correctness.
* include/experimental/bits/simd_ppc.h: Add _MaskImplPpc.
(_MaskImplPpc::_S_popcount): Implement via vec_cntm for POWER10.
Otherwise, for >=int use -vec_sums divided by a sizeof factor.
For <int use -vec_sums(vec_sum4s(...)) to sum all mask entries.
Diffstat (limited to 'libcpp')
0 files changed, 0 insertions, 0 deletions