diff options
author | Luc Grosheintz <luc.grosheintz@gmail.com> | 2025-08-03 22:57:29 +0200 |
---|---|---|
committer | Tomasz Kamiński <tkaminsk@redhat.com> | 2025-08-21 10:24:23 +0200 |
commit | 6dd2a42ab652e243b73e46b7f6e4ecfb3346e8b0 (patch) | |
tree | d5199bf78d7d55c278d31f4d239fa86a2a8791f6 | |
parent | 2d3282663c1ab392801ca498daf1542e67a89ae2 (diff) | |
download | gcc-6dd2a42ab652e243b73e46b7f6e4ecfb3346e8b0.zip gcc-6dd2a42ab652e243b73e46b7f6e4ecfb3346e8b0.tar.gz gcc-6dd2a42ab652e243b73e46b7f6e4ecfb3346e8b0.tar.bz2 |
libstdc++: Improve extents::operator==.
An interesting case to consider is:
bool same11(const std::extents<int, dyn, 2, 3>& e1,
const std::extents<int, dyn, dyn, 3>& e2)
{ return e1 == e2; }
Which has the following properties:
- There's no mismatching static extents, preventing any
short-circuiting.
- There's a comparison between dynamic and static extents.
- There's one trivial comparison: ... && 3 == 3.
Let E[i] denote the array of static extents, D[k] denote the array of
dynamic extents and k[i] be the index of the i-th extent in D.
(Naturally, k[i] is only meaningful if i is a dynamic extent).
The previous implementation results in assembly that's more or less a
literal translation of:
for (i = 0; i < 3; ++i)
e1 = E1[i] == -1 ? D1[k1[i]] : E1[i];
e2 = E2[i] == -1 ? D2[k2[i]] : E2[i];
if e1 != e2:
return false
return true;
While the proposed method results in assembly for
if(D1[0] == D2[0]) return false;
return 2 == D2[1];
i.e.
110: 8b 17 mov edx,DWORD PTR [rdi]
112: 31 c0 xor eax,eax
114: 39 16 cmp DWORD PTR [rsi],edx
116: 74 08 je 120 <same11+0x10>
118: c3 ret
119: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0]
120: 83 7e 04 02 cmp DWORD PTR [rsi+0x4],0x2
124: 0f 94 c0 sete al
127: c3 ret
It has the following nice properties:
- It eliminated the indirection D[k[i]], because k[i] is known at
compile time. Saving us a comparison E[i] == -1 and conditionally
loading k[i].
- It eliminated the trivial condition 3 == 3.
The result is code that only loads the required values and performs
exactly the number of comparisons needed by the algorithm. It also
results in smaller object files. Therefore, this seems like a sensible
change. We've check several other examples, including fully statically
determined cases and high-rank examples. The example given above
illustrates the other cases well.
The constexpr condition:
if constexpr (!_S_is_compatible_extents<...>)
return false;
is no longer needed, because the optimizer correctly handles this case.
However, it's retained for clarity/certainty.
libstdc++-v3/ChangeLog:
* include/std/mdspan (extents::operator==): Replace loop with
pack expansion.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
-rw-r--r-- | libstdc++-v3/include/std/mdspan | 9 |
1 files changed, 5 insertions, 4 deletions
diff --git a/libstdc++-v3/include/std/mdspan b/libstdc++-v3/include/std/mdspan index 4b27111..fdec813 100644 --- a/libstdc++-v3/include/std/mdspan +++ b/libstdc++-v3/include/std/mdspan @@ -407,10 +407,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return false; else { - for (size_t __i = 0; __i < __self.rank(); ++__i) - if (!cmp_equal(__self.extent(__i), __other.extent(__i))) - return false; - return true; + auto __impl = [&__self, &__other]<size_t... _Counts>( + index_sequence<_Counts...>) + { return (cmp_equal(__self.extent(_Counts), + __other.extent(_Counts)) && ...); }; + return __impl(make_index_sequence<__self.rank()>()); } } |