aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLuc Grosheintz <luc.grosheintz@gmail.com>2025-08-03 22:57:29 +0200
committerTomasz Kamiński <tkaminsk@redhat.com>2025-08-21 10:24:23 +0200
commit6dd2a42ab652e243b73e46b7f6e4ecfb3346e8b0 (patch)
treed5199bf78d7d55c278d31f4d239fa86a2a8791f6
parent2d3282663c1ab392801ca498daf1542e67a89ae2 (diff)
downloadgcc-6dd2a42ab652e243b73e46b7f6e4ecfb3346e8b0.zip
gcc-6dd2a42ab652e243b73e46b7f6e4ecfb3346e8b0.tar.gz
gcc-6dd2a42ab652e243b73e46b7f6e4ecfb3346e8b0.tar.bz2
libstdc++: Improve extents::operator==.
An interesting case to consider is: bool same11(const std::extents<int, dyn, 2, 3>& e1, const std::extents<int, dyn, dyn, 3>& e2) { return e1 == e2; } Which has the following properties: - There's no mismatching static extents, preventing any short-circuiting. - There's a comparison between dynamic and static extents. - There's one trivial comparison: ... && 3 == 3. Let E[i] denote the array of static extents, D[k] denote the array of dynamic extents and k[i] be the index of the i-th extent in D. (Naturally, k[i] is only meaningful if i is a dynamic extent). The previous implementation results in assembly that's more or less a literal translation of: for (i = 0; i < 3; ++i) e1 = E1[i] == -1 ? D1[k1[i]] : E1[i]; e2 = E2[i] == -1 ? D2[k2[i]] : E2[i]; if e1 != e2: return false return true; While the proposed method results in assembly for if(D1[0] == D2[0]) return false; return 2 == D2[1]; i.e. 110: 8b 17 mov edx,DWORD PTR [rdi] 112: 31 c0 xor eax,eax 114: 39 16 cmp DWORD PTR [rsi],edx 116: 74 08 je 120 <same11+0x10> 118: c3 ret 119: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0] 120: 83 7e 04 02 cmp DWORD PTR [rsi+0x4],0x2 124: 0f 94 c0 sete al 127: c3 ret It has the following nice properties: - It eliminated the indirection D[k[i]], because k[i] is known at compile time. Saving us a comparison E[i] == -1 and conditionally loading k[i]. - It eliminated the trivial condition 3 == 3. The result is code that only loads the required values and performs exactly the number of comparisons needed by the algorithm. It also results in smaller object files. Therefore, this seems like a sensible change. We've check several other examples, including fully statically determined cases and high-rank examples. The example given above illustrates the other cases well. The constexpr condition: if constexpr (!_S_is_compatible_extents<...>) return false; is no longer needed, because the optimizer correctly handles this case. However, it's retained for clarity/certainty. libstdc++-v3/ChangeLog: * include/std/mdspan (extents::operator==): Replace loop with pack expansion. Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com> Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
-rw-r--r--libstdc++-v3/include/std/mdspan9
1 files changed, 5 insertions, 4 deletions
diff --git a/libstdc++-v3/include/std/mdspan b/libstdc++-v3/include/std/mdspan
index 4b27111..fdec813 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -407,10 +407,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return false;
else
{
- for (size_t __i = 0; __i < __self.rank(); ++__i)
- if (!cmp_equal(__self.extent(__i), __other.extent(__i)))
- return false;
- return true;
+ auto __impl = [&__self, &__other]<size_t... _Counts>(
+ index_sequence<_Counts...>)
+ { return (cmp_equal(__self.extent(_Counts),
+ __other.extent(_Counts)) && ...); };
+ return __impl(make_index_sequence<__self.rank()>());
}
}