Age | Commit message (Collapse) | Author | Files | Lines |
|
stop_source
The move constructors for stop_source and stop_token are equivalent to
copying and clearing the raw pointer, as they are wrappers for a
counted-shared state.
For jthread, the move constructor performs a member-wise move of stop_source
and thread. While std::thread could also have a _Never_valueless_alt
specialization due to its inexpensive move (only moving a handle), doing
so now would change the ABI. This patch takes the opportunity to correct
this behavior for jthread, before C++20 API is marked stable.
libstdc++-v3/ChangeLog:
* include/std/stop_token (__variant::_Never_valueless_alt): Declare.
(__variant::_Never_valueless_alt<std::stop_token>)
(__variant::_Never_valueless_alt<std::stop_source>): Define.
* include/std/thread: (__variant::_Never_valueless_alt): Declare.
(__variant::_Never_valueless_alt<std::jthread>): Define.
|
|
The change in r14-905-g3b7cb33033fbe6 to disable the use of
pthread_mutex_clocklock when TSan is active assumed that the
_GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK macro was always checked with #if
rather than #ifdef, which was not true.
This makes the checks use #if consistently.
libstdc++-v3/ChangeLog:
PR libstdc++/121496
* include/std/mutex (__timed_mutex_impl::_M_try_wait_until):
Change preprocessor condition to use #if instead of #ifdef.
(recursive_timed_mutex::_M_clocklock): Likewise.
* testsuite/30_threads/timed_mutex/121496.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
This commit completes the implementation of P2897R7 by implementing and
testing the template class aligned_accessor.
PR libstdc++/120994
libstdc++-v3/ChangeLog:
* include/bits/version.def (aligned_accessor): Add.
* include/bits/version.h: Regenerate.
* include/std/mdspan (aligned_accessor): New class.
* src/c++23/std.cc.in (aligned_accessor): Add.
* testsuite/23_containers/mdspan/accessors/generic.cc: Add tests
for aligned_accessor.
* testsuite/23_containers/mdspan/accessors/aligned_neg.cc: New test.
* testsuite/23_containers/mdspan/version.cc: Add test for
__cpp_lib_aligned_accessor.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
This commit implements and tests the function is_sufficiently_aligned
from P2897R7.
PR libstdc++/120994
libstdc++-v3/ChangeLog:
* include/bits/align.h (is_sufficiently_aligned): New function.
* include/bits/version.def (is_sufficiently_aligned): Add.
* include/bits/version.h: Regenerate.
* include/std/memory: Add __glibcxx_want_is_sufficiently_aligned.
* src/c++23/std.cc.in (is_sufficiently_aligned): Add.
* testsuite/20_util/headers/memory/version.cc: Add test for
__cpp_lib_is_sufficiently_aligned.
* testsuite/20_util/is_sufficiently_aligned/1.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
When I added this explicit specialization in r14-1433-gf150a084e25eaa I
used the wrong value for the number of mantissa digits (I used 112
instead of 113). Then when I refactored it in r14-1582-g6261d10521f9fd I
used the value calculated from the incorrect value (35 instead of 36).
libstdc++-v3/ChangeLog:
PR libstdc++/121374
* include/std/limits (numeric_limits<__float128>::max_digits10):
Fix value.
* testsuite/18_support/numeric_limits/128bit.cc: Check value.
|
|
This commit implements the C++26 feature std::dims described in P2389R2.
It sets the feature testing macro to 202406 and adds tests.
Also fixes the test mdspan/version.cc
libstdc++-v3/ChangeLog:
* include/bits/version.def (mdspan): Set value for C++26.
* include/bits/version.h: Regenerate.
* include/std/mdspan (dims): Add.
* src/c++23/std.cc.in (dims): Add.
* testsuite/23_containers/mdspan/extents/misc.cc: Add tests.
* testsuite/23_containers/mdspan/version.cc: Update test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
Prior to this commit, the partial products of static extents in <mdspan>
was done in a loop that calls a function that computes the partial
product. The complexity is quadratic in the rank.
This commit removes the quadratic complexity.
libstdc++-v3/ChangeLog:
* include/std/mdspan (__static_prod): Delete.
(__fwd_partial_prods): Compute at compile-time in O(rank), not
O(rank**2).
(__rev_partial_prods): Ditto.
(__size): Inline __static_prod.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
This fixes an oversight in a previous commit that improved mdspan
related code. Because __size doesn't use __fwd_prod, __fwd_prod(__rank)
is not needed anymore. Hence, one can shrink the size of
__fwd_partial_prods.
libstdc++-v3/ChangeLog:
* include/std/mdspan (__fwd_partial_prods): Reduce size of the
array by 1 element.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
Using __int_traits avoids the need to include <limits> from <mdspan>.
This in turn should reduce the size of the pre-compiled <mdspan>.
Similar refactoring was carried out for PR92546. Unfortunately,
./gcc/xgcc -std=c++23 -P -E -x c++ - -include mdspan | wc -l
shows a decrease by 1(!) line. This is due to bits/max_size_type.h which
includes <limits>.
libstdc++-v3/ChangeLog:
* include/std/mdspan (__valid_static_extent): Replace
numeric_limits with __int_traits.
(extents::_S_ctor_explicit): Ditto.
(extents::__static_quotient): Ditto.
(layout_stride::mapping::mapping): Ditto.
(mdspan::size): Ditto.
* testsuite/23_containers/mdspan/extents/class_mandates_neg.cc:
Update test with additional diagnostics.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
An interesting case to consider is:
bool same11(const std::extents<int, dyn, 2, 3>& e1,
const std::extents<int, dyn, dyn, 3>& e2)
{ return e1 == e2; }
Which has the following properties:
- There's no mismatching static extents, preventing any
short-circuiting.
- There's a comparison between dynamic and static extents.
- There's one trivial comparison: ... && 3 == 3.
Let E[i] denote the array of static extents, D[k] denote the array of
dynamic extents and k[i] be the index of the i-th extent in D.
(Naturally, k[i] is only meaningful if i is a dynamic extent).
The previous implementation results in assembly that's more or less a
literal translation of:
for (i = 0; i < 3; ++i)
e1 = E1[i] == -1 ? D1[k1[i]] : E1[i];
e2 = E2[i] == -1 ? D2[k2[i]] : E2[i];
if e1 != e2:
return false
return true;
While the proposed method results in assembly for
if(D1[0] == D2[0]) return false;
return 2 == D2[1];
i.e.
110: 8b 17 mov edx,DWORD PTR [rdi]
112: 31 c0 xor eax,eax
114: 39 16 cmp DWORD PTR [rsi],edx
116: 74 08 je 120 <same11+0x10>
118: c3 ret
119: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0]
120: 83 7e 04 02 cmp DWORD PTR [rsi+0x4],0x2
124: 0f 94 c0 sete al
127: c3 ret
It has the following nice properties:
- It eliminated the indirection D[k[i]], because k[i] is known at
compile time. Saving us a comparison E[i] == -1 and conditionally
loading k[i].
- It eliminated the trivial condition 3 == 3.
The result is code that only loads the required values and performs
exactly the number of comparisons needed by the algorithm. It also
results in smaller object files. Therefore, this seems like a sensible
change. We've check several other examples, including fully statically
determined cases and high-rank examples. The example given above
illustrates the other cases well.
The constexpr condition:
if constexpr (!_S_is_compatible_extents<...>)
return false;
is no longer needed, because the optimizer correctly handles this case.
However, it's retained for clarity/certainty.
libstdc++-v3/ChangeLog:
* include/std/mdspan (extents::operator==): Replace loop with
pack expansion.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
In both fully static and dynamic extents the comparison
static_extent(i) == dynamic_extent
is known at compile time. As a result, extents::extent doesn't
need to perform the check at runtime.
An illustrative example is:
using E = std::extents<int, 3, 5, 7, 11, 13, 17>;
int required_span_size(const typename Layout::mapping<E>& m)
{ return m.required_span_size(); }
Prior to this commit the generated code (on -O2) is:
2a0: b9 01 00 00 00 mov ecx,0x1
2a5: 31 d2 xor edx,edx
2a7: 66 66 2e 0f 1f 84 00 data16 cs nop WORD PTR [rax+rax*1+0x0]
2ae: 00 00 00 00
2b2: 66 66 2e 0f 1f 84 00 data16 cs nop WORD PTR [rax+rax*1+0x0]
2b9: 00 00 00 00
2bd: 0f 1f 00 nop DWORD PTR [rax]
2c0: 48 8b 04 d5 00 00 00 mov rax,QWORD PTR [rdx*8+0x0]
2c7: 00
2c8: 48 83 f8 ff cmp rax,0xffffffffffffffff
2cc: 0f 84 00 00 00 00 je 2d2 <required_span_size_6d_static+0x32>
2d2: 83 e8 01 sub eax,0x1
2d5: 0f af 04 97 imul eax,DWORD PTR [rdi+rdx*4]
2d9: 48 83 c2 01 add rdx,0x1
2dd: 01 c1 add ecx,eax
2df: 48 83 fa 06 cmp rdx,0x6
2e3: 75 db jne 2c0 <required_span_size_6d_static+0x20>
2e5: 89 c8 mov eax,ecx
2e7: c3 ret
which is a scalar loop, and notably includes the check
308: 48 83 f8 ff cmp rax,0xffffffffffffffff
to assert that the static extent is indeed not -1. Note, that on -O3 the
optimizer eliminates the comparison; and generates a sequence of scalar
operations: lea, shl, add and mov. The aim of this commit is to
eliminate this comparison also for -O2. With the optimization applied we
get:
2e0: f3 0f 6f 0f movdqu xmm1,XMMWORD PTR [rdi]
2e4: 66 0f 6f 15 00 00 00 movdqa xmm2,XMMWORD PTR [rip+0x0]
2eb: 00
2ec: 8b 57 10 mov edx,DWORD PTR [rdi+0x10]
2ef: 66 0f 6f c1 movdqa xmm0,xmm1
2f3: 66 0f 73 d1 20 psrlq xmm1,0x20
2f8: 66 0f f4 c2 pmuludq xmm0,xmm2
2fc: 66 0f 73 d2 20 psrlq xmm2,0x20
301: 8d 14 52 lea edx,[rdx+rdx*2]
304: 66 0f f4 ca pmuludq xmm1,xmm2
308: 66 0f 70 c0 08 pshufd xmm0,xmm0,0x8
30d: 66 0f 70 c9 08 pshufd xmm1,xmm1,0x8
312: 66 0f 62 c1 punpckldq xmm0,xmm1
316: 66 0f 6f c8 movdqa xmm1,xmm0
31a: 66 0f 73 d9 08 psrldq xmm1,0x8
31f: 66 0f fe c1 paddd xmm0,xmm1
323: 66 0f 6f c8 movdqa xmm1,xmm0
327: 66 0f 73 d9 04 psrldq xmm1,0x4
32c: 66 0f fe c1 paddd xmm0,xmm1
330: 66 0f 7e c0 movd eax,xmm0
334: 8d 54 90 01 lea edx,[rax+rdx*4+0x1]
338: 8b 47 14 mov eax,DWORD PTR [rdi+0x14]
33b: c1 e0 04 shl eax,0x4
33e: 01 d0 add eax,edx
340: c3 ret
Which shows eliminating the trivial comparison, unlocks a new set of
optimizations, i.e. SIMD-vectorization. In particular, the loop has been
vectorized by loading the first four constants from aligned memory; the
first four strides from non-aligned memory, then computes the product
and reduction. It interleaves the above with computing 1 + 12*S[4] +
16*S[5] (as scalar operations) and then finishes the reduction.
A similar effect can be observed for fully dynamic extents.
libstdc++-v3/ChangeLog:
* include/std/mdspan (__mdspan::__all_static): New function.
(__mdspan::_StaticExtents::_S_is_dyn): Inline and eliminate.
(__mdspan::_ExtentsStorage::_S_is_dynamic): New method.
(__mdspan::_ExtentsStorage::_M_extent): Use _S_is_dynamic.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
One previous commit optimized fully dynamic extents; and another
refactored __size such that __fwd_prod is valid for __r = 0, ..., rank
(exclusive).
Therefore, by noticing that __rev_prod (and __fwd_prod) never accesses
the first (or last) extent, one can avoid pre-computing partial products
of static extents in those cases, if all other extents are dynamic.
We check that the size of the reference object file decreases further
and the .rodata sections for
__fwd_prod<dyn, ..., dyn, 11>
__rev_prod<3, dyn, ..., dyn>
are absent.
libstdc++-v3/ChangeLog:
* include/std/mdspan (__fwd_prods): Relax condition for fully-dynamic
extents to cover (dyn, ..., dyn, X).
(__rev_partial_prods): Analogous for (X, dyn, ..., dyn).
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
In mdspan related code, for extents with no static extents, i.e. only
dynamic extents, the following simplifications can be made:
- The array of dynamic extents has size rank.
- The two arrays dynamic-index and dynamic-index-inv become
trivial, e.g. k[i] == i.
- All elements of the arrays __{fwd,rev}_partial_prods are 1.
This commits eliminates the arrays for dynamic-index, dynamic-index-inv
and __{fwd,rev}_partial_prods. It also removes the indirection k[i] == i
from the source code, which isn't as relevant because the optimizer is
(often) capable of eliminating the indirection.
To check if it's working we look at:
using E2 = std::extents<int, dyn, dyn, dyn, dyn>;
int stride_left_E2(const std::layout_left::mapping<E2>& m, size_t r)
{ return m.stride(r); }
which generates the following
0000000000000190 <stride_left_E2>:
190: 48 c1 e6 02 shl rsi,0x2
194: 74 22 je 1b8 <stride_left_E2+0x28>
196: 48 01 fe add rsi,rdi
199: b8 01 00 00 00 mov eax,0x1
19e: 66 90 xchg ax,ax
1a0: 48 63 17 movsxd rdx,DWORD PTR [rdi]
1a3: 48 83 c7 04 add rdi,0x4
1a7: 48 0f af c2 imul rax,rdx
1ab: 48 39 fe cmp rsi,rdi
1ae: 75 f0 jne 1a0 <stride_left_E2+0x10>
1b0: c3 ret
1b1: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0]
1b8: b8 01 00 00 00 mov eax,0x1
1bd: c3 ret
We see that:
- There's no code to load the partial product of static extents.
- There's no indirection D[k[i]], it's just D[i] (as before).
On a test file which computes both mapping::stride(r) and
mapping::required_span_size, we check for static storage with
objdump -h
we don't see the NTTP _Extents, anything (anymore) related to
_StaticExtents, __fwd_partial_prods or __rev_partial_prods. We also
check that the size of the reference object file (described three
commits prior) reduced by a few percent from 41.9kB to 39.4kB.
libstdc++-v3/ChangeLog:
* include/std/mdspan (__mdspan::__all_dynamic): New function.
(__mdspan::_StaticExtents::_S_dynamic_index): Convert to method.
(__mdspan::_StaticExtents::_S_dynamic_index_inv): Ditto.
(__mdspan::_StaticExtents): New specialization for fully dynamic
extents.
(__mdspan::__fwd_prod): New constexpr if branch to avoid
instantiating __fwd_partial_prods.
(__mdspan::__rev_prod): Ditto.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
The methods layout_{left,right}::mapping::stride are defined
as
\prod_{i = 0}^r E[i]
\prod_{i = r+1}^n E[i]
This is computed as the product of a precomputed static product and the
product of the required dynamic extents.
Disassembly shows that even for low-rank extents, i.e. rank == 1 and
rank == 2, with at least one dynamic extent, the generated code loads
two values; and then runs the loop over at most one element, e.g. for
stride_left_d5 defined below the generated code is:
220: 48 8b 04 f5 00 00 00 mov rax,QWORD PTR [rsi*8+0x0]
227: 00
228: 31 d2 xor edx,edx
22a: 48 85 c0 test rax,rax
22d: 74 23 je 252 <stride_left_d5+0x32>
22f: 48 8b 0c f5 00 00 00 mov rcx,QWORD PTR [rsi*8+0x0]
236: 00
237: 48 c1 e1 02 shl rcx,0x2
23b: 74 13 je 250 <stride_left_d5+0x30>
23d: 48 01 f9 add rcx,rdi
240: 48 63 17 movsxd rdx,DWORD PTR [rdi]
243: 48 83 c7 04 add rdi,0x4
247: 48 0f af c2 imul rax,rdx
24b: 48 39 f9 cmp rcx,rdi
24e: 75 f0 jne 240 <stride_left_d5+0x20>
250: 89 c2 mov edx,eax
252: 89 d0 mov eax,edx
254: c3 ret
If there's no dynamic extents, it simply loads the precomputed product
of static extents.
For rank == 1 the answer is the constant `1`; for rank == 2 it's either 1 or
extents.extent(k), with k == 0 for layout_left and k == 1 for
layout_right.
Consider,
using Ed = std::extents<int, dyn>;
int stride_left_d(const std::layout_left::mapping<Ed>& m, size_t r)
{ return m.stride(r); }
using E3d = std::extents<int, 3, dyn>;
int stride_left_3d(const std::layout_left::mapping<E3d>& m, size_t r)
{ return m.stride(r); }
using Ed5 = std::extents<int, dyn, 5>;
int stride_left_d5(const std::layout_left::mapping<Ed5>& m, size_t r)
{ return m.stride(r); }
The optimized code for these three cases is:
0000000000000060 <stride_left_d>:
60: b8 01 00 00 00 mov eax,0x1
65: c3 ret
0000000000000090 <stride_left_3d>:
90: 48 83 fe 01 cmp rsi,0x1
94: 19 c0 sbb eax,eax
96: 83 e0 fe and eax,0xfffffffe
99: 83 c0 03 add eax,0x3
9c: c3 ret
00000000000000a0 <stride_left_d5>:
a0: b8 01 00 00 00 mov eax,0x1
a5: 48 85 f6 test rsi,rsi
a8: 74 02 je ac <stride_left_d5+0xc>
aa: 8b 07 mov eax,DWORD PTR [rdi]
ac: c3 ret
For rank == 1 it simply returns 1 (as expected). For rank == 2, it
either implements a branchless formula, or conditionally loads one
value. In all cases involving a dynamic extent this seems like it's
always doing clearly less work, both in terms of computation and loads.
In cases not involving a dynamic extent, it replaces loading one value
with a branchless sequence of four instructions.
This commit also refactors __size to no use any of the precomputed
arrays. This prevents instantiating __{fwd,rev}_partial_prods for
low-rank extents. This results in a further size reduction of a
reference object file (described two commits prior) by 9% from 46.0kB to
41.9kB.
In a prior commit we optimized __size to produce better object code by
precomputing the static products. This refactor enables the optimizer to
generate the same optimized code.
libstdc++-v3/ChangeLog:
* include/std/mdspan (__mdspan::__fwd_prod): Optimize
for rank <= 2.
(__mdspan::__rev_prod): Ditto.
(__mdspan::__size): Refactor to use a pre-computed product, not
a partial product.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
Let E denote an multi-dimensional extent; n the rank of E; r = 0, ...,
n; E[i] the i-th extent; and D[k] be the (possibly empty) array of
dynamic extents.
The two partial products for r = 0, ..., n:
\prod_{i = 0}^r E[i] (fwd)
\prod_{i = r+1}^n E[i] (rev)
can be computed as the product of static and dynamic extents. The static
fwd and rev product can be computed at compile time for all values of r.
Three methods are directly affected by this optimization:
layout_left::mapping::stride
layout_right::mapping::stride
mdspan::size
We'll check the generated code (-O2) for all three methods for a generic
(artificially) high-dimensional multi-dimensional extents.
Consider a generic case:
using Extents = std::extents<int, 3, 5, dyn, dyn, dyn, 7, dyn>;
int stride_left(const std::layout_left::mapping<Extents>& m, size_t r)
{ return m.stride(r); }
The code generated prior to this commit:
4f0: 66 0f 6f 05 00 00 00 movdqa xmm0,XMMWORD PTR [rip+0x0] # 4f8
4f7: 00
4f8: 48 83 c6 01 add rsi,0x1
4fc: 48 c7 44 24 e8 ff ff mov QWORD PTR [rsp-0x18],0xffffffffffffffff
503: ff ff
505: 48 8d 04 f5 00 00 00 lea rax,[rsi*8+0x0]
50c: 00
50d: 0f 29 44 24 b8 movaps XMMWORD PTR [rsp-0x48],xmm0
512: 66 0f 76 c0 pcmpeqd xmm0,xmm0
516: 0f 29 44 24 c8 movaps XMMWORD PTR [rsp-0x38],xmm0
51b: 66 0f 6f 05 00 00 00 movdqa xmm0,XMMWORD PTR [rip+0x0] # 523
522: 00
523: 0f 29 44 24 d8 movaps XMMWORD PTR [rsp-0x28],xmm0
528: 48 83 f8 38 cmp rax,0x38
52c: 74 72 je 5a0 <stride_right_E1+0xb0>
52e: 48 8d 54 04 b8 lea rdx,[rsp+rax*1-0x48]
533: 4c 8d 4c 24 f0 lea r9,[rsp-0x10]
538: b8 01 00 00 00 mov eax,0x1
53d: 0f 1f 00 nop DWORD PTR [rax]
540: 48 8b 0a mov rcx,QWORD PTR [rdx]
543: 49 89 c0 mov r8,rax
546: 4c 0f af c1 imul r8,rcx
54a: 48 83 f9 ff cmp rcx,0xffffffffffffffff
54e: 49 0f 45 c0 cmovne rax,r8
552: 48 83 c2 08 add rdx,0x8
556: 49 39 d1 cmp r9,rdx
559: 75 e5 jne 540 <stride_right_E1+0x50>
55b: 48 85 c0 test rax,rax
55e: 74 38 je 598 <stride_right_E1+0xa8>
560: 48 8b 14 f5 00 00 00 mov rdx,QWORD PTR [rsi*8+0x0]
567: 00
568: 48 c1 e2 02 shl rdx,0x2
56c: 48 83 fa 10 cmp rdx,0x10
570: 74 1e je 590 <stride_right_E1+0xa0>
572: 48 8d 4f 10 lea rcx,[rdi+0x10]
576: 48 01 d7 add rdi,rdx
579: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0]
580: 48 63 17 movsxd rdx,DWORD PTR [rdi]
583: 48 83 c7 04 add rdi,0x4
587: 48 0f af c2 imul rax,rdx
58b: 48 39 f9 cmp rcx,rdi
58e: 75 f0 jne 580 <stride_right_E1+0x90>
590: c3 ret
591: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0]
598: c3 ret
599: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0]
5a0: b8 01 00 00 00 mov eax,0x1
5a5: eb b9 jmp 560 <stride_right_E1+0x70>
5a7: 66 0f 1f 84 00 00 00 nop WORD PTR [rax+rax*1+0x0]
5ae: 00 00
which seems to be performing:
preparatory_work();
ret = 1
for(i = 0; i < rank; ++i)
tmp = ret * E[i]
if E[i] != -1
ret = tmp
for(i = 0; i < rank_dynamic; ++i)
ret *= D[i]
This commit reduces it down to:
270: 48 8b 04 f5 00 00 00 mov rax,QWORD PTR [rsi*8+0x0]
277: 00
278: 31 d2 xor edx,edx
27a: 48 85 c0 test rax,rax
27d: 74 33 je 2b2 <stride_right_E1+0x42>
27f: 48 8b 14 f5 00 00 00 mov rdx,QWORD PTR [rsi*8+0x0]
286: 00
287: 48 c1 e2 02 shl rdx,0x2
28b: 48 83 fa 10 cmp rdx,0x10
28f: 74 1f je 2b0 <stride_right_E1+0x40>
291: 48 8d 4f 10 lea rcx,[rdi+0x10]
295: 48 01 d7 add rdi,rdx
298: 0f 1f 84 00 00 00 00 nop DWORD PTR [rax+rax*1+0x0]
29f: 00
2a0: 48 63 17 movsxd rdx,DWORD PTR [rdi]
2a3: 48 83 c7 04 add rdi,0x4
2a7: 48 0f af c2 imul rax,rdx
2ab: 48 39 f9 cmp rcx,rdi
2ae: 75 f0 jne 2a0 <stride_right_E1+0x30>
2b0: 89 c2 mov edx,eax
2b2: 89 d0 mov eax,edx
2b4: c3 ret
Loosely speaking this does the following:
1. Load the starting position k in the array of dynamic extents; and
return if possible.
2. Load the partial product of static extents.
3. Computes the \prod_{i = k}^d D[i] where d is the number of
dynamic extents in a loop.
It shows that the span used for passing in the dynamic extents is
completely eliminated; and the fact that the product always runs to the
end of the array of dynamic extents is used by the compiler to eliminate
one indirection to determine the end position in the array of dynamic
extents.
The analogous code is generated for layout_left.
Next, consider
using E2 = std::extents<int, 3, 5, dyn, dyn, 7, dyn, 11>;
int size2(const std::mdspan<double, E2>& md)
{ return md.size(); }
on immediately preceding commit the generated code is
10: 66 0f 6f 05 00 00 00 movdqa xmm0,XMMWORD PTR [rip+0x0] # 18
17: 00
18: 49 89 f8 mov r8,rdi
1b: 48 8d 44 24 b8 lea rax,[rsp-0x48]
20: 48 c7 44 24 e8 0b 00 mov QWORD PTR [rsp-0x18],0xb
27: 00 00
29: 48 8d 7c 24 f0 lea rdi,[rsp-0x10]
2e: ba 01 00 00 00 mov edx,0x1
33: 0f 29 44 24 b8 movaps XMMWORD PTR [rsp-0x48],xmm0
38: 66 0f 76 c0 pcmpeqd xmm0,xmm0
3c: 0f 29 44 24 c8 movaps XMMWORD PTR [rsp-0x38],xmm0
41: 66 0f 6f 05 00 00 00 movdqa xmm0,XMMWORD PTR [rip+0x0] # 49
48: 00
49: 0f 29 44 24 d8 movaps XMMWORD PTR [rsp-0x28],xmm0
4e: 66 66 2e 0f 1f 84 00 data16 cs nop WORD PTR [rax+rax*1+0x0]
55: 00 00 00 00
59: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0]
60: 48 8b 08 mov rcx,QWORD PTR [rax]
63: 48 89 d6 mov rsi,rdx
66: 48 0f af f1 imul rsi,rcx
6a: 48 83 f9 ff cmp rcx,0xffffffffffffffff
6e: 48 0f 45 d6 cmovne rdx,rsi
72: 48 83 c0 08 add rax,0x8
76: 48 39 c7 cmp rdi,rax
79: 75 e5 jne 60 <size2+0x50>
7b: 48 85 d2 test rdx,rdx
7e: 74 18 je 98 <size2+0x88>
80: 49 63 00 movsxd rax,DWORD PTR [r8]
83: 49 63 48 04 movsxd rcx,DWORD PTR [r8+0x4]
87: 48 0f af c1 imul rax,rcx
8b: 41 0f af 40 08 imul eax,DWORD PTR [r8+0x8]
90: 0f af c2 imul eax,edx
93: c3 ret
94: 0f 1f 40 00 nop DWORD PTR [rax+0x0]
98: 31 c0 xor eax,eax
9a: c3 ret
which is needlessly long. The current commit reduces it down to:
10: 48 63 07 movsxd rax,DWORD PTR [rdi]
13: 48 63 57 04 movsxd rdx,DWORD PTR [rdi+0x4]
17: 48 0f af c2 imul rax,rdx
1b: 0f af 47 08 imul eax,DWORD PTR [rdi+0x8]
1f: 69 c0 83 04 00 00 imul eax,eax,0x483
25: c3 ret
Which simply computes the product:
D[0] * D[1] * D[2] * const
where const is the product of all static extents. Meaning the loop to
compute the product of dynamic extents has been fully unrolled and
all constants are perfectly precomputed.
The size of the object file described in the previous commit reduces
by 17% from 55.8kB to 46.0kB.
libstdc++-v3/ChangeLog:
* include/std/mdspan (__mdspan::__static_prod): New function.
(__mdspan::__fwd_partial_prods): Constexpr array of partial
forward products.
(__mdspan::__fwd_partial_prods): Same for reverse partial
products.
(__mdspan::__static_extents_prod): Delete function.
(__mdspan::__extents_prod): Renamed from __exts_prod and refactored.
include/std/mdspan (__mdspan::__fwd_prod): Compute as the
product of pre-computed static static and the product of dynamic
extents.
(__mdspan::__rev_prod): Ditto.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
In mdspan related code involving static extents, often the IndexType is
part of the template parameters, even though it's not needed.
This commit extracts the parts of _ExtentsStorage not related to
IndexType into a separate class _StaticExtents.
It also prefers passing the array of static extents, instead of the
whole extents object where possible.
The size of an object file compiled with -O2 that instantiates
Layout::mapping<extents<IndexType, Indices...>::stride
Layout::mapping<extents<IndexType, Indices...>::required_span_size
for the product of
- eight IndexTypes
- three Layouts,
- nine choices of Indices...
decreases by 19% from 69.2kB to 55.8kB.
libstdc++-v3/ChangeLog:
* include/std/mdspan (__mdspan::_StaticExtents): Extract non IndexType
related code from _ExtentsStorage.
(__mdspan::_ExtentsStorage): Use _StaticExtents.
(__mdspan::__static_extents): Return reference to NTTP of _StaticExtents.
(__mdspan::__contains_zero): New overload.
(__mdspan::__exts_prod, __mdspan::__static_quotient): Use span to avoid
copying __sta_exts.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
This patch adds the [[nodiscard]] attribute to the operator() of ranges
algorithm function objects if their std counterpart has it.
Furthermore, we [[nodiscard]] the operator() of the following ranges
algorithms that lack a std counterpart:
* find_last, find_last_if, find_last_if_not (to match other find
algorithms)
* contains, contains_subrange (to match find/any_of and search)
Finally, [[nodiscard]] is added to std::min and std::max overloads
that accept std::initializer_list. This appears to be an oversight,
as std::minmax is already marked, and other min overloads are as well.
The same applies to corresponding operator() overloads of ranges::min and
ranges::max.
PR libstdc++/121476
libstdc++-v3/ChangeLog:
* include/bits/ranges_algo.h (__all_of_fn::operator()):
(__any_of_fn::operator(), __none_of_fn::operator())
(__find_first_of_fn::operator(), __count_fn::operator())
(__find_end_fn::operator(), __remove_if_fn::operator())
(__remove_fn::operator(), __unique_fn::operator())
(__is_sorted_until_fn::operator(), __is_sorted_fn::operator())
(__lower_bound_fn::operator(), __upper_bound_fn::operator())
(__equal_range_fn::operator(), __binary_search_fn::operator())
(__is_partitioned_fn::operator(), __partition_point_fn::operator())
(__minmax_fn::operator(), __min_element_fn::operator())
(__includes_fn::operator(), __max_fn::operator())
(__lexicographical_compare_fn::operator(), __clamp__fn::operator())
(__find_last_fn::operator(), __find_last_if_fn::operator())
(__find_last_if_not_fn::operator()): Add [[nodiscard]] attribute.
* include/bits/ranges_algobase.h (__equal_fn::operator()):
Add [[nodiscard]] attribute.
* include/bits/ranges_util.h (__find_fn::operator())
(__find_if_fn::operator(), __find_if_not_fn::operator())
(__mismatch_fn::operator(), __search_fn::operator())
(__min_fn::operator(), __adjacent_find_fn::operator()):
Add [[nodiscard]] attribute.
* include/bits/stl_algo.h (std::min(initializer_list<T>))
(std::min(initializer_list<T>, _Compare))
(std::max(initializer_list<T>))
(std::mmax(initializer_list<T>, _Compare)): Add _GLIBCXX_NODISCARD.
* testsuite/25_algorithms/min/constrained.cc: Silence nodiscard
warning.
* testsuite/25_algorithms/max/constrained.cc: Likewise.
* testsuite/25_algorithms/minmax/constrained.cc: Likewise.
* testsuite/25_algorithms/minmax_element/constrained.cc: Likewise.
|
|
[PR121313]
For __n == 0, the elements were self move-assigned by
std::move_backward(__ins, __old_finish - __n, __old_finish).
PR libstdc++/121313
libstdc++-v3/ChangeLog:
* include/bits/vector.tcc (vector::insert_range): Add check for
empty size.
* testsuite/23_containers/vector/modifiers/insert/insert_range.cc:
New tests.
|
|
Forr rvalues the _Self parameter deduces a non-reference type. Consequently,
((_Self)__self) moved the object to a temporary, which then destroyed on
function exit.
This patch fixes this by using a C-style cast __self to (const indirect&).
This not only resolves the above issue but also correctly handles types that
are derived (publicly and privately) from indirect. Allocator requirements in
[allocator.requirements.general] p22 guarantee that dereferencing const _M_objp
works with equivalent semantics to dereferencing _M_objp.
PR libstdc++/121128
libstdc++-v3/ChangeLog:
* include/bits/indirect.h (indirect::operator*):
Cast __self to approparietly qualified indirect.
* testsuite/std/memory/indirect/access.cc: New test.
* testsuite/std/memory/polymorphic/access.cc: New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
When the C++98 std::distance and std::advance functions (and C++11
std::next and std::prev) are used with C++20 iterators there can be
unexpected results, ranging from compilation failure to decreased
performance to undefined behaviour.
An iterator which satisfies std::input_iterator but does not meet the
Cpp17InputIterator requirements might have std::output_iterator_tag for
its std::iterator_traits<I>::iterator_category, which means it currently
cannot be used with std::advance at all. However, the implementation of
std::advance for a Cpp17InputIterator doesn't do anything that isn't
valid for iterator types satsifying C++20 std::input_iterator.
Similarly, a type satisfying C++20 std::bidirectional_iterator might be
usable with std::prev, if it weren't for the fact that its C++17
iterator_category is std::input_iterator_tag.
Finally, a type satisfying C++20 std::random_access_iterator might use a
slower implementation for std::distance or std::advance if its C++17
iterator_category is not std::random_access_iterator_tag.
This commit adds a __promotable_iterator concept to detect C++20
iterators which explicitly define an iterator_concept member, and which
either have no iterator_category, or their iterator_category is weaker
than their iterator_concept. This is used by std::distance and
std::advance to detect iterators which should dispatch based on their
iterator_concept instead of their iterator_category. This means that
those functions just work and do the right thing for C++20 iterators
which would otherwise fail to compile or have suboptimal performance.
This is related to LWG 3197, which considers making it undefined to use
std::prev with types which do not meet the Cpp17BidirectionalIterator
requirements. I think making it work, as in this commit, is a better
solution than banning it (or rejecting it at compile-time as libc++
does).
PR libstdc++/102181
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator_base_funcs.h (distance, advance):
Check C++20 iterator concepts and handle appropriately.
(__detail::__iter_category_converts_to_concept): New concept.
(__detail::__promotable_iterator): New concept.
* testsuite/24_iterators/operations/cxx20_iterators.cc: New
test.
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
This adds the new bitset constructor from string_view
defined in P2697 to the debug version of the type.
libstdc++-v3/Changelog:
PR libstdc++/119742
* include/debug/bitset: Add new ctor.
|
|
libstdc++-v3/ChangeLog:
* include/std/mdspan: Small stylistic adjustments.
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
[PR121196]
PR libstdc++/121196
libstdc++-v3/ChangeLog:
* include/std/inplace_vector (std::erase): Provide default argument
for _Up parameter.
* testsuite/23_containers/inplace_vector/erasure.cc: Add test for
using braces-init-list as arguments to erase_if and use function
to verify content of inplace_vector
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
The unordered_map header incorrectly refers to a non-existent template parameter
_Value in default template argument descriptions. They should refer to _Key instead.
This patch fixes these descriptions to match the actual template parameters.
libstdc++-v3/ChangeLog:
* include/bits/unordered_map.h: Rectify referencing of
non-existent type.
|
|
In case of input iterators, the loop that assigns to existing elements
should run up to number of elements in vector (_M_size) not capacity (_Nm).
PR libstdc++/119137
libstdc++-v3/ChangeLog:
* include/std/inplace_vector (inplace_vector::assign_range):
Replace _Nm with _M_size in the assigment loop.
|
|
Previously, the default ctor of mdspan was never noexcept, even if all
members of mdspan were nothrow default constructible.
This commit makes mdspan conditionally nothrow default constructible.
A similar strengthening happens in libc++.
libstdc++-v3/ChangeLog:
* include/std/mdspan (mdspan::mdspan): Make default ctor
conditionally noexcept.
* testsuite/23_containers/mdspan/mdspan.cc: Add tests.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
The mdspan::is_{,always}_{unique,strided,exhaustive} methods only call
their counterparts in mdspan::mapping_type. The standard specifies that
the methods of mdspan::mapping_type are noexcept, but doesn't specify if
the methods of mdspan are noexcept.
Libc++ strengthened the exception guarantee for these mdspan methods.
This commit conditionally strengthens these methods for libstdc++.
libstdc++-v3/ChangeLog:
* include/std/mdspan (mdspan::is_always_unique): Make
conditionally noexcept.
(mdspan::is_always_exhaustive): Ditto.
(mdspan::is_always_strided): Ditto.
(mdspan::is_unique): Ditto.
(mdspan::is_exhaustive): Ditto.
(mdspan::is_strided): Ditto.
* testsuite/23_containers/mdspan/layout_like.h: Make noexcept
configurable. Add ThrowingLayout.
* testsuite/23_containers/mdspan/mdspan.cc: Add tests for
noexcept.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
Currently this new concept will get defined for -std=c++17 -fconcepts
but as it uses std::input_iterator, which is new in C++20, that won't
work. Guard it with __cpp_lib_concepts as well as __cpp_concepts.
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator_base_types.h (__any_input_iterator):
Only define when __cpp_lib_concepts is defined.
|
|
PR libstdc++/119137
libstdc++-v3/ChangeLog:
* include/std/inplace_vector (inplace_vector::operator=):
Qualify call to std::addressof.
|
|
Previously for localized output, if _M_debug option was set, the _M_check_ok
completed succesfully and _M_locale_fmt was called for months/weekdays that
are !ok().
This patch lifts debug checks from each conversion function into _M_check_ok,
that in case of !ok() values return a string_view containing the kind of
calendar data, to be included after "is not a valid" string. The localized
output (_M_locale_fmt) is not used if string is non-empty. Emitting of this
message is now handled in _M_format_to, further reducing each specifier
function.
To handle weekday (%a,%A) and month (%b,%B), _M_check_ok now accepts a
mutable reference to conversion specifier, and updates it to corresponding
numeric value (%w, %m). Extra care needs to be taken to handle a month(0)
that needs to be printed as single digit in debug format.
Finally, the _M_time_point is replaced with _M_needs_ok_check member, that
indicates if input contains any user-suplied values that are checked for
being ok() and these values are referenced in chrono-specs.
PR libstdc++/121154
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (_ChronoSpec::_M_time_point): Remove.
(_ChronoSpec::_M_needs_ok_check): Define
(__formatter_chrono::_M_parse): Set _M_needs_ok_check.
(__formatter_chrono::_M_check_ok): Check values also for debug mode,
and return __string_view.
(__formatter_chrono::_M_format_to): Handle results of _M_check_ok.
(__formatter_chrono::_M_wi, __formatter_chrono::_M_a_A)
(__formatter_chrono::_M_b_B, __formatter_chrono::_M_C_y_Y)
(__formatter_chrono::_M_d_e, __formatter_chrono::_M_F):
Removed handling of _M_debug.
(__formatter_chrono::__M_m): Print zero unpadded in _M_debug mode.
(__formatter_duration::_S_spec_for): Remove _M_time_point refernce.
(__formatter_duration::_M_parse): Override _M_needs_ok_check.
* testsuite/std/time/month/io.cc: Test for localized !ok() values.
* testsuite/std/time/weekday/io.cc: Test for localized !ok() values.
|
|
This implements the missing functions in _Utf_iterator to support
reverse iteration. All existing tests pass when the view is reversed, so
that the same code units are seen when iterating forwards or backwards.
libstdc++-v3/ChangeLog:
* include/bits/unicode.h (_Utf_iterator::operator--): Reorder
conditions and update position after reading a code unit.
(_Utf_iterator::_M_read_reverse): Define.
(_Utf_iterator::_M_read_utf8): Return extracted code point.
(_Utf_iterator::_M_read_reverse_utf8): Define.
(_Utf_iterator::_M_read_reverse_utf16): Define.
(_Utf_iterator::_M_read_reverse_utf32): Define.
* testsuite/ext/unicode/view.cc: Add checks for reversed views
and reverse iteration.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
This reorders the data members of _Utf_iterator to avoid padding bytes
between members due to alignment requirements. For x86_64 the previous
layout had padding after _M_buf and after _M_to_increment for the common
case where the iterators and sentinel types are pointers, so the size
shrinks from 40 bytes to 32 bytes. (For i686 there's no change, it's
still 20 bytes).
We could compress the three uint8_t members into one byte by using
bit-fields:
uint8_t _M_buf_index : 2; // [0,3]
uint8_t _M_buf_last : 3; // [0,4]
uint8_t _M_to_increment : 3; // [0,4]
But there doesn't seem to be any point, because it will just be slower
to access them and there will be tail padding so the size isn't any
smaller. We could also reduce _M_buf_last and _M_to_increment to 2 bits
because the 0 value is only used for a default constructed iterator, and
we don't actually care about the values in that case. Again, this
doesn't seem worth doing.
libstdc++-v3/ChangeLog:
* include/bits/unicode.h (_Utf_iterator): Reorder data members
to be more compact.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
Implement std::inplace_vector as specified in P0843R14, without follow
up papers, in particular P3074R7 (trivial unions). In consequence
inplace_vector<T, N> can be used inside constant evaluations only
if T is trivial or N is equal to zero.
We provide a separate specialization for inplace_vector<T, 0> to meet
the requirements of N5008 [inplace.vector.overview] p5. In particular
objects of such types needs to be empty.
To allow constexpr variable of inplace_vector v, where v.size() < v.capacity(),
we need to guaranteed that all elements of the storage array are initialized,
even ones in range [v.data() + v.size(), v.data() + v.capacity()). This is
perfoirmed by _M_init function, that is called by each constructor. By storing
the array in anonymous union, we can perform this initialization in constant
evaluation, avoiding the impact on runtime path.
The size() function conveys the information that _M_size <= _Nm to compiler,
by calling __builtin_unreachable(). In particular this allows us to eliminate
FP warnings by using _Nm - size() instead of _Nm - _M_size, when computing
available elements.
The included test cover almost all code paths at runtime, however some
compile time evaluation test are not yet implemented:
* operations on range, they depend on making testsuite_iterators constexpr
* negative test for invoking operations with preconditions at compile time,
especially for zero size specialization.
PR libstdc++/119137
libstdc++-v3/ChangeLog:
* doc/doxygen/user.cfg.in (INPUT): Add new header.
* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/bits/stl_iterator_base_types.h (__any_input_iterator):
Define.
* include/bits/version.def (inplace_vector): Define.
* include/bits/version.h: Regenerate.
* include/precompiled/stdc++.h: Include new header.
* src/c++23/std.cc.in: Export contents if new header.
* include/std/inplace_vector: New file.
* testsuite/23_containers/inplace_vector/access/capacity.cc: New file.
* testsuite/23_containers/inplace_vector/access/elem.cc: New file.
* testsuite/23_containers/inplace_vector/access/elem_neg.cc: New file.
* testsuite/23_containers/inplace_vector/cons/1.cc: New file.
* testsuite/23_containers/inplace_vector/cons/from_range.cc: New file.
* testsuite/23_containers/inplace_vector/cons/throws.cc: New file.
* testsuite/23_containers/inplace_vector/copy.cc: New file.
* testsuite/23_containers/inplace_vector/erasure.cc: New file.
* testsuite/23_containers/inplace_vector/modifiers/assign.cc: New file.
* testsuite/23_containers/inplace_vector/modifiers/erase.cc: New file.
* testsuite/23_containers/inplace_vector/modifiers/multi_insert.cc:
New file.
* testsuite/23_containers/inplace_vector/modifiers/single_insert.cc:
New file.
* testsuite/23_containers/inplace_vector/move.cc: New file.
* testsuite/23_containers/inplace_vector/relops.cc: New file.
* testsuite/23_containers/inplace_vector/version.cc: New file.
* testsuite/util/testsuite_iterators.h (input_iterator_wrapper::base):
Define.
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Co-authored-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
The second bug report in PR121061 is that the conversion of custom
OtherIndexType to IndexType is incorrectly not done via r-value
references.
This commit fixes the forwarding issue, adds a custom IndexType called
RValueInt, which only allows conversion to int via r-value reference.
PR libstdc++/121061
libstdc++-v3/ChangeLog:
* include/std/mdspan (extents::extents): Perform conversion to
index_type of an r-value reference.
(layout_left::mapping::operator()): Ditto.
(layout_right::mapping::operator()): Ditto.
(layout_stride::mapping::operator()): Ditto.
* testsuite/23_containers/mdspan/extents/custom_integer.cc: Add
tests for RValueInt and MutatingInt.
* testsuite/23_containers/mdspan/int_like.h (RValueInt): Add.
* testsuite/23_containers/mdspan/layouts/mapping.cc: Test with
RValueInt.
* testsuite/23_containers/mdspan/mdspan.cc: Ditto.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
PR121061 consists of two bugs for mdspan related code. This commit fixes
the first one. Namely, when passing custom IndexType as an array or
span, the conversion to int must be const. Prior to this commit the
constraint incorrectly also allowed non-const conversion. This commit
updates all related constraints to check
__valid_index_type<const OtherIndexType&, index_type>
in those cases. Also adds a MutatingInt to int_like.h which only
supports non-const conversion to int and updates the tests.
PR libstdc++/121061
libstdc++-v3/ChangeLog:
* include/std/mdspan (extents::extents): Fix constraint to
prevent non-const conversion to index_type.
(layout_stride::mapping::mapping): Ditto.
(mdspan::mdspan): Ditto.
(mdspan::operator[]): Ditto.
* testsuite/23_containers/mdspan/extents/custom_integer.cc: Add
test for MutatingInt.
* testsuite/23_containers/mdspan/int_like.h (MutatingInt): Add.
* testsuite/23_containers/mdspan/layouts/mapping.cc: Add test for
MutatingInt.
* testsuite/23_containers/mdspan/layouts/stride.cc: Ditto.
* testsuite/23_containers/mdspan/mdspan.cc: Ditto.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>
|
|
Add comments documenting what it does and how it does it.
Also reorder the if-else in operator++ so that we check whether to
iterate over code units in the local buffer before checking whether to
refill that buffer. That seems the more natural way to structure the
function.
libstdc++-v3/ChangeLog:
* include/bits/unicode.h (__unicode::_Utf_iterator): Add
comments.
(__unicode:_Utf_iterator::operator++()): Check whether to
iterate over the buffer first.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
The __promoted_t alias is only defined when __cpp_fold_expressions is
defined, which might not be the case for some hypothetical C++17
compilers.
Change the 3-arg std::hypot to just use __gnu_cxx::__promote_3 which is
always available.
libstdc++-v3/ChangeLog:
PR libstdc++/121097
* include/c_global/cmath (hypot): Use __promote_3 instead of
__promoted.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
This change was part of by P2697R1 (Interfacing bitset with string_view)
and should be slightly cheaper to instantiate.
We should consider using basic_string_view for C++17, C++20, and C++23
as well. This patch just conservatively changes it for C++26 to match
the working draft. It's conceivable that a program-defined
specialization of basic_string<_CharT> or basic_string_view<_CharT> will
observe a difference and be affected by this change.
libstdc++-v3/ChangeLog:
* include/std/bitset (__bitset::__string) [__cpp_lib_bitset]:
Change alias to refer to basic_string_view instead.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
Do not advertise library support for constexpr exceptions, as our
solution to throwing by __throw_* functions from <bits/functexcept.h>,
caues constant evaluation to fail, as these functions are not constexpr.
PR libstdc++/121114
libstdc++-v3/ChangeLog:
* include/bits/version.def (constexpr_exceptions): Add no_stdname
and changed value.
* include/bits/version.h: Regenerated.
* testsuite/18_support/exception/version.cc: Test that macro is
not exported.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kaminski <tkaminsk@redhat.com>
|
|
This is a minor compile-time optimization for C++20.
libstdc++-v3/ChangeLog:
* include/bits/move.h (swap): Replace enable_if with concepts
when available, and with __enable_if_t alias otherwise.
|
|
The standard specifies some of the effects of ranges::advance in terms
of "Equivalent to:" and it's observable that our current implementation
deviates from the precise specification in the standard. This was
causing some failures in the libc++ testsuite.
For the sized_sentinel_for<I, S> case I optimized our implementation to
avoid redundant calls when we have already checked that there's nothing
to do. We were eliding `advance(i, bound)` when the iterator already
equals the sentinel, and eliding `advance(i, n)` when `n` is zero. In
both cases, removing the seemingly redundant calls is not equivalent to
the spec because `i = std::move(bound)` or `i += 0` operations can be
observed by program-defined iterators. This patch inlines the observable
side effects of advance(i, bound) or advance(i, 0) without actually
calling those functions.
For the non-sized sentinel case, `if (i == bound || n == 0)` is
different from `if (n == 0 || i == bound)` for the case where n is zero
and a program-defined iterator observes the number of comparisons.
This patch changes it to do `n == 0` first. I don't think this is
required by the standard, as this condition is not "Equivalent to:" any
observable sequence of operations, but testing `n == 0` first is
probably cheaper anyway.
libstdc++-v3/ChangeLog:
* include/bits/ranges_base.h (ranges::advance(i, n, bound)):
Ensure that observable side effects on iterators match what is
specified in the standard.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
Data members of type __maybe_present_t where the conditionally present
type might be an aggregate or fundamental type need to be explicitly
value-initialized (rather than implicitly default-initialized), so that
default-initialization of the containing class always results in an
completely initialized object.
PR libstdc++/119962
libstdc++-v3/ChangeLog:
* include/std/ranges (join_view::_Iterator::_M_outer): Initialize.
(lazy_split_view::_OuterIter::_M_current): Initialize.
(join_with_view::_Iterator::_M_outer_it): Initialize.
* testsuite/std/ranges/adaptors/join.cc (test15): New test.
* testsuite/std/ranges/adaptors/join_with/1.cc (test05): New test.
* testsuite/std/ranges/adaptors/lazy_split.cc (test13): New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
The new test is currently marked as XFAIL because PR c++/102284 means
that GCC doesn't notice that the lifetimes have ended.
libstdc++-v3/ChangeLog:
PR libstdc++/121024
* include/bits/ranges_uninitialized.h (ranges::destroy): Do not
optimize away trivial destructors during constant evaluation.
(ranges::destroy_n): Likewise.
* testsuite/20_util/specialized_algorithms/destroy/121024.cc:
New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
I've just created LWG 4295 proposing this change, and am implementing it
via this patch.
libstdc++-v3/ChangeLog:
* include/experimental/memory (swap, make_observer_ptr): Add
constexpr.
(operator==, operator!=, operator<, operator>, operator<=)
(operator>=): Likewise.
* testsuite/experimental/memory/observer_ptr/make_observer.cc:
Checks for constant evaluation.
* testsuite/experimental/memory/observer_ptr/relops/relops.cc:
Likewise.
* testsuite/experimental/memory/observer_ptr/swap/swap.cc:
Likewise.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
Another follow-up to r16-2190-g4faa42ac0dee2c, ensuring that make_signed
and make_unsigned work on enumeration types with 128-bit integers as
their underlying type.
libstdc++-v3/ChangeLog:
* include/std/type_traits (__make_unsigned_selector): Add
unsigned __int128 to type list.
* testsuite/20_util/make_unsigned/int128.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
This is a follow-up to r16-2190-g4faa42ac0dee2c which ensures that
std::hash is always enabled for signed and unsigned __int128. The
standard requires std::hash to be enabled for all arithmetic types.
libstdc++-v3/ChangeLog:
PR libstdc++/96710
* include/bits/functional_hash.h (hash<__int128>): Define for
strict modes.
(hash<unsigned __int128>): Likewise.
* testsuite/20_util/hash/int128.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
|
|
With changes r16-2063-g8ad5968a8dcb47 the _M_a_A, _M_b_B and _M_p functions
are called only if the locale is equal to the locale::classic(), for which
the behavior is know. This patch changes they implementation, so instead of
reffering to __timepunct facet members, they use hardcoded list of English
weekday, months names. Only one list is needed, as in case of locale::classic()
abbreviated name corresponds to first tree letters of the full name.
For _M_p, _M_r we use a new _M_fill_ampm helper, that fills provided buffer
with "AM"/"PM" depending on the hours value.
In _M_S we no longer guard querying of numpuct facet, with check that requires
potentially equally expensive construction of locale::classic. We also mark
localized path as unlikely.
The _M_locale method is no longer used in __formatter_chrono, and thus was
moved to __formatter_duration.
PR libstdc++/110739
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (__formatter_chrono::_S_weekdays)
(__formatter_chrono::_S_months, __formatter_chrono::_S_fill_ampm):
Define.
(__formatter_chrono::_M_format_to): Do not pass context parameter
to functions listed below.
(__formatter_chrono::_M_a_A, __formatter_chrono::_M_b_B): Implement
using harcoded list of names, and remove format context parameter.
(__formatter_chrono::_M_p, __formatter_chrono::_M_r): Implement
using _S_fill_ampm.
(__formatter_chrono::_M_c): Removed format context parameter.
(__formatter_chrono::_M_subsecs): Call __ctx.locale() directly,
instead of _M_locale and do not compare with locale::classic().
Add [[unlikely]] attributes.
(__formatter_chrono::_M_locale): Move to __formatter_duration.
(__formatter_duration::_M_locale): Moved from __formatter_chrono.
|
|
We pre-emptively implemented part of LWG 2766, which still hasn't been
approved. Add comments to the deleted swap overloads saying why they're
there, because the standard doesn't require them.
libstdc++-v3/ChangeLog:
* include/bits/stl_pair.h (swap): Add comment to deleted
overload.
* include/bits/unique_ptr.h (swap): Likewise.
* include/std/array (swap): Likewise.
* include/std/optional (swap): Likewise.
* include/std/tuple (swap): Likewise.
* include/std/variant (swap): Likewise.
* testsuite/23_containers/array/tuple_interface/get_neg.cc:
Adjust dg-error line numbers.
|
|
Reported upstream: https://github.com/uxlfoundation/oneDPL/issues/2342
libstdc++-v3/ChangeLog:
* include/pstl/algorithm_impl.h (__for_each_n_it_serial):
Protect against overloaded comma operator.
(__brick_walk2): Likewise.
(__brick_walk2_n): Likewise.
(__brick_walk3): Likewise.
(__brick_move_destroy::operator()): Likewise.
(__brick_calc_mask_1): Likewise.
(__brick_copy_by_mask): Likewise.
(__brick_partition_by_mask): Likewise.
(__brick_calc_mask_2): Likewise.
(__brick_reverse): Likewise.
(__pattern_partial_sort_copy): Likewise.
* include/pstl/memory_impl.h (__brick_uninitialized_move):
Likewise.
(__brick_uninitialized_copy): Likewise.
* include/pstl/numeric_impl.h (__brick_transform_scan):
Likewise.
|
|
Only P3068R6 (Allowing exception throwing in constant-evaluation) is
implemented in the library so far, so the value of the
constexpr_exceptions feature test macro should be 202411L. Once we
support the library changes in P3378R2 (constexpr exception types) then
we can set the value to 202502L again.
libstdc++-v3/ChangeLog:
PR libstdc++/117785
* include/bits/version.def (constexpr_exceptions): Define
correct value.
* include/bits/version.h: Regenerate.
* libsupc++/exception: Check correct value.
* testsuite/18_support/exception/version.cc: New test.
|