Age | Commit message (Collapse) | Author | Files | Lines |
|
libstdc++-v3/ChangeLog:
* include/bits/shared_ptr_base.h: Do not include <bit>.
|
|
Use a requires-clause on the partial specialization of the
__pmf_expects_stop_token variable template, which is used for the
extension that allows constructing std::jthread with a
pointer-to-member-function that accepts a std::stop_token argument.
Also add a comment referring to the related Bugzilla PR.
libstdc++-v3/ChangeLog:
PR libstdc++/100612
* include/std/thread (__pmf_expects_stop_token): Constrain
variable template specialization with concept. Add comment.
|
|
Add conditional noexcept to the remaining range access functions that
were not changed in r15-5669-g8692cb10e82e72. This is now being proposed
for C++26 by P3623R0 (not published yet).
libstdc++-v3/ChangeLog:
* include/bits/range_access.h (rbegin, rend, crbegin, crend):
Add conditional noexcept, as per P3623R0.
* testsuite/24_iterators/headers/iterator/range_access.cc: Add
noexcept-specifier to rbegin, rend, crbegin and crend
declarations.
|
|
The code example here does:
```
if (begin == end) __builtin_unreachable();
std::list nl(begin, end);
for (auto it = nl.begin(); it != nl.end(); it++)
{
...
}
/* Remove the first element of the list. */
nl.erase(nl.begin());
```
And we get a warning because because we jump threaded the case were we
think the list was empty from the for loop BUT we populated it without
an empty array. So can help the compiler here by adding that after initializing
the list with non empty array, that the list will not be empty either.
This is able to remove the -Wfree-nonheap-object warning in the first reduced
testcase (with the fix for `begin == end` case added) in the PR 118865; the second
reduced testcase has been filed off as PR 118867.
Bootstrapped and tested on x86_64-linux-gnu.
libstdc++-v3/ChangeLog:
PR libstdc++/118865
* include/bits/stl_list.h (_M_initialize_dispatch): Add an
unreachable if the iterator was not empty that the list will
now be not empty.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
There are some unqualified calls to make_pair in Parallel Mode. Fix
these by just using a qualified call, because it's simpler and I don't
care about this code much.
libstdc++-v3/ChangeLog:
* include/parallel/algobase.h (__mismatch_switch): Qualify calls
to make_pair to avoid ADL.
|
|
_Rb_tree::_M_equal_range calls make_pair unqualified, which means it
uses ADL. As the new testcase shows, this can find something other than
std::make_pair. Rather than just changing it to use a qualified call,
remove the use of make_pair entirely. We don't need to deduce any types
here, we know exactly what type of std::pair we want to construct, so do
that explicitly.
libstdc++-v3/ChangeLog:
* include/bits/stl_tree.h (_Rb_tree::_M_equal_range): Replace
unqualified call to make_pair with explicit construction of
std::pair.
* testsuite/23_containers/set/operations/equal_range_adl.cc:
New test.
|
|
- Some hardware has support for floating point atomic fetch_add (and
similar).
- There are existing compilers targetting this hardware that use
libstdc++ -- e.g. NVC++.
- Since the libstdc++ atomic<float>::fetch_add and similar is written
directly as a CAS loop these compilers can not emit optimal code when
seeing such constructs.
- I hope to use __atomic_fetch_add builtins on floating point types
directly in libstdc++ so these compilers can emit better code.
- Clang already handles some floating point types in the
__atomic_fetch_add family of builtins.
- In order to only use this when available, I originally thought I could
check against the resolved versions of the builtin in a manner
something like `__has_builtin(__atomic_fetch_add_<fp-suffix>)`.
I then realised that clang does not expose resolved versions of these
atomic builtins to the user.
From the clang discourse it was suggested we instead use SFINAE (which
clang already supports).
- I have recently pushed a patch for allowing the use of SFINAE on
builtins: r15-6042-g9ed094a817ecaf
Now that patch is committed, this patch does not change what happens
for GCC, while it uses the builtin for codegen with clang.
- I have previously sent a patchset upstream adding the ability to use
__atomic_fetch_add and similar on floating point types.
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668754.html
Once that patchset is upstream (plus the automatic linking of
libatomic as Joseph pointed out in the email below
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665408.html )
then current GCC should start to use the builtin branch added in this
patch.
So *currently*, this patch allows external compilers (NVC++ in
particular) to generate better code, and similarly lets clang understand
the operation better since it maps to a known builtin.
I hope that by GCC 16 this patch would also allow GCC to understand the
operation better via mapping to a known builtin.
libstdc++-v3/ChangeLog:
* include/bits/atomic_base.h (__atomic_fetch_addable): Define
new concept.
(__atomic_impl::__fetch_add_flt): Use new concept to make use of
__atomic_fetch_add when available.
(__atomic_fetch_subtractable, __fetch_sub_flt): Likewise.
(__atomic_add_fetchable, __add_fetch_flt): Likewise.
(__atomic_sub_fetchable, __sub_fetch_flt): Likewise.
Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
|
|
The code was caching the result of `invoke(proj, *it)` in a local
`auto &&` variable. The problem is that this may create dangling
references, for instance in case `proj` is `std::identity` (the common
case) and `*it` produces a prvalue: lifetime extension does not
apply here due to the expressions involved.
Instead, store (and lifetime-extend) the result of `*it` in a separate
variable, then project that variable. While at it, also forward the
result of the projection to the predicate, so that the predicate can
act on the proper value category.
libstdc++-v3/ChangeLog:
PR libstdc++/118160
PR libstdc++/100249
* include/bits/ranges_algo.h (__is_permutation_fn): Avoid a
dangling reference by storing the result of the iterator
dereference and the result of the projection in two distinct
variables, in order to lifetime-extend each one.
Forward the projected value to the predicate.
* testsuite/25_algorithms/is_permutation/constrained.cc: Add a
test with a range returning prvalues. Test it in a constexpr
context, in order to rely on the compiler to catch UB.
Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
|
|
Because basic_ostream::sentry::~sentry is implicitly noexcept, we can't
let any exceptions escape from it, or the program would terminate. If
the streambuf's sync() function throws, or if it returns an error and
setting badbit in the stream state throws, then the program would
terminate.
LWG 835 intended to prevent exceptions from being thrown by the
std::basic_ostream::sentry destructor, but failed to cover the case
where the streambuf's sync() member throws an exception. LWG 4188 is
needed to fix that part. In any case, LWG 835 was never implemented for
libstdc++ so this does that, as well as my proposed fix for 4188 (that
badbit should be set if pubsync() exits via an exception).
In order to avoid a second try-catch block to handle an exception that
might be thrown by setting badbit, this introduces an RAII helper class
that temporarily clears the stream's exceptions mask, then restores it
afterwards.
The try-catch block doesn't handle the forced_unwind exception
explicitly, because catching and rethrowing that would just terminate
when it reached the sentry's implicit noexcept(true) anyway.
libstdc++-v3/ChangeLog:
* include/bits/ostream.h (basic_ostream::_Disable_exceptions):
RAII helper type.
(basic_ostream::sentry::~sentry): Use _Disable_exceptions. Add
try-catch block around call to pubsync.
* testsuite/27_io/basic_ostream/exceptions/char/lwg4188.cc: New
test.
* testsuite/27_io/basic_ostream/exceptions/wchar_t/lwg4188.cc:
New test.
|
|
Add a comment referencing PR 111050, to ensure the fix made by
r12-9903-g1be57348229666 doesn't get reverted.
libstdc++-v3/ChangeLog:
PR libstdc++/111050
* include/bits/hashtable_policy.h (_Hash_node_value_base): Add
comment about always_inline attributes.
|
|
This fixes flat_map/multimap::insert_range by just generalizing the
insert implementation to handle heterogenous iterator/sentinel pair.
I'm not sure we can do better than this, e.g. we can't implement it in
terms of the adapted containers' insert_range because that'd require two
passes over the range.
For flat_set/multiset, we can implement insert_range directly in terms
of the adapted container's insert_range. A fallback implementation
is also provided if insert_range isn't available, as is the case for
std::deque currently.
PR libstdc++/118156
libstdc++-v3/ChangeLog:
* include/std/flat_map (_Flat_map_impl::_M_insert): Generalized
version of insert taking heterogenous iterator/sentinel pair.
(_Flat_map_impl::insert): Dispatch to _M_insert.
(_Flat_map_impl::insert_range): Likewise.
(flat_map): Export _Flat_map_impl::insert_range.
(flat_multimap): Likewise.
* include/std/flat_set (_Flat_set_impl::insert_range):
Reimplement directly, not in terms of insert.
(flat_set): Export _Flat_set_impl::insert_range.
(flat_multiset): Likewise.
* testsuite/23_containers/flat_map/1.cc (test06): New test.
* testsuite/23_containers/flat_multimap/1.cc (test06): New test.
* testsuite/23_containers/flat_multiset/1.cc (test06): New test.
* testsuite/23_containers/flat_set/1.cc (test06): New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
In some cases we're wrongly returning an iterator to (one past) the last
element inserted instead of to the first element inserted.
libstdc++-v3/ChangeLog:
* include/bits/stl_bvector.h (vector<bool>::insert_range):
Consistently return an iterator pointing to the first element
inserted.
* include/bits/vector.tcc (vector::insert_range): Likewise.
* testsuite/23_containers/vector/bool/modifiers/insert/insert_range.cc:
Verify insert_range return values.
* testsuite/23_containers/vector/modifiers/insert/insert_range.cc:
Likewise.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
The std::latch::max() function assumes that the returned value can be
represented by ptrdiff_t, which is true when __platform_wait_t is int
(e.g. on Linux) but not when it's unsigned long, which is the case for
most other 64-bit targets. We should use the smaller of PTRDIFF_MAX and
std::numeric_limits<__platform_wait_t>::max(). Use std::cmp_less to do a
safe comparison that works for all types. We can also use std::cmp_less
and std::cmp_equal in std::latch::count_down so that we don't need to
deal with comparisons between signed and unsigned.
Also add a missing precondition check to constructor and fix the
existing check in count_down which was duplicated by mistake.
libstdc++-v3/ChangeLog:
PR libstdc++/98749
* include/std/latch (latch::max()): Ensure the return value is
representable as the return type.
(latch::latch(ptrdiff_t)): Add assertion.
(latch::count_down): Fix copy & pasted duplicate assertion. Use
std::cmp_equal to compare __platform_wait_t and ptrdiff_t
values.
(latch::_M_a): Use defined constant for alignment.
* testsuite/30_threads/latch/1.cc: Check max(). Check constant
initialization works for values in the valid range. Check
alignment.
|
|
The range adaptor perfect forwarding simplification mechanism is currently
only enabled for trivially copyable bound arguments, to prevent undesirable
copies of complex objects. But "trivially copyable" is the wrong property
to check for here, since a move-only type with a trivial move constructor
is considered trivially copyable, and after P2492R2 we can't assume copy
constructibility of the bound arguments. This patch makes the mechanism
more specifically check for trivial copy constructibility instead so
that it's properly disabled for move-only bound arguments.
PR libstdc++/118413
libstdc++-v3/ChangeLog:
* include/std/ranges (views::__adaptor::_Partial): Adjust
constraints on the "simple" partial specializations to require
is_trivially_copy_constructible_v instead of
is_trivially_copyable_v.
* testsuite/std/ranges/adaptors/adjacent_transform/1.cc (test04):
Extend P2494R2 test.
* testsuite/std/ranges/adaptors/transform.cc (test09): Likewise.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
the following testcase:
bool f(const std::vector<bool>& v, std::size_t x) {
return v[x];
}
is compiled as:
f(std::vector<bool, std::allocator<bool> > const&, unsigned long):
testq %rsi, %rsi
leaq 63(%rsi), %rax
movq (%rdi), %rdx
cmovns %rsi, %rax
sarq $6, %rax
leaq (%rdx,%rax,8), %rdx
movq %rsi, %rax
sarq $63, %rax
shrq $58, %rax
addq %rax, %rsi
andl $63, %esi
subq %rax, %rsi
jns .L2
addq $64, %rsi
subq $8, %rdx
.L2:
movl $1, %eax
shlx %rsi, %rax, %rax
andq (%rdx), %rax
setne %al
ret
which is quite expensive for simple bit access in a bitmap. The reason is that
the bit access is implemented using iterators
return begin()[__n];
Which in turn cares about situation where __n is negative yielding the extra
conditional.
_GLIBCXX20_CONSTEXPR
void
_M_incr(ptrdiff_t __i)
{
_M_assume_normalized();
difference_type __n = __i + _M_offset;
_M_p += __n / int(_S_word_bit);
__n = __n % int(_S_word_bit);
if (__n < 0)
{
__n += int(_S_word_bit);
--_M_p;
}
_M_offset = static_cast<unsigned int>(__n);
}
While we can use __builtin_unreachable to declare that __n is in range
0...max_size () but I think it is better to implement it directly, since
resulting code is shorter and much easier to optimize.
We now porduce:
.LFB1248:
.cfi_startproc
movq (%rdi), %rax
movq %rsi, %rdx
shrq $6, %rdx
andq (%rax,%rdx,8), %rsi
andl $63, %esi
setne %al
ret
Testcase suggests
movq (%rdi), %rax
movl %esi, %ecx
shrq $5, %rsi # does still need to be 64-bit
movl (%rax,%rsi,4), %eax
btl %ecx, %eax
setb %al
retq
Which is still one instruction shorter.
libstdc++-v3/ChangeLog:
PR target/80813
* include/bits/stl_bvector.h (vector<bool, _Alloc>::operator []): Do
not use iterators.
gcc/testsuite/ChangeLog:
PR target/80813
* g++.dg/tree-ssa/bvector-3.C: New test.
|
|
As reported in PR118185, std::ranges::clamp does not correctly forward
the projected value to the comparator. Add the missing forward.
libstdc++-v3/ChangeLog:
PR libstdc++/118185
PR libstdc++/100249
* include/bits/ranges_algo.h (__clamp_fn): Correctly forward the
projected value to the comparator.
* testsuite/25_algorithms/clamp/118185.cc: New test.
Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
This adds <bits/ostream.h> so that other headers don't need to include
all of <ostream>, which pulls in all of <format> since C++23 (for the
std::print and std::println overloads in <ostream>). This new header
allows the constrained operator<< in <bits/unique_ptr.h> to be defined
without all of std::format being compiled.
We could also replace <ostream> with <bits/ostream.h> in all of
<istream>, <fstream>, <sstream>, and <spanstream>. That seems more
likely to cause problems for users who might be expecting <sstream> to
define std::endl, for example. Although the standard doesn't guarantee
that, it is more reasonable than expecting <memory> to define it! We can
look into making those changes for GCC 16.
libstdc++-v3/ChangeLog:
PR libstdc++/99995
* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/bits/unique_ptr.h: Include bits/ostream.h instead of
ostream.
* include/std/ostream: Include new header.
* include/bits/ostream.h: New file.
|
|
Replace some `__cplusplus > 201402L` preprocessor checks with more
expressive checks for the appropriate feature test macro.
libstdc++-v3/ChangeLog:
* include/bits/stl_map.h: Check __glibcxx_node_extract instead
of __cplusplus.
* include/bits/stl_multimap.h: Likewise.
* include/bits/stl_multiset.h: Likewise.
* include/bits/stl_set.h: Likewise.
* include/bits/stl_tree.h: Likewise.
|
|
libstdc++-v3/ChangeLog:
PR libstdc++/109849
* include/bits/vector.tcc (vector::_M_range_insert): Fix
reversed args in length calculation.
|
|
This fixes warnings like the following during bootstrap:
sparc-sun-solaris2.11/libstdc++-v3/include/bits/atomic_futex.h:324:53: warning: unused parameter ‘__mo’ [-Wunused-parameter]
324 | _M_load_when_equal(unsigned __val, memory_order __mo)
| ~~~~~~~~~~~~~^~~~
libstdc++-v3/ChangeLog:
* include/bits/atomic_futex.h (__atomic_futex_unsigned): Remove
names of unused parameters in non-futex implementation.
|
|
libstdc++-v3/ChangeLog:
* include/bits/move.h (__addressof, forward, forward_like, move)
(move_if_noexcept, addressof): Add always_inline attribute.
Replace _GLIBCXX_NODISCARD with [[__nodiscard__]].
|
|
libstdc++-v3/ChangeLog:
* include/std/span: Fix indentation.
|
|
This commit implements P2447R6. The code is straightforward (just one
extra constructor, with constraints and conditional explicit).
I decided to suppress -Winit-list-lifetime because otherwise it would
give too many false positives. The new constructor is meant to be used
as a parameter-passing interface (this is a design choice, see
P2447R6/§2) and, as such, the initializer_list won't dangle despite
GCC's warnings.
The new constructor isn't 100% backwards compatible. A couple of
examples are included in Annex C, but I have also lifted some more
from R4. A new test checks for the old and the new behaviors.
libstdc++-v3/ChangeLog:
* include/bits/version.def: Add the new feature-testing macro.
* include/bits/version.h: Regenerate.
* include/std/span: Add constructor from initializer_list.
* testsuite/23_containers/span/init_list_cons.cc: New test.
* testsuite/23_containers/span/init_list_cons_neg.cc: New test.
Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
|
|
Any std::span<T, N> constructor with a runtime length has a precondition
that the length is equal to N (except when N == std::dynamic_extent).
Currently every constructor with a runtime length does:
if constexpr (extent != dynamic_extent)
__glibcxx_assert(n == extent);
We can move those assertions into the __detail::__extent_storage<N>
constructor so they are only done in one place. To avoid checking the
assertions when we have a constant length we can add a second
constructor which is consteval and takes a integral_constant<size_t, N>
argument. The std::span constructors can pass a size_t for runtime
lengths and a std::integral_constant<size_t, N> for constant lengths
that don't need to be checked.
The __detail::__extent_storage<dynamic_extent> specialization only needs
one constructor, as a std::integral_constant<size_t, N> argument can
implicitly convert to size_t.
For the member functions that return a subspan with a constant extent we
return std::span<T,C>(ptr, C) which is redundant in two ways. Repeating
the constant length C when it's already a template argument is
redundant, and using the std::span(T*, size_t) constructor implies a
runtime length which will do a redundant assertion check. Even though
that assertion won't fail and should be optimized away, it's still
unnecessary code that doesn't need to be instantiated and then optimized
away again. We can avoid that by adding a new private constructor that
only takes a pointer (wrapped in a custom tag struct to avoid
accidentally using that constructor) and automatically sets _M_extent to
the correct value.
libstdc++-v3/ChangeLog:
* include/std/span (__detail::__extent_storage): Check
precondition in constructor. Add consteval constructor for valid
lengths and deleted constructor for invalid constant lengths.
Make member functions always_inline.
(__detail::__span_ptr): New class template.
(span): Adjust constructors to use a std::integral_constant
value for constant lengths. Declare all specializations of
std::span as friends.
(span::first<C>, span::last<C>, span::subspan<O,C>): Use new
private constructor.
(span(__span_ptr<T>)): New private constructor for constant
lengths.
|
|
std::regex builds a cache of equivalence classes by calling
std::regex_traits<char>::transform_primary(c) for every char, which then
calls std::collate<char>::transform which calls strxfrm. On several
targets strxfrm fails for non-ASCII characters. Because strxfrm has no
return value reserved to indicate an error, some implementations return
INT_MAX or SIZE_MAX. This causes std::collate::transform to try to
allocate a huge buffer, which is either very slow or throws
std::bad_alloc. We should check errno after calling strxfrm to detect
errors and then throw a more appropriate exception instead of trying to
allocate a huge buffer.
Unfortunately the std::collate<C>::_M_transform function has a
non-throwing exception specifier, so we can't do the error handling
there.
As well as checking errno, this patch changes std::collate::do_transform
to use __builtin_alloca for small inputs, and to use RAII to deallocate
the buffers used for large inputs.
This change isn't sufficient to fix the three std::regex bugs caused by
the lack of error handling in std::collate::do_transform, we also need
to make std::regex_traits::transform_primary handle exceptions. This
change also attempts to make transform_primary closer to the effects
described in the standard, by not even attempting to use std::collate if
the locale's std::collate facet has been replaced (see PR 118105).
Implementing the correct effects for transform_primary requires RTTI, so
that we don't use some user-defined std::collate facet with unknown
semantics. When -fno-rtti is used transform_primary just returns an
empty string, making equivalence classes unusable in std::basic_regex.
That's not ideal, but I don't have any better ideas.
I'm unsure if std::regex_traits<C>::transform_primary is supposed to
convert the string to lower case or not. The general regex traits
requirements ([re.req] p20) do say "when character case is not
considered" but the specification for the std::regex_traits<char> and
std::regex_traits<wchar_t> specializations ([re.traits] p7) don't say
anything about that.
With the r15-6317-geb339c29ee42aa change, transform_primary is not
called unless the regex actually uses an equivalence class. But using an
equivalence class would still fail (or be incredibly slow) on some
targets. With this commit, equivalence classes should be usable on all
targets, without excessive memory allocations.
Arguably, we should not even try to call transform_primary for any char
values over 127, since they're never valid in locales that use UTF-8 or
7-bit ASCII, and probably for other charsets too. Handling 128
exceptions for every std::regex compilation is very inefficient, but at
least it now works instead of failing with std::bad_alloc, and no longer
allocates 128 x 2GB. Maybe for C++26 we could check the locale's
std::text_encoding and use that to decide whether to cache equivalence
classes for char values over 127.
libstdc++-v3/ChangeLog:
PR libstdc++/85824
PR libstdc++/94409
PR libstdc++/98723
PR libstdc++/118105
* include/bits/locale_classes.tcc (collate::do_transform): Check
errno after calling _M_transform. Use RAII type to manage the
buffer and to restore errno.
* include/bits/regex.h (regex_traits::transform_primary): Handle
exceptions from std::collate::transform and do not try to use
std::collate for user-defined facets.
|
|
The current check for negative times (i.e. before the epoch) only checks
for a negative number of seconds. For a time 1ms before the epoch the
seconds part will be zero, but the futex syscall will still fail with an
EINVAL error. Extend the check to handle this case.
This change adds a redundant check in the headers too, so that we avoid
even calling into the library for negative times. Both checks can be
marked [[unlikely]]. The check in the headers avoids the cost of
splitting the time into seconds and nanoseconds and then making a PLT
call. The check inside the library matches where we were checking
already, and fixes existing binaries that were compiled against older
headers but use a newer libstdc++.so.6 at runtime.
libstdc++-v3/ChangeLog:
PR libstdc++/118093
* include/bits/atomic_futex.h (_M_load_and_test_until_impl):
Return false for times before the epoch.
* src/c++11/futex.cc (_M_futex_wait_until): Extend check for
negative times to check for subsecond times. Add unlikely
attribute.
(_M_futex_wait_until_steady): Likewise.
* testsuite/30_threads/future/members/118093.cc: New test.
|
|
We have several overloads of std::deque::_M_insert_aux, one of which is
variadic and called by std::deque::emplace. With a suitable set of
arguments to emplace, it's possible for one of the non-variadic
_M_insert_aux overloads to be selected by overload resolution, making
emplace ill-formed.
Rename the variadic _M_insert_aux to _M_emplace_aux so that calls to
emplace never select an _M_insert_aux overload. Also add an inline
_M_insert_aux for the const lvalue overload that is called from
insert(const_iterator, const value_type&).
libstdc++-v3/ChangeLog:
PR libstdc++/90389
* include/bits/deque.tcc (_M_insert_aux): Rename variadic
overload to _M_emplace_aux.
* include/bits/stl_deque.h (_M_insert_aux): Define inline.
(_M_emplace_aux): Declare.
* testsuite/23_containers/deque/modifiers/emplace/90389.cc: New
test.
|
|
Also add "@since C++11" to std::move, std::forward etc.
libstdc++-v3/ChangeLog:
* include/bits/move.h (forward, move, move_if_noexcept, addressof):
Add @since to Doxygen comments.
(forward_like): Add Doxygen comment.
|
|
|
|
libstdc++-v3/ChangeLog:
PR libstdc++/118196
* include/std/generator (generator::operator=(generator)): Add
missing 'return *this;'.
* testsuite/24_iterators/range_generators/pr118196.cc: New test.
|
|
This overload requires
constructible_from<remove_cvref_t<yielded>,
const remove_reference_t<yielded>&>
... but then tries to construct remove_cvref_t<yielded> implicitly,
which means it imposes an additional constraint not in the standard.
libstdc++-v3/ChangeLog:
PR libstdc++/118022
* include/std/generator
(_Promise_erased::yield_value(const _Yielded_deref&)): Don't
implicit-constuct _Yielded_decvref.
* testsuite/24_iterators/range_generators/pr118022.cc: New test.
|
|
The fancy allocator pointer type support is added to std::map,
std::multimap, std::multiset and std::set through the underlying
std::_Rb_tree class.
To respect ABI a new parralel hierarchy of node types has been added.
This change introduces new class template parameterized on the
allocator's void_pointer type, __rb_tree::_Node_base, and new class
templates parameterized on the allocator's pointer type, __rb_tree::_Node,
__rb_tree::_Iterator. The iterator class template is used for both
iterator and const_iterator. Whether std::_Rb_tree<K, V, KoV, C, A>
should use the old _Rb_tree_node<V> or new __rb_tree::_Node<A::pointer>
type family internally is controlled by a new __rb_tree::_Node_traits
traits template.
Because std::pointer_traits and std::__to_address are not defined for
C++98, there is no way to support fancy pointers in C++98. For C++98 the
_Node_traits traits always choose the old _Rb_tree_node family.
In case anybody is currently using std::_Rb_tree with an allocator that
has a fancy pointer, this change would be an ABI break, because their
std::_Rb_tree instantiations would start to (correctly) use the fancy
pointer type. If the fancy pointer just contains a single pointer and so
has the same size, layout, and object representation as a raw pointer,
the code might still work (despite being an ODR violation). But if their
fancy pointer has a different representation, they would need to
recompile all their code using that allocator with std::_Rb_tree. Because
std::_Rb_tree will never use fancy pointers in C++98 mode, recompiling
everything to use fancy pointers isn't even possible if mixing C++98 and
C++11 code that uses std::_Rb_tree. To alleviate this problem, compiling
with -D_GLIBCXX_USE_ALLOC_PTR_FOR_RB_TREE=0 will force std::_Rb_tree to
have the old, non-conforming behaviour and use raw pointers internally.
For testing purposes, compiling with -D_GLIBCXX_USE_ALLOC_PTR_FOR_RB_TREE=9001
will force std::_Rb_tree to always use the new node types. This macro is
currently undocumented, which needs to be fixed.
As _Rb_tree is using _Base_ptr to represent the tree this change also
simplifies the implementation by removing all the const pointer types
and associated methods.
libstdc++-v3/ChangeLog:
PR libstdc++/57272
* include/bits/stl_tree.h
[_GLIBCXX_USE_ALLOC_PTR_FOR_RB_TREE]: New macro to control usage of the
code required to support fancy allocator pointer type.
(_Rb_tree_node_base::_Const_Base_ptr): Remove.
(_Rb_tree_node_base::_S_minimum, _Rb_tree_node_base::_S_maximum): Remove
overloads for _Const_Base_ptr.
(_Rb_tree_node_base::_M_base_ptr()): New.
(_Rb_tree_node::_Link_type): Remove.
(_Rb_tree_node::_M_node_ptr()): New.
(__rb_tree::_Node_base<>): New.
(__rb_tree::_Header<>): New.
(__rb_tree::_Node<>): New.
(_Rb_tree_increment(const _Rb_tree_node_base*)): Remove declaration.
(_Rb_tree_decrement(const _Rb_tree_node_base*)): Remove declaration.
(_Rb_tree_iterator<>::_Self): Remove.
(_Rb_tree_iterator<>::_Link_type): Rename into...
(_Rb_tree_iterator<>::_Node_ptr): ...this.
(_Rb_tree_const_iterator<>::_Link_type): Rename into...
(_Rb_tree_const_iterator<>::_Node_ptr): ...this.
(_Rb_tree_const_iterator<>::_M_const_cast): Remove.
(_Rb_tree_const_iterator<>::_M_node): Change type into _Base_ptr.
(__rb_tree::_Iterator<>): New.
(__rb_tree::_Node_traits<>): New.
(_Rb_tree<>::_Node_base, _Rb_tree::_Node): New.
(_Rb_tree<>::_Link_type): Rename into...
(_Rb_tree<>::_Node_ptr): ...this.
(_Rb_tree<>::_Const_Base_ptr, _Rb_tree<>::_Const_Node_ptr): Remove.
(_Rb_tree<>::_M_mbegin): Remove.
(_Rb_tree<>::_M_begin_node()): New.
(_S_key(const _Node&)): New.
(_S_key(_Base_ptr)): New, call latter.
(_S_key(_Node_ptr)): Likewise.
(_Rb_tree<>::_S_left(_Const_Base_ptr)): Remove.
(_Rb_tree<>::_S_right(_Const_Base_ptr)): Remove.
(_Rb_tree<>::_S_maximum(_Const_Base_ptr)): Remove.
(_Rb_tree<>::_S_minimum(_Const_Base_ptr)): Remove.
* testsuite/23_containers/map/allocator/ext_ptr.cc: New test case.
* testsuite/23_containers/multimap/allocator/ext_ptr.cc: New test case.
* testsuite/23_containers/multiset/allocator/ext_ptr.cc: New test case.
* testsuite/23_containers/set/allocator/ext_ptr.cc: New test case.
* testsuite/23_containers/set/requirements/explicit_instantiation/alloc_ptr.cc:
New test case.
* testsuite/23_containers/set/requirements/explicit_instantiation/alloc_ptr_ignored.cc:
New test case.
|
|
This implements the C++23 container adaptors std::flat_set and
std::flat_multiset from P1222R4. The implementation is essentially
an simpler and pared down version of std::flat_map.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new header <flat_set>.
* include/Makefile.in: Regenerate.
* include/bits/version.def (__cpp_flat_set): Define.
* include/bits/version.h: Regenerate
* include/precompiled/stdc++.h: Include <flat_set>.
* include/std/flat_set: New file.
* src/c++23/std.cc.in: Export <flat_set>.
* testsuite/23_containers/flat_multiset/1.cc: New test.
* testsuite/23_containers/flat_set/1.cc: New test.
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
This implements the C++23 container adaptors std::flat_map and
std::flat_multimap from P0429R9. The implementation is shared
as much as possible between the two adaptors via a common base
class that's parameterized according to key uniqueness.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new header <flat_map>.
* include/Makefile.in: Regenerate.
* include/bits/alloc_traits.h (__not_allocator_like): New concept.
* include/bits/stl_function.h (__transparent_comparator): Likewise.
* include/bits/stl_iterator_base_types.h (__has_input_iter_cat):
Likewise.
* include/bits/uses_allocator.h (__allocator_for): Likewise.
* include/bits/utility.h (sorted_unique_t): Define for C++23.
(sorted_unique): Likewise.
(sorted_equivalent_t): Likewise.
(sorted_equivalent): Likewise.
* include/bits/version.def (flat_map): Define.
* include/bits/version.h: Regenerate.
* include/precompiled/stdc++.h: Include <flat_map>.
* include/std/flat_map: New file.
* src/c++23/std.cc.in: Export <flat_map>.
* testsuite/23_containers/flat_map/1.cc: New test.
* testsuite/23_containers/flat_multimap/1.cc: New test.
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
libstdc++-v3/ChangeLog:
* include/bits/ranges_base.h (__detail::__range_key_type):
Define as per P1206R7.
(__detail::__range_mapped_type): Likewise.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
Rather than calling std::__addressof in std::addressof we can directly
call __builtin_addressof to bypass 1 function call.
libstdc++-v3/ChangeLog:
* include/bits/move.h (std::addressof): Call __builtin_addressof.
|
|
We are currently generating a loop which has more comparisons than you'd
typically need as the probablities on the small size loop are such that it
assumes the likely case is that an element is not found.
This again generates a pattern that's harder for branch predictors to follow,
but also just generates more instructions for the what one could say is the
typical case: That your hashtable contains the entry you are looking for.
This patch adds a __builtin_expect in _M_find_before_node where at the moment
the loop is optimized for the case where we don't do any iterations.
A simple testcase is (compiled with -fno-split-path to simulate the loop
in libstdc++):
#include <stdbool.h>
bool foo (int **a, int n, int val, int *tkn)
{
for (int i = 0; i < n; i++)
{
if (!a[i] || a[i]==tkn)
return false;
if (*a[i] == val)
return true;
}
}
which generataes:
foo:
cmp w1, 0
ble .L1
add x1, x0, w1, uxtw 3
b .L4
.L9:
ldr w4, [x4]
cmp w4, w2
beq .L6
cmp x0, x1
beq .L1
.L4:
ldr x4, [x0]
add x0, x0, 8
cmp x4, 0
ccmp x4, x3, 4, ne
bne .L9
mov w0, 0
.L1:
ret
.L6:
mov w0, 1
ret
i.e. BB rotation makes is generate an unconditional branch to a conditional
branch. However this method is only called when the size is above a certain
threshold, and so it's likely that we have to do that first iteration.
Adding:
#include <stdbool.h>
bool foo (int **a, int n, int val, int *tkn)
{
for (int i = 0; i < n; i++)
{
if (__builtin_expect(!a[i] || a[i]==tkn, 0))
return false;
if (*a[i] == val)
return true;
}
}
to indicate that we will likely do an iteration more generates:
foo:
cmp w1, 0
ble .L1
add x1, x0, w1, uxtw 3
.L4:
ldr x4, [x0]
add x0, x0, 8
cmp x4, 0
ccmp x4, x3, 4, ne
beq .L5
ldr w4, [x4]
cmp w4, w2
beq .L6
cmp x0, x1
bne .L4
.L1:
ret
.L5:
mov w0, 0
ret
.L6:
mov w0, 1
ret
which results in ~0-10% extra on top of the previous patch.
In table form:
+-------------+---------------+-------+--------------------+-------------------+-----------------+
| benchmark | Type | Size | Inline vs baseline | final vs baseline | final vs inline |
+-------------+---------------+-------+--------------------+-------------------+-----------------+
| find many | uint64_t | 11253 | -15.67% | -22.96% | -8.65% |
| find many | uint64_t | 11253 | -16.74% | -23.37% | -7.96% |
| find single | uint64_t | 345 | -5.88% | -11.54% | -6.02% |
| find many | string | 11253 | -4.50% | -9.56% | -5.29% |
| find single | uint64_t | 345 | -4.38% | -9.41% | -5.26% |
| find single | shared string | 11253 | -6.67% | -11.00% | -4.64% |
| find single | shared string | 11253 | -4.63% | -9.03% | -4.61% |
| find single | shared string | 345 | -10.41% | -14.44% | -4.50% |
| find many | string | 11253 | -3.41% | -7.51% | -4.24% |
| find many | shared string | 11253 | -2.30% | -5.72% | -3.50% |
| find many | string | 13 | 2.86% | -0.30% | -3.07% |
| find single | string | 11253 | 4.47% | 1.34% | -3.00% |
| find many | custom string | 11253 | 0.25% | -2.75% | -2.99% |
| find single | uint64_t | 345 | 2.99% | 0.01% | -2.90% |
| find single | shared string | 345 | -11.53% | -13.67% | -2.41% |
| find single | uint64_t | 11253 | 0.49% | -1.59% | -2.07% |
+-------------+---------------+-------+--------------------+-------------------+-----------------+
libstdc++-v3/ChangeLog:
* include/bits/hashtable.h
(_M_find_before_node): Make it likely that the map has at least one
entry and so we do at least one iteration.
|
|
We don't know what state an arbitrary sequence container will be in
after moving from it, so a moved-from std::priority_queue needs to clear
the moved-from container to ensure it doesn't contain elements that are
in an invalid order for the queue. An alternative would be to call
std::make_heap again to re-establish the rvalue queue's invariant, but
that could potentially cause an exception to be thrown. Just clearing it
so the sequence is empty seems safer and more likely to match user
expectations.
libstdc++-v3/ChangeLog:
PR libstdc++/118088
* include/bits/stl_queue.h (priority_queue(priority_queue&&)):
Clear the source object after moving from it.
(priority_queue(priority_queue&&, const Alloc&)): Likewise.
(operator=(priority_queue&&)): Likewise.
* testsuite/23_containers/priority_queue/118088.cc: New test.
Reviewed-by: Patrick Palka <ppalka@redhat.com>
|
|
In GCC 12 there was a ~40% regression in the performance of hashmap->find.
This regression came about accidentally:
Before GCC 12 the find function was small enough that IPA would inline it even
though it wasn't marked inline. In GCC-12 an optimization was added to perform
a linear search when the entries in the hashmap are small.
This increased the size of the function enough that IPA would no longer inline.
Inlining had two benefits:
1. The return value is a reference. so it has to be returned and dereferenced
even though the search loop may have already dereference it.
2. The pattern is a hard pattern to track for branch predictors. This causes
a large number of branch misses if the value is immediately checked and
branched on. i.e. if (a != m.end()) which is a common pattern.
The patch fixes both these issues by adding the inline keyword to _M_locate
to allow the inliner to consider inlining again.
This and the other patches have been ran through serveral benchmarks where
the size, number of elements searched for and type (reference vs value) etc
were tested.
The change shows no statistical regression, but an average find improvement of
~27% and a range between ~10-60% improvements. A selection of the results:
+-----------+--------------------+-------+----------+
| Group | Benchmark | Size | % Inline |
+-----------+--------------------+-------+----------+
| Find | unord<uint64_t | 11274 | 53.52% |
| Find | unord<uint64_t | 11254 | 47.98% |
| Find Mult | unord<uint64_t | 12 | 47.62% |
| Find Mult | unord<std::string | 12 | 44.94% |
| Find Mult | unord<std::string | 10 | 44.89% |
| Find Mult | unord<uint64_t | 11 | 40.90% |
| Find Mult | unord<uint64_t | 352 | 30.57% |
| Find | unord<uint64_t | 351 | 28.27% |
| Find Mult | unord<uint64_t | 342 | 26.80% |
| Find | unord<std::string | 12 | 25.66% |
| Find Mult | unord<std::string | 352 | 23.12% |
| Find | unord<std::string | 13 | 20.36% |
| Find Mult | unord<std::string | 355 | 19.23% |
| Find | unord<std::string | 353 | 18.59% |
| Find | unord<uint64_t | 350 | 15.43% |
| Find | unord<std::string | 11260 | 11.80% |
| Find | unord<std::string | 352 | 11.12% |
| Find | unord<std::string | 11262 | 9.97% |
+-----------+--------------------+-------+----------+
libstdc++-v3/ChangeLog:
* include/bits/hashtable.h: Inline _M_locate.
|
|
The mapping from char to wchar_t needs to handle 'i' and 'I' but those
were absent from the table that is used for some non-ASCII encodings.
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h (__to_wstring_numeric): Add 'i'
and 'I' to mapping.
|
|
This is both a performance optimization and a partial fix for PR 98723.
This commit fixes the issue for bracket expressions that do not depend
on the locale's collation facet. Examples:
* Character ranges ([a-z]) when std::regex::collate is not set
* Character classes ([:alnum:])
* Individual characters ([abc])
Signed-off-by: Luca Bacci <luca.bacci982@gmail.com>
libstdc++-v3/ChangeLog:
PR libstdc++/98723
* include/bits/regex_compiler.tcc (_BracketMatcher::_M_apply):
Only use transform_primary when an equivalence set is used.
|
|
libstdc++-v3/ChangeLog:
* include/debug/safe_local_iterator.h (_GLIBCXX_DEBUG_VERIFY_OPERANDS):
Add parentheses to avoid -Wparentheses warning.
|
|
[PR118035]
Inserting an empty range into a std::deque results in undefined calls to
either std::copy, std::copy_backward, std::move, or std::move_backward.
We call those algos with invalid arguments where the output range is the
same as the input range, e.g. std::copy(first, last, first) which
violates the preconditions for the algorithms.
This fix simply returns early if there's nothing to insert. Most callers
already ensure that we don't even call _M_range_insert_aux with an empty
range, but some callers don't. Rather than checking for n == 0 in each
of the callers, this just does the check once and uses __builtin_expect
to treat empty insertions as unlikely.
libstdc++-v3/ChangeLog:
PR libstdc++/118035
* include/bits/deque.tcc (_M_range_insert_aux): Return
immediately if inserting an empty range.
* testsuite/23_containers/deque/modifiers/insert/118035.cc: New
test.
|
|
Currently the _M_bucket members are left uninitialized for
default-initialized local iterators, and then copy construction copies
indeterminate values. We should just ensure they're initialized on
construction.
Setting them to zero makes default-initialization consistent with
value-initialization and avoids indeterminate values.
For the _Local_iterator_base<..., false> specialization we preserve the
existing behaviour of setting _M_bucket_count to -1 in the default
constructor, as a sentinel value to indicate there's no hash object
present.
libstdc++-v3/ChangeLog:
* include/bits/hashtable_policy.h (_Local_iterator_base): Use
default member-initializers.
|
|
This file is only for C++11 and later, so replace typedefs with
alias-declarations for clarity. Also remove redundant std::
qualification on size_t, ptrdiff_t etc.
We can also remove the result_type, first_argument_type and
second_argument_type typedefs from the range hashers. We don't need
those types to follow the C++98 adaptable function object protocol.
libstdc++-v3/ChangeLog:
* include/bits/hashtable_policy.h: Replace typedefs with
alias-declarations. Remove redundant std:: qualification.
(_Mod_range_hashing, _Mask_range_hashing): Remove adaptable
function object typedefs.
|
|
The fix for PR libstdc++/56267 (relating to the lifetime of the hash
object stored in a local iterator) has undefined behaviour, as it relies
on being able to call a member function on an empty object that never
started its lifetime. Although the member function probably doesn't care
about the empty object's state, this is still technically undefined
because there is no object of that type at that address. It's also
possible that the hash object would have a stricter alignment than the
_Hash_code_storage object, so that the reinterpret_cast would produce a
misaligned pointer.
This fix replaces _Local_iterator_base's _Hash_code_storage base-class
with a new class template containing a potentially-overlapping (i.e.
[[no_unique_address]]) union member. This means that we always have
storage of the correct type, and it can be initialized/destroyed when
required. We no longer need a reinterpret_cast that gives us a pointer
that we should not dereference.
It would be nice if we could just use a union containing the _Hash
object as a data member of _Local_iterator_base, but that would be an
ABI change. The _Hash_code_storage that contains the _Hash object is the
first base-class, before the _Node_iterator_base base-class. Making the
union a data member of _Local_iterator_base would make it come after the
_Node_iterator_base base instead of before it, altering the layout.
Since we're changing _Hash_code_storage anyway, we can replace it with a
new class template that stores the _Hash object itself in the union,
rather than a _Hash_code_base that holds the _Hash. This removes an
unnecessary level of indirection in the class hierarchy. This change
requires the effects of _Hash_code_base::_M_bucket_index to be inlined
into the _Local_iterator_base::_M_incr function, but that's easy.
We don't need separate specializations of _Hash_obj_storage for an empty
hash function and a non-empty one. Using [[no_unique_address]] gives us
an empty base-class when possible.
libstdc++-v3/ChangeLog:
* include/bits/hashtable_policy.h (_Hash_code_storage): Remove.
(_Hash_obj_storage): New class template. Store the hash
function as a union member instead of using a byte buffer.
(_Local_iterator_base): Use _Hash_obj_storage instead of
_Hash_code_storage, adjust members that construct and destroy
the hash object.
(_Local_iterator_base::_M_incr): Calculate bucket index.
|
|
The main change here is using [[no_unique_address]] instead of the Empty
Base-class Optimization. Using the attribute allows us to use data
members instead of base-classes. That simplifies the inheritance
hierarchy, which means less work for the compiler. It also means that
ADL has fewer associated classes and associated namespaces to consider,
further reducing the work the compiler has to do.
Reducing the differences between the _Hashtable_ebo_helper primary
template and the partial specialization means we no longer need to use
member functions to access the stored object, because it's now always a
data member called _M_obj. This means we can also remove a number of
other helper functions that were using those member functions to access
the object, for example we can swap the _Hash and _Equal objects
directly in _Hashtable::swap instead of calling _Hashtable_base::_M_swap
which then calls _Hash_code_base::_M_swap.
Although [[no_unique_address]] would allow us to reduce the size for
empty types that are also 'final', doing so would be an ABI break
because those types were previously excluded from using the EBO. So we
still need the _Hashtable_ebo_helper class template and a partial
specialization, so that we only use the attribute under exactly the same
conditions as we previously used the EBO. This could be avoided with a
non-standard [[no_unique_address(expr)]] attribute that took a boolean
condition, or with reflection and token sequence injection, but we don't
have either of those things.
Because _Hashtable_ebo_helper is no longer used as a base-class we don't
need to disambiguate possible identical bases, so it doesn't need an
integral non-type template parameter.
libstdc++-v3/ChangeLog:
* include/bits/hashtable.h (_Hashtable::swap): Swap hash
function and equality predicate here. Inline allocator swap
instead of using __alloc_on_swap.
* include/bits/hashtable_policy.h (_Hashtable_ebo_helper):
Replace EBO with no_unique_address attribute. Remove NTTP.
(_Hash_code_base): Replace base class with data member using
no_unique_address attribute.
(_Hash_code_base::_M_swap): Remove.
(_Hash_code_base::_M_hash): Remove.
(_Hashtable_base): Replace base class with data member using
no_unique_address attribute.
(_Hashtable_base::_M_swap): Remove.
(_Hashtable_alloc): Replace base class with data member using
no_unique_address attribute.
|
|
The union members I used in the new _Node types for fancy pointers only
work for value types that are trivially default constructible. This
change replaces the anonymous union with a named union so it can be
given a default constructor and destructor, to leave the variant member
uninitialized.
This also fixes the incorrect macro names in the alloc_ptr_ignored.cc
tests as pointed out by François, and fixes some std::list pointer
confusions that the fixed alloc_ptr_ignored.cc test revealed.
libstdc++-v3/ChangeLog:
PR libstdc++/57272
* include/bits/forward_list.h (__fwd_list::_Node): Add
user-provided special member functions to union.
* include/bits/stl_list.h (__list::_Node): Likewise.
(_Node_base::_M_hook, _Node_base::swap): Use _M_base() instead
of std::pointer_traits::pointer_to.
(_Node_base::_M_transfer): Likewise. Add noexcept.
(_List_base::_M_put_node): Use 'if constexpr' to avoid using
pointer_traits::pointer_to when not necessary.
(_List_base::_M_destroy_node): Fix parameter to be the pointer
type used internally, not the allocator's pointer.
(list::_M_create_node): Likewise.
* testsuite/23_containers/forward_list/requirements/explicit_instantiation/alloc_ptr.cc:
Check explicit instantiation of non-trivial value type.
* testsuite/23_containers/list/requirements/explicit_instantiation/alloc_ptr.cc:
Likewise.
* testsuite/23_containers/forward_list/requirements/explicit_instantiation/alloc_ptr_ignored.cc:
Fix macro name.
* testsuite/23_containers/list/requirements/explicit_instantiation/alloc_ptr_ignored.cc:
Likewise.
|
|
libstdc++-v3/ChangeLog:
* include/c_compatibility/wchar.h (fgetwc): Remove duplicate
using-declaration.
|
|
Use a local reference for the (now possibly lifetime extended) result of
*__first so that we copy it only when necessary.
PR libstdc++/112349
libstdc++-v3/ChangeLog:
* include/bits/ranges_algo.h (__min_fn::operator()): Turn local
object __tmp into a reference.
* include/bits/ranges_util.h (__max_fn::operator()): Likewise.
* testsuite/25_algorithms/max/constrained.cc (test04): New test.
* testsuite/25_algorithms/min/constrained.cc (test04): New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|