diff options
author | Jonathan Wakely <jwakely@redhat.com> | 2024-01-15 15:42:50 +0000 |
---|---|---|
committer | Jonathan Wakely <jwakely@redhat.com> | 2024-01-17 11:49:11 +0000 |
commit | df0a668b784556fe4317317d58961652d93d53de (patch) | |
tree | 2a91cc7e587af6283bef9f06483645175d0e70b0 /libstdc++-v3/testsuite/22_locale | |
parent | 665a3ff1539ce24e1215e52a14450ecd9a26e87f (diff) | |
download | gcc-df0a668b784556fe4317317d58961652d93d53de.zip gcc-df0a668b784556fe4317317d58961652d93d53de.tar.gz gcc-df0a668b784556fe4317317d58961652d93d53de.tar.bz2 |
libstdc++: Implement C++26 std::text_encoding (P1885R12) [PR113318]
This is another C++26 change, approved in Varna 2023. We require a new
static array of data that is extracted from the IANA Character Sets
database. A new Python script to generate a header from the IANA CSV
file is added.
The text_encoding class is basically just a pointer to an {ID,name} pair
in the static array. The aliases view is also just the same pointer (or
empty), and the view's iterator moves forwards and backwards in the
array while the array elements have the same ID (or to one element
further, for a past-the-end iterator).
Because those iterators refer to a global array that never goes out of
scope, there's no reason they should every produce undefined behaviour
or indeterminate values. They should either have well-defined
behaviour, or abort. The overhead of ensuring those properties is pretty
low, so seems worth it.
This means that an aliases_view iterator should never be able to access
out-of-bounds. A non-value-initialized iterator always points to an
element of the static array even when not dereferenceable (the array has
unreachable entries at the start and end, which means that even a
past-the-end iterator for the last encoding in the array still points to
valid memory). Dereferencing an iterator can always return a valid
array element, or "" for a non-dereferenceable iterator (but doing so
will abort when assertions are enabled). In the language being proposed
for C++26, dereferencing an invalid iterator erroneously returns "".
Attempting to increment/decrement past the last/first element in the
view is erroneously a no-op, so aborts when assertions are enabled, and
doesn't change value otherwise.
Similarly, constructing a std::text_encoding with an invalid id (one
that doesn't have the value of an enumerator) erroneously behaves the
same as constructing with id::unknown, or aborts with assertions
enabled.
libstdc++-v3/ChangeLog:
PR libstdc++/113318
* acinclude.m4 (GLIBCXX_CONFIGURE): Add c++26 directory.
(GLIBCXX_CHECK_TEXT_ENCODING): Define.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Use GLIBCXX_CHECK_TEXT_ENCODING.
* include/Makefile.am: Add new headers.
* include/Makefile.in: Regenerate.
* include/bits/locale_classes.h (locale::encoding): Declare new
member function.
* include/bits/unicode.h (__charset_alias_match): New function.
* include/bits/text_encoding-data.h: New file.
* include/bits/version.def (text_encoding): Define.
* include/bits/version.h: Regenerate.
* include/std/text_encoding: New file.
* src/Makefile.am: Add new subdirectory.
* src/Makefile.in: Regenerate.
* src/c++26/Makefile.am: New file.
* src/c++26/Makefile.in: New file.
* src/c++26/text_encoding.cc: New file.
* src/experimental/Makefile.am: Include c++26 convenience
library.
* src/experimental/Makefile.in: Regenerate.
* python/libstdcxx/v6/printers.py (StdTextEncodingPrinter): New
printer.
* scripts/gen_text_encoding_data.py: New file.
* testsuite/22_locale/locale/encoding.cc: New test.
* testsuite/ext/unicode/charset_alias_match.cc: New test.
* testsuite/std/text_encoding/cons.cc: New test.
* testsuite/std/text_encoding/members.cc: New test.
* testsuite/std/text_encoding/requirements.cc: New test.
Reviewed-by: Ulrich Drepper <drepper.fsp@gmail.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Diffstat (limited to 'libstdc++-v3/testsuite/22_locale')
-rw-r--r-- | libstdc++-v3/testsuite/22_locale/locale/encoding.cc | 36 |
1 files changed, 36 insertions, 0 deletions
diff --git a/libstdc++-v3/testsuite/22_locale/locale/encoding.cc b/libstdc++-v3/testsuite/22_locale/locale/encoding.cc new file mode 100644 index 0000000..18825fb --- /dev/null +++ b/libstdc++-v3/testsuite/22_locale/locale/encoding.cc @@ -0,0 +1,36 @@ +// { dg-options "-lstdc++exp" } +// { dg-do run { target c++26 } } +// { dg-require-namedlocale "en_US.ISO8859-1" } +// { dg-require-namedlocale "fr_FR.ISO8859-15" } + +#include <locale> +#include <testsuite_hooks.h> + +void +test_encoding() +{ + const std::locale c = std::locale::classic(); + std::text_encoding c_enc = c.encoding(); + VERIFY( c_enc == std::text_encoding::ASCII ); + + const std::locale fr = std::locale(ISO_8859(15, fr_FR)); + std::text_encoding fr_enc = fr.encoding(); + VERIFY( fr_enc == std::text_encoding::ISO885915 ); + + const std::locale en = std::locale(ISO_8859(1, en_US)); + std::text_encoding en_enc = en.encoding(); + VERIFY( en_enc == std::text_encoding::ISOLatin1 ); + +#if __cpp_exceptions + try { + const std::locale c_utf8 = std::locale("C.UTF-8"); + VERIFY( c_utf8.encoding() == std::text_encoding::UTF8 ); + } catch (...) { + } +#endif +} + +int main() +{ + test_encoding(); +} |