diff options
author | Mike FABIAN <mfabian@redhat.com> | 2024-09-12 15:02:55 +0200 |
---|---|---|
committer | Mike FABIAN <mfabian@redhat.com> | 2024-09-27 14:43:38 +0200 |
commit | a7b5eb821d48b0cb14d0c0d2706410d4f7838cf6 (patch) | |
tree | 831b43757b570557d202c74d957c674ffad10729 /localedata/locales/translit_compat | |
parent | f47596fcfe32ef96ba9b322a414803b25b8ce608 (diff) | |
download | glibc-a7b5eb821d48b0cb14d0c0d2706410d4f7838cf6.zip glibc-a7b5eb821d48b0cb14d0c0d2706410d4f7838cf6.tar.gz glibc-a7b5eb821d48b0cb14d0c0d2706410d4f7838cf6.tar.bz2 |
Update to Unicode 16.0.0 [BZ #32168]
Unicode 16.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 16.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).
Changes in CHARMAP and WIDTH:
Total added characters in newly generated CHARMAP: 5185
Total removed characters in newly generated WIDTH: 1
Total added characters in newly generated WIDTH: 170
The removed character from WIDTH is U+1171E AHOM CONSONANT SIGN MEDIAL RA.
It changed like this:
UnicodeData.txt 15.1.0: 1171E;AHOM CONSONANT SIGN MEDIAL RA;Mn;0;NSM;;;;;N;;;;;
UnicodeData.txt 16.0.0: 1171E;AHOM CONSONANT SIGN MEDIAL RA;Mc;0;L;;;;;N;;;;;
EastAsianWidth.txt 15.1.0: 1171D..1171F ; N # Mn [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
EastAsianWidth.txt 16.0.0: 1171E ; N # Mc AHOM CONSONANT SIGN MEDIAL RA
I.e it changed from Mn (Mark Nonspacing) to Mc (Mark Spacing
combining). So it should now have width 1 instead of 0, therefore it
is OK that it was removed from WIDTH, characters not in WIDTH get
width 1 by default.
Nothing suspicious when browsing the list of the 170 added characters.
Changes in ctype:
alpha: Added 4452 characters in new ctype which were not in old ctype
combining: Added 51 characters in new ctype which were not in old ctype
combining_level3: Added 43 characters in new ctype which were not in old ctype
graph: Added 5185 characters in new ctype which were not in old ctype
lower: Added 25 characters in new ctype which were not in old ctype
print: Added 5185 characters in new ctype which were not in old ctype
punct: Missing 33 characters of old ctype in new ctype
punct: Added 766 characters in new ctype which were not in old ctype
tolower: Added 27 characters in new ctype which were not in old ctype
totitle: Added 27 characters in new ctype which were not in old ctype
toupper: Added 27 characters in new ctype which were not in old ctype
upper: Added 27 characters in new ctype which were not in old ctype
Nothing suspicous in the additions.
About the 33 characters removed from `punct`:
U+0363 - U+036F are identical in UnicodeData.txt. Difference in DerivedCoreProperties.txt:
DerivedCoreProperties.txt 15.1.0: not there.
DerivedCoreProperties.txt 16.0.0: 0363..036F ; Alphabetic # Mn [13] COMBINING LATIN SMALL LETTER A..COMBINING LATIN SMALL LETTER X
So that’s the reason why they are added to `alpha` and removed from `punct`.
Same for U+1DD3 - U+1DE6, they are identical in UnicodeData.txt but there is a difference in DerivedCoreProperties.txt:
DerivedCoreProperties.txt 15.1.0: 1DE7..1DF4 ; Alphabetic # Mn [14] COMBINING LATIN SMALL LETTER ALPHA..COMBINING LATIN SMALL LETTER U WITH DIAERESIS
DerivedCoreProperties.txt 16.0.0: 1DD3..1DF4 ; Alphabetic # Mn [34] COMBINING LATIN SMALL LETTER FLATTENED OPEN A ABOVE..COMBINING LATIN SMALL LETTER U WITH DIAERESIS
So they became `Alphabetic` and were thus added to `alpha` and removed from `punct`.
Resolves: BZ #32168
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Diffstat (limited to 'localedata/locales/translit_compat')
-rw-r--r-- | localedata/locales/translit_compat | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/localedata/locales/translit_compat b/localedata/locales/translit_compat index 7a214b2..dd36388 100644 --- a/localedata/locales/translit_compat +++ b/localedata/locales/translit_compat @@ -9,7 +9,7 @@ comment_char % % otherwise be governed by that license. % Transliterations of compatibility characters and ligatures. -% Generated automatically from UnicodeData.txt by gen_translit_compat.py on 2023-09-15 for Unicode 15.1.0. +% Generated automatically from UnicodeData.txt by gen_translit_compat.py on 2024-09-12 for Unicode 16.0.0. LC_CTYPE |