diff options
author | Mike FABIAN <mfabian@redhat.com> | 2017-12-11 18:26:22 +0100 |
---|---|---|
committer | Mike FABIAN <mfabian@redhat.com> | 2018-02-27 17:47:50 +0100 |
commit | 159738548130d5ac4fe6178977e940ed5f8cfdc4 (patch) | |
tree | 03f90b90e7bb794cfdbd4b3e66c9fff7ad6a9b24 /localedata/locales/uz_UZ | |
parent | ce6636b06b67d6bb9b3d6927bf2a926b9b7478f5 (diff) | |
download | glibc-159738548130d5ac4fe6178977e940ed5f8cfdc4.zip glibc-159738548130d5ac4fe6178977e940ed5f8cfdc4.tar.gz glibc-159738548130d5ac4fe6178977e940ed5f8cfdc4.tar.bz2 |
Adapt collation in several locales to the new iso14651_t1_common file
[BZ #22550] - es_ES locale (and other es_* locales): collation should
treat ñ as a primary different character, sync the collation
for Spanish with CLDR
[BZ #21547] - Tibetan script collation broken (Dzongkha and Tibetan)
* localedata/Makefile: Add new test files.
* localedata/lv_LV.UTF-8.in: Adapt test file to new collation order.
* localedata/sv_SE.ISO-8859-1.in: Adapt test file to new collation order.
* localedata/uk_UA.UTF-8.in: Adapt test file to new collation order.
* localedata/am_ET.UTF-8.in: New test file.
* localedata/az_AZ.UTF-8.in: Likewise.
* localedata/be_BY.UTF-8.in: Likewise.
* localedata/ber_DZ.UTF-8.in: Likewise.
* localedata/ber_MA.UTF-8.in: Likewise.
* localedata/bg_BG.UTF-8.in: Likewise.
* localedata/br_FR.UTF-8.in: Likewise.
* localedata/cmn_TW.UTF-8.in: Likewise.
* localedata/crh_UA.UTF-8.in: Likewise.
* localedata/csb_PL.UTF-8.in: Likewise.
* localedata/cv_RU.UTF-8.in: Likewise.
* localedata/cy_GB.UTF-8.in: Likewise.
* localedata/dz_BT.UTF-8.in: Likewise.
* localedata/eo.UTF-8.in: Likewise.
* localedata/es_ES.UTF-8.in: Likewise.
* localedata/fa_IR.UTF-8.in: Likewise.
* localedata/fi_FI.UTF-8.in: Likewise.
* localedata/fil_PH.UTF-8.in: Likewise.
* localedata/fur_IT.UTF-8.in: Likewise.
* localedata/gez_ER.UTF-8@abegede.in: Likewise.
* localedata/ha_NG.UTF-8.in: Likewise.
* localedata/ig_NG.UTF-8.in: Likewise.
* localedata/ik_CA.UTF-8.in: Likewise.
* localedata/kk_KZ.UTF-8.in: Likewise.
* localedata/ku_TR.UTF-8.in: Likewise.
* localedata/ky_KG.UTF-8.in: Likewise.
* localedata/ln_CD.UTF-8.in: Likewise.
* localedata/mi_NZ.UTF-8.in: Likewise.
* localedata/ml_IN.UTF-8.in: Likewise.
* localedata/mn_MN.UTF-8.in: Likewise.
* localedata/mr_IN.UTF-8.in: Likewise.
* localedata/mt_MT.UTF-8.in: Likewise.
* localedata/nb_NO.UTF-8.in: Likewise.
* localedata/om_KE.UTF-8.in: Likewise.
* localedata/os_RU.UTF-8.in: Likewise.
* localedata/ps_AF.UTF-8.in: Likewise.
* localedata/ro_RO.UTF-8.in: Likewise.
* localedata/ru_RU.UTF-8.in: Likewise.
* localedata/sc_IT.UTF-8.in: Likewise.
* localedata/se_NO.UTF-8.in: Likewise.
* localedata/sq_AL.UTF-8.in: Likewise.
* localedata/sv_SE.UTF-8.in: Likewise.
* localedata/szl_PL.UTF-8.in: Likewise.
* localedata/tg_TJ.UTF-8.in: Likewise.
* localedata/tk_TM.UTF-8.in: Likewise.
* localedata/tt_RU.UTF-8.in: Likewise.
* localedata/tt_RU.UTF-8@iqtelif.in: Likewise.
* localedata/ug_CN.UTF-8.in: Likewise.
* localedata/uz_UZ.UTF-8.in: Likewise.
* localedata/vi_VN.UTF-8.in: Likewise.
* localedata/yi_US.UTF-8.in: Likewise.
* localedata/yo_NG.UTF-8.in: Likewise.
* localedata/zh_CN.UTF-8.in: Likewise.
* localedata/locales/am_ET: Adapt collation rules to new iso14651_t1_common
file and fix bugs in the collation.
* localedata/locales/az_AZ: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/br_FR: Likewise.
* localedata/locales/br_FR@euro: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/cns11643_stroke: Likewise.
* localedata/locales/crh_UA: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/csb_PL: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/en_CA: Likewise.
* localedata/locales/eo: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_EC: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/es_US: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fil_PH: Likewise.
* localedata/locales/fur_IT: Likewise.
* localedata/locales/gez_ER@abegede: Likewise.
* localedata/locales/ha_NG: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/hsb_DE: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/ig_NG: Likewise.
* localedata/locales/ik_CA: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/iso14651_t1_pinyin: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/ku_TR: Likewise.
* localedata/locales/ky_KG: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mi_NZ: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/mn_MN: Likewise.
* localedata/locales/mr_IN: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/ps_AF: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/ru_UA: Likewise.
* localedata/locales/sc_IT: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/sv_FI: Likewise.
* localedata/locales/sv_FI@euro: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/szl_PL: Likewise.
* localedata/locales/tg_TJ: Likewise.
* localedata/locales/ti_ER: Likewise.
* localedata/locales/tk_TM: Likewise.
* localedata/locales/tl_PH: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/tt_RU: Likewise.
* localedata/locales/tt_RU@iqtelif: Likewise.
* localedata/locales/ug_CN: Likewise.
* localedata/locales/uk_UA: Likewise.
* localedata/locales/uz_UZ: Likewise.
* localedata/locales/uz_UZ@cyrillic: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yo_NG: Likewise.
Diffstat (limited to 'localedata/locales/uz_UZ')
-rw-r--r-- | localedata/locales/uz_UZ | 127 |
1 files changed, 83 insertions, 44 deletions
diff --git a/localedata/locales/uz_UZ b/localedata/locales/uz_UZ index c5afbf7..2dae80c 100644 --- a/localedata/locales/uz_UZ +++ b/localedata/locales/uz_UZ @@ -155,53 +155,92 @@ END LC_CTYPE LC_COLLATE copy "iso14651_t1" +% CLDR collation rules for Uzbek: +% (see: https://unicode.org/cldr/trac/browser/trunk/common/collation/uz.xml) +% +% <collations> +% <collation type="standard"><cr><![CDATA[ +% # The following letters sort after z, see +% # https://en.wikipedia.org/wiki/Uzbek_alphabet#Alphabetical_order +% # Native speaker+linguists say that +% # the digraph ⟨ng⟩ and the symbol ⟨ʼ⟩ are not considered separate letters. +% # +% # Reset between the last z-like letter and ezh. +% # +% # U+02BB ʻ MODIFIER LETTER TURNED COMMA is hard to type, so make +% # equivalent contractions with U+2018 ‘ LEFT SINGLE QUOTATION MARK +% # and U+0027 APOSTROPHE. +% # (https://en.wikipedia.org/wiki/Uzbek_alphabet#Distinct_characters) +% # Remember that a pair of apostrophes encodes just one of them. +% &[before 1]ʒ<oʻ=o‘=o''<<<Oʻ=O‘=O'' +% <gʻ=g‘=g''<<<Gʻ=G‘=G'' +% <sh<<<Sh<<<SH +% <ch<<<Ch<<<CH +% ]]></cr></collation> +% </collations> +% +% And CLDR also lists the following +% index characters: +% (see: https://unicode.org/cldr/trac/browser/trunk/common/main/uz.xml) +% +% <exemplarCharacters type="index">[A B D E F G H I J K L M N O P Q R S T U V X Y Z {Oʻ} {Gʻ} {Sh} {Ch}]</exemplarCharacters> +% -%% a b c d e f g g' h i j k l m n o o' p q r s t u v x y z -%% cyr: a=, b=, v=, g=, d=, e=, io, z%, z=, i=, j=, k=, l=, m=, n=, o=, -%% p=, r=, s=, t=, u=, f=, h=, c=, c%, s%, sc, =' , y=, je, ju, ja, -%% v%, k,=, g-=, h,= -collating-symbol <g-'-uz> -collating-element <g-'> from "<U0067><U0027>" -collating-element <G-'> from "<U0047><U0027>" -collating-symbol <o-'-uz> -collating-element <o-'> from "<U006F><U0027>" -collating-element <O-'> from "<U004F><U0027>" - -collating-symbol <k,=> -collating-symbol <g-=> -collating-symbol <h,=> - -reorder-after <g> -<g-'-uz> -reorder-after <o> -<o-'-uz> -reorder-after <CYR-YA> -<CYR-OUBRE> -<k,=> -<g-=> -<h,=> - -reorder-after <U0067> -<g-'> <g-'-uz>;<PCL>;<MIN>;IGNORE -reorder-after <U0047> -<G-'> <g-'-uz>;<PCL>;<CAP>;IGNORE - -reorder-after <U006F> -<o-'> <o-'-uz>;<PCL>;<MIN>;IGNORE -reorder-after <U004F> -<O-'> <o-'-uz>;<PCL>;<CAP>;IGNORE +collating-symbol <g'-digraph> +collating-symbol <o'-digraph> +collating-element <g-turned-comma> from "g<U02BB>" +collating-element <G-turned-comma> from "G<U02BB>" +collating-element <o-turned-comma> from "o<U02BB>" +collating-element <O-turned-comma> from "O<U02BB>" +% Unfortunately we cannot use “left single quotation mark” because +% it fails when creating the uz_UZ.iso88591 locale. In UTF-8 it works +% but in ISO-8859-1 one gets error messages that it uses the same +% encoding as “turned comma” +% collating-element <g-left-single-quotation-mark> from "g<U2018>" +% collating-element <G-left-single-quotation-mark> from "G<U2018>" +% collating-element <o-left-single-quotation-mark> from "o<U2018>" +% collating-element <O-left-single-quotation-mark> from "O<U2018>" +collating-element <g-double-apostrophe> from "g''" +collating-element <G-double-apostrophe> from "G''" +collating-element <o-double-apostrophe> from "o''" +collating-element <O-double-apostrophe> from "O''" +collating-symbol <sh-digraph> +collating-element <sh> from "sh" +collating-element <sH> from "sH" +collating-element <Sh> from "Sh" +collating-element <SH> from "SH" +collating-symbol <ch-digraph> +collating-element <ch> from "ch" +collating-element <cH> from "cH" +collating-element <Ch> from "Ch" +collating-element <CH> from "CH" -reorder-after <U044F> -<U045E> <CYR-OUBRE>;<PCL>;<MIN>;IGNORE -<U049B> <k,=>;<PCL>;<MIN>;IGNORE -<U0493> <g-=>;<PCL>;<MIN>;IGNORE -<U04B3> <h,=>;<PCL>;<MIN>;IGNORE +reorder-after <AFTER-Z> +<o'-digraph> +<g'-digraph> +<sh-digraph> +<ch-digraph> -reorder-after <U042F> -<U040E> <CYR-OUBRE>;<PCL>;<CAP>;IGNORE -<U049A> <k,=>;<PCL>;<CAP>;IGNORE -<U0492> <g-=>;<PCL>;<CAP>;IGNORE -<U04B2> <h,=>;<PCL>;<CAP>;IGNORE +<o-turned-comma> <o'-digraph>;"<BASE><BASE>";"<MIN><MIN>";<VRNT1> +<O-turned-comma> <o'-digraph>;"<BASE><BASE>";"<CAP><MIN>";<VRNT1> +% <o-left-single-quotation-mark> <o'-digraph>;"<BASE><BASE>";"<MIN><MIN>";<VRNT2> +% <O-left-single-quotation-mark> <o'-digraph>;"<BASE><BASE>";"<CAP><MIN>";<VRNT2> +<o-double-apostrophe> <o'-digraph>;"<BASE><BASE>";"<MIN><MIN>";<VRNT3> +<O-double-apostrophe> <o'-digraph>;"<BASE><BASE>";"<CAP><MIN>";<VRNT3> +<g-turned-comma> <g'-digraph>;"<BASE><BASE>";"<MIN><MIN>";<VRNT1> +<G-turned-comma> <g'-digraph>;"<BASE><BASE>";"<CAP><MIN>";<VRNT1> +% <g-left-single-quotation-mark> <g'-digraph>;"<BASE><BASE>";"<MIN><MIN>";<VRNT2> +% <G-left-single-quotation-mark> <g'-digraph>;"<BASE><BASE>";"<CAP><MIN>";<VRNT2> +<g-double-apostrophe> <g'-digraph>;"<BASE><BASE>";"<MIN><MIN>";<VRNT3> +<G-double-apostrophe> <g'-digraph>;"<BASE><BASE>";"<CAP><MIN>";<VRNT3> +<sh> <sh-digraph>;"<BASE><BASE>";"<MIN><MIN>";IGNORE +<sH> <sh-digraph>;"<BASE><BASE>";"<MIN><CAP>";IGNORE +<Sh> <sh-digraph>;"<BASE><BASE>";"<CAP><MIN>";IGNORE +<SH> <sh-digraph>;"<BASE><BASE>";"<CAP><CAP>";IGNORE +<ch> <ch-digraph>;"<BASE><BASE>";"<MIN><MIN>";IGNORE +<cH> <ch-digraph>;"<BASE><BASE>";"<MIN><CAP>";IGNORE +<Ch> <ch-digraph>;"<BASE><BASE>";"<CAP><MIN>";IGNORE +<CH> <ch-digraph>;"<BASE><BASE>";"<CAP><CAP>";IGNORE reorder-end |