aboutsummaryrefslogtreecommitdiff
path: root/winsup
diff options
context:
space:
mode:
authorJohannes Schindelin <johannes.schindelin@gmx.de>2021-11-16 11:26:10 +0100
committerTakashi Yano <takashi.yano@nifty.ne.jp>2021-11-16 23:20:43 +0900
commit782aac590af7f065877168848d5fbb20535bfcf9 (patch)
tree279355d4cfe8c538e4ec7df1595d06dbc15bb2d3 /winsup
parent076c85673981493ed41aa176518a5e86fc71a33f (diff)
downloadnewlib-782aac590af7f065877168848d5fbb20535bfcf9.zip
newlib-782aac590af7f065877168848d5fbb20535bfcf9.tar.gz
newlib-782aac590af7f065877168848d5fbb20535bfcf9.tar.bz2
Cygwin: console: Handle Unicode surrogate pairs.
When running Cygwin's Bash in the Windows Terminal (see https://docs.microsoft.com/en-us/windows/terminal/ for details), Cygwin is receiving keyboard input in the form of UTF-16 characters. UTF-16 has that awkward challenge that it cannot map the full Unicode range, and to make up for it, there are the ranges U+D800-U+DBFF and U+DC00-U+DFFF which are illegal except when they come in a pair encoding for Unicode characters beyond U+FFFF. Cygwin does not handle such surrogate pairs correctly at the moment, as can be seen e.g. when running Cygwin's Bash in the Windows Terminal and then inserting an emoji (e.g. via Windows + <dot>, which opens an emoji picker on recent Windows versions): Instead of showing an emoji, this shows the infamous question mark in a black triangle, i.e. the invalid Unicode character. Let's special-case surrogate pairs in this scenario. This fixes https://github.com/git-for-windows/git/issues/3281 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Diffstat (limited to 'winsup')
-rw-r--r--winsup/cygwin/fhandler_console.cc17
-rw-r--r--winsup/cygwin/release/3.3.36
2 files changed, 22 insertions, 1 deletions
diff --git a/winsup/cygwin/fhandler_console.cc b/winsup/cygwin/fhandler_console.cc
index 0501b36..f4241ee 100644
--- a/winsup/cygwin/fhandler_console.cc
+++ b/winsup/cygwin/fhandler_console.cc
@@ -919,7 +919,22 @@ fhandler_console::process_input_message (void)
}
else
{
- nread = con.con_to_str (tmp + 1, 59, unicode_char);
+ WCHAR second = unicode_char >= 0xd800 && unicode_char <= 0xdbff
+ && i + 1 < total_read ?
+ input_rec[i + 1].Event.KeyEvent.uChar.UnicodeChar : 0;
+
+ if (second < 0xdc00 || second > 0xdfff)
+ {
+ nread = con.con_to_str (tmp + 1, 59, unicode_char);
+ }
+ else
+ {
+ /* handle surrogate pairs */
+ WCHAR pair[2] = { unicode_char, second };
+ nread = sys_wcstombs (tmp + 1, 59, pair, 2);
+ i++;
+ }
+
/* Determine if the keystroke is modified by META. The tricky
part is to distinguish whether the right Alt key should be
recognized as Alt, or as AltGr. */
diff --git a/winsup/cygwin/release/3.3.3 b/winsup/cygwin/release/3.3.3
index 1eb25e2..c1e8cef 100644
--- a/winsup/cygwin/release/3.3.3
+++ b/winsup/cygwin/release/3.3.3
@@ -16,3 +16,9 @@ Bug Fixes
- Fix long-standing problem that new files don't get created with the
FILE_ATTRIBUTE_ARCHIVE DOS attribute set.
Addresses: https://cygwin.com/pipermail/cygwin/2021-November/249909.html
+
+- Handle Unicode surrogate pairs in console. Cygwin console does not
+ handle surrogate pairs correctly at the moment. Fix issue that
+ running bash in Windows Terminal and inserting an emoji does not
+ work as expected.
+ Addresses: https://github.com/git-for-windows/git/issues/3281