aboutsummaryrefslogtreecommitdiff
path: root/winsup/doc/setup2.sgml
blob: bafecef8952125a9021ae8b37d3289f8096bccdb (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
<sect1 id="setup-env"><title>Environment Variables</title>

<sect2 id="setup-env-ov"><title>Overview</title>

<para>
All Windows environment variables are imported when Cygwin starts.
Apart from that, you may wish to specify settings of several important
environment variables that affect Cygwin's operation.</para>

<para>
The <envar>CYGWIN</envar> variable is used to configure a few global
settings for the Cygwin runtime system.  Typically you can leave
<envar>CYGWIN</envar> unset, but if you want to set one ore more
options, you can set it using a syntax like this, depending on the shell
in which you're setting it.  Here is an example in CMD syntax:</para>

<screen>
<prompt>C:\&gt;</prompt> <userinput>set CYGWIN=error_start:C:\cygwin\bin\gdb.exe glob</userinput>
</screen>

<para>
This is, of course, just an example.  For the recognized settings of the
<envar>CYGWIN</envar> environment variable, see
<xref linkend="using-cygwinenv"></xref>.
</para>

<para>
Locale support is controlled by the <envar>LANG</envar> and
<envar>LC_xxx</envar> environment variables.  Since Cygwin 1.7.2, all of
them are honored and have a meaning.  For a more detailed description see
<xref linkend="setup-locale"></xref>.
</para>

<para>
The <envar>PATH</envar> environment variable is used by Cygwin
applications as a list of directories to search for executable files
to run.  This environment variable is converted from Windows format
(e.g. <filename>C:\Windows\system32;C:\Windows</filename>) to UNIX format
(e.g., <filename>/cygdrive/c/Windows/system32:/cygdrive/c/Windows</filename>)
when a Cygwin process first starts.
Set it so that it contains at least the <filename>x:\cygwin\bin</filename>
directory where "<filename>x:\cygwin</filename> is the "root" of your
cygwin installation if you wish to use cygwin tools outside of bash.
This is usually done by the batch file you're starting your shell with.
</para>

<para> 
The <envar>HOME</envar> environment variable is used by many programs to
determine the location of your home directory and we recommend that it be
defined.  This environment variable is also converted from Windows format
when a Cygwin process first starts.  It's usually set in the shell
profile scripts in the /etc directory.
</para>

<para>
The <envar>TERM</envar> environment variable specifies your terminal
type.  It is automatically set to <literal>cygwin</literal> if you have
not set it to something else.
</para>

<para>The <envar>LD_LIBRARY_PATH</envar> environment variable is used by
the Cygwin function <function>dlopen ()</function> as a list of
directories to search for .dll files to load.  This environment variable
is converted from Windows format to UNIX format when a Cygwin process
first starts.  Most Cygwin applications do not make use of the
<function>dlopen ()</function> call and do not need this variable.
</para>

<para>
In addition to <envar>PATH</envar>, <envar>HOME</envar>,
and <envar>LD_LIBRARY_PATH</envar>, there are three other environment
variables which, if they exist in the Windows environment, are
converted to UNIX format: <envar>TMPDIR</envar>, <envar>TMP</envar>,
and <envar>TEMP</envar>.  The first is not set by default in the
Windows environment but the other two are, and they point to the
default Windows temporary directory.  If set, these variables will be
used by some Cygwin applications, possibly with unexpected results.
You may therefore want to unset them by adding the following two lines
to your <filename>~/.bashrc</filename> file:

<screen>
unset TMP
unset TEMP
</screen>

This is done in the default <filename>~/.bashrc</filename> file.
Alternatively, you could set <envar>TMP</envar>
and <envar>TEMP</envar> to point to <filename>/tmp</filename> or to
any other temporary directory of your choice.  For example:

<screen>
export TMP=/tmp
export TEMP=/tmp
</screen>
</para>

</sect2>

<sect2 id="setup-env-win32"><title>Restricted Win32 environment</title>

<para>There is a restriction when calling Win32 API functions which
require a fully set up application environment.  Cygwin maintains its own
environment in POSIX style.  The Win32 environment is usually stripped
to a bare minimum and not at all kept in sync with the Cygwin POSIX
environment.</para>

<para>If you need the full Win32 environment set up in a Cygwin process,
you have to call</para>

<screen>
#include &lt;sys/cygwin.h&gt;

cygwin_internal (CW_SYNC_WINENV);
</screen>

<para>to synchronize the Win32 environment with the Cygwin environment.
Note that this only synchronizes the Win32 environment once with the
Cygwin environment.  Later changes using the <function>setenv</function>
or <function>putenv</function> calls are not reflected in the Win32
environment.  In these cases, you have to call the aforementioned
<function>cygwin_internal</function> call again.</para>

</sect2>

</sect1>

<sect1 id="setup-maxmem"><title>Changing Cygwin's Maximum Memory</title>

<para>
Cygwin's heap is extensible.  However, it does start out at a fixed size
and attempts to extend it may run into memory which has been previously
allocated by Windows.  In some cases, this problem can be solved by
changing a field in the file header which is utilized by Cygwin since
version 1.7.10 to keep the initial size of the application heap.  If the
field contains 0, which is the default, the application heap defaults to
a size of 384 Megabyte.  If the field is set to any other value between 4 and
2048, Cygwin tries to reserve as much Megabytes for the application heap.
The field used for this is the "LoaderFlags" field in the NT-specific
PE header structure (<literal>(IMAGE_NT_HEADER)->OptionalHeader.LoaderFlags</literal>).</para>

<para>
This value can be changed for any executable by using a more recent version
of the <command>peflags</command> tool from the <literal>rebase</literal>
Cygwin package.  Example:

<screen>
$ peflags --cygwin-heap foo.exe
foo.exe: initial Cygwin heap size: 0 (0x0) MB
$ peflags --cygwin-heap=500 foo.exe
foo.exe: initial Cygwin heap size: 500 (0x1f4) MB
</screen>
</para>

<para>
Heap memory can be allocated up to the size of the biggest available free
block in the processes virtual memory (VM).  By default, the VM per process
is 2 GB for 32 processes.  To get more VM for a process, the executable
must have the "large address aware" flag set in the file header.  You can
use the aforementioned <command>peflags</command> tool to set this flag.
On 64 bit systems this results in a 4 GB VM for a process started from that
executable.  On 32 bit systems you also have to prepare the system to allow
up to 3 GB per process.  See the Microsoft article
<ulink url="http://msdn.microsoft.com/en-us/library/bb613473%28VS.85%29.aspx">4-Gigabyte Tuning</ulink>
for more information.
</para>

<note>
<para>
Older Cygwin releases only supported a global registry setting to
change the initial heap size for all Cygwin processes.  This setting is
not used anymore.  However, if you're running an older Cygwin release
than 1.7.10, you can add the <literal>DWORD</literal> value
<literal>heap_chunk_in_mb</literal> and set it to the desired memory limit
in decimal MB.  You have to stop all Cygwin processes for this setting to
have any effect.  It is preferred to do this in Cygwin using the
<command>regtool</command> program included in the Cygwin package.
(see <xref linkend="regtool"></xref>) This example sets the memory limit
to 1024 MB for all Cygwin processes (use HKCU instead of HKLM if you
want to set this only for the current user):

<screen>
$ regtool -i set /HKLM/Software/Cygwin/heap_chunk_in_mb 1024
$ regtool -v list /HKLM/Software/Cygwin
</screen>
</para>
</note>

</sect1>

<sect1 id="setup-locale"><title>Internationalization</title>

<sect2 id="setup-locale-ov"><title>Overview</title>

<para>
Internationalization support is controlled by the <envar>LANG</envar> and
<envar>LC_xxx</envar> environment variables.  You can set all of them
but Cygwin itself only honors the variables <envar>LC_ALL</envar>,
<envar>LC_CTYPE</envar>, and <envar>LANG</envar>, in this order, according
to the POSIX standard.  The content of these variables should follow the
POSIX standard for a locale specifier.  The correct form of a locale
specifier is</para>

<screen>
  language[[_TERRITORY][.charset][@modifier]]
</screen>

<para>"language" is a lowercase two character string per ISO 639-1, or,
if there is no ISO 639-1 code for the language (for instance, "Lower Sorbian"),
a three character string per ISO 639-3.</para>

<para>"TERRITORY" is an uppercase two character string per ISO 3166, charset is
one of a list of supported character sets.  The modifier doesn't matter
here (though some are recognized, see below).  If you're interested in the
exact description, you can find it in the online publication of the POSIX
manual pages on the homepage of the
<ulink url="http://www.opengroup.org/">Open Group</ulink>.</para>

<para>Typical locale specifiers are</para>

<screen>
  "de_CH"	   language = German, territory = Switzerland, default charset
  "fr_FR.UTF-8"    language = french, territory = France, charset = UTF-8
  "ko_KR.eucKR"    language = korean, territory = South Korea, charset = eucKR
  "syr_SY"         language = Syriac, territory = Syria, default charset
</screen>

<para>
If the locale specifier does not follow the above form, Cygwin checks
if the locale is one of the locale aliases defined in the file
<filename>/usr/share/locale/locale.alias</filename>.  If so, and if
the replacement localename is supported by the underlying Windows,
the locale is accepted, too.  So, given the default content of the
<filename>/usr/share/locale/locale.alias</filename> file, the below
examples would be valid locale specifiers as well.
</para>

<screen>
  "catalan"        defined as "ca_ES.ISO-8859-1" in locale.alias
  "japanese"       defined as "ja_JP.eucJP"      in locale.alias
  "turkish"        defined as "tr_TR.ISO-8859-9" in locale.alias
</screen>

<para>The file <filename>/usr/share/locale/locale.alias</filename> is
provided by the gettext package under Cygwin.</para>

<para>
At application startup, the application's locale is set to the default
"C" or "POSIX" locale.  Under Cygwin 1.7.2 and later, this locale defaults
to the ASCII character set on the application level.  If you want to stick
to the "C" locale and only change to another charset, you can define this
by setting one of the locale environment variables to "C.charset".  For
instance</para>

<screen>
  "C.ISO-8859-1"
</screen>

<note><para>The default locale in the absence of the aforementioned locale
environment variables is "C.UTF-8".</para></note>

<para>Windows uses the UTF-16 charset exclusively to store the names
of any object used by the Operating System.  This is especially important
with filenames.  Cygwin uses the setting of the locale environment variables
<envar>LC_ALL</envar>, <envar>LC_CTYPE</envar>, and <envar>LANG</envar>, to
determine how to convert Windows filenames from their UTF-16 representation
to the singlebyte or multibyte character set used by Cygwin.</para>

<para>
The setting of the locale environment variables at process startup
is effective for Cygwin's internal conversions to and from the Windows UTF-16
object names for the entire lifetime of the current process.  Changing
the environment variables to another value changes the way filenames are
converted in subsequently started child processes, but not within the same
process.</para>

<para>
However, even if one of the locale environment variables is set to
some other value than "C", this does <emphasis>only</emphasis> affect
how Cygwin itself converts filenames.  As the POSIX standard requires,
it's the application's responsibility to activate that locale for its
own purposes, typically by using the call</para>

<screen>
  setlocale (LC_ALL, "");
</screen>

<para>early in the application code.  Again, so that this doesn't get
lost:  If the application calls setlocale as above, and there is none
of the important locale variables set in the environment, the locale
is set to the default locale, which is "C.UTF-8".</para>

<para>But what about applications which are not locale-aware?  Per POSIX,
they are running in the "C" or "POSIX" locale, which implies the ASCII
charset.  The Cygwin DLL itself, however, will nevertheless use the locale
set in the environment (or the "C.UTF-8" default locale) for converting
filenames etc.</para>

<para>When the locale in the environment specifies an ASCII charset,
for example "C" or "en_US.ASCII", Cygwin will still use UTF-8
under the hood to translate filenames.  This allows for easier
interoperability with applications running in the default "C.UTF-8" locale.
</para>

<para>
Starting with Cygwin 1.7.2, the language and territory are used to
fetch locale-dependent information from Windows.  If the language and
territory are not known to Windows, the <function>setlocale</function>
function fails.</para>

<para>The following modifiers are recognized.  Any other modifier is simply
ignored for now.</para>

<itemizedlist mark="bullet">

<listitem><para>
For locales which use the Euro (EUR) as currency, the modifier "@euro"
can be added to enforce usage of the ISO-8859-15 character set, which
includes a character for the "Euro" currency sign.
</para></listitem>

<listitem><para>
The default script used for all Serbian language locales (sr_BA, sr_ME, sr_RS,
and the deprecated sr_CS and sr_SP) is cyrillic.  With the "@latin" modifier
it gets switched to the latin script with the respective collation behaviour.
</para></listitem>

<listitem><para>
The default charset of the "be_BY" locale (Belarusian/Belarus) is CP1251.
With the "@latin" modifier it's UTF-8.
</para></listitem>

<listitem><para>
The default charset of the "tt_RU" locale (Tatar/Russia) is ISO-8859-5.
With the "@iqtelif" modifier it's UTF-8.
</para></listitem>

<listitem><para>
The default charset of the "uz_UZ" locale (Uzbek/Uzbekistan) is ISO-8859-1.
With the "@cyrillic" modifier it's UTF-8.
</para></listitem>

<listitem><para>
There's a class of characters in the Unicode character set, called the
"CJK Ambiguous Width" characters.  For these characters, the width
returned by the wcwidth/wcswidth functions is usually 1.  This can be a
problem with East-Asian languages, which historically use character sets
where these characters have a width of 2.  Therefore, wcwidth/wcswidth
return 2 as the width of these characters when an East-Asian charset such
as GBK or SJIS is selected, or when UTF-8 is selected and the language is
specified as "zh" (Chinese), "ja" (Japanese), or "ko" (Korean).  This is
not correct in all circumstances, hence the locale modifier "@cjknarrow"
can be used to force wcwidth/wcswidth to return 1 for the ambiguous width
characters.
</para></listitem>

</itemizedlist>

</sect2>

<sect2 id="setup-locale-how"><title>How to set the locale</title>

<itemizedlist mark="bullet">

<listitem><para>
Assume that you've set one of the aforementioned environment variables to some
valid POSIX locale value, other than "C" and "POSIX".  Assume further that
you're living in Japan.  You might want to use the language code "ja" and the
territory "JP", thus setting, say, <envar>LANG</envar> to "ja_JP".  You didn't
set a character set, so what will Cygwin use now?  Starting with Cygwin 1.7.2,
the default character set is determined by the default Windows ANSI codepage
for this language and territory.  Cygwin uses a character set which is the
typical Unix-equivalent to the Windows ANSI codepage.  For instance:</para>

<screen>
  "en_US"		ISO-8859-1
  "el_GR"		ISO-8859-7
  "pl_PL"		ISO-8859-2
  "pl_PL@euro"		ISO-8859-15
  "ja_JP"		EUCJP
  "ko_KR"		EUCKR
  "te_IN"		UTF-8
</screen>
</listitem>

<listitem><para>
You don't want to use the default character set?  In that case you have to
specify the charset explicitly.  For instance, assume you're from Japan and
don't want to use the japanese default charset EUC-JP, but the Windows
default charset SJIS.  What you can do, for instance, is to set the
<envar>LANG</envar> variable in the <command>mintty</command> Cygwin Terminal
in the "Text" section of its "Options" dialog.  If you're starting your
Cygwin session via a batch file or a shortcut to a batch file, you can also
just set LANG there:</para>

<screen>
  @echo off

  C:
  chdir C:\cygwin\bin
  set LANG=ja_JP.SJIS
  bash --login -i
</screen>

<note><para>For a list of locales supported by your Windows machine, use the new
<command>locale -a</command> command, which is part of the Cygwin package.
For a description see <xref linkend="locale"></xref></para></note>

<note><para>For a list of supported character sets, see
<xref linkend="setup-locale-charsetlist"></xref>
</para></note>
</listitem>

<listitem><para>
Last, but not least, most singlebyte or doublebyte charsets have a big
disadvantage.  Windows filesystems use the Unicode character set in the
UTF-16 encoding to store filename information.  Not all characters
from the Unicode character set are available in a singlebyte or doublebyte
charset.  While Cygwin has a workaround to access files with unusual
characters (see <xref linkend="pathnames-unusual"></xref>), a better
workaround is to use always the UTF-8 character set.</para>

<para><emphasis>UTF-8 is the only multibyte character set which can represent
every Unicode character.</emphasis></para>

<screen>
  set LANG=es_MX.UTF-8
</screen>

<para>For a description of the Unicode standard, see the homepage of the
<ulink url="http://www.unicode.org/">Unicode Consortium</ulink>.
</para></listitem>

</itemizedlist>

</sect2>

<sect2 id="setup-locale-console"><title>The Windows Console character set</title>

<para>Sometimes the Windows console is used to run Cygwin applications.
While terminal emulations like the Cygwin Terminal <command>mintty</command>
or <command>xterm</command> have a distinct way to set the character set
used for in- and output, the Windows console hasn't such a way, since it's
not an application in its own right.</para>

<para>This problem is solved in Cygwin as follows.  When a Cygwin
process is started in a Windows console (either explicitly from cmd.exe,
or implicitly by, for instance, running the
<filename>C:\cygwin\Cygwin.bat</filename> batch file), the Console character
set is determined by the setting of the aforementioned
internationalization environment variables, the same way as described in
<xref linkend="setup-locale-how"></xref>.  </para>

<para>What is that good for?  Why not switch the console character set with
the applications requirements?  After all, the application knows if it uses
localization or not.  However, what if a non-localized application calls
a remote application which itself is localized?  This can happen with
<command>ssh</command> or <command>rlogin</command>.  Both commands don't
have and don't need localization and they never call
<function>setlocale</function>.  Setting one of the internationalization
environment variable to the same charset as the remote machine before
starting <command>ssh</command> or <command>rlogin</command> fixes that
problem.</para>

</sect2>

<sect2 id="setup-locale-problems"><title>Potential Problems when using Locales</title>

<para>
You can set the above internationalization variables not only when
starting the first Cygwin process, but also in your Cygwin shell on the
fly, even switch to yet another character set, and yet another.  In bash
for instance:</para>

<screen>
  <prompt>bash$</prompt> export LC_CTYPE="nl_BE.UTF-8"
</screen>

<para>However, here's a problem.  At the start of the first Cygwin process
in a session, the Windows environment is converted from UTF-16 to UTF-8.
The environment is another of the system objects stored in UTF-16 in
Windows.</para>

<para>As long as the environment only contains ASCII characters, this is
no problem at all.  But if it contains native characters, and you're planning
to use, say, GBK, the environment will result in invalid characters in
the GBK charset.  This would be especially a problem in variables like
<envar>PATH</envar>.  To circumvent the worst problems, Cygwin converts
the <envar>PATH</envar> environment variable to the charset set in the
environment, if it's different from the UTF-8 charset.</para>

<note><para>Per POSIX, the name of an environment variable should only
consist of valid ASCII characters, and only of uppercase letters, digits, and
the underscore for maximum portability.</para></note>

<para>Symbolic links, too, may pose a problem when switching charsets on
the fly.  A symbolic link contains the filename of the target file the
symlink points to.  When a symlink had been created with older versions
of Cygwin, the current ANSI or OEM character set had been used to store
the target filename, dependent on the old <envar>CYGWIN</envar>
environment variable setting <envar>codepage</envar> (see <xref
linkend="cygwinenv-removed-options"></xref>.  If the target filename
contains non-ASCII characters and you use another character set than
your default ANSI/OEM charset, the target filename of the symlink is now
potentially an invalid character sequence in the new character set.
This behaviour is not different from the behaviour in other Operating
Systems.  So, if you suddenly can't access a symlink anymore which
worked all these years before, maybe it's because you switched to
another character set.  This doesn't occur with symlinks created with
Cygwin 1.7 or later.  </para>

<para>Another problem you might encounter is that older versions of
Windows did not install all charsets by default.  If you are running
Windows XP or older, you can open the "Regional and Language Options"
portion of the Control Panel, select the "Advanced" tab, and select
entries from the "Code page conversion tables" list.  The following
entries are useful to cygwin: 932/SJIS, 936/GBK, 949/EUC-KR, 950/Big5,
20932/EUC-JP.</para>

</sect2>

<sect2 id="setup-locale-charsetlist"><title>List of supported character sets</title>

<para>Last but not least, here's the list of currently supported character
sets.  The left-hand expression is the name of the charset, as you would use
it in the internationalization environment variables as outlined above.
Note that charset specifiers are case-insensitive.  <literal>EUCJP</literal>
is equivalent to <literal>eucJP</literal> or <literal>eUcJp</literal>.
Writing the charset in the exact case as given in the list below is a
good convention, though.
</para>

<para>The right-hand side is the number of the equivalent Windows
codepage as well as the Windows name of the codepage.  They are only
noted here for reference.  Don't try to use the bare codepage number or
the Windows name of the codepage as charset in locale specifiers, unless
they happen to be identical with the left-hand side.  Especially in case
of the "CPxxx" style charsets, always use them with the trailing "CP".</para>

<para>This works:</para>

<screen>
  set LC_ALL=en_US.CP437
</screen>

<para>This does <emphasis>not</emphasis> work:</para>

<screen>
  set LC_ALL=en_US.437
</screen>

<para>You can find a full list of Windows codepages on the Microsoft MSDN page
<ulink url="http://msdn.microsoft.com/en-us/library/dd317756(VS.85).aspx">Code Page Identifiers</ulink>.</para>

<screen>
    Charset               Codepage
    -------------------   -------------------------------------------
    ASCII                 20127 (US_ASCII)

    CP437                   437 (OEM United States)
    CP720                   720 (DOS Arabic)
    CP737                   737 (OEM Greek)
    CP775                   775 (OEM Baltic)
    CP850                   850 (OEM Latin 1, Western European)
    CP852                   852 (OEM Latin 2, Central European)
    CP855                   855 (OEM Cyrillic)
    CP857                   857 (OEM Turkish)
    CP858                   858 (OEM Latin 1 + Euro Symbol)
    CP862                   862 (OEM Hebrew)
    CP866                   866 (OEM Russian)
    CP874                   874 (ANSI/OEM Thai)
    CP932		    932 (Shift_JIS, not exactly identical to SJIS)
    CP1125                 1125 (OEM Ukraine)
    CP1250                 1250 (ANSI Central European)
    CP1251                 1251 (ANSI Cyrillic)
    CP1252                 1252 (ANSI Latin 1, Western European)
    CP1253                 1253 (ANSI Greek)
    CP1254                 1254 (ANSI Turkish)
    CP1255                 1255 (ANSI Hebrew)
    CP1256                 1256 (ANSI Arabic)
    CP1257                 1257 (ANSI Baltic)
    CP1258                 1258 (ANSI/OEM Vietnamese)

    ISO-8859-1            28591 (ISO-8859-1)
    ISO-8859-2            28592 (ISO-8859-2)
    ISO-8859-3            28593 (ISO-8859-3)
    ISO-8859-4            28594 (ISO-8859-4)
    ISO-8859-5            28595 (ISO-8859-5)
    ISO-8859-6            28596 (ISO-8859-6)
    ISO-8859-7            28597 (ISO-8859-7)
    ISO-8859-8            28598 (ISO-8859-8)
    ISO-8859-9            28599 (ISO-8859-9)
    ISO-8859-10             -   (not available)
    ISO-8859-11             -   (not available)
    ISO-8859-13           28603 (ISO-8859-13)
    ISO-8859-14             -   (not available)
    ISO-8859-15           28605 (ISO-8859-15)
    ISO-8859-16             -   (not available)

    Big5                    950 (ANSI/OEM Traditional Chinese)
    EUCCN or euc-CN         936 (ANSI/OEM Simplified Chinese)
    EUCJP or euc-JP       20932 (EUC Japanese)
    EUCKR or euc-KR         949 (EUC Korean)
    GB2312                  936 (ANSI/OEM Simplified Chinese)
    GBK                     936 (ANSI/OEM Simplified Chinese)
    GEORGIAN-PS             -   (not available)
    KOI8-R                20866 (KOI8-R Russian Cyrillic)
    KOI8-U                21866 (KOI8-U Ukrainian Cyrillic)
    PT154                   -   (not available)
    SJIS                    -   (not available, almost, but not exactly CP932)
    TIS620 or TIS-620       874 (ANSI/OEM Thai)

    UTF-8 or utf8         65001 (UTF-8)
</screen>

</sect2>

</sect1>

<sect1 id="setup-files"><title>Customizing bash</title>

<para>
To set up bash so that cut and paste work properly, click on the
"Properties" button of the window, then on the "Misc" tab.  Make sure
that "QuickEdit mode" and "Insert mode" are checked.  These settings
will be remembered next time you run bash from that shortcut. Similarly
you can set the working directory inside the "Program" tab. The entry
"%HOME%" is valid, but requires that you set <envar>HOME</envar> in
the Windows environment.
</para>

<para>
Your home directory should contain three initialization files
that control the behavior of bash.  They are
<filename>.profile</filename>, <filename>.bashrc</filename> and
<filename>.inputrc</filename>.  The Cygwin base installation creates
stub files when you start bash for the first time.</para>

<para>
<filename>.profile</filename> (other names are also valid, see the bash man
page) contains bash commands.  It is executed when bash is started as login
shell, e.g. from the command <command>bash --login</command>.
This is a useful place to define and
export environment variables and bash functions that will be used by bash
and the programs invoked by bash.  It is a good place to redefine
<envar>PATH</envar> if needed.  We recommend adding a ":." to the end of
<envar>PATH</envar> to also search the current working directory (contrary
to DOS, the local directory is not searched by default).  Also to avoid
delays you should either <command>unset</command> <envar>MAILCHECK</envar> 
or define <envar>MAILPATH</envar> to point to your existing mail inbox.
</para>

<para>
<filename>.bashrc</filename> is similar to
<filename>.profile</filename> but is executed each time an interactive
bash shell is launched.  It serves to define elements that are not
inherited through the environment, such as aliases. If you do not use
login shells, you may want to put the contents of
<filename>.profile</filename> as discussed above in this file
instead.
</para>

<para>
<screen>
shopt -s nocaseglob
</screen>
will allow bash to glob filenames in a case-insensitive manner.
Note that <filename>.bashrc</filename> is not called automatically for login 
shells. You can source it from <filename>.profile</filename>.
</para>

<para>
<filename>.inputrc</filename> controls how programs using the readline
library (including <command>bash</command>) behave.  It is loaded
automatically.  For full details see the <literal>Function and Variable
Index</literal> section of the GNU <systemitem>readline</systemitem> manual.
Consider the following settings:
<screen>
# Ignore case while completing
set completion-ignore-case on
# Make Bash 8bit clean
set meta-flag on
set convert-meta off
set output-meta on
</screen>
The first command makes filename completion case insensitive, which can
be convenient in a Windows environment.  The next three commands allow
<command>bash</command> to display 8-bit characters, useful for
languages with accented characters.  Note that tools that do not use
<systemitem>readline</systemitem> for display, such as
<command>less</command> and <command>ls</command>, require additional
settings, which could be put in your <filename>.bashrc</filename>:
<screen>
alias less='/bin/less -r'
alias ls='/bin/ls -F --color=tty --show-control-chars'
</screen>
</para>

</sect1>