Notes on GCC's Native Language Support GCC's Native Language Support (NLS) is relatively new and experimental, so NLS is currently disabled by default. Use configure's --enable-nls option to enable it. Eventually, NLS will be enabled by default, and you'll need --disable-nls to disable it. You must enable NLS in order to make a GCC distribution. By and large, only diagnostic messages have been internationalized. Some work remains in other areas; for example, GCC does not yet allow non-ASCII letters in identifiers. Not all of GCC's diagnostic messages have been internationalized. Programs like `enquire' and `genattr' are not internationalized, as their users are GCC maintainers who typically need to be able to read English anyway; internationalizing them would thus entail needless work for the human translators. And no one has yet gotten around to internationalizing the messages in the C++ compiler, or in the specialized MIPS-specific programs mips-tdump and mips-tfile. The GCC library should not contain any messages that need internationalization, because it operates below the internationalization library. Currently, the only language translation supplied is en_UK (British English). Unlike some other GNU programs, the GCC sources contain few instances of explicit translation calls like _("string"). Instead, the diagnostic printing routines automatically translate their arguments. For example, GCC source code should not contain calls like `error (_("unterminated comment"))'; it should contain calls like `error ("unterminated comment")' instead, as it is the `error' function's responsibility to translate the message before the user sees it. By convention, any function parameter in the GCC sources whose name ends in `msgid' is expected to be a message requiring translation. For example, the `error' function's first parameter is named `msgid'. GCC's exgettext script uses this convention to determine which function parameter strings need to be translated. The exgettext script also assumes that any occurrence of `%eMSGID}' on a source line, where MSGID does not contain `%' or `}', corresponds to a message MSGID that requires translation; this is needed to identify diagnostics in GCC spec strings. If you enable NLS and modify source files, you'll need to use a special version of the GNU gettext package to propagate the modifications to the translation tables. Apply the following patch (use `patch -p0') to GNU gettext 0.10.35, which you can retrieve from: ftp://alpha.gnu.org/gnu/gettext-0.10.35.tar.gz This patch has been submitted to the GNU gettext maintainer, so eventually we shouldn't need this special gettext version. This patch is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. This patch is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this patch; see the file COPYING. If not, write to the Free Software Foundation, 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. 1998-07-26 Paul Eggert * po/Makefile.in.in (maintainer-clean): Remove cat-id-tbl.c and stamp-cat-id. 1998-07-24 Paul Eggert * po/Makefile.in.in (cat-id-tbl.o): Depend on $(top_srcdir)/intl/libgettext.h, not ../intl/libgettext.h. 1998-07-20 Paul Eggert * po/Makefile.in.in (.po.pox, all-yes, $(srcdir)/cat-id-tbl.c, $(srcdir)/stamp-cat-id, update-po): Prepend `$(srcdir)/' to files built in the source directory; this is needed for VPATH-based make in Solaris 2.6. 1998-07-17 Paul Eggert Add support for user-specified argument numbers for keywords. Extract all strings from a keyword arg, not just the first one. Handle parenthesized commas inside keyword args correctly. Warn about nested keywords. * doc/gettext.texi: Document --keyword=id:argnum. * src/xgettext.c (scan_c_file): Warn about nested keywords, e.g. _(_("xxx")). Warn also about not-yet-implemented but allowed nesting, e.g. dcgettext(..._("xxx")..., "yyy"). Get all strings in a keyword arg, not just the first one. Handle parenthesized commas inside keyword args correctly. * src/xget-lex.h (enum xgettext_token_type_ty): Replace xgettext_token_type_keyword1 and xgettext_token_type_keyword2 with just plain xgettext_token_type_keyword; it now has argnum value. Add xgettext_token_type_rp. (struct xgettext_token_ty): Add argnum member. line_number and file_name are now also set for xgettext_token_type_keyword. (xgettext_lex_keyword): Arg is const char *. * src/xget-lex.c: Include "hash.h". (enum token_type_ty): Add token_type_rp. (keywords): Now a hash table. (phase5_get): Return token_type_rp for ')'. (xgettext_lex, xgettext_lex_keyword): Add support for keyword argnums. (xgettext_lex): Return xgettext_token_type_rp for ')'. Report keyword argnum, line number, and file name back to caller. 1998-07-09 Paul Eggert * intl/Makefile.in (uninstall): Do nothing unless $(PACKAGE) is gettext. =================================================================== RCS file: doc/gettext.texi,v retrieving revision 0.10.35.0 retrieving revision 0.10.35.1 diff -pu -r0.10.35.0 -r0.10.35.1 --- doc/gettext.texi 1998/05/01 05:53:32 0.10.35.0 +++ doc/gettext.texi 1998/07/18 00:25:15 0.10.35.1 @@ -1854,13 +1854,19 @@ List of directories searched for input f Join messages with existing file. @item -k @var{word} -@itemx --keyword[=@var{word}] -Additonal keyword to be looked for (without @var{word} means not to +@itemx --keyword[=@var{keywordspec}] +Additonal keyword to be looked for (without @var{keywordspec} means not to use default keywords). -The default keywords, which are always looked for if not explicitly -disabled, are @code{gettext}, @code{dgettext}, @code{dcgettext} and -@code{gettext_noop}. +If @var{keywordspec} is a C identifer @var{id}, @code{xgettext} looks +for strings in the first argument of each call to the function or macro +@var{id}. If @var{keywordspec} is of the form +@samp{@var{id}:@var{argnum}}, @code{xgettext} looks for strings in the +@var{argnum}th argument of the call. + +The default keyword specifications, which are always looked for if not +explicitly disabled, are @code{gettext}, @code{dgettext:2}, +@code{dcgettext:2} and @code{gettext_noop}. @item -m [@var{string}] @itemx --msgstr-prefix[=@var{string}] =================================================================== RCS file: intl/Makefile.in,v retrieving revision 0.10.35.0 retrieving revision 0.10.35.1 diff -pu -r0.10.35.0 -r0.10.35.1 --- intl/Makefile.in 1998/04/27 21:53:18 0.10.35.0 +++ intl/Makefile.in 1998/07/09 21:39:18 0.10.35.1 @@ -143,10 +143,14 @@ install-data: all installcheck: uninstall: - dists="$(DISTFILES.common)"; \ - for file in $$dists; do \ - rm -f $(gettextsrcdir)/$$file; \ - done + if test "$(PACKAGE)" = "gettext"; then \ + dists="$(DISTFILES.common)"; \ + for file in $$dists; do \ + rm -f $(gettextsrcdir)/$$file; \ + done + else \ + : ; \ + fi info dvi: =================================================================== RCS file: src/xget-lex.c,v retrieving revision 0.10.35.0 retrieving revision 0.10.35.1 diff -pu -r0.10.35.0 -r0.10.35.1 --- src/xget-lex.c 1998/07/09 22:49:48 0.10.35.0 +++ src/xget-lex.c 1998/07/18 00:25:15 0.10.35.1 @@ -33,6 +33,7 @@ #include "error.h" #include "system.h" #include "libgettext.h" +#include "hash.h" #include "str-list.h" #include "xget-lex.h" @@ -83,6 +84,7 @@ enum token_type_ty token_type_eoln, token_type_hash, token_type_lp, + token_type_rp, token_type_comma, token_type_name, token_type_number, @@ -109,7 +111,7 @@ static FILE *fp; static int trigraphs; static int cplusplus_comments; static string_list_ty *comment; -static string_list_ty *keywords; +static hash_table keywords; static int default_keywords = 1; /* These are for tracking whether comments count as immediately before @@ -941,6 +943,10 @@ phase5_get (tp) tp->type = token_type_lp; return; + case ')': + tp->type = token_type_rp; + return; + case ',': tp->type = token_type_comma; return; @@ -1179,6 +1185,7 @@ xgettext_lex (tp) while (1) { token_ty token; + void *keyword_value; phase8_get (&token); switch (token.type) @@ -1213,17 +1220,20 @@ xgettext_lex (tp) if (default_keywords) { xgettext_lex_keyword ("gettext"); - xgettext_lex_keyword ("dgettext"); - xgettext_lex_keyword ("dcgettext"); + xgettext_lex_keyword ("dgettext:2"); + xgettext_lex_keyword ("dcgettext:2"); xgettext_lex_keyword ("gettext_noop"); default_keywords = 0; } - if (string_list_member (keywords, token.string)) - { - tp->type = (strcmp (token.string, "dgettext") == 0 - || strcmp (token.string, "dcgettext") == 0) - ? xgettext_token_type_keyword2 : xgettext_token_type_keyword1; + if (find_entry (&keywords, token.string, strlen (token.string), + &keyword_value) + == 0) + { + tp->type = xgettext_token_type_keyword; + tp->argnum = (int) keyword_value; + tp->line_number = token.line_number; + tp->file_name = logical_file_name; } else tp->type = xgettext_token_type_symbol; @@ -1236,6 +1246,12 @@ xgettext_lex (tp) tp->type = xgettext_token_type_lp; return; + case token_type_rp: + last_non_comment_line = newline_count; + + tp->type = xgettext_token_type_rp; + return; + case token_type_comma: last_non_comment_line = newline_count; @@ -1263,16 +1279,32 @@ xgettext_lex (tp) void xgettext_lex_keyword (name) - char *name; + const char *name; { if (name == NULL) default_keywords = 0; else { - if (keywords == NULL) - keywords = string_list_alloc (); + int argnum; + size_t len; + const char *sp; + + if (keywords.table == NULL) + init_hash (&keywords, 100); + + sp = strchr (name, ':'); + if (sp) + { + len = sp - name; + argnum = atoi (sp + 1); + } + else + { + len = strlen (name); + argnum = 1; + } - string_list_append_unique (keywords, name); + insert_entry (&keywords, name, len, (void *) argnum); } } =================================================================== RCS file: src/xget-lex.h,v retrieving revision 0.10.35.0 retrieving revision 0.10.35.1 diff -pu -r0.10.35.0 -r0.10.35.1 --- src/xget-lex.h 1998/07/09 22:49:48 0.10.35.0 +++ src/xget-lex.h 1998/07/18 00:25:15 0.10.35.1 @@ -23,9 +23,9 @@ Foundation, Inc., 59 Temple Place - Suit enum xgettext_token_type_ty { xgettext_token_type_eof, - xgettext_token_type_keyword1, - xgettext_token_type_keyword2, + xgettext_token_type_keyword, xgettext_token_type_lp, + xgettext_token_type_rp, xgettext_token_type_comma, xgettext_token_type_string_literal, xgettext_token_type_symbol @@ -37,8 +37,14 @@ struct xgettext_token_ty { xgettext_token_type_ty type; - /* These 3 are only set for xgettext_token_type_string_literal. */ + /* This 1 is set only for xgettext_token_type_keyword. */ + int argnum; + + /* This 1 is set only for xgettext_token_type_string_literal. */ char *string; + + /* These 2 are set only for xgettext_token_type_keyword and + xgettext_token_type_string_literal. */ int line_number; char *file_name; }; @@ -50,7 +56,7 @@ void xgettext_lex PARAMS ((xgettext_toke const char *xgettext_lex_comment PARAMS ((size_t __n)); void xgettext_lex_comment_reset PARAMS ((void)); /* void xgettext_lex_filepos PARAMS ((char **, int *)); FIXME needed? */ -void xgettext_lex_keyword PARAMS ((char *__name)); +void xgettext_lex_keyword PARAMS ((const char *__name)); void xgettext_lex_cplusplus PARAMS ((void)); void xgettext_lex_trigraphs PARAMS ((void)); =================================================================== RCS file: src/xgettext.c,v retrieving revision 0.10.35.0 retrieving revision 0.10.35.1 diff -pu -r0.10.35.0 -r0.10.35.1 --- src/xgettext.c 1998/07/09 22:49:48 0.10.35.0 +++ src/xgettext.c 1998/07/18 00:25:15 0.10.35.1 @@ -835,6 +835,8 @@ scan_c_file(filename, mlp, is_cpp_file) int is_cpp_file; { int state; + int commas_to_skip; /* defined only when in states 1 and 2 */ + int paren_nesting; /* defined only when in state 2 */ /* Inform scanner whether we have C++ files or not. */ if (is_cpp_file) @@ -854,63 +856,79 @@ scan_c_file(filename, mlp, is_cpp_file) { xgettext_token_ty token; - /* A simple state machine is used to do the recognising: + /* A state machine is used to do the recognising: State 0 = waiting for something to happen - State 1 = seen one of our keywords with string in first parameter - State 2 = was in state 1 and now saw a left paren - State 3 = seen one of our keywords with string in second parameter - State 4 = was in state 3 and now saw a left paren - State 5 = waiting for comma after being in state 4 - State 6 = saw comma after being in state 5 */ + State 1 = seen one of our keywords + State 2 = waiting for part of an argument */ xgettext_lex (&token); switch (token.type) { - case xgettext_token_type_keyword1: + case xgettext_token_type_keyword: + if (!extract_all && state == 2) + { + if (commas_to_skip == 0) + { + error (0, 0, + _("%s:%d: warning: keyword nested in keyword arg"), + token.file_name, token.line_number); + continue; + } + + /* Here we should nest properly, but this would require a + potentially unbounded stack. We haven't run across an + example that needs this functionality yet. For now, + we punt and forget the outer keyword. */ + error (0, 0, + _("%s:%d: warning: keyword between outer keyword and its arg"), + token.file_name, token.line_number); + } + commas_to_skip = token.argnum - 1; state = 1; continue; - case xgettext_token_type_keyword2: - state = 3; - continue; - case xgettext_token_type_lp: switch (state) { case 1: + paren_nesting = 0; state = 2; break; - case 3: - state = 4; + case 2: + paren_nesting++; break; - default: - state = 0; } continue; + case xgettext_token_type_rp: + if (state == 2 && paren_nesting != 0) + paren_nesting--; + else + state = 0; + continue; + case xgettext_token_type_comma: - state = state == 5 ? 6 : 0; + if (state == 2 && commas_to_skip != 0) + commas_to_skip -= paren_nesting == 0; + else + state = 0; continue; case xgettext_token_type_string_literal: - if (extract_all || state == 2 || state == 6) - { - remember_a_message (mlp, &token); - state = 0; - } + if (extract_all || (state == 2 && commas_to_skip == 0)) + remember_a_message (mlp, &token); else { free (token.string); - state = (state == 4 || state == 5) ? 5 : 0; + state = state == 2 ? 2 : 0; } continue; case xgettext_token_type_symbol: - state = (state == 4 || state == 5) ? 5 : 0; + state = state == 2 ? 2 : 0; continue; default: - state = 0; - continue; + abort (); case xgettext_token_type_eof: break; =================================================================== RCS file: po/Makefile.in.in,v retrieving revision 0.10.35.0 retrieving revision 0.10.35.5 diff -u -r0.10.35.0 -r0.10.35.5 --- po/Makefile.in.in 1998/07/20 20:20:38 0.10.35.0 +++ po/Makefile.in.in 1998/07/26 09:07:52 0.10.35.5 @@ -62,7 +62,7 @@ $(COMPILE) $< .po.pox: - $(MAKE) $(PACKAGE).pot + $(MAKE) $(srcdir)/$(PACKAGE).pot $(MSGMERGE) $< $(srcdir)/$(PACKAGE).pot -o $*.pox .po.mo: @@ -79,7 +79,7 @@ all: all-@USE_NLS@ -all-yes: cat-id-tbl.c $(CATALOGS) +all-yes: $(srcdir)/cat-id-tbl.c $(CATALOGS) all-no: $(srcdir)/$(PACKAGE).pot: $(POTFILES) @@ -90,8 +90,8 @@ || ( rm -f $(srcdir)/$(PACKAGE).pot \ && mv $(PACKAGE).po $(srcdir)/$(PACKAGE).pot ) -$(srcdir)/cat-id-tbl.c: stamp-cat-id; @: -$(srcdir)/stamp-cat-id: $(PACKAGE).pot +$(srcdir)/cat-id-tbl.c: $(srcdir)/stamp-cat-id; @: +$(srcdir)/stamp-cat-id: $(srcdir)/$(PACKAGE).pot rm -f cat-id-tbl.tmp sed -f ../intl/po2tbl.sed $(srcdir)/$(PACKAGE).pot \ | sed -e "s/@PACKAGE NAME@/$(PACKAGE)/" > cat-id-tbl.tmp @@ -180,7 +180,8 @@ check: all -cat-id-tbl.o: ../intl/libgettext.h +cat-id-tbl.o: $(srcdir)/cat-id-tbl.c $(top_srcdir)/intl/libgettext.h + $(COMPILE) $(srcdir)/cat-id-tbl.c dvi info tags TAGS ID: @@ -196,7 +197,7 @@ maintainer-clean: distclean @echo "This command is intended for maintainers to use;" @echo "it deletes files that may require special tools to rebuild." - rm -f $(GMOFILES) + rm -f $(GMOFILES) cat-id-tbl.c stamp-cat-id distdir = ../$(PACKAGE)-$(VERSION)/$(subdir) dist distdir: update-po $(DISTFILES) @@ -207,7 +208,7 @@ done update-po: Makefile - $(MAKE) $(PACKAGE).pot + $(MAKE) $(srcdir)/$(PACKAGE).pot PATH=`pwd`/../src:$$PATH; \ cd $(srcdir); \ catalogs='$(CATALOGS)'; \