texinfo-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]


From: Patrice Dumas
Date: Fri, 2 Feb 2024 03:45:57 -0500 (EST)

branch: master
commit d19fcb4aa877352562cd6131bc1297da6ff81fa3
Author: Patrice Dumas <pertusus@free.fr>
AuthorDate: Fri Feb 2 09:45:45 2024 +0100

    No encoding, no us-ascii added to locale for document translations
    
    * NEWS, doc/texinfo.texi (Internationalization of Document Strings),
    tp/Texinfo/Translations.pm (translate_string),
    tp/Texinfo/XS/main/translations.c (translate_string): do not prepend
    the encoding nor us-ascii to the locale name to find the translations.
    For the encoding it is not done in XS, not useful since long in Perl
    and encoding was never available when translating from parser.  Note
    that this does not constrain the actual encoding declared in the po/mo
    file (which could be us-ascii) and do not prevent from using accented
    @-commands in translations (which would be needed for accented letters
    if the encoding is us-ascii).
---
 ChangeLog                         | 15 +++++++++
 NEWS                              |  2 ++
 doc/texinfo.texi                  | 19 ------------
 tp/Texinfo/Translations.pm        | 64 ++-------------------------------------
 tp/Texinfo/XS/main/translations.c | 41 ++-----------------------
 5 files changed, 23 insertions(+), 118 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 3f92168a38..273bc613b2 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,18 @@
+2024-02-02  Patrice Dumas  <pertusus@free.fr>
+
+       No encoding, no us-ascii added to locale for document translations
+
+       * NEWS, doc/texinfo.texi (Internationalization of Document Strings),
+       tp/Texinfo/Translations.pm (translate_string),
+       tp/Texinfo/XS/main/translations.c (translate_string): do not prepend
+       the encoding nor us-ascii to the locale name to find the translations.
+       For the encoding it is not done in XS, not useful since long in Perl
+       and encoding was never available when translating from parser.  Note
+       that this does not constrain the actual encoding declared in the po/mo
+       file (which could be us-ascii) and do not prevent from using accented
+       @-commands in translations (which would be needed for accented letters
+       if the encoding is us-ascii).
+
 2024-02-02  Patrice Dumas  <pertusus@free.fr>
 
        * tp/tests/Makefile.onetst, tp/tests/coverage/list-of-tests: disable
diff --git a/NEWS b/NEWS
index d3ade883b4..2a763f600c 100644
--- a/NEWS
+++ b/NEWS
@@ -23,6 +23,8 @@ See the manual for detailed information.
  . set CHECK_NORMAL_MENU_STRUCTURE in the default case.
  . some unused translation files have been removed for the
    "texinfo_document" domain
+ . the us-ascii encoding is not tried anymore when looking for a document
+   output string locale.  Accents @-command can still be used in translations.
  . HTML, Texinfo and raw text output:
    An implementation of the conversion in C has been included.  Set the
     `TEXINFO_XS_CONVERT' environment variable to 1 to use.
diff --git a/doc/texinfo.texi b/doc/texinfo.texi
index 0c7b379d3b..f865e8fcbe 100644
--- a/doc/texinfo.texi
+++ b/doc/texinfo.texi
@@ -16620,28 +16620,9 @@ The expansion of a translation string is done like 
this:
 @item First, the string is translated.  The locale
 is @var{documentlanguage}@code{.}@var{documentencoding}.
 
-@cindex @code{us-ascii} encoding, and translations
 If the @var{documentlanguage} has the form @samp{ll_CC}, that is
 tried first, and then just @samp{ll}.
 
-To cope with the possibility of having multiple encodings, a
-special use of the @code{us-ascii} locale encoding is also possible.
-If the @samp{ll} locale in the current encoding does not exist, and the
-encoding is not @code{us-ascii}, then @code{us-ascii} is tried.
-
-The idea is that if there is a @code{us-ascii} encoding, it means that
-all the characters in the charset may be expressed as @@-commands.
-For example, there is a @code{fr.us-ascii} locale that can accommodate
-any encoding, since all the Latin@tie{}1 characters have associated
-@@-commands.  On the other hand, Japanese has only a translation
-@code{ja.utf-8}, since there are no @@-commands for Japanese
-characters.
-
-The @code{us-ascii} locales are not needed much now that
-UTF-8 is used for most documents.  Note that accented characters
-are required to be expressed as @@-commands in the @code{us-ascii} locales,
-which may be inconvenient for translators.
-
 @item Next, in most cases, the string is expanded as Texinfo, and converted.
 The arguments are substituted; for example, @samp{@{arg_name@}} is replaced by
 the corresponding actual argument.
diff --git a/tp/Texinfo/Translations.pm b/tp/Texinfo/Translations.pm
index cda40c1d60..f5deeb1448 100644
--- a/tp/Texinfo/Translations.pm
+++ b/tp/Texinfo/Translations.pm
@@ -99,8 +99,7 @@ sub configure($;$)
 }
 
 # libintl converts between encodings but doesn't decode them into the
-# perl internal format.  This is only called if the encoding is a proper
-# perl encoding.
+# perl internal format.
 sub _decode_i18n_string($$)
 {
   my $string = shift;
@@ -190,25 +189,6 @@ sub translate_string($$;$$)
 
   Locale::Messages::textdomain($strings_textdomain);
 
-  # FIXME do this only once when encoding is seen (or at beginning)
-  # instead of here, each time that gdt is called?
-  my $encoding;
-  #my $perl_encoding;
-  if ($customization_information) {
-    # NOTE the following customization variables are not set for
-    # a Parser, so the encoding will be undef when gdt is called from
-    # parsers.
-    if ($customization_information->get_conf('OUTPUT_ENCODING_NAME')) {
-      $encoding = $customization_information->get_conf('OUTPUT_ENCODING_NAME');
-    }
-    #if 
(defined($customization_information->get_conf('OUTPUT_PERL_ENCODING'))) {
-    #  $perl_encoding = 
$customization_information->get_conf('OUTPUT_PERL_ENCODING');
-    #}
-  } else {
-    # NOTE never happens in the tests, unlikely to happen at all.
-    $encoding = $DEFAULT_ENCODING;
-    #$perl_encoding = $DEFAULT_PERL_ENCODING;
-  }
   Locale::Messages::bind_textdomain_codeset($strings_textdomain, 'UTF-8');
   Locale::Messages::bind_textdomain_filter($strings_textdomain,
                           \&_decode_i18n_string, 'UTF-8');
@@ -219,16 +199,8 @@ sub translate_string($$;$$)
   # with UTF-8.  If there are actually characters that cannot be encoded in the
   # output encoding issues will still show up when encoding to output, though.
   # Should be more similar with code used in XS modules, too.
-  # As a side note, the best would have been to directly decode using the
+  # As a side note, the best could have been to directly decode using the
   # charset used in the po/gmo files, but it does not seems to be available.
-  #Locale::Messages::bind_textdomain_codeset($strings_textdomain, $encoding)
-  #  if (defined($encoding) and $encoding ne 'us-ascii');
-  #if (!($encoding and $encoding eq 'us-ascii')) {
-  #  if (defined($perl_encoding)) {
-  #    Locale::Messages::bind_textdomain_filter($strings_textdomain,
-  #      \&_decode_i18n_string, $perl_encoding);
-  #  }
-  #}
 
   my @langs = ($lang);
   if ($lang =~ /^([a-z]+)_([A-Z]+)/) {
@@ -237,31 +209,7 @@ sub translate_string($$;$$)
     push @langs, $main_lang;
   }
 
-  my @locales;
-  foreach my $language (@langs) {
-    # NOTE the locale file with appended encoding are searched for, but if
-    # not found, files with stripped encoding are searched for too:
-    # 
https://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html
-    if (defined($encoding)) {
-      push @locales, "$language.$encoding";
-    } else {
-      push @locales, $language;
-    }
-    # also try us-ascii, the charset should be compatible with other
-    # charset, and should resort to @-commands if needed for non
-    # ascii characters
-    # REMARK this is not necessarily true for every language/encoding.
-    # This can be true for latin1, and maybe some other 8 bit encodings
-    # with accents available as @-commands, but not for most
-    # language.  However, for those languages, it is unlikely that
-    # the locale with .us-ascii are set, so it should not hurt
-    # to add this possibility.
-    if (!$encoding or ($encoding and $encoding ne 'us-ascii')) {
-      push @locales, "$language.us-ascii";
-    }
-  }
-
-  my $locales = join(':', @locales);
+  my $locales = join(':', @langs);
 
   Locale::Messages::nl_putenv("LANGUAGE=$locales");
 
@@ -394,12 +342,6 @@ sub replace_convert_substrings($$;$)
       }
       $parser_conf->{'DEBUG'} = $debug_level;
     }
-    #foreach my $conf_variable () {
-    #  if (defined($customization_information->get_conf($conf_variable))) {
-    #    $parser_conf->{$conf_variable}
-    #      = $customization_information->get_conf($conf_variable);
-    #  }
-    #}
   }
   my $parser = Texinfo::Parser::simple_parser($parser_conf);
 
diff --git a/tp/Texinfo/XS/main/translations.c 
b/tp/Texinfo/XS/main/translations.c
index ccda036b8f..a36a275a4b 100644
--- a/tp/Texinfo/XS/main/translations.c
+++ b/tp/Texinfo/XS/main/translations.c
@@ -156,7 +156,6 @@ translate_string (OPTIONS *options, const char * string,
   char *saved_LANGUAGE;
   char *saved_LANG;
   char *saved_LC_MESSAGES;
-  char *encoding = 0;
   char *langs[2] = {0, 0};
   char *main_lang = 0;
   char *translated_string;
@@ -264,31 +263,9 @@ translate_string (OPTIONS *options, const char * string,
       if (i > 0)
         text_append_n (&language_locales, ":", 1);
       text_append (&language_locales, langs[i]);
-      if (encoding)
-        {
-          text_append_n (&language_locales, ".", 1);
-          text_append (&language_locales, encoding);
-        }
-    /*
-      also try us-ascii, the charset should be compatible with other
-      charset, and should resort to @-commands if needed for non
-      ascii characters
-      REMARK this is not necessarily true for every language/encoding.
-      This can be true for latin1, and maybe some other 8 bit encodings
-      with accents available as @-commands, but not for most
-      language.  However, for those languages, it is unlikely that
-      the locale with .us-ascii are set, so it should not hurt
-      to add this possibility.
-     */
-      if (!encoding || !strcmp (encoding, "us-ascii"))
-        {
-          text_append_n (&language_locales, ":", 1);
-          text_append (&language_locales, langs[i]);
-          text_append_n (&language_locales, ".", 1);
-          text_append (&language_locales, "us-ascii");
-        }
       free (langs[i]);
     }
+
   if (setenv ("LANGUAGE", language_locales.text, 1) != 0)
     {
       fprintf (stderr, "gdt: setenv `%s' error for string `%s': %s\n",
@@ -492,21 +469,9 @@ replace_convert_substrings (OPTIONS *options, char 
*translated_string,
    */
   parser_set_accept_internalvalue (1);
 
-  /* TODO implement setting configuration.  This may not be needed when
+  /* TODO implement setting DEBUG.  This may not be needed when
      called from a parser without reset_parser being called, but could be
-     when called from a converter.  As long as only DEBUG is passed
-     this is not really problematic. */
-  /*
-  # general customization relevant for parser
-  if ($customization_information) {
-    foreach my $conf_variable ('DEBUG') {
-      if (defined($customization_information->get_conf($conf_variable))) {
-        $parser_conf->{$conf_variable}
-          = $customization_information->get_conf($conf_variable);
-      }
-    }
-  }
-   */
+     when called from a converter. */
   document_descriptor = parse_string (texinfo_line, 1);
 
   /* FIXME if called from parser through complete_indices, options will



reply via email to

[Prev in Thread] Current Thread [Next in Thread]