texinfo-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]


From: Patrice Dumas
Date: Sun, 4 Feb 2024 17:35:48 -0500 (EST)

branch: master
commit 4a8cbae86777647a4dbf236c7a781a08c85c8a8a
Author: Patrice Dumas <pertusus@free.fr>
AuthorDate: Sun Feb 4 23:35:39 2024 +0100

    Add COLLATION_LANGUAGE for linguistic tailoring of indices sorting
    
    * tp/Texinfo/options_data.txt (COLLATION_LANGUAGE),
    tp/Texinfo/Indices.pm (setup_sortable_index_entries), doc/texinfo.texi
    (Other Customization Variables): add COLLATION_LANGUAGE to set
    linguistic tailoring for index sorting.  Not set in the default case.
    Requires Unicode::Collate::Locale to be effective.
    For a discussion about this customization option, see bug-texinfo, 4
    Feb 2024.
---
 ChangeLog                   | 12 ++++++++++++
 doc/texinfo.texi            | 14 ++++++++++++++
 tp/Texinfo/Indices.pm       | 35 +++++++++++++++++++++--------------
 tp/Texinfo/options_data.txt |  1 +
 4 files changed, 48 insertions(+), 14 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index e4dc87d106..54ddce2dd8 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,15 @@
+2024-02-04  Patrice Dumas  <pertusus@free.fr>
+
+       Add COLLATION_LANGUAGE for linguistic tailoring of indices sorting
+
+       * tp/Texinfo/options_data.txt (COLLATION_LANGUAGE),
+       tp/Texinfo/Indices.pm (setup_sortable_index_entries), doc/texinfo.texi
+       (Other Customization Variables): add COLLATION_LANGUAGE to set
+       linguistic tailoring for index sorting.  Not set in the default case.
+       Requires Unicode::Collate::Locale to be effective.
+       For a discussion about this customization option, see bug-texinfo, 4
+       Feb 2024.
+
 2024-02-04  Patrice Dumas  <pertusus@free.fr>
 
        * tp/Texinfo/Common.pm (%text_brace_no_arg_commands),
diff --git a/doc/texinfo.texi b/doc/texinfo.texi
index f865e8fcbe..87ad45f323 100644
--- a/doc/texinfo.texi
+++ b/doc/texinfo.texi
@@ -16157,6 +16157,20 @@ character.  The default for Info is set the same as for
 @code{OPEN_DOUBLE_QUOTE_SYMBOL}, except that the Unicode code is a closing
 double quote (see below).
 
+@item COLLATION_LANGUAGE
+By default, @command{texi2any} sorts document indices according to the
+@dfn{Unicode Collation Algorithm} (defined in
+@uref{http://www.unicode.org/reports/tr10/, Unicode Technical Standard #10}),
+without language specific collation tailoring.  If set, use the language
+for linguistic tailoring of indices sorting.
+
+If there is no support for linguistic tailoring because of missing software or
+missing support for the specified language, revert silently to the default.  In
+Perl, the @code{Unicode::Collate::Locale} module is used for linguistic
+tailoring, therefore if this module is not installed the variable will be
+silently ignored.  If @code{USE_UNICODE_COLLATION} is set to @samp{0}
+there is no unicode collation and no linguistic tailoring either.
+
 @item COMMAND_LINE_ENCODING
 Encoding used to decode command-line arguments.  Default is based on the locale
 encoding.  This may affect file names inserted into output files or error
diff --git a/tp/Texinfo/Indices.pm b/tp/Texinfo/Indices.pm
index ee5860a193..a3d0060a50 100644
--- a/tp/Texinfo/Indices.pm
+++ b/tp/Texinfo/Indices.pm
@@ -371,14 +371,6 @@ sub setup_sortable_index_entries($$$$$)
   # http://www.unicode.org/reports/tr10/#Variable_Weighting
   my %collate_options = ( 'variable' => 'Non-Ignorable' );
 
-  # TODO Unicode::Collate has been in perl core long enough, but
-  # Unicode::Collate::Locale is present since perl major version 5.14 only,
-  # released in 2011.  So probably better to use Unicode::Collate until 2031
-  # (and if documentlanguage is not set) and switch to Unicode::Collate::Locale
-  # at this date.
-  #my $collator = Unicode::Collate::Locale->new('locale' => $documentlanguage,
-  #                                             %collate_options);
-
   # The Unicode::Collate sorting changes often, based on the UCA version.
   # To test the result with a specific version, the UCA_Version should be set,
   # and, more importantly the table should correspond to that version.
@@ -405,12 +397,27 @@ sub setup_sortable_index_entries($$$$$)
     = $customization_information->get_conf('USE_UNICODE_COLLATION');
 
   my $collator;
-  if (!(defined($use_unicode_collation)
-        and !$use_unicode_collation)) {
-    eval { require Unicode::Collate; Unicode::Collate->import; };
-    my $unicode_collate_loading_error = $@;
-    if ($unicode_collate_loading_error eq '') {
-      $collator = Unicode::Collate->new(%collate_options);
+  if (!(defined($use_unicode_collation) and !$use_unicode_collation)) {
+    # Unicode::Collate::Locale is present in perl core since perl major
+    # version 5.14 released in 2011.
+    if (defined($customization_information->get_conf('COLLATION_LANGUAGE'))) {
+      eval { require Unicode::Collate::Locale;
+             Unicode::Collate::Locale->import; };
+      my $unicode_collate_locale_loading_error = $@;
+      if ($unicode_collate_locale_loading_error eq '') {
+        my $locale_lang
+          = $customization_information->get_conf('COLLATION_LANGUAGE');
+        $collator = Unicode::Collate::Locale->new('locale' => $locale_lang,
+                                                  %collate_options);
+      }
+    }
+
+    if (!defined($collator)) {
+      eval { require Unicode::Collate; Unicode::Collate->import; };
+      my $unicode_collate_loading_error = $@;
+      if ($unicode_collate_loading_error eq '') {
+        $collator = Unicode::Collate->new(%collate_options);
+      }
     }
   }
   # Fall back to stub if Unicode::Collate not wanted or not available.
diff --git a/tp/Texinfo/options_data.txt b/tp/Texinfo/options_data.txt
index c78a4706db..ec32f2e13d 100644
--- a/tp/Texinfo/options_data.txt
+++ b/tp/Texinfo/options_data.txt
@@ -213,6 +213,7 @@ CHAPTER_HEADER_LEVEL               converter_customization 
undef   integer
 CHECK_HTMLXREF                     converter_customization undef   integer
 CLOSE_DOUBLE_QUOTE_SYMBOL          converter_customization undef   char
 CLOSE_QUOTE_SYMBOL                 converter_customization undef   char
+COLLATION_LANGUAGE                 converter_customization undef   char
 COMMAND_LINE_ENCODING              converter_customization undef   char
 COMPLEX_FORMAT_IN_TABLE            converter_customization undef   integer
 CONTENTS_OUTPUT_LOCATION           converter_customization undef   char



reply via email to

[Prev in Thread] Current Thread [Next in Thread]