texinfo-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

branch master updated: * tp/Texinfo/Structuring.pm (setup_index_entry_ke


From: Patrice Dumas
Subject: branch master updated: * tp/Texinfo/Structuring.pm (setup_index_entry_keys_formatting): remove ascii_punctuation obsolete option set.
Date: Sun, 07 Jan 2024 18:36:24 -0500

This is an automated email from the git hooks/post-receive script.

pertusus pushed a commit to branch master
in repository texinfo.

The following commit(s) were added to refs/heads/master by this push:
     new b4296ae27c * tp/Texinfo/Structuring.pm 
(setup_index_entry_keys_formatting): remove ascii_punctuation obsolete option 
set.
b4296ae27c is described below

commit b4296ae27c7409b45d7af82e736df45e295120f3
Author: Patrice Dumas <pertusus@free.fr>
AuthorDate: Mon Jan 8 00:36:23 2024 +0100

    * tp/Texinfo/Structuring.pm (setup_index_entry_keys_formatting):
    remove ascii_punctuation obsolete option set.
    
    * tp/Texinfo/Structuring.pm (index_entry_element_sort_string)
    (_index_entry_element_sort_string_key)
    (setup_sortable_index_entries), tp/t/test_sort.t, tp/t/test_utils.pl:
    add an argument with document information for XS for
    index_entry_element_sort_string and
    _index_entry_element_sort_string_key to retrieve the document
    descriptor in XS.  Use setup_sortable_index_entries
    $customization_information as a source of that information.  If it is
    a converter based on Texinfo::Convert::Converter, the information is
    there, if not, it needs to be added explcitely.  Add the information
    explicitely setting 'document_descriptor' to
    document->document_descriptor() in test_sort.t and test_utils.pl.
    
    * tp/Texinfo/XS/convert/indices_in_conversion.c
    (index_entry_element_sort_string): implement in C.
    
    * tp/Texinfo/XS/main/get_perl_info.c (find_index_entry_sv): add based
    on get_sv_index_entries_sorted_by_letter code.
    
    * tp/Texinfo/XS/main/get_perl_info.c
    (copy_sv_options_for_convert_text): get 'code' from perl.
    
    * tp/Texinfo/XS/main/get_perl_info.c (find_index_entry_subentry)
    (subentry_hv_parent, find_subentry_index_command_sv)
    (find_element_from_sv): find subentry C element based on perl element,
    by finding the 'subentry parent' index entry C element going down
    subentry levels, and then find the C subentry element going up
    subentry levels.
    
    * tp/Texinfo/XS/main/get_perl_info.c
    (find_element_extra_index_entry_sv)
    (find_index_entry_associated_hv, find_element_from_sv): have
    find_element_extra_index_entry_sv return an index entry, such that the
    caller can find the element based on entry_associated_element or
    entry_element.  Add find_index_entry_associated_hv to go through both
    entry_associated_element and entry_element to find the C element
    associated to a perl element.  Use that code in find_element_from_sv.
    
    * tp/Texinfo/XS/structuring_transfo/StructuringTransfoXS.xs
    (index_entry_element_sort_string): XS interface for
    index_entry_element_sort_string.  Not used as it is slower than the
    perl...
    
    * tp/Texinfo/XS/main/get_perl_info.c (debug_print_element_hv)
    (debug_print_element_sv): add debug functions that print information
    on perl tree elements.
---
 ChangeLog                                          |  52 ++++
 tp/TODO                                            |  12 -
 tp/Texinfo/Convert/LaTeX.pm                        |   3 +-
 tp/Texinfo/Structuring.pm                          |  42 ++-
 tp/Texinfo/XS/Makefile.am                          |   4 +-
 tp/Texinfo/XS/convert/indices_in_conversion.c      |  55 ++++
 tp/Texinfo/XS/convert/indices_in_conversion.h      |   6 +-
 tp/Texinfo/XS/main/debug.c                         |   3 +-
 tp/Texinfo/XS/main/get_perl_info.c                 | 326 +++++++++++++++++----
 tp/Texinfo/XS/main/get_perl_info.h                 |   4 +
 .../XS/structuring_transfo/StructuringTransfoXS.xs |  34 +++
 tp/t/test_sort.t                                   |   4 +
 tp/t/test_utils.pl                                 |   2 +
 13 files changed, 460 insertions(+), 87 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index ac9dd5b8ad..417ffd4736 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,55 @@
+2024-01-07  Patrice Dumas  <pertusus@free.fr>
+
+       * tp/Texinfo/Structuring.pm (setup_index_entry_keys_formatting):
+       remove ascii_punctuation obsolete option set.
+
+       * tp/Texinfo/Structuring.pm (index_entry_element_sort_string)
+       (_index_entry_element_sort_string_key)
+       (setup_sortable_index_entries), tp/t/test_sort.t, tp/t/test_utils.pl:
+       add an argument with document information for XS for
+       index_entry_element_sort_string and
+       _index_entry_element_sort_string_key to retrieve the document
+       descriptor in XS.  Use setup_sortable_index_entries
+       $customization_information as a source of that information.  If it is
+       a converter based on Texinfo::Convert::Converter, the information is
+       there, if not, it needs to be added explcitely.  Add the information
+       explicitely setting 'document_descriptor' to
+       document->document_descriptor() in test_sort.t and test_utils.pl.
+
+       * tp/Texinfo/XS/convert/indices_in_conversion.c
+       (index_entry_element_sort_string): implement in C.
+
+       * tp/Texinfo/XS/main/get_perl_info.c (find_index_entry_sv): add based
+       on get_sv_index_entries_sorted_by_letter code.
+
+       * tp/Texinfo/XS/main/get_perl_info.c
+       (copy_sv_options_for_convert_text): get 'code' from perl.
+
+       * tp/Texinfo/XS/main/get_perl_info.c (find_index_entry_subentry)
+       (subentry_hv_parent, find_subentry_index_command_sv)
+       (find_element_from_sv): find subentry C element based on perl element,
+       by finding the 'subentry parent' index entry C element going down
+       subentry levels, and then find the C subentry element going up
+       subentry levels.
+
+       * tp/Texinfo/XS/main/get_perl_info.c
+       (find_element_extra_index_entry_sv)
+       (find_index_entry_associated_hv, find_element_from_sv): have
+       find_element_extra_index_entry_sv return an index entry, such that the
+       caller can find the element based on entry_associated_element or
+       entry_element.  Add find_index_entry_associated_hv to go through both
+       entry_associated_element and entry_element to find the C element
+       associated to a perl element.  Use that code in find_element_from_sv.
+
+       * tp/Texinfo/XS/structuring_transfo/StructuringTransfoXS.xs
+       (index_entry_element_sort_string): XS interface for
+       index_entry_element_sort_string.  Not used as it is slower than the
+       perl...
+
+       * tp/Texinfo/XS/main/get_perl_info.c (debug_print_element_hv)
+       (debug_print_element_sv): add debug functions that print information
+       on perl tree elements.
+
 2024-01-07  Patrice Dumas  <pertusus@free.fr>
 
        * tp/Texinfo/XS/convert/convert_html.c (convert_printindex_command):
diff --git a/tp/TODO b/tp/TODO
index 5db2a0c266..179ba95fe8 100644
--- a/tp/TODO
+++ b/tp/TODO
@@ -20,18 +20,6 @@ Document that Texinfo::Document::rebuild_document or
 Texinfo::Document::rebuild_tree should be called after tree modifications
 if the parser is XS but converters are perl.
 
-Find perl index element in C:
-$element->{'extra'}->{'index_entry'} = [$index_name, $number];
-in C:
-index[$index_name][number] = { 'index_name'           => $index_name,
-                      'entry_element'        => $element,
-                      'entry_number'         => $number,
-                    };
-(find index j based on index_name, could be a bsearch or a lin search)
- INDEX_ENTRY *index_entry;
-   index_entry = idx->index_entries[j];
-                  main_entry_element = index_entry->entry_element;
-
 
 Bugs
 ====
diff --git a/tp/Texinfo/Convert/LaTeX.pm b/tp/Texinfo/Convert/LaTeX.pm
index 03cd19e2e7..e178016a0e 100644
--- a/tp/Texinfo/Convert/LaTeX.pm
+++ b/tp/Texinfo/Convert/LaTeX.pm
@@ -2454,7 +2454,8 @@ sub _index_entry($$)
       # always setup a string to sort with as we may use commands
       my $convert_to_text_options = {%$options, 'code' => $in_code};
       my $sort_string
-           = Texinfo::Structuring::index_entry_element_sort_string($entry,
+           = Texinfo::Structuring::index_entry_element_sort_string(
+                                          $self, $entry,
                                           $subindex_command,
                                           $convert_to_text_options, 1);
       my $result = '';
diff --git a/tp/Texinfo/Structuring.pm b/tp/Texinfo/Structuring.pm
index 89baa4ee2a..abbb3fb4e8 100644
--- a/tp/Texinfo/Structuring.pm
+++ b/tp/Texinfo/Structuring.pm
@@ -114,8 +114,11 @@ our %XS_overrides = (
   "Texinfo::Structuring::_XS_unsplit"
     => "Texinfo::StructTransfXS::unsplit",
 
-#  "Texinfo::Structuring::index_entry_element_sort_string"
-#    => "Texinfo::StructTransfXS::index_entry_element_sort_string",
+  # TODO the XS override is slower than the perl function.
+  # One possible reason could be that the text options are read
+  # from perl for each entry instead of once for each index.
+  #"Texinfo::Structuring::index_entry_element_sort_string"
+  #  => "Texinfo::StructTransfXS::index_entry_element_sort_string",
 
   # Not useful for HTML as functions, as the calling functions are
   # already overriden
@@ -2270,7 +2273,7 @@ sub setup_index_entry_keys_formatting($)
 {
   my $customization_info = shift;
 
-  my $options = {'ascii_punctuation' => 1,
+  my $options = {
      Texinfo::Convert::Text::copy_options_for_convert_text(
                                   $customization_info)};
   if (not $customization_info->get_conf('ENABLE_ENCODING')
@@ -2281,9 +2284,11 @@ sub setup_index_entry_keys_formatting($)
   return $options;
 }
 
-# can be used for subentries
-sub index_entry_element_sort_string($$$;$)
+# can be used for subentries.
+# $DOCUMENT_INFO is used in XS to retrieve the document.
+sub index_entry_element_sort_string($$$$;$)
 {
+  my $document_info = shift;
   my $main_entry = shift;
   my $index_entry_element = shift;
   my $options = shift;
@@ -2312,16 +2317,17 @@ sub index_entry_element_sort_string($$$;$)
   return $sort_string;
 }
 
-sub _index_entry_element_sort_string_key($$$$;$)
+sub _index_entry_element_sort_string_key($$$$$;$)
 {
+  my $document_info = shift;
   my $main_entry = shift;
   my $index_entry_element = shift;
   my $options = shift;
   my $collator = shift;
   my $prefer_reference_element = shift;
 
-  my $sort_string = index_entry_element_sort_string ($main_entry,
-                                             $index_entry_element,
+  my $sort_string = index_entry_element_sort_string ($document_info,
+                               $main_entry, $index_entry_element,
                                $options, $prefer_reference_element);
 
   # This avoids varying results depending on whether the string is
@@ -2397,6 +2403,12 @@ sub _converter_or_registrar_line_warn($$$$)
   }
 }
 
+# There is no neeed for document information in Perl, however, in XS
+# it is needed to retrieve the Tree elements in the C structures.
+# $CUSTOMIZATION_INFORMATION is used as the source of document
+# information.  It should already be set if it is a converter based
+# on Texinfo::Convert::Converter, but otherwise it should be set by
+# the caller, setting 'document_descriptor' to document->document_descriptor().
 sub setup_sortable_index_entries($$$$$;$)
 {
   my $registrar = shift;
@@ -2469,8 +2481,8 @@ sub setup_sortable_index_entries($$$$$;$)
       my $convert_to_text_options = {%$options,
         'code' => $indices_information->{$entry_index_name}->{'in_code'}};
       my ($entry_key, $sort_entry_key)
-        = _index_entry_element_sort_string_key($index_entry,
-                                               $main_entry_element,
+        = _index_entry_element_sort_string_key($customization_information,
+                                   $index_entry, $main_entry_element,
                                   $convert_to_text_options, $entries_collator);
       my @entry_keys;
       my @sort_entry_keys;
@@ -2498,8 +2510,8 @@ sub setup_sortable_index_entries($$$$$;$)
         $subentry_nr ++;
         $subentry = $subentry->{'extra'}->{'subentry'};
         my ($subentry_key, $sort_subentry_key)
-              = _index_entry_element_sort_string_key($index_entry,
-                                $subentry, $convert_to_text_options,
+              = 
_index_entry_element_sort_string_key($customization_information,
+                             $index_entry, $subentry, $convert_to_text_options,
                                 $entries_collator);
         if ($subentry_key !~ /\S/) {
           my $entry_cmdname = $main_entry_element->{'cmdname'};
@@ -2879,10 +2891,14 @@ I<$node> is a node tree element.  Find the node 
I<$node> children based
 on the sectioning structure.  For the node associated with C<@top>
 sectioning command, the sections associated with parts are considered.
 
-=item $sort_string = index_entry_element_sort_string($main_entry, 
$index_entry_element, $options, $prefer_reference_element)
+=item $sort_string = index_entry_element_sort_string($document_info, 
$main_entry, $index_entry_element, $options, $prefer_reference_element)
 X<C<index_entry_element_sort_string>>
 
 Return a string suitable as a sort string, for index entries.
+I<$document_info> is used by C code to retrieve the document data,
+using the C<document_descriptor> key.  I<$document_info> can be a
+converter based on L<Texinfo::Convert::Converter>, otherwise
+C<document_descriptor> need, in general, to be set up explicitely.
 The tree element index entry processed is I<$index_entry_element>,
 and can be a C<@subentry>.  I<$main_entry> is the main index entry
 that can be used to gather information.  The I<$options> are options
diff --git a/tp/Texinfo/XS/Makefile.am b/tp/Texinfo/XS/Makefile.am
index 7658a51b6c..3d85482b84 100644
--- a/tp/Texinfo/XS/Makefile.am
+++ b/tp/Texinfo/XS/Makefile.am
@@ -350,13 +350,15 @@ nodist_StructuringTransfoXS_la_SOURCES = \
 CLEANFILES += \
                     structuring_transfo/StructuringTransfoXS.c
 StructuringTransfoXS_la_SOURCES = \
+                    convert/indices_in_conversion.h \
+                    convert/indices_in_conversion.c \
                     structuring_transfo/transformations.c \
                     structuring_transfo/transformations.h
 
 EXTRA_DIST += structuring_transfo/StructuringTransfoXS.xs
 
 # locate include files under out-of-source builds.
-StructuringTransfoXS_la_CPPFLAGS = -I$(srcdir)/main 
-I$(srcdir)/structuring_transfo $(AM_CPPFLAGS) $(GNULIB_CPPFLAGS) 
$(XSLIBS_CPPFLAGS)
+StructuringTransfoXS_la_CPPFLAGS = -I$(srcdir)/main 
-I$(srcdir)/structuring_transfo -I$(srcdir)/convert $(AM_CPPFLAGS) 
$(GNULIB_CPPFLAGS) $(XSLIBS_CPPFLAGS)
 StructuringTransfoXS_la_CFLAGS = $(XSLIBS_CFLAGS)
 StructuringTransfoXS_la_LIBADD = libtexinfoxs.la libtexinfo.la 
$(top_builddir)/gnulib/lib/libgnu.la
 StructuringTransfoXS_la_LDFLAGS = $(XSLIBS_LDFLAGS) $(LTLIBICONV) 
$(LTLIBUNISTRING)
diff --git a/tp/Texinfo/XS/convert/indices_in_conversion.c 
b/tp/Texinfo/XS/convert/indices_in_conversion.c
index 3774dec70f..cd8b43097c 100644
--- a/tp/Texinfo/XS/convert/indices_in_conversion.c
+++ b/tp/Texinfo/XS/convert/indices_in_conversion.c
@@ -26,6 +26,7 @@
 #include "utils.h"
 #include "extra.h"
 #include "unicode.h"
+#include "convert_to_text.h"
 #include "indices_in_conversion.h"
 
 /* corresponding perl code in Texinfo::Structuring */
@@ -160,3 +161,57 @@ index_content_element (const ELEMENT *element, int 
prefer_reference_element)
    }
 }
 
+char *
+index_entry_element_sort_string (INDEX_ENTRY *main_entry,
+                                 ELEMENT *index_entry_element,
+                                 TEXT_OPTIONS *options,
+                                 int prefer_reference_element)
+{
+  char *sort_string;
+  char *index_ignore_chars;
+  ELEMENT *entry_tree_element;
+
+  if (!index_entry_element)
+    {
+      fatal ("index_entry_element_sort_string: NUL element");
+    }
+
+  char *sortas = lookup_extra_string (index_entry_element, "sortas");
+  if (sortas)
+    return strdup (sortas);
+
+  entry_tree_element = index_content_element (index_entry_element,
+                                          prefer_reference_element);
+
+  sort_string = convert_to_text (entry_tree_element, options);
+
+  index_ignore_chars = lookup_extra_string (main_entry->entry_element,
+                                            "index_ignore_chars");
+  if (index_ignore_chars)
+    {
+      TEXT sort_string_text;
+      char *p = sort_string;
+      text_init (&sort_string_text);
+
+      while (*p)
+        {
+          int n = strspn (p, index_ignore_chars);
+          if (n)
+            {
+              p += n;
+            }
+          if (*p)
+            {
+              /* store a character */
+              int char_len = 1;
+              while ((p[char_len] & 0xC0) == 0x80)
+                char_len++;
+              text_append_n (&sort_string_text, p, char_len);
+              p += char_len;
+            }
+        }
+      free (sort_string);
+      sort_string = sort_string_text.text;
+    }
+  return sort_string;
+}
diff --git a/tp/Texinfo/XS/convert/indices_in_conversion.h 
b/tp/Texinfo/XS/convert/indices_in_conversion.h
index f516695632..7d9c7f7ab0 100644
--- a/tp/Texinfo/XS/convert/indices_in_conversion.h
+++ b/tp/Texinfo/XS/convert/indices_in_conversion.h
@@ -3,7 +3,7 @@
 #define INDICES_IN_CONVERSION_H
 
 #include "tree_types.h"
-#include "indices_in_conversion.h"
+#include "convert_to_text.h"
 
 MERGED_INDEX *merge_indices (INDEX **index_names);
 void destroy_merged_indices (MERGED_INDEX *merged_indices);
@@ -14,4 +14,8 @@ void destroy_indices_sorted_by_letter (
 ELEMENT *index_content_element (const ELEMENT *element,
                                 int prefer_reference_element);
 
+char *index_entry_element_sort_string (INDEX_ENTRY *main_entry,
+                                 ELEMENT *index_entry_element,
+                                 TEXT_OPTIONS *options,
+                                 int prefer_reference_element);
 #endif
diff --git a/tp/Texinfo/XS/main/debug.c b/tp/Texinfo/XS/main/debug.c
index c925d44deb..43d72c9d65 100644
--- a/tp/Texinfo/XS/main/debug.c
+++ b/tp/Texinfo/XS/main/debug.c
@@ -99,7 +99,8 @@ print_element_debug (const ELEMENT *e, int print_parent)
   return result;
 }
 
-char *print_associate_info_debug (const ASSOCIATED_INFO *info)
+char *
+print_associate_info_debug (const ASSOCIATED_INFO *info)
 {
   TEXT text;
   char *result;
diff --git a/tp/Texinfo/XS/main/get_perl_info.c 
b/tp/Texinfo/XS/main/get_perl_info.c
index 6727421db1..984bca7ac0 100644
--- a/tp/Texinfo/XS/main/get_perl_info.c
+++ b/tp/Texinfo/XS/main/get_perl_info.c
@@ -41,6 +41,9 @@ FIXME add an initialization of translations?
 #include "options_types.h"
 #include "document_types.h"
 #include "converter_types.h"
+#include "text.h"
+#include "extra.h"
+#include "debug.h"
 #include "utils.h"
 #include "builtin_commands.h"
 #include "errors.h"
@@ -51,6 +54,63 @@ FIXME add an initialization of translations?
 #include "converter.h"
 #include "get_perl_info.h"
 
+#define FETCH(key) key##_sv = hv_fetch (element_hv, #key, strlen(#key), 0);
+
+static void
+debug_print_element_hv (HV *element_hv)
+{
+  SV **cmdname_sv;
+  SV **type_sv;
+  SV **text_sv;
+  TEXT msg;
+
+  dTHX;
+
+  text_init (&msg);
+  text_append (&msg, "");
+
+  FETCH(cmdname)
+  if (cmdname_sv)
+    {
+      text_printf (&msg, "@%s", SvPVutf8_nolen (*cmdname_sv));
+    }
+  FETCH(type)
+  if (type_sv)
+    {
+      text_printf (&msg, "(%s)", SvPVutf8_nolen (*type_sv));
+    }
+  FETCH(text)
+  if (text_sv)
+    {
+      int allocated = 0;
+      char *text = SvPVutf8_nolen (*text_sv);
+      char *protected_text = debug_protect_eol (text,
+                                              &allocated);
+      text_printf (&msg, "[T: %s]", protected_text);
+      if (allocated)
+        free (protected_text);
+    }
+  fprintf (stderr, "ELT_sv: %s\n", msg.text);
+  free (msg.text);
+}
+
+void
+debug_print_element_sv (SV *element_sv)
+{
+  dTHX;
+
+  if (SvOK (element_sv))
+    {
+      HV *element_hv = (HV *) SvRV (element_sv);
+      debug_print_element_hv (element_hv);
+    }
+  else
+    {
+      fprintf(stderr, "debug_print_element_sv: NUL\n");
+    }
+}
+#undef FETCH
+
 DOCUMENT *
 get_document_or_warn (SV *sv_in, char *key, char *warn_string)
 {
@@ -336,7 +396,7 @@ get_expanded_formats (HV *hv, EXPANDED_FORMAT 
**expanded_formats)
 
   expanded_formats_sv = hv_fetch (hv, "expanded_formats",
                                   strlen ("expanded_formats"), 0);
-  if (expanded_formats_sv)
+  if (expanded_formats_sv && SvOK (*expanded_formats_sv))
     {
       I32 i;
       I32 formats_nr;
@@ -573,6 +633,54 @@ reset_output_init_conf (SV *sv_in)
     }
 }
 
+INDEX_ENTRY *
+find_index_entry_sv (SV *index_entry_sv, INDEX **index_names,
+                     const char *warn_string, char **entry_index_name,
+                     int *entry_number)
+{
+  HV *index_entry_hv;
+  SV **index_name_sv;
+  SV **entry_number_sv;
+  int entry_idx_in_index;
+  INDEX *idx;
+
+  dTHX;
+
+  index_entry_hv = (HV *) SvRV (index_entry_sv);
+  index_name_sv = hv_fetch (index_entry_hv, "index_name",
+                            strlen ("index_name"), 0);
+  entry_number_sv = hv_fetch (index_entry_hv, "entry_number",
+                              strlen ("entry_number"), 0);
+
+  *entry_index_name = 0;
+  *entry_number = 0;
+
+  if (!index_name_sv || !entry_number_sv)
+    {
+      char *msg;
+      const char *warn_str = warn_string;
+      if (!warn_str)
+        warn_str = "find_index_entry_sv";
+      xasprintf (&msg, "%s: no entry info\n", warn_str);
+      fatal (msg);
+    }
+  *entry_index_name = (char *) SvPVutf8_nolen (*index_name_sv);
+  *entry_number = SvIV (*entry_number_sv);
+  entry_idx_in_index = *entry_number - 1;
+
+  idx = indices_info_index_by_name (index_names,
+                                    *entry_index_name);
+
+  if (idx)
+    {
+      if (entry_idx_in_index < idx->entries_number)
+        return &idx->index_entries[entry_idx_in_index];
+    }
+
+  return 0;
+}
+
+
 /* code in comments allow to sort the index names to have a fixed order
    in the data structure.  Not clear that it is useful or not, not enabled
    for now */
@@ -736,15 +844,10 @@ get_sv_index_entries_sorted_by_letter (INDEX 
**index_names,
               for (k = 0; k < entries_nr; k++)
                 {
                   SV** index_entry_sv = av_fetch (entries_av, k, 0);
-                  HV *index_entry_hv;
-                  SV** index_name_sv;
-                  SV** entry_number_sv;
-                  INDEX *idx;
                   char *entry_index_name;
                   int entry_number;
-                  int entry_idx_in_index;
-
-                  letter_entries->entries[k] = 0;
+                  char *warn_string;
+                  INDEX_ENTRY *index_entry = 0;
 
                   if (!index_entry_sv)
                     {
@@ -754,32 +857,16 @@ get_sv_index_entries_sorted_by_letter (INDEX 
**index_names,
                              idx_name, i, letter_entries->letter, k);
                       fatal (msg);
                     }
-                  index_entry_hv = (HV *) SvRV (*index_entry_sv);
-                  index_name_sv = hv_fetch (index_entry_hv, "index_name",
-                                            strlen ("index_name"), 0);
-                  entry_number_sv = hv_fetch (index_entry_hv, "entry_number",
-                                              strlen ("entry_number"), 0);
-                  if (!index_name_sv || !entry_number_sv)
-                    {
-                      char *msg;
-                      xasprintf (&msg,
-  "get_sv_index_entries_sorted_by_letter: %s: %d: %s: %d: no entry info\n",
-                             idx_name, i, letter_entries->letter, k);
-                      fatal (msg);
-                    }
-                  entry_index_name = (char *) SvPVutf8_nolen (*index_name_sv);
-                  entry_number = SvIV (*entry_number_sv);
-                  entry_idx_in_index = entry_number - 1;
+                  xasprintf (&warn_string,
+                         "get_sv_index_entries_sorted_by_letter: %s: %d: %s: 
%d",
+                         idx_name, i, letter_entries->letter, k);
+                  index_entry = find_index_entry_sv (*index_entry_sv, 
index_names,
+                                                     warn_string, 
&entry_index_name,
+                                                     &entry_number);
+                  free (warn_string);
 
-                  idx = indices_info_index_by_name (index_names,
-                                                    entry_index_name);
+                  letter_entries->entries[k] = index_entry;
 
-                  if (idx)
-                    {
-                      if (entry_idx_in_index < idx->entries_number)
-                        letter_entries->entries[k]
-                          = &idx->index_entries[entry_idx_in_index];
-                    }
                   if (!letter_entries->entries[k])
                     {
                       char *msg;
@@ -824,12 +911,13 @@ force_conf (CONVERTER *converter, const char *conf, SV 
*value)
 /* output format specific */
 
 /* map hash reference of Convert::Text options to TEXT_OPTIONS */
-/* TODO more to do */
+/* TODO more to do? */
 #define FETCH(key) key##_sv = hv_fetch (hv_in, #key, strlen(#key), 0);
 TEXT_OPTIONS *
 copy_sv_options_for_convert_text (SV *sv_in)
 {
   HV *hv_in;
+  SV **code_sv;
   SV **TEST_sv;
   SV **INCLUDE_DIRECTORIES_sv;
   SV **converter_sv;
@@ -853,8 +941,11 @@ copy_sv_options_for_convert_text (SV *sv_in)
   if (enabled_encoding_sv)
     text_options->encoding = strdup (SvPVutf8_nolen (*enabled_encoding_sv));
 
-  FETCH(INCLUDE_DIRECTORIES)
+  FETCH(code)
+  if (code_sv)
+    text_options->code_state = SvIV (*code_sv);
 
+  FETCH(INCLUDE_DIRECTORIES)
   if (INCLUDE_DIRECTORIES_sv)
     add_svav_to_string_list (*INCLUDE_DIRECTORIES_sv,
                              &text_options->include_directories, svt_dir);
@@ -1145,6 +1236,7 @@ find_document_index_entry_extra_index_entry_sv (DOCUMENT 
*document,
 {
   AV *extra_index_entry_av;
   SV **index_name_sv;
+  char *index_name = 0;
   INDEX *idx = 0;
 
   dTHX;
@@ -1157,7 +1249,7 @@ find_document_index_entry_extra_index_entry_sv (DOCUMENT 
*document,
   index_name_sv = av_fetch (extra_index_entry_av, 0, 0);
   if (index_name_sv)
     {
-      char *index_name = SvPVutf8_nolen (*index_name_sv);
+      index_name = SvPVutf8_nolen (*index_name_sv);
       idx = indices_info_index_by_name (document->index_names,
                                         index_name);
     }
@@ -1178,7 +1270,7 @@ find_document_index_entry_extra_index_entry_sv (DOCUMENT 
*document,
 /* if there is a converter with sorted index names, use the
    sorted index names, otherwise use the index information from
    a document */
-ELEMENT *
+static INDEX_ENTRY *
 find_element_extra_index_entry_sv (DOCUMENT *document,
                                    CONVERTER *converter,
                                    SV *extra_index_entry_sv)
@@ -1197,14 +1289,7 @@ find_element_extra_index_entry_sv (DOCUMENT *document,
    index_entry = find_sorted_index_names_index_entry_extra_index_entry_sv (
                     &converter->sorted_index_names, extra_index_entry_sv);
 
-  if (index_entry)
-    {
-      if (index_entry->entry_associated_element)
-        return index_entry->entry_associated_element;
-      else if (index_entry->entry_element)
-        return index_entry->entry_element;
-    }
-  return 0;
+  return index_entry;
 }
 
 #define FETCH(key) key##_sv = hv_fetch (element_hv, #key, strlen(#key), 0);
@@ -1262,6 +1347,111 @@ ELEMENT *find_root_command (DOCUMENT *document, HV 
*element_hv,
   return 0;
 }
 
+/* find the subentry matching ELEMENT_HV */
+static ELEMENT *
+find_index_entry_subentry (ELEMENT *index_element, HV *element_hv)
+{
+  ELEMENT *current_element = index_element;
+
+  while (1)
+    {
+      ELEMENT *subentry = lookup_extra_element (current_element,
+                                                "subentry");
+      if (subentry)
+        {
+          if (subentry->hv == element_hv)
+            return subentry;
+          current_element = subentry;
+        }
+      else
+        return 0;
+    }
+}
+
+#define EXTRA(key) key##_sv = hv_fetch (extra_hv, #key, strlen(#key), 0);
+
+/* returns the subentry direct parent based on "subentry_parent" */
+static SV *
+subentry_hv_parent (HV *element_hv)
+{
+  SV **extra_sv;
+
+  dTHX;
+
+  FETCH(extra)
+
+  if (extra_sv)
+    {
+      SV **subentry_parent_sv;
+      HV *extra_hv = (HV *) SvRV (*extra_sv);
+
+      EXTRA(subentry_parent)
+      if (subentry_parent_sv)
+        {
+          return *subentry_parent_sv;
+        }
+    }
+  return 0;
+}
+
+/* Find the index entry parent of a subentry going through
+   "subentry_parent" until finding the index element hash */
+ELEMENT *
+find_subentry_index_command_sv (DOCUMENT *document, HV *element_hv)
+{
+  HV *current_parent = element_hv;
+  SV *current_sv = 0;
+
+  dTHX;
+
+  while (1)
+    {
+      SV *subentry_parent_sv = subentry_hv_parent (current_parent);
+      if (subentry_parent_sv)
+        {
+          current_parent = (HV *) SvRV (subentry_parent_sv);
+          current_sv = subentry_parent_sv;
+        }
+      else
+        {
+          if (!current_sv)
+            return 0;
+          return find_element_from_sv (0, document, current_sv, 0);
+        }
+    }
+}
+
+/* find the INDEX_ENTRY associated element matching ELEMENT_HV.
+
+   If the index entry was reassociated, the tree element the
+   index entry is reassociated to is not index_entry->entry_element
+   but index_entry->entry_associated_element.  The original
+   tree element that was associated is index_entry->entry_element.
+   Depending on the situation one or the other may be looked for
+   and the code tries both.
+
+   The reassociated tree element, for example, would be used
+   when doing a link to the tree from the index entry.  But it may
+   also be the original tree element that is used, for example
+   to get the index entry tree element content, for instance
+   when going through the elements associated to indices to setup
+   index entries sort strings.
+ */
+ELEMENT *find_index_entry_associated_hv (INDEX_ENTRY *index_entry,
+                                         HV *element_hv)
+{
+  if (index_entry->entry_associated_element
+      && index_entry->entry_associated_element->hv == element_hv)
+    return index_entry->entry_associated_element;
+
+  if (index_entry->entry_element
+  /* if the index entry was reassociated it is important to check */
+      && index_entry->entry_element->hv == element_hv)
+    return index_entry->entry_element;
+
+  return 0;
+}
+
 /* TODO nodedescription using the extra element_node and the
  * node extra node_description? */
 
@@ -1289,11 +1479,11 @@ find_element_from_sv (CONVERTER *converter, DOCUMENT 
*document_in,
 
   element_hv = (HV *) SvRV (element_sv);
 
-  FETCH(cmdname)
-
   if (!document && converter && converter->document)
     document = converter->document;
 
+  FETCH(cmdname)
+
   if (cmdname_sv && (output_units_descriptor || document))
     {
       char *cmdname = SvPVutf8_nolen (*cmdname_sv);
@@ -1308,11 +1498,22 @@ find_element_from_sv (CONVERTER *converter, DOCUMENT 
*document_in,
           if (element)
             return element;
         }
+      else if (cmd == CM_subentry)
+        {
+          ELEMENT *index_element = find_subentry_index_command_sv (document,
+                                                                   element_hv);
+          if (index_element)
+            {
+              ELEMENT *element = find_index_entry_subentry (index_element,
+                                                            element_hv);
+              if (element)
+                return element;
+            }
+        }
     }
 
   FETCH(extra)
 
-#define EXTRA(key) key##_sv = hv_fetch (extra_hv, #key, strlen(#key), 0);
   if (extra_sv)
     {
       HV *extra_hv = (HV *) SvRV (*extra_sv);
@@ -1352,27 +1553,36 @@ find_element_from_sv (CONVERTER *converter, DOCUMENT 
*document_in,
             }
         }
 
-
       EXTRA(associated_index_entry)
       if (associated_index_entry_sv)
         {
-          ELEMENT *index_element = find_element_extra_index_entry_sv (document,
-                                               converter,
-                                               *associated_index_entry_sv);
-          /* there should be no ambiguity, but we check nevertheless */
-          if (index_element && index_element->hv == element_hv)
-            return (index_element);
+          INDEX_ENTRY *index_entry
+               = find_element_extra_index_entry_sv (document,
+                                                    converter,
+                                              *associated_index_entry_sv);
+          if (index_entry)
+            {
+              ELEMENT *index_element
+                = find_index_entry_associated_hv (index_entry, element_hv);
+              if (index_element)
+                return (index_element);
+            }
         }
 
       EXTRA(index_entry)
       if (index_entry_sv)
         {
-          ELEMENT *index_element = find_element_extra_index_entry_sv (document,
+          INDEX_ENTRY *index_entry
+                     = find_element_extra_index_entry_sv (document,
                                                           converter,
                                                           *index_entry_sv);
-          /* it is important to check if the index entry was reassociated */
-          if (index_element && index_element->hv == element_hv)
-            return (index_element);
+          if (index_entry)
+            {
+              ELEMENT *index_element
+                = find_index_entry_associated_hv (index_entry, element_hv);
+              if (index_element)
+                return (index_element);
+            }
         }
     }
   return 0;
diff --git a/tp/Texinfo/XS/main/get_perl_info.h 
b/tp/Texinfo/XS/main/get_perl_info.h
index 8276e43f27..5be575eff5 100644
--- a/tp/Texinfo/XS/main/get_perl_info.h
+++ b/tp/Texinfo/XS/main/get_perl_info.h
@@ -42,6 +42,10 @@ CONVERTER *get_sv_converter (SV *sv_in, const char 
*warn_string);
 int converter_initialize (SV *converter_sv);
 void reset_output_init_conf (SV *sv_in);
 
+INDEX_ENTRY *find_index_entry_sv (SV *index_entry_sv, INDEX **index_names,
+                     const char *warn_string, char **entry_index_name,
+                     int *entry_number);
+
 INDEX_SORTED_BY_LETTER *get_sv_index_entries_sorted_by_letter
                  (INDEX **index_names, SV *index_entries_sorted_by_letter);
 
diff --git a/tp/Texinfo/XS/structuring_transfo/StructuringTransfoXS.xs 
b/tp/Texinfo/XS/structuring_transfo/StructuringTransfoXS.xs
index d1cad4a903..900f9e9c22 100644
--- a/tp/Texinfo/XS/structuring_transfo/StructuringTransfoXS.xs
+++ b/tp/Texinfo/XS/structuring_transfo/StructuringTransfoXS.xs
@@ -40,6 +40,7 @@
 #include "transformations.h"
 #include "structuring.h"
 #include "output_unit.h"
+#include "indices_in_conversion.h"
 #include "get_perl_info.h"
 #include "build_perl_info.h"
 
@@ -437,4 +438,37 @@ split_pages (SV *output_units_in, char *split)
         if (output_units)
           split_pages (output_units, split);
 
+SV *
+index_entry_element_sort_string (SV *document_in, SV *main_entry_sv, SV 
*element_sv, SV *options_sv, SV *prefer_reference_element_sv=0)
+    PREINIT:
+        DOCUMENT *document;
+        char *sort_string = 0;
+     CODE:
+        document = get_sv_document_document (document_in,
+                   "index_entry_element_sort_string");
+        if (document)
+          {
+            char *entry_index_name;
+            int entry_number;
+            int prefer_reference_element = 0;
+            ELEMENT *element = find_element_from_sv (0, document,
+                                                    element_sv, 0);
+            INDEX_ENTRY *main_entry = find_index_entry_sv (main_entry_sv,
+                                          document->index_names, 0,
+                                          &entry_index_name, &entry_number);
+            TEXT_OPTIONS *options
+              = copy_sv_options_for_convert_text (options_sv);
+            if (prefer_reference_element_sv && SvOK 
(prefer_reference_element_sv))
+              prefer_reference_element = SvIV (prefer_reference_element_sv);
+            sort_string = index_entry_element_sort_string (main_entry,
+                                 element, options, prefer_reference_element);
+            destroy_text_options (options);
+          }
+
+       if (!sort_string)
+         RETVAL = newSV (0);
+       else
+         RETVAL = newSVpv_utf8 (sort_string, 0);
+    OUTPUT:
+         RETVAL
 
diff --git a/tp/t/test_sort.t b/tp/t/test_sort.t
index fac7006334..eef08d15c0 100644
--- a/tp/t/test_sort.t
+++ b/tp/t/test_sort.t
@@ -49,6 +49,8 @@ my $index_entries = 
Texinfo::Structuring::merge_indices($indices_information);
 my $document_information = $document->global_information();
 my $main_configuration = Texinfo::MainConfig::new({'ENABLE_ENCODING' => 1});
 Texinfo::Common::set_output_encodings($main_configuration, 
$document_information);
+$main_configuration->{'document_descriptor'}
+  = $document->document_descriptor();
 my ($sorted_index_entries, $index_entries_sort_strings)
   = Texinfo::Structuring::sort_indices_by_index($registrar, 
$main_configuration,
                                           $index_entries, 
$indices_information);
@@ -124,6 +126,8 @@ $tree = $document->tree();
 $registrar = $parser->registered_errors();
 $indices_information = $document->indices_information();
 $index_entries = Texinfo::Structuring::merge_indices($indices_information);
+$main_configuration->{'document_descriptor'}
+  = $document->document_descriptor();
 ($sorted_index_entries, $index_entries_sort_strings)
   = Texinfo::Structuring::sort_indices_by_index($registrar, 
$main_configuration,
                                           $index_entries, 
$indices_information);
diff --git a/tp/t/test_utils.pl b/tp/t/test_utils.pl
index 00930006bc..560dab6dd3 100644
--- a/tp/t/test_utils.pl
+++ b/tp/t/test_utils.pl
@@ -1177,6 +1177,8 @@ sub test($$)
   my ($sorted_index_entries, $index_entries_sort_strings);
   my $indices_sorted_sort_strings;
   if ($merged_index_entries) {
+    $main_configuration->{'document_descriptor'}
+      = $document->document_descriptor();
     ($sorted_index_entries, $index_entries_sort_strings)
       = Texinfo::Structuring::sort_indices_by_index($registrar,
                                    $main_configuration,



reply via email to

[Prev in Thread] Current Thread [Next in Thread]