[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
branch master updated: * tp/Texinfo/Convert/Converter.pm (converter_opti
From: |
Patrice Dumas |
Subject: |
branch master updated: * tp/Texinfo/Convert/Converter.pm (converter_options_for_output): rename encode_converter_for_output as converter_options_for_output. Update callers. |
Date: |
Sun, 05 Nov 2023 14:32:09 -0500 |
This is an automated email from the git hooks/post-receive script.
pertusus pushed a commit to branch master
in repository texinfo.
The following commit(s) were added to refs/heads/master by this push:
new 150d948494 * tp/Texinfo/Convert/Converter.pm
(converter_options_for_output): rename encode_converter_for_output as
converter_options_for_output. Update callers.
150d948494 is described below
commit 150d94849494cd9d34da1e50d6bf947302ef1d92
Author: Patrice Dumas <pertusus@free.fr>
AuthorDate: Sun Nov 5 20:31:56 2023 +0100
* tp/Texinfo/Convert/Converter.pm (converter_options_for_output):
rename encode_converter_for_output as converter_options_for_output.
Update callers.
* tp/Texinfo/Convert/HTML.pm (_prepare_conversion_units)
(_prepare_units_directions_files, _prepare_title_titlepage)
(_html_convert_output), tp/Texinfo/Convert/Text.pm
(select_text_options), tp/Texinfo/Translations.pm (_XS_gdt),
tp/Texinfo/XS/convert/ConvertXS.xs (html_prepare_conversion_units)
(html_prepare_units_directions_files)
(html_prepare_title_titlepage, html_convert_output),
tp/Texinfo/XS/main/TranslationsXS.xs (gettree),
tp/Texinfo/XS/main/get_perl_info.c (copy_sv_options_for_convert_text):
do not convert strings to UTF-8 in perl but in XS. Rename
encode_text_options as select_text_options in Texinfo::Convert::Text.
* tp/Texinfo/Translations.pm (import),
tp/Texinfo/XS/main/TranslationsXS.xs (gdt): remove _XS_gettree, have a
direct override of gdt (in comments, as it does not work). In XS,
rename gettree as gdt and adapt call to match gdt call in perl.
---
ChangeLog | 23 +++++++++++++++++
tp/Texinfo/Convert/Converter.pm | 5 ++--
tp/Texinfo/Convert/HTML.pm | 50 +++++++++++++-----------------------
tp/Texinfo/Convert/Text.pm | 44 ++++++++++---------------------
tp/Texinfo/Translations.pm | 39 +++++-----------------------
tp/Texinfo/XS/convert/ConvertXS.xs | 22 ++++++++--------
tp/Texinfo/XS/main/TranslationsXS.xs | 24 ++++++++++-------
tp/Texinfo/XS/main/get_perl_info.c | 4 +--
8 files changed, 90 insertions(+), 121 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 58fb5724f9..b031d5e694 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,26 @@
+2023-11-05 Patrice Dumas <pertusus@free.fr>
+
+ * tp/Texinfo/Convert/Converter.pm (converter_options_for_output):
+ rename encode_converter_for_output as converter_options_for_output.
+ Update callers.
+
+ * tp/Texinfo/Convert/HTML.pm (_prepare_conversion_units)
+ (_prepare_units_directions_files, _prepare_title_titlepage)
+ (_html_convert_output), tp/Texinfo/Convert/Text.pm
+ (select_text_options), tp/Texinfo/Translations.pm (_XS_gdt),
+ tp/Texinfo/XS/convert/ConvertXS.xs (html_prepare_conversion_units)
+ (html_prepare_units_directions_files)
+ (html_prepare_title_titlepage, html_convert_output),
+ tp/Texinfo/XS/main/TranslationsXS.xs (gettree),
+ tp/Texinfo/XS/main/get_perl_info.c (copy_sv_options_for_convert_text):
+ do not convert strings to UTF-8 in perl but in XS. Rename
+ encode_text_options as select_text_options in Texinfo::Convert::Text.
+
+ * tp/Texinfo/Translations.pm (import),
+ tp/Texinfo/XS/main/TranslationsXS.xs (gdt): remove _XS_gettree, have a
+ direct override of gdt (in comments, as it does not work). In XS,
+ rename gettree as gdt and adapt call to match gdt call in perl.
+
2023-11-05 Patrice Dumas <pertusus@free.fr>
* tp/Texinfo/Common.pm, tp/Texinfo/Config.pm
diff --git a/tp/Texinfo/Convert/Converter.pm b/tp/Texinfo/Convert/Converter.pm
index 808afa7d31..bd27c40b49 100644
--- a/tp/Texinfo/Convert/Converter.pm
+++ b/tp/Texinfo/Convert/Converter.pm
@@ -548,9 +548,8 @@ sub encode_converter_document($)
return $result;
}
-# to be used before output
-# TODO document?
-sub encode_converter_for_output($)
+# FIXME remove, do in XS
+sub converter_options_for_output($)
{
my $self = shift;
diff --git a/tp/Texinfo/Convert/HTML.pm b/tp/Texinfo/Convert/HTML.pm
index d2645869bf..0d43e18bd9 100644
--- a/tp/Texinfo/Convert/HTML.pm
+++ b/tp/Texinfo/Convert/HTML.pm
@@ -8524,8 +8524,7 @@ sub convert_tree($$;$)
my $explanation = shift;
# No XS, convert_tree is not called on trees registered in XS
- #my $XS_result = _XS_html_convert_tree($self, $tree,
- # (defined($explanation) ? Encode::encode('UTf-8', $explanation) : ''));
+ #my $XS_result = _XS_html_convert_tree($self, $tree, $explanation);
#return $XS_result if (defined($XS_result));
# when formatting accents, goes through xml_accent without
@@ -9378,13 +9377,12 @@ sub _prepare_conversion_units($$$)
my ($output_units, $special_units, $associated_special_units);
if ($self->{'converter_descriptor'} and $XS_convert) {
- my $encoded_converter = $self->encode_converter_for_output();
- my $encoded_document_name = Encode::encode('UTF-8', $document_name);
+ my $converter_info = $self->converter_options_for_output();
my ($targets, $special_targets, $seen_ids);
($output_units, $special_units, $associated_special_units,
$targets, $special_targets, $seen_ids)
- = _XS_prepare_conversion_units($encoded_converter,
- $encoded_document_name);
+ = _XS_prepare_conversion_units($converter_info,
+ $document_name);
$self->{'targets'} = $targets;
$self->{'special_targets'} = $special_targets;
$self->{'seen_ids'} = $seen_ids;
@@ -9451,20 +9449,15 @@ sub _prepare_units_directions_files($$$$$$$$)
my $document_name = shift;
if ($self->{'converter_descriptor'} and $XS_convert) {
- my $encoded_converter = $self->encode_converter_for_output();
- my $encoded_document_name = Encode::encode('UTF-8', $document_name);
- my $encoded_output_file = Encode::encode('UTF-8', $output_file);
- my $encoded_destination_directory
- = Encode::encode('UTF-8', $destination_directory);
- my $encoded_output_filename = Encode::encode('UTF-8', $output_filename);
+ my $converter_info = $self->converter_options_for_output();
my ($XS_files_source_info, $global_units_directions,
$elements_in_file_count, $filenames,
$file_counters, $out_filepaths)
- = _XS_prepare_units_directions_files($encoded_converter,
+ = _XS_prepare_units_directions_files($converter_info,
$output_units, $special_units, $associated_special_units,
- $encoded_output_file, $encoded_destination_directory,
- $encoded_output_filename, $encoded_document_name);
+ $output_file, $destination_directory,
+ $output_filename, $document_name);
$self->{'global_units_directions'} = $global_units_directions;
$self->{'elements_in_file_count'} = $elements_in_file_count;
@@ -11141,10 +11134,8 @@ sub _prepare_title_titlepage($$$$)
my $output_filename = shift;
if ($self->{'converter_descriptor'} and $XS_convert) {
- my $encoded_output_filename = Encode::encode('UTF-8', $output_filename);
- my $encoded_output_file = Encode::encode('UTF-8', $output_file);
_XS_html_prepare_title_titlepage($self, $output_units,
- $encoded_output_file, $encoded_output_filename);
+ $output_file, $output_filename);
return;
}
@@ -11177,7 +11168,7 @@ sub convert($$)
my $self = shift;
my $document = shift;
- my $encoded_converter;
+ my $converter_info;
my $root = $document->tree();
my $result = '';
@@ -11207,9 +11198,9 @@ sub convert($$)
if ($self->{'converter_descriptor'} and $XS_convert) {
# Do it preferentially in XS, and import to perl, to have data
# setup in C for XS too.
- $encoded_converter = $self->encode_converter_for_output();
+ $converter_info = $self->converter_options_for_output();
my $global_units_directions =
- _XS_prepare_output_units_global_targets($encoded_converter,
+ _XS_prepare_output_units_global_targets($converter_info,
$output_units, $special_units, $associated_special_units);
$self->{'global_units_directions'} = $global_units_directions;
} else {
@@ -11246,7 +11237,7 @@ sub convert($$)
$self->{'current_filename'} = '';
if ($self->{'converter_descriptor'} and $XS_convert) {
- my $XS_result = _XS_html_convert_convert ($encoded_converter, $root,
+ my $XS_result = _XS_html_convert_convert ($converter_info, $root,
$output_units, $special_units);
$self->_finalize_output_state();
return $XS_result;
@@ -11533,18 +11524,13 @@ sub _html_convert_output($$$$$$$$)
$destination_directory, $output_filename, $document_name) = @_;
if ($self->{'converter_descriptor'} and $XS_convert) {
- my $encoded_converter = $self->encode_converter_for_output();
- my $encoded_document_name = Encode::encode('UTF-8', $document_name);
- my $encoded_output_file = Encode::encode('UTF-8', $output_file);
- my $encoded_destination_directory
- = Encode::encode('UTF-8', $destination_directory);
- my $encoded_output_filename = Encode::encode('UTF-8', $output_filename);
+ my $converter_info = $self->converter_options_for_output();
my $XS_text_output
- = _XS_html_convert_output ($encoded_converter,
- $root, $output_units, $special_units,
$encoded_output_file,
- $encoded_destination_directory, $encoded_output_filename,
- $encoded_document_name);
+ = _XS_html_convert_output ($converter_info,
+ $root, $output_units, $special_units, $output_file,
+ $destination_directory, $output_filename,
+ $document_name);
return $XS_text_output;
}
diff --git a/tp/Texinfo/Convert/Text.pm b/tp/Texinfo/Convert/Text.pm
index f00e418d0c..8422b22e48 100644
--- a/tp/Texinfo/Convert/Text.pm
+++ b/tp/Texinfo/Convert/Text.pm
@@ -412,50 +412,32 @@ sub copy_options_for_convert_text($;$)
return %options;
}
-# encode to UTF-8 bytes before passing to XS code. Specific
-# text options are in general ASCII strings, but this is still
-# cleaner. Also encode and select converter options passed.
-sub encode_text_options($)
+# select converter options passed.
+sub select_text_options($)
{
my $options = shift;
- my $encoded_options = {};
-
- foreach my $option ('enabled_encoding') {
- if (defined($options->{$option})) {
- $encoded_options->{$option}
- = Encode::encode("UTF-8", $options->{$option});
- }
- }
+ my $selected_options = {};
foreach my $option (@text_indicator_converter_options,
'INCLUDE_DIRECTORIES',
+ 'expanded_formats',
# non-converter indicator options
- 'sc', 'code', 'sort_string') {
+ 'enabled_encoding', 'sc', 'code', 'sort_string') {
if (defined($options->{$option})) {
- $encoded_options->{$option} = $options->{$option};
- }
- }
-
- if (defined($options->{'expanded_formats'})) {
- # FIXME may not need to encode, as the formats are ascii strings
- my $expanded_formats = {};
- foreach my $format (keys(%{$options->{'expanded_formats'}})) {
- my $encoded_format = Encode::encode("UTF-8", $format);
- $expanded_formats->{$encoded_format} = 1;
+ $selected_options->{$option} = $options->{$option};
}
- $encoded_options->{'expanded_formats'} = $expanded_formats;
}
# called through convert_to_text with a converter in text options
if ($options->{'converter'}
and $options->{'converter'}->{'conf'}) {
- $encoded_options->{'other_converter_options'}
+ $selected_options->{'other_converter_options'}
= $options->{'converter'}->{'conf'};
}
- $encoded_options->{'self_converter_options'} = $options;
+ $selected_options->{'self_converter_options'} = $options;
- return $encoded_options;
+ return $selected_options;
}
# This is used if the document is available for XS, but XS is not
@@ -490,8 +472,8 @@ sub convert_to_text($;$)
# Interface with XS converter.
if ($XS_convert and defined($root->{'tree_document_descriptor'})) {
- my $encoded_options = encode_text_options($options);
- my $XS_result = _convert_tree_with_XS($encoded_options, $root, $options);
+ my $selected_options = select_text_options($options);
+ my $XS_result = _convert_tree_with_XS($selected_options, $root, $options);
if (defined ($XS_result)) {
return $XS_result;
} else {
@@ -999,8 +981,8 @@ sub output($$)
my $result;
# Interface with XS converter.
if ($XS_convert and defined($root->{'tree_document_descriptor'})) {
- my $encoded_options = encode_text_options($self);
- my $XS_result = _convert_tree_with_XS($encoded_options, $root, $self);
+ my $selected_options = select_text_options($self);
+ my $XS_result = _convert_tree_with_XS($selected_options, $root, $self);
if (defined ($XS_result)) {
$result = $XS_result;
} else {
diff --git a/tp/Texinfo/Translations.pm b/tp/Texinfo/Translations.pm
index 0071934e01..a916af8a8e 100644
--- a/tp/Texinfo/Translations.pm
+++ b/tp/Texinfo/Translations.pm
@@ -52,10 +52,13 @@ sub import {
Texinfo::XSLoader::override(
"Texinfo::Translations::_XS_configure",
"Texinfo::TranslationsXS::configure");
- # not loaded because it is not usable. See comments near _XS_gettree
+ # Example of how gdt could be overriden. Not used because
+ # the approach is flawed as there won't be any substitution if the trees in
+ # $replaced_substrings are not registered in C data, as is the case in
+ # general.
#Texinfo::XSLoader::override(
- # "Texinfo::Translations::_XS_gettree",
- # "Texinfo::TranslationsXS::gettree");
+ # "Texinfo::Translations::gdt",
+ # "Texinfo::TranslationsXS::gdt");
$module_loaded = 1;
}
# The usual import method
@@ -457,36 +460,6 @@ sub pgdt($$$;$$)
$translation_context, $lang);
}
-
-sub _XS_gettree($;$$$$)
-{
-}
-
-# Example of what a wrapper around XS gdt code could be. Not used because
-# the approach is flawed as there won't be any substitution if the trees in
-# $replaced_substrings are not registered in C data, as is the case in general.
-# The code is fine, though.
-sub _XS_gdt($$;$$$)
-{
- my ($customization_information, $string, $replaced_substrings,
- $translation_context, $lang) = @_;
-
- my $encoded_string = Encode::encode('utf-8', $string);
- my $encoded_translation_context;
- $encoded_translation_context = Encode::encode('utf-8', $translation_context)
- if (defined($translation_context));
- my $encoded_lang;
- $encoded_lang = Encode::encode('utf-8', $lang)
- if (defined($lang));
-
- my $tree = _XS_gettree ($encoded_string, $customization_information,
- $replaced_substrings,
- $encoded_translation_context, $encoded_lang);
-
- return $tree;
-}
-
-
if (0) {
# it is needed to mark the translation as gdt is called like
# gdt($customization_information, '....')
diff --git a/tp/Texinfo/XS/convert/ConvertXS.xs
b/tp/Texinfo/XS/convert/ConvertXS.xs
index 17b4f17914..246d321d25 100644
--- a/tp/Texinfo/XS/convert/ConvertXS.xs
+++ b/tp/Texinfo/XS/convert/ConvertXS.xs
@@ -254,7 +254,7 @@ html_prepare_conversion_units (SV *converter_in, ...)
SV *seen_ids_sv;
PPCODE:
if (items > 1 && SvOK(ST(1)))
- document_name = SvPVbyte_nolen (ST(1));
+ document_name = SvPVutf8_nolen (ST(1));
/* add warn string? */
self = set_output_converter_sv (converter_in, 0);
@@ -290,10 +290,10 @@ html_prepare_conversion_units (SV *converter_in, ...)
void
html_prepare_units_directions_files (SV *converter_in, SV *output_units_in, SV
*special_units_in, SV *associated_special_units_in, output_file,
destination_directory, output_filename, document_name)
- char *output_file = (char *)SvPVbyte_nolen($arg);
- char *destination_directory = (char *)SvPVbyte_nolen($arg);
- char *output_filename = (char *)SvPVbyte_nolen($arg);
- char *document_name = (char *)SvPVbyte_nolen($arg);
+ char *output_file = (char *)SvPVutf8_nolen($arg);
+ char *destination_directory = (char *)SvPVutf8_nolen($arg);
+ char *output_filename = (char *)SvPVutf8_nolen($arg);
+ char *document_name = (char *)SvPVutf8_nolen($arg);
PREINIT:
CONVERTER *self = 0;
int output_units_descriptor = 0;
@@ -428,8 +428,8 @@ html_translate_names (SV *converter_in)
void
html_prepare_title_titlepage (SV *converter_in, SV *output_units_in,
output_file, output_filename)
- char *output_file = (char *)SvPVbyte_nolen($arg);
- char *output_filename = (char *)SvPVbyte_nolen($arg);
+ char *output_file = (char *)SvPVutf8_nolen($arg);
+ char *output_filename = (char *)SvPVutf8_nolen($arg);
PREINIT:
CONVERTER *self = 0;
int output_units_descriptor = 0;
@@ -526,10 +526,10 @@ html_convert_tree (SV *converter_in, SV *tree_in,
explanation)
SV *
html_convert_output (SV *converter_in, SV *tree_in, SV *output_units_in, SV
*special_units_in, output_file, destination_directory, output_filename,
document_name)
- char *output_file = (char *)SvPVbyte_nolen($arg);
- char *destination_directory = (char *)SvPVbyte_nolen($arg);
- char *output_filename = (char *)SvPVbyte_nolen($arg);
- char *document_name = (char *)SvPVbyte_nolen($arg);
+ char *output_file = (char *)SvPVutf8_nolen($arg);
+ char *destination_directory = (char *)SvPVutf8_nolen($arg);
+ char *output_filename = (char *)SvPVutf8_nolen($arg);
+ char *document_name = (char *)SvPVutf8_nolen($arg);
PREINIT:
CONVERTER *self = 0;
DOCUMENT *document = 0;
diff --git a/tp/Texinfo/XS/main/TranslationsXS.xs
b/tp/Texinfo/XS/main/TranslationsXS.xs
index 286e9c2374..789c210a0b 100644
--- a/tp/Texinfo/XS/main/TranslationsXS.xs
+++ b/tp/Texinfo/XS/main/TranslationsXS.xs
@@ -48,11 +48,17 @@ configure (localesdir,
strings_textdomain="texinfo_document")
configure (localesdir, strings_textdomain);
+# TODO not sure that the options_in argument is good to be
+# copy_sv_options argument, may need to retrieve a converter
+# first or Parser configuration. Does not matter much as
+# the approach does not work because replaced_substrings
+# perl element tree cannot be retrieved in C stored documents.
# optional:
-# options, replaced_substrings, translation_context, lang
+# replaced_substrings, translation_context, lang
SV *
-gettree (char *string, ...)
- PROTOTYPE: $;$$$$
+gdt (SV *options_in, string, ...)
+ char *string = (char *)SvPVutf8_nolen($arg);
+ PROTOTYPE: $$;$$$
PREINIT:
char *translation_context = 0;
char *in_lang = 0;
@@ -63,10 +69,14 @@ gettree (char *string, ...)
int gdt_document_descriptor;
DOCUMENT *gdt_document;
CODE:
+ if (SvOK(options_in))
+ {
+ options = copy_sv_options (options_in);
+ }
if (items > 4 && SvOK(ST(4)))
- in_lang = (char *)SvPVbyte_nolen(ST(4));
+ in_lang = (char *)SvPVutf8_nolen(ST(4));
if (items > 3 && SvOK(ST(3)))
- translation_context = (char *)SvPVbyte_nolen(ST(3));
+ translation_context = (char *)SvPVutf8_nolen(ST(3));
if (items > 2 && SvOK(ST(2)))
{
/* TODO put in get_perl_info.h */
@@ -92,10 +102,6 @@ gettree (char *string, ...)
replaced_substrings, key, document->tree);
}
}
- if (items > 1 && SvOK(ST(1)))
- {
- options = copy_sv_options (ST(1));
- }
gdt_document_descriptor
= gdt (string, options, replaced_substrings,
diff --git a/tp/Texinfo/XS/main/get_perl_info.c
b/tp/Texinfo/XS/main/get_perl_info.c
index d1ceb3d4b9..b8dfe38af5 100644
--- a/tp/Texinfo/XS/main/get_perl_info.c
+++ b/tp/Texinfo/XS/main/get_perl_info.c
@@ -287,7 +287,7 @@ copy_sv_options_for_convert_text (SV *sv_in)
enabled_encoding_sv = hv_fetch (hv_in, "enabled_encoding",
strlen ("enabled_encoding"), 0);
if (enabled_encoding_sv)
- text_options->encoding = strdup (SvPVbyte_nolen (*enabled_encoding_sv));
+ text_options->encoding = strdup (SvPVutf8_nolen (*enabled_encoding_sv));
include_directories_sv = hv_fetch (hv_in, "INCLUDE_DIRECTORIES",
strlen ("INCLUDE_DIRECTORIES"), 0);
@@ -1594,7 +1594,7 @@ set_conf (CONVERTER *converter, const char *conf, SV
*value)
{
if (converter->conf)
get_sv_option (converter->conf, conf, value);
- /* Too early to have aoptions set
+ /* Too early to have options set
else
fprintf (stderr, "HHH no converter conf %s\n", conf);
*/
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- branch master updated: * tp/Texinfo/Convert/Converter.pm (converter_options_for_output): rename encode_converter_for_output as converter_options_for_output. Update callers.,
Patrice Dumas <=
- Prev by Date:
branch master updated: * tp/Texinfo/Common.pm, tp/Texinfo/Config.pm (register_XS_document_main_configuration), tp/Texinfo/Convert/Converter.pm (encode_converter_document), tp/Texinfo/Convert/HTML.pm (_translate_names), tp/Texinfo/Convert/Text.pm (encode_text_options), tp/maintain/regenerate_C_options_info.pl: encode customization options strings in C and not in perl. The corresponding code is generated by regenerate_C_options_info.pl, using SvPVbyte for byte strings and SvPVutf8 to get UT-8 encoded strings for [...]
- Next by Date:
branch master updated: * tp/Texinfo/Convert/HTML.pm (import, _sort_index_entries), tp/Texinfo/XS/convert/ConvertXS.xs (sort_sortable_index_entries_by_letter), tp/Texinfo/XS/convert/indices_in_conversion.c (sort_indices_by_letter): remove code related to sorting in C based on sortable entries from perl, as it is unfinished because it requires a collation function in C, and the passing of data from perl need to be redone, it would be better to restart from scratch.
- Previous by thread:
branch master updated: * tp/Texinfo/Common.pm, tp/Texinfo/Config.pm (register_XS_document_main_configuration), tp/Texinfo/Convert/Converter.pm (encode_converter_document), tp/Texinfo/Convert/HTML.pm (_translate_names), tp/Texinfo/Convert/Text.pm (encode_text_options), tp/maintain/regenerate_C_options_info.pl: encode customization options strings in C and not in perl. The corresponding code is generated by regenerate_C_options_info.pl, using SvPVbyte for byte strings and SvPVutf8 to get UT-8 encoded strings for [...]
- Next by thread:
branch master updated: * tp/Texinfo/Convert/HTML.pm (import, _sort_index_entries), tp/Texinfo/XS/convert/ConvertXS.xs (sort_sortable_index_entries_by_letter), tp/Texinfo/XS/convert/indices_in_conversion.c (sort_indices_by_letter): remove code related to sorting in C based on sortable entries from perl, as it is unfinished because it requires a collation function in C, and the passing of data from perl need to be redone, it would be better to restart from scratch.
- Index(es):