[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
branch master updated: * tp/Texinfo/Convert/Unicode.pm (string_width): c
From: |
Patrice Dumas |
Subject: |
branch master updated: * tp/Texinfo/Convert/Unicode.pm (string_width): consider Default_Ignorable_Code_Point to be of zero width. These codepoints are described as having no visible glyph or advance width in and of themselves. |
Date: |
Wed, 06 Sep 2023 16:11:25 -0400 |
This is an automated email from the git hooks/post-receive script.
pertusus pushed a commit to branch master
in repository texinfo.
The following commit(s) were added to refs/heads/master by this push:
new 15b7cdae14 * tp/Texinfo/Convert/Unicode.pm (string_width): consider
Default_Ignorable_Code_Point to be of zero width. These codepoints are
described as having no visible glyph or advance width in and of themselves.
15b7cdae14 is described below
commit 15b7cdae144fdc959b1b15ed8b2186f0cd453319
Author: Patrice Dumas <pertusus@free.fr>
AuthorDate: Wed Sep 6 22:11:16 2023 +0200
* tp/Texinfo/Convert/Unicode.pm (string_width): consider
Default_Ignorable_Code_Point to be of zero width. These codepoints
are described as having no visible glyph or advance width in and of
themselves.
---
ChangeLog | 7 +++++++
tp/Texinfo/Convert/Unicode.pm | 8 +++++---
tp/t/results/moresectioning/only_special_spaces_node.pl | 6 +++---
tp/t/results/sectioning/in_menu_only_special_spaces_node.pl | 6 +++---
tp/t/results/sectioning/in_menu_only_special_spaces_node_menu.pl | 6 +++---
5 files changed, 21 insertions(+), 12 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 91623d93ed..1d40606ddd 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2023-09-06 Patrice Dumas <pertusus@free.fr>
+
+ * tp/Texinfo/Convert/Unicode.pm (string_width): consider
+ Default_Ignorable_Code_Point to be of zero width. These codepoints
+ are described as having no visible glyph or advance width in and of
+ themselves.
+
2023-09-06 Patrice Dumas <pertusus@free.fr>
Consider control characters to have a 0 width
diff --git a/tp/Texinfo/Convert/Unicode.pm b/tp/Texinfo/Convert/Unicode.pm
index a5a308d6a5..b7dcc86cfd 100644
--- a/tp/Texinfo/Convert/Unicode.pm
+++ b/tp/Texinfo/Convert/Unicode.pm
@@ -1667,12 +1667,13 @@ sub string_width($)
# of the string. These regexes are faster than making the substitutions
# below.
if ($string =~ /^[\p{IsPrint}]*$/
- and $string !~ /[\p{InFullwidth}\pM]/) {
+ and $string !~ /[\p{InFullwidth}\pM\p{Default_Ignorable_Code_Point}]/) {
return length($string);
}
$string =~ s/\p{InFullwidth}/\x{02}/g;
$string =~ s/\pM/\x{00}/g;
+ $string =~ s/\p{Default_Ignorable_Code_Point}/\x{00}/g;
$string =~ s/\p{IsPrint}/\x{01}/g;
$string =~ s/[^\x{01}\x{02}]/\x{00}/g;
@@ -1688,8 +1689,9 @@ sub string_width($)
foreach my $character(split '', $string) {
if ($character =~ /\p{InFullwidth}/) {
$width += 2;
- } elsif ($character =~ /\pM/) {
- # a mark set at length 0
+ } elsif ($character =~ /[\pM\p{Default_Ignorable_Code_Point}]/) {
+ # a mark set at length 0 or a Default Ignorable Code Point
+ # that have no visible glyph or advance width in and of themselves
} elsif ($character =~ /\p{IsPrint}/) {
$width += 1;
} elsif ($character =~ /\p{IsControl}/) {
diff --git a/tp/t/results/moresectioning/only_special_spaces_node.pl
b/tp/t/results/moresectioning/only_special_spaces_node.pl
index 6da280d00b..9af7427745 100644
--- a/tp/t/results/moresectioning/only_special_spaces_node.pl
+++ b/tp/t/results/moresectioning/only_special_spaces_node.pl
@@ -525,7 +525,7 @@ $result_texts{'only_special_spaces_node'} = 'top
*********************************************************
5 MONGOLIAN VOWEL SEPARATOR|| EM SPACE| |
-******************************************
+*****************************************
6 THREE-PER-EM SPACE| | FOUR-PER-EM SPACE| | SIX-PER-EM SPACE| | FIGURE SPACE|
| PUNCTUATION SPACE| | THIN SPACE| | HAIR SPACE| | LINE SEPARATOR|
| PARAGRAPH
SEPARATOR|
| NARROW NO-BREAK SPACE| | MEDIUM MATHEMATICAL SPACE| | IDEOGRAPHIC
SPACE| |
@@ -812,7 +812,7 @@ File: , Node:
, Next: , Prev: , Up: Top
File: , Node: , Next:
, Prev:
, Up: Top
5 MONGOLIAN VOWEL SEPARATOR|| EM SPACE| |
-******************************************
+*****************************************
File: , Node:
, Prev: , Up: Top
@@ -827,7 +827,7 @@ Node: Top56
Node: 205
Node:
499
Node: 681
-Node:
868
+Node:
867
End Tag Table
diff --git a/tp/t/results/sectioning/in_menu_only_special_spaces_node.pl
b/tp/t/results/sectioning/in_menu_only_special_spaces_node.pl
index 0aaab6040b..ad4255810a 100644
--- a/tp/t/results/sectioning/in_menu_only_special_spaces_node.pl
+++ b/tp/t/results/sectioning/in_menu_only_special_spaces_node.pl
@@ -854,7 +854,7 @@ $result_texts{'in_menu_only_special_spaces_node'} = 'top
*********************************************************
5 MONGOLIAN VOWEL SEPARATOR|| EM SPACE| |
-******************************************
+*****************************************
6 THREE-PER-EM SPACE| | FOUR-PER-EM SPACE| | SIX-PER-EM SPACE| | FIGURE SPACE|
| PUNCTUATION SPACE| | THIN SPACE| | HAIR SPACE| | LINE SEPARATOR|
| PARAGRAPH
SEPARATOR|
| NARROW NO-BREAK SPACE| | MEDIUM MATHEMATICAL SPACE| | IDEOGRAPHIC
SPACE| |
@@ -1227,7 +1227,7 @@ File: , Node:
, Next: , Prev: , Up: Top
File: , Node: , Next:
, Prev:
, Up: Top
5 MONGOLIAN VOWEL SEPARATOR|| EM SPACE| |
-******************************************
+*****************************************
File: , Node:
, Prev: , Up: Top
@@ -1242,7 +1242,7 @@ Node: Top64
Node: 227
Node:
521
Node: 703
-Node:
890
+Node:
889
End Tag Table
diff --git a/tp/t/results/sectioning/in_menu_only_special_spaces_node_menu.pl
b/tp/t/results/sectioning/in_menu_only_special_spaces_node_menu.pl
index fe32b3ef31..274e8e6299 100644
--- a/tp/t/results/sectioning/in_menu_only_special_spaces_node_menu.pl
+++ b/tp/t/results/sectioning/in_menu_only_special_spaces_node_menu.pl
@@ -854,7 +854,7 @@ $result_texts{'in_menu_only_special_spaces_node_menu'} =
'top
*********************************************************
5 MONGOLIAN VOWEL SEPARATOR|| EM SPACE| |
-******************************************
+*****************************************
6 THREE-PER-EM SPACE| | FOUR-PER-EM SPACE| | SIX-PER-EM SPACE| | FIGURE SPACE|
| PUNCTUATION SPACE| | THIN SPACE| | HAIR SPACE| | LINE SEPARATOR|
| PARAGRAPH
SEPARATOR|
| NARROW NO-BREAK SPACE| | MEDIUM MATHEMATICAL SPACE| | IDEOGRAPHIC
SPACE| |
@@ -1227,7 +1227,7 @@ File: , Node:
, Next: , Prev: , Up: Top
File: , Node: , Next:
, Prev:
, Up: Top
5 MONGOLIAN VOWEL SEPARATOR|| EM SPACE| |
-******************************************
+*****************************************
File: , Node:
, Prev: , Up: Top
@@ -1242,7 +1242,7 @@ Node: Top64
Node: 227
Node:
521
Node: 703
-Node:
890
+Node:
889
End Tag Table
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- branch master updated: * tp/Texinfo/Convert/Unicode.pm (string_width): consider Default_Ignorable_Code_Point to be of zero width. These codepoints are described as having no visible glyph or advance width in and of themselves.,
Patrice Dumas <=