bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Texinfo 7.0.93 pretest available


From: Gavin Smith
Subject: Re: Texinfo 7.0.93 pretest available
Date: Sun, 8 Oct 2023 18:29:23 +0100

On Sun, Oct 08, 2023 at 07:31:12PM +0300, Eli Zaretskii wrote:
> I see a very large diff, full of non-ASCII characters.  A typical hunk
> is below:
> 
>   -(ì) @'{e} é (é) @'{@dotless{i}} í (í) @dotless{i} ı (ı) @dotless{j} ȷ
>   -(ȷ) ‘@H{a}’ a̋ ‘@dotaccent{a}’ ȧ (ȧ) ‘@ringaccent{a}’ å (å)
>   -‘@tieaccent{a}’ a͡ ‘@u{a}’ ă (ă) ‘@ubaraccent{a}’ a̲ ‘@udotaccent{a}’ ạ
>   -(ạ) ‘@v{a}’ ǎ (ǎ) @,c ç (ç) ‘@,{c}’ ç (ç) ‘@ogonek{a}’ ą (ą)
>   +(ì) @'{e} é (é) @'{@dotless{i}} í (í) @dotless{i} ı (ı) @dotless{j} ȷ (ȷ)
>   +‘@H{a}’ a̋ ‘@dotaccent{a}’ ȧ (ȧ) ‘@ringaccent{a}’ å (å) ‘@tieaccent{a}’ a͡
>   +‘@u{a}’ ă (ă) ‘@ubaraccent{a}’ a̲ ‘@udotaccent{a}’ ạ (ạ) ‘@v{a}’ ǎ (ǎ)
>   +@,c ç (ç) ‘@,{c}’ ç (ç) ‘@ogonek{a}’ ą (ą)
> 
> It looks like a filling problem to me, perhaps because something
> counts bytes instead of characters?

It's almost certainly a problem with filling as you say.  In the C (XS)
code, the return value of wcwidth is used for each character to get
the width of each line.  The pure Perl code doesn't use the wcwidth
function as far as I know but keeps a count for each line based on
regex character classes.  The relevant code is in
Texinfo/Convert/Unicode.pm, in the 'string_width' function.

Do you know whether the XS modules are in use?
You could try "export TEXINFO_XS=omit" or "export TEXINFO_XS=require" to
check if it makes a difference.  That would narrow it down to which version
of the code had the problem (or if they both have a problem).

I remember that in the past, I broke up some of these lines to avoid
test failures on some platform that had different wcwidth results for
some characters.

> The diffs like above are followed by diffs in the Index part, where it
> looks like the differences are just line counts:
> 
>    * Menu:
> 
>   -* truc:                                  chapter.            (line 2236)
>   +* truc:                                  chapter.            (line 2234)
> 
> Probably due to the same problem of incorrect filling of lines?

Yes, that follows on from the different line breaking decisions, and these
parts of the diff will go away once the other problem is fixed.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]