Re: [bug #42233] [PATCH] wcwidth(3) used on UCS4/UTF-32 codepoints

groff

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug #42233] [PATCH] wcwidth(3) used on UCS4/UTF-32 codepoints

From:	G. Branden Robinson
Subject:	Re: [bug #42233] [PATCH] wcwidth(3) used on UCS4/UTF-32 codepoints
Date:	Sun, 1 Nov 2020 15:20:27 +1100
User-agent:	NeoMutt/20180716

Hi Steffen,

At 2020-10-21T15:18:29+0200, Steffen Nurpmeso wrote:
> > Steffen has withdrawn most/all of his other patches and even after
> > reading
>
> I do not know what this has to do with this bug.

As I recall (dimly), you said something at some point that I perhaps
misinterpreted--that you were going to work on a project called s-roff
and were not going to participate in any further "pull requests", if you
will, involving groff.

If I misunderstood you, I apologize.

> > this report a few times I'm not clear on what exactly the problem
> > is supposed to be.
> >
> > The "solution", "drop gnulib", is not likely, especially not during
> > an RC cycle.
>
> Sorry, what??

I refer to this statement in the original bug report:

"The neat side effect of that is that the entire GNULib can be
unhooked and removed from groff(1)."

However, you're right that this side effect was not proposed as
_necessary_, only possible.

> > This could be reopened if we had a simple, reproducible case of
> > groff actually misbehaving.
> >
> > > I think currently groff makes false use of wcwidth(3): if it finds
> > > the `unicode' property in a `DESC' file it uses wcwidth(3) to
> > > determine the visual width, not taking into account the current
> > > locale, but which wcwidth(3) depends upon.
> >
> > I don't understand[.]
>
> I am too old for this shit, really.  I therefore agree.

I am struggling with the non-idiomatic expression "makes false use".  I
can interpret it, but only vaguely.  Also, I may lack domain knowledge
here.

It's my understanding that Unicode defines a property called "East Asian
width"--at least that's what my local unicode(1) command calls it.

> > [.] why the width of a Unicode character would be locale-dependent.
> > As I understand it, the width property (half-width, full-width,
> > undefined) is determined on a per-codepoint basis and while it might
> > vary, there's no reason to expect it to vary based on the _locale_.
> > More likely, I think, it would vary due to choices taken by a font
> > vendor, and people using the font would be forced to adapt.

Thinking about this some more, the possibility of an "ambiguous" or
"undefined" character width at the UCS level could mean that the locale
is permitted to determine this parameter.

> > Closing.
>
> I think there was a ML thread by the time i opened the bug report,
> where the according GNUlib function that could simply be used to
> correctly implement the given was named.

Hmm. It would be good to find this.  I wonder if Dave Kemper can help;
to my eyes, he seems to have a fluttering cape that advertises a deep
knowledge of our mailing list history.

> Then that piece of cake would be correct, despite possibly non-capable
> surroundings.

If this would fix the infinite loop Osamu Ayama found, and that I
crudely hacked around in bcdf2f4c7c28328c711c6a7ac2ea17f2ecd5cdd4 (also
see https://savannah.gnu.org/bugs/?44018 ), that would be terrific.

I think we just need someone with a little more gnulib and/or
wide-character savvy than I possess right now to articulate the issue so
that I can understand it.  Eventually, I'll learn, but Bertrand's trying
to get an RC1 out.  :)

Regards,
Branden

signature.asc
Description: PGP signature

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [bug #42233] [PATCH] wcwidth(3) used on UCS4/UTF-32 codepoints, G. Branden Robinson <=
- Re: [bug #42233] [PATCH] wcwidth(3) used on UCS4/UTF-32 codepoints, Steffen Nurpmeso, 2020/11/04

Next by Date: Re: [groff] 03/09: tmac/an-old.tmac: Stop remapping ` and '.
Next by thread: Re: [bug #42233] [PATCH] wcwidth(3) used on UCS4/UTF-32 codepoints
Index(es):
- Date
- Thread