bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #58930] take baby steps toward Unicode


From: Dave
Subject: [bug #58930] take baby steps toward Unicode
Date: Sat, 28 May 2022 13:35:56 -0400 (EDT)

Follow-up Comment #13, bug #58930 (project groff):

[comment #0 original submission:]
> But if the input is some other encoding, preconv converts
> the character into the string "\[u00A0]", which groff does
> _not_ recognize.

The resolved bug #62300 has fixed preconv to emit "\~" rather than "\[u00A0]"
for a U+00A0 input character.

In preconv 1.22.4:

$ echo -e '\xA0' | preconv -elatin1
.lf 1 -
\[u00A0]

In preconv built from the latest code:

$ echo -e '\xA0' | preconv -elatin1
.lf 1 -
\~


So I think we can mark this part as resolved, despite one remaining issue
62300 points out in its comment 2:

"The input sequence '\[u00A0]' is _syntactically_ valid...but like '\[uFFFF]'
and '\[u0000]', it's not _meaningful_"

This is true of the current implementation but less true conceptually: U+0000
and U+FFFF are not meaningful input characters to groff, but U+00A0 is, and
users ideally ought to be able to specify the character as \[u00A0].

But this is an edge case I don't intend to pursue.  Users who want to stick to
pure-ASCII input have the escape sequence \~ to specify the nonbreaking space,
so don't need the alternate spelling \[u00A0].


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?58930>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]