bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#52459: 28.0.90; prin1-to-string does not escape bidi control charact


From: Daniel Mendler
Subject: bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
Date: Mon, 13 Dec 2021 19:13:32 +0100

>  1) produce strings for using in program source files.
>  2) produce strings for display in various UIs
> 
> The solutions should IMO be different, because the first is not about
> displaying these characters, while the second is about displaying
> them.

No, they are not different for my purposes since I want to have the
ability to copy strings from the UI to a source file. Working around the
problem on the display level (glyphless-display-mode) will preclude this
use case.

> For 1), is print-escape-multibyte satisfactory?  If not, why not?

I already explained this. `print-escape-multibyte` obfuscates the string
too much, which is undesirable for a debugging UI. Note that I am
passing on this experience report from a Russian user who observed that
Marginalia (which currently uses `print-escape-multibyte=t`) produces
output which is not as helpful as it could be thanks to the escaping of
all multi byte characters. The escaping hurts users of multi-byte languages.

> For 2), we now have in Emacs 29 the glyphless-display-mode, whereby
> the bidi control characters are shown as small boxes with their
> acronyms (RLE, FSI, PDI, etc.).  Is that satisfactory?  If not, why
> not?

The `glyphless-display-mode` would be a possible workaround if I just
pass on the characters unescaped. However I want to produce strings
which I can possibly copy to source code buffers. This is not possible
if the strings are not escaped and contain the problematic control
characters in literal form.

Once again - I propose the addition of configuration variables which
configure `prin1-string` to produce output where all control characters
are escaped. I would even argue that current variable
`print-escape-control-characters` is misleading since it only encodes
Ascii control characters. Is there anything which prevents the addition
of a configuration variable `print-escape-unicode-control-characters`,
which ensures full escaping of *all control characters* or we could even
further and add `print-escape-glyphless-characters` which would treat
the same characters as `glyphless-display-mode`.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]