emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bidi-string-strip-control-characters


From: Lars Ingebrigtsen
Subject: Re: bidi-string-strip-control-characters
Date: Thu, 20 Jan 2022 10:29:26 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)

Eli Zaretskii <eliz@gnu.org> writes:

> Lars, I'm not sure I understand the purpose of this function.  Can you
> explain?

Like the NEWS item says, it's for cases where you want to ensure that
there's no bidiness going on.

> The way it is currently used is also strange, to say the least: you
> apply it to a string made of a single character, so either it does
> nothing to the string, or it will return an empty string.  So the
> following code will present the user with a riddle:
>
>   (textsec-email-address-header-suspicious-p
>    "Lars Ingebrigtsen <larsi@\N{RIGHT-TO-LEFT OVERRIDE}gnus.org>")
>   "Disallowed character: `' (#x202e, RIGHT-TO-LEFT OVERRIDE)"
>
> The empty string between quotes is the riddle.

Well...  perhaps not optimal, but not really a riddle.  But the function
will probably be used elsewhere in textsec, too, but I haven't gotten
round to auditing all the strings yet.

> I think I understand the original problem: displaying a literal U+202E
> there will mess up the text on display, but if that is the reason, the
> right way is not to remove the character, it is to append to it the
> necessary bidi controls to prevent the messup (and make the appended
> controls be invisible).
>
> Here's an example:
>
>   (insert (format "Disallowed character: `%s' (#x202e, RIGHT-TO-LEFT 
> OVERRIDE)"
>               (concat (string ?\x202e)
>                       (propertize (string ?\x202c ?\x200e) 'invisible t))))
>
> This displays the RLO character, but doesn't mess up the description
> after it.

The display is identical to the one we have now, though:

   "Disallowed character: `' (#x202e, RIGHT-TO-LEFT OVERRIDE)"

So still a riddle.

But removing the bidi chars is "obviously correct" (and impervious to
future attacks) for somebody that's not that familiar with the bidi
machinery, so I prefer to remove the chars instead here.

> We do something like that in descr-text.el, so I guess we need to
> factor out that code and use it here.

Isn't that bidi-string-mark-left-to-right?  I forget.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



reply via email to

[Prev in Thread] Current Thread [Next in Thread]