help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [External] : Re: Strange whitespaces.


From: Hongyi Zhao
Subject: Re: [External] : Re: Strange whitespaces.
Date: Fri, 1 Oct 2021 15:26:51 +0800

On Fri, Oct 1, 2021 at 2:36 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Hongyi Zhao <hongyi.zhao@gmail.com>
> > Date: Fri, 1 Oct 2021 09:51:22 +0800
> > Cc: help-gnu-emacs <help-gnu-emacs@gnu.org>
> >
> > > We now highlight any non-ASCII character whose Unicode
> > > general category is "Space Separator" (or Zs for short).
> >
> > I fail to see the connection between the abbreviation and the original
> > representation it stands for.
>
> You mean, Zs vs "Space Separator"?

Yes.

> Please complain to the Unicode Consortium about any of that, Emacs just uses 
> the names and
> nomenclature they invented.  See
>
>   https://www.unicode.org/reports/tr44/#General_Category_Values

I presumably basically figured out the logic behinds the nomenclature:
Based on the Description given on the above URL:

a space character (of various non-zero widths)

So, the Z <---> non-zero, and s <---> space.

This is like the naming rules used in regular expression
metacharacters, say, in python [1]:

\s
For Unicode (str) patterns:

Matches Unicode whitespace characters (which includes [ \t\n\r\f\v],
and also many other characters, for example the non-breaking spaces
mandated by typography rules in many languages). If the ASCII flag is
used, only [ \t\n\r\f\v] is matched.

For 8-bit (bytes) patterns:

Matches characters considered whitespace in the ASCII character set;
this is equivalent to [ \t\n\r\f\v].

\S

Matches any character which is not a whitespace character. This is the
opposite of \s. If the ASCII flag is used this becomes the equivalent
of [^ \t\n\r\f\v].


[1] https://docs.python.org/3/library/re.html

HZ



reply via email to

[Prev in Thread] Current Thread [Next in Thread]