emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Single quotes in Info


From: Artur Malabarba
Subject: Re: Single quotes in Info
Date: Tue, 27 Jan 2015 23:15:22 -0200

Eli, if I may ask, did you get a chance to see the code? (it's quite short)
The last couple emails give me the impression we're not quite on the same page.

On 27 Jan 2015 19:18, "Eli Zaretskii" <address@hidden> wrote:
>
> > Date: Tue, 27 Jan 2015 18:24:09 -0200
> > From: Artur Malabarba <address@hidden>
> > Cc: Marcin Borkowski <address@hidden>, emacs-devel <address@hidden>
> >
> > > If this is implemented in isearch, then IMO doing it for quotes alone
> > > makes very little sense.
> >
> > The quotes are just proof of concept.
>
> Yes, but what concept is that?  Does it scale up to a general-purpose
> feature of the kind that suits isearch.el?  Just replacing one
> character for another doesn't, IMO.

No. It replaces one character with an arbitrary regexp. In the quotes case that's used to match about a dozen different quotation characters, but it's not limited to that. You can also use that to implement lax-whi

> > > If we do this via our private database, that database is going to be
> > > huge.
> >
> > Is it? I would expect something on the order of 50 lines.
>
> There are more than 5000 characters in the Unicode database that have
> equivalence and canonical decompositions.  (Look for entries in
> UnicodeData.txt whose 6th field is non-empty.)

The purpose of this is to allow the user to search for complex characters (such as curly quotes or any of these "“””„⹂〞‟‟❞❝❠“„〝〟🙷🙶🙸) by typing a simple character available on simple keyboards (such as the plain double quote "). Each simple character, needs an entry on the `isearch-groups-alist' variable. The max number of entries we'll ever need on this alist (in the very worst possible scenario) is the number of simple characters in a simple keyboard (which is way less than 5000 last I checked).

This might be easier to understand looking at the code.

>
> > > We already have infrastructure for that, see
> > > the description of the 'decomposition' character property in the ELisp
> > > manual.
> >
> > Building this on preexisting infrastructure would be great, but does that go
> > the right way? Does it relate a simple character to all its complex
> > equivalents? Or does it relate each complex character to a simple alternative?
> The latter.  Read paragraph 1.1 of UAX #15 for the starting point, and
> also section 3.7 of the Unicode Standard.

If it's the latter, then it's the wrong way for us to do an automated approach. What we need is to know the whole set of Unicode characters which is equivalent to a given ASCII character. Of course we can build this table from the Unicode Standard (that's exactly what the `isearch-groups-alist' variable is meant to do), I'm just saying an automated approach probably isn't viable here.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]