[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Pattern matching on match-string groups #elisp #question
From: |
Stefan Monnier |
Subject: |
Re: Pattern matching on match-string groups #elisp #question |
Date: |
Sat, 27 Feb 2021 09:39:27 -0500 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) |
>> BTW, I was thinking about making the optimization more conservative, so
>> it only throws away the actual `if` but keeps the computation of the test:
> [...]
>> and it does fix the `pcase-let` problem with your original code.
> Given the trouble I think we can defend not respecting side-effects in
> something as functional as pcase!
Nevertheless, I went ahead with this change (after remembering that
wrapping the code in `ignore` should eliminate the extra warnings).
>> It should macroexpand to something morally equivalent to:
>>
>> (cond ((not (stringp STR)) nil)
>> ((not (string-match "\\(?1:a*\\)" STR)) nil)
>> ((looking-at "^"")
>> (let* ((x1464 (match-string 1 STR)))
>> (let ((FOO x1464)) FOO))))
>
> Oh dear... perhaps we should just go with the intermediate list (or vector)
> and suffer the small allocation penalty? (At least we should treat the case
> of a single variable specially, since no consing would then be necessary.)
It's clearly The Right Thing™.
> My guess is that a vector may be faster than a list if there are more than N
> elements, for some N.
I'll let you benchmark it to determine the N.
> Should we use string-match-p when there are no variables bound in the rx
> clause?
We could. Tho, IIRC currently `string-match-p` is ever so slightly
slower than `string-match` and since we clobber the match data in other
cases, we might as well clobber the match data in this case as well: any
code which presumes the match data isn't affected by some other code
which uses regular expressions is quite confused.
>>> Of course a sufficiently optimising compiler would eliminate the consing!
>> Indeed, and it's not a difficult optimization (at least if you can
>> presume that this data is immutable).
> Right, although we would need some more serious data-flow infrastructure
> first. It would be useful for pattern-matching two or more values at the
> same time.
I don't think it's much more complicated than your current constant
folding: when you see a let-binding of a variable to a *constructor*,
stash that expression in your context as a "partially known constant"
and then do the constant folding when you see a matching *destructor*.
The problems I see are:
- you need to detect side effects between the constructor and the
destructor. You could just consider any *read* of a variable holding
such partial-constant as a (potential) side-effect (except for those
reads recognized as part of destructors, of course). It should be
good enough for the case under discussion.
- more importantly in order not to duplicate code and its side-effects
(and suffer risks linked to moving code into a different scope), you
need to convert your constructor so all its arguments are trivial and
"scope safe" (e.g. gensym'd variables, integer constants, symbols,
...).
>>>> It's linked to the special undocumented pcase pattern `pcase--dontcare`
>>>> (whose name is not well chosen, suggestions for better names are
>>>> welcome)
>>>
>>> pcase--give-up
>>
>> Hmm... probably not much more explanatory than "dontcare".
>
> Well, 'dontcare' suggests that anything would do and the value not being
> used, like '_', but that's quite misleading.
Right, that's part of the problem with this naming: it doesn't want to
mean that anything matches, like `_`, but that any behavior is
acceptable when pcase has to try and match against this pattern.
The other part is that it's not true: we wouldn't settle for "any"
behavior, it still has to be sane-ish.
>> I was thinking of `pcase--impossible` as well.
> Yes, that looks acceptable. In any case, it isn't really a user-facing
> symbol, is it? Otherwise we'd need crystal-clear semantics (and lose the
> double dashes).
I would settle for something a bit less than "crystal-clear", but yes to
drop the "--" we'd need a clear enough semantics.
Here's another idea for its name: `pcase--go-back` since it makes pcase
go back to the last option it tried and accept it even though it failed
to match. It still sucks, but maybe it'll give someone else a better idea?
Stefan
- Pattern matching on match-string groups #elisp #question, Ag Ibragimov, 2021/02/25
- Re: Pattern matching on match-string groups #elisp #question, Basil L. Contovounesios, 2021/02/25
- Re: Pattern matching on match-string groups #elisp #question, Stefan Monnier, 2021/02/25
- Re: Pattern matching on match-string groups #elisp #question, Mattias Engdegård, 2021/02/25
- Re: Pattern matching on match-string groups #elisp #question, Stefan Monnier, 2021/02/25
- Re: Pattern matching on match-string groups #elisp #question, Mattias Engdegård, 2021/02/26
- Re: Pattern matching on match-string groups #elisp #question, Stefan Monnier, 2021/02/26
- Re: Pattern matching on match-string groups #elisp #question, Mattias Engdegård, 2021/02/27
- Re: Pattern matching on match-string groups #elisp #question,
Stefan Monnier <=
- Re: Pattern matching on match-string groups #elisp #question, Mattias Engdegård, 2021/02/27
- Re: Pattern matching on match-string groups #elisp #question, Stefan Monnier, 2021/02/27
- Re: Pattern matching on match-string groups #elisp #question, Mattias Engdegård, 2021/02/28
- Re: Pattern matching on match-string groups #elisp #question, Stefan Monnier, 2021/02/28