[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Discrepancy in definition/use of match-data?
From: |
David Kastrup |
Subject: |
Re: Discrepancy in definition/use of match-data? |
Date: |
12 Jun 2004 11:03:53 +0200 |
User-agent: |
Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50 |
"Stephen J. Turnbull" <address@hidden> writes:
> >>>>> "David" == David Kastrup <address@hidden> writes:
>
> David> How strong were the results from the regression test?
>
> What does that mean? There is one test that specifically checks for
> match-data being preserved across a failed match, and it failed.
Ah, ok. I thought that the regression test might have involved
calling a lot of complex functions to see whether they still worked.
It dod not occur to me that it checked for the particular semantics
explicitly.
> David> What kind and amount of code appears to be affected?
>
> Kind? I know of one specific use in `w3-configuration-data' in
> w3-cfg.el which calls itself recursively, and depends on being able
> to use the top-level match-data if the match in the recursive call
> fails, while using the match-data from the recursive call otherwise.
> Too sneaky to live, I suppose you could say.
Sneakiness is all fine. It is just the question whether it is worth
the price. It would be a good idea to check this price out after the
next release, as said.
> Amount? How should I know? I can say I've tried inserting warning
> code in match_limit and file-name-sans-extension produces the
> warning during the dump phase. I don't know why yet, my
> implementation may be incorrect (the flag is either getting reset
> when it shouldn't, or fails to get set on a successful search---the
> Boyer-Moore code is the most complicated thing I've ever tried to
> deal with).
>
> However, I know that file-handlers can get called there, via
> file-name-directory inter alia. In general, even a simple variable
> reference can call arbitrary Lisp code in XEmacs (because of magic
> Mule handlers) and most likely GNU Emacs (because those handlers
> were introduced for GNU Emacs compatibility).
In which case this particular sort of references better use
save-match-data.
> David> With the current code. But I was also proposing to void
> David> the match-data in the main loop: that would produce
> David> errors more often, and only in situations where the
> David> match-data was completely unpredictable to start with
> David> (and possibly undefined, too).
>
> But that's not true. The match-data was sufficiently predictable
> that split-string only failed when it was voided.
The code in question was the following:
(let (parts (start 0) (len (length string)))
(if (string-match pattern string)
(setq parts (cons (substring string 0 (match-beginning 0)) parts)
start (match-end 0)))
The error occured when the condition was false to start with and setq
not reached. Then we had the following:
(while (and (< start len)
This condition is true for non-empty string
(string-match pattern string (if (> start (match-beginning 0))
start
(1+ start))))
start is still 0, match-beginning has a random value (this is where
the error got flagged). Without an error, we'll get into the false
branch, so the match starts at 1 now. Since the precondition was only
true for non-empty string, starting the match at 1 is valid, will
again fail like the first match, and the loop body gets skipped.
So this time we are lucky: the random contents of match-beginning
would indeed not matter. Would you want to rely on that?
> I think it's likely that that mistake has been made elsewhere.
I doubt that we can rely on its consequences being benign.
> On the other hand, pretty much any time any Lisp code intervenes
> between the return of the matching function and entry to the match
> data access there is an opportunity for a hook or handler to be
> called.
That's what save-match-data is for. Hooks or handlers that intervene
at random places need to use it when they might likely call regexp
code. This is nothing new.
> Probability of hitting it on any given path is low, but potential
> for random annoyance for years to come is high, I fear.
It is obvious that you are not fond of the interface, presumably
preferring a complex return value for matching functions including the
match data. But it is not like it would be feasible to change it,
anyway.
The only addition I can think of is to offer an explicit function
void-match-data that users would be free to call at points when they
know the match-data no longer to be needed. Having such a function
explicitly available might help for debugging use-before-definition
cases like the above.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
- Discrepancy in definition/use of match-data?, David Kastrup, 2004/06/09
- Re: Discrepancy in definition/use of match-data?, Richard Stallman, 2004/06/10
- Re: Discrepancy in definition/use of match-data?, David Kastrup, 2004/06/10
- Re: Discrepancy in definition/use of match-data?, Stephen J. Turnbull, 2004/06/11
- Re: Discrepancy in definition/use of match-data?, David Kastrup, 2004/06/11
- Re: Discrepancy in definition/use of match-data?, Stephen J. Turnbull, 2004/06/12
- Re: Discrepancy in definition/use of match-data?,
David Kastrup <=
- Re: Discrepancy in definition/use of match-data?, Richard Stallman, 2004/06/12
- Re: Discrepancy in definition/use of match-data?, Stephen J. Turnbull, 2004/06/14
- Re: Discrepancy in definition/use of match-data?, David Kastrup, 2004/06/14
- Re: Discrepancy in definition/use of match-data?, Stephen J. Turnbull, 2004/06/14
- Re: Discrepancy in definition/use of match-data?, Stephen J. Turnbull, 2004/06/16
- Re: Discrepancy in definition/use of match-data?, Richard Stallman, 2004/06/18
- Re: Discrepancy in definition/use of match-data?, Richard Stallman, 2004/06/18
- Re: Discrepancy in definition/use of match-data?, Stephen J. Turnbull, 2004/06/23
- Re: Discrepancy in definition/use of match-data?, Richard Stallman, 2004/06/11