emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] ...


From: Carsten Dominik
Subject: Re: [O] ...
Date: Thu, 31 Jan 2013 12:59:29 +0100

Hi Bastien,

as you know, regular expressions are a language to do a programmed search for 
text.  The pattern string has to be compiled before it can be used.  That 
compilation is a costly process, so most languages that have pattern matching 
use some kind of cache to store compiled patterns, so that frequently used 
patterns can be reused without compilation.

I am aware of this very much from studying perl.  In perl, a compiled pattern 
is associated with a particular instance of a string.  Often you build the 
pattern by constructing it through concatenation of other parts etc.  In Perl 
this means that the pattern is recompiled each time a match.  You can work 
around this issue in Perl by telling it explicitly and on programmers authority 
that, "yes, this pattern is dynamically constructed, but only once, I guarantee 
that it will not change, so compile it only once".  So in Perl the difference is

/pattern/      will match against pattern
/$pattern/     will match agains the pattern contained in the
               variable $pattern, and recompilation will occur
               each time
/$pattern/o    will compile only once and trust the programmer.

So I am very aware of this speedup issue.  And I thought that in Emacs, the 
caching would work by associating a specific string object with the compiled 
pattern.  But the code Christopher pointed out seems to suggest that the 
pattern cache works also for strings that are `equal', not only for string that 
are `eq'.

If this is the case, this means that there is only a very small difference 
between

(defconst my-pattern (concat "^" "xyz"))
(re-search-forward my-pattern ....)      ; many times in different functions

and

(defconst my-partial-pattern "xyz")
(re-search-forward (concat "^" my-partial-pattern) ....)  ; many times

The difference is only the repeated concatenation operation, and not the 
recompilation.  I always thought that this would work differently, and that is 
why a lot of regexps get constructed and then stored in variables or constants. 
 Of course this is also a good practice for readable and maintainable code, but 
the impact on efficiency is not as big as I used to think.  So when I saw 
Christoher's initial patch, I thought a function to create
org-ooutline-regexp-bol would be a large burden in speed - but it now seems 
that it would only be a minor impact.

Still, I think making a local variable in buffers with org-struct-mode is also 
a good way to get the functionality Christopher wants.

Clearer?

- Carsten


On 31 jan. 2013, at 12:22, Bastien <address@hidden> wrote:

> Hi Carsten and Christopher,
> 
> Carsten Dominik <address@hidden> writes:
> 
>> I mant to copy the list, I am doing this again now.
>> 
>> Wow, I was not aware that Emacs caches by content, this is an important
>> piece of information.  I guess this removed the main concern I had.  Thanks
>> for looking it up in the code and showing it to me.  I am not sure if I
>> understand that code completely, but i trust your judgment.
> 
> I'm not sure I have all the background to understand the issue at
> stake... can anyone educate me?  Thanks!
> 
> -- 
> Bastien


-- 
There is no unscripted life.  Only a badly scripted one. -- Brothers Bloom




reply via email to

[Prev in Thread] Current Thread [Next in Thread]