Re: [patch suggestion] Mitigating the poor Emacs performance on huge org

emacs-orgmode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patch suggestion] Mitigating the poor Emacs performance on huge org

From:	Ihor Radchenko
Subject:	Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers
Date:	Sun, 17 May 2020 23:00:10 +0800

Hi,

[All the changes below are relative to commit ed0e75d24. Later commits
make it hard to distinguish between hidden headlines and drawers. I will
need to figure out a way to merge this branch with master. It does not
seem to be trivial.]

I have finished a seemingly stable implementation of handling changes
inside drawer and block elements. For now, I did not bother with
'modification-hooks and 'insert-in-font/behind-hooks, but simply used
before/after-change-functions.

The basic idea is saving parsed org-elements before the modification
(with :begin and :end replaced by markers) and comparing them with the 
versions of the same elements after the modification.
Any valid org element can be examined in such way by an arbitrary
function (see org-track-modification-elements) [1].

For now, I have implemented tracking changes in all the drawer and block
elements. If the contents of an element is changed and the element is
hidden, the contents remains hidden unless the change was done with
self-insert-command. If the begin/end line of the element was changed in
the way that the element changes the type or extends/shrinks, the
element contents is revealed. To illustrate:

Case #1 (the element content is hidden):

:PROPERTIES:
:ID:       279e797c-f4a7-47bb-80f6-e72ac6f3ec55
:END:

is changed to

:ROPERTIES:
:ID:       279e797c-f4a7-47bb-80f6-e72ac6f3ec55
:END:

Text is revealed, because we have drawer in place of property-drawer

Case #2 (the element content is hidden):

:ROPERTIES:
:ID:       279e797c-f4a7-47bb-80f6-e72ac6f3ec55
:END:

is changed to

:OPERTIES:
:ID:       279e797c-f4a7-47bb-80f6-e72ac6f3ec55
:END:

The text remains hidden since it is still a drawer.

Case #3: (the element content is hidden):

:FOO:
bar
tmp
:END:

is changed to

:FOO:
bar
:END:
tmp
:END:

The text is revealed because the drawer contents shrank.

Case #4: (the element content is hidden in both the drawers):

:FOO:
bar
tmp
:END:
:BAR:
jjd
:END:

is changed to

:FOO:
bar
tmp
:BAR:
jjd
:END:

The text is revealed in both the drawers because the drawers are merged
into a new drawer.

> However, I think we can do better than that, and also fix the glitches
> from overlays. Here are two of them. Write the following drawer:
>
>   :FOO:
>   bar
>   :END:
>
> fold it and delete the ":f". The overlay is still there, and you cannot
> remove it with TAB any more. Or, with the same initial drawer, from
> beginning of buffer, evaluate:
>
>   (progn (re-search-forward ":END:") (replace-match ""))
>
> This is no longer a drawer: you just removed its closing line. Yet, the
> overlay is still there, and TAB is ineffective.

I think the above examples cover what you described.

Case #5 (the element content is hidden, point at <!>):

:FOO:<!>
bar
tmp
:END:

is changed (via self-insert-command) to

:FOO:a<!>
bar
tmp
:END:

The text is revealed.

This last case sounds logical and might potentially replace
org-catch-invisible-edits.

------------------------------------------------------------------------

Some potential issues with the implementation:

1. org--after-element-change-function can called many times even for
trivial operations. For example (insert "\n" ":TEST:") seems to call it
two times already. This has two implications: (1) potential performance
degradation; (2) org-element library may not be able to parse the
changed element because its intermediate modified state may not match
the element syntax. Specifically, inserting new property into
:PROPERTIES: drawer inserts a newline at some point, which makes
org-element-at-point think that it is not a 'property-drawer, but just
'drawer.

For (1), I did not really do any workaround for now. One potential way
is making use of combine-after-change-calls (info:elisp#Change Hooks).
At least, within org source code. 

For (2), I have introduced org--property-drawer-modified-re to override
org-property-drawer-re in relevant *-change-function. This seems to work
for property drawers. However, I am not sure if similar problem may
happen in some border cases with ordinary drawers or blocks. 

2. I have noticed that results of org-element-at-point and
org-element-parse-buffer are not always consistent.
In my tests, they returned different number of empty lines after drawers
(:post-blank and :end properties). I am not sure here if I did something
wrong in the code or if it is a real issue in org-element.

For now, I simply called org-element-at-point with point at :begin
property of all the elements returned by org-element-parse buffer to
make things consistent. This indeed introduced overhead, but I do not
see other way to solve the inconsistency.

3. This implementation did not directly solve the previously observed
issue with two ellipsis displayed in folded drawer after adding hidden
text inside:

:PROPERTY: ...  -->  :PROPERTY: ...  ...

For now, I just did

(org-hide-drawer-toggle 'off)
(org-hide-drawer-toggle 'hide)

to hide the second ellipsis, but I still don't understand why it is
happening. Is it some Emacs bug? I am not sure.

4. For some reason, before/after-change-functions do not seem to trigger
when adding note after todo state change. 

------------------------------------------------------------------------

Further plans:

1. Investigate the issue with log notes.
2. Try to change headings to use text properties as well.

The current version of the patch (relative to commit ed0e75d24) is
attached. 

------------------------------------------------------------------------

P.S. I have noticed an issue with hidden text on master (9bc0cc7fb) with
my personal config:

For the following .org file:

* TODO b
:PROPERTIES:
:CREATED:  [2020-05-17 Sun 22:37]
:END:

folded to

* TODO b...

Changing todo to DONE will be shown as

* DONE b
CLOSED: [2020-05-17 Sun 22:54]...:LOGBOOK:...

------------------------------------------------------------------------

[1] If one wants to track changes in two elements types, where one is
always inside the other, it is not possible now.

Best,
Ihor

featuredrawertextprop.patch
Description: Text Data


Nicolas Goaziou <address@hidden> writes:

> Ihor Radchenko <address@hidden> writes:
>
>> This should be better:
>> https://gist.github.com/yantar92/e37c2830d3bb6db8678b14424286c930
>
> Thank you.
>
>> This might get tricky in the following case:
>>
>> :PROPERTIES:
>> :CREATED: [2020-04-13 Mon 22:31]
>> <region-beginning>
>> :SHOWFROMDATE: 2020-05-11
>> :ID:       e05e3b33-71ba-4bbc-abba-8a92c565ad34
>> :END:
>>
>> <many subtrees in between>
>>
>> :PROPERTIES:
>> :CREATED:  [2020-04-27 Mon 13:50]
>> <region-end>
>> :ID:       b2eef49f-1c5c-4ff1-8e10-80423c8d8532
>> :ATTACH_DIR_INHERIT: t
>> :END:
>>
>> If the text in the region is replaced by something else, <many subtrees
>> in between> should not be fully hidden. We cannot simply look at the
>> 'invisible property before and after the changed region. 
>
> Be careful: "replaced by something else" is ambiguous. "Replacing" is an
> unlikely change: you would need to do 
>
>   (setf (buffer-substring x y) "foo")
>
> We can safely assume this will not happen. If it does, we can accept the
> subsequent glitch.
>
> Anyway it is less confusing to think in terms of deletion and insertion.
> In the case above, you probably mean "the region is deleted then
> something else is inserted", or the other way. So there are two actions
> going on, i.e., `after-change-functions' are called twice.
>
> In particular the situation you foresee /cannot happen/ with an
> insertion. Text is inserted at a single point. Let's assume this is in
> the first drawer. Once inserted, both text before and after the new text
> were part of the same drawer. Insertion introduces other problems,
> though. More on this below.
>
> It is true the deletion can produce the situation above. But in this
> case, there is nothing to do, you just merged two drawers into a single
> one, which stays invisible. Problem solved.
>
> IOW, big changes like the one you describe are not an issue. I think the
> "check if previous and next parts match" trick gives us roughly the same
> functionality, and the same glitches, as overlays.
>
> However, I think we can do better than that, and also fix the glitches
> from overlays. Here are two of them. Write the following drawer:
>
>   :FOO:
>   bar
>   :END:
>
> fold it and delete the ":f". The overlay is still there, and you cannot
> remove it with TAB any more. Or, with the same initial drawer, from
> beginning of buffer, evaluate:
>
>   (progn (re-search-forward ":END:") (replace-match ""))
>
> This is no longer a drawer: you just removed its closing line. Yet, the
> overlay is still there, and TAB is ineffective.
>
> Here's an idea to develop that would make folding more robust, and still
> fast.
>
> Each syntactical element has a "sensitive part", a particular area that
> can change the nature of the element when it is altered. Luckily drawers
> (and blocks) are sturdy. For a drawer, there are three things to check:
>
> 1. the opening line must match org-drawer-regexp
> 2. the closing line must match org-property-end-re (case ignored)
> 3. between those, you must not insert text match org-property-end-re or
>    org-outline-regexp-bol
>
> Obviously, point 3 needs not be checked during deletion.
>
> Instead of `after-change-functions', we may use `modification-hooks' for
> deletions, and `insert-behind-hooks' for insertions. For example, we
> might add modification-hooks property to both opening and closing line,
> and `insert-behind-hooks' on all the drawer. If any of the 3 points
> above is verified, we remove all properties.
>
> Note that if we can implement something robust with text properties, we
> might use them for headlines too, for another significant speed-up.
>
> WDYT?
>
>> I think that using fontification (something similar to
>> org-fontify-drawers) instead of after-change-functions should be
>> faster.
>
> I don't think it would be faster. With `after-change-functions',
> `modification-hooks' or `insert-behind-hook', we know exactly where the
> change happened. Fontification is fuzzier. It is not instantaneous
> either.
>
> It is an option only if we cannot do something fast and accurate with
> `after-change-functions', IMO.
>

-- 
Ihor Radchenko,
PhD,
Center for Advancing Materials Performance from the Nanoscale (CAMP-nano)
State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong 
University, Xi'an, China
Email: address@hidden, address@hidden

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, (continued)

Prev by Date: Re: Customizable fixed indentation column
Next by Date: Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers
Previous by thread: Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers
Next by thread: Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers
Index(es):
- Date
- Thread