|
From: | Dmitry Gutov |
Subject: | bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new `completion-filter-completions` API and deferred highlighting |
Date: | Mon, 16 Aug 2021 06:48:32 +0300 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 |
On 14.08.2021 11:23, João Távora wrote:
Dmitry Gutov <dgutov@yandex.ru> writes:Aside from the mutability/ownership issue, On 13.08.2021 15:05, João Távora wrote:If one removes these lines, the process becomes much faster, but there is a problem with highlighting. My idea is indeed to defer highlighting by not setting the 'face property directly on that shared string, but some other property that is read later from the shared string by compliant frontents.I haven't done any direct benchmarking, but I'm pretty sure that this approach cannot, by definition, be as fast as the non-mutating one. Because you go through the whole list and mutate all of its elements, you have to perform a certain bit of work W x N times, where N is the size of the whole list.Let's call W the work that you perform N times in this istuation. In the non-mutation, let's call it Z. So W <= Z, because Z not only propertizes the string with a calculation of faces but _also copies its character contents_.
As I pointed out later in the email you're replying to, copying won't happen N times.
Also I think it's better to start about copying rather than mutating. As Eli points out, putting a text property in a string (my idea) is not equivalent with "mutating" it.
In common industry terms, that's mutation. Lisp strings are not C strings, they are aggregate objects.
Whereas the deferred-mutation approach will only have to do its bit (which is likely more work, like, WWW) only 20 times, or 100 times, or however many completions are displayed. And this is usually negligible.I think you're going in the same fallacy you went briefly in the other bug report. The flex and pcm styles (meaning completion-pcm--hilit-commonality) has to go through all the completions when deciding the score to atribute to each completion that we already know matches the pattern. That's because this scoring is essential to sorting. That's a given in both scenarios, copying and non-copying.
First of all, note that scoring is only essential to the 'flex' style. Whereas the improvements we're discussing should benefit all, and can be more pronounced if the scoring don't need to be performed.
But ok, let's talk about flex in particular.
Then, it's true that only a very small set of those eventually have to be displayed to the user, depending on where wants she wants her scrolling page to be. So that's when you have to apply 'face' to, say 20 strings, and that can indeed be pretty fast. But where does the information come from? - Currently, it comes from the string's 'face' itself, which was copied entirely. - In the non-copying approach, it must come from somewhere else. One idea is that it comes from a new "private" property 'lazy-face', also in the string itselv, but which was _not_ copied. Another idea is just to remember the pattern and re-match it to those 20 strings.
Let's say that the cost to compute the score (on one completion) is S. And the cost to highlight it is H. The cost to copy a string is C (that would be amortized cost, including the subsequent GCs).
The current algorithm costs: N x (C + S + H) C is unavoidable because of the existing API guarantees. A non-mutating algorithm can cost: N x S (for sorting) + 100 x (C + S + H) (let's say we didn't even cache the scoring info)...where 100 is the number of displayed completions (the number is usually lower).
As we measured previously, copying is quite expensive. Even in the above, not-caching approach we win ((N - 100) x (C + H)), and, okay, lose 100 x S. For high values of N, it should be a clear win.
I think the second alternative is always faster.However big the difference is going to be, I can't say in advance, of course, or whether it's going to be shadowed by some other performance costs. But the non-mutating approach should have the best optimization potential when the list is long.Don't think so. I'm doing benchmarks, will post soon.
I'm guessing you just skip the C step in your benchmarks? Which is something that breaks our current contract.
Still, Daniel's patch should provide a comparable performance improvement. If you're saying it doesn't give numbers as good, I'll have to give it a more thorough read and testing tomorrow to comment on that.
[Prev in Thread] | Current Thread | [Next in Thread] |