bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#43016: replace-region-contents takes a lot of time when called from


From: Tassilo Horn
Subject: bug#43016: replace-region-contents takes a lot of time when called from json-pretty-print-buffer
Date: Mon, 24 Aug 2020 21:15:42 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Eli Zaretskii <eliz@gnu.org> writes:

>> So basically I'd say the problem is in gnulib's compareseq.  If it
>> can't be fixed there, I see no other possibility than to stop using
>> replace-buffer/region-contents in json.el (and wherever it might also
>> be used).  That would be sad because except for the performance in
>> some cases, it's very nice. :-(
>
> Could we decide whether to use replace-* functions dynamically, based
> on the size of the region/buffer being prettified?

No, the size is just a secondary measure here.  I've successfully and
quickly prettified much larger JSON files.

I've just tried with some other sample json file which is about three
times the size of the file in this report.  That also triggered an
early_abort of compareseq but within MAX-SECS time.  And here I have
almost half a million compareseq_early_abort tests, not just 321.

I've now modified it to approximately the same size as the file of ljell
(the reporter).  Then it doesn't trigger early_abort, is fast, and
slightly below 300.000 early_abort tests are performed.

Attachment: sample.json.gz
Description: application/gzip

So the question is why the file in this report with about the same size
and number of changes between minimized and prettified version result in
such strange numbers?

ljell is right, it seems to have to do with the non-ASCII characters.
In my sample.json.gz from above, I've just replaced every "e" with an
"Ê" (except in true/false literals).  When I prettify that, it aborts
early (fast) just after 449 early_abort_tests.

Attachment: sample-non-ascii.json.gz
Description: application/gzip

So just replacing "e" with an "Ê" changed "compares in time with 300.000
early_abort_tests" to "doesn't compare in time and makes only 449
early_abort_tests in that time".  The only difference to ljell's test
file is that with mine, there doesn't seem to be a big gap between the
last early_abort_test returning false and the one returning true.

> Btw, there's another problem with compareseq, see bug#42931.  I guess
> we need to add another criterion for early_abort, based on depth of
> recursion?

Ouch, I can also reproduce that.  But as said by Philipp Stephani in
that report, I'd consider protecting against too deep recursion a job of
compareseq, not of its callers.

And I think the issue in that report would also vanish if compareseq
would somehow ensure that its EARLY_ABORT expression would be evaluated
regularly, i.e., there were no long periods without check.  That's the
thing I also observe with the 18 MB file from bug#42931: gazillions of
early_abort_tests early on, then a long phase with no test, and
eventually the segfault.  (Of course it's possible that other files
would result in a quick segfault due to unbounded recursion where that
wouldn't help either...)

Bye,
Tassilo

reply via email to

[Prev in Thread] Current Thread [Next in Thread]