bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#64735: 29.0.92; find invocations are ~15x slower because of ignores


From: Dmitry Gutov
Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
Date: Mon, 24 Jul 2023 15:55:13 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 24/07/2023 14:20, Eli Zaretskii wrote:
Date: Sun, 23 Jul 2023 22:27:26 +0300
Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
  64735@debbugs.gnu.org
From: Dmitry Gutov <dmitry@gutov.dev>

On 23/07/2023 20:56, Eli Zaretskii wrote:
And, ideally, do all the relevant benchmarking when proposing the change.
Of course.  Although the benchmarks until now already show quite a
variability.

Speaking of your MS Windows results that are unflattering to 'find', it
might be worth it to do a more varied comparison, to determine the
OS-specific bottleneck.

Off the top of my head, here are some possibilities:

1. 'find' itself is much slower there. There is room for improvement in
the port.

I think it's the filesystem, not the port (which I did myself in this
case).

But directory-files-recursively goes through the same filesystem, doesn't it?

But I'd welcome similar tests on other Windows systems with
other ports of Find.  Just remember to measure this particular
benchmark, not just Find itself from the shell, as the times are very
different (as I reported up-thread).

Concur.

2. The process output handling is worse.

Not sure what that means.

Emacs's ability to process the output of a process on the particular platform.

You said:

  Btw, the Find command with pipe to some other program, like wc,
  finishes much faster, like 2 to 4 times faster than when it is run
  from find-directory-files-recursively.  That's probably the slowdown
  due to communications with async subprocesses in action.

One thing to try it changing the -with-find implementation to use a synchronous call, to compare (e.g. using 'process-file'). And repeat these tests on GNU/Linux too.

That would help us gauge the viability of using an asynchronous process to get the file listing. But also, if one was just looking into reimplementing directory-files-recursively using 'find' (to create an endpoint with swappable implementations, for example), 'process-file' is a suitable substitute because the original is also currently synchronous.

3. Something particular to the project being used for the test.

I don't think I understand this one.

This described the possibility where the disparity between the implementations' runtimes was due to something unusual in the project structure, if you tested different projects between Windows and GNU/Linux, making direct comparison less useful. It's the least likely cause, but still sometimes a possibility.

To look into the possibility #1, you can try running the same command in
the terminal with the output to NUL and comparing the runtime to what's
reported in the benchmark.

Output to the null device is a bad idea, as (AFAIR) Find is clever
enough to detect that and do nothing.  I run "find | wc" instead, and
already reported that it is much faster.

Now I see it, thanks.

I actually remember, from my time on MS Windows about 10 years ago, that
some older ports of 'find' and/or 'grep' did have performance problems,
but IIRC ezwinports contained the improved versions.

The ezwinports is the version I'm using here.  But maybe someone came
up with a better one: after all, I did my port many years ago (because
the native ports available back then were abysmally slow).

We should also look at the exact numbers. If you say that "| wc" invocation is 2-4x faster than what's reported in the benchmark, then it takes about 2-4 seconds. Which is still oddly slower than your reported numbers for directory-files-recursively.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]