bug#64735: 29.0.92; find invocations are ~15x slower because of ignores

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#64735: 29.0.92; find invocations are ~15x slower because of ignores

From:	Eli Zaretskii
Subject:	bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
Date:	Fri, 08 Sep 2023 09:35:45 +0300

> Date: Fri, 8 Sep 2023 03:53:37 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> >> (("with-find 4096" . "Elapsed time: 1.737742s (1.019624s in 28 GCs)")
> >>    ("with-find 40960" . "Elapsed time: 1.515376s (0.942906s in 26 GCs)")
> >>    ("with-find 409600" . "Elapsed time: 1.458987s (0.948857s in 26 GCs)")
> >>    ("with-find 1048576" . "Elapsed time: 1.363882s (0.888599s in 24 GCs)")
> >>    ("with-find-p 4096" . "Elapsed time: 1.202522s (0.745758s in 19 GCs)")
> >>    ("with-find-p 40960" . "Elapsed time: 1.005221s (0.640815s in 16 GCs)")
> >> ("with-find-p 409600" . "Elapsed time: 0.855483s (0.591208s in 15 GCs)")
> >> ("with-find-p 1048576". "Elapsed time: 0.825936s (0.623876s in 16 GCs)")
> >> ("with-find-sync 4096" . "Elapsed time: 0.848059s (0.272570s in 7 GCs)")
> >> ("with-find-sync 409600"."Elapsed time: 0.912932s (0.339230s in 9 GCs)")
> >> ("with-find-sync 1048576"."Elapsed time: 0.880479s (0.303047s in 8 GCs)"
> >> ))
> >>
> >> What was puzzling for me, overall, is that if we take "with-find 409600"
> >> (the fastest among the asynchronous runs without parallelism) and
> >> "with-find-sync", the difference in GC time (which is repeatable),
> >> 0.66s, almost covers all the difference in performance. And as for
> >> "with-find-p 409600", it would come out on top! Which it did in Ihor's
> >> tests when GC was disabled.
> >>
> >> But where does the extra GC time come from? Is it from extra consing in
> >> the asynchronous call's case? If it is, it's not from all the chunked
> >> strings, apparently, given that increasing max string's size (and
> >> decreasing their number by 2x-6x, according to my logging) doesn't
> >> affect the reported GC time much.
> >>
> >> Could the extra time spent in GC just come from the fact that it's given
> >> more opportunities to run, maybe? call_process stays entirely in C,
> >> whereas make-process, with its asynchronous approach, goes between C and
> >> Lisp even time it receives input. The report above might indicate so:
> >> with-find-p have ~20 garbage collection cycles, whereas with-find-sync -
> >> only ~10. Or could there be some other source of consing, unrelated to
> >> the process output string, and how finely they are sliced?
> > 
> > These questions can only be answered by dumping the values of the 2 GC
> > thresholds and of consing_until_gc for each GC cycle.  It could be
> > that we are consing more Lisp memory, or it could be that one of the
> > implementations provides fewer opportunities for Emacs to call
> > maybe_gc.  Or it could be some combination of the two.
> 
> Do you think the outputs of 
> https://elpa.gnu.org/packages/emacs-gc-stats.html could help?

I think you'd need to expose consing_until_gc to Lisp, and then you
can collect the data from Lisp.

> Otherwise, I suppose I need to add some fprintf's somewhere. Would the 
> beginning of maybe_gc inside lisp.h be a good place for that?

I can only recommend the fprintf method if doing this from Lisp is
impossible for some reason.

> >> If we get back to increasing read-process-output-max, which does help
> >> (apparently due to reducing the number we switch between reading from
> >> the process and doing... whatever else), the sweet spot seems to be
> >> 1048576, which is my system's maximum value. Anything higher - and the
> >> perf goes back to worse -- I'm guessing something somewhere resets the
> >> value to default? Not sure why it doesn't clip to the maximum allowed,
> >> though.
> >>
> >> Anyway, it would be helpful to be able to decide on as high as possible
> >> value without manually reading from /proc/sys/fs/pipe-max-size. And what
> >> of other OSes?
> > 
> > Is this with pipes or with PTYs?
> 
> All examples which use make-process call it with :connection-type 'pipe.
> 
> The one that calls process-file (the "synchronous" impl) also probably 
> does, but I don't see that in the docstring.

Yes, call-process uses pipes.  So finding the optimum boils down to
running various scenarios.  It is also possible that the optimum will
be different on different systems, btw.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Dmitry Gutov, 2023/09/07
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii <=
  - bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Dmitry Gutov, 2023/09/09
    - bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/09/10
    - bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Dmitry Gutov, 2023/09/10
    - bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/09/11
    - bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Dmitry Gutov, 2023/09/11
    - bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/09/12
    - bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Dmitry Gutov, 2023/09/12
    - bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Dmitry Gutov, 2023/09/12
    - bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Dmitry Gutov, 2023/09/12
    - bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/09/12

Prev by Date: bug#65803: 29.1; Noto Sans Mono CJK JP has doubled-width on Windows
Next by Date: bug#65797: 29.0.92; func-arity should not return (0 . many) with apply-partially
Previous by thread: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
Next by thread: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
Index(es):
- Date
- Thread