bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#31138: Native json slower than json.el


From: Eli Zaretskii
Subject: bug#31138: Native json slower than json.el
Date: Tue, 23 Apr 2019 13:22:34 +0300

> Cc: p.stephani2@gmail.com, sebastien@chapu.is, yyoncho@gmail.com,
>  31138@debbugs.gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Tue, 23 Apr 2019 00:03:30 +0300
> 
> On 22.04.2019 20:24, Eli Zaretskii wrote:
> 
> > Yes, but I'm slightly surprised why you loop from the end of the
> > string and not from the beginning.
> 
> To avoid creating an additional pointer variable.

I don't think it matters, and looping forward is more natural and may
even be slightly faster.

> > I guess that's expected when the strings in JSON are short enough.
> 
> Longer strings take a proportional amount of time to encode, though 
> (only 2x as fast per character, IIRC).

I was talking about decoding.  Assuming that decode_coding_utf_8 has
some setup overhead before it starts the loop of processing the bytes,
that overhead will become less significant with longer strings.  And
indeed, if I make the strings in large.json be 10K characters (can
this happen in real-life JSONs?), the speedup from using
make_specified_string for valid UTF-8 input goes down to just 40% for
unoptimized builds and 20% for optimized (see the timing data below).
But it's still faster even for such large strings, so I installed a
variant of what we were discussing.

Comparing with json.el shows that we've got 8-fold to ten-fold speedup
in optimized builds.

Here are my timings for the various variants ("large" means with JSON
input where all strings were enlarged to 10K characters):

  variant                       | unoptimized | optimized
  ------------------------------+-------------+----------
  curent master                 |    3.563    |   0.664
  curent master, large          |  174.0      |  43.34
  no validation                 |    0.980    |   0.326
  no validation, large          |  105.1      |  33.13
  coding_system directly        |    2.962    |   0.660
  coding_system directly, large |  173.4      |  43.19
  UTF-8 validation              |    0.980    |   0.334
  UTF-8 validation, large       |  105.9      |  34.36

In all cases, the times are from 10 benchmark loops, after subtracting
the time used by GC.

Thanks.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]