Re: GUILE 2/3 and string encoding cost

lilypond-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GUILE 2/3 and string encoding cost

From:	Carl Sorensen
Subject:	Re: GUILE 2/3 and string encoding cost
Date:	Wed, 22 Jan 2020 20:28:41 +0000
User-agent:	Microsoft-MacOutlook/10.10.10.191111


On 1/22/20, 1:21 PM, "lilypond-devel on behalf of David Kastrup" 
<lilypond-devel-bounces+c_sorensen=address@hidden on behalf of address@hidden> 
wrote:

    Han-Wen Nienhuys <address@hidden> writes:
    
    > On Wed, Jan 22, 2020 at 12:01 PM David Kastrup <address@hidden> wrote:
    >
    >> Han-Wen Nienhuys <address@hidden> writes:
    >>
    >> > I looked a bit through the GUILE source code to see what is going on.
    >> >
    >> > I believe our current hypothesis (LilyPond's slowdown is caused by
    >> > expensive unicode transcoding into 32-bit strings) is incorrect.
    >> >
    >> > If you look into the source code, you can see that the UTF-8 -> SCM
    >> > conversion checks if there are any code points over 255
    >> >
    >> >
    >> >
    >> 
https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n1620
    >> >
    >> > if there aren't, it uses Latin1 encoding ("narrow == 1") to encode the
    >> > string as a normal byte array. This code walks the string twice, but 
that
    >> > is very cheap due to CPU cache locality, so it should be
    >> > essentially equivalent to whatever GUILE 1.8 was doing.
    >>
    >> GUILE 1.8 did not walk the string even once
    >>
    >
    > GUILE 1.8 walks it once when you do memcpy.
    
    Ok, but that's sort of a cheap walk.
    
    >> > Even so, if the input flie does use UTF-8, there should be little
    >> > overhead, because the number of texts that we process is always
    >> > small. LilyPond is not a text processor.
    >> >
    >> > So, what hard data do we have on GUILE 2/3 slowness, and what does
    >> > that data say?
    >>
    >> That data says "humongous slowdown".  There is not much more than
    >> speculation what this is caused by as far as I know.
    >>
    >>
    > Do we have a standardized test file for benchmarking performance?
    
    input/regression/mozart-hrn-3.ly possibly, but it's not particularly
    large.

We don't have a standardized test file, but we do have some representative 
results from a couple of (unknown but described) files:

https://lists.gnu.org/archive/html/lilypond-devel/2018-10/msg00054.html

Perhaps we could get those files to become standards (along with some other, 
shorter-compiling files).

Carl

[Prev in Thread]

Current Thread

[Next in Thread]

GUILE 2/3 and string encoding cost, Han-Wen Nienhuys, 2020/01/22
- Re: GUILE 2/3 and string encoding cost, David Kastrup, 2020/01/22
  - Re: GUILE 2/3 and string encoding cost, Thomas Morley, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Thomas Morley, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Thomas Morley, 2020/01/22
  - Re: GUILE 2/3 and string encoding cost, Han-Wen Nienhuys, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, David Kastrup, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Carl Sorensen <=
    - Re: GUILE 2/3 and string encoding cost, Urs Liska, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Karlin High, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Thomas Morley, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Thomas Morley, 2020/01/22
  - Re: GUILE 2/3 and string encoding cost, Han-Wen Nienhuys, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Han-Wen Nienhuys, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, David Kastrup, 2020/01/22
    - Re: GUILE 2/3 and string encoding cost, Han-Wen Nienhuys, 2020/01/23
    - Re: GUILE 2/3 and string encoding cost, David Kastrup, 2020/01/23
    - Re: GUILE 2/3 and string encoding cost, Han-Wen Nienhuys, 2020/01/23

Prev by Date: Re: GUILE 2/3 and string encoding cost
Next by Date: Re: GUILE 2/3 and string encoding cost
Previous by thread: Re: GUILE 2/3 and string encoding cost
Next by thread: Re: GUILE 2/3 and string encoding cost
Index(es):
- Date
- Thread