guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: string-map arg order


From: Alex Shinn
Subject: Re: string-map arg order
Date: 06 Sep 2001 18:05:42 -0400
User-agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.0.104

>>>>> "Dirk" == Dirk Herrmann <address@hidden> writes:

    Dirk> On 6 Sep 2001, Alex Shinn wrote:
    >> But here without threads you're mutating the object
    >> that you're looping over inside a loop.  Is this something that
    >> has to be defined?

    Dirk> I don't think that changing the string semantics is an
    Dirk> option we should consider.  IMO, guile should follow RnRS.

Yes, definitely, I wasn't actually suggesting this.  I had been
looking through R5RS and the SRFI'S to find some precedent - the
closest I found was map, which would have undefined results if you
mutated the list you were mapping over.  But this is the wrong way to
think... I was conceptually caught in the idea of treating strings are
more opaque objects, when in fact they are really character arrays.
string-for-each explicitly says it traverses in left-to-right order,
and there's no reason to think you couldn't mutate the string inside
the loop.  By implementing strings as something other than character
arrays (what utf8 would mean) we end up losing performance because
strings must still behave as character arrays.

    Dirk> Given that, do you see my point that iteration over strings
    Dirk> (even with functions like string-for-each) is O(n*n) for
    Dirk> variable character length encodings, independent of whether
    Dirk> you use threading or not?

Yes.

    Dirk> Again: Both string representations can be used to perform
    Dirk> the same kind of tasks.  Fixed width encodings can use up
    Dirk> more memory and may need additional effort converting from
    Dirk> and to them (given that the rest of the world uses a
    Dirk> different format).  Iteration and a lot of other operations
    Dirk> can, however, be performed in a comparably efficient way.
    Dirk> Variable width encodings can be more space efficient, but
    Dirk> many (and many very common) operations are worse (by a
    Dirk> complexity factor of O(n)) with respect to execution time.
    Dirk> You could, however, save some conversion overhead if the
    Dirk> rest of the world uses the same format.

All true.  I'd like to add that conversions are not only forced by
external considerations, but internal ones (e.g. appending different
byte-sized strings).  Also, performance is not the only concern.
There's the simplicity of the API (Guile used to have multiple string
representations, but no one used them), and the ease with which people
can upgrade.  There's also the complexity of the coding involved, and
the fact that everything using strings (symbols, ports, regexps, etc.)
will have to be able to handle all forms of strings.

-- 
Alex Shinn <address@hidden>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]