guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: efficient implementation of string-replace-substring / string-replac


From: Mark H Weaver
Subject: Re: efficient implementation of string-replace-substring / string-replace-all
Date: Mon, 24 Mar 2014 01:19:12 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Andy Wingo <address@hidden> writes:

> On Fri 13 Sep 2013 21:41, Mark H Weaver <address@hidden> writes:
>
>> Here's an implementation that does this benchmark about 80 times faster
>> on my machine: (20 milliseconds vs 1.69 seconds)
>>
>> (define* (string-replace-substring s substr replacement
>>                                    #:optional
>>                                    (start 0)
>>                                    (end (string-length s)))
>>   (let ((substr-length (string-length substr)))
>>     (if (zero? substr-length)
>>         (error "string-replace-substring: empty substr")
>>         (let loop ((start start)
>>                    (pieces (list (substring s 0 start))))
>>           (let ((idx (string-contains s substr start end)))
>>             (if idx
>>                 (loop (+ idx substr-length)
>>                       (cons* replacement
>>                              (substring s start idx)
>>                              pieces))
>>                 (string-concatenate-reverse (cons (substring s start)
>>                                                   pieces))))))))
>
> Inspired to code-golf a bit, here's one that's even faster :)
>
> (define (string-replace-substring s substring replacement)
>   "Replace every instance of substring in s by replacement."
>   (let ((sublen (string-length substring)))
>     (with-output-to-string
>       (lambda ()
>         (let lp ((start 0))
>           (cond
>            ((string-contains s substring start)
>             => (lambda (end)
>                  (display (substring/shared s start end))
>                  (display replacement)
>                  (lp (+ end sublen))))
>            (else
>             (display (substring/shared s start)))))))))
>
> Just marginally so, though.

Nice!  I confess that I find this very surprising.  I would have
expected that the overhead in creating the string port, repeatedly
expanding the string buffer, doing UTF-8 encoding in 'display', and
decoding the UTF-8 when retrieving the result string, would add up to
something slower than what I had.  But experiment trumps theory, and I
guess I'll take your word on it that you did some reasonable benchmarks
and determined that my intuitions were wrong :)

One warning though: in Guile 2.0, string ports only support characters
representable in the %default-port-encoding, which defaults to the
encoding of the current locale.  Importing (srfi srfi-6) fixes this for
open-input-string and open-output-string, but with-output-to-string
remains limited.  Therefore, I recommend using the code above only in
Guile master, where string ports are proper.

    Regards,
      Mark



reply via email to

[Prev in Thread] Current Thread [Next in Thread]