[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: efficient implementation of string-replace-substring / string-replac
From: |
Mark H Weaver |
Subject: |
Re: efficient implementation of string-replace-substring / string-replace-all |
Date: |
Mon, 24 Mar 2014 01:19:12 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) |
Andy Wingo <address@hidden> writes:
> On Fri 13 Sep 2013 21:41, Mark H Weaver <address@hidden> writes:
>
>> Here's an implementation that does this benchmark about 80 times faster
>> on my machine: (20 milliseconds vs 1.69 seconds)
>>
>> (define* (string-replace-substring s substr replacement
>> #:optional
>> (start 0)
>> (end (string-length s)))
>> (let ((substr-length (string-length substr)))
>> (if (zero? substr-length)
>> (error "string-replace-substring: empty substr")
>> (let loop ((start start)
>> (pieces (list (substring s 0 start))))
>> (let ((idx (string-contains s substr start end)))
>> (if idx
>> (loop (+ idx substr-length)
>> (cons* replacement
>> (substring s start idx)
>> pieces))
>> (string-concatenate-reverse (cons (substring s start)
>> pieces))))))))
>
> Inspired to code-golf a bit, here's one that's even faster :)
>
> (define (string-replace-substring s substring replacement)
> "Replace every instance of substring in s by replacement."
> (let ((sublen (string-length substring)))
> (with-output-to-string
> (lambda ()
> (let lp ((start 0))
> (cond
> ((string-contains s substring start)
> => (lambda (end)
> (display (substring/shared s start end))
> (display replacement)
> (lp (+ end sublen))))
> (else
> (display (substring/shared s start)))))))))
>
> Just marginally so, though.
Nice! I confess that I find this very surprising. I would have
expected that the overhead in creating the string port, repeatedly
expanding the string buffer, doing UTF-8 encoding in 'display', and
decoding the UTF-8 when retrieving the result string, would add up to
something slower than what I had. But experiment trumps theory, and I
guess I'll take your word on it that you did some reasonable benchmarks
and determined that my intuitions were wrong :)
One warning though: in Guile 2.0, string ports only support characters
representable in the %default-port-encoding, which defaults to the
encoding of the current locale. Importing (srfi srfi-6) fixes this for
open-input-string and open-output-string, but with-output-to-string
remains limited. Therefore, I recommend using the code above only in
Guile master, where string ports are proper.
Regards,
Mark