[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: string vs list processing
From: |
Sascha Ziemann |
Subject: |
Re: string vs list processing |
Date: |
16 Apr 2001 15:40:48 +0200 |
User-agent: |
Gnus/5.0803 (Gnus v5.8.3) Emacs/20.6 |
Michael Livshin <address@hidden> writes:
| Sascha Ziemann <address@hidden> writes:
|
| > Is it right, that in Guile list processing is much faster than string
| > processing?
|
| "processing" is not precise enough.
|
| it is certainly true that consing new pairs is *much* faster than
| consing new strings, since the latter involves calling `malloc'.
|
| that seems to explain your results.
I have more examples. Task: remove the escape chars from a HTTP
request.
This is the first version I found on the net:
----------------------------------------------------------------------
(define-public (url-decode str)
;; for efficiency reasons, we scan the string backwards, consing
;; a list of replacement characters as we go; any plus is replaced
;; by a space, and if a `%' is seen then the last two conses are
;; replaced by their hex-decoded equivalent. A regex replace would
;; be conceptually cleaner, but less efficient, and it would require
;; invoking a closure for each hex decode. Maybe scsh's funky
;; regex support would be helpful here!
(let decode ((i (1- (string-length str)))
(chars '()))
(if (< i 0)
(apply string chars)
(case (string-ref str i)
;; convert plusses
((#\+) (decode (1- i) (cons #\space chars)))
;; hex-decode bytes
((#\%) (let ((hi (car chars))
(lo (cadr chars)))
(decode (1- i) (cons (integer->char (hex->number hi lo))
(cddr chars)))))
;; default: copy char verbatim
(else (decode (1- i) (cons (string-ref str i) chars)))))))
(define (hex->number hi lo)
(+ (* 16 (hex-char->number hi))
(hex-char->number lo)))
(define (hex-char->number ch)
(if (char-alphabetic? ch)
(+ 10
(- (char->integer (char-upcase ch))
(char->integer #\A)))
(- (char->integer ch)
(char->integer #\0))))
----------------------------------------------------------------------
And this the second:
----------------------------------------------------------------------
(define (decode-uri str)
(regexp-substitute/global
#f "\\+|%([0-9A-Fa-f][0-9A-Fa-f])" str
'pre
(lambda (m)
(cond ((string=? "+" (match:substring m 0)) " ")
(else (integer->char
(string->number
(match:substring m 1)
16)))))
'post))
----------------------------------------------------------------------
And this is what I wrote:
----------------------------------------------------------------------
(define (hex-char->integer ch)
(case ch
((#\0) 0) ((#\1) 1) ((#\2) 2) ((#\3) 3) ((#\4) 4)
((#\5) 5) ((#\6) 6) ((#\7) 7) ((#\8) 8) ((#\9) 9)
((#\a #\A) 10) ((#\b #\B) 11) ((#\c #\C) 12)
((#\d #\D) 13) ((#\e #\E) 14) ((#\f #\F) 15)
(else #f)))
(define (integer->hex-char int)
(case int
((0) #\0) ((1) #\1) ((2) #\2) ((3) #\3) ((4) #\4)
((5) #\5) ((6) #\6) ((7) #\7) ((8) #\8) ((9) #\9)
((10) #\A) ((11) #\B) ((12) #\C)
((13) #\D) ((14) #\E) ((15) #\F)
(else #f)))
(define-public (uri-unescape str)
(let loop ((done '())
(todo (string->list str)))
(if (null? todo)
(list->string done)
(case (car todo)
((#\%)
(loop (append done
(cons (integer->char
(+ (* (hex-char->integer (cadr todo)) 16)
(hex-char->integer (caddr todo))))
'()))
(cdddr todo)))
((#\+)
(loop (append done (cons #\space '()))
(cdr todo)))
(else
(loop (append done (cons (car todo) '()))
(cdr todo)))))))
----------------------------------------------------------------------
This first version uses the string function string-ref, the second
uses regular expressions and the last version converts the string into
a list and works on the list and converts the resulting list into a
string. And this is the funny result:
(compare-run-time 10000
(url-decode "It%27s%20me%21")
(decode-uri "It%27s%20me%21")
(uri-unescape "It%27s%20me%21")
)
;; 3.24 seconds for 10000 times (url-decode "It%27s%20me%21")
;; 10.42 seconds for 10000 times (decode-uri "It%27s%20me%21")
;; 1.72 seconds for 10000 times (uri-unescape "It%27s%20me%21")
I think I can live with slow regular expressions, but why is a simple
string-ref so slow?
The following example compares string-ref directly with list-ref:
(let* ((str (string-append (list->string (make-list 1000 #\A))))
(lst (string->list str))
(len (1- (length lst)))
(len/2 (quotient len 2)))
(compare-run-time 100000
(string-ref str 0)
(list-ref lst 0)
(string-ref str len/2)
(list-ref lst len/2)
(string-ref str len)
(list-ref lst len)))
;; 1.21 seconds for 100000 times (string-ref str 0)
;; 1.24 seconds for 100000 times (list-ref lst 0)
;; 1.24 seconds for 100000 times (string-ref str len/2)
;; 1.88 seconds for 100000 times (list-ref lst len/2)
;; 1.22 seconds for 100000 times (string-ref str len)
;; 2.58 seconds for 100000 times (list-ref lst len)
Direct indexing of a string is only twice as fast as walking through a
list of 1000 elements? I think there must be something wrong.
bis später...
Sascha
--
Ein Hoch auf den 7. Juni 2000
- string vs list processing, Sascha Ziemann, 2001/04/16
- Re: string vs list processing, Michael Livshin, 2001/04/16
- Re: string vs list processing,
Sascha Ziemann <=
- Re: string vs list processing, Sascha Ziemann, 2001/04/16
- Re: string vs list processing, Harvey J. Stein, 2001/04/16
- Re: string vs list processing, Michael Livshin, 2001/04/16
- Re: string vs list processing, Harvey J. Stein, 2001/04/16
- Re: string vs list processing, Michael Livshin, 2001/04/16
Re: string vs list processing, Masao Uebayashi, 2001/04/16