guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: byte-order marks


From: Andy Wingo
Subject: Re: byte-order marks
Date: Tue, 29 Jan 2013 22:09:38 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux)

On Tue 29 Jan 2013 20:22, Neil Jerram <address@hidden> writes:

> (define (read-csv file-name)
>   (let ((s (utf16->string (get-bytevector-all (open-input-file file-name))
>                         'little)))
>
>     ;; Discard possible byte order mark.
>     (if (and (>= (string-length s) 1)
>            (char=? (string-ref s 0) #\xfeff))
>       (set! s (substring s 1)))
>
>     ...))

FWIW the procedure I had was:

(define (consume-byte-order-mark port)
  (let ((enc (or (port-encoding port) "ISO-8859-1")))
    (set-port-encoding! port "ISO-8859-1")
    (case (peek-char port)
      ((#\xEF)
       (read-char port)
       (case (peek-char port)
         ((#\xBB)
          (read-char port)
          (case (peek-char port)
            ((#\xBF)
             (read-char port)
             (set-port-encoding! port "UTF-8"))
            (else
             (unread-char #\xBB port)
             (unread-char #\xEF port)
             (set-port-encoding! port enc))))
         (else
          (unread-char #\xEF port)
          (set-port-encoding! port enc))))
      ((#\xFE)
       (read-char port)
       (case (peek-char port)
         ((#\xFF)
          (read-char port)
          (set-port-encoding! port "UTF-16BE"))
         (else
          (unread-char #\xFE port)
          (set-port-encoding! port enc))))
      ((#\xFF)
       (read-char port)
       (case (peek-char port)
         ((#\xFE)
          (read-char port)
          (set-port-encoding! port "UTF-16LE"))
         (else
          (unread-char #\xFF port)
          (set-port-encoding! port enc))))
      (else
       (set-port-encoding! port enc)))))

The encoding dance is because there is no unread-u8 from Scheme, only
unread-char.

Andy
-- 
http://wingolog.org/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]