guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: string port encodings


From: Ludovic Courtès
Subject: Re: string port encodings
Date: Wed, 16 Jan 2013 16:44:50 +0100
User-agent: Gnus/5.130005 (Ma Gnus v0.5) Emacs/24.2 (gnu/linux)

Hi!

Andy Wingo <address@hidden> skribis:

> But no, currently the answer is locale-specific.  It encodes the string
> according to the current locale, then decodes it from that encoding.  If
> your locale can't encode the string, tough luck for you!

SRFI-6 uses Unicode-capable ports since
ecb48dccbac6b8fdd969f50a23351ef7f4b91ce5.

Otherwise, %default-port-encoding governs (info "(guile) String Ports"):

 -- Scheme Procedure: call-with-output-string proc
 -- C Function: scm_call_with_output_string (proc)
     Calls the one-argument procedure PROC with a newly created output
     port.  When the function returns, the string composed of the
     characters written into the port is returned.  PROC should not
     close the port.

     Note that which characters can be written to a string port depend
     on the port's encoding.  The default encoding of string ports is
     specified by the `%default-port-encoding' fluid (*note
     `%default-port-encoding': Ports.).  For instance, it is an error
     to write Greek letter alpha to an ISO-8859-1-encoded string port
     since this character cannot be represented with ISO-8859-1:

          (define alpha (integer->char #x03b1)) ; GREEK SMALL LETTER ALPHA

          (with-fluids ((%default-port-encoding "ISO-8859-1"))
            (call-with-output-string
              (lambda (p)
                (display alpha p))))

          =>
          Throw to key `encoding-error'

     Changing the string port's encoding to a Unicode-capable encoding
     such as UTF-8 solves the problem.

> This is a bit crazy.  Surely the port should be textual?  Surely the
> default encoding for a string port should be utf-8 or something that can
> actually handle all strings?

As was said, “this has been recommended” (note the passive form!) on our
fora as a smart way to do encoding conversion.

The thing is, unlike R6RS, our ports can be used both for textual and
binary I/O.

This has been discussed at length already, and I think all the pros and
cons have been written already.  :-)

Thanks,
Ludo’.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]