guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

textual output primitives


From: Andy Wingo
Subject: textual output primitives
Date: Sun, 05 Jun 2016 23:11:22 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Hi!

I have a simple question.  Imagine you have to write the section of the
Guile manual on textual I/O.  What interfaces do you recommend people to
use to output characters and strings to a port?

For context, I was wrapping up the non-blocking I/O work recently and
came to a quandary.  Most everything is in master already -- basically
some core port functions (close-port, force-output), some binary I/O
primitives (get-u8, lookahead-u8, get-bytevector-n, put-u8,
put-bytevector) and some textual I/O primitives (read-char, peek-char,
put-char, put-string, read-line, read-delimited) are available with
Scheme implementations as well as C implementations.  There's an
`install-sports!' (for Scheme Ports, you see; a terrible name and
probably that should change) function exported by (ice-9 sports) that
will actually set! the core bindings for these primitives to the Scheme
versions.  If the primitives would block, the value of the
`current-read-waiter' or the `current-write-waiter' parameter is called,
as appropriate, and the operation retries.  I have updated the
experimental "wip-ethreads" branch that uses this facility to implement
lightweight user-space threading based on delimited continuations and an
epoll-based scheduler.  I won't merge ethreads any time soon, and
probably not before 2.2; but I wanted it to make it possible to
implement ethreads or 8sync or whatever as a library.

OK that's the status.  Anyway, the conundrum: two of the primitives
above, put-char and put-string, are not like the others.

put-char is the same as write-char, with arguments reversed and no
optional args.  Why the extra name?

There is a put-string in (rnrs io ports), but while it's spiritually the
same, it's wrapped in the `with-textual-output-conditions' thing that
our higher-level R6RS ports code uses -- which is fine for the purposes
of a library that has to shim Guile exceptions to standard exceptions,
and maybe we should rework our standard exceptions one day, but it's a
layer above, making it not a primitive for the purposes of (ice-9
sports).

I repeat the original question: what should we recommend for outputting
characters and strings to ports?

Recall that for binary I/O, our recommendation is that users use the
R6RS-like interfaces from (ice-9 binary-ports).

For textual input we recommend read-char/peek-char and (ice-9 rdelim).

For textual output... well there's `display', `write', `format',
`write-char'... but only `write-char' is a primitive.  `display' and
`write' are both big generic things that operate on datums.  `format' is
a monster too.

For textual output, we need two primitives: one to encode a character to
bytes and send it to a port, and something to encode a string (and if
possible, a substring) to bytes and send those bytes to a port.

You could use the string function for characters, but that would mean
allocating a string to write a character, which would be slow.  You
could use the character fuction for strings, but that would be both slow
and wrong (in a way).  It would be slow because even if the output port
is unbuffered, we want to be able to write the string all in one go, or
at least in big chunks.  The semantic wrongness is that if multiple
processes are writing to a terminal at the same time, as a user you want
to see interleaved lines, not interleaved characters.

The R5RS answer is `write-char' and `display'.  If you want to display a
substring, use `substring' then `display' that.  Fine.  That does cause
an allocation for displaying substrings though, and `(write-char ch)' is
actually `(display ch)' and not `(write ch)', so it's not very
consistent.  For my purposes though I need a primitive that (ice-9
sports) could replace and `display' is not a primitive.  You would use
this primitive to implement `display', is what I'm saying.  So we need
something else.

The R6RS answer is `put-char' and `put-string'.  This is nice.  It
doesn't rhyme with `display' or `write' or `write-char' but it's
consistent in itself, has consistent argument order, and is consistent
with the binary API which is also already exported in (ice-9
binary-ports) -- put-bytevector, put-u8, and so on.  `display' and
`write' are treated as legacy in some way and R6RS even puts them in
another module.

The R7RS answer is `write-char' and `write-string'.  `write-string'
doesn't exist in Guile yet but it could exist.  Thing is, `(write-string
"foo")' is not the same as `(write "foo")' -- rather, it is `(display
"foo")'.  I think R7RS made the wrong choice in deciding to add
`write-string', preferring consistency with the past (`write-char') over
internal, extensible consistency.  I also am convinced by R6RS rationale
section 20.8 in which they argue for putting the port argument first to
the new `put-u8', `get-bytevector', `put-string' and so on procedures,
as it is consistent and places the optional arguments at the end of
e.g. `put-string' closer to their related object -- i.e. that in
`(put-string port str 6 10)', the 6 and 10 refer to `str'.

Sadly, R7RS also chose to go with a `start'/`end' convention for
subsequences rather than R6RS's `start'/`count', so the equivalent with
`write-string' is `(write-string str port 6 16)'.  Either system is fine
but R7RS chose deliberate incompatibility here.  Taylan points out that
it's particularly egregious in the case of `bytevector-copy!', where the
R6RS and R7RS versions of this procedure have arguments of similar types
but different interpretations.

Whew, lot of background.  OK.  So again, my question: imagine you have
to write the section of the Guile manual on textual I/O.  Given all of
this, what do you recommend people to use to write characters and
strings to a port?

There do not appear to be many good solutions.

 - Adding put-char and put-string to the default environment would be
   cool but put-char duplicates the existing write-char, and we wouldn't
   have the rest of the R6RS I/O system in the (guile) module

 - Adding write-string would be nasty as it is inconsistent with the
   port implementation internally and with (ice-9 binary-ports)

 - Adding (ice-9 textual-ports) and putting put-char/put-string there
   would be OK, but then it's off to the side in a way; but then also it
   is more amenable to change in the future...

 - Recommending R6RS as-is is a no-go because we don't want to force
   loading all those modules, and also the
   `with-textual-output-conditions' thing is inefficient

 - We could add put-string and put-char to (ice-9 binary-ports), but
   then that makes no sense

 - We could design some new I/O module hierarchy but it would be easy to
   make bad decisions

I am leaning towards adding the bindings to (ice-9 binary-ports), and
blessing binary-ports as just a random grab-bag of user-facing
"primitive" operations on ports.  I guess most people will keep using
`display' for everything anyway, unless they are really caring about
I/O, in which case they are probably using binary-ports in some way
anyway.

WDYT?

Andy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]