[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#18520: string ports should not have an encoding
From: |
David Kastrup |
Subject: |
bug#18520: string ports should not have an encoding |
Date: |
Tue, 23 Sep 2014 00:12:58 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) |
address@hidden (Ludovic Courtès) writes:
> David Kastrup <address@hidden> skribis:
>>
>> For error messages, yes. For associating a position in a string with a
>> previously parsed closure, no.
>
> But wouldn’t a line/column pair be as suitable as a unique identifier as
> the position in the file?
As long as the reencoded UTF-8 is byte-identical to the original. At
the current point of time, we flag non-UTF-8 sequences with a warning
and continue.
People complained previously about things like Latin-1 characters (most
likely to occur in comments or lyrics where they cause little or
well-identifiable havoc) leading to unceremonious aborts without
identifiable cause.
At any rate, the current behavior does not make sense. Guile 2.0 might
refuse to turn a string into a port, and for Guile 2.2 the port encoding
may be used to have a UTF-8 rendition of the string characters be
interpreted in another encoding (like latin-1) but not the other way
round.
Both versions make only some half-baked sense. Most resulting problems
can probably be worked around in some manner, but string ports are
actually the main stringbuf-like mechanism that Scheme has (dynamically
growing strings that are more compact than a list of characters).
Wedging a compulsory code conversion into it that is mirrored in the
port positions seems like a distraction.
> Also, if the result of ‘ftell’ is used as a unique identifier, does it
> really matter whether it’s an offset measured in bytes or in
> character?
In the LilyPond lexer, stuff is usually measured with byte offsets.
Yes, one can certainly parse the UTF-8 character distances and hope to
arrive at the same results as the UTF-8 reencoding.
But the point of GUILE's character set support was not really to make
everything more complicated, was it?
--
David Kastrup
- bug#18520: string ports should not have an encoding, David Kastrup, 2014/09/21
- bug#18520: string ports should not have an encoding, Ludovic Courtès, 2014/09/22
- bug#18520: string ports should not have an encoding, Ludovic Courtès, 2014/09/22
- bug#18520: string ports should not have an encoding, David Kastrup, 2014/09/22
- bug#18520: string ports should not have an encoding, Ludovic Courtès, 2014/09/22
- bug#18520: string ports should not have an encoding, David Kastrup, 2014/09/22
- bug#18520: string ports should not have an encoding, Ludovic Courtès, 2014/09/22
- bug#18520: string ports should not have an encoding,
David Kastrup <=
- bug#18520: string ports should not have an encoding, Ludovic Courtès, 2014/09/23
- bug#18520: string ports should not have an encoding, David Kastrup, 2014/09/23
- bug#18520: string ports should not have an encoding, Ludovic Courtès, 2014/09/23
- bug#18520: string ports should not have an encoding, David Kastrup, 2014/09/23
- bug#18520: string ports should not have an encoding, Ludovic Courtès, 2014/09/23
- bug#18520: string ports should not have an encoding, David Kastrup, 2014/09/23
- bug#18520: string ports should not have an encoding, Ludovic Courtès, 2014/09/23
- bug#18520: string ports should not have an encoding, David Kastrup, 2014/09/23
- bug#18520: string ports should not have an encoding, Ludovic Courtès, 2014/09/23
bug#18520: string ports should not have an encoding, Mark H Weaver, 2014/09/24