Re: [Chicken-users] BOM in a Scheme source file

chicken-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] BOM in a Scheme source file

From:	Elf
Subject:	Re: [Chicken-users] BOM in a Scheme source file
Date:	Sat, 8 Sep 2007 23:54:55 -0700 (PDT)


and according to the unicode consortium:

 A: Yes, UTF-8 can contain a BOM. However, it makes no difference as
     to the endianness of the byte stream. UTF-8 always has the same
     byte order. An initial BOM is only used as a signature -- an
     indication that an otherwise unmarked text file is in UTF-8. Note
     that some recipients of UTF-8 encoded data do not expect a BOM.
     Where UTF-8 is used transparently in 8-bit environments, the use of
     a BOM will interfere with any protocol or file format that expects
     specific ASCII characters at the beginning, such as the use of "#!"
     of at the beginning of Unix shell scripts. [AF] & [MD]

and
    In the absence of a protocol supporting its use as a BOM and
     when not at the beginning of a text stream, U+FEFF should normally
     not occur.
and
     3. Some byte oriented protocols expect ASCII characters at the             
        beginning of a file. If UTF-8 is used with these protocols, use of
        the BOM as encoding form signature should be avoided.
     4. Where the precise type of the data stream is known (e.g. Unicode
       big-endian or Unicode little-endian), the BOM should not be used.
       In particular, whenever a data stream is declared to be UTF-16BE,
       UTF-16LE, UTF-32BE or UTF-32LE a BOM must not be used. See also [

why not fix scite to not put in chars it shouldnt?

-elf


On Sun, 9 Sep 2007, Pierpaolo Bernardi wrote:

On 9/9/07, Graham Fawcett <address@hidden> wrote:

On 9/8/07, Pierpaolo Bernardi <address@hidden> wrote:

UTF8 has no BOM.  A BOM in a utf8 file should be there only if you
put it there.


Not true.

http://en.wikipedia.org/wiki/Byte_Order_Mark


UTF8 is defined by the Unicode consortium, not by wikipedia.

See here for example: http://unicode.org/faq/utf_bom.html#29

which says that you can put a bom in a utf8 file (of course, you can
put whatever character you want in a file), but it is a character
like every other character, it has no particular meaning wrt the encoding.

Then, maybe chicken could consider U+FFFE as whitespace, to work
around this bug in scite, and maybe other broken tools.

P.




On 9/9/07, Graham Fawcett <address@hidden> wrote:

On 9/8/07, Pierpaolo Bernardi <address@hidden> wrote:

UTF8 has no BOM.  A BOM in a utf8 file should be there only if you
put it there.


Not true.

http://en.wikipedia.org/wiki/Byte_Order_Mark

G



_______________________________________________
Chicken-users mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/chicken-users

[Prev in Thread]

Current Thread

[Next in Thread]

[Chicken-users] BOM in a Scheme source file, Shawn Rutledge, 2007/09/08
- Re: [Chicken-users] BOM in a Scheme source file, Pierpaolo Bernardi, 2007/09/08
  - Re: [Chicken-users] BOM in a Scheme source file, Graham Fawcett, 2007/09/08
    - Re: [Chicken-users] BOM in a Scheme source file, Pierpaolo Bernardi, 2007/09/09
    - Re: [Chicken-users] BOM in a Scheme source file, Elf <=
    - Re: [Chicken-users] BOM in a Scheme source file, Shawn Rutledge, 2007/09/09
    - Re: [Chicken-users] BOM in a Scheme source file, Elf, 2007/09/09
    - Re: [Chicken-users] BOM in a Scheme source file, John Cowan, 2007/09/09
    - Re: [Chicken-users] BOM in a Scheme source file, Zbigniew, 2007/09/09
    - Re: [Chicken-users] BOM in a Scheme source file, John Cowan, 2007/09/09
    - Re: [Chicken-users] BOM in a Scheme source file, John Cowan, 2007/09/09
    - Re: [Chicken-users] BOM in a Scheme source file, Pierpaolo Bernardi, 2007/09/10
    - Re: [Chicken-users] BOM in a Scheme source file, Elf, 2007/09/09

Prev by Date: Re: [Chicken-users] BOM in a Scheme source file
Next by Date: Re: [Chicken-users] BOM in a Scheme source file
Previous by thread: Re: [Chicken-users] BOM in a Scheme source file
Next by thread: Re: [Chicken-users] BOM in a Scheme source file
Index(es):
- Date
- Thread