openexr-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Openexr-devel] UTF-8


From: Florian Kainz
Subject: Re: [Openexr-devel] UTF-8
Date: Wed, 14 Nov 2012 11:47:37 -0800
User-agent: Thunderbird 2.0.0.24 (X11/20100428)


The ACES image container specification, meant to be compatible OpenEXR,
prescribes UTF-8 for the representation of strings.  Therefore I suggest
that OpenEXR adopt the following rules:

- All text strings are to be interpreted as Unicode, encoded as UTF-8.
  This includes attribute names and strings contained in attributes,
  for example, as channel names.

- Text strings stored in files must be in Normalization Form C (NFC,
  canonical decomposition followed by canonical composition).

- Where text strings need to be collated, strcmp() is used to compare
  the corresponding char sequences:  string A comes before (or is less
  than) string B if

    strcmp(A,B) == -1

  (Note: this is not ambigous; the C99 standard specifies that strcmp()
  interprets the bytes that make up a string as unsigned.)

- Text strings passed to the IlmImf library must be encoded as UTF-8
  and in Normalization Form C.

As far as I can tell, these rules are entirely compatible with all
existing versions of the IlmImf library.  Users whose writing system
includes non-ASCII Unicode characters can continue to employ the
existing library versions without change.

Future versions of the library should verify that text strings are
valid UTF-8.  In addition, the library should either verify that
strings are normalized to NFC, or normalize to NFC on the fly.


Florian




reply via email to

[Prev in Thread] Current Thread [Next in Thread]