bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gettext MO format version numbers


From: Bruno Haible
Subject: Re: Gettext MO format version numbers
Date: Sun, 15 Feb 2009 22:28:26 +0100
User-agent: KMail/1.9.9

Hi,

Dwayne Bailey wrote:
> 1) The Gettext documentation states that we are at version 0 of the
> format.

This is not up to date. I'm fixing it as below.

> Yet I have observed some files with a version number of 1, in 
> the wild.  I was able to parse them correctly by simply ignoring the
> version information.  Is there such a version?

Yes, see file intl/gmo.h:

/* Revision number of the currently used .mo (binary) file format.  */
#define MO_REVISION_NUMBER 0
#define MO_REVISION_NUMBER_WITH_SYSDEP_I 1

> 2) .mo files for certain RTL languages have different version number.
> msgfmt and msgunfmt are able to read these files but when opened in a
> hex editor the first few bytes are as follows:
> 
> DE12 0495 0100 0100

This means: major revision number is 1 (meaning that some format string
translations use "I" for internationalized output digits [Farsi]), and
minor revision number is 1 (meaning that some strings have substrings
whose expansion depends on the system type).

> Simply ignoring the version information allowed me to read these files
> correctly.  But I would like to know the cause so that we can continue
> to produce correct .mo files.

It's better to support these other major and minor versions, but OTOH it's
a lot of code for rarely used features. In order to support all kinds of
MO file versions, without copying all the hairy stuff from gettext's
write-mo.c and read-mo.c, it is better to simply invoke 'msgfmt' and
'msgunfmt' when creating or reading MO files, respectively.

Bruno


2009-02-15  Bruno Haible  <address@hidden>

        * gettext.texi (MO Files): Update w.r.t. the maximum revision in use.
        Reported by Dwayne Bailey <address@hidden>.

diff -u -r1.160 gettext.texi
--- gettext.texi        28 Jan 2009 01:55:03 -0000      1.160
+++ gettext.texi        15 Feb 2009 21:20:32 -0000      1.162
@@ -5285,15 +5285,23 @@
 The first two words serve the identification of the file.  The magic
 number will always signal GNU MO files.  The number is stored in the
 byte order of the generating machine, so the magic number really is
-two numbers: @code{0x950412de} and @code{0xde120495}.  The second
-word describes the current revision of the file format.  For now the
-revision is 0.  This might change in future versions, and ensures
-that the readers of MO files can distinguish new formats from old
-ones, so that both can be handled correctly.  The version is kept
+two numbers: @code{0x950412de} and @code{0xde120495}.
+
+The second word describes the current revision of the file format,
+composed of a major and a minor revision number.  The revision numbers
+ensure that the readers of MO files can distinguish new formats from
+old ones and handle their contents, as far as possible.  For now the
+major revision is 0 or 1, and the minor revision is also 0 or 1.  More
+revisions might be added in the future.  A program seeing an unexpected
+major revision number should stop reading the MO file entirely; whereas
+an unexpected minor revision number means that the file can be read but
+will not reveal its full contents, when parsed by a program that
+supports only smaller minor revision numbers.
+
+The version is kept
 separate from the magic number, instead of using different magic
 numbers for different formats, mainly because @file{/etc/magic} is
-not updated often.  It might be better to have magic separated from
-internal format version identification.
+not updated often.
 
 Follow a number of pointers to later tables in the file, allowing
 for the extension of the prefix part of MO files without having to




reply via email to

[Prev in Thread] Current Thread [Next in Thread]