bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ommiting header causes multibyte errors


From: Bruno Haible
Subject: Re: ommiting header causes multibyte errors
Date: Thu, 7 Jun 2007 14:22:42 +0200
User-agent: KMail/1.5.4

Hello,

Ariel wrote:
> I'm trying to generate a .po file containing UTF-8 using --from-code=UTF-8

This is supported by xgettext.

> If I run xgettext with --omit-header it gives errors:
> 
> warning: The following msgid contains non-ASCII characters. This will 
> cause problems to translators who use a character encoding different from 
> yours. Consider using a pure ASCII msgid instead.
> 
> And a bunch of:
> invalid multibyte sequence

This is normal. A POT or PO file that does not carry a character encoding
specification in the header entry is assumed to be in ASCII. xgettext notices
that its output would violate this rule and gives a warning and an error.

> If I let it output the header it works fine.

Yes, this is expected.

> If I omit the header, it's my responsibility to specify charset, etc. 
> elsewhere. xgettext should not assume it suddenly because ASCII.

No. The GNU gettext tools have been designed for maximum reliability. This
implies that the character encoding is specified in-line in the file, not
out of band.

> For example:
> 
> Say I wanted to cat a bunch of .po files together, so I don't want a 
> header.

The GNU gettext tools provide two programs for this purpose:
  - For generating a POT file, xgettext, like this:
       xgettext -o combined.pot part1.pot part2.pot ...
  - For generating a PO file, combining translations: msgcat.
       msgcat -o de.po de-part1.po de-part2.po ...

> Or the reason listed in the manual of removing a source of variance.

This is only for testing purposes. You can use plain ASCII when it's just for
testing.

> Or, in my case because I msgmerge to a file that already has a header, and 
> I want it exactly like that, without Report-Msgid-Bugs-To and 
> POT-Creation-Date.

msgmerge is not a tool for concatenating PO files. It's a tool for merging
updated translations from a translator with an updated POT file from the
programmer.

If this doesn't cover your use case, please explain what you are trying to
do. The GNU gettext tools are designed to cover many use cases, while at
the same time avoiding to treat PO files like binary data (because that
would lead to encoding errors in many cases).

Bruno





reply via email to

[Prev in Thread] Current Thread [Next in Thread]