bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gettext and ITS (was Re: Feature proposal: string extracting by RegE


From: Asgeir Frimannsson
Subject: Re: Gettext and ITS (was Re: Feature proposal: string extracting by RegExp for xgettext)
Date: Fri, 14 Mar 2008 22:42:19 +1000

On Fri, Mar 14, 2008 at 10:13 PM, Asgeir Frimannsson <address@hidden> wrote:
> Hi Bruno,
>
>
> On Fri, Mar 14, 2008 at 9:41 PM, Bruno Haible <address@hidden> wrote:
>
> > Hello Asgeir,
> >
> >
> > > For example, for Glade XML files, the following ITS descriptor [2] can be
> > > applied to extract/merge translatable features:
> > >
> > > <its:rules xmlns:its="http://www.w3.org/2005/11/its"; version="1.0">
> > >  <!-- ITS rules for Glade 2.0, based on 
> > > http://glade.gnome.org/glade-2.0.dtd -->
> > >  <its:translateRule selector="/glade-interface" translate="no"/>
> > >  <its:translateRule selector="//address@hidden'yes']" translate="yes"/>
> > >  <its:translateRule selector="//atkaction/@description" translate="yes"/>
> > >  <its:locNoteRule selector="//address@hidden'yes']"
> > >   locNoteType="description" locNotePointer="@comments"/>
> > > </its:rules>
> >
> > Thank you for posting this example! I had looked at the ITS specification,
> > but not understood what it was really about and how it was meant to be used.
> >
>
>
> Note that this Glade example is an actual example used in the 'best 
> practices' document, not something I came up with :)
>
>
> > So if I understand it right, tools for extracting translatable strings and
> > for merging back translated strings into XML documents could use this
> > W3C ITS specification?
>
>
> Yes, exactly. That is, for merging back you probably don't need it... But 
> imagine this combined with xgettext, e.g. for extracting stuff from odf 
> through xhtml and glade,ts... the absolute path for a translation unit could 
> be stored in the #: reference elem, for example "/html/body/p[34]/table[3]/p" 
> and be used as a locator when merging... something like "xgettext 
> --its=myconfig.its mydoc.xml".
>
>
> >
> >
> > There is no free implementation of it right now?
> >
>
>
> There are a couple: http://www.w3.org/International/its/links.html
>
> Rainbow (mono/.net) is LGPL, so is Spritser.
>
>
> >
> > An implementation of it would have to rely on XPath. For example, use 
> > libxml2.
> > Right?
>
>
> Yeah, the spec relies heavily on xpath expressions, libxml2 is excellent for 
> this.. It should be able to do a 'streaming' implementation, and just rely on 
> xpath for evaluating if the given node is translatable/inline/comment etc, 
> and not rely on loading the whole document into memory.
>
>

One limitation with a PO-based implementation is of course the
handling of inline elements.

For example:

Specify non-translatable elements: <its:translateRule translate="no"
selector="//d:email|//d:uri"/>
Specify inline elements: <its:withinTextRule withinText="yes"
selector="//d:email|//d:uri"

Say you have the xml fragment:
<para>Please email us at <email>address@hidden</email>, or visit our
website at <uri>http://www.example.com</uri>.</para>

Here, everything within para would become a msgid, however, we have no
way of blocking translators from modifying the non-translatable email
or uri elements... This could however be put in automatic comments by
the extraction tool, and even be checked by msgfmt if we have the its
configuration available...

A possible PO representation:

#: //section/para[34]
#. do not translate content within the <email> element
#. do not translate content within the <uri> element
#, xml-format
msgid "Please email us at <email>address@hidden</email>, or visit
our website at <uri>http://www.example.com</uri>."
msgstr ""

cheers,
asgeir




reply via email to

[Prev in Thread] Current Thread [Next in Thread]