Re: Feature proposal: string extracting by RegExp for xgettext

bug-gnu-utils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Feature proposal: string extracting by RegExp for xgettext

From:	Bruno Haible
Subject:	Re: Feature proposal: string extracting by RegExp for xgettext
Date:	Thu, 13 Mar 2008 14:00:00 +0100
User-agent:	KMail/1.5.4

Aurélio A. Heckert wrote:
> I don't want to see the messages. I want to help the xgettext
> to get messages from new coding ways.
> 
> See what i want to tell:
> =============================
> #!/usr/bin/mylang
> 
> do someting
> print gettext #my text#
> end
> =============================
> 
> The xgettext can't get the "my text" string on this
> strange language, but the problem is not only to
> new languages, there are a lot of languages not
> suported by xgettext and more if we think on
> templates... XML based formats can have localizable
> atributes.
> 
> So... how the xgettext will find the gettext function
> on new codes? How it must get the string?
> We may give a regexp to the xgettext recognize
> where to get the strtings on the code.

You are talking about two topics here:

1) About the new languages: You think that you can describe languages
through regular expressions. I don't think so. Regular expressions are
a good means to do some text processing with very short development time.
But when applied to text written in a programming language, they fail.
(Take the syntax colouring of 'vim' for example. It is described by regular
expressions. It's right 95% of the time, and produces wrong results 5%
of the time.)

Also, a single regular expression will not be enough. What you would need
is some kind of programmable execution engine (possibly a state machine)
where regular expressions are only one ingredient.

If you are inventing a particular new language, and are only interested in
quick-and-dirty results, you can program your own extractor in a scripting
language like Python. In Python you already have a binding to the libgettextpo
library for creating the .pot file, so you can concentrate on your parsing.

2) About XML based formats: xgettext supports the GNOME Glade format. Its
designers soon noticed that a long hardcoded list of localizable tags was
not a good idea. In Glade 2, therefore, there is an attribute
  translatable="yes"
which makes it easier to extract the localizable contents.

If you have an XML format of your own and want to produce a PO file from it,
the ideal scripting language for this task is probably XQuery. Less ideal,
but also possible, is XSLT that produces XLIFF, followed by an XLIFF to PO
converter [1].

Bruno

[1] http://xliff-tools.freedesktop.org/wiki/

[Prev in Thread]

Current Thread

[Next in Thread]

Feature proposal: string extracting by RegExp for xgettext, Aurélio A. Heckert, 2008/03/10
- Re: Feature proposal: string extracting by RegExp for xgettext, Bruno Haible, 2008/03/10
  - Message not available
    - Re: Feature proposal: string extracting by RegExp for xgettext, Bruno Haible <=
    - Re: Feature proposal: string extracting by RegExp for xgettext, Asgeir Frimannsson, 2008/03/14
    - Re: Feature proposal: string extracting by RegExp for xgettext, Bruno Haible, 2008/03/14
    - Message not available
    - Re: Feature proposal: string extracting by RegExp for xgettext, YS, 2008/03/16

Prev by Date: Re: bug of gawk cvs version (builtin.c)
Next by Date: Feature request: ability to copy a po_message
Previous by thread: Re: Feature proposal: string extracting by RegExp for xgettext
Next by thread: Re: Feature proposal: string extracting by RegExp for xgettext
Index(es):
- Date
- Thread