po4a-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Po4a-dev]HTML module (first revision)


From: Martin Quinson
Subject: Re: [Po4a-dev]HTML module (first revision)
Date: Fri, 14 Feb 2003 16:31:54 +0100
User-agent: Mutt/1.5.3i

On Thu, Feb 13, 2003 at 03:03:22PM +0100, Laurent Hausermann wrote:
> Hi all, 
>  
> I have developped an HTML module for po4a. It has still some bugs and it's 
> not 
> perfect, but I think it's a good starting point. 
>  
> It uses HTML::TokeParser ( apt-get install libhtml-parser-perl ) 
>  
> I sent the whole diff to Martin Quinson, not to this list (I can send any you 
> if you mind a email to me) ..  ? 

Ok, I commited this to the CVS, so that others can see it.

This module isn't ready to release yet in my opinion. Here are my objections:

 * The parser you used don't allow to retrieve the line number. Why not to
   use the HTML::Parser module, which seems somehow more powerfull ?
 * The sentence:
     a wonderful wife named "<a 
href="mailto://Armelle.Quinson.fr";>Armelle</a>", 
     and a marvelous little boy <a href="Tristan.html">Tristan</a>
   (yup, it's part of my homepage ;) is changed to:
# type: td
#: FIXME:0
#, no-wrap
msgid "a wonderful wife named \""
msgstr ""

# type: a
#: FIXME:0
#, no-wrap
msgid "Armelle"
msgstr ""

# type: td
#: FIXME:0
#, no-wrap
msgid "\", and a marvelous little boy"
msgstr ""

# type: a
#: FIXME:0 FIXME:0
#, no-wrap
msgid "Tristan"
msgstr ""

    That is to say that sentences are broken in subparts, which is BAD. (see
    http://www.ens-lyon.fr/~mquinson/l10n.html for a rational).
  * Your version don't put entry type in the po, which prevents from using
    gettextization (see po4a(7) for more details). I quickly hacked a
    support for that in the version in CVS, but that's not perfect yet.
    
I suggest that:
  - you move to a parser that allows you to retrieve the line number (or
    explain me that I'm an idiot and that this parser do allow you to
    retrieve the line number, and how)
  - you look at the sgml module to see how we handle the fact that some tags
    delimit a paragraph (like <p>), and should be translated, and that some
    other tags shouldn't be touched because they don't delimit a sentence
    (like <b>, <i> and so on)

Sorry, but I really can't release this module as is...
Anyway, thanks for your contribution, it IS a good start.

Bye, Mt.

-- 
Source is provided to this software because we believe users have the right
to know exactly what a program is going to do before they run it.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]