po4a-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Po4a-dev]HTML module (first revision)


From: Martin Quinson
Subject: Re: [Po4a-dev]HTML module (first revision)
Date: Tue, 18 Feb 2003 08:35:19 +0100
User-agent: Mutt/1.5.3i

In fact, HTML being a DTD of SGML, I guess it could be easier to handle this
format with the Sgml.pm module, which offers the whole mecanism to do what I
wanted from the HTML.pm...

You would only have to add the specific parts to HTML after the specific
parts of docbook and debiandoc, and provided that your documents start with
a line like
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
(as they should), it will work (I guess) !

Bye, Mt.

On Fri, Feb 14, 2003 at 04:31:54PM +0100, Martin Quinson wrote:
> On Thu, Feb 13, 2003 at 03:03:22PM +0100, Laurent Hausermann wrote:
> > Hi all, 
> >  
> > I have developped an HTML module for po4a. It has still some bugs and it's 
> > not 
> > perfect, but I think it's a good starting point. 
> >  
> > It uses HTML::TokeParser ( apt-get install libhtml-parser-perl ) 
> >  
> > I sent the whole diff to Martin Quinson, not to this list (I can send any 
> > you 
> > if you mind a email to me) ..  ? 
> 
> Ok, I commited this to the CVS, so that others can see it.
> 
> This module isn't ready to release yet in my opinion. Here are my objections:
> 
>  * The parser you used don't allow to retrieve the line number. Why not to
>    use the HTML::Parser module, which seems somehow more powerfull ?
>  * The sentence:
>      a wonderful wife named "<a 
> href="mailto://Armelle.Quinson.fr";>Armelle</a>", 
>      and a marvelous little boy <a href="Tristan.html">Tristan</a>
>    (yup, it's part of my homepage ;) is changed to:
> # type: td
> #: FIXME:0
> #, no-wrap
> msgid "a wonderful wife named \""
> msgstr ""
> 
> # type: a
> #: FIXME:0
> #, no-wrap
> msgid "Armelle"
> msgstr ""
> 
> # type: td
> #: FIXME:0
> #, no-wrap
> msgid "\", and a marvelous little boy"
> msgstr ""
> 
> # type: a
> #: FIXME:0 FIXME:0
> #, no-wrap
> msgid "Tristan"
> msgstr ""
> 
>     That is to say that sentences are broken in subparts, which is BAD. (see
>     http://www.ens-lyon.fr/~mquinson/l10n.html for a rational).
>   * Your version don't put entry type in the po, which prevents from using
>     gettextization (see po4a(7) for more details). I quickly hacked a
>     support for that in the version in CVS, but that's not perfect yet.
>     
> I suggest that:
>   - you move to a parser that allows you to retrieve the line number (or
>     explain me that I'm an idiot and that this parser do allow you to
>     retrieve the line number, and how)
>   - you look at the sgml module to see how we handle the fact that some tags
>     delimit a paragraph (like <p>), and should be translated, and that some
>     other tags shouldn't be touched because they don't delimit a sentence
>     (like <b>, <i> and so on)
> 
> Sorry, but I really can't release this module as is...
> Anyway, thanks for your contribution, it IS a good start.
> 
> Bye, Mt.
> 
> -- 
> Source is provided to this software because we believe users have the right
> to know exactly what a program is going to do before they run it.
> 
> 
> _______________________________________________
> Po4a-dev mailing list
> address@hidden
> http://mail.nongnu.org/mailman/listinfo/po4a-dev

-- 
J'admets que ces nouvelles technologies nous ouvrent d'autres perspectives,
mais je refuse que cela soit au détriment des possibilités anciennes comme
le livre. L'interaction ne doit pas être le tout de la communication. Car,
si elle le devenait, nous ne communiquerions plus qu'entre vivants, ce qui
serait barbare.  
          --- Alain Finkielkraut




reply via email to

[Prev in Thread] Current Thread [Next in Thread]