po4a-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Po4a-dev] address@hidden: Re: Using antiword parser for another program


From: Martin Quinson
Subject: [Po4a-dev] address@hidden: Re: Using antiword parser for another program]
Date: Wed, 22 Jan 2003 11:36:38 +0100
User-agent: Mutt/1.4i

Well, it looks like a word module is impossible to do. Erk, it would be sooo
cool ;)

Mt.

----- Forwarded message from Antiword team <address@hidden> -----

From: Antiword team <address@hidden>
To: Martin Quinson <address@hidden>
Subject: Re: Using antiword parser for another program
Date: Sun, 19 Jan 2003 13:29:56 +0100

On Wednesday 15 January 2003 10:50, you wrote:
> Hello,
> 
> I'm the principal developper of the po4a project. Its goal is to ease the
> translation of documentation by extracting strings from the document,
> presenting them to translator, and reinjecting their translation back in
> place of the original text in the document.
> 
> For now, it works for simple (ie, well documented) format like nroff (man
> pages), sgml and others. 
> 
> I would like to try to do the same thing for word document, since there 
is a
> very strong demand for this from translators.
> 
> Before I start coding on this, I would like to have your advice on this. 
Do
> you think it's possible to do the same for word document?
> The main problem I can imagine before starting is that we'll have to 
update
> offsets around the file when the text lenght change. But if there is not 
too
> much such stuff, it still could be feasible.
> Another interesting problem will be to change the encoding of the document
> in case of an english->korean translation, I guess.
> 
> Any other pitfall I should be aware of?

Hello,

I'm afraid your idea will not work for Word documents. If you replace text 
in a Word document the result is not a Word document anymore. Specially not 
when the repacing text is longer or shorter than the original text.
Word stores text and the information about how the text should be displayed 
in different places. Any change in the length of the text should be 
followed by countless changes in offsets in the display information. Miss 
one offset or miscompute one offset and Word will crash when it tries to 
read the resulting document. It's a hopeless task.

The best you can do is use Antiword to extract the text and then translate 
the text. But you can not create a new Word document.

Kind Regards,
Adri van Os 

-- 
The Antiword Team                         address@hidden
http://www.winfield.demon.nl/index.html for version 0.33 (05 Jul 2002)


----- End forwarded message -----

-- 
Si les grands esprits se rencontrent, les petits esprits, eux, se cognent.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]