[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Freecats-Dev] Re: Trados/other CAT, Python/Java, German/English
From: |
Keith Godfrey |
Subject: |
[Freecats-Dev] Re: Trados/other CAT, Python/Java, German/English |
Date: |
Tue, 25 Feb 2003 15:11:38 -0600 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20021120 Netscape/7.01 |
Henri Chorand wrote:
As translators, our first aim was to provide a full-fledged standalone
translation editor, because it might be the most productive solution. We
then quickly realized that we would need as many conversion filters as
possible in order to be able to translate whatever customers require,
and we thought about the huge job done by Open Office team. We soon
realized their conversion filters would have to be integrated into
Free CATS client(s).
Building a standalone translation editor of good quality, capable of
representing source files of different natures reasonably well (forget
accurately) will be a significant undertaking, imho, and one that will
be very difficult to accomplish in a cross-platform manner. One
possible solution would be to take an open source word processor that
has some filters to start with (such as Abiword) and build upon that,
but then you're tying yourself to a specific platform. On the plus
side, you've already got a solid infrastructure to start from. A CAT
tool imbedded within OpenOffice.org, if it were possible, might provide
the most optimal solution, but I've never heard if such a task would work.
There are two options there:
- We find a way to build this interactive translation editor (this means
we have to adapt Open Office's filters)
- We build a tool that works from within OO, like Trados with MS Word
(we can reuse these filters without any extra work).
In an ideal world, translators might ask for both a standalone
translation editor (like OmegaT) and integration within a word processor.
My background would suggest that a focus should be made on one or the
other techniques - trying to satisfy both will be significantly more
complex and likely never be completed.
I can't pretend we dwelved deeply into OO's internals, but we found out
the following:
- OO has no macro language. Something may be done at a later stage.
- OO's API is well documented, so it might be rather doable to do
something, especially if we only implement a toolbar calling a set of
external functions.
Intead of using Abiword, it might be worth considering making a custom
build of OOo with built in Trados like features (and port the parts of
other open source tools, such as OmegaT, that would be needed to make
the CAT side work).
The other solution we see is (assuming we start from OmegaT, which is
an option I would personally favour):
- Separate client & TM server features in OmegaT
- Design a more sophisticated GUI interface (I believe we can bring a
number of clever ideas here)
One thing to consider - all of the file filters in OmegaT would require
a complete rewrite if text style information needs to be extracted, and
seperate output filters would need to be created if the user is allowed
modify the file formatting. OmegaT's filters are reasonably simple -
they extract bits of text from a stream of data (the source file) and
simply replace that text with translated text when writing the
translated file. That method provides very strict enforcement of no
formatting changes outside of the proper word processing editor. Unless
one goes with a high quality word processor (such as OOo), it may be
dangerous try to modify formats - clients may end up with files that
don't work for them (I've spent several years as a localization engineer
and have seen plenty of corrupted files - another reason for OmegaT's
strict formatting policy)
We would all very much appreciate if you could provide us with a
general description of OmegaT's indexing & fuzzy matching features.
See my other email for this
At this stage, we need to know IF (and to which extent) OmegaT keeps
all such formatting (as found in OO's native XML format files).
As mentioned above, OmegaT ignores all formatting information - it's
only interest is in whether or not the formatting tags are 'hard' (like
paragraph boundaries) or 'soft' (like formatting boundaries). The tags
are discarded after identification.
BTW - I'm not subscribed to the list so if someone wishes to contact me,
please do so directly.
Keith
Re: [Freecats-Dev] Re: Interface/vote, David N. Welton, 2003/02/19