[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freecats-Dev] OmegaT
From: |
Marc Prior |
Subject: |
Re: [Freecats-Dev] OmegaT |
Date: |
Thu, 27 Feb 2003 10:03:18 +0100 |
Hello All,
I have subscribed to the Free CATS mailing list and would be happy to share
my thoughts with you (as many of you already know me, you will also know that
I think you're entitled to my opinion ;-) ), but at the moment work on the
OmegaT project is taking up any spare time I have. (I also have to do some
translation in-between, in order to buy food :-). Very inconvenient!)
I'll now resopnd to various comments on the list, mainly by Henri.
> To put it in a nutshell: Free CATS dreamed about it, and Keith did it
This really sums it up. OmegaT is available, accessible, and usable. Indeed,
people are using it - in fact, interest is really starting to pick up
now. It is easy to see potential areas for further improvement of OmegaT, but
my view is that it is better to make an open-source product available, and
then to modify it in the light of user feedback. With a commercial product,
there may be financial and marketing reasons for getting it right first time,
but I feel strongly that a process of continual development is preferable
for projects such as OmegaT and Free CATS. These projects are dependent upon
generating enthusiasm, and technical specifications aren't very exciting.
Also, an obvious source of support is the open-source coding community
outside of translation, as there are programmers there willing to contribute,
and they in turn can benefit from the product by using it to localize
open-source products. The problem here is that the programmers do not
necessarily have experience of the translation process, and if allowed to do
so, may work towards solutions which are not practical and do not meet the
actual needs of translators.
The other thing I would encourage you all to do is, of course, to try OmegaT.
It is easy to learn - in fact, a university lecturer here in Germany says it
can be learnt in an hour. (Think of that - OmegaT as a university subject.
OmegaTology, or OmegaTics?) See
http://www.fask.uni-mainz.de/cafl/linuxfaq/fasklinuxfaq.html
I have been suggesting that users try version 0.9.9, as version 1.0.0 had
certain bugs, and there is little documentation for it as yet (give me
another three-four weeks). However, the bugs have been cured in version
1.0.2, and I imagine list subscribers will be able to find their way around
1.0.2 with what documentation there is (the 0.9.7 manual, plus Keith's
release notes for 1.0.0 on the Sourceforge site). The additional
functionality of 1.0.2, in particular the change to TMX as the native TM
format, is well worth having. 1.0.2 is available from the OmegaT home page
(not Sourceforge yet). Make sure you have version 1.4.x of the J2RE installed.
On the current discussion of fuzzy matching algorithms:
> Please tell us what you think about it, and if ever you came with a
> similar solution. Marc Prior will certainly have something to say here,
> as I understand he works with German language, full of "déclinaison"
I have been using TM, first Trados, then Wordfast and now OmegaT, for six
years now. I must say that for my work, all three have fuzzy matching
algorithms that are more than adequate. There may be performance gains to be
had by changing these algorithms, but such improvements would be a long way
down the list of priorities for me.
I should perhaps point out that my work is seldom highly repetitive. It
seldom involves translating a text which is 90% identical to an existing one,
where the accuracy of fuzzy matching and the time savings gained from making
minimum modifications to an existing segment are critical. It is much more
often the case that I need to retrieve isolated words or phrases from past
translations each of which would take me several minutes to find by other
methods (searching the file system, opening the file, finding the location in
the text, locating the parallel (translated) file, opening it, finding the
location). This is the real benefit of TM for me. And, I believe, also for
many other translators who don't consider using translation memory because
they think it is only for repetitive texts.
One weakness of OmegaT is that the fuzzy matching algorithm treats inline
formatting tags as words (or rather parts of words), and this reduces the
match rate substantially on heavily formatted texts. That is not a fault of
the algorithm itself, though.
On the subject of formatting:
OmegaT retains all of OOo's formatting information. I have not yet noticed
any formatting loss whatsoever. I have imported large (5 MB), complex
(styles, images, TOCs, etc.) MS Word files directly (i.e. without going
through RTF) into OOo, translated them without difficulty in OmegaT, and
exported them directly from OOo to Word, without formatting loss.
(Apart from demonstrating OmegaT's strengths, this also proves the value of
the XML format structure for text documents. But don't get me started on that
subject!)
You translate around the tags in the way with which most of you will be
familiar. However, although you can't add formatting, you do have a certain
amount of control over the existing formatting. (I need to do more testing to
find how much.) You can delete, multilply, and change the sequence of certain
formatting attributes. So, the sentence
This is BOLD, this is ITALIC and this is UNDERLINED.
(where BOLD, ITALIC and UNDERLINED have the corresponding formatting)
appears in OmegaT as
This is <f0>bold</f0>, this is <f1>italic</f1> and this is <f2>underlined</f2>
and can be changed to
This is <f2>underlined</f2>, this is not italic and this is <f0><f2>bold and
underlined</f2></f0>.
This may seem trivial to a programmer, but if you translate between two
languages which frequently have a different word order and which use
different font formatting mechanisms, you will appreciate how valuable it is.
My point here is really that OmegaT should not be seen as leaving the
translator with no formatting control.
Enough for now.
Marc