I’m still very, very happy with my recent switch from Heartsome to OmegaT, but one thing I’m still mastering in OmegaT is the use of tags in formatted documents.
Tags in OmegaT aren’t a big issue if your translation work consists of documents that don’t have much formatting, or in which the formatting isn’t very important. However, most of my clients want their translations to look exactly the same in English as they do in French, so the formatting tags are critically important. If you’re used to working in or at least looking at a markup language like HTML, the tags that OmegaT inserts don’t look that odd. When you’re translating in OmegaT, you have to reproduce these tags in the target segment in order for it to be formatted like the source segment, and OmegaT has a handy “tag validation” feature that lets you know if you’ve missed any tags.
The issue that I have is that, as OmegaT’s user manual says, “Tags are usually not taken into account when considering string similarity for matching purposes.” My problem is that I get a lot of fuzzy matches where the text in two segments is identical or very similar, but the tags are completely different. For example, the segment “1.3.1) For ongoing needs” would come up as a fuzzy match for “<f0>1.3.1) For ongoing</f0><f1>needs</f1> “. So far, I haven’t found a way to deal with this other than either a) going character by character through the suggested match and inserting the tags into the text or b) inserting the tagged source segment into the target segment field and then copying and pasting in the matching text from the fuzzy match box. If anyone has suggestions on how to deal with this, I would love to hear them!
Another option that OmegaT’s manual suggests, and on some documents I think this might actually save time, is to remove all or most of the formatting from the source document in order to minimize the number of tags that appear in the source segments. Then, the translator could go back and re-format the document after the translation is finished. I also find it helpful to keep an OpenOffice.org document open with the source text so that I can refer to it when I can’t easily see what the tags’ purpose is.
Corinne,
It is important to understand that in a lot of cases, tags are only remnants of an old formatting that does not exist anymore on the surface but that still has its “roots” in the file code.
Such formating is invisible to the eye, but is very visible in OmegaT…
Also, in OmegaT, tags are not a special entity that OmegaT can automatically ignore when computing the matches. As far as OmegaT is concerned, a tag is a character string, and since matches are computed based on strings similarities, taking a single tag into account would add 9 characters in “weight” to the computation. You can easily see that it is not what you’d want as far as matching accuracy is concerned.
Last but not least, paragraph (or segment level) formatting is not visible in OmegaT. OmegaT only displays the “inline” formatting. For this reason it is sometimes convenient to simply ignore the tags altogether in OmegaT and externally add the formatting to the document. As the manual says. Since you only loose the inline formatting it is not such a big deal in most cases.
Tag management is going to be improved in the future, hopefully once the current code re-factoring is completed.
Jean-Christophe, thanks a lot, this is really helpful. I’m learning how to manage the tags more effectively, and I’m finding that as per your suggestion of ignoring the tags that don’t affect the appearance of the text, I can leave out many of the tags with no harmful effects! Otherwise OmegaT is working really, really well for me and I’m not going back to the proprietary guys!
Thanks a lot, this is really helpful. Really well for me and I’m not going back to the proprietary guys! If You Need More Information Please Visit us :- eTranslate is an international company specialising in the provision of Internationalization and Globalization Solutions.
Hi, Corinne. I’m also an OmegaT user, and I really missed a way to insert tags using keyboard shortcuts. A few days ago I’ve created this small set of scripts that add this feature to OmegaT, so that you can skip to the next/previous tag using ATL+RIGHT/LEFT and paste the tag with ALT+DOWN. In case you’re interested, it’s called taginsert and you may find it here:
http://www.bechtranslations.com.br/programas/start
The site is in Portuguese, but the zip file contains a readme in English. Note that the scripts ony work on Linux and with the newest version of OmegaT (2.0.2).
Best regards,
Roberto Bechtlufft
I was just wondering how to do that on a MS OOXML file.. Unfortunately still no solution as of 2.02.