• Skip to main content
  • Skip to primary sidebar

Training for Translators

Classes for translators and interpreters

  • Start here
  • Blog
  • Classes
    • Coaching for freelancers
  • Books
    • Translate my books
    • Book Shop
  • About/Contact
    • Privacy Policy
  • Certified translations
  •  

May 26 2008
Corinne McKay

Dealing with tags in OmegaT

I’m still very, very happy with my recent switch from Heartsome to OmegaT, but one thing I’m still mastering in OmegaT is the use of tags in formatted documents.

Tags in OmegaT aren’t a big issue if your translation work consists of documents that don’t have much formatting, or in which the formatting isn’t very important. However, most of my clients want their translations to look exactly the same in English as they do in French, so the formatting tags are critically important. If you’re used to working in or at least looking at a markup language like HTML, the tags that OmegaT inserts don’t look that odd. When you’re translating in OmegaT, you have to reproduce these tags in the target segment in order for it to be formatted like the source segment, and OmegaT has a handy “tag validation” feature that lets you know if you’ve missed any tags.

The issue that I have is that, as OmegaT’s user manual says, “Tags are usually not taken into account when considering string similarity for matching purposes.” My problem is that I get a lot of fuzzy matches where the text in two segments is identical or very similar, but the tags are completely different. For example, the segment “1.3.1) For ongoing needs” would come up as a fuzzy match for “<f0>1.3.1) For ongoing</f0><f1>needs</f1> “. So far, I haven’t found a way to deal with this other than either a) going character by character through the suggested match and inserting the tags into the text or b) inserting the tagged source segment into the target segment field and then copying and pasting in the matching text from the fuzzy match box. If anyone has suggestions on how to deal with this, I would love to hear them!

Another option that OmegaT’s manual suggests, and on some documents I think this might actually save time, is to remove all or most of the formatting from the source document in order to minimize the number of tags that appear in the source segments. Then, the translator could go back and re-format the document after the translation is finished. I also find it helpful to keep an OpenOffice.org document open with the source text so that I can refer to it when I can’t easily see what the tags’ purpose is.

Written by Corinne McKay · Categorized: Technology

Reader Interactions

Comments

  1. Jean-Christophe Helary says

    May 27, 2008 at 2:55 pm

    Corinne,

    It is important to understand that in a lot of cases, tags are only remnants of an old formatting that does not exist anymore on the surface but that still has its “roots” in the file code.

    Such formating is invisible to the eye, but is very visible in OmegaT…

    Also, in OmegaT, tags are not a special entity that OmegaT can automatically ignore when computing the matches. As far as OmegaT is concerned, a tag is a character string, and since matches are computed based on strings similarities, taking a single tag into account would add 9 characters in “weight” to the computation. You can easily see that it is not what you’d want as far as matching accuracy is concerned.

    Last but not least, paragraph (or segment level) formatting is not visible in OmegaT. OmegaT only displays the “inline” formatting. For this reason it is sometimes convenient to simply ignore the tags altogether in OmegaT and externally add the formatting to the document. As the manual says. Since you only loose the inline formatting it is not such a big deal in most cases.

    Tag management is going to be improved in the future, hopefully once the current code re-factoring is completed.

    Reply
  2. Corinne McKay says

    May 28, 2008 at 3:19 am

    Jean-Christophe, thanks a lot, this is really helpful. I’m learning how to manage the tags more effectively, and I’m finding that as per your suggestion of ignoring the tags that don’t affect the appearance of the text, I can leave out many of the tags with no harmful effects! Otherwise OmegaT is working really, really well for me and I’m not going back to the proprietary guys!

    Reply
  3. Etranslate says

    June 9, 2008 at 11:18 am

    Thanks a lot, this is really helpful. Really well for me and I’m not going back to the proprietary guys! If You Need More Information Please Visit us :- eTranslate is an international company specialising in the provision of Internationalization and Globalization Solutions.

    Reply
  4. Roberto bechtlufft says

    May 28, 2009 at 6:32 pm

    Hi, Corinne. I’m also an OmegaT user, and I really missed a way to insert tags using keyboard shortcuts. A few days ago I’ve created this small set of scripts that add this feature to OmegaT, so that you can skip to the next/previous tag using ATL+RIGHT/LEFT and paste the tag with ALT+DOWN. In case you’re interested, it’s called taginsert and you may find it here:
    http://www.bechtranslations.com.br/programas/start

    The site is in Portuguese, but the zip file contains a readme in English. Note that the scripts ony work on Linux and with the newest version of OmegaT (2.0.2).

    Best regards,

    Roberto Bechtlufft

    Reply
  5. Textnik says

    November 12, 2009 at 11:06 am

    I was just wondering how to do that on a MS OOXML file.. Unfortunately still no solution as of 2.02.

    Reply

Trackbacks

  1. Update on tags in OmegaT « Thoughts On Translation says:
    June 30, 2008 at 4:12 am

    […] my love of the free and open source translation environment tool OmegaT and my frustration with tags in OmegaT. Since then, I’ve found that the easiest solution, as long as the document doesn’t […]

    Reply

Leave a ReplyCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Subscribe to the Training for Translators mailing list!

The Training for Translators blog…in your pocket! PDF compilation of 15 months of blog posts: $10

Getting Started as a Freelance Interpreter: Available now in print and electronic editions

Learn from our blog:

  • How is this year going so far?
  • Travel: Climbing some of Colorado’s 14,000-foot peaks
  • How to edit and proofread your own work
  • Travel: Why I succumbed to the United Quest card
  • How to prepare for (and pass!) an interpreting exam: master class on Thursday
  • Contacting multiple people at the same company: When and how to do this
  • When clients think you’re too expensive, should you try to justify your rates, or just let them move on?
  • April classes open for registration

Search the Training for Translators blog

Copyright © 2026 · Training For Translators · Log in

This website uses cookies to improve your experience. We'll assume you're OK with this, but you can opt-out if you wish. To view this website's privacy policy, click About>Privacy Policy. Accept Read More
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT