Regex tagger

While importing a document

Start importing a document (any monolingual format).
In the Document import options window, select the documents, and click Change filter and configuration.
The Document import settings window appears.
Below Filter configuration, click the Add cascading filter link.
The Select filter to chain window appears. From the Filter drop-down box, choose Regex tagger. Click OK.
In the Document import settings window, the filter chain appears below the cascading filter controls. Click Regex tagger.
The Regex tagger controls appear.

In the translation editor

Open a project, and a document for translation.
In the Preparation ribbon, click Regex Tagger.
The Tag current document window appears. It's the same as the Regex tagger settings in the Document import settings window.

Use an existing set of regular expressions, or save your own

Most of this window is for writing and testing regular expression rules that memoQ uses to find parts of text that are replaced with inline tags. After you write up rules like this, you can save them as a filter configuration. You can also load a set of rules that was saved earlier.

To load an existing set of patterns: Choose one from the Filter configuration dropdown.

To save the rules you just created: In the Filter configuration dropdown, choose <new configuration> , and click the Save icon_saveall icon next to it. The Create new filter configuration window appears, where you can give a name to the new set of rules.

Write or edit regular expression patterns

You can set up several rules in a single regular expression filter configuration. These are listed in the top box of the Rules section.

To add a pattern, first type a regular expression in the Regular expression text box. This can be a simple expression: for example, if you want to replace the word 'memoQ' with an inline tag, simply type 'memoQ' in the Regular expression text box.

You can also enter more complex expressions where a simple pattern can represent several different character sequences. If you need assistance, open the Regex Assistant: Click the icon on the right, and create a regex, or choose one from the regex library. Copy it, then switch back to this window, and paste your regex into the Regular expression text box.

For example, the regular expression <[^/]*?> matches text that starts with the < character, followed by the shortest possible sequence of characters that does not contain the / character, and ends in a > character. In short, text that matches this pattern looks like an XML opening <tag>.

To learn more about regular expressions: See the topic about Regular expressions.

After you type the regular expression, choose what type of tag you want to see in the place of the text. You can choose to use an opening tag icon_opening_tag , a closing tag icon_closing_tag , or an empty tag icon_empty_tag . These correspond to the types of tags commonly used in XML markup.

If you check the Required check box, memoQ will add a tagging error to a segment if you don't copy the corresponding tag to the target text in the translation editor.

In the Display text text box, you can specify what memoQ should write inside the tag. This is called a replacement rule, and you also use these in auto-translation rules. You can write any text here, but you can also use the pre-defined $0 expression: if the replacement rule is $0, the tag will contain the text that memoQ found when matching the pattern.

Note: If the regular expression contains several non-fixed parts, you can use $1, $2 etc. to refer to the first, second etc. non-fixed part in the replacement rule. You can choose from available options if you click the Pattern link next to the Display text box.

After you fill in the Display text box, click Add to add the rule to the list.

To modify an existing rule, click the rule in the list, and click Change.

To remove a rule from the list, click the rule, and click Delete.

If you want the Regex Tagger to work on tabs and newlines, too, check the Rules handle tabs and newlines check box.

Dealing with tabs and newlines: A segment in memoQ never contains tabs or line breaks. If they appear, they are represented by a type of a tag. But when you need to import a text-based document (TXT, HTML, XML etc.), you may want to tag newlines and tabs yourself. Normally, the filters before the Regex Tagger will have already converted these characters into a tag. But then you do not have the chance to tag them yourself. To work with tabs and newlines in the Regex tagger, check the Rules handle tabs and newlines check box. Then the filters before the Regex tagger will not touch the tabs and newlines (they won't convert them into tags). But then need to make sure that you tag the tabs and newlines with the Regex Tagger. If you do not tag them, memoQ won't import the document.

The lower part of the window shows how the rules work. After you fill in or edit the Rules list, the Input text box shows what parts of the original text will match your patterns. The Result box shows how memoQ tags them. Matches and replacements are highlighted in red.

Normally, the Input text and Result boxes highlight the matches from all patterns. If you want to see highlights from one rule only, click the Apply only selected rule radio button, then click a rule in the Rules list.

The order of the rules matters: Click the Up and Down buttons to move rules up and down. This can be useful if two patterns match the same paragraph, and the parts they match are overlapping.

If you come from document import

To confirm the settings, and return to the Document import options window: Click OK.

To return the Document import options window, and not change the filter settings: Click Cancel.

If this is a cascading filter, you can change the settings of another filter in the chain: Click the name of the filter at the top of the window.

In the Document import options window: Click OK again to start importing the documents.

If you come from the translation editor

To confirm the settings, and return to the translation editor: Click OK. memoQ starts to tag the document.

To return the translation editor, and not tag the document: Click Cancel.

Regex tagger

How to get here

What can you do?

When you finish