Regular expressions

Regular expressions are a powerful means for finding character sequences in text. In memoQ, they are used to define segmentation rules, auto-translation rules, or rules for the Regex tagger. You can also use regular expressions in Find and replace, and in the Filter fields in the translation editor.

Finding character sequences is a familiar task to everyone who has used a word processor or text editor before. The Find or Search dialog serves this purpose – if you search for 'cat', your editor will highlight words (or parts of words) such as 'cat', 'cats', or even 'sophisticated'.

Regular expressions, however, provide a lot more freedom to tell the computer what you are looking for. You can identify sequences such as a letter 'a', followed by two or three letters 'c'; a number of letters followed by one or more digits; either of the words 'cat', 'dog' or 'mouse'; or even the occurrences of a word where it is between quotation marks – and much more. After reading through this page and experimenting with the examples, you'll know exactly how. If you do not feel ready to learn the details, the Regex Assistant will help you.

Note: The term regular expression comes from the mathematical theory on which this pattern matching method is based. It is often abbreviated as regexp or regex – here we'll use regex, or in the plural, regexes.

Regex syntax has many variants (flavors): memoQ uses the .NET regex engine, and thus the .NET flavor. This article only describes a part of the .NET regex syntax – for the detailed documentation, see the related part of the Microsoft Learn website.

Standard .NET regex features

memoQ extensions

Sequence

Description

\tag

Inline or memoQ tag

\itag

Inline tag

\mtag

memoQ tag