Microsoft Word (DOCX)
Microsoft Word documents are the most common documents that get translated. In this window, you can control how memoQ imports Microsoft Office Word 2007and newer (*.docx or *.dotx) documents.
This window is for documents from Word 2007 or higher: Use it to import .docx or *.dotx files.
How to get here
- Start importing a document from Word 2007 or higher (a .docxfile).
- In the Document import options window, select the Word documents, and click Change filter and configuration.
- The Document import settings window appears. From the Filter drop-down list, choose Microsoft Word filter.
What can you do?
Click Import markup as inline tags. The other option is to import all formatting markup as legacy memoQ {tags}, which is not recommended.
Under Content to import, you can import or ignore certain parts of the document. Here are the options you have:
- Import hidden text: Normally, memoQ doesn't import hidden text. Check this check box to import it.
- Import TOC: Normally, memoQ doesn't import the table of contents. The table of contents in Word is a field that is generated automatically. Normally, it's not translated. Check this check box if you still want to import it.
- Import index: Normally, memoQ doesn't import the index. The index in Word is a field that is generated automatically. Normally, it's not translated. Check this check box if you still want to import it.
- Import index entries: Normally, memoQ imports the index entries. Index entries are fields in the middle of the text that mark words or expressions that get in the index. An index entry can specify a different expression that will appear in the index instead of the actual expression in the text. If you don't want to translate these: Clear this check box.
memoQ can exclude text that is formatted in a specific style. For example, a technical document may include program code that mustn't be translated. The program code is formatted in a different style. You list that style here, so that memoQ doesn't import those parts. Thus they can't be corrupted during translation.
You specify the styles by name. memoQ doesn't read the names of styles from the document, so you need to make a precise guess at the style name. Before you add a style name to the list, open the document in Word, and double-check the styles.
There are two types of excluded parts:
- Internal: This is text formatted in a character style. It usually marks an expression inside a segment that is formatted differently. Example: A reference to a part of program code in a sentence. When memoQ excludes text in an internal style, that part is replaced with an inline tag.
- External: This is text formatted in a paragraph style. Example: A complete snippet of program code in the middle of the document. When memoQ excludes text in an external style, that text doesn't even appear in the translation editor.
To add a style to the list:
- Type the name of the style in the text box above the list.
- If it's an external style: Check the External check box.
- Click Add style.
To remove a style from the list: Select it. Click Remove style.
Legacy Trados markup styles are added automatically: Normally, memoQ will exclude parts that are formatted in the tw4winInternal and the tw4winExternal styles. These are the styles used in bilingual Word documents that Trados Translator's Workbench 2007 and earlier produce. In those documents, these styles indicate markup that doesn't belong to the actual text.
Under Special characters, choose how memoQ imports manual line breaks (soft breaks) and tabulators. In Word, a paragraph break always means the end of a segment, but a manual line break doesn't end a paragraph. To insert a manual line break in Word, press Shift+Enter.
The tabulator is a character jumps to a predefined position in a line. It can be used to write up text that looks like tables. It is also used to indent the first lines in paragraphs that have a hanging indent.
In memoQ, both characters can mean the end of the segment, and they can also be inserted as an inline tag.
For each character, you can choose from three options in the drop-down box:
- Start new segment: Normally, memoQ starts a new segment when there is a manual line break. For a tab character, choose this if you want to start a new segment.
- Show as inline tag and start new segment: Choose this to show each manual line break or tab character as an inline tag, and start a new segment at the same time.
- Show as inline tag: Normally, memoQ shows tab characters as inline tags. For manual line breaks, choose this to show them as inline tags.
When you import a Word document, there can be too many tags, especially if the Word document was converted from a scanned PDF document.
You have two options to get a cleaner import:
- Ignore font "hints" for cleaner import: Don't clear this check box. The main font may need to change when you are translating from Arabic or an Asian language into a European language. The source text may still contain some Latin words. Before memoQ introduced the font substitution option, every switch between the Latin and Arabic/Asian characters resulted in tags in the imported source text because MS Word implicitly changed fonts there.
In many cases, you do not need these font changes in the target text. If you are translating from Japanese (with some English words like company names etc. interspersed) into English, the translated text won't contain any Japanese characters. But in Word, a formatting instruction can have information like "use this font if the part of text is Asian, but use that font if it is Latin." There is also a "hint" attribute for each part of text that tells Word about the type of script a part of text has (Asian, Arabic, etc). memoQ ignores the "hint" attribute if you check this check box. The attribute won't make sense in the translated text. It's very likely that the script will change because you translate from Japanese to English.
- Ignore minor formatting changes for fewer tags: Check this when you work with a scanned document or a document converted from a PDF file. It reduces the number of tags in your document. memoQ ignores formatting changes such as baseline shifts, character spacing changes, and character compression.
You cannot run a chained (cascaded) filter after a Word import if the imported text contains inline formatting. However, several content management systems return DOCX or Excel documents this way.
If you need to run a HTML or an XML filter after the DOCX import, you can ignore inline formatting:
On the General tab, check the Ignore inline formatting tags in cascading import check box.
This check box appears only if another filter is chained after the Word filter.
Caution: The formatting memoQ ignores will be missing from the exported document, too.
A Word document may contain a lot of comments. You can decide what happens to them.
Normally, memoQ doesn't import comments from Word documents.
To change this: First, click the Advanced tab.
- To import Word comments as memoQ comments: Click Import as memoQ comments. In memoQ, a comment has a type. Choose it from the Category drop-down box. The options are: Information, Warning, Error, Other. If the comment refers to text that is smaller than a segment, it will be imported as a comment for that part of the source text. If the comment refers to text that spans several sentences, it's imported as a segment-level comment for all the affected segments.
- To import Word comments as text to translate: Click Import as text to translate. With this setting, comments will appear as regular segments.
If you import hidden text, hidden comments are imported, too. If you do not import hidden text, comments for the hidden text are not imported either.
memoQ can export comments into translated Word documents. Normally, memoQ comments are not exported. To make memoQ export comments: Open Options. Choose Miscellaneous. Click the Translation tab. Choose what types of comments you want to export.
Previously, memoQ imported all automatic numbering formats from the document's styles. In memoQ 8.4 and newer versions, you can choose to import only those that are used in the document.
To set this up: Click the Advanced tab.
You have three choices:
- Import all number formats (even unused ones): just like in earlier versions, memoQ imports all the numbering formats from the document's styles.
- Import used number formats only; they will be collected at the end of document: memoQ imports only the numbering formats that appear in the docx file. You can find them in the last segments of the imported document.
- Import used number formats only; they appear in the document when they are first used: memoQ imports only the numbering formats that appear in the docx file. You can find them at their first occurrence in the imported document. This is the default choice.
Choose how memoQ should process documents saved by Microsoft Office versions that are newer than Office 2007.
To set this up: Click the Advanced tab.
Some parts of the documents are called extensions: For example, text boxes. In new versions of Office, these are saved in two variants: The content is there for the newest Office version. And there is another copy for compatibility reasons, to be read by Office 2007 or earlier versions.
You have three choices:
- Keep compatible with all Office versions: memoQ imports both versions of the extensions. This means that some of the contents are imported twice: once from the up-to-date extension, and once from its legacy counterpart.
- Keep compatible with original Office version and newer: memoQ imports the up-to-date extensions only. If you open the document in Office 2007 or earlier, you won't see the translations of these. Use this option only if the document was created in a newer version of Office, and it will be used in a newer version than Office 2007.
- Downgrade document to Office 2007: memoQ imports the legacy content only, and exports a document that doesn't have the up-to-date extensions. This is the default choice.
If a Word document contains tracked changes, memoQ can import them as tracked changes. This is useful when you translate a document, and then it gets edited, and you need to translate the edited version, too.
To choose how memoQ treats this, click the Track changes tab.
First, tell memoQ how it should import tracked changes. This means that tracked changes can be imported in the source text.
- Normally, memoQ will import the final version only - as if all changes were accepted. This is the Accept changes before import radio button.
- To import the unedited version - as if all changes were rejected: Click Reject changes before import.
- To import the tracking markup - to see how the document was edited: Click Import changes as such (source segments will have change tracking). In the translation editor, the document will look like this:
Then, tell memoQ how it should export tracked changes. memoQ will export tracked changes from the target text. This is completely independent from tracked changes in the source text.
- Normally, memoQ uses the options that you set in the Miscellaneous pane of the Options window. In the Miscellaneous pane, click the Import/export tab. This is the Use the global options radio button.
- To export the final version - with all changes accepted: Click Export final version.
- To export the tracking markup - to see what was changed: Click Export change tracking markup.
When you finish
To confirm the settings, and return to the Document import options window: Click OK.
To return the Document import options window, and not change the filter settings: Click Cancel.
If this is a cascading filter, you can change the settings of another filter in the chain: Click the name of the filter at the top of the window.
In the Document import options window: Click OK again to start importing the documents.