Microsoft Word 97–2003 (DOC)
Microsoft Word documents are the most common documents that get translated. In this window, you can control how memoQ imports documents from Microsoft Word 2003 or earlier (these are *.doc or *.dot documents).
This window is for documents from Word 2003 or earlier: Use it to import .doc or *.dot files.
How to get here
- Start importing a document from Word 2003 or earlier (a .doc file).
- In the Document import options window, select the Word documents, and click Change filter and configuration.
- The Document import settings window appears. From the Filter drop-down list, choose Microsoft Word 97–2003 filter.
What can you do?
First, memoQ converts the document into the DOCX format (the one used by Word 2007 or later).
Under Select import type: Click Import as DOCX. Do not click Import as RTF.
This works if Word 2007 or later is installed on your computer. memoQ calls Word to save the document in the newest format.
If Word 2007 or later isn't installed: Check the Aspose check box. In this case, memoQ converts the document into the newest format on its own. For that, memoQ uses a module called Aspose.
Click Import markup as inline tags. The other option is to import all formatting markup as legacy memoQ {tags}, which is not recommended.
In Content to import section, you can import or ignore certain parts of the document. Here are the options you have:
- Import hidden text: Normally, memoQ doesn't import hidden text. Check this check box to import it.
- Import TOC: Normally, memoQ doesn't import the table of contents. In Word, TOC is a field that is generated automatically, and usually, it's not translated. Check this check box if you still want to import it.
- Import alternative text of images: Normally, memoQ imports alternative text of images. If you do not want to translate them: Clear this check box.
- Import index: Normally, memoQ doesn't import the index. The index in Word is a field that is generated automatically, and usually it's not translated. Check this check box if you still want to import it.
- Import index entries: Normally, memoQ imports the index entries. They are fields in the middle of the text that mark words or expressions that get in the index. An index entry can specify a different expression that will appear in the index instead of the actual expression in the text. If you don't want to translate these: Clear this check box.
memoQ can exclude text that is formatted in a specific style. For example, a technical document may include program code that mustn't be translated. The program code is formatted in a different style. You list that style here, so that memoQ doesn't import those parts. Thus they can't be corrupted during translation.
You specify the styles by name. memoQ doesn't read the names of styles from the document, so you need to make a precise guess at the style name. Before you add a style name to the list, open the document in Word, and double-check the styles.
There are two types of excluded parts:
- Internal: This is text formatted in a character style. It usually marks an expression inside a segment that is formatted differently. Example: A reference to a part of program code in a sentence. When memoQ excludes text in an internal style, that part is replaced with an inline tag.
- External: This is text formatted in a paragraph style. Example: A complete snippet of program code in the middle of the document. When memoQ excludes text in an external style, that text doesn't even appear in the translation editor.
To add a style to the list:
- Type the name of the style in the text box above the list.
- If it's an external style: Check the External check box.
- Click Add style.
To remove a style from the list: Select it. Click Remove style.
Legacy Trados markup styles are added automatically: Normally, memoQ will exclude parts that are formatted in the tw4winInternal and the tw4winExternal styles. These are the styles used in bilingual Word documents that Trados Translator's Workbench 2007 and earlier produce. In those documents, these styles indicate markup that doesn't belong to the actual text.
Under Special characters, choose how memoQ imports manual line breaks (soft breaks) and tabulators. In Word, a paragraph break always means the end of a segment, but a manual line break doesn't end a paragraph. To insert a manual line break in Word, press Shift+Enter.
The tabulator is a character jumps to a predefined position in a line. It can be used to write up text that looks like tables. It is also used to indent the first lines in paragraphs that have a hanging indent.
In memoQ, both characters can mean the end of the segment, and they can also be inserted as an inline tag.
For each character, you can choose from three options in the drop-down box:
- Start new segment: Normally, memoQ starts a new segment when there is a manual line break. For a tab character, choose this if you want to start a new segment.
- Show as inline tag and start new segment: Choose this to show each manual line break or tab character as an inline tag, and start a new segment at the same time.
- Show as inline tag: Normally, memoQ shows tab characters as inline tags. For manual line breaks, choose this to show them as inline tags.
When you import a Word document, there can be too many tags, especially if the Word document was converted from a scanned PDF document.
You have two options to get a cleaner import:
-
Ignore font "hints" for cleaner import: Don't clear this check box. The main font may need to change when you are translating from Arabic or an Asian language into a European language. The source text may still contain some Latin words. Before memoQ introduced the font substitution option, every switch between the Latin and Arabic/Asian characters resulted in tags in the imported source text because MS Word implicitly changed fonts there.
In many cases, you do not need these font changes in the target text. If you are translating from Japanese (with some English words like company names etc. interspersed) into English, the translated text won't contain any Japanese characters. But in Word, a formatting instruction can have information like "use this font if the part of text is Asian, but use that font if it is Latin." There is also a "hint" attribute for each part of text that tells Word about the type of script a part of text has (Asian, Arabic, etc). memoQ ignores the "hint" attribute if you check this check box. The attribute won't make sense in the translated text. It's very likely that the script will change because you translate from Japanese to English.
- Ignore minor formatting changes for fewer tags: Check this when you work with a scanned document or a document converted from a PDF file. It reduces the number of tags in your document. memoQ ignores formatting changes such as baseline shifts, character spacing changes, and character compression.
A Word document may contain a lot of comments. You can decide what happens to them.
Normally, memoQ doesn't import comments from Word documents.
To change this: First, click the Advanced tab.
- To import Word comments as memoQ comments: Click Import as memoQ comments. In memoQ, a comment has a type. Choose it from the Category drop-down box. The options are: Information, Warning, Error, Other. If the comment refers to text that is smaller than a segment, it will be imported as a comment for that part of the source text. If the comment refers to text that spans several sentences, it's imported as a segment-level comment for all the affected segments.
- To import Word comments as text to translate: Click Import as text to translate. With this setting, comments will appear as regular segments.
If you import hidden text, hidden comments are imported, too. If you do not import hidden text, comments for the hidden text are not imported either.
memoQ can export comments into translated Word documents. Normally, memoQ comments are not exported. To make memoQ export comments: Open Options. Choose Miscellaneous. Click the Translation tab. Choose what types of comments you want to export.
You can choose to import all automatic numbering formats, or only those that are used in the document.
To set this up: Click the Advanced tab.
You have three choices:
- Import all number formats (even unused ones): just like in earlier versions, memoQ imports all the numbering formats from the document's styles.
- Import used number formats only; they will be collected at the end of the document: memoQ imports only the numbering formats that appear in the docx file. You can find them in the last segments of the imported document.
- Import used number formats only; they appear in the document when they are first used: memoQ imports only the numbering formats that appear in the docx file. You can find them at their first occurrence in the imported document. This is the default choice.
After you translate an old-style Word document, memoQ converts it back to the old format when the translation is exported.
Because the original document is from an old Word version, there will be no extensions that work in Word 2007 or later only.
You can safely ignore the settings in the Compatibility group.
When you finish
-
To confirm the settings, and return to the Document import options window: Click OK.
In the Document import options window: Click OK again to start importing the documents.
-
To return to the Document import options window, and not change the filter settings: Click Cancel.
-
If this is a cascading filter, you can change the settings of another filter in the chain: Click the name of the filter at the top of the window.