PDF (Portable Document Format) files

memoQ can import PDF files. On its own, memoQ can open them as plain text, or convert them into DOCX first, and imports the DOCX file.

The PDF format is not designed for translation. It is more similar to a printed paper. Whenever possible, work on the original document that was converted into PDF. Then you can convert the translated document to PDF.

The TransPDF service is no longer available in memoQ 10.4 and later.

When you import PDF documents into memoQ, remember that:

  • Can't export PDF: If the source document is PDF, memoQ exports the translation in plain text or in DOCX, depending on the method of the import.
  • Can't import password-protected PDF files.
  • Can't import scanned PDF files: memoQ doesn't extract text from scanned PDF files, where the pages are saved as images and not as text. To translate these documents, run them through a page reader (OCR) program such as Nuance OmniPage or ABBYY FineReader (PDF Reader). These programs save well-formed DOCX files where the text flow and the formatting is retained as much as possible.
  • Text may become garbled: PDF is not a text format. Normally, it doesn't try to preserve the text flow. As a result, some of the text may be missing or may appear in the wrong order when you import a PDF into memoQ. When this happens, use an OCR program (see above).

How to get here

  1. Start importing a Portable Document Format (PDF) file.

  2. In the Document import options window, select the PDF files, and click Change filter and configuration.

  3. The Document import settings window appears. From the Filter drop-down list, choose PDF (Portable Document Format).

    pdf_docimport

What can you do?

When you finish

In the Document import options window: Click OK again to start importing the documents.

  • If this is a cascading filter, you can change the settings of another filter in the chain: Click the name of the filter at the top of the window.

memoQ doesn't import PDF directly

memoQ relies on external modules that help importing PDF documents. These modules are installed with memoQ, but come from other software makers.

To convert PDF documents into Word (DOCX), memoQ uses Aspose.PDF. To learn how this is done: See the developer's web page.

To convert PDF documents into plain text memoQ uses xPDF. Xpdf copyright © 1996-2009 Glyph & Cog, LLC.