PDF files (Portable Document Format)
Use this dialog to choose how memoQ imports PDF files as plain text or as DOCX.
How to begin
Note: You cannot import password protected PDF files.
You can choose to
•Import as plain text
•or Import as DOCX.
The plain text filter has no further configuration options. memoQ extracts the texts from the PDF and imports them into your memoQ project.
Important: memoQ can open Adobe Acrobat PDF files but with a severe restriction – it is not possible to save and export an as plain text imported PDF file as a PDF file. You can only export these files as bilingual documents (memoQ XLIFF, bilingual doc/rtf) or TXT files, losing all of its formatting features in the process. If you need to keep the formatting features of a document, we recommend that before using memoQ, you use a software such as ABBY PDF Transformer to convert your PDF file into a Word document. Or import as DOCX.
Important: If you migrate a project from a previous memoQ version that only had the plain text filter, you can only export the PDF file as plain text. You will need to import the PDF again using the Import as DOCX option to be able to export your PDF as DOCX.
For the DOCX import option, go to the DOCX options tab for further import configurations. The tab has the same configuration options as the DOCX import filter. The Import as DOCX option is the default setting. memoQ uses Aspose to import the PDF document as DOCX. Aspose converts the PDF to DOCX.
Since the import of a PDF document and its conversion is done using Aspose, the following options from Aspose are available:
•Conversion type section: You can choose between 2 conversion options on import: the Text flow conversion (might slightly change formatting) option, which is the default setting, and the Attempt to keep formatting (some text bay be lost) option.
The text flow option preserves the original look of the PDF file, but the ability to edit the output document could be limited later on in Word. Every visually grouped block of text in your PDF document is converted into a text box during import into memoQ. This option keeps the maximal resemblance to the original PDF document.
The attempt to keep the formatting option is a recognition mode where Aspose performs a grouping and a multi-level analysis to restore the original document while trying to produce an easy to edit output document. The output document, however, may look different than the original PDF document.
•Conversion options section:
Check the Specify relative horizontal proximity (normalized, 1=100%) check box to control the relative proximity between textual elements. That distance is normed by the font size. Larger fonts may have a bigger distance between syllables and still be looked at as a single whole. This is specified as a percentage of the font size (1=100%). For example, 2 characters of 10pt are placed 10 pt apart.
Check the Recognize bullet points signs check box to enable the recognition of bullets during the conversion to a Word document on import into your memoQ project.
Further information on PDF conversion using Aspose can be found here (link available at the time of writing):
In memoQ, PDF files are opened with Xpdf, which is an open source viewer.
Xpdf copyright © 1996-2009 Glyph & Cog, LLC.