In this section, you can fine-tune the term extraction process.
Note: In memoQ, term extraction is fully statistical (based on the length and the frequency of the candidates); it does not use any linguistic intelligence such as stemming or parsing. The options described below control the statistical procedure.
General options
•Maximum length (words) text box: The number of words in the longest term candidate. memoQ will not list expressions that are longer than this. The default value is 4. •Minimum frequency text box: memoQ will not list candidates that do not occur in the source text as many or more times as the number specified here. For example, if the minimum frequency is 3, the list will contain candidates that occur 3 or more times in the source text. The default value is 3. •Expression delimiters text box: This is a list of characters that mark the beginning or the end of a term candidate. memoQ will not extract expressions where one or more of these characters occur inside the expression. •Length factor text box: This is a number between 0.5 and 3 that controls how much memoQ should favor longer expressions. Each term candidate (that is, extracted expression) receives a score during the extraction process. The larger the length factor, the larger the difference will be between the score of a longer and a shorter expression. The default vale is 1.5. •Ignore words with numbers check box: If this check box is checked, memoQ will not include expressions if there is a word in it that contains one or more digits. The check box is not checked by default.
Single-word terms
memoQ uses a different approach to extract single-word term candidates. The settings below control how they are extracted.
•Minimum length (characters) text box: memoQ does not list words that are shorter than the number specified here. For example, if the minimum length is 3, memoQ extracts single-word candidates that are 3 characters long or longer. The default value is 3. Note: The minimum length does not apply to term candidates that contain multiple words.
•Minimum frequency text box: memoQ will not list candidates that do not occur in the source text as many or more times as the number specified here. For example, if the minimum frequency is 3, the list will contain candidates that occur 3 or more times in the source text. The default value is 3.
Term base lookup
When extracting candidates, memoQ looks for expressions in the source-language text only. However, memoQ can retrieve possible translations for the extracted candidates by looking them up in term bases used in the same project.
•Look up candidates check box: Check this if you want memoQ to look up translations for each candidate in the term bases used in the current project. The check box is checked by default. ▪All term bases in project radio button: Choose this if you wish to look up the candidates in all term bases in the current project. This is the default setting. ▪Term base with the highest rank only radio button: Choose this if you wish to look up the candidates in the highest ranked term base only. |