Weighted word counts

An analysis (also called statistics) is the way to quantify the productivity enhancement of using memoQ or another tool. The better you can estimate your cost or time requirements for a given job, the more competitive you can be on the market. CAT tools use statistical language processing to calculate the similarity of two segments. It is a generally accepted principle that in most cases the statistically similar segments are also similar in meaning. As statistics is based on the very core of a CAT tool, data in the statistical analysis in different CAT tools may vary slightly. memoQ examines the segments, looks up the best TM result, compares, and counts in the fuzzy match ranges.

The memoQ match ranges (Fuzzy ranges) and x-translate which is the document based pre-translation of a document are used to calculate the weighted words.

Example:

no match (no match found in the TM) = 100% paid

lower fuzzy ranges between 50 and 70%

higher fuzzy ranges = 40%

Exact match (a 100% match from the TM) = 30%

Repetition (the same segment is there several times in the document) = 20%

Context match (a 101% match from the TM, where the previous and next segment is the same) = 10%

X-translated = 10%

These values give you the total amount of words and the weighted total. Weighted means for example, you have:

1000 new words (no match) = 1000 words

1000 words calculated for 70% = 700 words

1000 words calculated for 30 % = 300 words

If you check the Use weighted word counts, if available check box in the Assign selected documents to users dialog in an online project, then the weighted word counts are only available when you generated an analysis report for the document. If a report was created, then the most recently generated one is used to calculate the the weighted word count. If no report is found, memoQ will use raw word count, even when a weighted option is chosen.

Important: Weighted word count is not available for slices of a document.

The weighted word count in memoQ works like this:

  • 100% and above counts as 0.3
  • A 95-99% match counts as 0.5
  • A 94-75% match counts as 0.8
  • A no-match (below 75%) counts as 1.0
  • Locked segments do not count anywhere.