Homogeneity and repetitions in a project

Homogeneity measures internal similarities of documents within a project. If there are two similar segments in the documents, memoQ simulates what happens if the two segments are translated after each other in real life. When calculating statistics with the homogeneity option, each segment that memoQ processes is added to a temporary translation memory, and is used for lookup for every segment processed subsequently, as if it was already translated. For the statistics results, it means that if the second segment is 80% similar to the first, then the second segment will be put in the 75-84% match category in statistics.

This gives more realistic values, simulating the translation memory's suggestions a translator gets while working: if the a translator actually translates the first segment, and goes to the second segment that is 80% similar, he also gets a 80% match.

Using homogeneity is generally not recommended in projects with several translatable documents and several translators – because you cannot know which of the two segments will be translated first. It could be either way. The order in which documents are processed also affects the results. memoQ currently processes documents by full document path, in alphabetical order.

The same applies for repetitions: The repetition numbers will be wrong if you run statistics on two or more documents first and then assign the documents to the translators. A repetition may appear in the first document first or in the second document, i.e. the second occurrence of the repetition will be counted as repetition instead of a new translation.

In both scenarios, there can be injustice. There are workarounds for the problem:

  1. Use post-translation analysis. This feature allows you to pay the translators based on exactly what they translated and the matches they got from the translation memories and corpora.

    Note: If you use this report, instruct the translators not to add and use their own TMs to the project.

  2. Extract the repetitions as a preparatory step when you set up your project. Then translate the repetitions into the Project TM. Use this pre-step to eliminate injustice in payment for repetitions.

    Note: This method does not affect internal fuzzy matches. Turn off Homogeneity when creating analyses for the translators.

  3. Run statistics per user. After you created the project, imported the translatable documents, and assigned the user to the memoQ user roles, select all the documents you assigned to user 1, and run statistics. Pay user 1 accordingly. Then select all the documents assigned to user 2, and run statistics. Pay user 2 accordingly. Proceed this way for all users and for all target languages in the project. This way, homogeneity and repetition matches are counted per user.