Edit distance statistics

Edit distance statistics shows how much work was done on a project by a translator. It doesn't show the time spent on the project. For that, there is the editing time report.

Edit distance statistics shows how much translators had to change the translation memory matches.

When it measures the work of a reviewer, the edit distance report shows how much a reviewer had to work on segments that were confirmed by a translator.

To measure the work of a reviewer: It's recommended to use the edit distance report instead.

See also: In online projects, you need to use the edit distance report instead. It's also available in local projects.

When memoQ calculates edit distances, it computes how much you need to edit the first version of the text to get the second one. The difference appears in words or in percent, depending on your choice:

  • Levenshtein: The difference is shown in words.
  • Fuzzy: The difference is shown in percent. memoQ also groups the results by percent ranges, similarly to fuzzy ranges in analysis reports.

    The distance is 100% if there was no match: If you are measuring the work of the translator, and the translation had to be written from scratch because there was no match: The edit distance will be 100% (or the number of words in the translation).

How to get here

To measure the work of the translator:

  1. Open a local project, or a checkout of an online project.
  2. In Project home, choose Translations.
  3. On the Documents ribbon, click Edit Distance. The Edit distance statistics window opens.

    To measure the work of a reviewer:

  4. Open a local project, or a checkout of an online project.
  5. In Project home, choose Translations.
  6. Select a document. Right-click the selection. From the menu, choose History/Reports. The History and reports window opens.
  7. Under Minor versions of document, click the earlier version of the document. Press and hold Ctrl, and click the later version.
  8. Click Calculate edit distance. The Edit distance statistics window opens. The Scope radio buttons are grey because you have already chosen a document.

    edit_distance_stats

What can you do?

Set up how memoQ should count words and characters. Remember: memoQ counts words and characters to calculate the speed of translation. That means the number of words or characters translated per hour.

Through Edit distance statistics, you can measure the work of a translator only.

memoQ compares the translator's work to the translation memory matches that were inserted in the segments.

Or, it compares the reviewer's work to the translations that were confirmed by a translator. You can do this only if you start with History and reports.

To measure the work of a reviewer: It's recommended to use the edit distance report instead.

To get the amount of work in words, use the Levenshtein method to measure the distance. It makes more sense to see how many words a translator worked - as opposed to how much of the inserted text was changed. In an average translation job, many segments are translated from scratch, without inserting a match first.

Use Fuzzy only if the work was similar to editing: Counting in percent is useful if there was a match for most segments. This happens if you upgrade a translation to a new version, and there is little difference. Or, when there was machine translation and you need to fix machine-translated segments.

Confirmed segments only: memoQ calculates the edit distance for those segments that are already confirmed.

Don't use Trados 2007-like word counts: Normally, memoQ counts words like Microsoft Word does. In the past, when Trados 2007 or earlier (Trados Translator's Workbench) used to be a dominant translation tool, it was important that memoQ could produce similar word counts - so that translation companies could compare them. This is no longer the case. Use the Trados 2007-like word counts only if your client still works with an early Trados version, and they insist on using it.

To get separate statistics for each document: Check the Show results for each file check box.

To get the edit-distance statistics: Click Calculate. memoQ displays the results in the main part of the Edit distance statistics window.

The statistics result shows the total size of the documents (in segments and in words), the number of segments that were edited, and what the edit distance is in those segments. You get a number for the absolute edit distance, and the normalized edit distance.

  • Absolute edit distance is the sum of characters or words added or deleted. This gives you a single absolute number that tells you how much the document was edited.
  • A normalized edit distance is a number between 0 (0%) and 1 (100%). 0% means that the current version of a segment is identical to the previous version. 100% means that there is no similarity between the previous version of the segment and the current one. This can happen if the segment was empty, and it was filled in.

    To save the statistics: Click Export. memoQ saves the results in a HTML file. If you open this file in a browser, it looks the same as the tables in the window. Choose a folder and a name for the file. Click Save.

When you finish

To save the statistics in a file: Click Export.

To return to Project home or to History and reports: Click Close.