Edit distance statistics
You can use the Edit distance feature to measure the quality of Machine Translation, after post-editing. Or for instance, the reviewer not only marks but also corrects target segments in memoQ, you can use Edit distance statistics to measure the amount of work the reviewer had.
The edit distance is calculated whenever a segment is confirmed. The match value is stored in the row. The edit distance statistics for inserted matches is always calculated between the currently stored segment text and the inserted match. If there is no inserted match, it is assumed as a new translation, and in these cases the edit distance is not applicable.
The following information is stored: whether it is a fuzzy or MT match, source and target segment, source of the match (e.g. from TM).
How to begin
The Edit distance statistics dialog can be invoked on the Documents ribbon tab:
Note: Locked segments are excluded from the edit distance statistics.
Note: Edit distance is also displayed in the translation grid, next to the display of view document name.
Distance measurement: Choose the Levenshtein radio button to get a file by file report with a summary. Levenshtein calculates the difference between 2 versions of a text. When you choose the Fuzzy radio button, then you get a fuzzy breakdown of edited segments. Fuzzy edit distance is when you compare versions against the inserted matches with the current status row by row. The last delivered roles are used to compare document versions.
Click the Calculate button to calculate project-wide statistics in a project summary where each file is summarized in one life of the table. When you click the Show results for each file option, you will get an individual table for each file.
The statistics result shows the type (100%, fuzzies, etc.), the segments and words, how many segments were edited and absolute and normalized edit distance:
•Absolute edit distance is basically a sum of added and deleted characters over a text (document, project, etc.). This gives you a single absolute number that tells you how much editing someone did overall. It can be anything up from 0, that’s why it’s not “normalized”.
•A normalized edit distance is a number between 0 and 1 that says, segment A is identical to segment B (one end of the scale: 0 as in “no changes”) or that the two segments are infinitely different (one segment is empty, the other has an infinite number of characters; this is 1).
You can also invoke the Edit distance statistics from History/Reports, select 2 versions of a document, then click Calculate edit distance.
Here you can compare two versions of a document: edited segments or if segments were split or joined.
Click Export to export the report in HTML format. Click Close to close the dialog.