Filter for duplicates (term base)
You can clean your term base from duplicate entries. It's easy to generate duplicates in a term base. You may end up with a lot of them if you do one of these:
- Import entries from a file;
- Add terms to the term base from the translation editor using the Quick add term command (Ctrl+Q).
In the Filter for duplicates window, you can start looking for the duplicates in your term base. You can run this on one term base at a time.
In the end, memoQ gives you groups of entries. Entries in the same group are duplicates of each other in a way or another - according to the condition you set up in the Filter for duplicates window.
How to get here
- Start editing a term base.
-
On the Term base editor ribbon, click Show Duplicates. The Filter for duplicates window appears.
What can you do?
Two term base entries are usually duplicates of each other if the terms in one of the languages are the same in both of them.
- Under Filter for, you can fine-tune this. You have radio buttons to make your choice.
The Filter for section uses the source language and the target language from the term base editor.
If two entries count as duplicates if just one term is the same in both, use the radio buttons under At least one term same for:
- Language 1 (English in the example): Entries are duplicates if the same term is there in Language 1 in both of them
- Language 2 (German in the example): Entries are duplicates if the same term is there in Language 2 in both of them
- both languages: Entries are duplicates if they have at least one term in common in both Language 1 and Language 2
- all languages: Entries are duplicates if they have at least one term in common in every language of the term base.
If two entries count as duplicates if all the terms are the same in both, use the radio buttons under All terms same for:
- Language 1 (English in the example): Entries are duplicates if all the terms are the same in Language 1 in both of them
- Language 2 (German in the example): Entries are duplicates if all the terms are the same in Language 2 in both of them
- both languages: Entries are duplicates if all the terms are the same in them in both Language 1 and Language 2
- all languages: Entries are duplicates all the terms are the same in them in every language of the term base.
Choose just one of the eight radio buttons. Normally, you would go for a duplicate if two entries share just one term in the source language. This is the very first radio button on the left.
- Next, choose how the terms are compared; what counts as 'same term'. Decide how strict you want to be about this. To set this up, use the check boxes under Matching. Normally, all these check boxes are cleared, which means that two terms are the same if they have the exact same words, spaces, and punctuation. Lowercase and uppercase can be different.
- To make the checking case-sensitive, so that uppercase and lowercase must also be the same: Check the Case sensitive check box.
- To accept two terms as identical even if the spaces are different: Check the Ignore different spaces check box.
- To accept two terms as identical even if the punctuation is different: Check the Ignore punctuation check box.
You can fine-tune this under Keep/merge/delete settings for duplicates.
In a group of duplicate entries, one will always become the 'master' entry. You can decide what happens to the other entries in the group:
- Delete them
- Merge extra details from them into the master entry
- Keep them, so the duplicate remains
You can make your choice for each group when you're back in the term base editor.
First, memoQ needs to decide which entry is the master in the group. Choose one from the Choose master entry based on list: User, Approved, More recent, More metadata. For example, your master entry will be one that was added by a certain user. Normally, memoQ follows this logic:
- The master entry will be the one that comes from a preferred user.
- If there is not one entry with a preferred user: Choose the one that was approved.
- If neither entry is approved, or several ones are: Choose the most recent one.
- If there are several entries from the same time: Choose the one that has more metadata filled in.
You can change the order of these. Click a condition, and click the Move up or the Move down button.
In the Users list, you can list the preferred users. These will be senior people, for example, terminologists, who regularly add or approve terms. To add a preferred user, click Add. To remove a user from the list: Click the name of the user. Click Remove.
You can add unpreferred users, too. If you have a list of unpreferred users, other users will be preferred to them. For example, if the duplicate group has four entries from unpreferred users, and one from a user who's not on the list, memoQ will choose this fifth one as the master entry.
Then, choose what happens to those entries that are merged into the master entry.
- For duplicates not automatically selected as master: Either click Mark for deletion or Mark for merging. Normally, memoQ marks the non-master entries for merging into the master entry.
Don't choose Mark for deletion unless you are absolutely sure that the duplicate entries are completely identical, all terms, all languages, all fields. In all other cases, you should choose Mark for merging. Otherwise you'll lose information from the term base.
- Last modifying user/date for merged entries: Choose if the modification date and the last modifying user should come from the master entry - or from the current editing session. Either click Keep from master entry or Use current user/date. You can't keep dates and users from the non-master (merged) entries.
- Text metadata values for merged entries: If the metadata fields have conflicting contents in the master entry and in the non-master (merged entries), choose what should happen. You can use the value from the master entry, or you can use all of them, appended together with commas. For the former, click Keep from master entry. For the latter, click Concatenate using commas.
When you finish
To look for all the duplicates, and return to the term base editor: Click OK.
To return to the term base editor, and not look for the duplicates: Click Cancel.
After memoQ finds the duplicates, it displays a special list that contains the groups of duplicates, and not the individual entries.
To learn more: See the documentation about the term base editor.