Building Glossaries & Terminology with AI

Terminology work is the part of a translator's job that scales worst with human time. Building a 200-term glossary by hand for a new client used to mean a day of work. AI does the first 80% of that work in 20 minutes, leaving you to do the 20% — verification — that actually requires your expertise.

What You'll Learn

How to extract a domain glossary from a source document
How to enrich a glossary with definitions, context, and usage notes
How to convert AI output into a CAT-tool-ready termbase
The verification habit that keeps AI glossaries trustworthy

Why Terminology Is Worth Your AI Time

Bad terminology is the leading cause of poor large-project quality. The same English term gets translated inconsistently because:

You translated some segments on Monday and some on Friday
A different translator worked on chapters 3–6
The client did not provide a glossary
The reviewer disagrees with the translator's choices

A solid project glossary, agreed with the client before you start, eliminates 60% of avoidable revisions. AI lets you build one in a single afternoon.

Step 1: Term Extraction from the Source

Paste a representative chunk of the source — an executive summary, the first chapter, or the table of contents plus introduction — into Claude or ChatGPT.

"You are a senior terminologist. I am translating a [DOMAIN] document from [SOURCE] to [TARGET] for a [CLIENT TYPE] audience. Below is the executive summary of the source. Extract the 40 most important domain-specific terms and proper nouns that will need consistent translation throughout the project. Exclude common everyday words. For each term, list:

The exact source-language term

Why it is important (concept / product name / regulation / acronym / technical jargon)

Whether it should be left untranslated, translated literally, or adapted

Format as a Markdown table."

A typical output: 30–50 terms with rationale. Now you have the spine of your glossary.

Step 2: Generate Target-Language Equivalents

For each extracted term, get a candidate translation with context:

"For each row in the table above, propose two candidate translations into [TARGET LANGUAGE, e.g., Brazilian Portuguese] for a [DOMAIN] audience. Include:

Suggested target term

Alternative

One sentence of usage guidance for me as the translator

A 12–15 word example sentence in the target language showing the term in use"

This is the moment where AI is most useful and most dangerous. The fluency of the suggestions can lure you into accepting plausible but incorrect terms. Always verify against authoritative sources before adding to your termbase.

Step 3: Verification Against Authoritative Sources

For every term in your AI-generated glossary, check at least one authoritative source:

EU terminology: IATE (iate.europa.eu) — multilingual, vetted, free
UN terminology: UNTERM (unterm.un.org)
Canadian government: TERMIUM Plus (free, EN/FR/ES/PT)
Spanish-specific: Fundéu, RAE
German-specific: DWDS, Duden
Medical: WHO terminology, MedDRA, ICD-11
Legal: National legal dictionaries (Black's Law Dictionary for US English; Cornu for French)
Industry: Client's existing materials, competitor websites in the target market, industry body publications

A clean workflow: keep two browser windows open. AI suggestion on the left, authoritative source on the right. Accept, reject, or modify each term in 10–30 seconds.

Step 4: Enrich with Context and Forbidden Terms

A working termbase is more than equivalents. Ask the AI to expand each entry:

"For each accepted term, add the following fields: (1) part of speech in the target language, (2) grammatical gender (if applicable), (3) common collocations (3 examples), (4) forbidden synonyms or false friends to avoid, (5) plural form, (6) a one-sentence definition in the target language."

You now have a termbase that rivals what a senior terminologist would produce in a week, in under an hour.

Step 5: Export to Your CAT Tool

Most CAT tools (Trados Studio, memoQ, Phrase, Smartcat) accept termbases as:

TBX (TermBase eXchange, XML format)
CSV with specific column headers
Excel with mapped fields

Have the AI format it for you:

"Export the accepted terms as a CSV with these columns, ready for memoQ import: Source | Target | Part of speech | Domain | Definition | Forbidden synonyms | Notes. Quote any field that contains a comma. Use UTF-8."

Save the output as a .csv file and import into your CAT tool.

Multilingual Glossaries in One Pass

If you work into multiple targets — common for software localization or institutional clients — you can build a multilingual glossary in one prompt:

"For each of the 40 source terms above, generate equivalents in: French (France), Spanish (Latin America), Italian, German, Brazilian Portuguese, Simplified Chinese, Japanese. Format as one row per source term, with one column per language. Highlight any term where you are less than 80% confident in the equivalent and explain why."

The "highlight uncertainty" instruction is what makes the output safe — you know which rows to scrutinize first.

When the Client Has an Existing Glossary

Don't throw it away. Feed it to the AI as authoritative context:

"Below is the client's existing English-French glossary (50 terms). Below that is the source text I will translate (3,000 words). Identify: (a) any source terms not yet in the glossary that should be added, (b) any glossary terms that appear in the source so I know they're relevant, (c) any inconsistencies in the existing glossary I should raise with the client."

This gap-analysis prompt is gold for repeat clients. It surfaces missing terms, flags drift in the existing termbase, and gives you something to send the client to demonstrate diligence.

Interpreter-Specific Glossaries

Interpreters need glossaries that look different from translators':

Phonetic guides for difficult names
Acronym expansions
Quick equivalents arranged for booth-readability, not alphabetically

"I will simultaneous-interpret EN → ES at a 4-hour conference on offshore wind farms. Here is the program. Generate a glossary of 40 likely terms with: (1) English term, (2) Spanish equivalent, (3) IPA pronunciation for any English name or acronym, (4) Spanish pronunciation note where Spanish-speaking listeners may stumble. Order by likely frequency in the agenda."

A Cautionary Tale

In 2024 a freelance translator working on a Mongolian-English medical glossary trusted ChatGPT's "veterinary equivalent" output for a complex pharmacological term — and shipped the term to a client who used it in regulatory paperwork. The term was wrong. The client lost the filing window. Verification matters. AI gives you speed; your authoritative sources give you safety. Use both.

Key Takeaways

AI extracts and proposes terminology dramatically faster than manual work — but is not authoritative.
Always verify AI suggestions against IATE, UNTERM, TERMIUM, client materials, or industry sources.
Enrich termbases with collocations, forbidden synonyms, and definitions for richer CAT-tool support.
Build interpreter glossaries differently from translator glossaries — booth-readability and pronunciation matter.

Building Glossaries & Terminology with AI

What You'll Learn

How to extract a domain glossary from a source document
How to enrich a glossary with definitions, context, and usage notes
How to convert AI output into a CAT-tool-ready termbase
The verification habit that keeps AI glossaries trustworthy

Why Terminology Is Worth Your AI Time

Bad terminology is the leading cause of poor large-project quality. The same English term gets translated inconsistently because:

You translated some segments on Monday and some on Friday
A different translator worked on chapters 3–6
The client did not provide a glossary
The reviewer disagrees with the translator's choices

A solid project glossary, agreed with the client before you start, eliminates 60% of avoidable revisions. AI lets you build one in a single afternoon.

Step 1: Term Extraction from the Source

Paste a representative chunk of the source — an executive summary, the first chapter, or the table of contents plus introduction — into Claude or ChatGPT.

"You are a senior terminologist. I am translating a [DOMAIN] document from [SOURCE] to [TARGET] for a [CLIENT TYPE] audience. Below is the executive summary of the source. Extract the 40 most important domain-specific terms and proper nouns that will need consistent translation throughout the project. Exclude common everyday words. For each term, list:

The exact source-language term

Why it is important (concept / product name / regulation / acronym / technical jargon)

Whether it should be left untranslated, translated literally, or adapted

Format as a Markdown table."

A typical output: 30–50 terms with rationale. Now you have the spine of your glossary.

Step 2: Generate Target-Language Equivalents

For each extracted term, get a candidate translation with context:

"For each row in the table above, propose two candidate translations into [TARGET LANGUAGE, e.g., Brazilian Portuguese] for a [DOMAIN] audience. Include:

Suggested target term

Alternative

One sentence of usage guidance for me as the translator

A 12–15 word example sentence in the target language showing the term in use"

Step 3: Verification Against Authoritative Sources

For every term in your AI-generated glossary, check at least one authoritative source:

EU terminology: IATE (iate.europa.eu) — multilingual, vetted, free
UN terminology: UNTERM (unterm.un.org)
Canadian government: TERMIUM Plus (free, EN/FR/ES/PT)
Spanish-specific: Fundéu, RAE
German-specific: DWDS, Duden
Medical: WHO terminology, MedDRA, ICD-11
Legal: National legal dictionaries (Black's Law Dictionary for US English; Cornu for French)
Industry: Client's existing materials, competitor websites in the target market, industry body publications

A clean workflow: keep two browser windows open. AI suggestion on the left, authoritative source on the right. Accept, reject, or modify each term in 10–30 seconds.

Step 4: Enrich with Context and Forbidden Terms

A working termbase is more than equivalents. Ask the AI to expand each entry:

"For each accepted term, add the following fields: (1) part of speech in the target language, (2) grammatical gender (if applicable), (3) common collocations (3 examples), (4) forbidden synonyms or false friends to avoid, (5) plural form, (6) a one-sentence definition in the target language."

You now have a termbase that rivals what a senior terminologist would produce in a week, in under an hour.

Step 5: Export to Your CAT Tool

Most CAT tools (Trados Studio, memoQ, Phrase, Smartcat) accept termbases as:

TBX (TermBase eXchange, XML format)
CSV with specific column headers
Excel with mapped fields

Have the AI format it for you:

"Export the accepted terms as a CSV with these columns, ready for memoQ import: Source | Target | Part of speech | Domain | Definition | Forbidden synonyms | Notes. Quote any field that contains a comma. Use UTF-8."

Save the output as a .csv file and import into your CAT tool.

Multilingual Glossaries in One Pass

If you work into multiple targets — common for software localization or institutional clients — you can build a multilingual glossary in one prompt:

"For each of the 40 source terms above, generate equivalents in: French (France), Spanish (Latin America), Italian, German, Brazilian Portuguese, Simplified Chinese, Japanese. Format as one row per source term, with one column per language. Highlight any term where you are less than 80% confident in the equivalent and explain why."

The "highlight uncertainty" instruction is what makes the output safe — you know which rows to scrutinize first.

When the Client Has an Existing Glossary

Don't throw it away. Feed it to the AI as authoritative context:

"Below is the client's existing English-French glossary (50 terms). Below that is the source text I will translate (3,000 words). Identify: (a) any source terms not yet in the glossary that should be added, (b) any glossary terms that appear in the source so I know they're relevant, (c) any inconsistencies in the existing glossary I should raise with the client."

This gap-analysis prompt is gold for repeat clients. It surfaces missing terms, flags drift in the existing termbase, and gives you something to send the client to demonstrate diligence.

Interpreter-Specific Glossaries

Interpreters need glossaries that look different from translators':

Phonetic guides for difficult names
Acronym expansions
Quick equivalents arranged for booth-readability, not alphabetically

"I will simultaneous-interpret EN → ES at a 4-hour conference on offshore wind farms. Here is the program. Generate a glossary of 40 likely terms with: (1) English term, (2) Spanish equivalent, (3) IPA pronunciation for any English name or acronym, (4) Spanish pronunciation note where Spanish-speaking listeners may stumble. Order by likely frequency in the agenda."

A Cautionary Tale

Key Takeaways

AI extracts and proposes terminology dramatically faster than manual work — but is not authoritative.
Always verify AI suggestions against IATE, UNTERM, TERMIUM, client materials, or industry sources.
Enrich termbases with collocations, forbidden synonyms, and definitions for richer CAT-tool support.
Build interpreter glossaries differently from translator glossaries — booth-readability and pronunciation matter.

Building Glossaries & Terminology with AI

What You'll Learn

Why Terminology Is Worth Your AI Time

Step 1: Term Extraction from the Source

Step 2: Generate Target-Language Equivalents

Step 3: Verification Against Authoritative Sources

Step 4: Enrich with Context and Forbidden Terms

Step 5: Export to Your CAT Tool

Multilingual Glossaries in One Pass

When the Client Has an Existing Glossary

Interpreter-Specific Glossaries

A Cautionary Tale

Key Takeaways

Quiz

Questions & Answers

Building Glossaries & Terminology with AI

What You'll Learn

Why Terminology Is Worth Your AI Time

Step 1: Term Extraction from the Source

Step 2: Generate Target-Language Equivalents

Step 3: Verification Against Authoritative Sources

Step 4: Enrich with Context and Forbidden Terms

Step 5: Export to Your CAT Tool

Multilingual Glossaries in One Pass

When the Client Has an Existing Glossary

Interpreter-Specific Glossaries

A Cautionary Tale

Key Takeaways

Quiz

Questions & Answers