Machine Translation, Translation Memory & Generative AI

Translators encounter three closely related but very different technologies every day: translation memory (TM), machine translation (MT), and generative AI (LLMs). Clients, project managers, and even tool vendors confuse them constantly. If you can speak about them precisely, you'll set better expectations, charge appropriately, and choose the right tool for each task.

What You'll Learn

The fundamental differences between TM, MT, and LLM-based generative AI
When to use each — and when not to
How modern CAT tools combine all three
The pricing and quality implications of each technology

Translation Memory (TM): The Database

Translation memory is, at its core, a database of segment pairs. Every time you translate "Click the green button" into "Clic en el botón verde", the pair is stored. The next time the same or similar English segment appears, the TM offers the stored translation as a match.

TMs are:

Deterministic — same input, same output, every time
100% controlled by the linguist — you decide what gets stored
Project-specific — your TM for Client A doesn't pollute your work for Client B
Auditable — you can see exactly where every match came from

A 100% TM match means the segment is identical to a previously translated one and (assuming context matches) can be reused. A fuzzy match (typically 75–99%) needs editing.

TM is the bedrock of professional translation pricing. Discounted rates for TM matches are industry-standard.

Machine Translation (MT): The Neural Drafter

Machine translation engines (DeepL, Google Translate, Microsoft Translator, ModernMT, Amazon Translate, ModelFront) use neural networks trained on huge bilingual corpora to produce a draft translation of any sentence.

Modern MT is:

Probabilistic — output can vary slightly between runs and engines
Trained on third-party data — output reflects the training corpus, not your past work
Domain-agnostic by default — generic engines don't know your client's terminology unless you customize
Fast and cheap at scale

MT is excellent for: high-volume, lower-stakes content; gisting (understanding what a foreign document says); first drafts in domains where the MT engine is strong (EN ↔ DE, EN ↔ ES, EN ↔ FR for general business content).

MT is poor for: creative work, low-resource languages, brand voice, legal nuance, anything where a hallucinated number or dropped negation has consequences.

Generative AI (LLMs): The Reasoning Assistant

LLMs (ChatGPT, Claude, Gemini) are different beasts. They were not built specifically to translate, but to predict text. Because their training data is massive and multilingual, they can translate — and they can do far more.

LLMs are:

Conversational — you can ask follow-up questions
Reasoning — you can ask why a translation choice was made
Customizable on the fly — paste a glossary, a style guide, and a register instruction in the prompt
Bad at consistency at scale — they will translate the same term two different ways across a long document if not pinned down

An LLM is not a faster MT engine. It is a junior linguist you can converse with. That distinction changes how you use it.

The Quality Comparison at a Glance

Task	TM	MT	LLM
Repeat sentence from past project	Perfect	Good	Decent
Brand-new sentence in EN→ES	Empty	Good	Good
Brand-new sentence in EN→Swahili	Empty	Mediocre	Mediocre
Explain a tricky source term	No	No	Excellent
Adapt tone for a new audience	No	No	Excellent
Build a domain glossary from scratch	No	No	Excellent
Guarantee consistency across 10,000 segments	Yes	Partial	No (unless tightly prompted)
Translate a marketing slogan	No	Poor	Good
Catch a dropped negation	Maybe	No	Yes, if asked

How CAT Tools Are Combining All Three

Modern CAT tools — Trados Studio 2024+, memoQ 11+, Phrase Strings, Smartcat, MateCat — now wire all three together inside the editor:

TM lookup first. If a 100% match exists, use it.
Fuzzy match second. If a 75–99% match exists, propose it for editing.
MT third. If no useful TM match, fetch an MT draft from DeepL, Google, or a custom engine.
LLM polish. Some tools now route the MT draft through an LLM with the project's glossary and style guide for in-place refinement before showing it to you.

For you as the linguist, this means:

Pricing models are evolving. Many agencies now charge a single "post-edit per word" rate that bundles MT + LLM, with TM matches still discounted.
Quality control matters more than ever. The fluent surface of LLM-polished MT can hide errors that crude MT would have made obvious.
You need a verification habit (which we cover in lesson 4).

When to Use What

Use only TM and pure human translation when:

Translation is sworn, certified, or legal-binding
The text is highly creative (literary, transcreation, brand voice)
The client has explicitly forbidden MT (very common in legal, medical, defense)

Use TM + MT post-edit when:

The volume is high
Stakes are moderate (internal documentation, support articles, product descriptions)
You have a good MT engine for the pair

Use TM + MT + LLM polish when:

You're a freelance generalist working on varied content
You want to apply a style guide and glossary at scale
You can verify outputs efficiently

Use LLM only (no TM, no MT) when:

You're prepping for an interpreting assignment
You're drafting client emails
You're researching terminology
You're explaining a source-text passage to yourself
You're doing a one-off QA pass on a finished translation

A Note on Confidentiality

TMs live on your machine or your client's server. MT engines and LLMs send your text to a vendor. Before pasting any client content into ChatGPT, Claude, or Gemini:

Check your NDA and the client's data-handling clauses
Prefer enterprise versions (ChatGPT Enterprise, Claude for Work, Gemini for Workspace) that contractually do not train on your inputs
For highly sensitive work, use on-prem or local LLMs (Llama 3, Mistral) — or just don't use AI at all

Key Takeaways

TM is a deterministic database; MT is a neural drafter; LLMs are conversational reasoning assistants.
Modern CAT tools combine all three — you need to understand which one is responsible for which behavior.
LLMs are great for terminology, style, and reasoning, but poor for cross-document consistency without strong prompting.
Confidentiality matters: don't paste sensitive client content into consumer AI tools.

Machine Translation, Translation Memory & Generative AI

What You'll Learn

The fundamental differences between TM, MT, and LLM-based generative AI
When to use each — and when not to
How modern CAT tools combine all three
The pricing and quality implications of each technology

Translation Memory (TM): The Database

TMs are:

Deterministic — same input, same output, every time
100% controlled by the linguist — you decide what gets stored
Project-specific — your TM for Client A doesn't pollute your work for Client B
Auditable — you can see exactly where every match came from

A 100% TM match means the segment is identical to a previously translated one and (assuming context matches) can be reused. A fuzzy match (typically 75–99%) needs editing.

TM is the bedrock of professional translation pricing. Discounted rates for TM matches are industry-standard.

Machine Translation (MT): The Neural Drafter

Modern MT is:

Probabilistic — output can vary slightly between runs and engines
Trained on third-party data — output reflects the training corpus, not your past work
Domain-agnostic by default — generic engines don't know your client's terminology unless you customize
Fast and cheap at scale

MT is poor for: creative work, low-resource languages, brand voice, legal nuance, anything where a hallucinated number or dropped negation has consequences.

Generative AI (LLMs): The Reasoning Assistant

LLMs are:

Conversational — you can ask follow-up questions
Reasoning — you can ask why a translation choice was made
Customizable on the fly — paste a glossary, a style guide, and a register instruction in the prompt
Bad at consistency at scale — they will translate the same term two different ways across a long document if not pinned down

An LLM is not a faster MT engine. It is a junior linguist you can converse with. That distinction changes how you use it.

The Quality Comparison at a Glance

Task	TM	MT	LLM
Repeat sentence from past project	Perfect	Good	Decent
Brand-new sentence in EN→ES	Empty	Good	Good
Brand-new sentence in EN→Swahili	Empty	Mediocre	Mediocre
Explain a tricky source term	No	No	Excellent
Adapt tone for a new audience	No	No	Excellent
Build a domain glossary from scratch	No	No	Excellent
Guarantee consistency across 10,000 segments	Yes	Partial	No (unless tightly prompted)
Translate a marketing slogan	No	Poor	Good
Catch a dropped negation	Maybe	No	Yes, if asked

How CAT Tools Are Combining All Three

Modern CAT tools — Trados Studio 2024+, memoQ 11+, Phrase Strings, Smartcat, MateCat — now wire all three together inside the editor:

TM lookup first. If a 100% match exists, use it.
Fuzzy match second. If a 75–99% match exists, propose it for editing.
MT third. If no useful TM match, fetch an MT draft from DeepL, Google, or a custom engine.
LLM polish. Some tools now route the MT draft through an LLM with the project's glossary and style guide for in-place refinement before showing it to you.

For you as the linguist, this means:

Pricing models are evolving. Many agencies now charge a single "post-edit per word" rate that bundles MT + LLM, with TM matches still discounted.
Quality control matters more than ever. The fluent surface of LLM-polished MT can hide errors that crude MT would have made obvious.
You need a verification habit (which we cover in lesson 4).

When to Use What

Use only TM and pure human translation when:

Translation is sworn, certified, or legal-binding
The text is highly creative (literary, transcreation, brand voice)
The client has explicitly forbidden MT (very common in legal, medical, defense)

Use TM + MT post-edit when:

The volume is high
Stakes are moderate (internal documentation, support articles, product descriptions)
You have a good MT engine for the pair

Use TM + MT + LLM polish when:

You're a freelance generalist working on varied content
You want to apply a style guide and glossary at scale
You can verify outputs efficiently

Use LLM only (no TM, no MT) when:

You're prepping for an interpreting assignment
You're drafting client emails
You're researching terminology
You're explaining a source-text passage to yourself
You're doing a one-off QA pass on a finished translation

A Note on Confidentiality

TMs live on your machine or your client's server. MT engines and LLMs send your text to a vendor. Before pasting any client content into ChatGPT, Claude, or Gemini:

Check your NDA and the client's data-handling clauses
Prefer enterprise versions (ChatGPT Enterprise, Claude for Work, Gemini for Workspace) that contractually do not train on your inputs
For highly sensitive work, use on-prem or local LLMs (Llama 3, Mistral) — or just don't use AI at all

Key Takeaways

TM is a deterministic database; MT is a neural drafter; LLMs are conversational reasoning assistants.
Modern CAT tools combine all three — you need to understand which one is responsible for which behavior.
LLMs are great for terminology, style, and reasoning, but poor for cross-document consistency without strong prompting.
Confidentiality matters: don't paste sensitive client content into consumer AI tools.

Machine Translation, Translation Memory & Generative AI

What You'll Learn

Translation Memory (TM): The Database

Machine Translation (MT): The Neural Drafter

Generative AI (LLMs): The Reasoning Assistant

The Quality Comparison at a Glance

How CAT Tools Are Combining All Three

When to Use What

A Note on Confidentiality

Key Takeaways

Quiz

Questions & Answers

Machine Translation, Translation Memory & Generative AI

What You'll Learn

Translation Memory (TM): The Database

Machine Translation (MT): The Neural Drafter

Generative AI (LLMs): The Reasoning Assistant

The Quality Comparison at a Glance

How CAT Tools Are Combining All Three

When to Use What

A Note on Confidentiality

Key Takeaways

Quiz

Questions & Answers