AI-Powered Translation Quality Assurance
Quality assurance is where AI gives translators the most immediate, measurable win. A 4,000-word translation that used to take 45 minutes of QA can now take 10 minutes — and catch more errors. The trick is knowing what AI is good at flagging and what still requires your eye.
What You'll Learn
- Which QA checks AI handles well (and which it doesn't)
- A reusable prompt template for end-of-project QA
- How to use AI for consistency, numbers, dates, names, and forbidden terms
- The right verification habit so you don't trust AI blindly
The Five QA Categories
Every professional translation QA pass covers five categories:
- Accuracy — Did the translation capture all source meaning, no additions, no omissions?
- Consistency — Are recurring terms, brand names, and product names translated the same way every time?
- Numeric and locale data — Are numbers, dates, currencies, units, and formats correct for the target locale?
- Compliance — Are forbidden terms avoided? Are required disclaimers present?
- Surface quality — Spelling, grammar, punctuation, spacing, tags.
AI is excellent at categories 2, 3, and 4, decent at 5, and only a helper at 1. The first category — accuracy of meaning — is still primarily your job, because you understand the source.
A Reusable AI QA Prompt
Paste this into Claude or ChatGPT with your source and target. Substitute the bracketed parts.
You are a senior bilingual editor reviewing a translation from [SOURCE LANGUAGE] to [TARGET LANGUAGE] for a [DOMAIN] client. Below you have the source segments on the left and my translation on the right.
Run the following QA checks and produce a numbered list of issues, each with: severity (critical / major / minor), segment number, and a suggested correction.
- Consistency: any term that is translated differently across segments?
- Numbers: any digit, percentage, or amount that differs between source and target?
- Dates: any date format that does not match [TARGET LOCALE] conventions?
- Currencies and units: correctly converted and formatted?
- Named entities: people, companies, products spelled identically to source?
- Forbidden terms: any of these terms appear? [LIST]
- Omissions and additions: any sentence in source missing from target, or any target sentence with no source equivalent?
- Register: any segment in the wrong formality level for [AUDIENCE]?
Do not rewrite the translation. Only report issues. If no issues exist in a category, write "No issues." Do not invent issues.
The final two sentences are critical. Without them, the AI will helpfully invent problems to seem useful.
Consistency Checks
Inconsistency is the most common defect in mid- and large-volume translation. The same English term — say, "user", "customer", "client", "account holder" — gets translated three different ways across a 30,000-word manual. Old-school QA tools (Xbench, Verifika, the QA modules in Trados and memoQ) catch only what is in your termbase. AI catches what is not in your termbase.
Prompt template:
Below are 40 segments of a translation EN → FR. Identify any English term that has been translated with more than one French equivalent across the segments. List each term and all the French variants used, with segment numbers. Recommend which French variant should be standardized on.
Run this once per project. You'll be surprised what surfaces.
Numeric and Locale QA
Numbers, dates, and units are where translators silently introduce damaging errors. AI catches them faster than the human eye.
"Check the following bilingual table. For each row, confirm: the numeric value in the target matches the source; the date format follows DD/MM/YYYY for fr-FR; currency symbol position follows fr-FR convention (1 234,56 €); units are converted only if the source asked for conversion. Flag anything wrong."
For locale-specific conventions, point the AI at a reference: "Follow the Microsoft Style Guide for fr-FR for number, date, and currency formatting."
Forbidden Terms and Brand Voice
Many clients maintain a "do-not-use" list — outdated product names, banned competitor names, trademark issues, or culturally inappropriate words. Feed that list into the AI as part of your QA prompt:
"Scan the target text for any occurrence of these forbidden terms or their close variants: [list]. Report each occurrence with segment number."
This is a 30-second pass that used to take 10 minutes of Find-and-Replace.
Catching Omissions and Additions
A senior reviewer's hardest task is the line-by-line read for things that are missing in the target. AI helps:
"Compare these two columns. Identify any source segment whose meaning is partially or fully missing from the corresponding target segment. Identify any target segment that contains information not present in the source."
LLMs are good at catching dropped clauses, dropped negations ("not", "no", "without" silently lost), and accidental additions. Always still do a human spot-check on critical content, but use the AI as a first sweep.
Tone and Register Checks
For marketing, customer support, UX copy, and brand voice work, ask the AI:
"Read the target text. Rate it on a 1–5 scale for: (a) friendliness, (b) formality, (c) clarity. The client wants 4/4/5. Identify any segments that score low and suggest rewrites."
What AI Will Not Catch
Be honest about the limits. AI QA struggles with:
- Cultural appropriateness. AI may not know that a phrase that is fine in pan-Hispanic Spanish is offensive in Argentina.
- Domain-specific accuracy. AI does not know your client's internal terminology unless you give it.
- Subtle register shifts within long passages of literary or creative work.
- Source-text errors. If the source has an error, AI may "fix" it silently in the target.
For sworn, certified, medical, or legal work, AI QA is a helper — never a replacement for a qualified human reviewer.
A Sample QA Workflow
- Finish your translation in your CAT tool.
- Export bilingual review file (CSV, XLIFF, or a side-by-side table).
- Paste 50–100 segments at a time into Claude (long context) or ChatGPT.
- Run the QA prompt above.
- Triage flagged issues by severity. Fix critical and major. Decide on minor.
- Run the consistency-only prompt across the whole document.
- Do a final 10-minute human read of the first 200 and last 200 words. These are where errors hide.
This routine catches roughly 80–90% of objectively findable defects in a fraction of the time pure-human QA takes.
Key Takeaways
- AI excels at consistency, numerics, forbidden terms, and omissions QA — categories where it outperforms the human eye for speed.
- A reusable QA prompt with explicit categories, severity levels, and a "do not invent issues" instruction beats ad-hoc requests.
- Always provide locale references (Microsoft Style Guide, client style guide) so AI knows the rules.
- AI QA is a first sweep, not a final sign-off. Critical content still needs a human last look.

