Preparing Close Data Safely: Redaction, Anonymization, and Tenant Rules
Before any close data goes near an AI tool, you need a simple, repeatable preparation step. Get this wrong and you will leak customer names, salaries, or unreleased earnings into a system you do not control. Get it right and AI becomes safe to use for the rest of the course.
This lesson is not legal advice. It is the practical, defensible workflow that finance teams in regulated US and UK environments use today.
What You'll Learn
- The four data risk categories in a typical close pack
- A 60-second redaction workflow for spreadsheets and PDFs
- The difference between consumer, Business/Team, and Enterprise AI tenants
- When you can safely paste real numbers and when you cannot
The Four Data Risk Categories
When you look at the documents and files that flow through your close, they fall into four buckets.
Category 1 — Public or near-public. Last quarter's published 10-Q, your audited financial statements, your filed UK statutory accounts. These are already public. You can paste freely into any tool.
Category 2 — Internal but non-sensitive. Management charts of accounts (without account numbers), your close checklist template, your policy memos. Generally safe in Business or Team tier AI tools. Avoid consumer tiers as a rule.
Category 3 — Sensitive financial data. Unreleased earnings figures, draft variances, draft commentary, internal forecasts, M&A-related figures. These can move stock prices or competitive position. Only paste into a Business, Team, or Enterprise tier with training-exclusion confirmed, or redact first.
Category 4 — Personal or restricted data. Employee names with salaries, customer-level revenue, supplier-level spend tied to negotiation positions, anything covered by GDPR or HIPAA. Default to redaction even on Enterprise tiers.
The instinct most finance professionals have — "I'll just paste this trial balance into ChatGPT to ask a quick question" — is exactly the instinct to slow down. Spend 30 seconds classifying the data first.
The 60-Second Redaction Workflow
You do not need redaction software. You need a discipline. Here is the workflow that works for 95 percent of close tasks.
Step 1 — Copy to a scratch sheet. Never edit the source file. Copy the relevant data into a new tab labeled "AI scratch".
Step 2 — Replace names with codes. Customer A, Customer B, Customer C. Employee 1, Employee 2. Supplier ALPHA, Supplier BETA. This takes 30 seconds with Find and Replace.
Step 3 — Round or rebase numbers if needed. For genuinely sensitive figures (unreleased earnings, M&A), rebase to a percentage of total: "Revenue is 100. Cost of goods sold is 62. Gross profit is 38." The proportions are still informative. The absolute numbers leave your firewall.
Step 4 — Strip metadata from PDFs. When you upload a PDF, save it as a flattened copy first. Open in your PDF tool, choose "Print to PDF" or export, and remove author metadata. This avoids leaking your colleague's name and your firm's internal file paths.
Step 5 — Read once before you hit send. The single most important step. Read the prompt as if it were a tweet you were about to publish. If it contains anything you would not want screenshotted, redact more.
For 80 percent of close work the answer is "Step 1 and Step 2 are enough". For board-level numbers and pre-earnings data, do all five steps.
Consumer, Business, and Enterprise Tenants
The biggest factor in how much you need to redact is which AI tier you are using.
Consumer tiers (ChatGPT Free, Plus, Claude Free, Pro). These are personal accounts. Even where the provider states data is not used for training, you have no organisational audit trail and no centralised IT control. Treat as untrusted for any sensitive data. Use for Category 1 only.
Business and Team tiers (ChatGPT Business, Claude Team). These exclude your prompts from model training by default and add SSO, admin controls, and basic audit logs. ChatGPT Business sits around $20 to $25 per seat per month on annual billing and Claude Team sits around $25 to $30 per seat per month. These are the right baseline for finance teams. Safe for Categories 1, 2, and most of 3.
Enterprise tiers (ChatGPT Enterprise, Claude Enterprise, Microsoft 365 Copilot Enterprise). Add data residency, full audit logs, longer retention controls, and contractual indemnities. Required for regulated industries and the default for FTSE 250 / S&P 500 finance functions. Safe for all categories with normal redaction discipline.
Microsoft 365 Copilot inside your tenant. When you use Copilot inside your own Microsoft 365 tenant on a workbook stored in your OneDrive or SharePoint, the data does not leave your tenant for training. This is the lowest-friction option for in-Excel work because the redaction step largely disappears — the data was already in your Microsoft tenant.
Confirm tier classifications with your IT and security team. Vendor terms change quarterly, and what is true today may shift. The principle stays the same: know your tenant before you paste.
A Worked Redaction Example
You want to ask AI to draft commentary on a sensitive customer concentration metric. Original data:
"Customer Acme Industries: revenue $2,341,508 (prior month $1,890,221, prior year $1,420,000). Customer is in oil and gas. Contract renewal December 31. Risk: contract not yet signed for renewal."
After 60-second redaction:
"Customer A (oil and gas sector, large enterprise): current month revenue 100, prior month 81, prior year 61. Renewal date is end of Q4. Renewal contract not yet signed."
The AI can still draft useful commentary about year-over-year growth, sector concentration, and renewal risk. Acme's name and absolute revenue figure never left your firewall.
When You Cannot Use AI at All
Be honest about the cases where AI is not the answer this month.
- Hard pre-announcement window. Two days before earnings release: no AI on draft numbers, even on Enterprise tiers. The risk of an inadvertent disclosure outweighs the time saving.
- Litigation hold or investigation. Anything subject to legal hold should not flow through external tools without legal sign-off.
- Healthcare PHI without HIPAA-compliant tooling. Use only providers with a signed BAA.
- EU personal data without a clear lawful basis. Default to redaction and consult Data Protection.
This is the unglamorous part of the course, but it is the part that lets you defend your use of AI to your audit committee. Skip it and your AI program will be shut down after the first incident.
Building a Reusable Redaction Snippet
Save this snippet inside your prompt library:
"Before I paste any data, I confirm: (1) Source is my AI scratch tab, not the live workbook. (2) Customer, employee, and supplier names are coded. (3) For pre-release earnings data, figures are rebased to ratios. (4) PDF metadata is stripped. (5) I have read the prompt once as if it were public."
Pin it to the top of your prompt library. Run it through your head every single time. It will become muscle memory in two weeks.
Key Takeaways
- Classify every close file into one of four risk categories before you touch AI
- Use a 60-second redaction workflow: scratch tab, code names, rebase numbers, strip metadata, read once
- Match data sensitivity to AI tenant tier — consumer tiers for public data only
- Microsoft 365 Copilot inside your own tenant has the lowest friction for in-Excel work
- Hard no on AI during pre-announcement windows, legal holds, and PHI without HIPAA tooling

