What this is
Auditor scans a case folder for problems before files leave the office. It's a sister tool to Filer β same canonical reference rules, run independently or wired into Filer's Compile pre-flight (v0.8).
Workflow
- Case Details β set the case's identifiers (case ref, client ref, HO refs, client name). Import case summary populates everything from a
gmiau-case-summary/1JSON, or type the refs by hand. - Scan β drop the case folder (or specific files). Tick Scan PDF text so Auditor can search document bodies, not just filenames. Hit Run audit.
- Report β review the findings. The headline question is per-file: does this document belong in this client's folder β yes or no?
The checks (v0.7)
- C1 Files that belong to this case β for each file, Auditor searches the filename + the body text for any of this case's known identifiers (case ref, client ref with context cue, HO refs, client name, client email). First hit wins β once a file's ownership is confirmed, Auditor stops scanning it. Files where no identifier appears anywhere are flagged for review (they may be misfiled). For emails saved to PDF, a client email match in the subject / From / To line is enough.
- C2 Third-party recipient allow-list β for each PDF named
Letter to <X>orEmail to <X>, Auditor extracts the recipient and checks it againstgmiau-specs/parties.yaml(canonical names + aliases) then the Case Details ad-hoc parties textarea. Unknown β WARN. CCL / OAL client letters are exempted; their first-page salutation is cross-checked against the case summary's client name (mismatch β WARN). - C3 Three-monthly client-letter gap β finds client-facing letters by filename keyword (CCL / OAL / Opening Advice Letter / Client Care Letter), takes the newest date (filename YYMMDD or YYYY-MM-DD β PDF
/CreationDateβ file mtime) and flags WARN at 91+ days old, INFO at 81β90 days, INFO if no client letter is present. Thresholds + keywords are tunable in Config. - C4 Missing key documents β per funding flag (LH / CLR / CLR2 / PFC / β¦), Auditor checks that the expected key documents are present. v1.1 of
auditor-keydocs.yamlextends beyond CCL/OAL: identity verification, eligibility/means evidence, and CW4 merits forms on CLR matters. - C5βC8 (stubs) β regulator-aligned checks scaffolded per
gmiau-specs/AUDITOR-REGULATORY-MAP.md: C5 CW4 merits assessment, C6 closure letter, C7 retention/destruction, C8 attendance-note presence. Each emits a pending row in the Report until the matching v0.x lands.
The tabs
π Case Details
The case's identifiers β case ref, client ref, HO umbrella refs, client name. Anything that positively identifies a document as belonging to this case. Import a Case Summary JSON to populate.
π Scan
Drop zone, file list, Run audit button.
π Report
Findings from the last audit run, per check.
βοΈ Config
Per-caseworker thresholds + keywords (Phase 0.5+).
π¨ Settings
App appearance, font, tab visibility, embedded spec.
Confidentiality: this tool runs entirely in your browser; nothing is uploaded. The files you load, the case references and the client names are confidential β keep audit reports on this machine.
Scan
Drop a case folder, or individual files. Filenames are always scanned; tick Scan PDF text to scan body content too (uses pdf.js β slower).
Diagnostics β run a smoke test against the shared check library
Proves @@AUDITOR_CHECKS@@ resolved and the loader registered window.gmiauAuditor. Useful after a rebuild.
Case Details
The case's identifiers. C1 checks each scanned file for at least one of these β case ref, client ref (digits with cue word like "Our ref:"), Home Office umbrella ref, client name, or client email. A file with none of them is flagged for review.
Report
No audit has been run yet. Go to the Scan tab, drop a case folder, and hit Run audit.
Config
Per-caseworker preferences β shared with other GMIAU tools via the One File projection in localStorage['ifyi_config_bundle'] under auditor.*. Persists across reloads; sync between machines by exporting/importing from Filer's Config tab.
C3 Β· Three-monthly client-letter gap
- C2 parties allow-list editor (currently sourced from
gmiau-specs/parties.yamlat build time; the Case Details ad-hoc textarea is the runtime override). - C1 bare-digit context cue list β surface the existing heuristic for editing.
- Severity overrides per check.
- Override marker text for Filer pre-flight (
Compiled with GMIAU Auditor) β used by v0.8.
App font
App appearance
Dracula themes are dark-only β switching to Light keeps them dark.
Tab visibility
Syncs across every GMIAU Shell tool via ifyi_hide_guide_tab + ifyi_hide_config_tab.
Canonical spec
The current AUDITOR-SPEC.md is inlined at build time. Edit the spec at gmiau-specs/AUDITOR-SPEC.md and run immigrationfyi-tools update to refresh every tool that embeds it.
Show AUDITOR-SPEC.md
# GMIAU Auditor Spec
**Version 0.4 Β· 2026-05-19 Β· v0.7 C2 shipped**
v0.4 (2026-05-19): C2 recipient allow-list check shipped.
- `gmiau-specs/parties.yaml` bumped to v1.1 with HMCTS + GMIAU additions; inlined into auditor.html via `<!-- @@DATA:parties.yaml@@ -->` (resolves to `window.IFYI_PARTIES`).
- C2 extracts recipient from `Letter to <X>` / `Email to <X>` filenames; looks up against parties.yaml canonical names + aliases, then the Case Details ad-hoc textarea. Unknown β WARN.
- CCL / OAL client letters exempted from the recipient check; their first page is salutation-checked against `caseSummary.client_name` (mismatch β WARN).
- Database experts/barristers layer deferred β left as a TODO in `auditor-checks.js` until the encrypted Database's unlock state is surfaced into the audit ctx.
v0.3 (2026-05-19): C3 three-monthly gap check shipped.
- C3 threshold reverted to **91 days, comparison `>=`** (one day strictly past three months). Functionally equivalent to v0.2's "90 days, strict `>`"; this form is clearer in the Config UI ("WARN at 91+ days").
- Added a second knob β `auditor.approachDays` (default **81**). Gap 81β90 days β INFO "approaching"; 91+ β WARN. Gives caseworkers a ten-day heads-up before the gap actually trips.
- Config persisted via `localStorage['ifyi_config_bundle'].auditor.*`, the same bundle Filer's Config tab reads/writes β cross-tab live-sync via the `storage` event.
v0.2 (2026-05-14): user feedback on Β§10 open questions:
- C2 simplified β recipient extraction now reads filenames (`Letter to X.pdf` / `Email to X.pdf`), not body salutations. Client letters identified by salutation `Dear <FirstName>(?:\s+<MiddleName>)?` matched against `case_summary.client_name`.
- Parties allow-list is **system-wide**, not per-case-summary. Sourced from a new canonical `gmiau-specs/parties.yaml` + the encrypted GMIAU Database (experts, barristers) + caseworker ad-hoc list per audit run. CASE-SUMMARY-IMPORT extension dropped.
- Threshold tightened to **90 days** (was 91). _(Reverted to 91 in v0.3 with `>=` comparison.)_
- Client-letter keywords narrowed to `CCL`, `OAL`, `Opening Advice Letter`, `Client Care Letter` only.
- Override marker text: `Compiled with GMIAU Auditor` (typo-correction reading; alternative "Complied with GMIAU Auditor" still open β see Β§10).
- Auditor is a **sibling tool**, not a Filer mode (user delegated; see Β§10).
Sister tool to **Filer**. Auditor scans a case's files (folder, sub-tree, or staged-for-filing set) and reports compliance + data-protection issues *before* anything leaves the office. Filer borrows Auditor's checks as a pre-flight before Compile.
Companion specs: [`FILER-SPEC.md`](./FILER-SPEC.md), [`REFERENCES-SPEC.md`](./REFERENCES-SPEC.md), [`CASE-SUMMARY-IMPORT.md`](./CASE-SUMMARY-IMPORT.md), [`GMIAU-STYLE-GUIDE.md`](./GMIAU-STYLE-GUIDE.md).
---
## Β§1 Purpose
A caseworker about to transfer a file to a third party (counsel, new firm, LAA, peer reviewer, court) needs to know:
1. Are there any **stray references** to other clients in this case's files? (Data protection β the first reason for the tool to exist.)
2. Are there **letters addressed to recipients not on the case summary's allow-list**? (Same concern, different vector.)
3. Has a **client-facing letter** been sent in the last three months? (LAA expectation on live matters.)
4. Are any **key documents** missing for this matter / funding type? (CW1, COI, CCL, OAL etc.)
Auditor runs these checks. Filer runs them automatically before Compile. Caseworker can also run Auditor ad-hoc on any case at any time.
---
## Β§2 Architecture
- **Standalone HTML tool** at `immigrationfyi-tools/source/auditor.html`. Standard GMIAU Shell canon (header + tabs + Settings tab + sentinel order per [[ref_ifyi_shell]]).
- **Shared check library** at `immigrationfyi-tools/scripts/auditor-checks.js`. Single source of truth for the check algorithms; both `auditor.html` and `filer.html` inline it via a new build-time sentinel `<!-- @@AUDITOR_CHECKS@@ -->` (same pattern as `@@IFYI_CONFIG@@`, `@@DATABASE@@`). One edit propagates to both tools on `immigrationfyi-tools update`.
- **No backend.** Pure-client. Inputs come from file pickers + case summary JSON; outputs render in-tab + as a downloadable HTML report.
- **No client data leaves the machine.** Same offline contract as Filer.
---
## Β§3 Inputs
### 3.1 Case context
- **Case Summary JSON** (`gmiau-case-summary/1`). Auditor uses:
- `case_reference` β the allow-list of refs that belong to this case (Case ref, Client ref, all Home Office umbrella refs)
- `client_name` β first name(s) used to identify client-addressed letters (`Dear <FirstName>`); also for the report header
- **Parties allow-list (system-wide, not per-case)** β three layers, in this lookup order:
1. **Canonical bodies** from `gmiau-specs/parties.yaml` (NEW spec β see Β§7). Government bodies, courts, tribunals: UKVI, Home Office (alias HO), FtTIAC (aliases FtT, First-tier Tribunal), UTIAC (aliases UT, Upper Tribunal), EHRC, Court of Appeal, etc.
2. **GMIAU Database** β experts, barristers, interpreters from the encrypted register, read via the existing `@@DATABASE@@` sentinel (`window.databaseGet('experts')`, `β¦('barristers')`, β¦). Cross-tool single source of truth per [[ref_ifyi_database_sot]].
3. **Caseworker ad-hoc list** β a textarea in Auditor's Case Details tab, one name per line. Per audit run; not persisted. Covers one-off contacts: new instructing firm, the client's GP this matter, etc.
- **Ref allow-list** populates from the imported case summary; editable per run. **Parties allow-list** is read-only for layers 1β2 (canonical + database); editable for layer 3.
### 3.2 Files
- Drop a **folder** (whole case root, or a specific sub-tree, or Filer's staged set).
- Drop **individual files**.
- Auditor scans:
- **PDFs** β text content via pdf.js (already inlined in the suite) + filename
- **Filenames only** for everything else (`.docx`, `.xlsx`, images, audio)
- Auditor does NOT open Office files in v0.1 (left to v0.2; OOXML text extraction via JSZip is plausible).
### 3.3 Config
`ifyiConfig.auditor.*` (additive within `gmiau-config/1`, no schema bump). Persisted in `localStorage['ifyi_config_bundle'].auditor.*`:
- `threeMonthDays` β C3 WARN threshold; default **91** (gap β₯ this many days β WARN)
- `approachDays` β C3 INFO threshold; default **81** (gap in [`approachDays`, `threeMonthDays`) β INFO "approaching")
- `clientLetterKeywords` β list of filename tokens that count as a client-facing letter (defaults `CCL`, `OAL`, `Opening Advice Letter`, `Client Care Letter`; configurable)
- `keyDocsByFunding` β per funding flag (`LH`/`CLR`/`CLR2`/`PFC`/...), the list of required document keywords. Seeds from `gmiau-cases/gmiau.py:_REVIEW_DEFAULT_KEY_DOCS` β see Β§7.
---
## Β§4 Checks (v0.1 scope)
Four checks. Each emits zero or more findings; each finding has `{checkId, severity, file, evidence, suggestion}`. Severity ceiling per check:
### C1 β Files that belong to this case *(severity ceiling: WARN)*
**The question is per-file:** does this document belong in this client's folder β yes or no? (Reframed 2026-05-14 from the v0.4-v0.5.2 per-token approach. The user reported the old algorithm produced noise from PGP signatures, base64 / data: URIs, URLs and long hashes; the new algorithm searches for *specific known strings* so noise can't match by accident.)
For each scanned file, Auditor searches the **filename** and (if PDF text was extracted) every **body page** for any of the case's known identifiers. **First hit wins** β once a file's ownership is confirmed, scanning stops. Files with no hit anywhere get a WARN finding: *"Couldn't find any identifier for case <ref> in this file. Review whether this file belongs here."*
**Identifiers** built from `caseSummary` + `refsAllowList`:
| Kind | Source | Search rule |
|---|---|---|
| `case_ref` | `case_reference`, `refsAllowList.caseRef` | Substring (case-insensitive). The funding-tail stem (`RB12345` from `RB12345-LH-GF`) is also added. |
| `client_ref` | `client_reference`, `refsAllowList.clientRef` | Word-boundary digit match, but **only counts as a hit if a cue word** ("Our ref", "Client ref", "Your ref", "Matter ref", "File ref", "Case ref", "Reference") appears within 30 chars before. Avoids matching page numbers, post-codes etc. |
| `ho_ref` | `home_office_reference`, `uan`, `gwf`, `port_reference`, `case_id`, `refsAllowList.homeOfficeRefs[]` | Substring. |
| `appeal_ref` | `appeal_reference`, `refsAllowList.appealRefs[]` | Substring. |
| `name` | `client_name` | Substring of the full name as imported. (First name alone is too weak β gives false positives on common forenames.) |
**What was dropped from earlier drafts:** the `REF_FINDER` alternation, the `BARE_DIGIT_FINDER`+`BARE_DIGIT_CONTEXT_RE` open-universe sweep, `NOISE_PATTERNS` / `_noiseSpans` / `_inAnySpan`, the allow-list "anything else is flagged" model, the "stray reference" findings shape. The cross-reference detection ("this case's file mentions ANOTHER case's ref") is **not** part of v0.5.3 C1 β a file that contains its own ref is treated as belonging, irrespective of what else it mentions. That cross-reference question is left for a future check (possibly part of C2 or a new C5).
### C2 β Recipient not on the system parties allow-list *(severity ceiling: WARN)*
Reads filenames, not body content β GMIAU's naming convention carries the recipient.
**Identification rules:**
- `^.*\bLetter to (.+?)(?:,\s*\d|\.\w+$)` β captures `<X>` from `Letter to <X>.pdf` or `Letter to <X>, 12 May 2026.pdf` (Filer-style trailing date).
- `^.*\bEmail to (.+?)(?:,\s*\d|\.\w+$)` β same shape, "Email" variant.
- Files matching client-letter keywords (`CCL`, `OAL`, `Opening Advice Letter`, `Client Care Letter`) are **not** subject to C2 β they're for the client, who is always allowed.
- Files that match neither pattern are not outbound letters; C2 skips them.
**Lookup (`<X>` from filename):**
- Case-insensitive match against `parties.yaml` canonical name OR any alias.
- Then against database experts + barristers (name + aliases if stored).
- Then against caseworker's ad-hoc list for this run.
- Not found in any layer β WARN finding `{file, capturedRecipient, suggestion: "Confirm <X> is a permitted recipient; add to ad-hoc list or update parties.yaml."}`.
**Body-content cross-check (defensive):**
- For PDFs that *look* like client letters by filename (CCL / OAL), verify the salutation `^Dear (<client_first_name>)(?:\s+<client_middle_name>)?,` is present and matches `case_summary.client_name`'s first name(s) on the first page. Mismatch β WARN ("Filed as client letter but salutation reads 'Dear <other>'").
- For PDFs that *look* like third-party letters (`Letter to X`), no body cross-check in v0.1 β filename is authoritative.
Severity is WARN (not FAIL): caseworker may have a legitimate new contact that just needs adding to the ad-hoc list or the canonical YAML.
### C3 β Three-monthly client-letter gap *(severity ceiling: WARN)*
- Default WARN threshold: **91 days** (`ifyiConfig.auditor.threeMonthDays`); comparison `today β newest >= 91`.
- Default INFO ("approaching") threshold: **81 days** (`ifyiConfig.auditor.approachDays`). Triggers when the gap is in `[approachDays, threeMonthDays)`. Clamped to `<= threeMonthDays` so the two never invert.
- Client-facing letters identified by filename keyword (case-insensitive substring): **`CCL`**, **`OAL`**, **`Opening Advice Letter`**, **`Client Care Letter`**. List is config-editable (`ifyiConfig.auditor.clientLetterKeywords`).
- For each match, extract a date β first hit wins:
1. Filename `YYYY-MM-DD` (anywhere; year 19xx/20xx) or `YYMMDD` (six consecutive digits at a word boundary; YY pivots on current year + 5).
2. PDF `/CreationDate` (or `/ModDate` fallback) via pdf.js `getMetadata()`. Already loaded for text extraction β no extra dependency.
3. OS `lastModified` (`File.lastModified`).
- Take the most recent letter date. If `today β newest >= threeMonthDays` β WARN. If `approachDays <= gap < threeMonthDays` β INFO. Otherwise PASS.
- If **no** client-facing letter exists at all β separate INFO finding ("no client letter") β a brand-new matter legitimately has none; not a WARN escalation.
- If letters exist but none has a usable date β INFO listing each undated file ("rename with a YYMMDD prefix").
### C4 β Missing key documents *(severity ceiling: WARN)*
- Read `case_summary.funding_flags` (e.g. `LH`, `CLR-ECF`) β or whichever field the case summary uses to mark the funding stage. For each flag present, look up the expected key-doc keywords in `keyDocsByFunding`.
- For each expected document, search filenames for the keyword (case-insensitive substring). Missing β finding.
- The matter-type β expected-docs table is the v0.1 scope risk; lift the GMIAU-canonical table from `gmiau-cases/gmiau.py:_REVIEW_DEFAULT_KEY_DOCS` and promote to `gmiau-specs/auditor-keydocs.yaml`. Auditor reads the YAML at build time via a sentinel; the caseworker edits via Auditor's Config tab.
---
## Β§5 Output
### 5.1 In-tab Report pane
Per check, a section with:
- Title + status badge (PASS Β· INFO Β· WARN Β· FAIL)
- One-line summary ("3 stray refs in 2 files")
- Expandable rows per finding: file path, evidence excerpt (highlighted token / matched line), suggestion
- Overall verdict at top: green / amber / red
### 5.2 Downloadable report
- `<Case> Audit Report <YYYY-MM-DD>.pdf` β bound report mirroring the in-tab view. PRIVATE & CONFIDENTIAL header, GMIAU metadata, optional qpdf-wasm lock (reuse Filer's `_gmiauMaybeLockBytes`).
- `<Case> Audit Report <YYYY-MM-DD>.json` β machine-readable findings for archival or for piping into other tools.
### 5.3 Persistence
- Audit runs are ephemeral. No history kept by default.
- Saving the JSON report is the caseworker's responsibility (drop into `~/Documents/Work/<case>/Reviews/`).
---
## Β§6 Filer integration
Filer's Compile button gains a pre-flight:
1. On click, before `buildFilerBundle` runs, Filer invokes `window.gmiauAuditor.runChecks({caseSummary, fileState})`.
2. If any **FAIL**: modal shows the report inline, "Cancel" / "Compile anyway" buttons. "Compile anyway" requires a one-time confirm.
3. If only WARN/INFO: status bar shows the count; Compile proceeds without blocking.
4. If everything passes: silent.
**Compile marker in PDF metadata.** Every Filer compile that ran the pre-flight stamps a marker into the bound PDF's `Producer` field (appended to `GMIAU Toolkit Β· `):
- **Clean compile** (no FAIL, or only WARN/INFO accepted): `GMIAU Toolkit Β· Compiled with GMIAU Auditor (YYYY-MM-DD)`
- **Override compile** (FAIL accepted via Compile-anyway): `GMIAU Toolkit Β· Compiled with GMIAU Auditor (YYYY-MM-DD) β overrides accepted`
The marker means "audit ran"; the suffix flags non-clean. The audit JSON sidecar (when written β Β§5.2) carries the full finding list for a reviewer who wants more detail than the marker.
Caseworker can run the full audit independently any time via Auditor's own UI β Filer's pre-flight is just a convenient hook.
---
## Β§7 Dependencies (need to land before / alongside v0.2 build)
1. **`gmiau-specs/parties.yaml`** *(NEW)* β canonical recipients. Cross-consumed by Auditor (C2 allow-list) and eventually by `letter` / `court-doc` CLIs for filename autocomplete. Shape:
```yaml
name: parties
version: "1.0"
updated: 2026-05-14
bodies:
home_office:
name: "Home Office"
aliases: ["HO", "UKVI"]
role: government
fttiac:
name: "FtTIAC"
aliases: ["FtT", "First-tier Tribunal", "First-tier Tribunal IAC"]
role: tribunal
utiac:
name: "UTIAC"
aliases: ["UT", "Upper Tribunal", "Upper Tribunal IAC"]
role: tribunal
# β¦ Court of Appeal, EHRC, AIT, LAA, etc.
```
2. **`gmiau-specs/auditor-keydocs.yaml`** *(NEW)* β promoted from `gmiau-cases/gmiau.py:_REVIEW_DEFAULT_KEY_DOCS`. Cross-consumed by Auditor + the gmiau-cases CLI.
3. **`immigrationfyi-tools/scripts/auditor-checks.js`** *(NEW)* β shared check library. New build-time sentinel `<!-- @@AUDITOR_CHECKS@@ -->` (extends [[ref_spec_sentinel_pattern]]).
4. **`@@DATABASE@@`** sentinel β already exists; Auditor pulls experts/barristers from the encrypted register via `window.databaseGet('experts')` / `β¦('barristers')`. No new build wiring.
5. **`@@SPEC:AUDITOR-SPEC@@`** sentinel in `auditor.html`'s Settings tab β embeds this spec, per [[ref_spec_sentinel_pattern]] (already wired).
6. **No CASE-SUMMARY-IMPORT change.** The previous v0.1 plan to add a `parties:` field to `gmiau-case-summary/1` is dropped β parties are system-wide (via #1 + #4), not per-case.
---
## Β§8 Out of scope (v0.1)
- **Named-entity recognition** on document body content (would need an ML model β violates offline + airgap rules).
- **`.docx` / `.xlsx` text extraction.** v0.2 candidate via JSZip + OOXML parse β same pattern as Filer's `_filerScrubOfficeXml`.
- **Email / SharePoint scanning** (out of scope; Auditor scans the on-disk case folder only).
- **Auto-redact / auto-fix.** Auditor reports; the caseworker fixes.
- **Cross-case scanning** ("are any files from case X showing up in case Y?"). Possible v0.3 with the encrypted GMIAU register's case list.
- **Three-monthly auto-letter generation.** Different tool β Auditor only flags the gap.
- **History / trend reports** (audit results over time). v0.3+ if asked for.
---
## Β§9 Build phases
| Phase | Scope | Output |
|---|---|---|
| **v0.1 β v0.2 (this doc)** | Design + user-confirmed scope. C2 simplified to filename-based; parties source moved system-wide. | This file β for user re-review of Β§10 (only Q4 remains) |
| **v0.3** | Scaffold `auditor.html` (canon shell, 5 tabs incl. Guide + π Index back-link) Β· `auditor-checks.js` skeleton Β· `@@AUDITOR_CHECKS@@` sentinel wired into `build-all-offline.py` Β· seed `gmiau-specs/parties.yaml` + `auditor-keydocs.yaml` from existing GMIAU sources | tool boots; no real audit yet |
| **v0.4** | Check C1 (stray refs) β highest-value, lowest-ambiguity | scan a folder; see flagged refs |
| **v0.5** | Check C4 (missing key docs) β reuses well-defined table | full findings UI |
| **v0.6** (shipped 2026-05-19) | Check C3 (three-monthly gap) β filename date / PDF `/CreationDate` / mtime; WARN at 91+, INFO at 81β90, INFO if no letter; Config tab live for `threeMonthDays` + `approachDays` + `clientLetterKeywords`; PDF metadata extraction folded into existing pdf.js pass | gap detection live |
| **v0.7** (shipped 2026-05-19) | Check C2 (recipient allow-list) β `Letter to <X>` / `Email to <X>` filename capture, lookup against `parties.yaml` + ad-hoc list, CCL/OAL salutation cross-check. Database experts layer deferred. | parties lookup live |
| **v0.8** | Filer pre-flight integration + PDF metadata Producer marker | Compile-time blocking |
| **v0.9** | PDF report + JSON report download | shareable artefact |
| **v0.10** | Vault docs (Cheatsheet + Reference + Guide tab export) + tool-audit clean | feature-complete |
Each phase = a single backup snapshot to `gmiau-backup/auditor-vNN-YYYY-MM-DD/`, in line with [[feedback_ifyi_snapshot_every_change]].
---
## Β§10 Open questions
### Resolved 2026-05-14 (user reply)
1. β
**Recipient extraction (C2)** β GMIAU letters use `Dear FIRSTNAME,` / `Dear FIRSTNAME MIDDLENAME,` for client letters; third-party letters are identified by filename (`Letter to X.pdf` / `Email to X.pdf`). Spec rewritten in C2.
2. β
**Client-letter keywords (C3)** β narrowed to `CCL`, `OAL`, `Opening Advice Letter`, `Client Care Letter`. User's proposed additions (Letter to Client / Client Update / Follow-up) dropped.
3. β
**Three-monthly threshold** β 90 days.
5. β
**Parties source** β system-wide (canonical YAML + GMIAU Database experts/barristers + caseworker ad-hoc list per run). CASE-SUMMARY-IMPORT extension dropped. See Β§3 + Β§7.
6. β
**Sibling tool, not Filer mode** β user delegated; rationale: distinct purpose (check correctness vs package outbound), distinct cadence (ad-hoc audit vs file-transfer moment), aligns with GMIAU Shell pattern (Bundle Builder + Evidence Exhibitor are sibling tools, not one tool with modes), separate icon in launcher makes "run an audit" discoverable.
### Resolved 2026-05-14 (user reply, Reading A)
4. β
**Override marker β Reading A confirmed.** `Compiled with GMIAU Auditor (YYYY-MM-DD)` stamped on every audited compile; overrides append ` β overrides accepted` after the date. Lives in the bound PDF's metadata `Producer` field, appended to `GMIAU Toolkit Β· β¦`. Marker = "audit ran"; the suffix flags non-clean compiles.
---
**Status:** v0.7 shipped 2026-05-19 β C2 recipient allow-list live via `parties.yaml` + ad-hoc textarea + CCL/OAL salutation cross-check. C1 + C2 + C3 + C4 all live. C5βC8 scaffolded as stubs per `AUDITOR-REGULATORY-MAP.md` (CW4 merits / closure letter / retention / attendance notes). Database experts/barristers layer for C2 deferred. Browser-test still pending. Next: v0.8 (Filer pre-flight integration + Producer marker) β and the C5βC8 implementations in subsequent v0.x.
Regulatory framework cross-reference: [`AUDITOR-REGULATORY-MAP.md`](./AUDITOR-REGULATORY-MAP.md) β maps every file-level evidence requirement from SQM v3, LAA Peer Review IA, LAA Imm Common Errors (2022 + 2024), and IAA Code of Standards 2024 to a live / partial / planned / out-of-scope status.
**Provenance:**
- v0.1 Β· 2026-05-14 β initial design doc; written by Claude with user (conversational thread "Filer + Auditor + spec sentinel", 2026-05-14). No build started.
- v0.2 Β· 2026-05-14 β same-day revision after user's Β§10 answers. C2 simplified (filename-based recipient extraction), parties moved system-wide (`parties.yaml` + database + ad-hoc), threshold tightened to 90 days, client-letter keywords narrowed, sibling-tool decision confirmed. Snapshot of v0.1 at `gmiau-backup/auditor-v0.1-to-v0.2-2026-05-14/`. Still no build.
- v0.3 build Β· 2026-05-14 β scaffold shipped. `parties.yaml` + `auditor-keydocs.yaml` seeded, `auditor-checks.js` skeleton, `@@AUDITOR_CHECKS@@` sentinel wired into `build-all-offline.py`, `source/auditor.html` (GMIAU Shell, 7 tabs) booting; all 4 checks PASS-stubbed. Snapshot at `gmiau-backup/auditor-v0.3-scaffold-2026-05-14/`.
- v0.4 build Β· 2026-05-14 β C1 (stray case references) implemented. `auditor-checks.js` gains `REF_FINDER` alternation + bare-digit context heuristic + `findRefsIn` + `buildAllowList` + `classifyRef` + `canonicalRef` (strips funding-flag tail). `auditor.html` gains drop zone (folder+files), file list, pdf.js text extraction (CDN 3.11.174 + worker blob), Run audit button, full Case Details form (Case ref Β· Client ref Β· HO refs Β· Appeal refs Β· Funding flag Β· Client name Β· Ad-hoc parties) + Import Case Summary, Report tab rendering (verdict pill + per-check cards + findings table). Smoke-tested against 3-file scenario: filename stray ref β, in-body stray ref β, bare-digit with context β, own refs not flagged β. **Not yet browser-tested** β `node --check` β, build β (1.88 MB), tool-audit β. Snapshot at `gmiau-backup/auditor-v0.4-c1-2026-05-14/`.
- v0.5.1 fixtures + test runner Β· 2026-05-14 β `gmiau-testing/auditor/` added: `build.py` generates 8 stdlib-only PDFs covering every C1+C4 path (own refs clean Β· stray filename ref Β· stray body ref Β· bare-digit with context Β· bare-digit without context Β· clean CCL+OAL satisfying C4 Β· WARN scenario by removal). `test-auditor.js` is a Node-based automated runner β 18 scenarios, no browser needed. The runner caught a v0.4-shipping bug: `REF_FINDER` was missing the FtTIAC appeal pattern (`[A-Za-z]{2,3}-\d{4,5}-\d{4}`) so PA/HU/EA/DA appeal references in stray-file text wouldn't have been flagged. Fixed inline; 18/18 pass post-fix.
- v0.5.3 C1 redesign β **per-file ownership, not per-token** Β· 2026-05-14 β user feedback: "the auditor isn't working. If the auditor verifies that a document contains information matching the client's data, then it does not need to review the entire document. The issue is does this document belong in the client's file β yes or no." This is a fundamental reframe: C1 is no longer "scan every token and flag those not in the allow-list" (which produced PGP / base64 / URL noise). It's now "search the file for any of the case's known identifiers β if at least one appears, the file belongs; if none, flag the file for review." Implementation: `_collectIdentifiers(ctx)` builds a small list of `{kind, token, needsContext}` entries (case ref + funding-tail stem, client ref with cue, every HO umbrella ref, appeal refs, full client name). `_searchIdentifier` does substring search (case-insensitive) β with the bare-digit cue test for client refs. `_findFirstIdentifier` returns the first hit and stops. The check iterates files, checks filename first (cheap, almost always carries the ref under GMIAU naming), then body pages until a hit. Files with no hit get a WARN finding. Title renamed: "Stray case references" β "Files that belong to this case". NOISE_PATTERNS and the `_noiseSpans` filter dropped β no longer needed (specific-string search can't false-match base64). The Report scope-line now lists the identifiers being searched for (grouped by kind: case ref, client ref with cue, HO ref, appeal ref, client name). Test runner rewritten: 14 C1 scenarios cover every identifier kind + cue-gated bare-digit + no-identifier WARN + pages:null fallback + PGP-content-doesn't-block-ownership; plus 5 C4 unchanged + 1 E2E = 20/20 pass. Fixtures README rewritten β expected outcomes change from "3 stray-ref findings" to "1 mis-filing flag on the RB67890 file". Snapshot at `gmiau-backup/auditor-v0.5.3-ownership-2026-05-14/`.
- v0.5.2 false-positive filter + Report UX Β· 2026-05-14 β user reported "lots of false flags from my email signature's PGP" + "doesn't tell me what the issue is". Two fixes + a tab reorder. (1) `NOISE_PATTERNS` + `_noiseSpans` + `_inAnySpan` added to `auditor-checks.js`: any C1 finding whose index falls inside a PGP/PEM armored block, a `data:β¦;base64,β¦` URI, an `http(s)`/`ftp`/`mailto` URL, an email address, or a long (40+) unbroken alnum run is dropped. Original-text snippet still drawn from un-stripped source so context stays human-readable. (2) Report tab rewritten: each finding now renders as a small card with explicit `Where`, `Token`, `Kind`, `Snippet`, and a visible **Why** line (was tooltip-only). Each check card carries a `check-card-scope` subtitle: C1 lists the actual allow-list tokens scanned against (so caseworker sees what counted as own-case); C4 names the funding flag + requirement count. (3) Tab order reshuffled per user: Guide Β· Case Details Β· Scan Β· Report Β· Config Β· Settings (Case Details moved before Scan since you set up the allow-list before scanning; Scan stays the default-active per Β§2C). Three new regression tests added (PGP block, stray-ref-then-PGP-block, URL/email/data:URI/hash). 21/21 pass. Snapshot at `gmiau-backup/auditor-v0.5.2-noise-filter-2026-05-14/`.
- v0.5 build Β· 2026-05-14 β C4 (missing key documents) implemented + generic `@@DATA:<filename>@@` sentinel infrastructure added. `auditor-keydocs.yaml` (v1.0, 11 funding-flag keys β CCL/OAL aliases) is now consumed at runtime via `window.IFYI_AUDITOR_KEYDOCS`. New `inline_data` + `_parse_yaml_subset` + `_split_flow_items` in `build-all-offline.py` (hand-rolled YAML parser for the gmiau-specs shape β top-level scalars, nested maps, flat lists, inline flow arrays). `auditor.html` gains the `@@DATA:auditor-keydocs.yaml@@` sentinel + `fundingFlag` passed into `ctx` by `readCaseDetails`. C4 also derives the flag from a hyphenated case-ref tail (`RB12345-CLR-ECF` β `CLR-ECF`) when the explicit field is blank. Smoke-tested against 6 scenarios: empty fileset + LH β 2 missing (WARN), CCL present no OAL β 1 missing, both present β PASS, no flag β INFO with hint, unknown flag β INFO with hint, flag-from-case-ref β PASS. **Not yet browser-tested** β `node --check` β, build β (1.89 MB), tool-audit β. Snapshot at `gmiau-backup/auditor-v0.5-c4-2026-05-14/`. The `@@DATA@@` sentinel is reusable β `parties.yaml` (v0.7) will plug into it the same way.