apmp · 7 min read · 2026-04-29
How do you run an APMP Color Review on an AI-drafted proposal?
Pink, Red, Gold hat reviews adapted for proposals where the first draft came from an AI agent. The review pattern that prevents both AI hallucinations and human compliance gaps.
What APMP Color Reviews actually are
APMP (Association of Proposal Management Professionals) defines a sequence of "color hats" for proposal reviews. The standard sequence:
| Color | Timing (% complete) | Focus | Output |
|---|---|---|---|
| Blue | 5-10% | Strategy validation | Win/no-bid decision |
| Pink | 50-60% | Compliance + responsiveness | Punch list of gaps |
| Red | 85-90% | Final compliance + scoring | Sign-off or rework |
| Gold | 95-100% | Win theme + discriminator polish | Submit-ready |
(Some firms also run Black for legal review and White Glove for graphic polish.)
The reviews are adversarial. Pink team reads the draft as a contracting officer. Red team reads it as the evaluation board. Gold team reads it as the customer.
How AI-drafted proposals break the standard sequence
When an LLM drafts the first 40% of a proposal:
- Pink team finds fewer compliance gaps (the agent is good at hitting Section L instructions).
- Red team finds more hallucinations (the agent invents specific past performance details, NAICS justifications, contracting officer names).
- Gold team finds less differentiation (the agent's prose is bland by design — it averages industry tone).
The fix is to add an AI Pink review before standard Pink, and to weight Red toward hallucination detection.
The new sequence: AI Pink → Pink → Red → Gold
AI Pink (5-10% of the time, mandatory for any AI draft)
Run a 30-minute review focused entirely on:
- Specific claims that need a source. Every dollar amount, contract number, agency name, contracting officer — read the source it came from. If the agent didn't cite, the claim is suspect.
- Past performance attribution. Did the agent claim a contract you don't actually have? Cross-check against your CPARS records.
- Generic prose flagged as "AI-feel". Phrases like "leveraging best-in-class capabilities" without a specific claim. These should be cut, not edited.
- Plausibly-real but fabricated details. "Awarded to Booz Allen via PIID HC102814C0001" — is that a real PIID? Search USAspending. If the PIID doesn't exist, the entire paragraph is suspect.
This review is not optional. AI agents trained on federal contracting data hallucinate at a rate of 8-15% on specific identifiers (FAR clauses, PIIDs, CPARS ratings). Without AI Pink, those hallucinations propagate to submission.
Pink (50-60% complete, 60-90 minutes)
Standard APMP Pink — read against Section L and Section M. The compliance matrix is your guide. Every requirement must have a section that addresses it.
Red (85-90% complete, 4-6 hours)
Standard APMP Red, with one addition: a second hallucination pass. After the rewrites between Pink and Red, the agent (or human) may have introduced new specifics. Re-validate every dollar, date, and identifier added since Pink.
Gold (95-100% complete, 1-2 hours)
Standard APMP Gold — win themes, discriminators, executive summary. Specifically:
- Are the win themes named explicitly (not just implied)?
- Are the discriminators sourced (not just claimed)?
- Does the executive summary preview every M factor in the order they're scored?
What our agent does for each color
For Pink:
reflect_and_critique({
verdict: "patch",
selfScore: 78,
findings: [
{ kind: "uncited_claim", claim: "Booz Allen won the prior VA cloud contract", severity: "high" },
{ kind: "missed_step", stepId: "incumbent-check", label: "Did not verify on USAspending" }
],
patchSummary: "Will pull award by PIID before final draft"
})
This is a structured Pink output. The agent self-flags anything that looks uncited. A human Pink team validates the self-flag (catches false positives) and looks for what the agent missed.
For Red, the same tool runs with a stricter rubric — severity: "high" for any uncited claim, no exceptions.
For Gold, the agent's reflection focuses on win themes and discriminators rather than compliance.
The 80/20 of AI-augmented Color Reviews
If you can only run two reviews (small firm, tight deadline):
- AI Pink at 30% draft (mandatory — catches hallucinations early).
- Red at 90% (combines Pink + Red checks).
- Skip Gold; trust the agent's reflection if it scored ≥ 90% on win theme.
If you run three:
- AI Pink (20-30% draft)
- Standard Pink (50-60% draft)
- Red (85-90% draft)
Skip Gold only if the executive summary already names the win themes and discriminators explicitly.
If you run all four (resource-rich, large pursuit):
Run AI Pink → Pink → Red → Gold as separate teams. Different reviewers per color. The compounding catch-rate is high enough to justify the investment for any pursuit > $5M total contract value.
Common pitfalls
- Letting the agent's reflection substitute for human Red. The reflection is a Pink-equivalent self-check. Red must be a different reader.
- Skipping AI Pink because "the agent looked good." Hallucination rates are silent. They show up in debriefs, not drafts.
- Running Gold as a copy-edit pass. Gold is content review, not grammar. Send the draft to a copy-editor separately if needed.
- No reviewer for the executive summary alone. It's the only section the contracting officer is guaranteed to read. Worth its own 30-minute pass.
A 2-hour total review for a 30-page proposal
- 30 min AI Pink (one experienced capture lead, hallucination focus)
- 60 min Red (one capture lead + one tech SME, scoring focus)
- 30 min Gold (one BD lead, win theme focus)
This compresses APMP doctrine into a small-team-friendly format without losing the essential checks.
Tooling
Our agent supports the workflow via:
compose_compliance_matrix— 9-column matrix that becomes the Pink/Red checklist.reflect_and_critique— runs a Pink-equivalent self-check on every draft.compose_proposal_section— produces sections that can be reviewed independently.- A
color_review_statefield on the conversation — tracks which color the draft is at.
The agent handles the structure. Humans hold the judgment.