Most “AI writing” breaks when facts aren’t in the model’s head. Retrieval-Augmented Generation (RAG) fixes that by grounding answers in your sources—and by refusing to guess when the evidence isn’t there. Below is a practical, buyer-first prompt system that forces citations, dates, and a clear “not in corpus” fallback. (RAG = LLM + retrieval from your knowledge base.)
TL;DR
- Ground every answer in your sources; require dates + citations on claims.
- Add a strict fallback: if data isn’t found, say “not in corpus” and stop (no speculation).
- Use operational verbs (list/compare/extract/show) and a deliverable shape (table/checklist/example).
- Improve precision with retrieval + reranking (e.g., BM25 → Cohere Rerank).
- Evaluate regularly; untested RAG creates “silent failures.”
What “RAG writing” means for content teams
RAG answers with passages pulled from your corpus, then generates copy around that evidence—so definitions, numbers, and examples can be verified or fixed at the source. Use cases: product explainers, comparisons, FAQs, and refreshes where correctness matters. (See Google Cloud’s RAG overview and quickstart for canonical definitions and setup.)
Objectives that shape good prompts
- Buyer-first coverage: Fit, Cost/Limits, Performance, Migration, Reliability/Safety, Change Awareness, Hands-on Scenarios, Rigor/Stress-tests.
- Corpus-bound truth: Answers must come from your indexed domains/files; show snippet-level citations with dates.
- Actionable outputs: Prefer comparisons, checklists, snippet examples.
- Error-tolerant: If info is missing, reply “not in corpus.”
- AEO-ready: Short definitions, FAQ-style bullets, and answer blocks.
These map to what cloud providers and enterprise RAG guides recommend: scope the retrieval target, structure outputs, and make provenance explicit.
Prompt mechanics that work (and why)
- Scope gate: “Only using {domains}/{folders}…” narrows retrieval, reducing off-topic chunks.
- Operational verbs: list, compare, extract, summarize, show example—these align with retrievers and chunking.
- Guardrails inside the prompt: “Provide snippet citations with dates; if a value isn’t present, say not in corpus and stop.”
- Output shape: ask for a table/checklist/example to force structured synthesis.
- Context knobs: add small scenarios (e.g., “300 in / 500 out”) when estimating throughput/limits.
- Retrieval quality: combine a keyword retriever (BM25) with a semantic reranker for precision on ambiguous queries.
Failure modes (and fixes)
- Unbounded scope → noisy retrieval. Fix: restrict to named domains/collections and the exact facets you want (endpoints, limits, billing).
- Missing data → hallucinations. Fix: the “not in corpus” clause + add the source to your index, then retry.
- Vague verbs → rambling. Fix: use list/compare/extract/show + a deliverable.
- No provenance → opinion drift. Fix: require dates + citations on claims.
Reusable prompt templates (copy/paste)
Use {domains} or {collections} to pin scope.
- Fit (API surface)
“Only using {domains}, list supported endpoints, base URLs, required headers, and streaming options. Return a table. Add snippet-level citations with dates.” - Cost / Limits
“From {domains}, extract billing model(s) and any non-token charges. Include units (tokens/GB/RPM/TPM). If a price isn’t present, say ‘not in corpus’ and stop. Add citations + dates.” - Performance
“Summarize documented rate limits, max request sizes, timeouts, and retry/backoff guidance. Provide a short checklist with citations.” - Migration
“If code uses {SDK}, list changes to run on {provider}. Provide a short diff/checklist and note param/schema differences. Cite docs.” - Reliability/Safety
“Summarize data retention/privacy statements for API use (training opt-out, log retention, PII guidance). If any area isn’t stated, say so. Cite.” - Change awareness
“List changes in the last {N} days that affect buyers (date — change — why it matters). Cite release notes/changelogs.” - Hands-on scenario
“For {workload}, estimate whether defaults suffice (RPM/TPM/streaming). Flag where documented request-based increases apply. Cite limits pages.” - Rigor / De-dup
“Within {domains}, identify duplicate/mirrored claims and keep one canonical citation. Explain the canonical choice.”
Mini checklist (printable)
- Domains pinned
- Task verbs chosen (list/compare/extract/show)
- Output shape defined (table/checklist/example)
- Dates + citations required
- “Not in corpus” rule included
- Units specified (tokens/GB/RPM/TPM)
- Scenario anchor added (optional)
More on RAG AI
FAQs
Do I need fine-tuning as well?
Often, RAG covers freshness/grounding; fine-tuning helps with style or domain shorthand. Many teams combine both—RAG for facts, light customization for tone/structure. See AWS’s Nova case study comparing RAG vs customization.
How to start fast with a corpus-first assistant?
Spin up an assistant that answers only from your sources and always shows citations—then drop in the templates above. Start your first build: start your free trial • compare plans: pricing.
What does “not in corpus” mean in prompts?
It’s a guardrail that tells the model to stop (and admit the gap) when the requested fact isn’t in your indexed sources—preventing guesses. Pair this with routine evaluation to catch “silent failures.”
How do I improve retrieval precision?
Use a two-stage setup: keyword retrieval (e.g., BM25) followed by a semantic reranker (e.g., Cohere Rerank) to put the best passages on top.