Why this checklist?
Most audits flag issues but don’t tell you what to fix first. This one does. It prioritizes verifiable content (facts + citations) and duplicate control (canonicals/301s) so improvements stick—and so your editors can ship confidently.
TL;DR — fast audit flow
- Inventory: crawl/export URLs; group by topic/template; note owners & “last updated.”
- Quality (people-first): clarify purpose, audience, depth, authorship, methods.
- Facts & citations: build a claim table; verify each claim; add citations or mark Not in corpus.
- Originality: run a plagiarism check; keep quotes short + cited; check media rights.
- Duplicates & canonicals: cluster near-duplicates; choose a primary URL; apply 301 or rel=canonical.
- On-page technicals: titles/meta, internal links to the primary, eligible schema (FAQ/HowTo).
- AEO block (optional): H1 → 50-word answer → FAQ.
- Refresh plan: 90-day recheck of stats, links, and duplicates.
The Checklist
1) Inventory & scope
Export all indexable URLs from your CMS/crawler. Tag each with: template, topic, owner, last updated, word count, traffic/conversions. Flag thin, outdated, and orphan pages for merge or removal.
2) Quality (people-first)
Score each page on purpose clarity, audience fit, depth, and usefulness. Add author, methods, and sources where missing. Align with Google’s guidance on helpful, people-first content (useful purpose, expertise signals, and a good page experience).
Reference: Google Search Central — Creating helpful, reliable content
3) Facts & citations (the claim table)
Extract factual claims (numbers, definitions, comparisons). For each claim:
- Found in your corpus? Link the best source; include date/version.
- Not in corpus? Leave the claim out or add a trusted source to your corpus and retry.
- Ambiguous? Rewrite or remove.
This single table prevents “looks right” content from slipping through and trains teams to expect evidence.
4) Originality & rights
Run a plagiarism check on suspect pages. Keep quotations short and attributed; add links to originals. Validate usage rights for images, charts, and logos. Document sensitive terms and disclaimers (legal/medical/financial) in the page notes.
5) Duplicates, cannibalization & “off-topic”
Cluster near-duplicates (title similarity + embeddings). For each cluster:
- Pick one primary URL.
- If content truly moves: 301 redirect old to primary.
- If close variants must stay: use rel=canonical to the primary; unify internal links.
- Update the XML sitemap to list primaries; remove outdated alternates.
Reference: Google Search Central — Consolidate duplicate URLs
6) On-page technicals
- Tight, descriptive titles and helpful meta descriptions.
- Internal links point to the primary (avoid linking to duplicates).
- Add FAQ/HowTo schema only when it genuinely helps readers.
- Make sure canonical tags, hreflang (if used), and sitemaps agree.
7) AEO readiness (optional but powerful)
For your most important pages, adopt the H1 → 50-word answer → FAQ pattern. It gives scanners a quick, trustworthy answer and clarifies the page’s focus for machines and humans.
Use a corpus-first assistant to make this fast
Spin up a research assistant that answers only from your sources and always shows citations. It turns audits into fixes you can verify in minutes.
Start now: start your free trial • compare plans: pricing.
1) What’s the difference between a 301 redirect and canonicalization in a content audit?
Use a 301 redirect when an old or overlapping page is being fully replaced—transfer users and signals to the primary URL. Use rel=canonical when closely similar pages must remain live (regional variants, UTM archives, print pages); signals consolidate to the primary while both URLs stay accessible.
2) How do I find and fix duplicate content (cannibalization) fast?
Cluster pages by title/topic + similarity (embeddings or near-dup checks), pick one primary, then
Merge content where possible.
Apply 301 to deprecated versions or rel=canonical if variants must remain.
Update internal links and the XML sitemap to point at the primary only.
3) What should an AI content audit template include?
At minimum: URL, Template, Owner, Last Updated, Purpose, Word Count, Traffic/Conv, Claims Verified (Y/N), Citations Added, Duplicate Cluster ID, Primary URL, Canonical/301 Applied, Title/Meta Updated, Schema Added (Y/N), Next Action, Due Date.