Blog Fact-Check

Verify statistics, claims, and source attributions in blog posts. Pure Claude pipeline with no external NLP dependencies.

Workflow

Step 1: Read the Blog Post

Read the target file and identify all sections containing data claims.

Step 2: Extract Statistical Claims

Scan the full text for every claim that includes a number, percentage, dollar amount, or named source. Build a claims list with these fields:

Field	Description
claim_text	The exact sentence or phrase containing the statistic
value	The numeric value (e.g., "42%", "$1.2M", "3x")
attribution	Named source if present (e.g., "HubSpot", "Gartner 2025")
url	Cited URL if present (from markdown link or parenthetical)
location	Heading or line number where the claim appears

Step 3: Verify Cited Claims

For each claim that includes a URL:

Fetch the source page via WebFetch
Search the returned content for the specific numeric value
If exact value found, check surrounding context matches the claim topic
Assign a confidence score (see Verification Scoring below)

Process claims sequentially to avoid rate-limiting source sites.

Step 4: Flag Uncited Claims

For claims without a URL:

Mark status as UNVERIFIED
Suggest a search query the user can run to find a source
If the attribution names a specific organization, suggest their domain

Step 5: Generate Verification Report

Output the full results table, summary statistics, and recommended actions.

Claim Extraction Patterns

Identify claims matching these structures:

Fully cited (highest priority):

[Number]% [claim] ([Source], [Year]) - parenthetical citation
[claim] [Number]% ... [markdown link to source] - inline link
According to [Source], [Number]... - attribution lead

Uncited statistics (flag for sourcing):

[Number]% of [noun phrase] - standalone percentage
[Number]x more/less/higher/lower - multiplier claims
$[Number] [claim] - dollar figures without attribution

Weak signals (check context before extracting):

studies show, research indicates, data suggests + nearby number
survey found, report reveals, analysis shows + nearby number
Round numbers in isolation (e.g., "millions of users") - skip unless specific

Verification Scoring

Score	Status	Criteria
1.0	VERIFIED	Exact number found on cited page in matching context
0.7-0.9	PARAPHRASE	Similar data found but with different wording, rounding, or timeframe
0.3-0.6	WEAK	Source page exists and covers the topic but the specific statistic is not visible
0.0	NOT FOUND	Cited page does not contain the claimed data anywhere
N/A	UNVERIFIED	No source URL provided for the claim

Scoring guidance:

A claim of "43%" when the source says "nearly half" scores 0.8
A claim of "2024" data when the source only has "2023" scores 0.7
A claim citing a homepage when the stat lives on a subpage scores 0.3
A 404 or unreachable URL scores 0.0

Output Format

Verification Report: [Post Title]

File: [path] Claims found: [total] Verified: [count] | Paraphrase: [count] | Weak: [count] | Not Found: [count] | Unverified: [count]

#	Claim	Source URL	Score	Status	Notes
1	"73% of marketers..."	https://example.com/report	1.0	VERIFIED	Exact match found in section 3
2	"5x ROI improvement"	https://example.com/study	0.8	PARAPHRASE	Source says "nearly 5x"
3	"60% prefer video"	(none)	N/A	UNVERIFIED	Try: "video preference statistics 2025"

Recommended Actions

[List claims that need source URLs]
[List claims with weak or not-found scores that need replacement sources]
[List claims where the source data may be outdated]

Integration

This skill can be called from blog-analyze as an optional deep-verification step. When invoked from the analyzer, only claims scoring below 0.7 are flagged in the analysis report.

Standalone usage: /blog factcheck path/to/post.md

Limitations

Paywalled content: WebFetch cannot access content behind login walls. These score as WEAK (0.5) with a note about paywall detection.
Dynamic pages: JavaScript-rendered content may not be available via WebFetch. If the page returns minimal content, note this in the status.
PDF sources: WebFetch may not extract PDF text reliably. Flag PDF URLs for manual verification.
Archived pages: If a URL returns 404, suggest checking web.archive.org.
Rate limits: Process no more than 10 URLs per run to avoid overwhelming source servers. If a post has more than 10 cited URLs, verify the first 10 and list the remainder as SKIPPED.

blog-factcheck

Installation

Summary

SKILL.MD