Installation
$npx skills add lionkiii/claude-seo-skills --skill seo-internal-linksSummary
An agent can crawl a domain's internal link graph to identify orphan pages, underlinked pages (fewer than 3 inbound links), broken internal links, and generate anchor text suggestions for improving link distribution. Invoke when a user needs to audit internal linking health or improve site interconnection.
SKILL.MD
Internal Link Audit
Crawls a domain to build an internal link graph. Identifies orphan pages, underlinked pages, and broken internal links. Suggests anchor text improvements.
Inputs
domain: Target domain URL (e.g.,https://example.com). Include protocol.max_pages(optional): Max pages to crawl. Default: 100. Maximum: 200.
Execution
Step 1: Crawl Site
Use scripts/fetch_page.py and scripts/parse_html.py to crawl the site:
# Fetch homepage
python3 scripts/fetch_page.py <domain>
# Parse HTML to extract links
python3 scripts/parse_html.py <html_content>
Follow internal links only (same domain hostname). Normalize URLs:
- Remove trailing slashes (treat
/aboutand/about/as same) - Remove URL fragments (
#section) - Preserve query strings only if they appear to be content (not tracking: strip
utm_*,ref=,source=)
Respect robots.txt: fetch <domain>/robots.txt first, skip disallowed paths.
Cap crawl at max_pages. Track crawl queue (BFS order from homepage).
Step 2: Build Link Graph
For each crawled page, record:
- Source URL
- Target URL (all internal links found)
- Anchor text for each link
Build adjacency structure:
outbound[url]= list of (target, anchor_text)inbound[url]= list of (source, anchor_text)
Step 3: Calculate Per-Page Metrics
For each crawled page:
- Inbound internal links count (links from other crawled pages pointing here)
- Outbound internal links count (links from this page to other pages)
- Link depth from homepage (BFS level at which this page was first discovered)
Step 4: Identify Issues
- Orphan pages: Pages discovered via sitemap (
<domain>/sitemap.xml) or linked from other pages but with 0 inbound internal links. Fetch sitemap first to identify all known URLs. - Underlinked pages: Pages with < 3 inbound internal links (high priority for internal linking).
- Excessive outbound: Pages with > 100 outbound internal links (PageRank dilution).
- Broken internal links: For top 20 most-linked pages, verify HTTP status. Flag 4xx/5xx.
Step 5: Anchor Text Suggestions for Top 5 Underlinked Pages
For each of the 5 most underlinked pages (fewest inbound links, excluding orphans):
- Fetch the page and extract H1 and
<title>tag - Identify top 3 relevant anchor text options based on: H1 noun phrases, title keywords, page URL slug
- Find pages in crawl that mention related terms and could link to this page
- Output: target URL, suggested anchors (ranked), recommended source pages
Step 6: Optional Ahrefs Enrichment
If Ahrefs available (ToolSearch '+ahrefs'):
- Fetch
site-explorer-pages-by-internal-linksfor the domain - Cross-reference crawl findings with Ahrefs data
- Add
### Ahrefs Internal Link Datasection showing discrepancies
Output Format
## Internal Link Audit: [domain]
### Summary Stats
| Metric | Value |
|--------|-------|
| Pages crawled | N / max_pages |
| Total internal links | N |
| Avg inbound links per page | N.N |
| Orphan pages found | N |
| Underlinked pages (<3 inbound) | N |
| Pages with broken internal links | N |
### Orphan Pages
Pages with 0 inbound internal links:
| URL | Depth | Found via |
|-----|-------|-----------|
| /page | — | sitemap |
### Top 5 Underlinked Pages — Suggested Anchors
**1. [URL]** — N inbound links
- H1: "[heading]"
- Suggested anchors: "[anchor 1]", "[anchor 2]", "[anchor 3]"
- Recommended source pages: [list pages that mention related terms]
### Broken Internal Links
| Source Page | Broken Link | Status Code |
|---|---|---|
### Link Depth Distribution
| Depth | Pages | % of Total |
|-------|-------|------------|
| 1 (homepage) | 1 | X% |
| 2 | N | X% |
| 3+ | N | X% |
### Pages with Excessive Outbound Links (>100)
| Page | Outbound Links |
|------|----------------|
## Data Sources
- Internal crawl via fetch_page.py + parse_html.py ([N] pages, [date])
- [Ahrefs MCP — if used]