seo-llms-txt

Installation

$npx skills add lionkiii/claude-seo-skills --skill seo-llms-txt

Summary

This skill enables an agent to generate, validate, or audit llms.txt files that help LLMs understand a website's structure. The agent can crawl site hierarchies, produce spec-compliant Markdown indexes, check existing files against the llms.txt standard, or validate files for compliance—supporting three distinct workflows triggered by user language cues.

SKILL.MD

llms.txt — Generate, Validate & Audit

Analyze, generate, or validate llms.txt files per the llms.txt specification (proposed by Jeremy Howard / Answer.AI). The llms.txt standard provides a Markdown file at /llms.txt that helps LLMs understand a website's content structure — similar to how robots.txt guides traditional crawlers.

Adoption: 950+ sites including Anthropic, Cloudflare, Docker, Vercel, Stripe, FastHTML, Cursor, Hugging Face, and many more.

Inputs

Three modes, auto-detected from the user's input:

Input PatternModeExample
/seo llms-txt <url>Audit/seo llms-txt example.com
/seo llms-txt generate <url>Generate/seo llms-txt generate example.com
/seo llms-txt validate <url-or-path>Validate/seo llms-txt validate /path/to/llms.txt
  • If user provides a URL without sub-command → Audit mode (fetch and check existing file)
  • If user says "generate" → Generate mode (crawl site, produce llms.txt)
  • If user says "validate" → Validate mode (parse and check against spec)

llms.txt Spec Reference

@skills/seo/references/llms-txt-spec.md

Quick Spec Summary

The llms.txt file must be valid Markdown with this structure:

  1. H1 heading (required) — Project/site name
  2. Blockquote (optional) — Short site summary
  3. H2 sections — Categorized link lists with descriptions
  4. Link format: - [Name](url): Description (colon + space after URL)
  5. ## Optional section — Links useful but not essential for understanding
  6. Companion files (optional):
    • llms-full.txt — Comprehensive version with full content
    • Individual .md page mirrors for key pages

Execution: Audit Mode

When user provides a URL (no sub-command):

  1. Fetch llms.txt: Use WebFetch to retrieve <url>/llms.txt
  2. Run 8-point check:
#CheckPass Criteria
1File existsHTTP 200 at /llms.txt
2Valid MarkdownParseable as Markdown, no broken syntax
3Has H1Exactly one # heading
4Has blockquote summary> line present after H1 (recommended)
5Has H2 sectionsAt least one ## section with links
6Link format correctLinks follow - [Name](url): description pattern
7Has ## OptionalContains an Optional section (recommended)
8Companion llms-full.txtllms-full.txt exists at same path (recommended)
  1. Link spot-check: WebFetch up to 5 linked URLs to verify they resolve (HTTP 200)
  2. Generate score card and recommendations

Execution: Generate Mode

When user says "generate":

  1. Crawl site structure:

    • WebFetch the homepage — extract title, meta description, nav links
    • WebFetch /sitemap.xml (or /sitemap_index.xml) — extract all URLs
    • If no sitemap, follow nav links from homepage (up to 50 pages)
  2. Categorize pages into sections based on URL patterns and content:

    • ## Docs/docs/, /documentation/, /guides/, /reference/
    • ## Blog/blog/, /articles/, /posts/, /news/
    • ## API/api/, /developers/, /reference/
    • ## Products/products/, /features/, /pricing/, /solutions/
    • ## About/about/, /team/, /company/, /contact/
    • ## Legal/privacy/, /terms/, /legal/
    • ## Optional — Catch-all for remaining useful pages
  3. Extract metadata per page:

    • Title (from <title> or <h1>)
    • Description (from meta[name=description] or first paragraph)
    • Truncate description to ~100 characters
  4. Generate llms.txt:

    # Site Name
    
    > Brief site description from meta or homepage hero text.
    
    ## Docs
    
    - [Getting Started](https://example.com/docs/start): Introduction and setup guide
    - [API Reference](https://example.com/docs/api): Complete API documentation
    
    ## Blog
    
    - [Latest Post](https://example.com/blog/post): Description of the post
    
    ## Optional
    
    - [About](https://example.com/about): Company background
    - [Contact](https://example.com/contact): Get in touch
    
  5. Generate llms-full.txt skeleton (optional — note to user):

    Suggest that llms-full.txt should contain the full Markdown content
    of all key pages concatenated. Provide the structure but note that
    generating full content requires crawling every page.
    
  6. Output the generated file — display it in a code block and offer to write to disk

Execution: Validate Mode

When user says "validate":

  1. Load the file:

    • If URL → WebFetch <url>/llms.txt
    • If local path → Read the file
  2. Run spec compliance checks:

#RuleSeverityCheck
1Has exactly one H1FAILCount # lines (not ##)
2H1 is first headingFAILFirst line starting with # must be H1
3Valid MarkdownFAILNo broken link syntax, unclosed brackets
4Links use correct formatWARN- [Name](url): description pattern
5Has at least one H2 sectionFAILAt least one ## heading
6No empty sectionsWARNEvery H2 should have at least one link
7Blockquote presentINFO> after H1 is recommended
8## Optional section existsINFORecommended by spec
9No deep heading levelsWARNOnly H1 and H2 allowed (no H3+)
10Descriptions presentWARNLinks should have : description after URL
  1. Output pass/fail per rule with line numbers for failures

Output Format

Audit Mode Output

## llms.txt Audit: [domain]

**Score: X/8 checks passed**

| # | Check | Result | Details |
|---|---|---|---|
| 1 | File exists | PASS/FAIL | [status code or error] |
| 2 | Valid Markdown | PASS/FAIL | [issue if any] |
| ... | ... | ... | ... |

### Link Spot-Check
| URL | Status |
|---|---|
| [url] | 200 OK / 404 / timeout |

### Recommendations
- [Prioritized list of improvements]

### What is llms.txt?
> llms.txt is a proposed standard (llmstxt.org) that provides LLMs with
> a structured Markdown overview of your site. 950+ sites have adopted it.
> While AI platform adoption of the standard is still emerging, early
> adoption positions your site for AI search visibility — similar to how
> early schema.org adoption paid off for rich snippets.

Generate Mode Output

## Generated llms.txt for [domain]

[The generated llms.txt content in a code block]

### Generation Notes
- Pages crawled: [count]
- Sections created: [list]
- Pages excluded: [count and reason]

### Next Steps
1. Review and edit the generated file
2. Upload to your site root as `/llms.txt`
3. Consider creating `llms-full.txt` with full page content
4. Add `<link rel="help" href="/llms.txt">` to your HTML head (optional)

Validate Mode Output

## llms.txt Validation: [source]

**Result: X/10 rules passed** | Y warnings | Z info

| # | Rule | Result | Line | Details |
|---|---|---|---|---|
| 1 | Has H1 | PASS/FAIL | [line#] | [details] |
| ... | ... | ... | ... | ... |

### Issues to Fix
- [FAIL items with specific fix instructions]

### Recommendations
- [WARN items with suggestions]