Installation
$npx skills add AndacGuven/technical-seo-skill --skill skill-mdSummary
This skill enables an agent to execute full-scope technical SEO audits on domains and subdomains, producing a structured Excel workbook with broken links, orphaned URLs, metadata issues, schema analysis, and image SEO findings. The agent can manage checkpointed crawl state, resume interrupted audits, and scale crawl parameters for large sites.
SKILL.MD
technical-seo-skill
This skill is the operating contract for production-grade technical SEO crawls.
Use it when the user asks to:
- crawl a domain or website
- generate a technical SEO audit file
- inspect broken internal links
- detect orphaned pages
- audit metadata, canonicals, H1s, hreflang, headings, schema, and image SEO
- produce an execution-ready workbook for SEO operations
Primary Entry Point
Run the skill through:
technical_seo_skill.py
The underlying crawl engine is:
technical_seo_skill.py
Hard Output Contract
These requirements are mandatory:
- The final deliverable must be an
.xlsxworkbook written underreports/. - A markdown report is never an acceptable substitute for the workbook.
- The crawl must create a checkpoint directory under
checkpoints/<domain>/. - The checkpoint must contain the normal crawl state files used by this project.
- The workbook must follow the same structure used by
reports/fal-ai-final-fresh.xlsx. Issuesmust be the first sheet.Schemamust be a single combined sheet.Broken InternalandOrphaned URLsmust be present.- Do not generate a separate AI corrections workbook.
If these conditions are not met, the job is incomplete.
Expected Crawl Behavior
This skill should behave like a real technical SEO specialist.
That means it should:
- discover URLs across the main domain and relevant subdomains
- normalize noisy parameter variants
- keep auth wrappers from polluting SEO task URLs
- validate internal targets before reporting broken internal links
- calculate orphaned URLs from the internal crawl graph
- inspect image inventory and image quality issues
- inspect structured data and produce a schema summary plus schema samples
- convert findings into action-ready tasks
Do not treat this skill like a single-page metadata checker.
Core Audit Scope
The audit should check, when available:
robots.txtllms.txt- sitemap discovery and parsing
- page status codes
- load time
- title presence, length, and duplication
- meta description presence, length, and duplication
- H1 presence, count, and duplication
- heading hierarchy issues
- canonical presence and non-self canonicals
- hreflang presence
- structured data presence
- broken internal links
- orphaned URLs
- crawl depth
- low text to HTML ratio
- thin content
- image inventory
- image file sizes and formats
- missing alt text
- weak alt text
- broken internal images
- duplicate patterns
- remediation tasks
Standard Workflow
- Confirm whether the user wants a fresh crawl or a resumed crawl.
- If the user wants a fresh crawl, delete the existing checkpoint directory for that domain first.
- Run
technical_seo_skill.py. If the user does not specify--output, the skill should automatically create a domain-based Excel filename underreports/. - For large sites, prefer explicit
--max-urls,--workers,--batch-size, and--timeoutvalues. - Monitor
checkpoints/<domain>/state.jsonduring long runs. - Do not claim the report is complete until discovery, link validation, image audit, and workbook phases are complete.
- After completion, verify that the workbook opens and the expected sheets exist.
- Summarize the results from the workbook, not from assumptions.
Command Patterns
Basic run:
python technical_seo_skill.py https://example.com
Recommended large-site run:
python technical_seo_skill.py https://example.com ^
--max-urls 12000 ^
--workers 12 ^
--batch-size 50 ^
--timeout 10
Resume an interrupted crawl:
python technical_seo_skill.py https://example.com ^
--max-urls 12000 ^
--workers 12 ^
--batch-size 50 ^
--timeout 10 ^
--resume
If --output is omitted, the skill should default to:
reports/<domain>_audit.xlsx
Workbook Structure Contract
The workbook should match this structure:
IssuesSayfa BilgileriBlog PagesImagesLarge Images (100KB+)Error PagesRobots.txt & SitemapsN-gram AnalysisDuplicate Content IssuesMeta Tag IssuesImage SEO IssuesHeading Structure IssuesInternal Link IssuesPage Weight IssuesContent Quality IssuesURL Structure IssuesStructured Data IssuesKeyword AnalysisTasksExcluded AuthOrphaned URLsHigh Depth URLsMissing CanonicalSlow URLsBroken InternalSite ChecksSchemaDuplicatesLinks
Issues should combine:
- overview metrics
- raw crawl summary metrics
- consolidated issue inventory
Schema should combine:
- schema summary
- schema samples
Checkpoint Expectations
During a valid crawl, the domain checkpoint should contain the normal state files for this project, including:
state.jsonpages.csvlinks_raw.csvimages_raw.csvtarget_status.csvimage_meta.csv
If these files are not being created during the crawl, the skill is not operating correctly.
Guardrails
- Keep all generated documentation and workbook-facing text in English unless the user explicitly asks otherwise.
- Do not present
?share=variants as separate SEO task URLs. - Do not present login or auth wrapper URLs as final SEO task URLs.
- Do not remove
Broken InternalorOrphaned URLsfrom the workflow. - Do not say the crawl is complete if only discovery has finished.
- Do not silently narrow scope to the apex domain if relevant subdomains are in scope.
- Do not deliver prose or markdown when the user asked for an audit file.
- Do not change the workbook structure casually.
Review Order
After the workbook is generated, review in this order:
IssuesTasksBroken InternalOrphaned URLsHigh Depth URLsSchemaImagesSayfa Bilgileri
Alignment Rule
If the engine changes materially, update this file so it stays aligned with:
- real sheet names
- real crawl behavior
- real checkpoint behavior
- real output files
Do not document features that do not exist in the codebase.