docs-seeker

Searching internet for technical documentation using llms.txt standard, GitHub repositories via Repomix, and parallel exploration. Use when user needs: (1) Latest documentation for libraries/frameworks, (2) Documentation in llms.txt format, (3) GitHub repository analysis, (4) Documentation without direct llms.txt support, (5) Multiple documentation sources in parallel
development (0%)
Information
Last updated:6 months ago
Repository
mrgoonie/claudekit-skills
All powerful skills of ClaudeKit.cc!
1,200 stars
Skill 内容

# Documentation Discovery &amp; Analysis

## Overview

Intelligent discovery and analysis of technical documentation through multiple strategies:

1. **llms.txt-first**: Search for standardized AI-friendly documentation
2. **Repository analysis**: Use Repomix to analyze GitHub repositories
3. **Parallel exploration**: Deploy multiple Explorer agents for comprehensive coverage
4. **Fallback research**: Use Researcher agents when other methods unavailable

## Core Workflow

### Phase 1: Initial Discovery

1. **Identify target**
   - Extract library/framework name from user request
   - Note version requirements (default: latest)
   - Clarify scope if ambiguous
   - Identify if target is GitHub repository or website

2. **Search for llms.txt (PRIORITIZE context7.com)**

   **First: Try context7.com patterns**

   For GitHub repositories:
   ```
   Pattern: https://context7.com/{org}/{repo}/llms.txt
   Examples:
   - https://github.com/imagick/imagick → https://context7.com/imagick/imagick/llms.txt
   - https://github.com/vercel/next.js → https://context7.com/vercel/next.js/llms.txt
   - https://github.com/better-auth/better-auth → https://context7.com/better-auth/better-auth/llms.txt
   ```

   For websites:
   ```
   Pattern: https://context7.com/websites/{normalized-domain-path}/llms.txt
   Examples:
   - https://docs.imgix.com/ → https://context7.com/websites/imgix/llms.txt
   - https://docs.byteplus.com/en/docs/ModelArk/ → https://context7.com/websites/byteplus_en_modelark/llms.txt
   - https://docs.haystack.deepset.ai/docs → https://context7.com/websites/haystack_deepset_ai/llms.txt
   - https://ffmpeg.org/doxygen/8.0/ → https://context7.com/websites/ffmpeg_doxygen_8_0/llms.txt
   ```

   **Topic-specific searches** (when user asks about specific feature):
   ```
   Pattern: https://context7.com/{path}/llms.txt?topic={query}
   Examples:
   - https://context7.com/shadcn-ui/ui/llms.txt?topic=date
   - https://context7.com/shadcn-ui/ui/llms.txt?topic=button
   - https://context7.com/vercel/next.js/llms.txt?topic=cache
   - https://context7.com/websites/ffmpeg_doxygen_8_0/llms.txt?topic=compress
   ```

   **Fallback: Traditional llms.txt search**
   ```
   WebSearch: "[library name] llms.txt site:[docs domain]"
   ```
   Common patterns:
   - `https://docs.[library].com/llms.txt`
   - `https://[library].dev/llms.txt`
   - `https://[library].io/llms.txt`

   → Found? Proceed to Phase 2
   → Not found? Proceed to Phase 3

### Phase 2: llms.txt Processing

**Single URL:**
- WebFetch to retrieve content
- Extract and present information

**Multiple URLs (3+):**
- **CRITICAL**: Launch multiple Explorer agents in parallel
- One agent per major documentation section (max 5 in first batch)
- Each agent reads assigned URLs
- Aggregate findings into consolidated report

Example:
```
Launch 3 Explorer agents simultaneously:
- Agent 1: getting-started.md, installation.md
- Agent 2: api-reference.md, core-concepts.md
- Agent 3: examples.md, best-practices.md
```

### Phase 3: Repository Analysis

**When llms.txt not found:**

1. Find GitHub repository via WebSearch
2. Use Repomix to pack repository:
   ```bash
   npm install -g repomix  # if needed
   git clone [repo-url] /tmp/docs-analysis
   cd /tmp/docs-analysis
   repomix --output repomix-output.xml
   ```
3. Read repomix-output.xml and extract documentation

**Repomix benefits:**
- Entire repository in single AI-friendly file
- Preserves directory structure
- Optimized for AI consumption

### Phase 4: Fallback Research

**When no GitHub repository exists:**
- Launch multiple Researcher agents in parallel
- Focus areas: official docs, tutorials, API references, community guides
- Aggregate findings into consolidated report

## Agent Distribution Guidelines

- **1-3 URLs**: Single Explorer agent
- **4-10 URLs**: 3-5 Explorer agents (2-3 URLs each)
- **11+ URLs**: 5-7 Explorer agents (prioritize most relevant)

## Version Handling

**Latest (default):**
- Search without version specifier
- Use current documentation paths

**Specific version:**
- Include version in search: `[library] v[version] llms.txt`
- Check versioned paths: `/v[version]/llms.txt`
- For repositories: checkout specific tag/branch

## Output Format

```markdown
# Documentation for [Library] [Version]

## Source
- Method: [llms.txt / Repository / Research]
- URLs: [list of sources]
- Date accessed: [current date]

## Key Information
[Extracted relevant information organized by topic]

## Additional Resources
[Related links, examples, references]

## Notes
[Any limitations, missing information, or caveats]
```

## Quick Reference

**Tool selection:**
- WebSearch → Find llms.txt URLs, GitHub repositories
- WebFetch → Read single documentation pages
- Task (Explore) → Multiple URLs, parallel exploration
- Task (Researcher) → Scattered documentation, diverse sources
- Repomix → Complete codebase analysis

**Popular llms.txt locations (try context7.com first):**
- Astro: https://context7.com/withastro/astro/llms.txt
- Next.js: https://context7.com/vercel/next.js/llms.txt
- Remix: https://context7.com/remix-run/remix/llms.txt
- shadcn/ui: https://context7.com/shadcn-ui/ui/llms.txt
- Better Auth: https://context7.com/better-auth/better-auth/llms.txt

**Fallback to official sites if context7.com unavailable:**
- Astro: https://docs.astro.build/llms.txt
- Next.js: https://nextjs.org/llms.txt
- Remix: https://remix.run/llms.txt
- SvelteKit: https://kit.svelte.dev/llms.txt

## Error Handling

- **llms.txt not accessible** → Try alternative domains → Repository analysis
- **Repository not found** → Search official website → Use Researcher agents
- **Repomix fails** → Try /docs directory only → Manual exploration
- **Multiple conflicting sources** → Prioritize official → Note versions

## Key Principles

1. **Prioritize context7.com for llms.txt** — Most comprehensive and up-to-date aggregator
2. **Use topic parameters when applicable** — Enables targeted searches with ?topic=...
3. **Use parallel agents aggressively** — Faster results, better coverage
4. **Verify official sources as fallback** — Use when context7.com unavailable
5. **Report methodology** — Tell user which approach was used
6. **Handle versions explicitly** — Don't assume latest

## Detailed Documentation

For comprehensive guides, examples, and best practices:

**Workflows:**
- [WORKFLOWS.md](./WORKFLOWS.md) — Detailed workflow examples and strategies

**Reference guides:**
- [Tool Selection](./references/tool-selection.md) — Complete guide to choosing and using tools
- [Documentation Sources](./references/documentation-sources.md) — Common sources and patterns across ecosystems
- [Error Handling](./references/error-handling.md) — Troubleshooting and resolution strategies
- [Best Practices](./references/best-practices.md) — 8 essential principles for effective discovery
- [Performance](./references/performance.md) — Optimization techniques and benchmarks
- [Limitations](./references/limitations.md) — Boundaries and success criteria