About llms.txt
The llms.txt file is an emerging standard for providing LLM-friendly documentation of your website's content. Similar to how robots.txt tells search engine crawlers what to index, llms.txt helps AI language models understand your site structure and content priorities.
Why Use llms.txt?
- Better AI Understanding: Provides context about your pages that helps LLMs give more accurate answers about your business
- Content Prioritization: Highlight your most important pages for AI crawlers
- Structured Information: Organize content by category (core, product, docs, blog) for better comprehension
- SEO for AI: As AI-powered search becomes more prevalent, llms.txt helps your content appear in AI-generated responses
How It Works
- Generate: Use this tool to create your llms.txt file from a sitemap or URL list
- Customize: Review and edit the generated file to match your needs
- Deploy: Place the file at
https://yoursite.com/llms.txt - Update: Regenerate periodically as your content evolves
Best Practices
- Focus on evergreen content rather than time-sensitive posts
- Include clear, descriptive titles and summaries for each page
- Organize content logically by category
- Keep the file under 100 entries for optimal performance
- Update quarterly or when major content changes occur
Related Tools
Check out our SEO Analyzer for comprehensive on-page SEO analysis and optimization recommendations.
Frequently Asked Questions
What is llms.txt?
llms.txt is a standardized markdown file that provides LLM-friendly documentation of your website's content. It lists important pages with titles and descriptions, organized by category (core pages, product pages, documentation, resources, support, blog posts, company info, legal pages). This helps AI crawlers and language models understand your site structure and prioritize content when answering questions about your business.
Where should I place the llms.txt file?
Place the generated llms.txt file in your website's root directory (e.g., https://example.com/llms.txt). This makes it easily discoverable by AI crawlers following the llms.txt standard.
How does automatic categorization work?
When you provide a sitemap, the tool automatically categorizes URLs based on common patterns:
- Core Pages:
/about,/contact,/pricing,/features,/services - Product Pages:
/product/,/solutions/,/plans/ - Documentation:
/docs/,/guides/,/api/,/reference/,/tutorial/ - Resources:
/resources/,/case-studies/,/whitepapers/,/ebooks/,/templates/,/downloads/ - Support:
/help/,/faq/,/support/,/knowledge-base/,/troubleshooting/ - Blog Posts:
/blog/,/posts/,/articles/,/news/ - Company:
/company/,/careers/,/jobs/,/team/,/press/,/investors/ - Legal:
/privacy/,/terms/,/cookies/,/gdpr/,/compliance/
You can provide custom regex patterns for your specific URL structure.
What if my sitemap has more than 100 URLs?
The llms.txt standard recommends keeping your file under 100 entries for optimal performance. When your sitemap exceeds this limit, our tool automatically prioritizes URLs in this order: Core → Product → Documentation → Resources → Support → Blog → Company → Legal → Other. Use the URL exclusion filters in "Advanced Recipe for Professional Juicemakers" to remove test pages, staging environments, or legacy content before parsing. This gives you more control over which pages make the final cut.
How do URL exclusion filters work?
URL exclusion filters help you remove unwanted pages before analysis. Since llms.txt files work best with under 100 entries, filtering out test pages, staging sites, or outdated content ensures your most valuable pages are included. You can use default filters (removes test, demo, sample, draft, staging, dev, wip, tmp, placeholder pages) or add custom regex patterns to match your site's structure. Filters are applied during sitemap parsing, before the 100-page limit kicks in.
How do I handle multiple title suffix variations?
The Title Suffix field supports both plain text and regular expressions. If your site uses different suffixes (like " | FreshJuice DEV Docs", " | FreshJuice Tools", " | FreshJuice"), you can use a regex pattern like \\| FreshJuice( DEV Docs| Tools)? to match all variations. The tool automatically detects if your pattern contains regex characters and applies it accordingly.
Can I edit the generated file?
Yes! The llms.txt file is just formatted markdown text. After downloading, you can edit it in any text editor to add custom sections, reorder pages, or update descriptions. The tool provides a starting point that you can customize to your needs.
How is this different from robots.txt or sitemap.xml?
While robots.txt tells bots what not to crawl and sitemap.xml lists all URLs for search engines, llms.txt provides human-readable context about your content specifically for LLMs. It includes descriptions and categorization that help AI understand what each page is about, making it easier to provide accurate answers about your business.