---
title: "llms.txt Generator"
description: "Generate llms.txt files for your website to guide AI crawlers. Create a structured file from your sitemap or URL list that helps LLMs like Claude, GPT, and Gemini understand your site content."
url: "https://freshjuice.dev/tools/llmstxt-generator/"
---
## About this tool

The [llms.txt standard](https://llmstxt.org/) proposes a Markdown file at the root of your domain that tells AI crawlers and assistants which pages of your site matter most, and what each page is about. Think of it as `robots.txt` for the LLM era — preferences instead of rules.

The flow:

1.  **Step 1 — Input.** Paste your `sitemap.xml` URL (or any subdomain — we'll auto-detect `/sitemap.xml`). Optionally filter out test/staging URLs.
2.  **Step 2 — Select.** We auto-categorize discovered URLs into Core / Product / Docs / Blog / Support / etc. Uncheck anything you don't want exposed to AI crawlers.
3.  **Step 3 — Generate.** Add an optional site title and tagline, hit Generate, then download the file. Upload it to `/llms.txt` on your domain.

## Frequently Asked Questions

### What is llms.txt?

llms.txt is a standardized markdown file that provides LLM-friendly documentation of your website's content. It lists important pages with titles and descriptions, organized by category (core pages, product pages, documentation, resources, support, blog posts, company info, legal pages). This helps AI crawlers and language models understand your site structure and prioritize content when answering questions about your business. See the [llms.txt spec](https://llmstxt.org/).

### Where should I place the llms.txt file?

Place the generated `llms.txt` file in your website's root directory (e.g., `https://example.com/llms.txt`). This makes it easily discoverable by AI crawlers following the llms.txt standard.

### How does automatic categorization work?

When you provide a sitemap, the tool automatically categorizes URLs based on common patterns:  
  
**Core Pages**: `/about`, `/contact`, `/pricing`, `/features`, `/services`  
**Product Pages**: `/product/`, `/solutions/`, `/plans/`  
**Documentation**: `/docs/`, `/guides/`, `/api/`, `/reference/`, `/tutorial/`  
**Resources**: `/resources/`, `/case-studies/`, `/whitepapers/`, `/ebooks/`, `/templates/`, `/downloads/`  
**Support**: `/help/`, `/faq/`, `/support/`, `/knowledge-base/`, `/troubleshooting/`  
**Blog Posts**: `/blog/`, `/posts/`, `/articles/`, `/news/`  
**Company**: `/company/`, `/careers/`, `/jobs/`, `/team/`, `/press/`, `/investors/`  
**Legal**: `/privacy/`, `/terms/`, `/cookies/`, `/gdpr/`, `/compliance/`

### What if my sitemap has more than 100 URLs?

The llms.txt standard recommends keeping your file under 100 entries for optimal performance. When your sitemap exceeds this limit, our tool automatically prioritizes URLs in this order: Core → Product → Documentation → Resources → Support → Blog → Company → Legal → Other. Use the URL exclusion filters under **Advanced** in Step 1 to remove test pages, staging environments, or legacy content before parsing. This gives you more control over which pages make the final cut.

### How do URL exclusion filters work?

URL exclusion filters help you remove unwanted pages before analysis. Since llms.txt files work best with under 100 entries, filtering out test pages, staging sites, or outdated content ensures your most valuable pages are included. You can use default filters (removes test, demo, sample, draft, staging, dev, wip, tmp, placeholder pages) or add custom regex patterns to match your site's structure. Filters are applied during sitemap parsing, before the 100-page limit kicks in.

### Can I edit the generated file?

Yes. The llms.txt file is just formatted markdown text. After downloading, you can edit it in any text editor to add custom sections, reorder pages, or update descriptions. The tool provides a starting point that you can customize to your needs.

### How is this different from robots.txt or sitemap.xml?

While `robots.txt` tells bots what *not* to crawl and `sitemap.xml` lists all URLs for search engines, `llms.txt` provides human-readable context about your content specifically for LLMs. It includes descriptions and categorization that help AI understand what each page is about, making it easier to provide accurate answers about your business.

### What's the difference between Sitemap mode and Manual mode?

**Sitemap mode** is the recommended path: you give us a sitemap URL, we discover and auto-categorize every URL inside, then you choose which to include in Step 2. **Manual mode** skips discovery — you paste a list of URLs you already curated, and we go straight to fetching titles/descriptions and generating the file. Use Manual when you don't have a sitemap or you've already done the curation work.
