---
title: "LLM Readiness Checker"
description: "Is your page actually readable by ChatGPT, Claude, Perplexity, and friends? Drop a URL and we'll grade it — llms.txt, content negotiation, .md variants, text ratio, semantics, JS-rendering warnings."
url: "https://freshjuice.dev/tools/llm-readiness-checker/"
---
## What this tool checks

LLM crawlers don't all behave the same, but they share a few preferences. They love clean markdown. They love files that tell them what's worth reading. They tolerate HTML, but only if there's actual content under the pile of nav, footer, scripts, and dialog overlays. They don't execute JavaScript — if your content only appears after hydration, you're invisible.

This checker grades a single page against the signals that actually move the needle, then runs the same checks against the site's homepage so you can spot whether the problem is the page or the site.

### The signals we check

-   **Markdown via `Accept: text/markdown`** — the strongest signal. If the server returns real markdown when asked, crawlers skip HTML parsing entirely.
-   **`/llms.txt`** — present at the root domain with non-trivial content.
-   **`/llms-full.txt`** — optional companion file with full content.
-   **`.md` URL variant** — `/page.md` or `/page/index.md` alongside `/page`.
-   **`<link rel="alternate" type="text/markdown">`** — declared in `<head>`, telling crawlers where the markdown lives.
-   **Text-to-markup ratio** — visible text bytes vs total HTML bytes. 25%+ is healthy.
-   **Heading hierarchy** — exactly one `<h1>`, no skipped levels.
-   **Semantic landmarks** — `<main>` or `<article>`.
-   **JS-rendering penalty** — applied when raw HTML body has minimal text and SPA markers (`#root`, `#app`, `__next`). Most crawlers don't run JavaScript.

### Grade thresholds

-   **A (90+)** — Excellent. The site delivers markdown directly to LLMs.
-   **B (75–89)** — Good. A few easy wins from a perfect score.
-   **C (55–74)** — Fair. Crawlers can read you, but inefficiently.
-   **D (35–54)** — Poor. Most signals are missing.
-   **F (<35)** — Failing. Crawlers get little to nothing.

## How to actually improve your score

The cheapest wins, in order:

1.  Publish `/llms.txt`. Use our [llms.txt generator](https://freshjuice.dev/tools/llmstxt-generator/) if you don't have one — it'll pull pages from your sitemap and produce the file in minutes.
2.  Serve a `.md` variant of each page. Most static-site generators can output both formats during build. If you're on a CMS, a route handler that responds with rendered markdown is a one-day job.
3.  Honor the `Accept: text/markdown` header. When a crawler asks for markdown, return markdown — same content, smaller payload, no parsing required.
4.  Add `<link rel="alternate" type="text/markdown" href="…">` to `<head>`. Five seconds of work, declares your markdown variant.
5.  Cut the bloat. Inline styles, mega-menus, footer link farms, and marketing dialogs all eat byte budget that should be your content.

## FAQ

### Why does my page score so low? I can see content in my browser.

Your browser runs JavaScript. Most AI crawlers don't. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, Meta-ExternalAgent and the rest fetch your URL the way `curl` does, then read whatever HTML comes back, raw.

If your site is built as a JavaScript app (the page only fills in *after* JavaScript runs on the client), the raw HTML response is basically empty: a `<div id="root">` and some script tags. That is what crawlers see. The browser fills in the rest. The crawler never gets that far.

This is the default behaviour of **most no-code AI app builders** — Lovable, Base44, Bolt, v0, Replit Agent, and pretty much every React/Vue/Svelte starter without server-side rendering. Pages look perfectly normal in your browser. They are *invisible* to ChatGPT, Claude, Perplexity, and Gemini.

This isn't just an LLM problem. **It tanks SEO too.** Google, Bing, and DuckDuckGo crawl the same way — they fetch the raw HTML first. Google has a "render later" pass that runs JavaScript, but it's rate-limited, takes days to weeks, and is unreliable enough that Google itself recommends server-side rendering. Bing and DuckDuckGo don't render JavaScript at all. If you built your site with one of those AI app builders and you're wondering why nobody finds it in search results — this is exactly why. Your site looks great to humans visiting it directly. It does not exist to any crawler.

Don't believe us? Try it yourself. Open a terminal and run:

```
curl -A "GPTBot/1.0" https://your-site.example/your-page
```

If the body is mostly `<div>` shells and `<script>` tags with no actual paragraphs, headings, or content text, that is the exact thing every AI crawler sees. Your browser is making the page look real. The crawler never runs that code.

**The fix** is server-side rendering (SSR) or static generation (SSG). The server returns real HTML with your content; JavaScript still hydrates on top for interactivity, but crawlers and humans get the same content in the initial response. Frameworks that do this out of the box: **Astro** (this site), Next.js (SSR/SSG modes), Nuxt, SvelteKit, Remix, Eleventy, plus every traditional server-rendered stack (Rails, Django, Laravel, Phoenix, WordPress).

No fix exists at the configuration level for an app that was generated without SSR. The fix is rebuilding on a framework that does it, or adding a pre-rendering step. There is no robots.txt tweak that magically makes JavaScript-rendered content visible to bots that don't execute JavaScript.

### The content negotiation check returned markdown. Do the other signals still matter?

Less than you'd think. If your server returns real markdown when asked, crawlers can skip HTML parsing entirely. The HTML quality signals become "nice to have" rather than load-bearing.

### How is the score calculated?

It's a percentage — what fraction of the relevant LLM-readiness checks your page passes. Some checks matter more than others (markdown via Accept negotiation is the strongest signal, semantic landmarks are nice-to-have), but you don't need to know the internal mix to read the number.

Only signals that actually apply to your page count. If your server serves markdown via `Accept: text/markdown`, the alternate-link, `.md` variant, and text-to-markup signals become informational (gray dash) and stop affecting the score — they're redundant. If your page is a JavaScript shell, the HTML quality signals demote the same way (they're symptoms, not independent gaps).

A page detected as JavaScript-rendered takes a noticeable hit (because most crawlers see an empty shell). Meet every signal that's in scope for your setup and you land at 100%. The check, X, and dash on each row tell you the rest.

### Why am I seeing old results after updating my page?

Results are cached for 5 minutes to keep response times snappy. If you just made changes to your website, wait a few minutes and analyze again to see the updated results.
