
How We Build Websites AI Can Actually Find
The way people find businesses online is shifting. Instead of typing a query into Google and clicking through ten blue links, more and more people are asking AI assistants like ChatGPT, Perplexity, and Google's AI Overviews for direct answers. When someone asks "who builds websites in Buckinghamshire?", the AI pulls from websites it can understand, and ignores the ones it cannot.
This changes the game for every business with a website. If your site is built in a way that AI models struggle to read, you are invisible to a growing chunk of your potential audience.
At All Trouser Digital, we have been paying close attention to this shift. Every site we build is now structured to be readable by both humans and machines. Here is how we do it, and why it matters.
Why Most Websites Are Invisible to AI
Most websites are built to look good in a browser. That is fine for humans, but AI models do not "see" your website the way you do. They read the raw HTML, and if that HTML is a mess of JavaScript bundles, empty div tags, and client-side rendering, there is very little for an AI to work with.
Single-page applications (SPAs) are the worst offenders. The browser receives an almost empty HTML file and JavaScript builds the page after it loads. Google's crawler can handle this (with delays), but most AI crawlers cannot. They see a blank page and move on.
This is the single biggest reason a beautifully designed website can be completely invisible to AI.
Server-Side Rendering Changes Everything
The foundation of every site we build is Next.js, a React framework that renders pages on the server before sending them to the browser. This means that when any crawler, whether it is Google, ChatGPT, or Perplexity, requests a page, it receives fully formed HTML with all the content already in place.
No waiting for JavaScript to execute. No empty shells. Just clean, complete, readable content from the very first request.
Next.js gives us three rendering strategies, and we pick the right one for each page:
-
Static Generation (SSG) pre-builds pages at deploy time. Blog posts, service pages, and landing pages are ready to serve instantly. This is the fastest option and the most crawler-friendly.
-
Server-Side Rendering (SSR) generates pages on each request. We use this for pages with dynamic data that changes frequently, like dashboards or admin panels.
-
Incremental Static Regeneration (ISR) gives us the best of both worlds. Pages are served statically but revalidate in the background after a set time, so content stays fresh without sacrificing speed.
The result is a website that loads fast for visitors and delivers its full content to every crawler that asks.
Semantic HTML: Giving Content Structure
AI models are good at reading text, but they are much better when that text is wrapped in meaningful HTML. We use semantic elements throughout every site:
<main>marks the primary content, so AI can skip the navigation and footer noise<article>wraps standalone pieces of content like blog posts<section>groups related content with clear headings<nav>identifies navigation blocks<header>and<footer>define the page boundaries<time>tags dates so AI knows exactly when content was published
These elements are not just good practice. They are signals that tell AI models what each part of the page actually is, rather than leaving it to guess from a sea of generic <div> tags.
Structured Data with JSON-LD
Beyond the visible HTML, we embed structured data using JSON-LD (JavaScript Object Notation for Linked Data). This is a block of machine-readable metadata that sits in the page source and explicitly tells AI models and search engines what the page is about.
For a business website, this might include:
- Organization schema with the company name, logo, address, contact details, and social profiles
- WebSite schema describing the site structure and search capabilities
- Article schema on blog posts with the headline, author, publish date, and description
- FAQPage schema on FAQ pages, which AI assistants love because the question-and-answer format maps directly to how they respond to queries
- BreadcrumbList schema showing the page hierarchy
This structured data is invisible to visitors but invaluable to machines. When an AI assistant is looking for information about your business, JSON-LD gives it clean, unambiguous answers rather than forcing it to parse and interpret your page content.
Clean URLs and Logical Routing
Next.js uses file-system-based routing, which means the URL structure mirrors the project structure. A page at /services/website-build-packages is exactly what it sounds like. There are no query strings, no hash fragments, no cryptic IDs.
This matters because AI models use URLs as strong signals for content relevance. A clean, descriptive URL like /blog/how-we-build-websites-ai-can-actually-find tells an AI far more than /p?id=4832&ref=blog.
Every route we build is a real, crawlable path. No JavaScript-dependent navigation. No single-page-app hash routing. Just straightforward URLs that both humans and machines can understand at a glance.
The Metadata API: Titles, Descriptions, and Open Graph
Next.js has a built-in Metadata API that generates all the meta tags a page needs. For every page we build, we define:
- Title and description that appear in search results and AI summaries
- Open Graph tags for social sharing previews (also consumed by AI systems)
- Canonical URLs so AI knows the definitive version of each page
- Robot directives controlling which crawlers can access which pages
This metadata is generated at build time for static pages and at request time for dynamic ones. It is never an afterthought. Every page ships with a complete set of signals that tell AI models exactly what it is looking at.
Sitemaps and robots.txt
Every site we build includes a programmatically generated XML sitemap that lists every page, its last modification date, and its priority. This gives AI crawlers a complete index of the site without having to discover pages through links alone.
We also configure robots.txt with awareness of the growing list of AI-specific crawlers. Bots like GPTBot (OpenAI), ClaudeBot (Anthropic), and PerplexityBot all have their own user agents, and we make sure they are explicitly allowed to access the content.
If a site owner wants to be found by AI assistants (and most do), blocking these crawlers is one of the worst things you can do. We make sure the door is open.
llms.txt: The New Standard for AI Discovery
This is where things get really interesting. We have started adding llms.txt files to the sites we build. This is an emerging standard, proposed by Jeremy Howard of Answer.AI, that gives AI models a structured, plain-text summary of everything important on your website.
Think of it as the opposite of robots.txt. Where robots.txt tells crawlers what they cannot access, llms.txt tells them what they should pay attention to. It sits at the root of your domain (e.g. alltrouser.digital/llms.txt) and is written in Markdown with a specific structure:
- A heading with the site or business name
- A short description of what the business does
- Sections listing key pages with brief descriptions
- Links to the most important content
When an AI model encounters a llms.txt file, it gets a clean, curated overview of the entire site in a format it can process in seconds. No crawling dozens of pages. No parsing complex HTML. Just a straightforward summary of who you are, what you do, and where to find the details.
We have already added llms.txt to our own site and we are rolling it out to client sites as part of our standard build process.
Performance Matters Too
A fast website is not just better for visitors. It is better for AI discovery. Google's Core Web Vitals (LCP, INP, CLS) are ranking factors, and better rankings mean more visibility to AI systems that use search data as a signal.
Next.js helps here with automatic image optimisation, font optimisation that eliminates layout shift, code splitting that loads only what each page needs, and edge-ready deployment for fast response times globally.
Every site we deliver scores well on Core Web Vitals because performance is baked into the framework, not bolted on after the fact.
What This Means for Your Business
If your website was built five or ten years ago, there is a good chance it was not designed with AI readability in mind. That was not a problem then. It is becoming one now.
The businesses that will benefit most from AI-driven search are the ones whose websites clearly communicate what they do, in formats that machines can read just as easily as humans. That means server-rendered content, structured data, semantic HTML, clean URLs, proper metadata, and emerging standards like llms.txt.
This is not about replacing traditional SEO. Everything that makes a site AI-friendly also makes it better for Google. It is about making sure your website is ready for the next wave of search, not just the current one.
Want to Future-Proof Your Website?
If you are wondering how your current site stacks up, or if you are planning a new build and want to make sure it is structured for both human visitors and AI discovery, get in touch. We will take a look and give you an honest assessment of where you stand and what would make the biggest difference.
How much will your project cost?
Answer a few quick questions about what you need and get a ballpark estimate in under two minutes. No sign-up, no commitment.