Home/Glossary/Meta Robots

Meta Robots

Technical
Definition

HTML meta tag controlling how search engines crawl and index a page with directives like noindex, nofollow, and noarchive.

Meta robots tags are HTML meta elements that provide specific instructions to search engine crawlers about how to handle individual web pages. These tags use directives like noindex, nofollow, noarchive, and nosnippet to control crawling and indexing behavior at the page level, offering more granular control than the site-wide robots.txt file.

The meta robots tag appears in the section of HTML documents and directly communicates with search engine bots during the crawling process. Unlike robots.txt, which provides crawling instructions before bots access a page, meta robots tags are read after the page is already crawled, making them essential for managing indexing decisions on already-accessible content.

Why It Matters for AI SEO

AI-powered search engines and traditional crawlers both respect meta robots directives, but AI systems introduce new considerations for their implementation. Large language models that power AI Overviews and chatbots may still access content marked with certain meta robots tags during their training or inference processes, even if that content doesn't appear in traditional search results. Modern AI crawlers from companies like OpenAI, Anthropic, and Google use sophisticated parsing mechanisms that can understand meta robots tags alongside other signals. This means your meta robots strategy directly impacts whether your content contributes to AI training data, appears in AI-generated responses, or gets indexed for traditional search results.

How It Works

The basic syntax places the meta robots tag in the HTML head: . Common directives include index/noindex (control indexing), follow/nofollow (control link following), noarchive (prevent cached copies), nosnippet (prevent snippets), and max-snippet: [number] (limit snippet length). You can target specific bots using their names: or . Tools like Screaming Frog and Google Search Console help audit meta robots implementation across your site, while SEO plugins like Yoast SEO and RankMath provide interfaces for setting these tags without manual HTML editing. For AI-specific considerations, you might combine meta robots tags with new directives or HTTP headers that specifically target AI crawlers, though the ecosystem is still evolving as AI companies develop their own crawling protocols.

Common Mistakes

The most frequent error is using meta robots tags when robots.txt is more appropriate—if you don't want bots to crawl a page at all, block it in robots.txt rather than allowing crawls just to show a noindex tag. Another mistake is misunderstanding that noindex doesn't prevent crawling; it only prevents indexing after the page has been crawled and processed. Many practitioners also forget that meta robots tags only work if the page is crawlable, and conflicting directives between robots.txt and meta robots can create confusion. Finally, assuming that meta robots tags will completely prevent AI systems from accessing content is incorrect—these tags primarily affect traditional search indexing, while AI training data collection may follow different rules entirely.