Social Insight

Google Says LLMs.Txt Comparable To Keywords Meta Tag

Google’s John Mueller answered a question about LLMs.txt, a proposed standard for showing website content to AI agents and crawlers, downplaying its usefulness and comparing it to the useless keywords meta tag, confirming the experience of others who have used it.

LLMS.txt

LLMS.txt has been compared to as a Robots.txt for large language models but that’s 100% incorrect. The main purpose of a robots.txt is to control how bots crawl a website. The proposal for LLMs.txt is not about controlling bots. That would be superfluous because a standard for that already exists with robots.txt.

The proposal for LLMs.txt is generally about showing content to LLMs with a text file that uses the markdown format so that they can consume just the main content of a web page, completely devoid of advertising and site navigation. Markdown language is a human and machine readable format that indicates headings with the pound sign (#) and lists with the minus sign (-). LLMs.txt does a few other things similar to that functionality and that’s all it’s about.

What LLMs.txt is:

  • LLMs.txt is not a way to control AI bots.
  • LLMs.txt is a way to show the main content to AI bots.
  • LLMs.txt is just a proposal and not a widely used and accepted standard.

That last part is important because it relates to what Google’s John Mueller said:

LLMs.txt Is Comparable To Keywords Meta Tag

Someone started a discussion on Reddit about LLMs.txt to ask if anyone else shared their experience that the AI bots were not checking their LLMs.txt files.

They wrote:

“I’ve submitted to my blog’s root an LLM.txt file earlier this month, but I can’t see any impact yet on my crawl logs. Just curious to know if anyone had a tracking system in place,e or just if you picked up on anything going on following the implementation.

If you haven’t implemented it yet, I am curious to hear your thoughts on that.”

One person in that discussion shared that they host over 20,000 domains and that no AI agents or bots are downloading the LLMs.txt files, only niche bots like one from BuiltWith is grabbing those files.

The commenter wrote:

“Currently host about 20k domains. Can confirm that no bots are really grabbing these apart from some niche user agents…”

John Mueller answered:

“AFAIK none of the AI services have said they’re using LLMs.TXT (and you can tell when you look at your server logs that they don’t even check for it). To me, it’s comparable to the keywords meta tag – this is what a site-owner claims their site is about … (Is the site really like that? well, you can check it. At that point, why not just check the site directly?)”

He’s right, none of the major AI services, Anthropic, OpenAI, and Google, have announced support for the proposed LLMs.txt standard. So if none of them are actually using it then what’s the point?

Mueller also raises the point that an LLMs.txt file is redundant because why use that markdown file if the original content (and structured data) have already been downloaded? A bot that uses the LLMs.txt will have to check the other content to make sure it’s not spam so why bother?

Lastly, what’s to stop a publisher or SEO from showing one set of content in LLMs.txt to spam AI agents and another set of content for users and search engines? It’s too easy to generate spam this way, essentially cloaking for LLMs.

In that regard it is very similar to the keywords meta tag that no search engine uses because it would be too sketchy to trust a site that it’s really about those keywords and search engines are better and more sophisticated nowadays about parsing the content to understand what it’s about.

Read the LinkedIn discussion here:

LLM.txt – where are we at?

Featured Image by Shutterstock/Jemastock

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button