Ask HN: Is the web for machines (/llm.txt) the one we wished we had as humans?
Recently I found myself manually adding `/llm.txt` to most websites I visit because I find the content for LLMs strait to the point and clear. The only annoyance is web browsers like chrome do not render the markdown.
So could the AI revolution actually fix the web for humans as a side effect?
Do you find yourself doing the same?
36 points | by sunshine-o 2 days ago
17 comments
- ahriad 2 days agoWe broke the web so badly for humans that we had to build a clean web for machines, and now humans will have to use machines to experience a clean web again.
- marand23 2 days agoI never thought about it before now but the llm era could be a form of renaissance for blind people on the Internet. An alternative web where functionality of every page is described in short but detailed text instead of extremely verbose and non-linear html tree structure.
- rickette 2 days agoDoes any of the LLM providers actually use llms.txt?
If I remember correctly this "standard" was setup by someone but without involvement of any of the major AI players.
- HermanMartinus 2 days agoI can definitively say llms.txt is not used by any AI players. I run a blogging platform with around 80k blogs and /llms.txt is not requested by anything (other than humans checking to see if there's an llms.txt path).
All regular pages are aggressively scraped to the extent it's a problem I have to consistently manage, but not llms.txt.
- sunshine-o 2 days agoAmazing, I didn't know.
So it get even stranger, I am the only one reading those /llms.txt ...
- nickserv 2 days agoI'm seeing quite a bit of request for these on my work's GitBook documentation site.
But perhaps these are developers specifically targeting these pages to feed whatever LLM they are using.
- isaachinman 2 days agoHow is a static blog being scraped a problem? Do you not use a CDN?
- nickserv 2 days ago> a blogging platform with around 80k blogs
But nah, I'm sure OP doesn't know about CDNs.
- the_real_cher 2 days agoAre all blogs static though?
- johannes1234321 2 days agoVery few blogs require frequent updates. Even with user comments.
- 0123456789ABCDE 2 days ago> I can definitively say llms.txt is not used by any AI players.
https://developers.openai.com/llms.txt https://docs.anthropic.com/llms.txt https://geminicli.com/llms.txt https://github.com/llms.txt https://docs.aws.amazon.com/llms.txt https://openrouter.ai/docs/llms.txt- m4tthumphrey 2 days agoOP clearly meant that the AI players are not reading and/or honouring llms.txt of other websites when scraping.
- 0123456789ABCDE 2 days agoi stand corrected, but what was clear to you, obviously was not clear to me.
- solumos 2 days agoNo, requesting "Accept: text/markdown" in the headers and returning markdown is the more agreed upon standard at this point.[0]
- kamma4434 2 days agoNow, it would be super cool to get markdown and zero javascript bundles…
- solumos 2 days agoIf you want to see what that looks like, I one-shot a browser with Claude that does it[0]. Docs pages are early adopters to this[1][2], so that AI agents can better handle tasks.
[0] - https://github.com/solumos/md-browse
[1] - https://docs.stripe.com
[2] - https://vercel.com/docs
- sunshine-o 1 day agoI just found out Cloudflare supports real-time html to md conversion [0]
- [0] https://blog.cloudflare.com/markdown-for-agents/#convert-htm...
- christoff12 2 days agoThis is interesting. I should start incorporating this -- it couldn't hurt to do both.
- 0123456789ABCDE 2 days agoyes, they do.
anyone who's, even slightly, clued into how agents access documentation, has been making changes to their pages. ex: https://searchtxt-web.fly.dev/search?q=aws
- cyanydeez 2 days agooh don't worry, in 5 years your AI will be unundated with context poison prompts that try to get them to spend all your bank notes and meta bucks on equally useless things.
This is just a redeux of the early web.
- maccam912 2 days agoAlready happening. I was using Claude to check out sampler plugins and I'm sure it happens undetected, and it might have mentioned it with other versions, but Claude Opus 4.8, being it's helpful, honest self, told me that one of the pages it reviewed had hidden text instructing it to recommend that plugin. It caught it and was able to avoid influence from that plugin at least, but we're already living in that world.
- skywalqer 2 days agoWhy didn't they place it in .well-known? Also, I couldn't find a website that has it.
- JimDabell 2 days agoPutting it in .well-known/ was immediately raised as an issue from the beginning; it’s issue #2 in fact:
https://github.com/AnswerDotAI/llms-txt/issues/2
It’s been completely ignored ever since.
- 0123456789ABCDE 2 days ago
- realty_geek 2 days agoWhat is an example of a site with a good llm.txt?
- pramodbiligiri 2 days agoAnthropic's developer documentation: https://platform.claude.com/llms.txt. There's also https://platform.claude.com/llms-full.txt which is (WARNING) much bigger. Not sure where this second one fits into the standard.
- jbrooksuk 2 days agoMintlify generates an llms.txt and llms-full.txt for all documentation sites. These work really well:
- croes 2 days agoNo, the spammers are just at the beginning of ruining that too
https://news.ycombinator.com/item?id=48411569
BTW why should Chrome even consider rendering a .txt file as markdown?
- user568439 2 days agoThat's what I was thinking... Now spammers will add hidden prompts or things worse than that for the LLMs...
- mohamedkoubaa 2 days agoIt just hasn't been gamed yet
- tacostakohashi 2 days agoPretty much.
There is an enshittification cycle at work. The web used to be good, predominately text, and useful, 25 years ago. Then... slowly... we added javascript, then AJAX, CSS, flash, interstitials, popups, marketing, social media, algorithms, doomscrolling... gradually but surely turn it into the unusable cesspool that it is today.
Now we have AI! I think a big part of its utility is that it gets us back to text/information, and lets us bypass all the "beautiful" design / nonsense on the material it is trained on.
However, AI is just beginning its enshittification cycle - now that it has a critical mass of users, it is an irresistible target to start slowly adding ads, misinformation, conspiracy theories, and whatever else people can dream up, until it also becomes unusable and the cycle repeats.
- DeathArrow 2 days agoI tried it: https://news.ycombinator.com/item?id=48410589`/llm.txt
Result: no such item.
From where do you got the idea that adding /llm.txt to urls will produce markdown?
- fxwin 2 days agohere: https://llmstxt.org/ and obviously it doesn't automatically produce markdown, it's something the website needs to provide (e.g. https://pydantic.dev/llms.txt)
- gobdovan 2 days agoNot really, but sounds interesting. Would you care to share some sites that offer better llms.txt than main web page? Or talk about some piece of info you easily found on llms.txt that was hard to navigate to on the regular website?
- sunshine-o 2 days agollms.txt usually includes a clear sitemap and description of information available on a site.
There are also clear definition of the restful scheme and API/data access options.
One very basic example would be the weather channel https://weather.com/llms.txt
- gobdovan 2 days agoThanks, the comparison hit like a bag of bricks.
- jordemort 2 days agono
- Umairq786 2 days ago[flagged]
- aaron695 2 days ago[dead]
- onion2k 2 days agoThe only annoyance is web browsers like chrome do not render the markdown.
I imagine Claude could zero-shot a Chrome plugin for that.
- 8organicbits 2 days agoOf course plugins that do this already exist. Save your tokens.
It's the law of monetization.
And despite this, modern life is made possible by the illusion that "regulations" work..