robots.txt
robots.txt are
robots.txt is the filename used for implementing the Robots Exclusion Protocol, a standard used by websites to indicate to visiting web crawlers and other web robots which portions of the website they are allowed to visit.
AI bot crawlers, how to handle them
There are a couple of projects that aim at helping against ai companies bots that crawl and scrape our content. Amongst them:
I'm going to setup in my Justfile a update setting for my robots.txt file that is in my static folder.
https://github.com/ai-robots-txt/ai.robots.txt/releases/latest/download/robots.txt
curl --url https://github.com/ai-robots-txt/ai.robots.txt/releases/latest/download/robots.txt -o static/robots.txt
↑ Back to the top