There are upcoming maintenance events which may impact our services. Learn more

The default robots.txt file Print

  • 0

Description

If you visit your website root URL and add /robots.txt on the end, you'll find our default compilation of bot blocks and rate limiting. The goal with this default is to ensure bots don't crawl irrelevant dynamic URLs, thus ensuring resources like CPU and RAM are not wasted on bots crawling the site unnecessarily. Bot crawls are an extremely common source of resource overusage on other hosts, so this is one component to our bot protection that helps ensure bots aren't a problem for your sites' performance.

Details & Overriding

If you need to use your own custom robots.txt, you can simply create one in your website root directory and it will override ours.

Some notes about this:

  1. If you're on shared hosting and we find that bots are accessing dynamic processed resources unnecessarily for your site with your custom robots.txt, which our robots.txt would have blocked, we may remove it in favour of ours. Therefore we strongly recommend copying the contents of our existing robots.txt into your replacement before then adding your own rules to it.
  2. You will not see an actual robots.txt file anywhere in your web root -- this 'document' is provided by the web server engine configuration rather than through an actual file.
  3. You cannot use a dynamic robots.txt file. This is true for two reasons: 1) a dynamically generated robots.txt file uses dynamic resources when it's totally unnecessary, and 2) it's not possbile to detect when a dynamic robots.txt file is to be used and serve it up instead. Therefore if you have software trying to dynamically serve robots.txt, you should ask the software developer for a write-to-disk mode where it actually creates the robots.txt file. Doing it this way will be both better for performance AND work with our optimizations.

 


Was this answer helpful?

← Back