Robots.txt Blocking Tests

CheckView respects robots.txt rules and noindex directives by default. If your site’s robots.txt file blocks CheckView’s user agent, or if the page has a noindex meta tag, tests may fail before they begin.

How It Works

Before running a test, CheckView checks the site’s robots.txt file and the target page’s meta tags. If the page is disallowed or marked as noindex, the test will fail with one of the following error codes:

  • user-agent-disallowed: The site’s robots.txt file blocks CheckView’s user agent from accessing the page.
  • noindex: The page has a noindex meta tag or HTTP header, indicating it should not be indexed or tested.

The Respect Bot Restrictions Setting

You can control this behavior in Organization Settings > Platform Configuration > Respect Bot Restrictions:

  • Yes (default): CheckView will respect robots.txt and noindex directives. Tests will fail if the page is blocked.
  • No: CheckView will ignore robots.txt and noindex directives and proceed with the test regardless.

When to Disable Bot Restrictions

Consider disabling bot restrictions if:

  • Your robots.txt broadly blocks crawlers but you still want to test your pages.
  • Pages have noindex tags for SEO reasons but should still be tested for functionality.
  • You are testing staging or development environments that block all bots by default.

Fixing robots.txt Issues

If you want to keep bot restrictions enabled but allow CheckView access, you can add a specific rule to your robots.txt:

User-agent: *
Allow: /

Or, to allow all bots access to specific paths only, add allow rules for the URLs you want to test.