All About the Robots.txt File, Search Engine Visibility Checkbox, and “noindex” Tag

You are here:
← All Topics

What is Robots.txt?

Robots.txt is a plain text file that resides in the root directory of a website in order to give directions to search engines about what website content should and should not be crawled.

The directions within the robots.txt file are aptly referred to as “directives.”

There are several different types of directives that can be used in a robots.txt file. They include:

  • Allow
  • Disallow
  • Sitemap

A robots.txt file also specifies which “user-agent” (which search engine crawlers) the directives apply to. If an asterisk is used for this, that means the directive should apply to all crawlers.

A standard robots.txt file for WordPress often looks like this:

User-agent: *
Disallow: /wp-admin/ 
Allow: /wp-admin/admin-ajax.php 

Sitemap: https://yoursite.com/sitemap_index.xml

The above robots.txt file is saying that all search engine crawlers should NOT crawl the /wp-admin/ directory, with the exception of the admin-ajax.php file. All other files are allowed to be crawled.

The sitemap line tells search engine crawlers where to find the XML sitemap, which is a file that is a “table of contents” of sorts for all of the website content that you DO want search engines to crawl.

You may or may not have the sitemap line in your robots.txt file. It’s okay if you don’t. Search engines will likely locate your sitemap file anyway (if you have one.) But putting that line there is a good way to ensure that crawlers will find the sitemap file easily.

Another common robots.txt configuration is one that doesn’t block any files, allowing all website content to be crawled. That would look like this:

User-agent: *
Allow: /

The single slash following the Allow directive indicates all website content.

If a single slash follows a Disallow directive, that means that ALL website content is blocked from search engine crawlers. That would look like this:

User-agent: *
Disallow: /

This is a very dangerous configuration. Having a robots.txt file in this state for even just a short time can completely decimate YEARS worth of Search Engine Optimization (SEO) progress!

What is the Search Engine Visibility checkbox?

The robots.txt file is not the only way that a WordPress site can get blocked from search engines. There is also a simple, yet dangerous, checkbox under the WordPress “Reading” settings that is called the “Search Engine Visibility” checkbox. It is followed by text that reads “Discourage search engines from indexing this site.”

Screenshot of WordPress Reading Settings

If this box is checked, WordPress will generate a meta robots “noindex” tag (line of code) on all pages of the website, which looks like this:

<meta name='robots' content='noindex,follow' />

The “noindex” directive being present in the meta robots tag on all pages of the website has the same effect as the “Disallow: /” directive being present in the robots.txt file. It blocks all search engines from crawling all content on the site.

Why are these things so important?

Having a whole site blocked from search engine crawlers can have devastating effects. Every moment that goes by when a site is blocked from search engines can result in increasing losses of keyword rankings and continued loss of incredibly valuable inbound links. These SEO losses can be extremely difficult or even impossible to fully recover from.

That’s why we refer to the Search Engine Visibility checkbox as the “Checkbox of Death.” Countless clients contact us every year because their websites suddenly dropped off of the face of all search engines.

Inevitably, we find out that someone either accidentally checked that box, or accidentally modified the robots.txt file, or launched a new version of the website that still had one or both of those things configured to block search engines.

That’s why we created the Cross Check plugin.

Many SEO plugins help you set up important SEO settings properly, so that your website is able to be indexed in search engines, but they don’t alert you if those settings get undone!

Cross Check’s Robots.txt and Search Engine Visibility SEO Monitoring WordPress Plugin acts as an “insurance policy” to protect you from catastrophic SEO losses.

How do I view and edit my robots.txt file?

There are several ways to view your robots.txt file:

  1. In a browser, visit yoursite.com/robots.txt
  2. Use the Robots.txt viewer in the Cross Check plugin’s settings screen
  3. Other plugins may also allow you to view the robots.txt file.
    • For example, Yoast will show you the contents of the robots.txt file under SEO -> Tools -> File Editor

Options 1 and 2 above do not allow you to edit the file. You can only view it those ways. The Robots.txt file can be edited in several different ways:

  1. There are aso plugins that exist solely for the purpose of managing the robots.txt file.
  2. Other plugins, especially SEO plugins, may also have the ability to edit the robots.txt file, if your robots file is writeable.
    • For example, in the Yoast SEO plugin, the robots.txt file is editable under SEO -> Tools -> File Editor
  3. You can use FTP to upload a new version of the file to the root directory of a website’s public files
  4. If you have a “file manager” feature in your hosting control panel, you can edit the robots.txt file or upload a new version that way
  5. On WordPress versions prior to 5.3, the robots.txt file (if present) will also be modified by checking the “Discourage search engines from indexing this site” checkbox in the WordPress “Reading” settings.

Regardless of how you edit your robots.txt file, ALWAYS do the following:

  • Check to ensure that your changes actually saved to the file itself, by viewing the file in a browser. Go to yourwebsite.com and add /robots.txt to the end of the URL (“yourwebsite.com/robots.txt”) to view the file. If you don’t see your changes, then either:
    • You need to clear your cache (which may need to be done in your performance/speed plugin settings, your hosting company, your CDN, or all of the above)
    • Your robots.txt file is not writeable. That will need to be fixed by changing file permissions. If you’re not sure how to do this, contact your hosting company for help.

BE CAREFUL when editing your robots.txt file. One small mistake in this file can render your whole website inaccessible by search engines. If you’re not confident in working with this file, have an SEO expert or the support staff from your email/hosting provider assist. Or reach out to us for further assistance.

How do I view and edit the meta robots “noindex” tag?

To view the meta robots tag, and see if it has a “noindex” directive in it, use your browser’s “view page source” function and search through the code for “robots” and/or “noindex.”

The meta robots tag, while theoretically can be added/edited manually in page templates (themes), is typically controlled by the Search Engine Visibility checkbox (in WordPress Reading settings) or an SEO plugin (like Yoast.)

For example, in Yoast, the noindex tag can be applied to all content of a certain post type under SEO -> Search Appearance -> Content Types, Taxonomies, and Archives.

Yoast also allows you to add a noindex tag to a single post in the “advanced” Yoast SEO settings for that post or page.

I need more help.

No problem! Just contact us and let us know what you need.