Everything You Required To Learn About The X-Robots-Tag HTTP Header

Posted by

Seo, in its most standard sense, trusts one thing above all others: Search engine spiders crawling and indexing your website.

However nearly every site is going to have pages that you don’t want to consist of in this exploration.

For instance, do you actually want your privacy policy or internal search pages appearing in Google results?

In a best-case scenario, these are not doing anything to drive traffic to your site actively, and in a worst-case, they could be diverting traffic from more important pages.

Fortunately, Google allows web designers to inform online search engine bots what pages and content to crawl and what to overlook. There are numerous methods to do this, the most common being utilizing a robots.txt file or the meta robots tag.

We have an exceptional and detailed explanation of the ins and outs of robots.txt, which you ought to definitely check out.

But in top-level terms, it’s a plain text file that resides in your site’s root and follows the Robots Exemption Protocol (REP).

Robots.txt supplies crawlers with guidelines about the website as an entire, while meta robots tags include directions for particular pages.

Some meta robotics tags you may utilize consist of index, which tells search engines to add the page to their index; noindex, which informs it not to add a page to the index or include it in search results; follow, which advises an online search engine to follow the links on a page; nofollow, which tells it not to follow links, and an entire host of others.

Both robots.txt and meta robots tags work tools to keep in your tool kit, however there’s also another way to instruct search engine bots to noindex or nofollow: the X-Robots-Tag.

What Is The X-Robots-Tag?

The X-Robots-Tag is another method for you to control how your websites are crawled and indexed by spiders. As part of the HTTP header response to a URL, it controls indexing for a whole page, in addition to the specific aspects on that page.

And whereas utilizing meta robotics tags is relatively straightforward, the X-Robots-Tag is a bit more complex.

But this, obviously, raises the concern:

When Should You Use The X-Robots-Tag?

According to Google, “Any instruction that can be utilized in a robotics meta tag can likewise be defined as an X-Robots-Tag.”

While you can set robots.txt-related directives in the headers of an HTTP reaction with both the meta robotics tag and X-Robots Tag, there are certain circumstances where you would wish to use the X-Robots-Tag– the 2 most typical being when:

  • You wish to manage how your non-HTML files are being crawled and indexed.
  • You wish to serve regulations site-wide rather of on a page level.

For example, if you wish to obstruct a particular image or video from being crawled– the HTTP action method makes this easy.

The X-Robots-Tag header is likewise useful since it permits you to combine numerous tags within an HTTP action or utilize a comma-separated list of regulations to define directives.

Maybe you do not want a specific page to be cached and desire it to be unavailable after a specific date. You can use a mix of “noarchive” and “unavailable_after” tags to instruct online search engine bots to follow these directions.

Essentially, the power of the X-Robots-Tag is that it is far more versatile than the meta robotics tag.

The advantage of utilizing an X-Robots-Tag with HTTP responses is that it enables you to utilize routine expressions to execute crawl regulations on non-HTML, along with apply criteria on a larger, international level.

To assist you comprehend the distinction between these regulations, it’s helpful to classify them by type. That is, are they crawler regulations or indexer directives?

Here’s a convenient cheat sheet to explain:

Spider Directives Indexer Directives
Robots.txt– uses the user agent, permit, prohibit, and sitemap regulations to specify where on-site search engine bots are permitted to crawl and not permitted to crawl. Meta Robots tag– permits you to define and prevent online search engine from revealing particular pages on a website in search results.

Nofollow– allows you to define links that should not pass on authority or PageRank.

X-Robots-tag– permits you to control how specified file types are indexed.

Where Do You Put The X-Robots-Tag?

Let’s say you wish to block particular file types. A perfect technique would be to add the X-Robots-Tag to an Apache setup or a.htaccess file.

The X-Robots-Tag can be added to a site’s HTTP responses in an Apache server setup via.htaccess file.

Real-World Examples And Utilizes Of The X-Robots-Tag

So that sounds fantastic in theory, however what does it appear like in the real life? Let’s have a look.

Let’s state we wanted search engines not to index.pdf file types. This configuration on Apache servers would look something like the below:

Header set X-Robots-Tag “noindex, nofollow”

In Nginx, it would look like the below:

location ~ * . pdf$

Now, let’s look at a different circumstance. Let’s say we want to utilize the X-Robots-Tag to obstruct image files, such as.jpg,. gif,. png, etc, from being indexed. You might do this with an X-Robots-Tag that would appear like the below:

Header set X-Robots-Tag “noindex”

Please keep in mind that understanding how these directives work and the effect they have on one another is crucial.

For instance, what happens if both the X-Robots-Tag and a meta robots tag are located when spider bots find a URL?

If that URL is obstructed from robots.txt, then certain indexing and serving directives can not be found and will not be followed.

If instructions are to be followed, then the URLs including those can not be disallowed from crawling.

Look for An X-Robots-Tag

There are a couple of different techniques that can be used to look for an X-Robots-Tag on the site.

The easiest method to inspect is to set up a browser extension that will inform you X-Robots-Tag information about the URL.

Screenshot of Robots Exclusion Checker, December 2022

Another plugin you can utilize to identify whether an X-Robots-Tag is being utilized, for example, is the Web Designer plugin.

By clicking on the plugin in your browser and browsing to “View Response Headers,” you can see the different HTTP headers being utilized.

Another approach that can be utilized for scaling in order to identify concerns on websites with a million pages is Shrieking Frog

. After running a site through Screaming Frog, you can navigate to the “X-Robots-Tag” column.

This will reveal you which areas of the website are utilizing the tag, together with which specific directives.

Screenshot of Yelling Frog Report. X-Robot-Tag, December 2022 Utilizing X-Robots-Tags On Your Website Comprehending and managing how online search engine engage with your site is

the cornerstone of search engine optimization. And the X-Robots-Tag is an effective tool you can utilize to do just that. Simply be aware: It’s not without its risks. It is extremely easy to slip up

and deindex your whole site. That said, if you read this piece, you’re most likely not an SEO newbie.

So long as you utilize it sensibly, take your time and examine your work, you’ll discover the X-Robots-Tag to be an useful addition to your toolbox. More Resources: Included Image: Song_about_summer/ Best SMM Panel