Dec 192012
 

Yesterday we reviewed how to redirect image attachment pages to help visitors and fix duplicate content SEO issues. Today we look at how to tell search engines not to index other potential duplicate pages such as date archives and tags.

The reasons for doing so are best described in an awesome post over at Seomoz – well worth a read first to understand how default settings in WordPress may affect SEO.

The article illustrates how to use Yoast’s plugin to configure more SEO friendly indexing. However, such plugins do so much that they may be overkill if all you want is to configure noindexing.

We’ll provide a code snippet below which does this basic job (although you may find that the extra features of an SEO plugin would prove very useful). First, let’s consider what the article suggests as best practice for a single author blog – for visitors and search engines:

  • Pages, Posts and Categories – accessible to visitors, index and follow.
  • Tags, Dated Archives and Pagination – accessible to visitors, noindex and follow. [An example of pagination is Tips/page/2 etc i.e. older posts in the Tips category]
  • Author (default) – 301 redirect visitors to homepage, noindex and follow.
  • Author (custom) – accessible to visitors, index and follow.

We would add the following too:

  • Search Result Page Archives – accessible to visitors, noindex and follow.
  • 404 Page Not Found – accessible to visitors, noindex and follow (should be noindexed by search engines anyway but worth adding as a backup)

Assuming that these SEO criteria exactly match what you want for your WordPress site, the following code snippet should do the job. Insert the snippet into Header.php – between the head and /head tags:

<?php
// Noindex duplicates – author, date archive, pagination, tags, 404, search
if( is_author() | is_date() | is_paged() | is_tag() | is_404() | is_search()) {
  _e('<meta name="robots" content="noindex, follow" />');
} else {
  _e('<meta name="robots" content="index, follow" />');
} ?>

The code logic is simple – if a page belongs to one of the selected criteria, the robots meta is placed in the page header telling search engine robots to follow any page links but don’t index the page.

For all other pages, robots meta is placed in the page header telling search engines to index the page and follow any links. [According to Google you don’t actually need to add an index instruction because the default behaviour is to index all pages unless there is a specific noindex, however, it does no harm and makes it easier to double check your changes]

In addition, to redirect a single author page to the site homepage you would need to add the following line to your .htaccess file:

redirect 301 /author/yourname http://yoursite.com/

Testing The Changes

To avoid unintentional noindexing by search engines, it is vital to test that the changes do what you actually intended. Browse to each type of page on your site and view the page source – check that one of the following statements is included in the header and that the decision to index or not is correct for that page type:

<meta name=”robots” content=”index, follow” />  or <meta name=”robots” content=”noindex, follow” />

To test the author page redirect, browse to your author page and you should be automatically taken to your site homepage.

Customizing The Changes

It is important to note that the code snippet reflects the suggested best practise of the Seomoz article – this may not be suitable for your own site e.g. you may prefer to index tags and/or categories etc.

You must make the right decision for your site as to what content search engines should index but, if you prefer not to use a plugin, the above code should at least get you started. It’s easy enough to modify the conditional IF statement to suit your own needs.

Conclusion

Many users will find that an SEO plugin is ideal for resolving duplicate content in WordPress.

However, for those not wishing to take the plunge, or who prefer the more streamlined manual approach, a code snippet can do an effective job.

  2 Responses to “How To Noindex Duplicates In WordPress Without A Plugin”

  1. Hi..I pasted this code snippet into my header.php a week ago but i still see 404 crawl errors under my google webmaster account. How long does it usually take for the 404 errors to disappear from webmaster account?

    • @Satish – this snippet includes noindex of your 404 “page” i.e. the standard page displayed when a visitor follows a dead/incorrect link.

      The 404 errors you refer to in your account are not the 404 page itself – they’re the actual webpages which are not present (e.g. perhaps you deleted a few articles?) – you don’t need to worry about those as Google will remove them from their Search index and your webmaster account eventually – it can take months.

      Google do say that 404 errors don’t hurt your site though – assuming you don’t have 100s more of them than you have actual pages. (Obviously if you have 404s because of incorrect links then you should correct these links so that visitors can browse the site properly)

 Leave a Reply

(required)

(required but will NOT be published)