How can you avoid duplicate content in WordPress?

Lacey Tech Solutions are pleased to have been given the award for Best Website Development & SEO Agency 2019!

The way the content is arranged and structured in WordPress is useful for the users, no doubt, but becomes a problem when search engines index three times more pages than they should. Literally, on average, for a fresh WordPress installation with 10 published posts, Google ends up indexing close to 30 URLs, all pointing back to just 10 unique articles.

You might say you’ll get more traffic from Google if you have more pages in Google index, but there is a bigger chance of getting a penalty rather than a boost in traffic. There are obvious advantages of clearing up this mess, like better crawling results, better Page Rank distribution and original content.

The problem can be solved pretty simple, just by adding “noindex, nofollow” to unwanted pages. These pages can still be crawled by spiders; however the search engines will not include them in the index anymore.

Using the built-in WordPress functions you can create and add the code below inside your header.php file, before the </head> tag.

<?php if($paged > 1){
  echo '<meta name="robots" content="noindex,follow" />';
} ?>

<?php if(is_author()){
  echo '<meta name="robots" content="noindex,follow" />';
} ?>

<?php if(is_trackback()){
  echo '<meta name="robots" content="noindex,follow" />';
} ?>

I am allowing Google Bot and also the other search engine spiders to crawl and index my categories, which contain only the excerpts of my blog posts, so the content is not really the same as the content on the pages displaying the single posts.

You can use the following code if you want to exclude the categories from the Google index.

<?php if (is_category() ) {
  echo '<meta name="robots" content="noindex,follow" /> ';
} ?>

Google likes pages that have large amounts of content, therefore pages like categories or archives, will most likely receive more credit then single post pages for example.

After using the code above you might notice a small decrease in traffic but it should be temporary, until the link juice gets redistributed between the pages that remained in the index.

You can check to see if everything went well by doing a search query site: example.com and see if the pages you wanted to remove from Google index are still there or not. Depending on the crawl rate of your website it can take anywhere from a couple of hours to a few days for the changes appear in the results page.

Ben Lacey is the Managing Director of Lacey Tech Solutions. He is passionate about everything to do with websites from design and development through to search optimisation and hosting. He started the company blog as a platform to help educate current and prospective customers about the ever changing website development industry.

What are your thoughts?

We hope you found our article useful and look forward to answering your questions.