5 Ways to Deal with Duplicate Content in WordPress

What is Duplicate Content?

There are many people who have this and that to say about duplicate content, and I could you give you my definition but I think that it is best to understand what Google classifies as duplicate content.

“Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin. Examples of non-malicious duplicate content could include:

  • Discussion forums that can generate both regular and stripped-down pages targeted at mobile devices
  • Store items shown or linked via multiple distinct URLs
  • Printer-only versions of web pages

Google goes on to state that you should “Understand your content management system: Make sure you’re familiar with how content is displayed on your web site. Blogs, forums, and related systems often show the same content in multiple formats. For example, a blog entry may appear on the home page of a blog, in an archive page, and in a page of other entries with the same label.”

Understanding How WordPress Duplicates Content

WordPress is an awesome and powerful content management system.  Unlike other open source content management systems such as Joomla or Drupal, WordPress is a lot easier to use (I banged my head up against the wall using Joomla a few times and quickly started looking for a different solution)  WordPress is not only easier to use but it is also more automated than either Joomla or Drupal, for example WordPress has One Click updates for both the core update as well as plugins.  Furthermore WordPress gives you the ability to easily schedule content to be published so that you can set it and forget it (note this can also be done in Joomla but the WordPress interface is much easier to use).

Even though WordPress is both powerful and easy to use it does have some downsides. For instance, WordPress out of the box, that is the default settings,  publishes and organizes content that creates duplicate content issues, which can confuse the search engines, which is something we want to avoid at all costs. Remember, Google is your best friend, if you give them what they want, which frankly is great content and user friendly websites.

The way that duplicate content confuses the search engines, is if you have content published  and organized on your websites using various different methods, as WordPress does, then the Google bot (the search engine spider/robot) will not know which page to index in the Google Search Database and it may make your site appear to be of low value or that you are trying to trick the Google Bot (something that Google doesn’t like).  Lets take a closer look at how WordPress creates duplicate content.

Here’s how duplicate content is generated in a typical WordPress setup. Say you or someone else creates a Post titled How to Find the Best Roofing Company in Edmonton, selects two categories (Edmonton Roofer and Edmonton Roofing) for the Post and one Tag for the Post (finding a roofer). When the Post is published, WordPress then does the following:

The Post is published as www.edmontonroofingpros.com.com/how-to-find-the-best-roofing-company-in-edmonton.

  1. Category pages are created. An entry for the Post containing its title, a link to the Post, and a 55-word excerpt (called the excerpt in WordPress parlance) is created on two category pages here at www.edmontonroofingpros.com/category.edmonton-roofer and www.edmontonroofingpros.com/category/edmonton-roofing.

  2. A Tag page is created. Another duplicate entry of the Post with a title, link, and 55-word excerpt are created on a tag page here: www.edmontonroofingpros.com/tags/finding-a-roofer

  3. If that wasn’t enough repetition, WordPress also creates yearly and monthly archives that contain another copy of the title, a link to the Post, and the excerpt. The archives display in this format: www.edmontonroofingpros.com/2011 and www.edmontonroofingpros.com/2011/07.

  4. Furthermore, WordPress also creates an author archive publishing the exact same post under the authors name.

We don’t want to disable these pages because they help users find content but we have to find a way to control duplicate content to maximize our SEO efforts. There is one other wrinkle at work in this scenario. The 55-word excerpt that WordPress generates automatically for each Post (unless we create an excerpt manually for each Post) is generated from the first 55 words of the Post! So, the excerpt itself is a partial duplicate of the actual content! So, we have two problems:

  • The excerpt is a partial duplicate of existing content

  • Multiple archive pages with duplicate content

If this duplicate content isn’t controlled, a search engine might index your Tag index page instead of the content on the Post page itself which is not an ideal scenario.

How to Solve Duplicate Content Issues with WordPress

So what is the answer solving the WordPress duplicate content problem?  Read on to discover the five ways to solve the WordPress duplicate content issues.

1.  Use Canonanical URLs

This is something that you should not only do with WordPress but all of your other websites as well, whether they be powered by wordpress or not.  Perhaps you’re reading this and wondering what I mean by canonical urls (don’t feel that you’re alone, I had no idea what they were either) Canonical URLS are basically telling the search engines and everyone else which url structure you want to use.  For instance, you may not know this but Google see’s www.edmontonroofingpros.com and edmontonroofingpros.com as two different websites (hence referred to as non-www and www).  Therefore you need to tell Google, Users and other search engines which url structure you prefer.  Frankly, while some people claim that it is easier for users to not have to type  www. I prefer to use it, however it is just personal preference.

2.  Structure Your Website Using Categories

This is really the most ideal way to structure your WordPress website for SEO and usability.  Using categories to organize your posts will create Keyword Silo’s, that is keyword theme content related areas of your site.  When the content on your is related to a specific category it is shown to be more relevant to the search engines and your users.

3.  Have Page Excerpts on Your Home Page rather than full excerpts

If you have WordPress configured so that you Home page is where post are published or you have set WordPress up so that there is a dedicated page for blog posts you will want to have excerpts on your page rather than full posts so that the search engine spiders will follow the link and index it in its proper place.  You can read more about this in my post, How to Easily Have Page Excerpts with the Headway Theme.

4.  Use Only One Category Type per post

This should go without saying but I’ll mention it anyway.  Only assign one category for each post.  If you assign multiple categories this will further confuse the search engines and create even more duplicate content issues for you to resolve.  Assigning one category per post will make it easier on your, your readers and the search engines.  Coming up with the proper categories again should  be a part of your keyword research.  Also focus on ranking for one keyword at a time.

5.  Use “no index” to tell the search engines not to index pages

There is an Robots Meta Tag that you can use to inform the search engines that you do not want them to index a certain page in order to avoid duplicate content and this is known as the Robots Meta Tags.  For instance you can tell the search engine robots not to index your page or follow the links on the page by putting “no index, no follow” in the header section of your site.  However, to accomplish this with WordPress is again, either best to use the Headway theme or Yoast’s WordPress SEO Plugin. Again, I prefer using Yoast’s plugin so I’ve shown you the settings to change in a screen capture below.

wordpress seo yaost plugin meta robots settings thumb 5 Ways to Deal with Duplicate Content in Wordpress

Tools To Solve WordPress Duplicate Content Issues

1.  Duplicate Content Cure Plugin

There is a simple yet powerful plugin that you can install and simply set and forget. It will tell the search engines not to follow nor index your archives, paged and category pages.  The benefits of using this plugin is the fact that you can simply install it and it will do all the work for you.  The downside is that you have no control over what it indexes.  For instance, I’ve written earlier that I organize my websites according to categories and the Duplicate Content Cure Plugin would not allow me to do this which is the reason why I don’t use it.

2.  Use an SEO Optimized Theme Such as Headway

I keep no secrets in letting you know that I love headway and use it to build and power a majority of my sites.  Headway is so powerful and flexible that you can build almost any type of site with it.  For the reasons why I use headway check out my post on 7 Reasons Why I Use Headway (and you should too)

One of those reasons is the fact that Headway has SEO options built into it.  That is that I can choose which pages to tell the search engines not to index, I can add my own Custom Titles and Meta Descriptions and headway will remove a list of predefined words such as “the” “of” and “a” from my word press slug urls.

Headway is a great theme, not only for its built in SEO options but also for its powerful capabilities (such as the Visual Editor and its Drag and Drop Framework).

3. WordPress SEO Plugin by Yoast

This is my method of choice to optimize my themes for the search engines.  I used to simply use Headway’s built in options until Joost released his WordPress SEO plugin for WordPress.  It has so many features and options that I now use it on all my sites.

If you’re familiar with WordPress and SEO than you’ve more than likely heard of Joost De Valk, a WordPress Developer and SEO consultant from the Netherlands. Joost has worked with companies such as KLM, eBay and Salesforce. He also is currently ranked #1 for the keyword “Wordpress SEO” and I don’t mind sharing that I learn a lot from him and listen to what he says (and you should too).

That being said, while there are numerous WordPress SEO plugins to choose from, in my humble opinion, Yoast’s WordPress SEO Plugin is the best. While I’ll be doing a full review on the plugin in a future post I’ll share three reasons why I like it.

i.    It has a lot of options for me to customize how I want the Search Engine Spiders to crawl my site

ii.   It has an xml sitemap built into it. Thus removing the need for another plugin.

iii.  It has the option to remove the word “category” from my blog posts which results in better SEO (who needs the word category in the post type, annoying in my opinion and I’m glad that he included this feature)

I could go on about the plugin but I’ll save that for another post.

So I hope this blog post has been both helpful and useful to you.  If you have any questions or comments about duplicate content please leave a comment below.

email

Related posts:

  1. 3 Ways to Install the Google Plus One Button in WordPress, and the Method that I Use
About the Author
Matt Fraser is a graduate of www.sesssions.edu an online design school based out of New York. He is fanatatic about Wordpress, Web design and SEO. When not surfing the internet and reading about the afore metioned topics you can find him walking his dog, a Border Collie Beagle named Daisy, or riding his Bike in the river valley.
  • sagiv

    Very useful! thanks :)

  • Zimbrul

    I’d like to ask some questions regarding duplicate content that WordPress creates:

    1. Paginate home page. Each page has got different url such us http://website.com/page/3/ or http://website.com/page/20/ but the title tag and description seems to stay the same. How affect this Google understanding of the Homepage?

    2. Same problem with the paginated Category archives: /page/3/ or …page/20/ is added to the url by WordPress but the description and title tag stay the same…how can you alter these?

  • http://www.saharaservice.com/?page_id=468 Computer & Laptop

    Well, I am not familiar with WordPress since I don’t use prebuilt CMSs, but I can tell you that you will need a server-side scripting language such as PHP, ASP, or JSP and a database such as MySQL, with knowledge of the SQL language as well.Thanks for sharing the information.

  • http://www.saharaservice.com/?page_id=468 Computer & Laptop

    I am trying to set up a children’s website using word press, and in one of the sections in my website i would like to store the students answers such as for example answers to a comprehension.Thanks for sharing the information.

  • http://www.saharaservice.com/?page_id=468 Computer & Laptop

    I really like this post, it is very nice blog post. I don’t want to use the 2 already provided and I alreadly downloaded a theme from word press. I need specific step-by-step instructions on how to change my skin.Thanks for sharing the information.

  • Marc Reece

    Hey Matt I agree with you about Headway, love that theme. So in your opinion how do you deal with duplicate content issues in headway. In order to customize certain pages sometimes I create a page only to use to query on another. If I create a page only for the purposes of querying in another should I just no index that content

  • Marc

    One more point I forgot to mention you have a ton of spammy comments posted, you might want to consider manually approving them they are diluting your page value.

  • Guest

    Hello, I have a quick question for you. I was trying to install a new theme and discovered that the pages were replicating. The sample page continues to appear along with every page. So if create an about us page, the sample page will appear along with it. When I create Privacy Policy page, the sample page appears with it. I need some help, please. I am not sure what to do any more