How to Use the Canonical Tag to Avoid Duplicate Content
By: Helen M. Overland, February 16, 2009, 3:05amHow to Use the Canonical URL Tag to Avoid Duplicate Content
View more presentations from Helen Overland.
Download the Full Canonical URL Tag PPT Here
Canonicalization issues on your website can be at times a frustrating barrier to ranking well in search engines. For years, SEO's have relied on 301 redirects and strict internal linking conventions in order to avoid this issue. But as any experienced SEO knows, canonicalization can be an ongoing issue for many clients websites due to the number of people updating the website, and which CMS they are using.A few days ago, Google, Yahoo and Microsoft all announced their engines will be supporting the "canonical" tag to help webmasters declare which of their pages is the "canonical" or "offical" version of the page.
For those of you who are new to this, let's get up to speed on what are canonical URLs, and then explore what you can do to fix your website and avoid duplicate content issues.
What is Duplicate Content and Why is it a Problem?
When search engines visit your website, they index the content along with your URL. However, if the same page is available at multiple URLs, or if there are other factors that make 2 pages appear to be similar, then one or more of those pages may be flagged as "duplicate content".For example, your home page could be accessed at the following URLs:
- http://www.example.com/
- http://example.com
- http://www.example.com/index.php
- http://example.com/index.php?disp=home
To the search engines, these could appear to be different pages, despite the fact that they all simply show your home page. Therefore, the search engine has to "choose" which version of the page is the most authoritative, and then filter out the others so that it doesn't show multiple duplicate pages in the search results. In addition, if the search engine has assigned your site 25 pages per visit to spider, and it spends 21 of those visits spidering pages that are just copies of a few pages, obviously you are going to lose out on some of your pages being spidered.
In addition, just ensuring that your website is only accessible using an all-lowercase URL on www.example.com does not guarantee you will avoid canonicalization issues. For example, if you are an E-Commerce website that runs an affiliate program, you may find your site being accessed at the following URLs:
- http://www.example.com/products/shoes/white-runner
- http://www.example.com/products/shoes/white-runner?affiliate=123
If the search engine spiders both of these pages, one may be chosen over the other as the authoritative page.
Finally, there is always the issue of pages where people can append whatever they like to the URL. If the URL is indexed, this can be a problem. Consider the following somewhat unfortunate URL for change.gov seen in Google:
In order to avoid these issues, SEO's have relied on comprehensive and sometimes complicated 301 redirect schemas.
You can read more about what canonicalization issues are and how they can affect your website on Matt Cutts blog and on SEOBook.
What is the "Canonical" Tag?
The "Canonical" tag is a way for you to declare what the "official" URL of the page is. It means that you (apparently) no longer need to create 10,000 line .htaccess files for complicated 301 redirects controlling your large website.
For each page on your site, you simply add the following tag to the <head> area of the website, including of course the correct URL for the specific page:
<link rel="canonical" href="http://www.example.com/products/" />
This will tell the search engines that regardless of what URL parameters they find, the expressly declared URL is the one that should be indexed.
What are the Limitations of the Canonical Tag?
There are a few limitations to the Canonical tag that you should be aware of before adding it to your website. For example, you cannot declare the canonical version of the page to be on a different domain. So you couldn't post an article on another website and declare that the canonical version resides on your own website. You also can't declare a page to be canonical if it has different content. So for example, you can't write a number of articles on your website, and then concentrate the "link juice" back to your home page, unless of course the content is very similar.
How to Add the new Canonical Tag to Your Website
If you are using WordPress, Magneto, or Drupal, there are already some modules available or in development from Yoast.
If you are using a different CMS, you may be able to make a small database call to determine what the canonical version of the URL may be. You may be able to then automatically add the tag to your website header template.
Further Coverage
- Matt Cutts Blog - Much helpful information here
- Canonical URL Tag - The Most Important Advancement in SEO Practices Since Sitemaps - SEOMoz
- Google, Yahoo & Microsoft Unite On “Canonical Tag” To Reduce Duplicate Content Clutter - Search Engine Land
©2009 Helen M. Overland. All Rights Reserved.
Want to republish this article on your site?
This article may be republished on your personal, non-profit, or commercial website, blog or E-Zine free of charge provided there is an active link to www.MsSEM.com and that this copyright notice is included. The article must be publicly and freely available - without a charge for the content.
Copyright 2006 Helen M. Overland, All Rights Reserved
www.MsSEM.com
SEO Search: Search the Web with an SEO Focus:
This article may be republished on your personal, non-profit, or commercial website, blog or E-Zine free of charge provided there is an active link to www.MsSEM.com and that this copyright notice is included. The article must be publicly and freely available - without a charge for the content.
Copyright 2006 Helen M. Overland, All Rights Reserved
www.MsSEM.com