Tips, tricks, and pokes, just WebTrends Analytics
Random header image... Refresh for more!

Canonical URLs – Why You Should Care

Until Google recently published their Canonical URL link-tag standard, we hadn’t seen the word “canonical” in actual written use since grad school.  It was very exciting to see it again after all this time.  You know how much we enjoy jargon …  We were even more thrilled when we saw that Google uses “canonical” as a noun as well!   As always, Google is technically correct!  Whew!

Anyway, one meaning of the word canonical is “the simplest form.”  In other words, a “canonical” mathematical model is the model with the fewest possible rules and variables, out of all possible mathematical models for a thing. 

We love the irony of using an obscure multisyllabic word that means “simple.”

They might have called it the ”Standard for Preventing URL variations (for a given page) from being indexed .”  The SFPUVFOGPFBI.

You should care about Canonical URLs when you have a page on your site that is reached through multiple URLs due to tracking parameters.

Example: 

Your page http://yoursite.com/promo.asp can be reached through the plain URL. 

But:

On your home page you have a promotions panel that points to that page.  And of course you want WebTrends to correctly report on the number of /promo.asp visits that came from that home page promo without resorting to path analysis.  So you’ve hard-coded that link from the home page to direct to the marked URL
/promo.asp?WT.ac=fromhomepage  .  As a result, that promo page can be reached through two different URLs, one with and one without WT.ac in it.

Or:

There are affiliates or friendly sites with links to your promo.asp page, and they have cooperatively helped your tracking by hard-coding their links as
http://www.yoursite.com/promo.asp?WT.mc_id=friendlysitename&WT.mc_ev=affiliate,  
or
http://www.yoursite.com/promo.asp?source=friendlysitename

Because of these marker parameters, your WebTrends reports will give accurate numbers.  But on the other hand you’ve created a search engine problem:  Yahoo, Google, MSN, Ask and other spiders that follow those links will end up with, in their indexes, two or more URLs for the same page:  a plain one, and one with the extra parameter WT.ac or WT.mc_id or whatever.

At best, this will water down the ranks or PageRank for these pages.  At worst, the search engine will assume it’s seeing spamming, i.e. duplicate content on multiple pages.

All of this can be avoided by adding code to your page’s <head> section.  This bit of code was announced by Google back in February and has since been adopted by other search engines such as MSN-LiveSearch, Yahoo and Ask.

The code is aimed at the search engine spiders.  It tells them the one and only way that you want the page to appear in their indexes.

<link rel=”canonical” href=”http://www.yoursite.com/subdirectory/promo.asp” />

Remember, it goes inside the  <head> section of the page.

It doesn’t affect what WebTrends sees at all.  It’s only used by search engine spiders.

Think about how many of your pages might be affected by this problem, considering your banners, pay-per-click, affiliates, and on-site advertising.  If you have a lot of them, like we do, you might want to program your content management system to automatically put the canonical link-tag into the header of every page.

Share:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google

Tags

, , , , ,

Somewhat Related Posts

  • You need to read this post about Table Limits
  • ...
  • Cool custom report: Actual vs paid-for PPC search terms
  • ...
  • Getting Yahoo PPC to add its own markers to landing page URLs
  • ...

    8 comments

    1 Bryan Cristina { 04.21.09 at 9:47 am }

    Loved the concept when I first heard of it, however, people around here are still paranoid about it affecting SEO efforts. For shame.

    Someday they’ll buy into it, and on that day I can track things a bit better without people being so scared. :)

    2 rocky { 04.21.09 at 10:10 am }

    Google is careful to say it is not a “directive” but instead is “a hint that we honor strongly.” Which leaves room for doubt for those who want to doubt.

    Here’s a link to a Google blog with some more information:

    http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html

    3 JD { 05.01.09 at 1:26 pm }

    You, sir, are no geek at all if you don’t have an easy familiarity with the word canonical.

    4 rocky { 05.02.09 at 6:09 am }

    Truer words were never spoken. I am flattered whenever anybody accidentally calls me a geek, but I am a mere social systems researcher by training, and “canonical” until now has only been, to me, a type of correlation analysis.

    5 Mike { 06.17.09 at 11:24 am }

    How would the parameter effect the path analysis? Basically if you had two identical links on the same, would they come up separately in the path analysis?

    6 Nestor { 06.18.09 at 9:52 am }

    I found your site yesterday and it has been very informative. Mad props!

    In terms of this post, a possible alternative to not breaking your SEO rankings is implementing tracking your WT parameters via the hash tag instead of adding it to the regular query parameter list. For example, “index.htm#WT.mc_id=cid” instead of “index.htm?WT.mc_id=cid”. This can be done by adding a small snippet of code in your WT base tag to parse the said hash tag parameters and pass it to the page tag image.

    7 rocky { 06.23.09 at 6:53 am }

    That is a really interesting idea because to the best of my knowledge the hash tag stuff is ignored when a page is indexed.

    At the same time, that trick cannot be used in URLs used in paid search, at least not for Google AdWords. The redirect in Google AdWords will strip out the hash tag portion during the redirect and the visit will not be identified as a paid search visit.

    8 rocky { 06.23.09 at 7:00 am }

    Mike – any parameter will affect path analysis only if you’ve allowed the parameter to be displayed in reports. If you’ve specified that the parameter should be suppressed in reports, path analysis will ignore it (the differentiating parameter). The suppress/display of specific parameters is controlled by WebTrends’ “URL Rebuilding” function which is found in the part of the config menu called “Administration.”

    Leave a Comment