What is canonicalisation?
“Canonicalisation is the process of picking the best URL when there are several choices…”
“Canonicalisation is the process of picking the best URL when there are several choices…”
Basically, quite often a web page will have several URLs for
the same page, for example:
http://www.cpbhaiseo.com
http://cpbhaiseo.com (notice without the dub dub dub)
Both of these URLs load the same page, the homepage! There can also be other versions of the URL loading the same
page with additional parameters such as /index.php or even /home.php In
addition the owner of a website might have bought several domains (TLDs), for
example I also own the .co.uk TLD: http://www.cpbhaiseo.co.uk If this
additional domain is just pointed to the website/page this will again load the
same page. So potentially I could have 8 different URLs loading the homepage
for Verve Search.
What is the canonical issue?
A canonical issue arises when 301 redirects are not properly in place. This means that your website can be accessed by search engines from several different URLs. This means that search engines can then potentially index your site under different URLs, meaning that it will look like a site of duplicated content.
What can be done to resolve the canonical issue?
The best and most effective way to resolve the canonical issue is with a permanent 301 redirect. This can be implemented in a number of ways, as detailed below. Depending on what server your website is hosted on will determine the method which you use to implement a redirect.
What is the canonical issue?
A canonical issue arises when 301 redirects are not properly in place. This means that your website can be accessed by search engines from several different URLs. This means that search engines can then potentially index your site under different URLs, meaning that it will look like a site of duplicated content.
What can be done to resolve the canonical issue?
The best and most effective way to resolve the canonical issue is with a permanent 301 redirect. This can be implemented in a number of ways, as detailed below. Depending on what server your website is hosted on will determine the method which you use to implement a redirect.
This is a problem for several reasons, fundamentally because
when the search engine visits your website the search engine spiders is likely
to be having this experience:
It would be even more complicated for the search engine spiders if in addition to all these URLs your website also contained URL based sessionIDs (sessionIDs=dynamically generated a separate URL for each user in each session, including the spiders) For example http://www.cpbhaiseo.com/?PHPSESSID=123 . Each page would then be likely to have hundreds, maybe even thousands, of separate URLs for the same page. The real problem then comes when the spiders indexes one of these sessionID URLs instead of your main URL. Yes it will look rubbish, BUT the real problem is that this URL is unlikely to have any link authority as it’s a unique URL just for the session when the spider crawled the site. The real problem is when loads of these URLs find their way into the search engine index, as these sesssion URLs are likely to have any link authority, so if you are trying to rank within a competitive market this could be holding your site back significantly. Worst case scenario the spiders can be indexing a sessionID instead of the main URL to a page.
Note: the reason some sites use sessionIDs is usually to be
able to do in depth tracking of each session. For those of you that do this I
would recommend using cookie based sessions instead of URL based session IDs.
Yes, cookie based tracking might not be as accurate if users disables cookies
but I believe it’s better in the long run as session based URLs could
potentially harm your SEO efforts and over complicate things
How canonicalization issues affects link authority!
In your mind the http://www.yourdomain.com/ is usually your
main URL, but don’t assume this is obvious to users and search engines. If you
haven’t chosen a canonical URL (and implemented the appropriate redirects or
rel=canonical tags, don’t worry explanation will come) it is likely that some
links will go to one of the other URLs, for example a user types in my website
direct into browser but uses the .co.uk TLD, it finds the page they wanted to
link to and links to it using the .co.uk. Another example could be a user
following an internal link and the internal link goes to /page/index.php but
your link builders are getting links to the main URL, now you have links going
to both URLs and the link authority is being diluted. You still
following me? Now imagine you also have sessionIDs on your site and a user have
visited your site, gets a sessionID and bookmarks the page (with the sessionID)
then links to it via his/her blog. Now you have 3 different URLs to the same
page with links, imagine how much more powerful the page would be if all of the
links went to one URL??!!
How to fix canonicalisation problems
There is now 2 different ways of fixing canonicalization
issues to your site. Quite recently Google announced supporting a new “canonical
tag” that lets you specify in the HTML header that the URL in question
should be treated as a “copy” and names the canonical URL that all link
authority and content metrics should flow back to.
Example:
Within the HTML header of the page loading on this URL http://www.vervesearch.com/index.php there would be a parameter like this:
<link rel=”canonical” href=”http://www.vervesearch.com/”
/>
This would “tell” the search engines that they should index
the canonical URL specified in this tag and also weigh any link authority from
the /index.php URL to the canonical URL. The rel=canonical tag should be
implemented on every URL you have that is loading the same page (except
from the main canonical URL you want to use of course).
This tag is really easy to implement and can solve a lot of canonicalization issues, BUT it has its limitations. For example you can’t use this for your country specific TLDs (which essentially a separate domain) or other additional domains you might have bought. There might also be issues with the fact that this tag only “redirects” the engines attention to the correct URL, users will still be able to use all the different URLs and within your analytics these are likely to come up as different pages.
No comments:
Post a Comment