There is no better way to create an infinite amount of duplicate content on your site than to force session IDs onto each visitor. Typically, session IDs are used for tracking a single visitor’s navigation path through the site, including the adding or removing products from the shopping cart. They are great for tracking purposes, but really, really bad for search engines and inbound linking.
Ok, first of all, that’s a bad URL shown above, but aside from that, tacked on at the end there is the session ID. Both URLs pull the same page pulled open via a different browsing session. The bad stuff happens if the session IDs also get attached when the search engines come for a visit.
Since a new session ID is attached with each new visit, each time the search engine comes around they are essentially fed all new URLs. If you have only a ten page site, the second time the search engines visit they add the “new” 10 pages to the index, for a total of 20 pages. When they come around a third time they now have 30 pages in their index. Once they start analyzing these pages they find page after page after page of duplication.
An additional problem arises as site visitors start bookmarking and linking to your site. Every link they add contains their very own session ID. The search engines follow that link to your site and now you’ve got another 10 pages of duplication. If they follow another link to your site, that’s 10 more. You starting to see where this is going? Essentially you can turn a 10 page site into endless duplications.
Even with a small site you can see why the search engines would stop coming around. But if you have a site with hundreds, or even thousands of products, you find two things happen. 1) The search engines will stop spidering new pages because there is just too much duplication. 2) The engines will start dropping pages out of the index altogether.
Now this is where my lack of programming skills show. I know there are some systems that will withhold the session IDs from search engines. This still has the potential of creating problems with inbound links. I can’t say for sure how search engines handle incoming links with Session IDs in the URLs, even if those IDs get stripped once the engine hits the site. I would think the link value will pass as if the ID isn’t there, but I don’t know.
Like sex, the only guaranteed protection here is not to do it at all. There are alternate means of tracking users for whatever reason. Avoiding session IDs completely ensures that you don’t open yourself up to inadvertent site duplication.
Related posts on duplicate content: