This series is pulled from a presentation given at SMX East. Part I of this series covered the problems duplicate content creates. Part II covered some of the causes of duplicate content. This post covers some of the solutions that will help you fix your duplicate content problems.
Part I: Duplicate Content Causes Problems. Duh!
Part II: There is No Single Cause of Duplicate Content. Don’t collect them all!
Great! Now let’s move on.
Only You Can Prevent Duplicate Content
Finally! Now we can address some of the solutions to the problems duplicate content creates.
Not all duplicate content issues are easily fixable, and some may be outside of your own control. But, those that are in your control do need to be addressed sooner rather than later. Or, you could just sit back and wait for Google to figure it all out. Don’t worry, it’s all good. Google’s got your back!
But, while you’re praying to Google for lavish blessings, I’ll be working with my clients to fix problems that are holding them back in the search results.
Search engine friendly links
If you’re going to link to a page, be consistent about it. We covered how the same page can be linked in several different ways. You can implement redirects and canonical tags (which I’ll cover below), but regardless of the other solutions you put in place, be sure to be consistent in how you link to all pages in your site.
If you want to use the “www.”, then use it on every link. If you want to link to the default page without using the file name of any directory or sub-directory, then do that consistently as well. Half of the problem with duplicate content is pages being linked inconsistently throughout the site. Fix your link structure first, then work on the rest of the solutions.
Secure shopping path
In Part II, I talked about the problems that happen when visitors move into the secure area of your site. Often times these secure areas contain links back out, but maintain the secure “https” in the URL. This creates both a secure and non-secure version of the same page. A dupe. The solution here is two-fold.
First, don’t let the search engines enter into your shopping cart area. Secure or not, keep them out! There is nothing there for them to see. Second, once visitors are in the secure area, be sure that any links back out of the check out area go to the unsecure site, not secure URLs of the same pages. It’s OK for visitors to move in and out of the secure area, but what you don’t want is them (or the search engines) accessing secure pages that aren’t meant to be.
Hard code all of your links out of your secure area to be sure they are not using the secure “https” in the URL. Problem solved.
The canonical tag (or attribute. Whatever.) is the ultimate duplicate content band-aid solution for duplicate content. The search engines released this as a way to give them a “hint” about which page of all your duplicates is the one that is supposed to be the genuine URL.
This solution is only necessary if you can’t get your pages properly redirected, or duplicate URLs eliminated, via smart linking and content management implementation. It’s the ultimate “if I can’t do anything else” solution. And really, I wouldn’t worry about it unless you can’t implement any other type of fix.
The idea here is to put the tag in the head code of each duplicate page with the URL of the “proper” page. The search engines are supposed to treat it as if it is a redirect when assigning link and other values to the page.
Link to only to canonical page
If you can’t eliminate your duplicate pages and must use the canonical tag, I would also do my best to link only to the canonical version of each page. I wouldn’t rely on the search engines to transfer all your link values from the incorrect URL to the correct one. Maybe they will, maybe they won’t. But, if make sure your internal links point only to the canonical page, you’ve accounted for half the problem.
The other half will be external links, which redirects (see below) will handle. Linking to the canonical page ensures that all internal linking value will be passed to the proper page without relying on the search engines to get the “hint”. “Don’t make them [the search engines] think” is still the best play.
The absolute best solution to maintaining link value to the pages that are supposed to receive it is the use of the redirect. Whether you have deleted or moved old pages, or have duplicates with a single canonical page, using the 301 redirect (along with linking to the correct page) is the best solution available.
This doesn’t require any thinking on behalf of the search engine or the visitor, and you never have to worry about what URLs are being used in links to your site, because only the correct URL is being served. This is the Big Kahuna (along with linking to the correct page) of duplicate content and bad URL solutions.
If you don’t know how to implement redirects, talk to your developers. They should know the best solution for you, but be sure they implement a 301 redirect, and nothing less.
Duplicate content can be problematic, but implementing these solutions will do wonders to eliminating the problems and reducing the amount of online clutter your site may be producing. Once eliminated, your site should perform significantly better in the search engines, which is the goal we should all be shooting for.