How to prevent duplicate content within this complex website?
-
I have a complex SEO issue I've been wrestling with and I'd appreciate your views on this very much. I have a sports website and most visitors are looking for the games that are played in the current week (I've studied this - it's true). We're creating a new website from scratch and I want to do this is as best as possible. We want to use the most elegant and best way to do this. We do not want to use work-arounds such as iframes, hiding text using AJAX etc. We need a solid solution for both users and search engines.
Therefor I have written down three options:
- Using a canonical URL;
- Using 301-redirects;
- Using 302-redirects.
Introduction
The page 'website.com/competition/season/week-8' shows the soccer games that are played in game week 8 of the season. The next week users are interested in the games that are played in that week (game week 9). So the content a visitor is interested in, is constantly shifting because of the way competitions and tournaments are organized. After a season the same goes for the season of course.
The website we're building has the following structure:
- Competition (e.g. 'premier league')
- Season (e.g. '2011-2012')
- Playweek (e.g. 'week 8')
- Game (e.g. 'Manchester United - Arsenal')
- Playweek (e.g. 'week 8')
- Season (e.g. '2011-2012')
This is the most logical structure one can think of. This is what users expect.
Now we're facing the following challenge: when a user goes to http://website.com/premier-league he expects to see a) the games that are played in the current week and b) the current standings. When someone goes to http://website.com/premier-league/2011-2012/ he expects to see the same: the games that are played in the current week and the current standings. When someone goes to http://website.com/premier-league/2011-2012/week-8/ he expects to the same: the games that are played in the current week and the current standings.
So essentially there's three places, within every active season within a competition, within the website where logically the same information has to be shown.
To deal with this from a UX and SEO perspective, we have the following options:
Option A - Use a canonical URL
Using a canonical URL could solve this problem. You could use a canonical URL from the current week page and the Season page to the competition page:
So:
- the page on 'website.com/$competition/$season/playweek-8' would have a canonical tag that points to 'website.com/$competition/'
- the page on 'website.com/$competition/$season/' would have a canonical tag that points to 'website.com/$competition/'
The next week however, you want to have the canonical tag on 'website.com/$competition/$season/playweek-9' and the canonical tag from 'website.com/$competition/$season/playweek-8' should be removed.
So then you have:
- the page on 'website.com/$competition/$season/playweek-9' would have a canonical tag that points to 'website.com/$competition/'
- the page on 'website.com/$competition/$season/' would still have a canonical tag that points to 'website.com/$competition/'
In essence the canonical tag is constantly traveling through the pages.
Advantages:
- UX: for a user this is a very neat solution. Wherever a user goes, he sees the information he expects. So that's all good.
- SEO: the search engines get very clear guidelines as to how the website functions and we prevent duplicate content.
Disavantages:
- I have some concerns regarding the weekly changing canonical tag from a SEO perspective. Every week, within every competition the canonical tags are updated. How often do Search Engines update their index for canonical tags? I mean, say it takes a Search Engine a week to visit a page, crawl a page and process a canonical tag correctly, then the Search Engines will be a week behind on figuring out the actual structure of the hierarchy. On top of that: what do the changing canonical URLs to the 'quality' of the website? In theory this should be working all but I have some reservations on this.
- If there is a canonical tag from 'website.com/$competition/$season/week-8', what does this do to the indexation and ranking of it's subpages (the actual match pages)
Option B - Using 301-redirects
Using 301-redirects essentially the user and the Search Engine are treated the same. When the Season page or competition page are requested both are redirected to game week page.
The same applies here as applies for the canonical URL: every week there are changes in the redirects.
So in game week 8:
- the page on 'website.com/$competition/' would have a 301-redirect that points to 'website.com/$competition/$season/week-8'
- the page on 'website.com/$competition/$season' would have a 301-redirect that points to 'website.com/$competition/$season/week-8'
A week goes by, so then you have:
- the page on 'website.com/$competition/' would have a 301-redirect that points to 'website.com/$competition/$season/week-9'
- the page on 'website.com/$competition/$season' would have a 301-redirect that points to 'website.com/$competition/$season/week-9'
Advantages
- There is no loss of link authority.
Disadvantages
- Before a playweek starts the playweek in question can be indexed. However, in the current playweek the playweek page 301-redirects to the competition page. After that week the page's 301-redirect is removed again and it's indexable.
- What do all the (changing) 301-redirects do to the overall quality of the website for Search Engines (and users)?
Option C - Using 302-redirects
Most SEO's will refrain from using 302-redirects. However, 302-redirect can be put to good use: for serving a temporary redirect.
Within my website there's the content that's most important to the users (and therefor search engines) is constantly moving. In most cases after a week a different piece of the website is most interesting for a user. So let's take our example above. We're in playweek 8.
If you want 'website.com/$competition/' to be redirecting to 'website.com/$competition/$season/week-8/' you can use a 302-redirect. Because the redirect is temporary
The next week the 302-redirect on 'website.com/$competition/' will be adjusted. It'll be pointing to 'website.com/$competition/$season/week-9'.
Advantages
- We're putting the 302-redirect to its actual use.
- The pages that 302-redirect (for instance 'website.com/$competition' and 'website.com/$competition/$season') will remain indexed.
Disadvantages
- Not quite sure how Google will handle this, they're not very clear on how they exactly handle a 302-redirect and in which cases a 302-redirect might be useful. In most cases they advise webmasters not to use it.
I'd very much like your opinion on this. Thanks in advance guys and galls!
-
Hi Andy and Peter, thanks for your response.
@Andy: the rel=next and rel=prev markup won't really help in solving the problem we had. We will use it though because it's very helpful.
@Peter: yeah it's been something we've been struggling with for a while but we've finally made a decision on it.
The /current solution wasn't really a good solution because at the start of a season all the gameweeks are planned and created so it would become quite complex. We've done some calculations on how much duplicate content we would have if we would not use any of the redirects of canonical tags and the percentage of DC is very small (below 1%) so we're going to put our faith in Google's hands and let them figure it out. It's a good quality website with loads of links we're talking about so I don't expect to much issues. We'll monitor it closely though and stand by to interfere when needed.
Anyways, thanks for your suggestions. Although it didn't solve my problem 1:1 it did make me think and make a decision.
Bye, Steven
-
Yeah, time-sensitive information is always tough. I think you're dead on about the disadvantages - the timing of Google's application of these rotating tags would always be off, and you could end up with some really weird search results that are not only bad for SEO but could create bad UX (people landing on old pages thinking they're new).
What about another option - could you take more of a news/blog approach and have a "/current" page that is always the current week? As the current week changes, roll that content into an archive page ("/week8", etc.). That way, the content lives on, but the current URL never changes.
In terms of duplication, is this really full duplication? It sounds like some pages (like the season) just have snippets of the current week. That's not necessarily a problem. If they are very similar, could you "widgetize" it somehow? Could be straight HTML, but use a condensed format for the season page that links to the full version on the current week page. This would be much like a snippet of a blog post - instead of repeating everything on all 3 pages, have one main chunk of content and two summaries.
-
Hi,
Does the rel=next, rel=prev markup help you out with this problem? See http://googlewebmastercentral.blogspot.co.uk/2011/09/pagination-with-relnext-and-relprev.html
Ive used it a couple of times to help stop pages been seen as dupe content where those pages are duplicate (meta, main content, images etc) except for example reviews, or comments e.g. /product_x /product_x_review_page1 /product_x_review_page2 /product_x_review_page_3
Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Fix Duplicate Content Before Migration?
My client has 2 Wordpress sites (A and B). Each site is 20 pages, with similar site structures, and 12 of the pages on A having nearly 100% duplicate content with their counterpart on B. I am not sure to what extent A and/or B is being penalized for this. In 2 weeks (July 1) the client will execute a rebrand, renaming the business, launching C, and taking down A and B. Individual pages on A and B will be 301 redirected to their counterpart on C. C will have a similar site structure to A and B. I expect the content will be freshened a bit, but may initially be very similar to the content on A and B. I have 3 questions: Given that only 2 weeks remain before the switchover - is there any purpose in resolving the duplicate content between A and B prior to taking them down? Will 301 redirects from penalized pages on A or B actually hurt the ranking of the destination page on C? If a page on C has the same content as its predecessor on A or B, could it be penalized for that, even though the page on A or B has since been taken down and replaced with a 301 redirect?
Intermediate & Advanced SEO | | futumara0 -
2-websites focused on different markets but similar content
Hi all! I have a client who wants to branch out to another market (currently in Northern California and wants to open an office in Southern California), what would happen if we put up a second website that has similar content, but is exclusively for Southern California, with a different office address, and all the content geared towards Southern California market? There would be NO linking between the sites. Would that generate a penalty? Thanks! BB
Intermediate & Advanced SEO | | BBuck0 -
Duplicate page content errors stemming from CMS
Hello! We've recently relaunched (and completely restructured) our website. All looks well except for some duplicate content issues. Our internal CMS (custom) adds a /content/ to each page. Our development team has also set-up URLs to work without /content/. Is there a way I can tell Google that these are the same pages. I looked into the parameters tool, but that seemed more in-line with ecommerce and the like. Am I missing anything else?
Intermediate & Advanced SEO | | taylor.craig0 -
Same content pages in different versions of Google - is it duplicate>
Here's my issue I have the same page twice for content but on different url for the country, for example: www.example.com/gb/page/ and www.example.com/us/page So one for USA and one for Great Britain. Or it could be a subdomain gb. or us. etc. Now is it duplicate content is US version indexes the page and UK indexes other page (same content different url), the UK search engine will only see the UK page and the US the us page, different urls but same content. Is this bad for the panda update? or does this get away with it? People suggest it is ok and good for localised search for an international website - im not so sure. Really appreciate advice.
Intermediate & Advanced SEO | | pauledwards0 -
Duplicate content mess
One website I'm working with keeps a HTML archive of content from various magazines they publish. Some articles were repeated across different magazines, sometimes up to 5 times. These articles were also used as content elsewhere on the same website, resulting in up to 10 duplicates of the same article on one website. With regards to the 5 that are duplicates but not contained in the magazine, I can delete (resulting in 404) all but the highest value of each (most don't have any external links). There are hundreds of occurrences of this and it seems unfeasible to 301 or noindex them. After seeing how their system works I can canonical the remaining duplicate that isn't contained in the magazine to the corresponding original magazine version - but I can't canonical any of the other versions in the magazines to the original. I can't delete the other duplicates as they're part of the content of a particular issue of a magazine. The best thing I can think of doing is adding a link in the magazine duplicates to the original article, something along the lines of "This article originally appeared in...", though I get the impression the client wouldn't want to reveal that they used to share so much content across different magazines. The duplicate pages across the different magazines do differ slightly as a result of the different Contents menu for each magazine. Do you think it's a case of what I'm doing will be better than how it was, or is there something further I can do? Is adding the links enough? Thanks. 🙂
Intermediate & Advanced SEO | | Alex-Harford0 -
Duplicate Content/ Indexing Question
I have a real estate Wordpress site that uses an IDX provider to add real estate listings to my site. A new page is created as a new property comes to market and then the page is deleted when the property is sold. I like the functionality of the service but it creates a significant amount of 404's and I'm also concerned about duplicate content because anyone else using the same service here in Las Vegas will have 1000's of the exact same property pages that I do. Any thoughts on this and is there a way that I can have the search engines only index the core 20 pages of my site and ignore future property pages? Your advice is greatly appreciated. See link for example http://www.mylvcondosales.com/mandarin-las-vegas/
Intermediate & Advanced SEO | | AnthonyLasVegas0 -
Multiple cities/regions websites - duplicate content?
We're about to launch a second site for a different, neighbouring city in which we are going to setup a marketing campaign to target sales in that city (which will also have a separate office there as well). We are going to have it under the same company name, but different domain name and we're going to do our best to re-write the text content as much as possible. We want to avoid Google seeing this as a duplicate site in any way, but what about: the business name the toll free number (which we would like to have same on both sites) the graphics/image files (which we would like to have the same on both sites) site structure, coding styles, other "forensic" items anything I might not be thinking of... How are we best to proceed with this? What about cross-linking the sites?
Intermediate & Advanced SEO | | webdesignbarrie0 -
Duplicate content
I have just read http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world and I would like to know which option is the best fit for my case. I have the website http://www.hotelelgreco.gr and every image in image library http://www.hotelelgreco.gr/image-library.aspx has a different url but is considered duplicate with others of the library. Please suggest me what should i do.
Intermediate & Advanced SEO | | socrateskirtsios0