How to prevent duplicate content within this complex website?
-
I have a complex SEO issue I've been wrestling with and I'd appreciate your views on this very much. I have a sports website and most visitors are looking for the games that are played in the current week (I've studied this - it's true). We're creating a new website from scratch and I want to do this is as best as possible. We want to use the most elegant and best way to do this. We do not want to use work-arounds such as iframes, hiding text using AJAX etc. We need a solid solution for both users and search engines.
Therefor I have written down three options:
- Using a canonical URL;
- Using 301-redirects;
- Using 302-redirects.
Introduction
The page 'website.com/competition/season/week-8' shows the soccer games that are played in game week 8 of the season. The next week users are interested in the games that are played in that week (game week 9). So the content a visitor is interested in, is constantly shifting because of the way competitions and tournaments are organized. After a season the same goes for the season of course.
The website we're building has the following structure:
- Competition (e.g. 'premier league')
- Season (e.g. '2011-2012')
- Playweek (e.g. 'week 8')
- Game (e.g. 'Manchester United - Arsenal')
- Playweek (e.g. 'week 8')
- Season (e.g. '2011-2012')
This is the most logical structure one can think of. This is what users expect.
Now we're facing the following challenge: when a user goes to http://website.com/premier-league he expects to see a) the games that are played in the current week and b) the current standings. When someone goes to http://website.com/premier-league/2011-2012/ he expects to see the same: the games that are played in the current week and the current standings. When someone goes to http://website.com/premier-league/2011-2012/week-8/ he expects to the same: the games that are played in the current week and the current standings.
So essentially there's three places, within every active season within a competition, within the website where logically the same information has to be shown.
To deal with this from a UX and SEO perspective, we have the following options:
Option A - Use a canonical URL
Using a canonical URL could solve this problem. You could use a canonical URL from the current week page and the Season page to the competition page:
So:
- the page on 'website.com/$competition/$season/playweek-8' would have a canonical tag that points to 'website.com/$competition/'
- the page on 'website.com/$competition/$season/' would have a canonical tag that points to 'website.com/$competition/'
The next week however, you want to have the canonical tag on 'website.com/$competition/$season/playweek-9' and the canonical tag from 'website.com/$competition/$season/playweek-8' should be removed.
So then you have:
- the page on 'website.com/$competition/$season/playweek-9' would have a canonical tag that points to 'website.com/$competition/'
- the page on 'website.com/$competition/$season/' would still have a canonical tag that points to 'website.com/$competition/'
In essence the canonical tag is constantly traveling through the pages.
Advantages:
- UX: for a user this is a very neat solution. Wherever a user goes, he sees the information he expects. So that's all good.
- SEO: the search engines get very clear guidelines as to how the website functions and we prevent duplicate content.
Disavantages:
- I have some concerns regarding the weekly changing canonical tag from a SEO perspective. Every week, within every competition the canonical tags are updated. How often do Search Engines update their index for canonical tags? I mean, say it takes a Search Engine a week to visit a page, crawl a page and process a canonical tag correctly, then the Search Engines will be a week behind on figuring out the actual structure of the hierarchy. On top of that: what do the changing canonical URLs to the 'quality' of the website? In theory this should be working all but I have some reservations on this.
- If there is a canonical tag from 'website.com/$competition/$season/week-8', what does this do to the indexation and ranking of it's subpages (the actual match pages)
Option B - Using 301-redirects
Using 301-redirects essentially the user and the Search Engine are treated the same. When the Season page or competition page are requested both are redirected to game week page.
The same applies here as applies for the canonical URL: every week there are changes in the redirects.
So in game week 8:
- the page on 'website.com/$competition/' would have a 301-redirect that points to 'website.com/$competition/$season/week-8'
- the page on 'website.com/$competition/$season' would have a 301-redirect that points to 'website.com/$competition/$season/week-8'
A week goes by, so then you have:
- the page on 'website.com/$competition/' would have a 301-redirect that points to 'website.com/$competition/$season/week-9'
- the page on 'website.com/$competition/$season' would have a 301-redirect that points to 'website.com/$competition/$season/week-9'
Advantages
- There is no loss of link authority.
Disadvantages
- Before a playweek starts the playweek in question can be indexed. However, in the current playweek the playweek page 301-redirects to the competition page. After that week the page's 301-redirect is removed again and it's indexable.
- What do all the (changing) 301-redirects do to the overall quality of the website for Search Engines (and users)?
Option C - Using 302-redirects
Most SEO's will refrain from using 302-redirects. However, 302-redirect can be put to good use: for serving a temporary redirect.
Within my website there's the content that's most important to the users (and therefor search engines) is constantly moving. In most cases after a week a different piece of the website is most interesting for a user. So let's take our example above. We're in playweek 8.
If you want 'website.com/$competition/' to be redirecting to 'website.com/$competition/$season/week-8/' you can use a 302-redirect. Because the redirect is temporary
The next week the 302-redirect on 'website.com/$competition/' will be adjusted. It'll be pointing to 'website.com/$competition/$season/week-9'.
Advantages
- We're putting the 302-redirect to its actual use.
- The pages that 302-redirect (for instance 'website.com/$competition' and 'website.com/$competition/$season') will remain indexed.
Disadvantages
- Not quite sure how Google will handle this, they're not very clear on how they exactly handle a 302-redirect and in which cases a 302-redirect might be useful. In most cases they advise webmasters not to use it.
I'd very much like your opinion on this. Thanks in advance guys and galls!
-
Hi Andy and Peter, thanks for your response.
@Andy: the rel=next and rel=prev markup won't really help in solving the problem we had. We will use it though because it's very helpful.
@Peter: yeah it's been something we've been struggling with for a while but we've finally made a decision on it.
The /current solution wasn't really a good solution because at the start of a season all the gameweeks are planned and created so it would become quite complex. We've done some calculations on how much duplicate content we would have if we would not use any of the redirects of canonical tags and the percentage of DC is very small (below 1%) so we're going to put our faith in Google's hands and let them figure it out. It's a good quality website with loads of links we're talking about so I don't expect to much issues. We'll monitor it closely though and stand by to interfere when needed.
Anyways, thanks for your suggestions. Although it didn't solve my problem 1:1 it did make me think and make a decision.
Bye, Steven
-
Yeah, time-sensitive information is always tough. I think you're dead on about the disadvantages - the timing of Google's application of these rotating tags would always be off, and you could end up with some really weird search results that are not only bad for SEO but could create bad UX (people landing on old pages thinking they're new).
What about another option - could you take more of a news/blog approach and have a "/current" page that is always the current week? As the current week changes, roll that content into an archive page ("/week8", etc.). That way, the content lives on, but the current URL never changes.
In terms of duplication, is this really full duplication? It sounds like some pages (like the season) just have snippets of the current week. That's not necessarily a problem. If they are very similar, could you "widgetize" it somehow? Could be straight HTML, but use a condensed format for the season page that links to the full version on the current week page. This would be much like a snippet of a blog post - instead of repeating everything on all 3 pages, have one main chunk of content and two summaries.
-
Hi,
Does the rel=next, rel=prev markup help you out with this problem? See http://googlewebmastercentral.blogspot.co.uk/2011/09/pagination-with-relnext-and-relprev.html
Ive used it a couple of times to help stop pages been seen as dupe content where those pages are duplicate (meta, main content, images etc) except for example reviews, or comments e.g. /product_x /product_x_review_page1 /product_x_review_page2 /product_x_review_page_3
Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do we avoid duplicate/thin content on +150,000 product pages?
Hey guys! We got a rather large product range (books) on our eCommerce site (+150,000 titles). We get book descriptions as meta data from our publishers, which we display on the product pages. This obviously is not unique, as many other sites display the same piece of description of the book. It is important for us to rank on those book titles, so my question to You is: How would you go about it? I mean, it seems like a rather unrealistic task to paraphrase +150,000 (and growing) book descriptions. As I see it, there are these options: 1. Don't display the descriptions on the product pages (however then those pages will get even thinner!)
Intermediate & Advanced SEO | | Jacob_Holm
2. Display the (duplicate) descriptions, but put no-index on those product pages in order not to punish the rest of the site (not really an option, though).
3. Hire student workers to produce unique product descriptions for all 150,000 products (seems like a huge and expensive task) But how would You solve such a challenge?
Thanks a lot! Cheers, Tommy.0 -
301 a website to mine within a subfolder
Hey there Mozzers, I have purchased a very amazing Social Media Related Plugin. I already have a business website about digital marketing which pretty much falls in the same category. I am thinking of transferring that plugin into a subfolder of my own website. Is there anything I should keep in mind when I do that?
Intermediate & Advanced SEO | | AngelosS1 -
Duplicate Content For Product Alternative listing
Hi I have a tricky one here. cloudswave is a directory of products and we are launching new pages called Alternatives to Product X This page displays 10 products that are an alternative to product X (Page A) Lets say now you want to have the alternatives to a similar product within the same industry, product Y (Page B), you will have 10 product alternatives, but this page will be almost identical to Page A as the products are in similar and in the same industry. Maybe one to two products will differ in the 2 listings. Now even SEO tags are different, aren't those two pages considered duplicate content? What are your suggestions to avoid this problem? thank you guys
Intermediate & Advanced SEO | | RSedrati0 -
Woocommerce SEO & Duplicate content?
Hi Moz fellows, I'm new to Woocommerce and couldn't find help on Google about certain SEO-related things. All my past projects were simple 5 pages websites + a blog, so I would just no-index categories, tags and archives to eliminate duplicate content errors. But with Woocommerce Product categories and tags, I've noticed that many e-Commerce websites with a high domain authority actually rank for certain keywords just by having their category/tags indexed. For example keyword 'hippie clothes' = etsy.com/category/hippie-clothes (fictional example) The problem is that if I have 100 products and 10 categories & tags on my site it creates THOUSANDS of duplicate content errors, but If I 'non index' categories and tags they will never rank well once my domain authority rises... Anyone has experience/comments about this? I use SEO by Yoast plugin. Your help is greatly appreciated! Thank you in advance. -Marc
Intermediate & Advanced SEO | | marcandre1 -
How do I use public content without being penalized for duplication?
The NHTSA produces a list of all recalls for automobiles. In their "terms of use" it states that the information can be copied. I want to add that to our site, so there is an up-to-date list for our audience to see. However, I'm just copying and pasting. I'm allowed to according to NHTSA, but google will probably flag it right? Is there a way to do this without being penalized? Thanks, Ruben
Intermediate & Advanced SEO | | KempRugeLawGroup1 -
Best strategy for duplicate content?
Hi everyone, We have a site where all product pages have more or less similar text (same printing techniques, etc.) The main differences are prices and images, text is highly similar. We have around 150 products in every language. Moz's algorithm tells me to do something about duplicate content, but I don't really know what we could do, since the descriptions can't be changed to be very different. We essentially have paper bags in different colors and and from different materials.
Intermediate & Advanced SEO | | JaanMSonberg0 -
Help With Preferred Domain Settings, 301 and Duplicate Content
I've seen some good threads developed on this topic in the Q&A archives, but feel this topic deserves a fresh perspective as many of the discussion were almost 4 years old. My webmaster tools preferred domain setting is currently non www. I didn't set the preferred domain this way, it was like this when I first started using WM tools. However, I have built the majority of my links with the www, which I've always viewed as part of the web address. When I put my site into an SEO Moz campaign it recognized the www version as a subdomain which I thought was strange, but now I realize it's due to the www vs. non www preferred domain distinction. A look at site:mysite.com shows that Google is indexing both the www and non www version of the site. My site appears healthy in terms of traffic, but my sense is that a few technical SEO items are holding me back from a breakthrough. QUESTION to the SEOmoz community: What the hell should I do? Change the preferred domain settings? 301 redirect from non www domain to the www domain? Google suggests this: "Once you've set your preferred domain, you may want to use a 301 redirect to redirect traffic from your non-preferred domain, so that other search engines and visitors know which version you prefer." Any insight would be greatly appreciated.
Intermediate & Advanced SEO | | JSOC1 -
Should I robots block site directories with primarily duplicate content?
Our site, CareerBliss.com, primarily offers unique content in the form of company reviews and exclusive salary information. As a means of driving revenue, we also have a lot of job listings in ouir /jobs/ directory, as well as educational resources (/career-tools/education/) in our. The bulk of this information are feeds, which exist on other websites (duplicate). Does it make sense to go ahead and robots block these portions of our site? My thinking is in doing so, it will help reallocate our site authority helping the /salary/ and /company-reviews/ pages rank higher, and this is where most of the people are finding our site via search anyways. ie. http://www.careerbliss.com/jobs/cisco-systems-jobs-812156/ http://www.careerbliss.com/jobs/jobs-near-you/?l=irvine%2c+ca&landing=true http://www.careerbliss.com/career-tools/education/education-teaching-category-5/
Intermediate & Advanced SEO | | CareerBliss0