Duplication, pagination and the canonical
-
Hi all, and thank you in advance for your assistance.
We have an issue of paginated pages being seen as duplicates by pro.moz crawlers.
The paginated pages do have duplicated by content, but are not duplicates of each other. Rather they pull through a summary of the product descriptions from other landing pages on the site.
I was planing to use rel=canonical to deal with them, however I am concerned as the paginated pages are not identical to each other, but do feature their own set of duplicate content!
We have a similar issue with pages that are not paginated but feature tabs that alter the URL parameters like so:
?st=BlueWidgets
?st=RedSocks
?st=Offers
These are being seen as duplicates of the main URL, and again all feature duplicate content pulled from elsewhere in the site, but are not duplicates of each other. Would a canonical tag be suitable here?
Many Thanks
-
The rel next prev is not for duplicated content - it just shows google how the parts relate to the whole.
An alternative to the rel next prev is the "Classic Pagination for SEO" that uses noindex another article by Adam
http://searchengineland.com/the-latest-greatest-on-seo-pagination-114284
If you have a duplicate issue, this would solve it as you would noindex all the duplicate pages.
What you need to do (and I can't do this for you), is to look at all the crawl paths that you are providing Google. As I mention above, you are not doing any favors to Google or to your site when you show Google an infinite number of paths to get to the same content. It just wastes Google's time and you don't want to do that when Google also has to crawl the rest of the internet. If you solve this issue, you will solve your duplicate issue.
AJ Kohn just posted an article on the concept of crawl budget that talks about this. I think the article is quite good and it explains why we need to look at all the topics of noindex, nofollow, robots, canonical and rel next prev http://www.blindfiveyearold.com/crawl-optimization
-
Thanks CleverPhD,
That's a very interesting read by Adam Audette too, thanks.
I should say that there's no internal search, each tab has a series of duplicated 'blurbs' taken from the product's unique landing page, while the body copy remains the same across the slight variations in the URL. So with:
example.com/example/?st=BlueWidgets
example.com/example/?st=RedSocks
all of these will feature the same body copy, while the last two will have a series of small descriptions from other landing pages in the site. Would the canonical tag be appropriate in this case? We only need to index 'example.com/example'.
Also, does the rel next prev take into account duplicate content? We want only the main URL indexed as all the paginated pages feature duplicate content, there is no view all page however.
Many thanks
-
If I am understanding the question - I think pulling in some body copy from each search result (and not just the whole page) would be fine. I think Google will see that this is a search result and that you are pointing to other pages. You are probably going to pull in text from the title too. This is common practice in search results - heck Google does it!
If you are still concerned about the pulled in descriptions, your option is to setup the system to have an alternate description for each page. Use the alternate description when you pull it into your main page. It is more work, but it will eliminate this issue.
Separately, paginated pages no longer need to be canonicaled to the index page. You can use rel next and prev.
http://googlewebmastercentral.blogspot.com/2011/09/pagination-with-relnext-and-relprev.html
https://support.google.com/webmasters/answer/1663744?hl=en
It explains to Google the relationship between P1 and P2,3,4,5,n etc.
Beyond that, you need to watch that you do not get into too many paginated pages to get to the exact same product pages. Lets say you had 1,000 widgets that were blue, red and green and also were Free, Expensive or Cheap. You would have several sets of paginated pages (one set for Blue, one for Red, Green, Free, Cheap, Expensive, one for Red and Expensive) etc. It gets to be a little crazy as they all lead to the same set of widget product pages. You need to manage how to have Google crawl all that and not have your Paginated Category pages look like duplicated. Adam Audette writes great stuff on this. Look here for things to consider
http://www.rimmkaufman.com/blog/site-search-dynamic-content-and-seo/01032013/
-
Thank you Robert, and for the helpful link.
You did read my question correctly, however I failed to ask it ask entirely correctly. Just to complicate matters, I neglected to mention that there is body copy on each page, which technically will be duplicated.
It sits above the tabs and does not change, while the tabbed pages - under new URL parameters - pull in a sentence or two of product description from elsewhere (a unique landing page).
So,
?st=BlueWidgets
?st=RedSocks
?st=Offers
will all feature the same body copy and different duplicate content. For obvious reasons, we only want the SE to index the main URL.
Any ideas?
Thanks again
-
Hi
It doesn't sound like rel=canonical is the solution, as each one of your pages might feature multiple pieces of content from various other parts of your website (if I've read your question correctly) - so which would be the canonical version of the page?
You could use Parameter Handling in Webmaster Tools to ensure Google knows what to do with your various parameters. Moz doesn't matter here, as long as Search Engines are aware of how to handle your pages correctly.
There's a good overview here.
I hope that's helpful
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Use 301 or rel=canonical
I have a page on my site that is showing in search results at #9. I created another page on my site with the search term in the url. Wondering if I 301 or rel=canonical. Thank you, Kerry
Technical SEO | | Hydraulicgirl0 -
Wordpress pagination and SEO
Hello Mozzers, We have incorporated Wordpress blog in our website. The blog has a fair share of what we believe is a valuable content both for the users and SEO. We have reached the point where our content is getting pushed out to pages 2, 3 and etc. 99% of the older content is still relevant and useful. However it does get less traffic from the users because it is not on the front page. I am dealing with it by showing "related posts" and get some traffic through that. I feel that the content that got pushed from the front page of the blog gets less love from search engines as well.The my permalink structure is /%postname%/ only, however when Wordpress adds page/1/ the SEO ranking appears to drop. Is it because Wordpress adds page/1/ to the address? What is a good way to optimize is? I have 15 posts showing on the front page should I increase it?
Technical SEO | | SirMax0 -
When is Duplicate Content Duplicate Content
Hi, I was wondering exactly when duplicate content is duplicate content? Is it always when it is word-for-word or if it is similar? For example, we currently have an information page and I would like to add a FAQ to the website. There is, however, a crossover with the content and some of it is repeated. However, it is not written word for word. Could you please advise me? Thanks a lot Tom
Technical SEO | | National-Homebuyers0 -
Duplicate pages on wordpress
I am doing SEO on a site which is running on WP. And it has all pages and categories duplicates on domain.com/site/ However, as it got crawled I saw that all domain.com/ pages have rel=canonical with main page tag (does it mean something?). Thing is I will fix permalinks structure and I think WP automatically redirects if it is changed from /?page_id= to /%category%/%postname%/ or /%postname%/ Isn't there something I miss? Second problems is a forum. After a crawl it found over 5k errors and over 5k warnings. Those are: Duplicate page content; Duplicate page title; Overly-Dynamic URLs; Missing Meta descr; Title Element too long. All those come from domain.com/forum/ (fortunately, there are no domain.com/site/forum duplicates). What could be an easy solution to this?
Technical SEO | | OVJ0 -
Results Pages Duplication - What to do?
Hi all, I run a large, well established hotel site which fills a specific niche. Last February we went through a redesign which implemented pagination and lots of PHP / SQL wizzardy. This has left us, however, with a bit of a duplication problem which I'll try my best to explain! Imagine Hotel 1 has a pool, as well as a hot tub. This means that Hotel 1 will be in the search results of both 'Hotels with Pools' and 'Hotels with Hot Tubs', with exactly the same copy, affiliate link and thumbnail picture in the search results. Now imagine this issue occurring hundreds of times across the site and you have our problem, especially since this is a Panda-hit site. We've tried to keep any duplicate content away from our landing pages with some success but it's just all those pesky PHP paginated pages which doing us in (e.g. Hotels/Page-2/?classifications[]263=73491&classifcations[]742=24742 and so on) I'm thinking that we should either a) completely noindex all of the PHP search results or b) move us over to a Javascript platform. Which would you guys recommend? Or is there another solution which I'm overlooking? Any help most appreciated!
Technical SEO | | dooberry0 -
Pagination V Canonical
Hi Guys, I am needing some help with regards to duplicate page content issues. Using Zen Cart on an ecommerce platform and it is bringing up duplicate page content on pages. For instance:- http://www.blissfulkidsparties.com.au/store/1st-birthday-themes-barnyard-bash-1st-birthday-c-67_321/ is the same as:- http://www.blissfulkidsparties.com.au/store/1st-birthday-themes-barnyard-bash-1st-birthday-c-67_321/?sort=20a&page=1 Rel=Prev/Next as I understand it will treat http://www.blissfulkidsparties.com.au/store/1st-birthday-themes-barnyard-bash-1st-birthday-c-67_321/?sort=20a&page=1 http://www.blissfulkidsparties.com.au/store/1st-birthday-themes-barnyard-bash-1st-birthday-c-67_321/?sort=20a&page=2 http://www.blissfulkidsparties.com.au/store/1st-birthday-themes-barnyard-bash-1st-birthday-c-67_321/?sort=20a&page=3 as one page but won't solve the issue of the duplicate content issues between:- http://www.blissfulkidsparties.com.au/store/1st-birthday-themes-barnyard-bash-1st-birthday-c-67_321/ and http://www.blissfulkidsparties.com.au/store/1st-birthday-themes-barnyard-bash-1st-birthday-c-67_321/?sort=20a&page=1 am I better using rel=Canonical here instead??? Kind Regards Neil
Technical SEO | | jazzah0 -
Duplicate Content and URL Capitalization
I have multiple URLs that SEOMoz is reporting as duplicate content. The reason is that there are characters in the URL that may, or may not, be capitalized depending on user input. A couple examples are: www.househitz.com/Pennsylvania/Houses-for-sale www.househitz.com/Pennsylvania/houses-for-sale www.househitz.com/Pennsylvania/Houses-for-rent www.househitz.com/Pennsylvania/houses-for-rent There are currently thousands of instances of this on the site. Is this something I should spend effort to try and resolve (may not be minor effort), or should I just ignore it and move on?
Technical SEO | | Jom0 -
Help With Joomla Duplicate Content
Need another set of eyes on my site from someone with Joomla experience. I'm running Joomla 2.5 (latest version) and SEOmoz is giving my duplicate content errors on a lot of my pages. I checked my sitemap, I checked my menus, and I checked my links, and I can't figure out how SEOmoz is finding the alternate paths to my content. Home page is: http://www.vipfishingcharters.com/ There's only one menu at the top. Take the first link "Dania Beach" under fishing charters for example. This generates the SEF url: http://www.vipfishingcharters.com/fishing-charters/broward-county/dania-beach-fishing-charters-and-fishing-boats.html Somehow SEOmoz (and presumably all other robots) are finding duplicate content at: http://www.vipfishingcharters.com/broward-county/dania-beach-fishing-charters-and-fishing-boats.html SEOmoz says the referrer is the homepage/root. The first URL is constructed using the menu aliases. The second one is constructed using the Joomla category and article alias. Where is it getting this and how can I stop it? <colgroup><col width="601"></colgroup>
Technical SEO | | NoahC0