Am I doing enough to rid duplicate content?
-
I'm in the middle of a massive cleanup effort of old duplicate content on my site, but trying to make sure I'm doing enough.
My main concern now is a large group of landing pages. For example:
http://www.boxerproperty.com/lease-office-space/office-space/dallas
http://www.boxerproperty.com/lease-office-space/executive-suites/dallas
http://www.boxerproperty.com/lease-office-space/medical-space/dallas
And these are just the tip of the iceberg. For now, I've put canonical tags on each sub-page to direct to the main market page (the second two both point to the first, http://www.boxerproperty.com/lease-office-space/office-space/dallas for example). However this situation is in many other cities as well, and each has a main page like the first one above. For instance:
http://www.boxerproperty.com/lease-office-space/office-space/atlanta
http://www.boxerproperty.com/lease-office-space/office-space/chicago
http://www.boxerproperty.com/lease-office-space/office-space/houston
Obviously the previous SEO was pretty heavy-handed with all of these, but my question for now is should I even bother with canonical tags for all of the sub-pages to the main pages (medical-space or executive-suites to office-space), or is the presence of all these pages problematic in itself? In other words, should http://www.boxerproperty.com/lease-office-space/office-space/chicago and http://www.boxerproperty.com/lease-office-space/office-space/houston and all the others have canonical tags pointing to just one page, or should a lot of these simply be deleted?
I'm continually finding more and more sub-pages that have used the same template, so I'm just not sure the best way to handle all of them. Looking back historically in Analytics, it appears many of these did drive significant organic traffic in the past, so I'm going to have a tough time justifying deleting a lot of them.
Any advice?
-
Heather,
I'm confused as to what the duplicate content is. The three Dallas pages you mentioned have different content. Sure there's a decent amount that's the same from the site-wide content (nav menus, etc.), but each has different text and information about different locations that are available. How is it duplicate?
Kurt Steinbrueck
OurChurch.Com -
Heather,
First things: 1. Are they still driving traffic? 2. Rel=canonicals are supposed to be used on identical pages or on a page whose content is a subset of the canonical version.
Those pages are very thin content and I certainly wouldn't leave them as they are. If they're still driving content, I'd keep them, but for fear of panda, I'd 302 them to the main pages while I work steadily on putting real content on them and then remove the redirects as the content goes on.
If they're not still driving traffic, it seems to me that it wouldn't be very hard to justifying their removal (or 301 redirection to their main pages). Panda is a tough penalty and you don't want to get caught in that.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Our partners are using our website content for their websites. Do such websites hurt us due to duplicate content?
Hi all, Many of our partners across the globe are using the same content from our website and hosting on their websites including header tags, text, etc. So I wonder will these websites are hurting our website due to this duplicate content. Do we need to ask our partners to stop using our content? Any suggestions? What if some unofficial partners deny to remove the content? best way to handle? Thanks
Algorithm Updates | | vtmoz0 -
Does cached duplicate content hurts seo by Google
If we have duplicate content or pages cached in Google which has been indexed months back, still it hurts the original pages? Old URLs with cache can be seen now in Google when we search for the same URLs.
Algorithm Updates | | vtmoz0 -
SEO Myth-Busters -- Isn't there a "duplicate content" penalty by another name here?
Where is that guy with the mustache in the funny hat and the geek when you truly need them? So SEL (SearchEngineLand) said recently that there's no such thing as "duplicate content" penalties. http://searchengineland.com/myth-duplicate-content-penalty-259657 by the way, I'd love to get Rand or Eric or others Mozzers aka TAGFEE'ers to weigh in here on this if possible. The reason for this question is to double check a possible 'duplicate content" type penalty (possibly by another name?) that might accrue in the following situation. 1 - Assume a domain has a 30 Domain Authority (per OSE) 2 - The site on the current domain has about 100 pages - all hand coded. Things do very well in SEO because we designed it to do so.... The site is about 6 years in the current incarnation, with a very simple e-commerce cart (again basically hand coded). I will not name the site for obvious reasons. 3 - Business is good. We're upgrading to a new CMS. (hooray!) In doing so we are implementing categories and faceted search (with plans to try to keep the site to under 100 new "pages" using a combination of rel canonical and noindex. I will also not name the CMS for obvious reasons. In simple terms, as the site is built out and launched in the next 60 - 90 days, and assume we have 500 products and 100 categories, that yields at least 50,000 pages - and with other aspects of the faceted search, it could create easily 10X that many pages. 4 - in ScreamingFrog tests of the DEV site, it is quite evident that there are many tens of thousands of unique urls that are basically the textbook illustration of a duplicate content nightmare. ScreamingFrog has also been known to crash while spidering, and we've discovered thousands of URLS of live sites using the same CMS. There is no question that spiders are somehow triggering some sort of infinite page generation - and we can see that both on our DEV site as well as out in the wild (in Google's Supplemental Index). 5 - Since there is no "duplicate content penalty" and there never was - are there other risks here that are caused by infinite page generation?? Like burning up a theoretical "crawl budget" or having the bots miss pages or other negative consequences? 6 - Is it also possible that bumping a site that ranks well for 100 pages up to 10,000 pages or more might very well have a linkuice penalty as a result of all this (honest but inadvertent) duplicate content? In otherwords, is inbound linkjuice and ranking power essentially divided by the number of pages on a site? Sure, it may be some what mediated by internal page linkjuice, but what's are the actual big-dog issues here? So has SEL's "duplicate content myth" truly been myth-busted in this particular situation? ??? Thanks a million! 200.gif#12
Algorithm Updates | | seo_plus0 -
Content, for the sake of the search engines
So we all know the importance of quality content for SEO; providing content for the user as opposed to the search engines. It used to be that copyrighting for SEO was treading the line between readability and keyword density, which is obviously no longer the case. So, my question is this, for a website which doesn't require a great deal of content to be successful and to fullfil the needs of the user, should we still be creating relavent content for the sake of SEO? For example, should I be creating content which is crawlable but may not actually be needed / accessed by the user, to help improve rankings? Food for thought 🙂
Algorithm Updates | | underscorelive0 -
Could we run into issues with duplicate content penalties if we were to borrow product descriptions?
Hello, I work for an online retailer that has the opportunity to add a lot of SKUs to our site in a relatively short amount of time by borrowing content from another site (with their permission). There are a lot of positives for us to do this, but one big question we have is what the borrowed content will do to our search rankings (we normally write our own original content in house for a couple thousand SKUs). Organic search traffic brings in a significant chunk of our business and we definitely don't want to do something that would jeopardize our rankings. Could we run into issues with duplicate content penalties if we were to use the borrowed product descriptions? Is there a rule of thumb for what proportion of the site should be original content vs. duplicate content without running into issues with our search rankings? Thank you for your help!
Algorithm Updates | | airnwater0 -
Need help with some duplicate content.
I have some duplicate content issues on my blog I'm trying to fix. I've read lots of different opinions online about the best way to correct it, but they all contradict each other. I was hoping I could ask this community and see what the consensus was. It looks like my category and page numbers are showing duplicate content. For instance when I run the report I see things like this: http://noahsdad.com/resources/ http://noahsdad.com/resources/page/2/ http://noahsdad.com/therapy/page/2/ I'm assuming that is just the categories that are being duplicated, since the page numbers only show on the report at the end of a category. What is the best way to correct this? I don't use tags at all on my blog, using categories instead. I also use the Yoast SEO plug in. I have a check mark in the box that disables tags. However it says, "If you're using categories as your only way of structure on your site, you would probably be better off when you prevent your tags from being indexed." There is a box that allows you to disable categories also, but the description above makes it seem like I don't want to block both tags and categories. Any ideas what I should do? Thanks.
Algorithm Updates | | NoahsDad0 -
Duplicate Content
Hi guys, http://www.youtube.com/watch?v=gOvNtPGGeHc http://themovies2012.info/wanderlust Will google know what site copy the content and what site own the content? The description on youtube is exactly the same as my review on themovies2012.info, but in the description on youtube i put link to my website... Will google know the difference?
Algorithm Updates | | prunarevic0 -
Content below the fold and Panda Update
Hi I was at the linklove conference and I heard some worrying stories about the way content is formatted on a page being a factor in ehow has avoided being slapped. It was the first time I had heard the expression "below the fold..." I am producing some very sexy SERP's results and other sexier metrics are up too but I am concerened that thefurnituremarket.co.uk has a ton of images on the home page and the nice content is below all of them.. firstly is this content..."below the fold"? secondly I know the site is old but do you think when this panda update hits the UK... were will be penalised for the look of the site.. I know there was talk yesterday at the conference of coming up woth a tool to check this out... my gut says that this will be a factor... sooner rather than later hence I am looking at magento and how we can skin it to look nice and present products better.. I would be really interested to know what exactly is "below the fold" on the furnituremarket.co.uk and some thoughts on the whole ehow formatting issue..
Algorithm Updates | | robertrRSwalters0