Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
No-index pages with duplicate content?
-
Hello,
I have an e-commerce website selling about 20 000 different products. For the most used of those products, I created unique high quality content. The content has been written by a professional player that describes how and why those are useful which is of huge interest to buyers.
It would cost too much to write that high quality content for 20 000 different products, but we still have to sell them. Therefore, our idea was to no-index the products that only have the same copy-paste descriptions all other websites have.
Do you think it's better to do that or to just let everything indexed normally since we might get search traffic from those pages?
Thanks a lot for your help!
-
We recommend to such clients that they apply the robots noindex,follow meta tag on the duplicated pages until they get rewritten. We aim for 20% of all products on the site to be completely unique in content, and indexable. The other 80% can be rewritten gradually over time and released back into the index as they are rewritten.
So to answer you question: Yes, I think your plan is perfectly acceptable, and is what I would do myself if I were in the same situation.
-
Duplicate content is not a penalty, it's a filter. Deindexing will ensure that they never rank, leave them indexed and they have a chance of ranking, worst case scenario is they don't rank well because of it.
-
I think Devanur gives some good advice regarding the gradual improvement of the content, though you're stuck in a bit of a catch-22 with regard to how Google views websites: You want to be able to sell lots of products, but don't have the resources for your company present them in a unique or engaging fashion. This is something that Google wants webmasters to do, but the reality of your situation paints a completely different picture of what will give your company decent ROI for updating vast amounts of product content.
If there isn't an obvious Panda problem, I wouldn't just noindex lots of pages without some thought and planning first. Before noindexing the pages I would look at what SEO traffic they're getting. noindexing alone seems like a tried and tested method of bypassing potential Panda penalties and although PageRank will still be passed, there's a chance that you are going to remove pages from the index that are driving traffic (even if it's long tail).
In addition to prioritising content production for indexed pages per Devanur's advice, I would also do some keyword analysis and prioritise the production of new content for terms which people are actually searching for before they purchase.
There's a Moz discussion here which might help you: http://moz.com/community/q/noindex-vs-page-removal-panda-recovery.
Regards
George
@methodicalweb
-
Hi, the suggestion was not to get the quality articles written that take an hour to write each but I meant to change the products descriptions that were copied and pasted with little variation so that they don't look like a copy, paste job.
Now, coming to the de-indexing part, let us look at a scenario:
Suppose I built a website to promote Amazon products through Amazon associates program. I populated its pages using Amazon API through a plugin like WProbot or Protozon. In this case, the content will be purely scraped from Amazon and other places. After a while, I realize that my site has not been performing well in the search engines because of the scraped content but haven't seen any penalty levied or manual action taken. As of now, I have about 3000 pages in Google's index. Now I want to tackle the duplicate content issue. This is what I would do to be on a safer side from a possible penalty in future like Panda:
1. First, will make the top pages unique.
2. Add, noindex to the rest of the duplicate content pages.
3. Keep on making the pages unique in phases, removing the noindex tag to the ones that were updated with unique content.
4. Would repeat the above step till I fix all the duplicate content pages on the website.
It greatly depends on the level of content duplication and few other things so, we will be able to suggest better if we can have a look at the website in question. You can send a private message if you want any of us to have a look at it.
-
Hello,
Like I said in my first post, this has already been done. I was asking a specific question.
on another topic, 300 quality pages of content is not possible in the month. We're talking about articles that take at least an hour to write.
That being said, I'll ask my question again: once I have done, let's say, 750 pages of unique content, should I no-index the rest or not. is there something better to do that doesn't involve writing content for 20 000 pages?
Thanks.
-
Very true my friend. If you look at your top pages for last 30 days, there won't be more than 2000 approximately. So you can make the content unique on these over a period of six months or a bit more going at 300 per month. Trust me, this would be an effort well spent.
-
Hello,
I agree with you that it would be the best but like Isaid, writting content for 20 000 pages is not an option. Thanks for your answer!
-
Going off of what Devanur said. Giving your product pages unique content is the way to go. But this can include pictures, sizes, material and etc... I am in the rug business and this is how we pull it off and also how RugsUSA does as well. If you do not however, I would do what Devanur referred to with changing descriptions of your top selling products first.
All the best!
-
Hi,
While its not recommended to have duplicate content on your pages that is found else where, it is also not a good thing to de-index pages from Google. If I were you, I would have tried to beef-up these duplicate pages a little bit with unique content or at least rewritten the existing content so that it becomes unique.
Please go ahead and initiate the task of rewriting the product descriptions in phases starting with the ones that get the most traffic as per your web analytics data. Those were my two cents my friend.
Best regards,
Devanur Rafi
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEM Rush & Duplicate content
Hi SEMRush is flagging these pages as having duplicate content, but we have rel = next etc implemented: https://www.key.co.uk/en/key/brand/bott https://www.key.co.uk/en/key/brand/bott?page=2 Or is it being flagged as they're just really similar pages?
Intermediate & Advanced SEO | | BeckyKey0 -
Google does not want to index my page
I have a site that is hundreds of page indexed on Google. But there is a page that I put in the footer section that Google seems does not like and are not indexing that page. I've tried submitting it to their index through google webmaster and it will appear on Google index but then after a few days it's gone again. Before that page had canonical meta to another page, but it is removed now.
Intermediate & Advanced SEO | | odihost0 -
Duplicate Content through 'Gclid'
Hello, We've had the known problem of duplicate content through the gclid parameter caused by Google Adwords. As per Google's recommendation - we added the canonical tag to every page on our site so when the bot came to each page they would go 'Ah-ha, this is the original page'. We also added the paramter to the URL parameters in Google Wemaster Tools. However, now it seems as though a canonical is automatically been given to these newly created gclid pages; below https://www.google.com.au/search?espv=2&q=site%3Awww.mypetwarehouse.com.au+inurl%3Agclid&oq=site%3A&gs_l=serp.3.0.35i39l2j0i67l4j0i10j0i67j0j0i131.58677.61871.0.63823.11.8.3.0.0.0.208.930.0j3j2.5.0....0...1c.1.64.serp..8.3.419.nUJod6dYZmI Therefore these new pages are now being indexed, causing duplicate content. Does anyone have any idea about what to do in this situation? Thanks, Stephen.
Intermediate & Advanced SEO | | MyPetWarehouse0 -
Contextual FAQ and FAQ Page, is this duplicate content?
Hi Mozzers, On my website, I have a FAQ Page (with the questions-responses of all the themes (prices, products,...)of my website) and I would like to add some thematical faq on the pages of my website. For example : adding the faq about pricing on my pricing page,... Is this duplicate content? Thank you for your help, regards. Jonathan
Intermediate & Advanced SEO | | JonathanLeplang0 -
Pages are Indexed but not Cached by Google. Why?
Here's an example: I get a 404 error for this: http://webcache.googleusercontent.com/search?q=cache:http://www.qjamba.com/restaurants-coupons/ferguson/mo/all But a search for qjamba restaurant coupons gives a clear result as does this: site:http://www.qjamba.com/restaurants-coupons/ferguson/mo/all What is going on? How can this page be indexed but not in the Google cache? I should make clear that the page is not showing up with any kind of error in webmaster tools, and Google has been crawling pages just fine. This particular page was fetched by Google yesterday with no problems, and even crawled again twice today by Google Yet, no cache.
Intermediate & Advanced SEO | | friendoffood2 -
Dev Subdomain Pages Indexed - How to Remove
I own a website (domain.com) and used the subdomain "dev.domain.com" while adding a new section to the site (as a development link). I forgot to block the dev.domain.com in my robots file, and google indexed all of the dev pages (around 100 of them). I blocked the site (dev.domain.com) in robots, and then proceeded to just delete the entire subdomain altogether. It's been about a week now and I still see the subdomain pages indexed on Google. How do I get these pages removed from Google? Are they causing duplicate content/title issues, or does Google know that it's a development subdomain and it's just taking time for them to recognize that I deleted it already?
Intermediate & Advanced SEO | | WebServiceConsulting.com0 -
Duplicate Content on Press Release?
Hi, We recently held a charity night in store. And had a few local celebs turn up etc... We created a press release to send out to various media outlets, within the press release were hyperlinks to our site and links on certain keywords to specific brands on our site. My question is, should we be sending a different press release to each outlet to stop the duplicate content thing, or is sending the same release out to everyone ok? We will be sending approx 20 of these out, some going online and some not. So far had one local paper website, a massive football website and a local magazine site. All pretty much same content and a few pics. Any help, hints or tips on how to go about this if I am going to be sending out to a load of other sites/blogs? Cheers
Intermediate & Advanced SEO | | YNWA0 -
Number of Indexed Pages are Continuously Going Down
I am working on online retail stores. Initially, Google have indexed 10K+ pages of my website. I have checked number of indexed page before one week and pages were 8K+. Today, number of indexed pages are 7680. I can't understand why should it happen and How can fix it? I want to index maximum pages of my website.
Intermediate & Advanced SEO | | CommercePundit0