Is it worth De-duplicating a large e-commerce website?
-
Hi all,
Most e-commerce websites use the same product description as the manufacturers. We know duplicate content is a huge negative for SEO. We are thinking about de-duplicating ours but our website is so big - it has tens of thousands of products. To de-duplicate it would require a ton of resources. Do you think it's worth it to go ahead de-duplicate our website? Do you have a website where de-duplication was done and did you see any positive result? (if so, did you see a certain percentage increase?)
Thank you in advance
-
Hi Marie,
Thank you for responding. I'd be happy to share it with you if we do see a positive result.
-
Hi Matthew,
Thank you for your suggestion.
From what I can see on our analytics, the product detail page, which is the page with the duplicate description, isn't getting any organic traffic. The page is indexed but it's just not ranking. Other websites including manufacturer is out ranking it.
Good suggestion, we are planning on doing the top 250 SKUs and hopefully we'll get some positive result.
On the websites that you've worked on, did you see an increase after you de-duplicate it? The reason why I'm asking is because I'm being asked by higher ups of what kind of percentage increase should we expect from this. I have no answer for it but they want a ball park answer, e.g., 10% or 20% increase.
-
Ok mate. You are in trouble and there is no shortcut. Sorry I am going to disagree. You must have unique content in your page and there is no two ways about it. If you are still getting visitors even after harbouring duplicate content, you should thank your lucky star.
_Now, why not start generating unique content rather than wasting your time looking at the size of the complexity. _
-
I was going to respond to this however Mathew and Marie are spot on. First order of business is to check positioning for your products to see if you are being marginalized behind other pages with the same descriptions. (While youre in there check the pages Page Authority and backlinks to compare to yours as well). Check to see how many of your product pages are indexed. You can also do a litmus test on a few products by adding unique descriptions and waiting to see if there is improvement.
-
Matthew's got good advice for you. I have a few thoughts to add.
How well are you ranking for these products right now? Anywhere near the top? If not, it's worth a try. And I really like the idea of trying it with a small sample first and seeing if your rankings improve.
I wanted to share with you about a real estate site I work with. In real estate almost every realtor uses the MLS listing description for each listing. This means there is a lot of duplication going on. For our listings we create a unique description and title for each listing and we usually end up ranking #1 for address searches for these listings.
If you do de-duplicate and see an improvement I would love to write about your site on my blog as it would probably be considered a Panda recovery of sorts. Let me know!
-
I agree with Mathhew. completely. Test it on a small scale, there are so many other metrics that influence the rankings that it's best to test it and if it works, do it on a larger scale. Quoted from Mathhew's response "Another way to approach this would be to take a sample of ~50 products that get some traffic from search right now. Take unique pictures and write unique content for those products. Then, measure the results. Did organic traffic increase after 30, 60, 90 days? If so, you know that you've got a problem with duplicate content worth correcting."
-
How much traffic do you get to those duplicated pages right now from Google/Bing? Are all of those pages indexed? Can you tell if you are losing out to the manufacturer websites? If the pages are indexed, you are getting traffic, and seem to be "beating" the manufacturer's website, I wouldn't worry too much. On sites I've worked on, I've seen duplicated pages get no traffic (or only one page gets traffic, but not all pages) or they get some traffic but the manufacturer's website ranks considerably higher. That is when you know you have a problem worth correcting. Another way to approach this would be to take a sample of ~50 products that get some traffic from search right now. Take unique pictures and write unique content for those products. Then, measure the results. Did organic traffic increase after 30, 60, 90 days? If so, you know that you've got a problem with duplicate content worth correcting. I hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Avoiding Duplicate Title Tags and Duplicate Content
Hi - I have a question on how to both avoid duplicate title tags and duplicate content AND still create a good user experience. I have a lot of SEO basics to do as the company has not done any SEO to this point. I work for a small cruise line. We have a page for each cruise. Each cruise is associated with a unique itinerary. However the ports of call are not necessarily unique to each itinerary. For each port on the itinerary there are also a set of excursions and if the port is the embark/disembark port, hotels that are associated. The availability of the excursions and hotels depends on the dates associated with the cruise. Today, we have two pages associated with each cruise for the excursions and hotels: mycruisecompany.com/cruise/name-of-cruise/port/excursion/?date=dateinport mycruisecompany.com/cruise/name-of-cruise/port/hotel/?date=dateinport When someone navigates to these pages, they can see a list of relevant content. From a user perspective the list I see is only associated with the relevant date (which is determined by a set of query parameters). Unfortunately, there are situations where the same content is on multiple pages. For instance the exact same set of hotels or excursions might be available for two different cruises or on multiple dates of the same cruise. This is causing a couple of different challenges. For instance, with regard to title tags, we have <title>Hotels in Rome</title> multiple times. I know that isn't good. If I tried to just have a hub page with hotels and a hub page with excursions available from each cruise and then a page for each hotel and excursion, each with a unique title tag, then the challenge is that I don't know how to not make the customer have to work through whether the hotel they are looking for is actually available on the dates in question. So while I can guarantee unique content/title tags, I end up asking the user to think too much. Thoughts?
On-Page Optimization | | Marston_Gould1 -
Pages with near duplicate content
Hi Mozzers, I need your opinion on the following. Imagine that we have a product X (brand Sony for example), so if we sell parts for different models of items of this product X, we then have numerous product pages with model number. Sony camera parts for Sony Camera XYZ parts for Sony Camera XY etc. So the thing is that these pages are very very similar, like 90% duplicate and they do duplicate pages for Panasonic, Canon let's say with small tweaks in content. I know that those are duplicates and I would experiment removing a category for one brand only (least seached for), but at the same time I cannot remove for the rest as they convert a lot, being close to the search query of the customer (customer looks for parts for Sony XYZ, lands on the page and buys, insteading of staying on a page for Sony parts where should additionally browse for model number). What would you advise to make as unique as possible these pages, I am thinking about: change page titles. meta descriptions tweak the content as much as I can (very difficult, there is nothing fancy or different in those :(() i will start with top top pages that really drive traffic first and see how it goes. I will remove least visited pages and prominently put the model number in Sony parts page to see how it goes in terms of organic and most importantly conversions Any other ideas? I am really concerned about dupes and a penalty, but I try to think of solutions in order not to kill conversions at this point. Have a lovely Monday
On-Page Optimization | | SammyT0 -
Building an optimised friendly website
We are in the process of having a new website built and was wondering what factors do we need do we need to instruct our web company to include, at the build phase, to ensure that we can easily optimise it for SEO purposes. They have designed us a previous site that has excessive duplicate URLs and they haven’t given us access to the code so we can’t add 301 redirects etc and would like to avoid this in the future. I look forward to hearing from you
On-Page Optimization | | Hardley1110 -
A Lot of Duplicate Meta Descriptions
Hi Everyone Its polesandblinds.com newbie here again, I've Just been in Webmaster tools and see my site has: 278 Duplicate Meta Descriptions 13 Long Meta Descriptions 4 Short Meta Descriptions 304 Duplicate Title Tags We are Using Magento version. 1.6.0.0, It seems that our description content which is unique per page (200 to 300 words) is somehow getting put into the meta description, which I confirmed when I viewed the source code, however its not showing in the meta description boxes in magento. I would say its been like this since the re-launch of the site in December 2012. example page http://www.polesandblinds.com/curtain-tracks/ view-source:http://www.polesandblinds.com/curtain-tracks/ My BIG worry is how Google views this and what effects it may have or had on a site. My site has taken quite a big hit on rankings since the Google updates. I'm really looking forward to your responses good or bad. Many Thanks Jonathan
On-Page Optimization | | JonnytheB0 -
Duplicate Content Issues with Forum
Hi Everyone, I just signed up last night and received the crawl stats for my site (ShapeFit.com). Since April of 2011, my site has been severely impacted by Google's Panda and Penguin algorithm updates and we have lost about 80% of our traffic during that time. I have been trying to follow the guidelines provided by Google to fix the issues and help recover but nothing seems to be working. The majority of my time has been invested in trying to add content to "thin" pages on the site and filing DMCA notices for copyright infringement issues. Since this work has not produced any noticeable recovery, I decided to focus my attention on removing bad backlinks and this is how I found SEOmoz. My question is about duplicate content. The crawl diagnostics showed 6,000 errors for duplicate page content and the same for duplicate page title. After reviewing the details, it looks like almost every page is from the forum (shapefit.com/forum). What's the best way to resolve these issues? Should I completely block the "forum" folder from being indexed by Google or is there something I can do within the forum software to fix this (I use phpBB)? I really appreciate any feedback that would help fix these issues so the site can hopefully start recovering from Panda/Penguin. Thank you, Kris
On-Page Optimization | | shapefit0 -
Large Site - Advice on Subdomaining
I have a large news site - over 1 million pages (have already deleted 1.5 million) Google buries many of our pages, I'm ready to try subdomaining http://bit.ly/dczF5y There are two types of content - news from our contributors, and press releases. We have had contracts with the big press release companies going back to 2004/5. They push releases to us by FTP or we pull from their server. These are then processed and published. It has taken me almost 18 months, but I have found and deleted or fixed all the duplicates I can find. There are now two duplicate checking systems in place. One runs at the time the release comes in and handles most of them. The other one runs every night after midnight and finds a few, which are then handled manually. This helps fine-tune the real-time checker. Businesses often link to their release on the site because they like us. Sometimes google likes this, sometimes not. The news we process is reviews by 1,2 or 3 editors before publishing. Some of the stories are 100% unique to us. Some are from contributors who also contribute to other news sites. Our search traffic is down by 80%. This has almost destroyed us, but I don't give up easily. As I said, I've done a lot of projects to try to fix this. Not one of them has done any good, so there is something google doesn't like and I haven't yet worked it out. A lot of people have looked and given me their ideas, and I've tried them - zero effect. Here is an interesting and possibly important piece of information: Most of our pages are "buried" by google. If I dear, even for a headline, even if it is unique to us, quite often the page containing that will not appear in the SERP. The front page may show up, an index page may show up, another strong page pay show up, if that headline is in the top 10 stories for the day, but the page itself may not show up at all - UNTIL I go to the end of the results and redo the search with the "duplicates" included. Then it will usually show up, on the front page, often in position #2 or #3 According to google, there are no manual actions against us. There are also no notices in WMT that say there is a problem that we haven't fixed. You may tell me just delete all of the PRs - but those are there for business readers, as they always have been. Google supposedly wants us to build websites for readers, which we have always done, What they really mean is - build it the way we want you to do it, because we know best. What really peeves me is that there are other sites, that they consistently rank above us, that have all the same content as us, and seem to be 100% aggregators, with ads, with nothing really redeeming them as being different, so this is (I think) inconsistent, confusing and it doesn't help me work out what to do next. Another thing we have is about 7,000+ US military stories, all the way back to 2005. We were one of the few news sites supporting the troops when it wasn't fashionable to do so. They were emailing the stories to us directly, most with photos. We published every one of them, and we still do. I'm not going to throw them under the bus, no matter what happens. There were some duplicates, some due to screwups because we had multiple editors who didn't see that a story was already published. Also at one time, a system code race condition - entirely my fault, I am the programmer as well as the editor-in-chief. I believe I have fixed them all with redirects. I haven't sent in a reconsideration for 14 months, since they said "No manual spam actions found" - I don't see any point, unless you know something I don't. So, having exhausted all of the things I can think of, I'm down to my last two ideas. 1. Split all of the PRs off into subdomains (I'm ready to pull the trigger later this week) 2. Do what the other sites do, that I believe create little value, which is show only a headline and snippet and some related info and link back to the original page on the PR provider website. (I really don't want to do this) 3. Give up on the PRs and delete them all and lose another 50% of the income, which means releasing our remaining staff and upsetting all of the companies and people who linked to us. (Or find them all and rewrite them as stories - tens of thousands of them) and also throw all our alliances under the bus (I really don't want to do this) There is no guarantee this is the problem, but google won't tell me, the google forums are crap, and nobody else has given me an idea that has helped. My thought is that splitting them off into subdomains will have a number of effects. 1. Take most of the syndicated content onto subdomains, so its not on the main domain. 2. Shake up the Domain Authority 3. Create a million 301 redirects. 4. Make it obvious to the crawlers what is our news and what is PRs 5. make it easier for Google News to understand Here is what I plan to do 1. redirect all PRs to their own subdomain. pn.domain.com for PRNewswire releases bw.domain.com for Businesswire releases etc 2. Fix all references so they use the new subdomain Here are my questions - and I hope you may see something I haven't considered. 1. Do you have any experience of doing this? 2. What was the result 3. Any tips? 4. Should I put PR index pages on the subdomains too? I was originally planning to keep them on the main domain, with the individual page links pointing to the actual release on the subdomain. Obviously, I want them only in one place, but there are two types of these index pages. a) all of the releases for a particular PR company - these certainly could be on the subdomain and not on the main domain b) Various category index pages - agriculture, supermarkets, mining etc These would have to stay on the main domain because they are a mixture of different PR providers. 5. Is this a bad idea? I'm almost out of ideas. Should I add a condensed list of everything I've done already? If you are still reading, thanks for hanging in.
On-Page Optimization | | loopyal0 -
Is rel=canonical used only for duplicate content
Can the rel-canonical be used to tell the search engines which page is "preferred" when there are similar pages? For instance, I have an internal page that Google is showing on the first page of the SERPs that I would prefer the home page be ranked for. Both the home and internal page have been optimized for the same keyword. What is interesting is that the internal page has very few backlinks compared to the home page but Google seems to favor it since the keyword is in the URL. I am afraid a 301 will drop us from the first page of the SERPs.
On-Page Optimization | | surveygizmo0 -
Alt tag matching product titles - e-commerce
Hey all, Just wondering if it is ok to match the alt tag to product titles. Imagine an e-commerce site that lists a whole lot of products on any one page for any one category. Each product listing has a thumbnail image beside it. The easiest way to implement this dynamically is to use the product title for the alt tag. Anyone had any experience with this? Is it overkill / spam of keywords - given that the product title is repeated. Our current situation is that our alt tags are simply blank or say 'photo' which is no good, and we have hundreds of thousands of pages. Cheers, Croozie
On-Page Optimization | | sichristie0