Does this site have a duplicate content issue?
-
Google WMT is showing me only 2 short meta descriptions under "HTML Improvements" but I believe http://www.customgia.com may have a content duplication issue. Numerous keywords are used repeatedly across many product descriptions. To make matters worse, every product page has a "Design It!" button that sends the user to a flash-based jewelry designer in which they can edit the product's appearance. I'm not sure if these "designer pages" are adding unnecessary and potentially damaging duplicate content but it's certainly a possibility.
There are many items on this site that are similar to one another but not the same. The product description tend to use the same phrases over and over again - words like crystal, Swarovski, beaded, design it, customize, change, pearl, glass beads, iridescent, pearl, drop earrings are used a lot. What I'm stuck on is whether or not I should be focusing on a content duplication issue as the primary SEO problem or if there is something bigger. Thank you for any assistance you can provide!
-
This is where things get a bit dicey - I'm not 100% sure that won't remove the main page, too (and how Google handles the trailing "/"). You might need a "/*" wild-card in the Robots.txt. Frankly, I'd ease into it with just one directory. These things never seem to work quite the way in practice that we all say they should in theory.
-
Okay, last question on this (I hope). As far as I can tell, Google's URL removal tool does not support the use of wildcards. And according to their removal requirements, I can't remove an entire directory unless that directory is already blocked in the robots.txt. So before I submit the removal request for: http://www.customgia.com/design-your-own-jewelry/classic-necklace/, I have to add: Disallow: /design-your-own-jewelry/classic-necklace/ to robots.txt. Is this the right way to do this? And again, thanks for helping me with this.
-
Oh, sorry - yeah, this is why these questions can be dangerous in the scope of Q&A. If some of the pages in that virtual folder are main nav pages/links, then it's definitely going to look weird to block them (it's a mixed signal, at best). I'm not sure I fully understand the site structure, but my gut reaction is to leave those indexed. The wild-cards should work - the other option would be to give them each their own shorter URLs and not put them in the "/design-your-own-jewelry" folder, but that can be a ton of work, depending on how your site is built (plus, you'd have to 301-redirect the old URLs, which opens up a whole new mess).
-
Thank you for the excellent advice. We put the META NOINDEX tags into place this morning. The URL removal request is next but I have a slight change to what Everett outlined above.
The six designer pages that are accessible from the homepage: /design-your-own-jewelry/classic-necklace, /design-your-own-jewelry/simple-necklace, /design-your-own-jewelry/classic-bracelet, /design-your-own-jewelry/simple-bracelet, /design-your-own-jewelry/pendant-necklace, /design-your-own-jewelry/drop-earrings are not duplicates and the Moz crawl did not show them as duplicates. All the other designer pages are considered duplicates of each other or duplicates of one of these six pages. So we put INDEX, NOFOLLOW on these six pages to keep them indexed. I think the removal request should follow suit.
What's your opinion on placing a removal request for each of the following? - /design-your-own-jewelry/classic-necklace/, /design-your-own-jewelry/simple-necklace/, /design-your-own-jewelry/classic-bracelet/, /design-your-own-jewelry/simple-bracelet/, /design-your-own-jewelry/pendant-necklace/, /design-your-own-jewelry/drop-earrings/. Correct me if I'm wrong, but this should remove all the designer pages from the index except the six that are accessible from the homepage, giving those six pages a chance to rank.
-
That sounds like a winning plan Dr. Pete, though I'd append 2.1 "Request removal of directory in Bing and Google webmaster tools".
-
I've had a lot of issues where, if pages were already indexed, Robots.txt did a poor job of removing them. Absolutely agree on the crawl budget issue and it's a whole lot easier to remove a folder in Robots.txt, but I've just had a bunch of odd problems with Robots.txt at large scale. If I actually had to do it on my own site, I'd probably:
(1) META NOINDEX the pages
(2) Monitor removal
(3) Once removal was progressing well (80%+), then add to Robots.txt
-
I agree with Dr. Pete here, though I think the easiest solution would be to simply block the entire /design-your-own-jewelry/* directory from being indexed using robots.txt and, to Dr. Pete's point, you'll want to remove that directory from the index in both Bing and Google webmaster tools, as discussed here:
http://googlewebmastercentral.blogspot.com/2007/04/requesting-removal-of-content-from-our.html (see the section under "entire directory")
Something I think about with regard to robots.txt block Vs meta robots block is crawl budget. Google has to access a page to see the meta noindex tag, while a single disallow statement in the robots.txt file can save Googlebot the hassle of visiting potentially thousands of unnecessary pages.
If down the road you figure out a way to put custom content on those pages and want to try and rank for things like "Custom Garnet Pearl Bracelet" or "Design Your Own Beaded Bracelet" then I'd look into some of the other options discussed here. Until then I feel they would just be complicating something as simple as the need to remove very thin, mostly duplicate content from the index.
-
Each of these problems may have a unique solution, so it gets complicated. Regarding the "design your own" pages, I'm seeing over 5K of those URLs in the search index, and they do probably look very similar. Since these are not the core product pages, I'd strongly consider using META NOINDEX on them. I find that Robots.txt does not do a good job of blocking content that has already been indexed, in most cases. You can add the meta tag dynamically in your code, hopefully, so that just a few lines of code will serve all of these pages.
While these pages aren't "true" duplicates, they look similar enough that, at the scale of your site, they really are diluting your ability to rank. In extreme cases, if you're also serving up product variations, paginated search results, etc., you could even run into Panda issues. Whether or not this is your core problem, from an SEO perspective, cleaning it up can't hurt, and may make it easier to find other problems.
-
Even if the 'design' part was not flash the textual content is pretty much identical. There is no benefit for it to be indexed so canonical to the main directory URL would make sense. Then add some good text to those main pages.
Personally I would only use H1 for user experience rather than keywordy as they don't carry much weight.
-
Yes, the page http://www.customgia.com/fashion-beaded-jewelry/shop-for/beaded-bracelets/classic-bracelet/AADE17E9BB964352B7A3912294BB5DF8 is a unique product page with a unique description. The designer page you referenced: http://www.customgia.com/design-your-own-jewelry/classic-bracelet/AADE17E9BB964352B7A3912294BB5DF8 is also unique because it loads the jewelry design in the product page.
On every product page the "Design It!" button opens a flash-based designer page that let's the user edit that particular design. Unfortunately, the Moz crawler (and I assume Google) considers these pages duplicates of http://www.customgia.com/design-your-own-jewelry/classic-bracelet/ (or one of the other five jewelry patterns). The fact that every designer page loads a unique jewelry design does not seem to matter. The best (and most costly) solution, I suppose would be to change all the flash code to html but that isn't happening anytime soon.
The
tag on all the designer pages is either "Classic Bracelet", "Classic Necklace", "Simple Necklace", "Simple Bracelet", "Pendant Necklace" or "Drop Earrings" (depending on which pattern was used to create the design). Maybe changing the
to something like: Redesigning "Garnet Pearl Bracelet with Silver Hearts" would help tell Google these pages indeed differ from one another - but I think the content below the
will still be considered duplicate. If I use canonical tags on these pages, is there any point in creating dynamic
tags if it doesn't improve the user experience? Thanks again!
-
On a brief view:
http://www.customgia.com/fashion-beaded-jewelry/shop-for/beaded-bracelets/classic-bracelet/AADE17E9BB964352B7A3912294BB5DF8 looks like it is a unique product page with potentially unique description? If so leave as is.
http://www.customgia.com/design-your-own-jewelry/classic-bracelet/AADE17E9BB964352B7A3912294BB5DF8 looks a duplicate of http://www.customgia.com/design-your-own-jewelry/classic-bracelet/ so I would set rel canonical on any http://www.customgia.com/design-your-own-jewelry/classic-bracelet/CODE page pointing to ...design-your-own-jewelry/classic-bracelet/
I hope this helps.
-
Initially I thought canonical tags would work best. If we use them, should the canonical tag for the page: http://www.customgia.com/design-your-own-jewelry/classic-bracelet/AADE17E9BB964352B7A3912294BB5DF8 point to the "parent" product page: http://www.customgia.com/fashion-beaded-jewelry/shop-for/beaded-bracelets/classic-bracelet/AADE17E9BB964352B7A3912294BB5DF8 or do you think it should point to the appropriate designer page: http://www.customgia.com/design-your-own-jewelry/classic-bracelet/? - (The design that appears on these pages is based on the six products that appear on the homepage.)
The first solution, I suppose, would pass authority/PageRank to the "parent" product page. Whereas the second solution would pass authority/PageRank to one of the six designer pages. I'm not sure which is a better solution but I'm favoring pointing the pages to it's "parent" product page.
The priority is to fix the duplication content issue but a bump in ranking for any of these pages is obviously a bonus. Thanks for your help!
-
If you add rules to robots.txt that does not mean those directories will be removed from the index. You will also need to remove them in Webmaster Tools >> Google Index >> Remove URLs >> set Reason to Remove Directory.
Having said that why not use canonical tags on pages like /design-your-own-jewelry/classic-bracelet/AADE17E9BB964352B7A3912294BB5DF8 - http://moz.com/blog/rel-confused-answers-to-your-rel-canonical-questions
-
You are so welcome, and as I said, technical SEO is something I've been thrust into learning because of my current situation as an in-house SEO.
I think you may be on the right track, but there are other, very talented technical SEOs here who I would ask for a second on your decision. Perhaps Dr. Pete, Ian Lurie, or Everett Sizemore could chime in with a much more accurate response on the robots.txt
Good luck!
-
Thanks for the quick response. The Crawl Diagnostics report I just rec'd identified all the designer pages as duplicate content (these are the pages that load when the "Design It!" button is clicked).
The current robots.txt file does not disallow the designer pages. There are six different types of designer pages. Each of the 967 products loads one type of designer page, depending on the jewelry pattern the item was created with. Do you think the correct solution is to disallow these designer pages in the robot.txt file? I think the new robot.txt file should look like this:
User-agent: *
Disallow: /admin/
Disallow: /design-your-own-jewelry/classic-bracelet/
Disallow: /design-your-own-jewelry/classic-necklace/
Disallow: /design-your-own-jewelry/drop-earrings/
Disallow: /design-your-own-jewelry/pendant-necklace/
Disallow: /design-your-own-jewelry/simple-bracelet/
Disallow: /design-your-own-jewelry/simple-necklace/Your help with this is very much appreciated.
-
First off, welcome!
Second, I would say don't worry at all about your call to action button. Every eCommerce site has call to action buttons on every page (i.e. "Add to Cart"). The pages that happen after that don't matter to search engines with one caveat...just make sure you have a properly configured Robots.txt file.
Third. If all of your product pages are indeed trying to capitalize on the same key terms....yes, you have a duplicate content problem. Don't wait for a tool to tell you what you already know in your heart!
Take some time, and prioritize and start to re-write you product pages to use a wider variety of keywords (particularly long-tail) that better describe the products you offer.
I hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to fix thin content issue?
Hello! I've checked my website via Moz and received "thin content" issue: "Your page is considered to have "thin content" if it has less than 50 words" But I definitely know that we have 5 text blocks with unique content, each block consist of more than 50 words. Do you have any ideas what may cause this issue? Thanks in advance, Yana
On-Page Optimization | | yanamazault0 -
To avoid the duplicate content issue I have created new urls for that specific site I am posting to and redirecting that url to the original on my site. Is this the right way to do it?
I am trying to avoid the duplicate content issue by creating new urls and redirecting them to the original url. Is this the proper way of going about it?
On-Page Optimization | | yagobi210 -
E-commerce site product descriptions and duplicate content
Hi everyone. I'm developing an e-commerce site using Prestashop and concerned about the issue of duplicate content among product descriptions. My main concerns are: If there are 500 or more products and those product descriptions are obtained from a manufacturer or supplier's website hence running into external duplicate content issues. Internal duplicate content is also an issue, if there are multiple similar products and each product has the same description across several pages. What would be the best approach to eliminate the possibility of incurring a duplicate content penalty due to similar product descriptions? I've already considered the suggestion of noindex-ing the complete range of products to help protect from duplicate content penalties and having unique articles written in the site blog discussing products instead linking to certain products on the site. Another consideration I had was noindex-ing all product pages except pages for featured products in the store and rewriting descriptions for a set amount of those featured products regularly (this will still have the problem of internal duplicate content across pages if similar product descriptions are rewritten). The product range is intended to be very large so I'm really seeking an alternative solution from the insane task of rewriting many product descriptions. Any suggestions to make SEO work efficient are very much welcome and appreciated. Thank you!
On-Page Optimization | | valuepets0 -
Duplicated Content Column in excel
I'd like to see all duplicated content URLs in excel. But when I do the export to csv, and then use text to columns, I end up with an empty duplicated content column. The URLs should be in column AF in excel, but this column is empty. Can somebody help me on this?
On-Page Optimization | | jdclerck0 -
Best practice to solve this Unique duplicate page content issue?
I just got Seomoz Pro (it's awesome!), and when I did a campaign for my website I discovered that I have a big issue with duplicate page content (as well as titles). The Crawl Diagnostics Summary told me I have 196 Crawl Errors Found (I had a total of 362 pages crawled on my site), and as much as 160 of these was duplicate page content. Which to me sounds like a big problem, correct me if I'm wrong (I'm very new to SEO). So our website is an ecommerce that sells greeting cards. The unique part about our platform is that we offer the customer to make a customization of the cards.
On-Page Optimization | | danielpett
Let me walk you through each step a customer takes so you fully understand: They find a card they like and visit the product page of that card (just like on any ecommerce store.) They then decide they want to buy it. There is no "Add to cart" button, they will instead click on a "customize the card" button. 3) This takes them to a step by step process of customizing the card. They change the name on the front of the greeting card so it says for example: "Happy Birthday Katy!". And then adds a personal text on the inside of the card. They then add an delivery address and when it should be delivered. After that they proceed to checkout and it's all done. This is my website (it's in Swedish): loveday.se - it will take you to a product page so that you can click the green button and see what I mean with the customization pages. Hopefully it helps even though it's in Swedish. My issue starts at the customization part of the site (the bolded step above), as I can see the permalinks in the diagnostics I got.
This step-by-step process looks exactly the same with every card in the store. Same call-to-action headline, same descriptive text etc. The only difference is a JPEG-file with the unique greeting card design. So, what is your take on this? Let me know if I was unclear about something. Any help or advice is greatly appreciated.0 -
How to fix duplicate issue among multiple root domains
Hello, I’m doing SEO for one E-commerce website which name is Lily Ann Cabinets & I’ve around 300 different root domains which having same linking structures, same design & even same product database for all 300 websites, but currently I’m focusing only on Lily Ann Cabinets website & trying to get ranking on some targeted keywords, but website is not performing well in Google.com For Example: http://www.lilyanncabinets.com/ (Main Websites)
On-Page Optimization | | CommercePundit
http://www.orlandocabinets.com/
http://www.chicagocabinets.org/
http://www.miamicabinets.org/
http://www.newyorkcabinets.org/
http://www.renocabinets.org/ So please can anyone tell that Will it create duplicate issue in search engines or may be due to this reason website doesn’t have good ranking in search engines, then how can I fix this issue? Do I have to make different structures for Lily Ann Cabinets?0 -
Offer landing page, duplicate content and noindex
Hi there I'm setting up a landing page for an exclusive offer that is only available (via a link) to a particular audience. Although I've got some specific content (offer informaiton paragraph), i want to use some of the copy and content from one of my standard product pages to inform the visitors about what it is that i'm trying to sell them. Considering I'm going to include a noindex on this page, do i need to worry about it having some content copied directly from another page on my site? Thanks
On-Page Optimization | | zeegirl0 -
Three Sites or One?
I have a client who provides three distinct, although related, services. Some of his competitors only provide one of those services, and thus their sites are more saturated with that particular service. Would it be best to develop three different sites optimized for each particular service, or could I achieve the same effect by optimizing different sections of one site for each service?
On-Page Optimization | | kscotbarr0