Duplicate Page Content
-
Hey Moz Community,
Newbie here. On my second week of Moz and I love it but have a couple questions regarding crawl errors. I have two questions:
1. I have a few pages with duplicate content but it say 0 duplicate URL's. How do I know what is duplicated in this instance?
2. I'm not sure if anyone here is familiar with an IDX for a real estate website. But I have this setup on my site and it seems as though all the links it generates for different homes for sale show up as duplicate pages.
For instance, http://www.handyrealtysa.com/idx/mls...tonio_tx_78258 is listed as having duplicate page content compared with 7 duplicate URLS:
http://www.handyrealtysa.com/idx/mls...tonio_tx_78247
http://www.handyrealtysa.com/idx/mls...tonio_tx_78253
http://www.handyrealtysa.com/idx/mls...tonio_tx_78245
http://www.handyrealtysa.com/idx/mls...tonio_tx_78261
http://www.handyrealtysa.com/idx/mls...tonio_tx_78258
http://www.handyrealtysa.com/idx/mls...tonio_tx_78260
http://www.handyrealtysa.com/idx/mls...tonio_tx_78260I've attached a screenshot that shows 2 of the pages that state duplicate page content but have 0 duplicate URLs. Also you can see somewhat about the idx duplicate pages.
rel="canonical" is functioning on these pages, or so it seems when I view the source code from the page.
Any help is greatly appreciated.
-
The contact-us page re-directs to a different URL (about-us/contact-us) but the original source code for just www.handyrealtysa.com/contact-us matches http://www.handyrealtysa.com/community & http://www.handyrealtysa.com/resources which has no content in the main area.
While a high percentage can be considered duplicates, our crawler will also take into account the main content area to see if anything matches there as well which in the above links are different outside of the navigation and header.
-
-
Can you provide me with a couple of pages that are similar but not flagged as a duplicate?
-
Thanks for the responses.
I used the page checker and is shows most of the IDX pages are 98% similar. This can't be good. I've posed the question to my IDX provider and await their answer.
With regards to the similar pages that show 0 duplicate URLs, what can I do to look into this? These seem to be non-IDX pages, so I could likely do more to fix the error in these pages.
Thanks again!
-
Campaigns have a 90% tolerance for duplicate content. This includes all the source code on the page and not just the viewable text. So if a URL is at least 90% similar in code to another URL, this warning will appear. Although the pages in question are may appear to be different on the front end, they are actually duplicates based on this percentage (at least the example URLs I checked in your campaigns.)
You can run your own tests using this tool: http://www.webconfs.com/similar-page-checker.php
We don't know what standard Google uses, but it's safe to say they are a bit more sophisticated than us - so you might be okay in this regard as long as you have a couple hundred words of unique text per page. Google won't say how much duplicate content is too much, so we like to be better safe than sorry.
Hope this helps!
-
Seeing your problem in an SEO viewpoint, it’s always best for a website not to have any duplicate content. So maybe try linking to the source of the listing on the IDX website.
Your rel="canonical" is in place and in the section where it needs to be.
The duplicate content maybe coming from what you are not doing, but what other similar sites are doing. How many other real-estate sites use the same identical keyword and description for the same listing as you? These similar listings on "other sites", could be the cause for the duplicate content issues on your site. I guess my question would be how many other sites have a house listed @ 20615 Wild Springs Dr, San Antonio, TX 78258 (MLS # 1034019) using the same address and description as you?
My understanding this is a common problem with IDX, not sure if this solves your problem, but may solve why you are having a duplicate content issue.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is it better to create more pages of content or expand on current pages of content?
I am assuming that one way of improving the rankings of current pages will be to create more content on the keywords used... should this be an expansion of the content on current pages I am optimising for a keyword or is it better to keep creating new pages and if we are creating new pages is it best to use an extension of the keyword on the new page – for example if we are optimising one page for ‘does voltage optimisation work’ would it then be worth creating a page optimised for ‘does voltage optimisation work in hotels’ for example and so on? I am guessing maybe both might help, this is just a question I have had from one of my clients.
On-Page Optimization | | TWSI1 -
Duplicated Content with joomla multi language website
Dear Seomoz Community I am running a multi language joomla website (www.siam2nite.com) with 2 active languages. The first and primary language is english. the second language is thai. Most of the content (articles, event descriptions ...) is in english only. What we did is a thai translation for the navigation bars, headers, titles etc (translation of all joomla language files) those texts are static and only help the user navigate / understand our site in their thai language. Now I facing a problem with duplicated content. Lets take our Q&A component as example. the url structure looks like this: english - www.siam2nite.com/en/questions/ thai - www.siam2nite.com/th/questions/ Every question asked will create two URL, one for each language. The content itself (user questions & answers) is identical on both URL's. Only the GUI language is different. If you take a look at this question you will understand what i mean: ENGLISH VERSION: http://www.siam2nite.com/en/questions/where-to-celebrate-halloween-in-bangkok THAI VERSION: http://www.siam2nite.com/th/questions/where-to-celebrate-halloween-in-bangkok As you can see each page has a unique title (H1) and introduction text in the correct language (same for menu, buttons, etc.) but the questions and answers are only available in one language. Now my question 😉 I guess Google will see this pages as duplicated content. How should I proceed with this problem: put all thai links /th/questions/ in the robots.txt and block them or make a canonical tag for the english versions? Not sure if I set a canonical tag google will still index the thai title and introduction texts (they have important thai keywords in them) Would really appreciate your help on this 😉 Regards, Menelik
On-Page Optimization | | menelik0 -
Duplicate content because of content scrapping - please help
We manage brands websites in a very competitive industry that have thousands of affiliate links We see that more and more websites (mainly affiliates websites) are scrapping our brand websites content and it generate many duplicate content (but most of them link to us back with an affiliate link). Our brand websites still rank for any sentence in brackets you search in Google, Will this duplicate content hurt our brand websites ? If yes, should we take some preventive actions ? We are not able to add ongoing UGC or additional text to all our duplicate content and trying to stop those websites of stealing our content is like playing cat and mouse... Thanks for your advices
On-Page Optimization | | Tit0 -
Why a page with an On Page A grade has a less good rank than a page with a F grade?
Why a page with an On Page A grade is ranked 17 in Google when the home page with a F grade is ranked 9 ? Thanks
On-Page Optimization | | Amadeus_eBC0 -
Number of characters to duplicate content
I wonder how much characters in a page title so it can be characterized for Googleas duplicate content?
On-Page Optimization | | imoveiscamposdojordao
Sorry for the English, I used Google Translator.
I'm from Brazil 😄
Thanks.0 -
Localised content/pages for identical products
I've got a question about localising the website of a nationwide company. We're a small dance school with nationwide (40 cities) coverage for around 40 products. Currently, we have one page for each product (style of dance), and one page for each city; the product pages cover keywords like 'cheerleading dance class' while the city pages target the 'london dance classes'-type keywords. To make 'localised product pages', I feel like we should make a page for every city/product combo 'London cheerleading classes' - but that seems like a nightmare for both writing sexy & original content, and link building/social stats. The other thing I can think of (which I refuse to do because it would look stupid & flag the page as keyword stuffed) is filling the page with the keyword phrases which are appropriate for every city. Is there another way to let google know 'this page is appropriate for these cities...'? We do currently list the cities a product is available in, but it doesn't seem to help local rankings very much. Would this just be a link building job, using hyper-targeted anchor texts (inc. city names) for each product? How do the pro's tackle this problem?
On-Page Optimization | | AlecPR0 -
Home Page Content - In a Div?
Is putting content in a div so it doesn't muck up the look of the home page create a problem in doing well organically? Example - http://www.callawaygardens.com. We have lots of clients that want no text on the home page and we are trying to figure out how to do this while still ranking well organically. What are your thoughts? Can we get in trouble? Are there negative impacts with SEO doing it like this? Thank you!
On-Page Optimization | | RezStream80 -
Filtered Navigation, Duplicate content issue on an Ecommerce Website
I have navigation that allows for multiple levels of filtering. What is the best way to prevent the search engine from seeing this duplicate content? Is it a big deal nowadays? I've read many articles and I'm not entirely clear on the solution. For example. You have a page that lists 12 products out of 100: companyname.com/productcategory/page1.htm And then you filter these products: companyname.com/productcategory/filters/page1.htm The filtered page may or may not contain items from the original page, but does contain items that are in the unfiltered navigation pages. How do you help the search engine determine where it should crawl and index the page that contains these products? I can't use rel=canonical, because the exact set of products on the filtered page may not be on any other unfiltered pages. What about robots.txt to block all the filtered pages? Will that also stop pagerank from flowing? What about the meta noindex tag on the filitered pages? I have also considered removing filters entirely, but I'm not sure if sacrificing usability is worth it in order to remove duplicate content. I've read a bunch of blogs and articles, seen the whiteboard special on faceted navigation, but I'm still not clear on how to deal with this issue.
On-Page Optimization | | 13375auc30