Duplicate Pages , Do they matter ?
-
I have been told buy the company who created my site that duplicate page warning are not a problem ? my site is small and only has 50 pages ( including product pages etc ) yet the crawl shows over 6500 duplicate pages am I right to be concerned?
-
Hey Stephen
You have every right to be concerned here and duplicate content is a big problem but don't take my word for it:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66359
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1235687Lets try and forget any SEO mumbo jumbo and just think about this in common sense terms:
- 50 pages with 6500 duplicates means there are 130 versions of each page - which one should Google return in search?
- If Google is crawling your page and has 6500 pages to crawl how are they supposed to know which page is the right one?
I am guessing there are not actually 6500 versions of your pages so in all likelihood, this is some kind of URL based duplication with the page being returned on several URLs (or 130 URLs). What is more worrying is that if the Moz crawler found these pages they are all obviously crawlable so the internal link structure and information architecture is all out of whack.
Without examples it is tough to give advice but you have various options
- Tidy up the URLs - remove any unnecessary URL parameters (this can be done in webmaster tools if your development company are unhelpful (as they certainly seem unknowledgeable and happy to hand out duff advice).
- Implement a canonical URL
- Improve the internal link structure so you only link to one version of each page (ideally combined with a canonical)
Ultimately we want just the 50 pages, a canonical URL, consistent internal linking and any confusing parameters dealt with in webmaster tools.
This kind of problem should be easily resolved and happy to give specific feedback if you wanted to post a link showing some examples.
Another good read:
http://moz.com/blog/fat-pandas-and-thin-contentHope that helps!
Marcus
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does Title Tag location in a page's source code matter?
Currently our meta description is on line 8 for our page - http://www.paintball-online.com/Paintball-Guns-And-Markers-0Y.aspx The title tag, however sits below a bunch of code on line 237 Does the location of the title tag, meta tags, and any structured data have any influence with respect to SEO and search engines? Put another way, could we benefit from moving the title tag up to the top? I "surfed 'n surfed" and could not find any articles about this. I would really appreciate any help on this as our site got decimated organically last May and we are looking for any help with SEO. NIck
Technical SEO | | Istoresinc0 -
Joomla creating duplicate pages, then the duplicate page's canonical points to itself - help!
Using Joomla, every time I create an article a subsequent duplicate page is create, such as: /latest-news/218-image-stabilization-task-used-to-develop-robot-brain-interface and /component/content/article?id=218:image-stabilization-task-used-to-develop-robot-brain-interface The latter being the duplicate. This wouldn't be too much of a problem, but the canonical tag on the duplicate is pointing to itself.. creating mayhem in Moz and Webmaster tools. We have hundreds of duplicates across our website and I'm very concerned with the impact this is having on our SEO! I've tried plugins such as sh404SEF and Styleware extensions, however to no avail. Can anyone help or know of any plugins to fix the canonicals?
Technical SEO | | JamesPearce0 -
Duplicate Page Content but where?
Hi All Moz is telling me I have duplicate page content and sure enough the PA MR mT are all 0 but it doesnt give me a link to this content! This is the page: http://www.orsgroup.com/index.php?page=Scanning-services But I cant find where the duplicate content is other than on our own youtube page which I will get removed here: http://www.youtube.com/watch?v=Pnjh9jkAWuA Can anyone help please? Andy
Technical SEO | | ORS-Group0 -
Duplicate Content
SEOmoz is reporting duplicate content for 2000 of my pages. For example, these are reported as duplicate content: http://curatorseye.com/Name=“Holster-Atlas”---Used-by-British-Officers-in-the-Revolution&Item=4158
Technical SEO | | jplill
http://curatorseye.com/Name=âHolster-Atlasâ---Used-by-British-Officers-in-the-Revolution&Item=4158 The actual link on the site is http://www.curatorseye.com/Name=“Holster-Atlas”---Used-by-British-Officers-in-the-Revolution&Item=4158 Any insight on how to fix this? I'm not sure where the second version of the URL is coming from. Thanks,
Janet0 -
Duplicate pages in Google index despite canonical tag and URL Parameter in GWMT
Good morning Moz... This is a weird one. It seems to be a "bug" with Google, honest... We migrated our site www.three-clearance.co.uk to a Drupal platform over the new year. The old site used URL-based tracking for heat map purposes, so for instance www.three-clearance.co.uk/apple-phones.html ..could be reached via www.three-clearance.co.uk/apple-phones.html?ref=menu or www.three-clearance.co.uk/apple-phones.html?ref=sidebar and so on. GWMT was told of the ref parameter and the canonical meta tag used to indicate our preference. As expected we encountered no duplicate content issues and everything was good. This is the chain of events: Site migrated to new platform following best practice, as far as I can attest to. Only known issue was that the verification for both google analytics (meta tag) and GWMT (HTML file) didn't transfer as expected so between relaunch on the 22nd Dec and the fix on 2nd Jan we have no GA data, and presumably there was a period where GWMT became unverified. URL structure and URIs were maintained 100% (which may be a problem, now) Yesterday I discovered 200-ish 'duplicate meta titles' and 'duplicate meta descriptions' in GWMT. Uh oh, thought I. Expand the report out and the duplicates are in fact ?ref= versions of the same root URL. Double uh oh, thought I. Run, not walk, to google and do some Fu: http://is.gd/yJ3U24 (9 versions of the same page, in the index, the only variation being the ?ref= URI) Checked BING and it has indexed each root URL once, as it should. Situation now: Site no longer uses ?ref= parameter, although of course there still exists some external backlinks that use it. This was intentional and happened when we migrated. I 'reset' the URL parameter in GWMT yesterday, given that there's no "delete" option. The "URLs monitored" count went from 900 to 0, but today is at over 1,000 (another wtf moment) I also resubmitted the XML sitemap and fetched 5 'hub' pages as Google, including the homepage and HTML site-map page. The ?ref= URls in the index have the disadvantage of actually working, given that we transferred the URL structure and of course the webserver just ignores the nonsense arguments and serves the page. So I assume Google assumes the pages still exist, and won't drop them from the index but will instead apply a dupe content penalty. Or maybe call us a spam farm. Who knows. Options that occurred to me (other than maybe making our canonical tags bold or locating a Google bug submission form 😄 ) include A) robots.txt-ing .?ref=. but to me this says "you can't see these pages", not "these pages don't exist", so isn't correct B) Hand-removing the URLs from the index through a page removal request per indexed URL C) Apply 301 to each indexed URL (hello BING dirty sitemap penalty) D) Post on SEOMoz because I genuinely can't understand this. Even if the gap in verification caused GWMT to forget that we had set ?ref= as a URL parameter, the parameter was no longer in use because the verification only went missing when we relaunched the site without this tracking. Google is seemingly 100% ignoring our canonical tags as well as the GWMT URL setting - I have no idea why and can't think of the best way to correct the situation. Do you? 🙂 Edited To Add: As of this morning the "edit/reset" buttons have disappeared from GWMT URL Parameters page, along with the option to add a new one. There's no messages explaining why and of course the Google help page doesn't mention disappearing buttons (it doesn't even explain what 'reset' does, or why there's no 'remove' option).
Technical SEO | | Tinhat0 -
Duplicate Content
Many of the pages on my site are similar in structure/content but not exactly the same. What amount of content should be unique for Google to not consider it duplicate? If it is something like 50% unique would it be preferable to choose one page as the canonical instead of keeping them both as separate pages?
Technical SEO | | theLotter0 -
Duplicate Page Issue
Dear All, I am facing stupid duplicate page issue, My whole site is in dynamic script and all the URLs were in dynamic, So i 've asked my programmer make the URLs user friendly using URL Rewrite, but he converted aspx pages to htm. And the whole mess begun. Now we have 3 different URLs for single page. Such as: http://www.site.com/CityTour.aspx?nodeid=4&type=4&id=47&order=0&pagesize=4&pagenum=4&val=Multi-Day+City+Tours http://www.tsite.com/CityTour.aspx?nodeid=4&type=4&id=47&order=0&pagesize=4&pagenum=4&val=multi-day-city-tours http://www.site.com/city-tour/multi-day-city-tours/page4-0.htm I think my programmer messed up the URL Rewrite in ASP.net(Nginx) or even didn't use it. So how do i overcome this problem? Should i add canonical tag in both dynamic URLs with pointing to pag4-0.htm. Will it help? Thanks!
Technical SEO | | DigitalJungle0 -
Duplicate page content errors in SEOmoz
Hi everyone, we just launched this new site and I just ran it through SEOmoz and I got a bunch of duplicate page content errors. Here's one example -- it says these 3 are duplicate content: http://www.alicealan.com/collection/alexa-black-3inch http://www.alicealan.com/collection/alexa-camel-3inch http://www.alicealan.com/collection/alexa-gray-3inch You'll see from the pages that the titles, images and small pieces of the copy are all unique -- but there is some copy that is the same (after all, these are pretty much the same shoe, just a different color). So, why am I getting this error and is there any best way to address? Thanks so much!
Technical SEO | | ketanmv
Ketan0