"Issue: Duplicate Page Content " in Crawl Diagnostics - but these pages are noindex
-
Saw an issue back in 2011 about this and I'm experiencing the same issue. http://moz.com/community/q/issue-duplicate-page-content-in-crawl-diagnostics-but-these-pages-are-noindex
We have pages that are meta-tagged as no-everything for bots but are being reported as duplicate. Any suggestions on how to exclude them from the Moz bot?
-
Technically that could be done in your robots.txt file but I wouldn't recommend that if you want Google to crawl them too. I'm not sure if Rogerbot can do that. Sorry I couldn't be more help.
If you don't get one of the staffers on here in the next few days, I would send a ticket to them for clarification.
If you decide to go with robots.txt here is a resource from Google on implementing and testing it. https://support.google.com/webmasters/answer/156449?hl=en
-
Thanks for the information on Rogerbot. I understand the difference between the bots from Google and Moz.
Some errors reported in Moz are not real. For example we use a responsive slider on the home page that generates the slides from specific pages. These pages are tagged to no-everything so as to be invisible to bots, yet they are generating errors in the reports.
Is there anyway to exclude some pages from the reports?
-
Don't forget that Rogerbot (moz's crawler) is a robot and not an index like Google. Google used robots to gather the data but the results we see is an index. Rogerbot will crawl the pages regardless of noindex or nofollow.
Here is more info on RogerBot http://moz.com/help/pro/rogerbot-crawler
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Search Console issue: "This is how Googlebot saw the page" showing part of page being covered up
Hi everyone! Kind of a weird question here but I'll ask and see if anyone else has seen this: In Google Search Console when I do a fetch and render request for a specific site, the fetch and blocked resources all look A-OK. However, in the render, there's a large grey box (background of navigation) that covers up a significant amount of what is on the page. Attaching a screenshot. You can see the text start peeking out below (had to trim for confidentiality reasons). But behind that block of grey IS text. And text that apparently in the fetch part Googlebot does see and can crawl. My question: is this an issue? Should I be concerned about this visual look? Or no? Never have experienced an issue like that. I will say - trying to make a play at a featured snippet and can't seem to have Google display this page's information, despite it being the first result and the query showing a featured snippet of a result #4. I know that it isn't guaranteed for the #1 result but wonder if this has anything to do with why it isn't showing one. VmIqgFB.png
On-Page Optimization | | ChristianMKG0 -
How to explain to a client that duplicate content is bad...
Afternoon! An SEO client of ours has copied a load of landing/category page content from other sites. Lots of emails have been sent back and forth asking them to remove it, but they are adamant to keep it up there until we have time to amend it. We have explained to them: The Google penalty risks The copyright risks The short and long-term implications for their brand new business/website The money they are spending on our SEO package could be completely wasted if they're caught I think the above is pretty black and white, but the director of this company will not budge. Does anyone have any different approaches? The director said he's happy for us to amend the content but, in the meantime, the plagiarised content will not be removed. Cheers, Lewis
On-Page Optimization | | PeaSoupDigital0 -
How do i fix my duplicte page issues on Prestashop platform?
I have tried many things such as rel-canonical, duplicate url redirect modules, but still im looking at over 70 duplicate page titles. Most of this is because the same title is used across all the pages and im not sure if this is possible to change. (ex: http://apostolicclothing.com/15-modest-womens-dresses, http://apostolicclothing.com/15-modest-womens-dresses?p=2)
On-Page Optimization | | nathanielleee0 -
Should I consolidate pages to prevent "thin content"
I have a site based on downloadable images which tend to slow a site. In order to increase speed I divided certain pages up so that there will be less images on each page such as here: http://www.kindergartenteacherresources.com/2011/09/23/spongebob-alphabet-worksheets-uppercase-letters-a-h/ http://www.kindergartenteacherresources.com/2011/09/23/spongebob-alphabet-worksheets-uppercase-letters-i-q/ http://www.kindergartenteacherresources.com/2011/09/23/spongebob-alphabet-worksheets-uppercase-letters-r-z/ The problem is that I now have potential duplicate content and thin content. Should I consolidate them and put all of the content from the 3 pages on one page? or maybe keep them as they are but add a rel previous / next tag? or any other suggestion to prevent a duplicate/thin content penalty while not slowing down the site too much?
On-Page Optimization | | JillB20130 -
"irrelevant pages of a site"
Hi there! Some pages of my site like "contact" or "registration": Should they have a title and a description tag? They are pages that I don't want them to be shown in the SERPs....Could I be penalized by google If I don't do so? The SEOMOZ crawling tool warms me about this issue (to short titles, no meta-description tags....) Many thanks
On-Page Optimization | | juanmiguelcr0 -
Replacing "_" with "-" in url, results in new url?
We ran SEOmoz's "On-Page Optimization" tool on a url which contains the character "_". According to the tool: "Characters which are less commonly used in URLs may cause problems with accessibility, interpretation and ranking in search engines. It is considered a best practice to stick to standard URL structures to avoid potential problems." "Rewrite the URL to contain only standard characters." Therefore we will rewrite the url, replacing "_" with "-". Will search engines consider the "-" url a different one? Do we need to 301 the old url to the new one? Thanks for your help!
On-Page Optimization | | gerardoH0 -
Do videos count as duplicate content?
If we allow users to embed our videos on their site, would that count as duplicate content? I imagine note, given that Google can't usually 'see' the content of videos, but just want to double check.
On-Page Optimization | | nicole.healthline0 -
Duplicate content Issue
I'm getting a report of duplicate title and content on: http://www.website.com/ http://www.website.com/index.php Of course, they're the same pages but does this need to be corrected somehow. Thanks!
On-Page Optimization | | dbaxa-2613380