Crawl Diagnostics Updates
-
I have several page types on my sites that I have blocked using the robots.txt file (ex: emailafriend.asp, shoppingcart.asp, login.asp), but they are still showing up in crawl diagnostics as issues (ex: duplicate page content, duplicate title tag, etc). Is there a way to filter these issues or perhaps there is something I'm doing wrong resulting in the issues that are showing up?
- Ryan
-
Hi Ryan,
try to move the sitemap to the end and leave a space before it. something like this:
User-agent:*
Disallow: /cgi-bin/
Disallow: /ShoppingCart.asp
Disallow: /SearchResults.asp...
...
Disallow: /mailinglist_subscribe.asp
Disallow: /mailinglist_unsubscribe.asp
Disallow: /EmailaFriend.asp -
I added the pages that it was suggesting to the robots.txt file:
http://www.naturalrugco.com/robots.txt
Most of the pages listed in the high priority errors within moz analytics crawl diagnostics are the emailafriend.asp pages which I've disallowed. Ex: http://www.naturalrugco.com/EmailaFriend.asp?ProductCode=AMB0012-parent
-
Hi Ryan,
At the end of this page you will find several ways to block Roger bot from indexing pages: http://moz.com/help/pro/rogerbot-crawler
I hope it helps,
Istvan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When does updating site content affect SERPS?
Can any of you shine a light as to when updating content on a website had any effect on its 'general' placement in the search engines? I appreciate that for an online newspaper, it must be important, but has anyone noticed from theirs or a client's site that by not uploading i.e. product descriptions or articles in general, that their site has taken a hit?
On-Page Optimization | | SDavis110 -
Sitemaps Updating
Im using wordpress and I realise that my sitemaps doesnt update itself when i add an additional page on my website, like a blog post. I have to go to (1) setting > xml sitemap setting > click on build sitemap > save changes in wordpress, and then (2) Export the sitemal.xml file it to webmaster tools in google every single time i blog. Am i doing it wrong? i feel that all these should be automatic.
On-Page Optimization | | kevinbp0 -
Can I force an update of Grade Reports?
It looks like my weekly crawl has finished, but my Grade Reports still reflect last week. Is there a way to manually update them, or do I just have to wait it out?
On-Page Optimization | | FDAitsupport0 -
Crawl Diagnostics - Duplicates and canonical problem
SEOmoz crowl diagnostic reports duplicates (title, content) issue on this addres: http://www.meblobranie.pl/biurowe/fotele-biurowe/promocje page already has canonical tag - is this a bug of crowler, or smth wrong on page?
On-Page Optimization | | SITS0 -
Html and css errors - what do SE spiders do if they come across coding errors? Do they stop crawling the rest of the code below the error
I have a client who uses a template to build their websites (no problem with that) when I ran the site through w3c validator it threw up a number of errors, most of which where minor eg missing close tags and I suggested they fix them before I start their off site SEO campaigns. When I spoke to their web designer about the issues I was told that some of the errors where "just how its done" So if that's the case, but the validator still registers the error, do the SE spiders ignore them and move on, or does it penalize the site in some way?
On-Page Optimization | | pab10 -
The crawl diagnosis indicated that my domain www.mydomain.com is duplicate with www.mydomain.com/index.php. How can I correct this issue?
How can I fix this issue when crawl diagnosis indicated that my www.mydomain.com is duplicate with www.mydomain.com/index.php? That suppose to be the same page and not duplicate, right?
On-Page Optimization | | jsevilla0 -
Changing Subfolder that has been crawled before
Question: I am using a wordpress multisite and I enabled the crawl options yesterday www.abc.com/subfolder <-original but i find that www.abc.com/sub is good enough I checked the site:abc.com but I find that my pages in the /subfolder has been crawled before. Can I just change it to www.abc.com/sub or it will raise duplicate content issue?
On-Page Optimization | | joony20080 -
Are a lot auf tag-sites in the index a bad signal for low quality? (Panda Update)
Hello everybody, first of all please excuse my bad english. I'm from Germany - I try my best. 😉 The case: I have a Wordpress SEO project which rankings very well. A this moment I have all "archive sites" like "archive", "category" und "tags" indexed. I use the more-Tag for every archive/category/tag site - so duplicate content ist not really a problem, but in view of the Panda Update, which surely arrives in Germany soon, I wonder if all this Tag/Archive/Category Sites in the index maybe seen as low quality und can hurt the ranking of my whole site. Low quality because: With using the more-tag the site are just a list of internal links with content snippets. I have 500 articles und 700 Tag Site (all in the index). So my fear is when google (with Panda Update) looks at my site und sees all this (maybe) low quality tag-sites in the index I get penalised because there is not a good proportion between my normal (good quality) Articles und the archive/tag sites. I hope you guys can understand my thoughts. Do I have a legitimate fear that the mass of tag-site in the index could be problem? Are there any data from the USA, how blogs mit Tag-Site in the Index rank after the Panda Update or if sites which contains of internal Links mit content snippets - like these tag site - are low quality in Google eyes? Or I'm worring to much? Thank you very much! Oliver
On-Page Optimization | | channelplus0