Impact of "restricted by robots" crawler error in WT
-
I have been wondering about this for a while now with regards to several of my sites. I am getting a list of pages that I have blocked in the robots.txt file. If I restrict Google from crawling them, then how can they consider their existence an error? In one case, I have even removed the urls from the index.
And do you have any idea of the negative impact associated with these errors.
And how do you suggest I remedy the situation.
Thanks for the help
-
Google is just showing you a warning that hey, these are excluded, make sure that you want them excluded. They're not passing a judgement on whether or not they should be excluded. So, as long as they're excluded on purposes, no worries.
-
Hi Patrick,
That section is simply there to advice on any URLs that Google feels are wrongly excluded within the robots.txt
If the URLs are not wrongly excluded, don't worry about it showing in WMT's - it's there just as an advisory.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Are links still considered reciprocal if the link from one website is rel="nofollow" and the other isnt ?
Im working on a site that has some press coverage due in the next couple of days from quite a big site in the niche. The press outlet has requested that we link back to the content they post about us, they said the link can be rel="nofollow" if we'd prefer. Id really like to get the full benefit of the link back to our website, obviously if i did a straight link back to the 3rd party press site the links would be reciprocal and cancel each other out in terms of "link juice", but i was wandering if we make our link back to the 3rd party rel="nofollow" will we still get the full benefit of their link to us in terms of link juice ? ie. having the link back to them, but nofollow wouldn't been seen as a reciprocal link. ? (Obviously either way there is still benefit of having the link even if it reciprocal as it will send traffic to our site, but just no "link juice") Note - Ive used the phrase"Link Juice" for lack of a better term, any ideas on how else to refer to this ?
Technical SEO | | Sam-P1 -
Missing "Mobile Friendly" Tag in Google
Hi All, I have noticed that Google are not displaying a mobile friendly tag next to our website (www.wombatwebdesign.com). We made it responsive over a year ago and it is running on Joomla 3.X, as recommended by Google. I have run it through google checking tool and it confirms it is mobile friendly. So why no mobile friendly tag? Any ideas gratefully received. Thanks Fraser
Technical SEO | | fraserhannah0 -
If I want clean up my URLs and take the "www.site.com/page.html" and make it "www.site.com/page" do I need a redirect?
If I want clean up my URLs and take the "www.site.com/page.html" and make it "www.site.com/page" do I need a redirect? If this scenario requires a 301 redirect no matter what, I might as well update the URL to be a little more keyword rich for the page while I'm at it. However, since these pages are ranking well I'd rather not lose any authority in the process and keep the URL just stripped of the ".html" (if that's possible). Thanks for you help! [edited for formatting]
Technical SEO | | Booj0 -
The use of robots.txt
Could someone please confirm that if I do not want to block any pages from my URL, then I do not need a robots.txt file on my site? Thanks
Technical SEO | | ICON_Malta0 -
Fixing Crawl Errors
Hi! I moved my Wordpress blog back in August, and lost much of my site traffic. I recently found over 1000 crawl errors in Webmaster Tools because some of my redirects weren't transferred, so we are working on fixing the errors and letting Google know. I'm wondering how long I should expect for Google to recognize that the errors have been fixed and for the traffic to start returning? Thanks! Jodi - momsfavoritestuff.com
Technical SEO | | JodiFTM0 -
Google's "cache:" operator is returning a 404 error.
I'm doing the "cache:" operator on one of my sites and Google is returning a 404 error. I've swapped out the domain with another and it works fine. Has anyone seen this before? I'm wondering if G is crawling the site now? Thx!
Technical SEO | | AZWebWorks0 -
Google causing Magento Errors
I have an online shop - run using Magento. I have recently upgraded to version 1.4, and I installed a extension called Lightspeed, a caching module which makes tremendous improvements to Magento's performance. Unfortunately, a confoguration problem, meant that I had to disable the module, because it was generating errors relating to the session, if you entered the site from any page other than the home page. The site is now working as expected. I have Magento's error notification set to email - I've not received emails for errors generated by visitors. However over a 72 hour period, I received a deluge of error emails, which where being caused by Googlebot. It was generating an erro in a file called lightspeed.php Here is an example: URL: http://www.jacksgardenstore.com/tahiti-vulcano-hammock IP Address: 66.249.66.186 Time: 2011-06-11 17:02:26 GMT Error: Cannot send headers; headers already sent in /home/jack/jacksgardenstore.com/user/jack_1.4/htdocs/lightspeed.php, line 444 So several things of note: I deleted lightspeed.php from the server, before any of these error messages began to arrive. lightspeed.php was never exposed in the URL, at anytime. It was referred to in a mod_rewrite rule in .htaccess, which I also commented out. If you clicked on the URL in the error message, it loaded in the browser as expected, with no error messages. It appears that Google has cached a version of the page which briefly existed whilst Lightspeed was enabled. But I though that Google cached generated HTML. Since when does cache a server-side PHP file ???? I've just used the Fetch as Googlebot facility on Webmaster Tools for the URL in the above error message, and it returns the page as expected. No errors. I've had to errors at all in the last 48 hours, so I'm hoping it's just sorted itself out. However I'm concerned about any Google related implications. Any insights would be greatly appreciated. Thanks Ben
Technical SEO | | atticus70 -
Rel="canonical" for PFDs?
Hello there, We have a lot of PDFs that seem to end up on other websites. I was wondering if there was a way to make sure that our website gets the credit/authority as the original creator. Besides linking directly from the PDF copy to our pages, is anyone aware of strategy for letting Google know that we are the original publishers? I know search engines can index HTML versions of PDFs, so is there anyway to get them to index a rel="canonical" tag as well? Thoughts/Ideas?
Technical SEO | | Tektronix0