Crawl errors in GWT!
-
I have been seeing a large number of access denied and not found crawl errors. I have since fixed the issued causing these errors; however, I am still seeing the in webmaster tools.
At first I thought the data was outdated, but the data is tracked on a daily basis!
Does anyone have experience with this? Does GWT really re-crawl all those pages/links everyday to see if the errors still exist?
Thanks in advance for any help/advice.
-
Neither access denied nor not found crawl errors are dealbreakers as far as Google is concerned. A not found error usually just means you have links pointing to pages that don't exist (this is how you can be receiving more errors than pages crawled - a not found error means that a link to that page was crawled, but since there's no page there, no page was crawled). Access denied is usually caused by either requiring a login or blocking the search bots with robots.txt.
If the links causing 404 errors aren't on your site it's certainly possible that errors would still be appearing. One thing you can do is double-check your 404 page to make sure it really is returning an error of 404: not found at the URL level. One common thing I've seen all over the place is that sites will institute a 302 redirect to one 404 page (like www.example.com/notfound). Because the actual URL isn't returning a 404, bots will sometimes just keep crawling those links over and over again.
Google doesn't necessarily crawl everything every day or update everything every day. If your traffic isn't being affected by these errors I would just try as best you can to minimize them, and otherwise not worry too much ab out it.
-
Crawl errors are also due to links of those pages on other sites or in google's own index. When Google revisits those pages and does not find them, they flag off as 404 errors.
-
BTW, the crawls stats show Google crawling about 3-10K pages a day. The daily errors are numbering over 100K. Is this even possible? How can if find so many errors if the spiders are not even crawling that many pages?
Thanks again!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
GWT app and link emulation
Hi, I have a gwt site - https://www.whatiswhere.com. I have tab control which emulates the menu. I am planning to put <a>links instead of text into tab labels to create internal links. I am thinking to add java script to stop onclick event of</a> <a>otherwise i will get to the new session of GWT site. What I want is to just change the tab but at the same time have a link for the crawler. Would my approach work? Will it be equivalent to non-follow link? Will it improve the ranking comparing to 'no link at all' case?</a> <a>Thanks, Andrei.</a>
Intermediate & Advanced SEO | | Anazar_20010 -
Crawl Issue for Deleted Pages
Hi, sometimes, I just delete a page and not necessarily want to make a 404 to another page. So Google Webmaster Tools shows me 108 'not found' pages under 'Crawling Errors'. Is that a problem for my site?
Intermediate & Advanced SEO | | soralsokal
Can I ignore this with good conscience?
Shall I make 404 to my homepage? I am confused and would like to hear your opinion on this. Best, Robin0 -
Does Google crawl and spider for other links in rel=canonical pages?
When you add rel=canonical to the page, will Google still crawl your page for content and discover new links in that page?
Intermediate & Advanced SEO | | ReferralCandy0 -
Killing 404 errors on our site in Google's index
Having moved a site across to Magento, obviously re-directs were a large part of that, ensuring all the old products and categories linked up correctly with the new site structure. However, we came up against an issue where we needed to add, delete, then re-add products. This, coupled with a misunderstanding of the csv upload processing, meant that although the old urls redirected, some of the new Magento urls changed and then didn't redirect: For Example: mysite/product would get deleted re-added and become: mysite/product-1324 We now know what we did wrong to ensure it doesn't continue to happen if we weret o delete and re-add a product, but Google contains all these old URLs in its index which has caused people to search for products on Google, click through, then land on the 404 page - far from ideal. We kind of assumed, with continual updating of sitemaps and time, that Google would realise and update the URL accordingly. But this hasn't happened - we are still getting plenty of 404 errors on certain product searches (These aren't appearing in SEOmoz, there are no links to the old URL on the site, only Google, as the index contains the old URL). Aside from going through and finding the products affected (no easy task), and setting up redirects for each one, is there any way we can tell Google 'These URLs are no longer a thing, forget them and move on, let's make a fresh start and Happy New Year'?
Intermediate & Advanced SEO | | seanmccauley0 -
Can 404 Errors Be Affecting Rankings
I have a client that we recently (3 months ago) designed, developed, and launch a new site at a "new" domain. We set up redirects from the old domain to the new domain and kept an eye on Google Webmaster Tools to make sure the redirects were working properly. Everything was going great, we maintained and improved the rankings for the first 2 months or so. In late January, I started noticing a great deal of 404 errors in Webmaster Tools for URLs from the new site. None of these URLs were actually on the current site so I asked my client if he had previously used to domain. It just so happens that he used the domain a while back and none of the URLs were ever redirected or removed from the index. I've been setting up redirects for all of the 404s appearing in Webmaster tools but we took a pretty decent hit in rankings for February. Could those errors (72 in total) been partially if not completely responsible for the hit in rankings? All other factors have been constant so that lead me to believe these errors were the culprits.
Intermediate & Advanced SEO | | TheOceanAgency0 -
Have you ever seen this 404 error: 'www.mysite.com/Cached' in GWT?
Google webmaster tools just started showing some strange pages under "not found" crawl errors. www.mysite.com/Cached www.mysite.com/item-na... <--- with the three dots, INSTEAD of www.mysite.com/item-name/ I have just 301'd them for now, but is this a sign of a technical issue? The site is php/sql and I'm doing the URL rewrites/301s etc in .htaccess. Thanks! -Dan EDIT: Also, wanted to add, there is no 'linked to' page.
Intermediate & Advanced SEO | | evolvingSEO0 -
How to Set Custom Crawl Rate in Google Webmaster Tools?
This is really silly question to set custom crawl rate in Google webmaster tools. Any one can find out that section under setting tab. But, I have confusion to decide number for request per second and second between requests text field. I want to set custom crawl rate for my eCommerce website. I checked my Google webmaster tools and find out as attachment. So, Can I use this facility to improve my crawling? 6233755578_33ce83bb71_b.jpg
Intermediate & Advanced SEO | | CommercePundit0 -
How to prevent Google from crawling our product filter?
Hi All, We have a crawler problem on one of our sites www.sneakerskoopjeonline.nl. On this site, visitors can specify criteria to filter available products. These filters are passed as http/get arguments. The number of possible filter urls is virtually limitless. In order to prevent duplicate content, or an insane amount of pages in the search indices, our software automatically adds noindex, nofollow and noarchive directives to these filter result pages. However, we’re unable to explain to crawlers (Google in particular) to ignore these urls. We’ve already changed the on page filter html to javascript, hoping this would cause the crawler to ignore it. However, it seems that Googlebot executes the javascript and crawls the generated urls anyway. What can we do to prevent Google from crawling all the filter options? Thanks in advance for the help. Kind regards, Gerwin
Intermediate & Advanced SEO | | footsteps0