Crawl errors in GWT!
-
I have been seeing a large number of access denied and not found crawl errors. I have since fixed the issued causing these errors; however, I am still seeing the in webmaster tools.
At first I thought the data was outdated, but the data is tracked on a daily basis!
Does anyone have experience with this? Does GWT really re-crawl all those pages/links everyday to see if the errors still exist?
Thanks in advance for any help/advice.
-
Neither access denied nor not found crawl errors are dealbreakers as far as Google is concerned. A not found error usually just means you have links pointing to pages that don't exist (this is how you can be receiving more errors than pages crawled - a not found error means that a link to that page was crawled, but since there's no page there, no page was crawled). Access denied is usually caused by either requiring a login or blocking the search bots with robots.txt.
If the links causing 404 errors aren't on your site it's certainly possible that errors would still be appearing. One thing you can do is double-check your 404 page to make sure it really is returning an error of 404: not found at the URL level. One common thing I've seen all over the place is that sites will institute a 302 redirect to one 404 page (like www.example.com/notfound). Because the actual URL isn't returning a 404, bots will sometimes just keep crawling those links over and over again.
Google doesn't necessarily crawl everything every day or update everything every day. If your traffic isn't being affected by these errors I would just try as best you can to minimize them, and otherwise not worry too much ab out it.
-
Crawl errors are also due to links of those pages on other sites or in google's own index. When Google revisits those pages and does not find them, they flag off as 404 errors.
-
BTW, the crawls stats show Google crawling about 3-10K pages a day. The daily errors are numbering over 100K. Is this even possible? How can if find so many errors if the spiders are not even crawling that many pages?
Thanks again!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does google still not crawl forms with a method=post?
I know back in 08 Google started crawling forms using the method=get however not method=post. whats the latest? is this still valid?
Intermediate & Advanced SEO | | Turkey0 -
Wordpress to HubSpot CMS - I had major crawl issues post launch and now traffic is down 400%
Hi there good looking person! Our traffic went from 12k visitors in july to 3k visitors in july. << www.thedsmgroup.com >>When we moved our site from wordpress to the hubspot COS (their CMS system), I didnt submit a new sitemap to google webmaster tools. I didn't know that I had to... and to be honest, I've never submitted or re-submitted a sitemap to GWT. I have always built clean sites with fresh content and good internal linking and never worried about it. Yoast kind of took care of the rest, as all of my sites and our clients' sites were always on wordpress. Well, lesson learned. I got this message on June 27th in GWT_http://www.thedsmgroup.com/: Increase in not found errors__Google detected a significant increase in the number of URLs that return a 404 (Page Not Found) error. Investigating these errors and fixing them where appropriate ensures that Google can successfully crawl your site's pages._One month after our site launched we had 1,000 404s on our website. Ouch. Google thought we had a 1,200 page website with only 200 good pages and 1,000 error pages. Not very trust worthy... We never had a 404 ever before this, as we added a plugin to wordpress that would 301 any 404 to the homepage, so we never had a broken link on our site, which is not ideal for UX, but as far as google was concerned, our site was always clean. Obviously I have submitted a new sitemap to GWT a few weeks ago, and we are moving in the right direction... **but have I taken care of everything I need to? I'm not sure. Our traffic is still around 100 visitors per day, not 400 per day as it was before we launched the new site.**Thoughts?I'm not totally freaking out or anything, but a month ago we ranked #1 and #2 for "marketing agency nj", now we aren't in the top 100. I've never had a problem like this. _I added a few screen grabs from Google Webmaster Tools that should be helpful.__Bottom line, have I done everything I need to or do I need to do something with all of these "not found" error details that I have in GWT?_None of these "not found" pages have any value and I'm not sure how Google even found them... For example: http://www.thedsmgroup.com/supersize-page-test/screen-shot-2012-11-06-at-2-33-22-pmHelp! -JasonuhLLtou&h4QmGCW#0 uhLLtou&h4QmGCW#1
Intermediate & Advanced SEO | | Charlene-Wingfield0 -
Strange 404s in GWT - "Linked From" pages that never existed
I’m having an issue with Google Webmaster Tools saying there are 404 errors on my site. When I look into my “Not Found” errors I see URLs like this one: Real-Estate-1/Rentals-Wanted-228/Myrtle-Beach-202/subcatsubc/ When I click on that and go to the “Linked From” tab, GWT says the page is being linked from http://www.myrtlebeach.com/Real-Estate-1/Rentals-Wanted-228/Myrtle-Beach-202/subcatsubc/ The problem here is that page has never existed on myrtlebeach.com, making it impossible for anything to be “linked from” that page. Many more strange URLs like this one are also showing as 404 errors. All of these contain “subcatsubc” somewhere in the URL. My Question: If that page has never existed on myrtlebeach.com, how is it possible to be linking to itself and causing a 404?
Intermediate & Advanced SEO | | Fuel0 -
How should I go about repairing 400,000 404 error pages?
My thinking is to make a list of most linked to and most trafficked error pages, and just redirect those, but I don't know how to get all that data because i can't even download all the error pages from Webmaster Tools, and even then, how would i get backlink data except by checking each link manually? Are there any detailed step-by-step instructions on this that I missed in my Googling? Thanks for reading!!
Intermediate & Advanced SEO | | DA20130 -
Does Google make continued attempts to crawl an old page one it has followed a 301 to the new page?
I am curious about this for a couple of reasons. We have all dealt with a site who switched platforms and didn't plan properly and now have 1,000's of crawl errors. Many of the developers I have talked to have stated very clearly that the HTacccess file should not be used for 1,000's of singe redirects. I figured If I only needed them in their temporarily it wouldn't be an issue. I am curious if once Google follows a 301 from an old page to a new page, will they stop crawling the old page?
Intermediate & Advanced SEO | | RossFruin0 -
Correct URL Parameters for GWT?
Hi, I am just double checking to see if these parameters are ok - I have added an attachment to this post. We are using an e-commerce store and dealing with faceted navigation so I excluded a lot of parameters from being crawled as I didnt want them indexed. (they got indexed anyway!). Advice and recommendations on the use of GWT would be very helpful - please check my screenshot. thanks, B0gSmRu
Intermediate & Advanced SEO | | bjs20100 -
How concerning is a message from Google about an increase in server errors?
In the past few weeks I have started getting messages from Google webmasters about an increase in server errors. According to our r&d team these messages come at times our site has been down and Google is not an accurate measure of the site health. 1 - are they correct and is there a better tool to be using? 2 - could be harmed that Google is occasionally running into this problem..that is then fixed within a few hours? Thanks!
Intermediate & Advanced SEO | | theLotter0