Magento - Google Webmaster Crawl Errors
-
Hi guys,
Started my free trial - very impressed - just thought I'd ask a question or two while I can.
I've set up the website for http://www.worldofbooks.com (large bookseller in the UK), using Magento.
I'm getting a huge amount of not found crawl errors (27,808), I think this is due to URL rewrites, all the errors are in this format (non search friendly): http://www.worldofbooks.com/search_inventory.php?search_text=&category=&tag=Ure&gift_code=&dd_sort_by=price_desc&dd_records_per_page=40&dd_page_number=1
As oppose to this format: http://www.worldofbooks.com/arts-books/history-of-art-design-styles/the-art-book-by-phaidon.html (the re-written URL).
This doesn't seem to really be affecting our rankings, we targeted 'cheap books' and 'bargain books' heavily - we're up to 2nd for Cheap Books and 3rd for Bargain Books.
So my question is - are these large amount of Crawl errors cause for concern or is it something that will work itself out? And secondly - if it is cause for concern will it be affecting our rankings negatively in any way and what could we do to resolve this issue?
Any points in the right direction much appreciated.
If you need any more clarification regarding any points I've raised just let me know.
Benjamin Edwards
-
I've added a picture of my crawl errors in SEOMoz
-
Keep in mind that it sounds like you're bloating your index. If anything about your URL is different (even one character) then spiders see that as a new page entirely. Even if you used canonical tags to reduce your index down to just single pages, you're still going to have bots spider all your pages to realize that you don't need that many indexed.
Did you look at the specific crawl errors in seomoz or are these in Google Webmaster?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will Google crawl and rank our ReactJS website content?
We have 250+ products dynamically inserted and sorted on our site daily (more specifically our homepage... yes, it's a long page). Our dev team would like to explore rendering the page server-side using ReactJS. We currently use a CDN to cache all the content, which of course we would like to continue using. SO... will Google be able to crawl that content? We've read some articles with different ideas (including prerendering): http://andrewhfarmer.com/react-seo/
Technical SEO | | Jane.com
http://www.seoskeptic.com/json-ld-big-day-at-google/ If we were to only load the schema important to the page (like product title, image, price, description, etc.) from the server and then let the client render the remaining content (comments, suggested products, etc.), would that go against best practices? It seems like that might be seen as showing the googlebot 1 version and showing the site visitor a different (more complete) version.0 -
404 errors
Hi I am getting these show up in WMT crawl error any help would be very much appreciated | ?escaped_fragment=Meditation-find-peace-within/csso/55991bd90cf2efdf74ec3f60 | 404 | 12/5/15 |
Technical SEO | | ReSEOlve
| | 2 | mobile/?escaped_fragment= | 404 | 10/26/15 |
| | 3 | ?escaped_fragment=Tips-for-a-balanced-lifestyle/csso/1 | 404 | 12/1/15 |
| | 4 | ?escaped_fragment=My-favorite-yoga-spot/csso/5598e2130cf2585ebcde3b9a | 404 | 12/1/15 |
| | 5 | ?escaped_fragment=blog/c19s6 | 404 | 11/29/15 |
| | 6 | ?escaped_fragment=blog/c19s6/Tag/yoga | 404 | 11/30/15 |
| | 7 | ?escaped_fragment=Inhale-exhale-and-once-again/csso/2 | 404 | 11/27/15 |
| | 8 | ?escaped_fragment=classes/covl | 404 | 10/29/15 |
| | 9 | m/?escaped_fragment= | 404 | 10/26/15 |
| | 10 | ?escaped_fragment=blog/c19s6/Page/1 | 404 | 11/30/15 | | |0 -
Is there a way to see Crawl Errors older than 90 days in Webmaster Tools?
I had some big errors show up in November, but I can't see them anymore as the history only goes back 90 days. Is there a way to change the dates in Webmaster Tools? If not, is there another place I'd be able to get this information? We migrated our hosting to a new company around this time and the agency that handled it for us never downloaded a copy of all the redirects that were set-up on the old site.
Technical SEO | | b4cab0 -
My site is not being regularly crawled?
My site used to be crawled regularly, but not anymore. My pages aren't showing up in the index months after they've been up. I've added them to the sitemap and everything. I now have to submit them through webmaster tools to get them to index. And then they don't really rank? Before you go spouting off the standard SEO resolutions... Yes, I checked for crawl errors on Google Webmaster and no, there aren't any issues No, the pages are not noindex. These pages are index,follow No, the pages are not canonical No, the robots.txt does not block any of these pages No, there is nothing funky going on in my .htaccess. The pages load fine No, I don't have any URL parameters set What else would be interfereing? Here is one of the URLs that wasn't crawled for over a month: http://www.howlatthemoon.com/locations/location-st-louis
Technical SEO | | howlusa0 -
Strange Webmaster Tools Crawl Report
Up until recently I had robots.txt blocking the indexing of my pdf files which are all manuals for products we sell. I changed this last week to allow indexing of those files and now my webmaster tools crawl report is listing all my pdfs as not founds. What is really strange is that Webmaster Tools is listing an incorrect link structure: "domain.com/file.pdf" instead of "domain.com/manuals/file.pdf" Why is google indexing these particular pages incorrectly? My robots.txt has nothing else in it besides a disallow for an entirely different folder on my server and my htaccess is not redirecting anything in regards to my manuals folder either. Even in the case of outside links present in the crawl report supposedly linking to this 404 file when I visit these 3rd party pages they have the correct link structure. Hope someone can help because right now my not founds are up in the 500s and that can't be good 🙂 Thanks is advance!
Technical SEO | | Virage0 -
.htaccess and error 404
Hi, I permit to contact the community again because you have good and quick answer ! Yesterday, I lost the file .htaccess on my server. Right now, only the home page is working and the other pages give me this message : Not Found The requested URL /freshadmin/user/login/ was not found on this server Could you help me please? Thanks
Technical SEO | | Probikeshop0 -
404 Errors in Google Webmaster Tools
Hello, Google webmaster tools is returning our URLs as 404 errors: http://www.celebritynetworth.com/watch/D5GrrPEN9Oc/tom-mccarthy-floating/ When we enter the URL into the browser it loads the page just fine. Is there a way to determine why Google Webmaster Tools is returning a 404 error when the link loads perfectly fine in a browser? Thanks, Alex
Technical SEO | | Anti-Alex0 -
Google & Separators
This is not a question but something to share. If you click on all of these links and compare the results you will see why _ is not a good thing to have in your URLs. http://www.google.com/search?q=blue http://www.google.com/search?q=b.l.u.e http://www.google.com/search?q=b-l-u-e http://www.google.com/search?q=b_l_u_e http://www.google.com/search?q=b%20l%20u%20e If you have any other examples of working separators please comment.
Technical SEO | | Dan-Petrovic3