High Number of Crawl Errors for Blog

Level2Designs

Hello All,

We have been having an issue with very high crawl errors on websites that contain blogs. Here is a screenshot of one of the sites we are dealing with: http://cl.ly/image/0i2Q2O100p2v .

Looking through the links that are turning up in the crawl errors, the majority of them (roughly 90%) are auto-generated by the blog's system. This includes category/tag links, archived links, etc. A few examples being:

http://www.mysite.com/2004/10/

http://www.mysite.com/2004/10/17/

http://www.mysite.com/tagname

As far as I know (please correct me if I'm wrong!), search engines will not penalize you for things like this that appear on auto-generated pages. Also, even if search engines did penalize you, I do not believe we can make a unique meta tag for auto-generate pages. Regardless, our client is very concerned seeing these high number of errors in the reports, even though we have explained the situation to him.

Would anyone have any suggestions on how to either 1) tell Moz to ignore these types of errors or 2) adjust the website so that these errors now longer appear in the reports?

Thanks so much!

Rebecca

evolvingSEO

Hi Rebecca

What are the crawl errors exactly? From that report screenshot it looks like you have a variety of them, so the fixes will all be different.

Let me know, and in the meantime you might want to check out my article on Moz about setting up WordPress

-Dan

Schwaab

It is true that you will most likely not be penalized for these pages, Google is pretty good at figuring out common canonicalization problems in my opinion and would most likely not penalize you for having duplicate content. I would encourage you to dig a little deeper and see what additional problems these pages could create though.

Consider that Google will waste valuable crawl bandwidth crawling these meaningless pages, rather than focusing on the important content you want them too. If Google is crawling them, you can most likely bet that PageRank is flowing through these pages as well, diluting the link equity of your site.

Are you using Wordpress? There are a lot of great plug ins that can help you manage these pages. You could control how Google crawls these pages with your robots.txt, by placing meta robots tags on the pages using a plug in, or by placing rel=canonical tags on the pages pointing back to the page that is the original source.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

High Number of Crawl Errors for Blog

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

5xx Crawl Issue might not be issues at all. Help

When I crawl my site On Moz it says it can't access the robots.txt file, but crawl is fine on SEM Rush - Anyone know any reason for this?

Scheduled update - Re-Crawl - recrawl

Crawl rate

Moz could not crawl my httpS website

Why wont rogerbot crawl my page?

My site is not being fully crawled

Down for me? Or everyone? 504 errors on campaigns and research tools.