404 Error on Spider Emulators

znotes

I recently began working at a company called Uncommon Goods. I ran a few different spider emulators on our homepage (uncommongoods.com) and I saw a 404 Error on SEO-browser.com as well as URL errors on Summit Media's emulator and SEOMoz's crawler. It seems there is a serious problem here. How is this affecting our site from an SEO standpoint? What are the repercussions?

Also, I know we have a lot of javascript on our homepage..is this causing the 404? Any advice would be much appreciated.

Thanks!

-Zack

G-Force

Hey Zack,

It seems your website is now returning a 200 so you apparently managed to fix the problem.

Was the problem coming from the server configuration as I suggested?

Best regards,
Guillaume Voyer.

G-Force

Hi Zack,

Yes, having the home page return a 404 error is a HUGE problem. It actually tells the engines that the page doesn't exist so they will stop crawling it and eventualy drop it from their index even if it returns content.

You should solve this problem ASAP!

Best regards,
Guillaume Voyer.

znotes

Hi Guillaume,

Your comments about Javascript on the client side make complete sense to me now and I will examine our Resin config w/ my IT team. Thanks for explaining. Also, as per Beneeb's advice above, I'm going to try making some changes to robots.txt.

From a bigger picture perspective though, do you think this 404 Error is even that big of a deal? Are we likely to be penalized for this in terms of Page Rank, Domain Authority, etc..??

Thanks for your help!

-Zack

G-Force

Hi Zack,

The 404 error has nothing to do with the robots.txt file, it has to do with your server configuration as I said in my answers bellow.

About the robots.txt file, I would remove the Disallow: line if you don't need to block anything.

Best regards,
Guillaume Voyer.

znotes

Hi Beneeb,

That tool is awesome! It definitely helps, thanks! I'm going to show that report to my IT guys today. I think your guess is a very good one. Hopefully I can persuade them to make the changes and we'll see if it resolves the error.

Best Regards,

-Zack

beeneeb

Hi Zack,

To be honest with you, it was just a guess. I used a robots.txt syntax checker and saw several issues. You can check out that same tool here & run your current robots.txt file through it:

http://tool.motoricerca.info/robots-checker.phtml

I hope that gets you pushed in the right direction. I'm very new to SEO, but I've worked in the technical support world forever. So, my suggestion is only worth what you paid for it.

znotes

Hi Beeneeb,

Thank you for your insight. I think this makes sense as I see there is some redundancy in robots.txt as it is now. I'm curious however, why do you think that changing robots.txt will resolve the 404 error?

Best Regards,

-Zack

G-Force

Hi Zack,

Quick followup : Your website will always return 500 to HTTP/1.0 queries. With HTTP/1.1, homepage returns 404 and subpages returns 200.

I saw the website was running on a Resin server rather than a Apache server, then, you might want to look into your Resin server's configuration.

Best regards,
Guillaume Voyer.

G-Force

Hi Zack,

Actually, when I use this http header tool and that I input http://www.uncommongoods.com/ I see that the header returned is in fact a 500 Internal Server Error.

The HTTP Header is returned by the server even before the browser can kow that their is javascript on that page so it has nothing to do with javascript.

You'll have to look at the server side as an Internal Server Error and the HTTP Header are returned by the server in opposite to the javascript that is executed client send.

Best regards,
Guillaume Voyer.

beeneeb

Hi Zack,

Looking at your robots.txt file, you have several errors. I would replace your current robots.txt file with the following:

User-Agent: *
Disallow:

Sitemap: http://www.uncommongoods.com/sitemap.xml

(not sure why the message truncated your sitemap file, but you get the picture)

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

404 Error on Spider Emulators

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Weird, long URLS returning crawl error

Are 404 Errors a bad thing?

Why I am a seeing an error for duplicate content for any categories and tags on my Wordpress blog?

Get rid of a large amount of 404 errors

Google's "cache:" operator is returning a 404 error.

4XX (Client Error)

How is my competition causing bad crawl errors and links on my site

Magento - Google Webmaster Crawl Errors