Google crawl index issue with our website...

ILM_Marketing

Hey there. We've run into a mystifying issue with Google's crawl index of one of our sites. When we do a "site:www.burlingtonmortgage.biz" search in Google, we're seeing lots of 404 Errors on pages that don't exist on our site or seemingly on the remote server.

In the search results, Google is showing nonsensical folders off the root domain and then the actual page is within that non-existent folder.

An example:

Google shows this in its index of the site (as a 404 Error page): www.burlingtonmortgage.biz/MQnjO/idaho-mortgage-rates.asp

The actual page on the site is: www.burlingtonmortgage.biz/idaho-mortgage-rates.asp

Google is showing the folder MQnjO that doesn't exist anywhere on the remote. Other pages they are showing have different folder names that are just as wacky.

We called our hosting company who said the problem isn't coming from them...

Has anyone had something like this happen to them?

Thanks so much for your insight!
Megan

ILM_Marketing

Hi Keri. Thanks for following up. This turned out to be an issue with an auto-generated breadcrumbs script. I don't know what the intricacies of that were but we were able to remove it and get this issue straightened out.

Thanks again!

Megan

KeriMorgret

Hi Megan,

I'm following up on older questions that are marked unanswered. Did you ever get this figured out?

wissamdandan

Megan ,

Please check with your hosting company,

about this code to be included in htaccess

ErrorDocument 404 /404.shtml

/404.shtml its your 404 page

ILM_Marketing

Thanks for your help on this Wissam. Is this something that we need to have the hosting company set-up on the server to ensure that these pages get returned as 404s?

wissamdandan

Megan,

See here

http://markup.io/v/fyd9w4w9wmjr

Googlebot when It crawls this page, you remote server is telling Google Bot that its a Live page and this page Exists

The solution to the upper problem, might help you in fixing the actual problem.

If the Pages with the mystery folder Does not Exist .. your remote server should show google bot a 404 not found (http header).

ILM_Marketing

Are we talking about one problem or two?

http://www.burlingtonmortgage.biz/contact.htm does not exist on the remote server (as it was removed over a year ago). I see that there are similar errors for other old pages which were also previously removed. Should we have redirected those to the 404 page since there are not related pages on the existing site?

I am not sure if the two problems have anything to do with one another. The pages with the "mystery folders" are existing pages. They just exist in the root. Why would google be looking at them as if they are inside sub folder?

wissamdandan

Megan,

noticed something also for example this page http://www.burlingtonmortgage.biz/contact.htm . its showing a 404 error from title and content ... but the HTTP header is showing 200 ok. u need to fix that.

and would assume maybe thats why google started indexing weird URLs generating from your site... and if its true is a 404 page ..google is not picking it up because its showing its a Live page (200ok)

ILM_Marketing

We use Dreamweaver.

wissamdandan

Which CMS are you using?

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Google crawl index issue with our website...

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Moz crawler is not able to crawl my website

Removing a site from Google index with no index met tags

I can't crawl the archive of this website with Screaming Frog

How Does Google Handle Websites Like Craigslist

Google Indexing of Site Map

Does Google Still Pass Anchor Text for Multiple Links to the Same Page When Using a Hashtag? What About Indexation?

What is the most effective way of indexing a localised website?

URL Duplicate Content Issues (Website Transition)