Google crawl index issue with our website...
-
Hey there. We've run into a mystifying issue with Google's crawl index of one of our sites. When we do a "site:www.burlingtonmortgage.biz" search in Google, we're seeing lots of 404 Errors on pages that don't exist on our site or seemingly on the remote server.
In the search results, Google is showing nonsensical folders off the root domain and then the actual page is within that non-existent folder.
An example:
Google shows this in its index of the site (as a 404 Error page): www.burlingtonmortgage.biz/MQnjO/idaho-mortgage-rates.asp
The actual page on the site is: www.burlingtonmortgage.biz/idaho-mortgage-rates.asp
Google is showing the folder MQnjO that doesn't exist anywhere on the remote. Other pages they are showing have different folder names that are just as wacky.
We called our hosting company who said the problem isn't coming from them...
Has anyone had something like this happen to them?
Thanks so much for your insight!
Megan -
Hi Keri. Thanks for following up. This turned out to be an issue with an auto-generated breadcrumbs script. I don't know what the intricacies of that were but we were able to remove it and get this issue straightened out.
Thanks again!
Megan
-
Hi Megan,
I'm following up on older questions that are marked unanswered. Did you ever get this figured out?
-
Megan ,
Please check with your hosting company,
about this code to be included in htaccess
ErrorDocument 404 /404.shtml
/404.shtml its your 404 page
-
Thanks for your help on this Wissam. Is this something that we need to have the hosting company set-up on the server to ensure that these pages get returned as 404s?
-
Megan,
See here
http://markup.io/v/fyd9w4w9wmjr
Googlebot when It crawls this page, you remote server is telling Google Bot that its a Live page and this page Exists
The solution to the upper problem, might help you in fixing the actual problem.
If the Pages with the mystery folder Does not Exist .. your remote server should show google bot a 404 not found (http header).
-
Are we talking about one problem or two?
http://www.burlingtonmortgage.biz/contact.htm does not exist on the remote server (as it was removed over a year ago). I see that there are similar errors for other old pages which were also previously removed. Should we have redirected those to the 404 page since there are not related pages on the existing site?
I am not sure if the two problems have anything to do with one another. The pages with the "mystery folders" are existing pages. They just exist in the root. Why would google be looking at them as if they are inside sub folder?
-
Megan,
noticed something also for example this page http://www.burlingtonmortgage.biz/contact.htm . its showing a 404 error from title and content ... but the HTTP header is showing 200 ok. u need to fix that.
and would assume maybe thats why google started indexing weird URLs generating from your site... and if its true is a 404 page ..google is not picking it up because its showing its a Live page (200ok)
-
We use Dreamweaver.
-
Which CMS are you using?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google is indexing our old domain
We changed our primary domain from vivitecsolutions.com to vivitec.net. Google is indexing our new domain, but still has our old domain indexed too. The problem is that the old site is timing out because of the https: Thought on how to make the old indexing go away or properly forward the https?
Technical SEO | | AdsposureDev0 -
Website indexed but not ranking for anything
hello everyone, It seems my website http://www.scribidocampus.com/ is indexed by google but it si not ranking anywhere, even wehn i google scribidocampus. If i search any of the text on my website in " " no results come up. can someone tell me the reason?
Technical SEO | | themesh0 -
Is there a way to get Google to index more of your pages for SEO ranking?
We have a 100 page website, but Google is only indexing a handful of pages for organic rankings. Is there a way to submit to have more pages considered? I have optimized meta data and get good Moz "on-page graders" or the pages & terms that I am trying to connect....but Google doesn't seem to pick them up for ranking. Any insight would be appreciated!
Technical SEO | | JulieALS0 -
Why is my blog disappearing from Google index?
My Google blogger blog is about 10 months old. In that time i have worked really hard with adding unique content, building relationships with other bloggers in the same niche, and done some inbound marketing. 2 weeks ago I updated the template to something cleaner, with a little more "wordpress" feel to it. This means i've messed about with the code a lot in these weeks, adding social buttons etc. The problem is that from some point late last week thurs/fri my pages started disappearing from Googles index. I have checked webmaster tools and have no manual actions. My link profile is pretty clean as its a new site, and i have manually checked every piece of content published for plagiarism etc. So what is going on? Did i break my blog? Or is something else amiss? Impressions are down 96% comparing Nov 1-5th to previous 5 days. site is here: http://bit.ly/174beVm Thanks for any help in advance.
Technical SEO | | Silkstream0 -
Best way to fix a whole bunch of 500 server errors that Google has indexed?
I got a notification from Google Webmaster tools saying that they've found a whole bunch of server errors. It looks like it is because an earlier version of the site I'm doing some work for had those URLs, but the new site does not. In any case, there are now thousands of these pages in their index that error out. If I wanted to simply remove them all from the index, which is my best option: Disallow all 1,000 or so pages in the robots.txt ? Put the meta noindex in the headers of each of those pages ? Rel canonical to a relevant page ? Redirect to a relevant page ? Wait for Google to just figure it out and remove them naturally ? Submit each URL to the GWT removal tool ? Something else ? Thanks a lot for the help...
Technical SEO | | jim_shook0 -
Do pages that are in Googles supplemental index pass link juice?
I was just wondering if a page has been booted into the supplemental index for being a duplicate for example (or for any other reason), does this page pass link juice or not?
Technical SEO | | FishEyeSEO0 -
Mysterious drop of website ranking in google
Usually, I don't want to bother anybody by posting silly questions on forums. But this time I really might need advice. My wife and I took over the website maintenance and e-marketing of a local air conditioning company end of March this year. Before that the applied SEO strategies were not very user friendly and a little too search engine focused (spammy keyword stuffed articles, confusing website structure, a lot of directory links). Yesterday night (May 15th) the website more or less stopped ranking. For search terms like "ac repair englewood fl" or "trane north port" and many more the website was on page 1. Here are some more details: I replaced the old website with a newer version end of April. Since some of old the url structure did not apply any longer, I did a setup of around 30 301-redirects in .htaccess. The new site seemed to rank more or less as expected. The homepage has a PakeRank of 1 (seomoz Page Authority is 31). I am working on that but good natural links just take some time. site:kobiecomplete.com still brings up all the pages Google Webmaster Tools notified me on May 12th that there was a possible outage: _"_While crawling your site, we have noticed an increase in the number of transient soft 404 errors around 2012-05-08 16:00 UTC (London, Dublin, Edinburgh). Your site may have experienced outages. These issues may have been resolved. Here are some sample pages that resulted in soft 404 errors:" The listed pages under "some sample pages" are only pages from the old website which do not exist any longer and the 301 redirect was not setup. But this should have been already any issue before, if at all.
Technical SEO | | grojoh
I added the missing 301 redirects and marked them as fixed in Google Webmaster Tools. I had a copy of the website on a testing webspace (root directory of brightsidewg.com). Even though I had robots.txt set to disallow everything and WordPress search engine privacy set to do not index / follow, the website appeared on the Google search results yesterday night instead of the original website (kobiecomplete.com). Even though brightsidewg was a few ranks worse than kobiecomplete.com was, it was still ranking.
To remove the duplicate content, I deleted everything on brightsidewg.com and requested the removal of the website in the Webmaster Tools. Now brightsidewg.com is not any longer indexed (good) but it didn't help the ranking of kobiecomplete.com. Especially the homepage and the service area pages were ranking pretty decent on Google before yesterday night. Now I can not find them at all. Only other less important pages rank on page 8+ No malware on website I did not do any big changes on the website yesterday (only really minor ones). I did not acquire any weird/paid links even though there is a new link from a PageRank 0 website which I did not setup: http://www.indo-karya.com/detail/news/2012/kombise But that alone I think would not be enough for a penalty. It almost looks like that Google applied a partial -950 filter!? I could submit the website for reconsideration to Google and tell them about the duplicate content issue with my testing webspace brightsidewg.com. What do you think about it and what shall I do? Thank you so much for any help!0