Google crawl index issue with our website...
-
Hey there. We've run into a mystifying issue with Google's crawl index of one of our sites. When we do a "site:www.burlingtonmortgage.biz" search in Google, we're seeing lots of 404 Errors on pages that don't exist on our site or seemingly on the remote server.
In the search results, Google is showing nonsensical folders off the root domain and then the actual page is within that non-existent folder.
An example:
Google shows this in its index of the site (as a 404 Error page): www.burlingtonmortgage.biz/MQnjO/idaho-mortgage-rates.asp
The actual page on the site is: www.burlingtonmortgage.biz/idaho-mortgage-rates.asp
Google is showing the folder MQnjO that doesn't exist anywhere on the remote. Other pages they are showing have different folder names that are just as wacky.
We called our hosting company who said the problem isn't coming from them...
Has anyone had something like this happen to them?
Thanks so much for your insight!
Megan -
Hi Keri. Thanks for following up. This turned out to be an issue with an auto-generated breadcrumbs script. I don't know what the intricacies of that were but we were able to remove it and get this issue straightened out.
Thanks again!
Megan
-
Hi Megan,
I'm following up on older questions that are marked unanswered. Did you ever get this figured out?
-
Megan ,
Please check with your hosting company,
about this code to be included in htaccess
ErrorDocument 404 /404.shtml
/404.shtml its your 404 page
-
Thanks for your help on this Wissam. Is this something that we need to have the hosting company set-up on the server to ensure that these pages get returned as 404s?
-
Megan,
See here
http://markup.io/v/fyd9w4w9wmjr
Googlebot when It crawls this page, you remote server is telling Google Bot that its a Live page and this page Exists
The solution to the upper problem, might help you in fixing the actual problem.
If the Pages with the mystery folder Does not Exist .. your remote server should show google bot a 404 not found (http header).
-
Are we talking about one problem or two?
http://www.burlingtonmortgage.biz/contact.htm does not exist on the remote server (as it was removed over a year ago). I see that there are similar errors for other old pages which were also previously removed. Should we have redirected those to the 404 page since there are not related pages on the existing site?
I am not sure if the two problems have anything to do with one another. The pages with the "mystery folders" are existing pages. They just exist in the root. Why would google be looking at them as if they are inside sub folder?
-
Megan,
noticed something also for example this page http://www.burlingtonmortgage.biz/contact.htm . its showing a 404 error from title and content ... but the HTTP header is showing 200 ok. u need to fix that.
and would assume maybe thats why google started indexing weird URLs generating from your site... and if its true is a 404 page ..google is not picking it up because its showing its a Live page (200ok)
-
We use Dreamweaver.
-
Which CMS are you using?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Escort directory page indexing issues
Re; escortdirectory-uk.com, escortdirectory-usa.com, escortdirectory-oz.com.au,
Technical SEO | | ZuricoDrexia
Hi, We are an escort directory with 10 years history. We have multiple locations within the following countries, UK, USA, AUS. Although many of our locations (towns and cities) index on page one of Google, just as many do not. Can anyone give us a clue as to why this may be?0 -
IP Redirect causing Indexing Issue
Hi, I am trying to redirect any IP from outside India that comes to Store site (https://store.nirogam.com/) to Global Store site (https://global.nirogam.com/) using this methodThis is causing various indexing issues for Store site as Googlebot from US also gets redirected!- Very few pages for "store.nirogam.com/products/" are being indexed. Even after submission of sitemap it indexed ~50 pages and then went back to 1 page etc. Only ~20 pages indexed for now.- After this I tried manually indexing via "Crawl -> Fetch as Google" - but then it showed me a redirect to global.nirogam.com. All have their "status -> Redirected" - This is why bots are not able to index the site.What are possible solutions for this? How can we tell bots to index these pages and not get redirected?Will a popup method where we ask user if they are outside India help in solving this issue?All approaches/suggestions will be highly appreciated.
Technical SEO | | pks3330 -
Google only indexed 19/94 images
I'm using Yoast SEO and have images (attachments) excluded from sitemaps, which is the recommended method (but could this be wrong?). Most of my images are in my posts; here's the sitemap for posts: https://edwardsturm.com/post-sitemap.xml I also appear on p1 for some good keywords, and my site is getting organic traffic, so I'm not sure why the images aren't being indexed. Here's an example of a well performing article: https://edwardsturm.com/best-games-youtube-2016/ Thanks!
Technical SEO | | Edward_Sturm0 -
How to display the full structure of website on Google serps
I have been searching around but unable to gather information as to how we can control or list top pages of a website on Google's first page , i.e. if we type seomoz in google , we can see the main listing with 6 subdomain listings , which link to Blog , Seo tool , Beginner Seo guide , Learn Seo , Pricing & Plans and login My question is can we control these listings i.e. what to display and what not , and if yes how can we make this type of visibility on first page , by using html or xml sitemaps or theirs something mostly websites are missing. Cause this type of data is coming up for very less websites and mostly websites are with single urls. c43Ki.jpg
Technical SEO | | ngupta10 -
How do I get google to index the right pages with the right key word?
Hello I notice that even though I have a site map google is indexing the wrong pages under the wrong key words. As a result its not as relevant and is not ranking properly.
Technical SEO | | ursalesguru0 -
What to do about Google Crawl Error due to Faceted Navigation?
We are getting many crawl errors listed in Google webmaster tools. We use some faceted navigation with several variables. Google sees these as "500" response code. It looks like Google is truncating the url. Can we tell Google not to crawl these search results using part of the url ("sort=" for example)? Is there a better way to solve this?
Technical SEO | | EugeneF0 -
Removing some of the indexed pages from my website
I am planning to remove some of the webpages from my website and these webpages are already indexed with search engine. Is there any way by which I need to inform search engine that these pages are no more available.
Technical SEO | | ArtiKalra0 -
Every time google caches our site it shows no website.
Our site <cite>www.skaino.co.uk/</cite> seems to be having real issues with being picked up with Google. The site has been around for a long time but no longer even ranks on google if you search for the word 'Skaino'. This is odd as its hardly a competitive keyword. If I do a site:www.skaino.co.uk then it shows all the pages proving the site has been indexed. But if I do cache:www.skaino.co.uk it shows a blank cache. I'm starting to worry that Google isn't able to crawl our site properly. If it helps to clarify we have a flash site with a HTML site running underneath for those who cant view flash. Im wandering if I've missed something glaringly obvious. Is it normal to have a blank google cache? Thanks AJ
Technical SEO | | handygammon0