Google refuses to index our domain. Any suggestions?
-
A very similar question was asked previously. (http://www.seomoz.org/q/why-google-did-not-index-our-domain) We've done everything in that post (and comments) and then some.
The domain is http://www.miwaterstewardship.org/ and, so far, we have:
- put "User-agent: * Allow: /" in the robots.txt (We recently removed the "allow" line and included a Sitemap: directive instead.)
- built a few hundred links from various pages including multiple links from .gov domains
- properly set up everything in Webmaster Tools
- submitted site maps (multiple times)
- checked the "fetch as googlebot" display in Webmaster Tools (everything looks fine)
- submitted a "request re-consideration" note to Google asking why we're not being indexed
Webmaster Tools tells us that it's crawling the site normally and is indexing everything correctly. Yahoo! and Bing have both indexed the site with no problems and are returning results. Additionally, many of the pages on the site have PR0 which is unusual for a non-indexed site. Typically we've seen those sites have no PR at all.
If anyone has any ideas about what we could do I'm all ears. We've been working on this for about a month and cannot figure this thing out.
Thanks in advance for your advice.
-
You make excellent points. I'll escalate this to "the pros" and see if they're able to bring their guru powers to bear on the trouble.
Thanks again Ryan for all your advice. It is greatly appreciated.
-
Looking at the site I can confirm the following:
-
the home page is tagged index follow
-
the status code for the home page is 200, an OK response
-
the robots.txt file is valid and clear
-
your crawl reports look fine to me
-
you stated your sitemap is received and 73 of 75 pages are indexed
-
your site is clearly not in Google's index as a site:miwaterstewardship.org search shows nothing.
-
I looked at your sitemap. I am not familiar with .aspx sitemaps but it does contain valid html links which apparently is enough for Google to utilize
-
you stated your site is not under penalty, as per Google
The possibilities are:
-
one of the pieces of information we are depending on is incorrect
-
we are overlooking a key piece of information
-
our understanding of SEO in this case is not complete
-
there is an issue with Google which is preventing your site from being indexed.
At this point I would suggest using your 1 PRO question on this issue, and reference this Q&A thread. While I don't believe we missed anything we should get the team to look at this issue and rule out every last possibility.
-
-
Thanks for the suggestions, Ryan. All of the previous changes were made before Google did it's last crawl. Here's the other info...
URLs Submitted = 78 / URLs in Web Index = 75 / The sitemap status is green check mark and it was downloaded today--June 14.
Geographic target has not been selected. Google has always been able to determine the crawl rate. I just changed the preferred domain to www.miwaterstewardship.org. That's the only change made recently.
This is another piece that baffles me. Check out the crawl stats for the site here: http://netvantagemarketing.com/temp/miws-crawl-stats.png
The bot is crawling an average of 63 pages per day and there's crawling activity since the middle of March. STILL, though...the domain absolutely will not appear in the index.
We're working with the client right now to see if we can get the site changed to a new IP address. The thought is that perhaps Google has somehow historically blocked the IP that it lives on now and changing to a new IP might get us out of jail. SUPER long shot...but those are the kinds of things we're trying now.
Thanks again for the help.
-
I was able to verify your robots.txt file is correctly set up. Did you make the changes before Google crawled the site on Saturday? You are correct that your site is not in the index. I would guess the robots.txt file was not modified until after the crawl. If it was adjusted pre-crawl, below are the next steps you can take.
Let's take a fresh look at your site:
-
you have a valid robots.txt file
-
your home page has a valid "index, follow" tag (not necessary, but it doesn't hurt either)
-
you have checked WMT and confirmed your site was crawled
In WMT, Site Configuration > Sitempas there is a "URLs submitted" field and a "URLs in web index field". What are those numbers please?
- It's a bit far reaching but while you are there please go over to your Settings tab in WMT. Geographic Target should not be checked, or if it is then Target Users in US should be selected. Preferred domain should be "www.miwaterstewardship.org" and I would recommend "Let Google determine my crawl rate" option unless you have a specific reason for doing otherwise.
-
-
Well...we're still in the virtual doghouse so-to-speak...
I made the changes that Ryan suggested on Friday. Webmaster Tools reports that the GoogleBot crawled the site on Saturday and that everything is OK in its eyes. Google responded to our reconsideration request over the weekend and stated very specifically that no actions were taken by the Google Spam team which would affect the rankings of our site or domain.
Still, if you search info:miwaterstewardship.org in The Ol' Goog, it continues to report that the site does not exist in the Google database.
Are there any other ideas of what we might be able to try?
This is a Dot Net Nuke site. Is there a DNN setting somewhere which might indicate to Google that it should not report our domain in search results?
Thanks for the looks and the help.
-
I agree with Ryan, your backlinks look great, website is structured well and has a great looking design. No signs of you breaking any of googles TOS.
-
Thank you EGOL. I consider myself a student of SEO, definitely not a master.
The Q&A forums here have given me a great opportunity to learn about SEO. I have been spending all my time this past month un-learning all the bad information I gathered from the internet, and learning SEO the right way.
The Matt Cutts videos, SEOmoz webinars, blogs and pro Q&A have all been immensely helpful. Those resources, along with the replies you and other mozzers offer have provided me an incredibly rich learning experience.
Thanks for noticing.
-
Thanks, Ryan. I noticed this as well but figured I'd leave it alone since Webmaster Tools was telling me "all is well" with the robots.txt file. I'll add the code like you suggest and we'll see what happens.
Thanks again.
-
Ryan, you have been giving some really valuable answers. Nice. Keep up the great work!
-
Fix your robots.txt. It is not set up as you suggested.
http://www.miwaterstewardship.org/robots.txt
User-agent: * Sitemap: http://www.miwaterstewardship.org/SiteMap.aspx I am unsure of what action a search engine would take upon encountering your code, but based on your post it seems that it blocks all agents. The correct code would be:
User-agent: *
Disallow:Sitemap: http://www.miwaterstewardship.org/SiteMap.aspx For more information about robots.txt you can take a look at: [http://www.robotstxt.org/robotstxt.html](http://www.robotstxt.org/robotstxt.html)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Vanity URLs are being indexed in Google
We are currently using vanity URLs to track offline marketing, the vanity URL is structured as www.clientdomain.com/publication, this URL then is 302 redirected to the actual URL on the website not a custom landing page. The resulting redirected URL looks like: www.clientdomain.com/xyzpage?utm_source=print&utm_medium=print&utm_campaign=printcampaign. We have started to notice that some of the vanity URLs are being indexed in Google search. To prevent this from happening should we be using a 301 redirect instead of a 302 and will the Google index ignore the utm parameters in the URL that is being 301 redirect to? If not, any suggestions on how to handle? Thanks,
Technical SEO | | seogirl221 -
How to know how much pages are indexed on Google?
I have a big site, there are a way to know what page are not indexed? I know that you can use site: but with a big site is a mess to check page by page. This is a tool or a system to check a entire site and automatically find non-indexed pages?
Technical SEO | | markovald0 -
Clients domain expired - rankings lost - repurchased domain - what next?
Its only been 10 days and i have repurchased the domain name/ renewed. The who is info, website and contact information is all still the same. However we have lost all rankings and i am hoping that our top rankings come back. Does anyone have experience with such a crappy situation?
Technical SEO | | waqid0 -
New domain's Sitemap.xml file loaded to old domain - how does this effect SEO?
I have a client who recently changed their domain when they redesigned their site. The client wanted the old site to remain live for existing customers with links to the new domain. I guess as a workaround, the developer loaded the new domain's sitemap.xml file to the old domain. What SEO ramifications would this have if any on the primary (new) domain?
Technical SEO | | julesae0 -
I am trying to figure out why a website is not getting fully indexed by google. Any ideas?
I am trying to figure out why a website is not getting fully indexed by google. The website was built with Godaddy's website designer so maybe this is the problem. Originally, the internal links throughout the navigation were linked to “pages” within the site. I went in and changed all of these navigation links to point to the actual url links throughout the site instead of relative links pointing to pages on the server. I thought this would have solved the problem because I thought that perhaps google was not able to follow the original relative links. When I check to see how many pages are in the google index I still see the same #. What is going on? Should this website be rebuilt using more search engine friendly code like wordpress? Is there a simple fix that will enable google to find all of this content created by Godaddy design software? I appreciate any help offered. Here is the site- http://www.securehomeusa.com/
Technical SEO | | ULTRASEM0 -
Can JavaScrip affect Google's index/ranking?
We have changed our website template about a month ago and since then we experienced a huge drop in rankings, especially with our home page. We kept the same url structure on entire website, pretty much the same content and the same on-page seo. We kind of knew we will have a rank drop but not that huge. We used to rank with the homepage on the top of the second page, and now we lost about 20-25 positions. What we changed is that we made a new homepage structure, more user-friendly and with much more organized information, we also have a slider presenting our main services. 80% of our content on the homepage is included inside the slideshow and 3 tabs, but all these elements are JavaScript. The content is unique and is seo optimized but when I am disabling the JavaScript, it becomes completely unavailable. Could this be the reason for the huge rank drop? I used the Webmaster Tolls' Fetch as Googlebot tool and it looks like Google reads perfectly what's inside the JavaScrip slideshow so I did not worried until now when I found this on SEOMoz: "Try to avoid ... using javascript ... since the search engines will ... not indexed them ... " One more weird thing is that although we have no duplicate content and the entire website has been cached, for a few pages (including the homepage), the picture snipet is from the old website. All main urls are the same, we removed some old ones that we don't need anymore, so we kept all the inbound links. The 301 redirects are properly set. But still, we have a huge rank drop. Also, (not sure if this important or not), the robots.txt file is disallowing some folders like: images, modules, templates... (Joomla components). We still have some html errors and warnings but way less than we had with the old website. Any advice would be much appreciated, thank you!
Technical SEO | | echo10 -
How long does it take for Google to de-index urls?
Added the noindex meta tag to some pages on my site and I am wondering if anyone has any idea how long it will take to deindex the urls?
Technical SEO | | nicole.healthline0 -
Domain Alias
I have a client that picked up a bunch of keyword rich domain names and he wants to point them to his current corporate site as domain aliases. Could this in anyway negatively or positively effect his SEO? or ranking? Thanks - Kyle Chandler
Technical SEO | | kchandler1