Why isn't our new site being indexed?
-
We built a new website for a client recently.
Site: https://www.woofadvisor.com/
It's been live for three weeks. Robots.txt isn't blocking Googlebot or anything.
Submitted a sitemap.xml through Webmasters but we still aren't being indexed.
Anyone have any ideas?
-
Hey Dirk,
No worries - I visited the question first time today and considered it unanswered as the site is perfectly accessible in California. I like to confirm what Search Console says as that is 'straight from the horses mouth'.
Thanks for confirming that the IP redirect has changed, that is interesting. It is impossible for us to know when that happened - I would have expected thing to get indexed quite fast when it changed.
With the extra info I'm happy to mark this as answered, but would be good to hear from the OP.
Best,
-Tom
-
Hi Tom,
I am not questioning your knowledge - I re-ran the test on webpagetest.org and I see that the site is now accessible for Californian ip (http://www.webpagetest.org/result/150911_6V_14J6/) which wasn't the case a few days ago (check the result on http://www.webpagetest.org/result/150907_G1_TE9/) - so there has been a change on the ip redirection. I also checked from Belgium - the site is now also accessible from here.
I also notice that if I now do a site:woofadvisor.com in Google I get 19 pages indexed rather than 2 I got a few days ago.
Apparently removing the ip redirection solved (or is solving) the indexation issue - but still this question remains marked as "unanswered"
rgds,
Dirk
-
I am in California right now, and can access the website just fine, which is why I didn't mark the question as answered - I don't think we have enough info yet. I think the 'fetch as googlebot' will help us resolve that.
You are correct that if there is no robots.txt then Google assumes the site is open, but my concern is that the developers on the team say that there IS a robots.txt file there and it has some contents. I have, on at least two occasions, come across a team that was serving a robots.txt that was only accessible to search bots (once they were doing that 'for security', another time because they mis-understood how it worked). That is why I suggested that Search Console is checked to see what shows up for robots.txt.
-
To be very honest - I am quite surprised that this question is still marked as "Unanswered".
The owners of the site decided to block access for all non UK / Ireland adresses. The main Googlebot is using a Californian ip address to visit the site. Hence - the only page Googlebot can see is https://www.woofadvisor.com/holding-page.php which has no links to the other parts of the site (this is confirmed by the webpagetest.org test with Californian ip address)
As Google indicates - Googlebot can also use other IP adresses to crawl the site ("With geo-distributed crawling, Googlebot can now use IP addresses that appear to come from other countries, such as Australia.") - however it's is very likely that these bots do not crawl with the same frequency/depth as the main bot (the article clearly indicates " Google might not crawl, index, or rank all of your locale-adaptive content. This is because the default IP addresses of the Googlebot crawler appear to be based in the USA).
This can easily be solved by adding a link on /holding-page.php to the Irish/UK version which contains the full content (accessible for all ip adresses) which can be followed to index the full site (so - only put the ip detection on the homepage - not on the other pages)
The fact that the robots.txt gives a 404 is not relevant: if no robots.txt is found Google assumes that the site can be indexed (check this link) - quote: "You only need a
robots.txt
file if your site includes content that you don't want Google or other search engines to index." -
I'd be concerned about the 404ing robots.txt file.
You should check in Search Console:
-
What does Search Console show in the robots.txt section?
-
What happens if you fetch a page that is no indexed (e.g. https://www.woofadvisor.com/travel-tips.php) with the 'Fetch as Googlebot' tool?
I checked and do not see any obvious indicators of why the pages are not being indexed - we need more info.
-
-
I just did a quick check on your site with Webpagetest.org with California IP address http://www.webpagetest.org/result/150907_G1_TE9/ - as you can see here these IP's also go to the holding page - which is logically the only page which can be indexed as it's the only one Googlebot can access.
rgds,
Dirk
-
Hi,
I can't access your site in Belgium - I guess you are redirecting your users based on ip address. If , like me, they are not located in your target country they are 302 redirected to https://www.woofadvisor.com/holding-page.php and there is only 1 page that is indexed.
Not sure which country you are actually targeting - but could it be that you're accidentally redirecting Google bot as well?
Check also this article from Google on ip based targeting.
rgds
Dirk
-
Strangely, there are two pages indexed on Google Search.
The homepage and one other
-
I noticed the robots.txt file returned a 404 and asked the developers to take a look and they said the content of it is fine.
Sometimes developers say this stuff. If you are getting a 404, demonstrate it to them.
-
I noticed the robots.txt file returned a 404 and asked the developers to take a look and they said the content of it is fine.
But yes, I'll doublecheck the WordPress settings now.
-
Your sitemap all looked good, but when I tried to view the robots.txt file in your root, it returned a 404 and so was unable to determine if there was an issue. Could any of your settings in your WordPress installation also be causing it to trip over.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Country and Language tags-Running an SEO audit on a site that definitely has more than one language, but nothing is pulling up. I don't quite understand href lang or how to go about it. HELP HELP!
Ran an SEO audit and I don't really understand country and language tags. For example, sony.com definitely has more than one language, but how do I seo check href lang ? Do I inspect the page? etc?
Technical SEO | | Mindgruver0 -
Site architecture? I've got a free user report, that shoots back a page with their data for them to share with co-workers and friends.
Hi, I have a site about to go online that users can run a free report that connects to their calendar app to get 12 months of statistics for their meetings, and then it shoots out a report. So they go to a.com/freereport and they get back a.zom/freereport/report/xxxxxx The content of those reports is different, but the structure is the same as it is a fun way to show off meeting stats to co-workers and friends. I don't see the point of Google indexing those as the traffic to those pages is going to be from social networks and viral, but I do want the backlink credit. Will I get backlink credit if I nofollow that folder? I am having a hard time deciding what to do seo wise and would love some thoughts and advice, what would you recommend? Do nothing fancy. Mark the report folder no follow. Try to do something with rel=cannonical to point those pages to the root page? Thoughts?
Technical SEO | | bwb0 -
Can the Hosting location of image files have a negative effect if 'off-site' such as on the devs own media server ?
Hi Can the Hosting location of image files have a negative effect if 'off-site' such as if they are on the developers own media server ? As opposed to on the actual websites server or file structure ? In the case i'm looking at the image files are hosted on a totally separate server (a media subdomain of the developers site server) from the subject sites dedicated server. Will engines still attribute the properties of files hosted in this manner to the main website (such as file name, alt attributes, etc etc) ? Or should they really be on the subject sites server own media folder ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
.me vs .com for new personal blog site
Hi guys, this is my first ever post on SEOMoz (woo!) I have researched and I did see someone else ask something similar but I still wasn't clear, so i hope this question is not considered a duplicate and can go on to help other people too Enough waffle For various reasons I am moving our company blog to startup a personal blog instead and I have bought a couple of appropriate domain names in a firstname/lastname format for the new blog, basically: myname.me and iammyname.com My question is, which would you consider 'better', if either, for SEO? (bonus point: are there any other non-SEO factors I should consider?) Obviously the second name is longer, but it is a .com and I hear all the time that .com is king and .me is waaaay behind Ultimately I want to rank #1 for my name If it was your site and your blog and you had my choices which one would you go for? Many thanks for your help. I'm looking forward to being part of the SEOMoz community and learning a lot from you guys, cheers, Nick
Technical SEO | | NickDavis0 -
Why hasn't my sites indexed on opensiteexplorer.org changed in weeks?
Why hasn't my sites indexed on opensiteexplorer.org changed in weeks, even though I've done link-building like crazy?
Technical SEO | | AccountKiller0 -
Development site accidentally got indexed and now appears in SERPs. How to fix?
I work at a design firm, and we just redesigned a website for a client. When it came time for the coding, we initially built a development site to work out all the kinks before going live. Then we relaunched the actual site about a week ago. Here's the problem: Somehow, the developer who coded the site for us (a freelancer) allowed the development site to be indexed by Google. Now, when you enter the client's name into Google, the development site appears higher in the results pages than the real site! In fact, the real site isn't even in the top 50 search results. The client is understandably angry about this for multiple reasons. We quickly added a robots.txt file to the development site and a 301 redirect to the real site. However, that did seemed to have no effect on the problem. Any ideas on how to fix this mess? Thank you in advance!
Technical SEO | | matt-145670 -
Switching Site to a Domain Name that's in Use
I'm comfortable with the steps of moving a site to a new domain name as recommended by Google. However, in this case, the domain name I'm asked to move to is not really "new" ... meaning it's currently hosting a website and has been for a long time. So my question is, do I do this in steps and take the old website down first in order to "free up" the domain name in they eyes of search engines to avoid large numbers of 404s and then (in step 2) switch to the "new" domain in a few months? Thanks.
Technical SEO | | R2iSEO0 -
Duplicate titles OK if page don't need to rank well?
I know It is not a good idea to have duplicate titles across a website on pages as Google does not like this. Is it ok to have duplicate titles on pages that aren't being optimised with SERP's in mind? or could this have a negative effect on the pages that are being optimised?
Technical SEO | | iSenseWebSolutions0