Why isn't our new site being indexed?
-
We built a new website for a client recently.
Site: https://www.woofadvisor.com/
It's been live for three weeks. Robots.txt isn't blocking Googlebot or anything.
Submitted a sitemap.xml through Webmasters but we still aren't being indexed.
Anyone have any ideas?
-
Hey Dirk,
No worries - I visited the question first time today and considered it unanswered as the site is perfectly accessible in California. I like to confirm what Search Console says as that is 'straight from the horses mouth'.
Thanks for confirming that the IP redirect has changed, that is interesting. It is impossible for us to know when that happened - I would have expected thing to get indexed quite fast when it changed.
With the extra info I'm happy to mark this as answered, but would be good to hear from the OP.
Best,
-Tom
-
Hi Tom,
I am not questioning your knowledge - I re-ran the test on webpagetest.org and I see that the site is now accessible for Californian ip (http://www.webpagetest.org/result/150911_6V_14J6/) which wasn't the case a few days ago (check the result on http://www.webpagetest.org/result/150907_G1_TE9/) - so there has been a change on the ip redirection. I also checked from Belgium - the site is now also accessible from here.
I also notice that if I now do a site:woofadvisor.com in Google I get 19 pages indexed rather than 2 I got a few days ago.
Apparently removing the ip redirection solved (or is solving) the indexation issue - but still this question remains marked as "unanswered"
rgds,
Dirk
-
I am in California right now, and can access the website just fine, which is why I didn't mark the question as answered - I don't think we have enough info yet. I think the 'fetch as googlebot' will help us resolve that.
You are correct that if there is no robots.txt then Google assumes the site is open, but my concern is that the developers on the team say that there IS a robots.txt file there and it has some contents. I have, on at least two occasions, come across a team that was serving a robots.txt that was only accessible to search bots (once they were doing that 'for security', another time because they mis-understood how it worked). That is why I suggested that Search Console is checked to see what shows up for robots.txt.
-
To be very honest - I am quite surprised that this question is still marked as "Unanswered".
The owners of the site decided to block access for all non UK / Ireland adresses. The main Googlebot is using a Californian ip address to visit the site. Hence - the only page Googlebot can see is https://www.woofadvisor.com/holding-page.php which has no links to the other parts of the site (this is confirmed by the webpagetest.org test with Californian ip address)
As Google indicates - Googlebot can also use other IP adresses to crawl the site ("With geo-distributed crawling, Googlebot can now use IP addresses that appear to come from other countries, such as Australia.") - however it's is very likely that these bots do not crawl with the same frequency/depth as the main bot (the article clearly indicates " Google might not crawl, index, or rank all of your locale-adaptive content. This is because the default IP addresses of the Googlebot crawler appear to be based in the USA).
This can easily be solved by adding a link on /holding-page.php to the Irish/UK version which contains the full content (accessible for all ip adresses) which can be followed to index the full site (so - only put the ip detection on the homepage - not on the other pages)
The fact that the robots.txt gives a 404 is not relevant: if no robots.txt is found Google assumes that the site can be indexed (check this link) - quote: "You only need a
robots.txt
file if your site includes content that you don't want Google or other search engines to index." -
I'd be concerned about the 404ing robots.txt file.
You should check in Search Console:
-
What does Search Console show in the robots.txt section?
-
What happens if you fetch a page that is no indexed (e.g. https://www.woofadvisor.com/travel-tips.php) with the 'Fetch as Googlebot' tool?
I checked and do not see any obvious indicators of why the pages are not being indexed - we need more info.
-
-
I just did a quick check on your site with Webpagetest.org with California IP address http://www.webpagetest.org/result/150907_G1_TE9/ - as you can see here these IP's also go to the holding page - which is logically the only page which can be indexed as it's the only one Googlebot can access.
rgds,
Dirk
-
Hi,
I can't access your site in Belgium - I guess you are redirecting your users based on ip address. If , like me, they are not located in your target country they are 302 redirected to https://www.woofadvisor.com/holding-page.php and there is only 1 page that is indexed.
Not sure which country you are actually targeting - but could it be that you're accidentally redirecting Google bot as well?
Check also this article from Google on ip based targeting.
rgds
Dirk
-
Strangely, there are two pages indexed on Google Search.
The homepage and one other
-
I noticed the robots.txt file returned a 404 and asked the developers to take a look and they said the content of it is fine.
Sometimes developers say this stuff. If you are getting a 404, demonstrate it to them.
-
I noticed the robots.txt file returned a 404 and asked the developers to take a look and they said the content of it is fine.
But yes, I'll doublecheck the WordPress settings now.
-
Your sitemap all looked good, but when I tried to view the robots.txt file in your root, it returned a 404 and so was unable to determine if there was an issue. Could any of your settings in your WordPress installation also be causing it to trip over.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My Home Page meta title on Google isn't what it should be
Hey guys My website is http://www.oxfordmeetsfifth.com According to SEOcentro, my website should appear to Google as Fashion Tips for Women | Oxford Meets Fifth. I have used the Yoast plugin and force rewrote titles to ensure that is the home page meta title. It also appears correctly in browser. Could anyone advise why this is the case? Thanks in advance!
Technical SEO | | OxfordMeetsFifth0 -
Strange URL's indexed
Hi, I got the message "Increase in not found errors" (404 errors) in GWT for one of my website. I did not change anything but I now see a lot of "strange" URL's indexed (~50) : &ui=2&tf=1&shva=1 &cat_id=6&tag_id=31&Remark=In %22%3EAny suggestion on how to fix it ?Erwan
Technical SEO | | johnny1220 -
Walking into a site I didn't build, easy way to fix this # indexing problem?
I recently joined a team with a site without a) Great content b) Not much of any search traffic I looked and all their url's are built in this way: Normal looking link -> not actually a new page but # like: /#content-title And it has no h1 tag. Page doesn't refresh. My initial thought is to gut the site and build it in wordpress, but first have to ask, is there a way to make a site with /#/ content loading friendly to search engines?
Technical SEO | | andrewhyde0 -
Are these 'not found' errors a concern?
Our webmaster report is showing thousands of 'not found' errors for links that show up in javascript code. Is this something we should be concerned about? Especially since there are so many?
Technical SEO | | nicole.healthline0 -
Google is somehow linking my two sites that aren't linked! HELP
Good Morning... In my Google webmaster account it is showing an increase of backlinks between one site i own to the other.... This should not happen, as there are no links from one site to the other. I have thoroughly checked many pages on the new site to see if i can find a backlink, but i can't. Does anyone know why this is showing like this (google now shows 50,000 links from one site to the other).. Can someone please take a look and see if you can find any link from one to the other... original site : http://goo.gl/JgK1e new site : http://goo.gl/Jb4ng Please let me know why you guys think this is happening or if you were actually able to find a link on the new site pointing back to the old site... thanks a lot
Technical SEO | | Prime850 -
Should I Do On Site Optimization For A Website That Will Get A New Design
Would it be wise for me to start implementing onsite optimization changes on a website, such as the changing urls, adding in keywords in meta tags, meta descriptions, etc if the website is about to get a totally new design. For example if I wanted to change the url structure and onsite optimization features would the changes still be on the new website.
Technical SEO | | TSpike10 -
Best way to redirect 3 sites to 1 new one.
Hi All We currently have 3 old sites that have tones of content. Due to brand/business consolidation we have merge all 3 to produce 1 website. The new site contains all the old content from the old 3. So, I know I need to 301 redirect all the old content from the previous sites to the equivelent content on the new sites but am confused how you do this with 3 domains? One of the domains is being replaced with the new site. So I have: www.domain1.co.uk www.domain2.co.uk www.domain3.co.uk All the content for all the sites have been imported into a new site and any duplicate content issues havce been resolved. Can anyone point me in the right direction? Thanks
Technical SEO | | EclipseLegal0 -
Is there a great tool for URL mapping old to new web site?
We are implementing new design and removing some pages and adding new content. Task is to correctly map and redirect old pages that no longer exist.
Technical SEO | | KnutDSvendsen0