Why would our server return a 301 status code when Googlebot visits from one IP, but a 200 from a different IP?
-
I have begun a daily process of analyzing a site's Web server log files and have noticed something that seems odd. There are several IP addresses from which Googlebot crawls that our server returns a 301 status code for every request, consistently, day after day. In nearly all cases, these are not URLs that should 301. When Googlebot visits from other IP addresses, the exact same pages are returned with a 200 status code.
Is this normal? If so, why? If not, why not?
I am concerned that our server returning an inaccurate status code is interfering with the site being effectively crawled as quickly and as often as it might be if this weren't happening.
Thanks guys!
-
Howdie,
Yes, I believe we got this sorted out. Interestingly, it wasn't any of the suggestions made here causing the 301 status code responses. I posted a thread in Google Webmaster Tools Forum regarding the issue and received a response that I am 99.5% sure is the correct answer.
Here is a link to that thread for future readers' reference: https://productforums.google.com/forum/#!mydiscussions/webmasters/zOCDAVudxNo
I believe the underlying issue has to do with incorrect handling of a redirect for this domain: ccisound.com
I am currently pursuing getting it corrected with our IT Director. Once the remedy is in place, I should know right away if it solves the issue I am seeing in the server logs. I'll post back here once I am 100% certain that was the issue.
Thanks all! This has been an interesting one for me!
-
Hi Dana, have you definitively sorted this out?
-
They are pretty detailed, I'll send you yesterday's in a zip file so you can take a look. I'm certain that have everything needed. Thanks Eric!
-
Right, a DNS manager could do a redirect, but that would not be visible in the web server log. It would only be visible in whatever is managing the DNS.
-
Depends what kind of DNS manager you are using. A redirect via DNS can still be possible.
In my experience DNS managing software can redirect users with 301 or 302 headers depending on what settings you have. If your DNS manager has a security protocol along with redirect rules, it could be causing the issue.
Examples of DNS redirects:
-
The request headers will also show if any and what cookies the user may have set. Which it looks like is how your server determines if it should provide the client the desktop or mobile version.
-
How detailed are your log files? Can you see the user-agent (browser name) Maybe you could ask your IT department to log request headers? If that will make the log files too big, they can probably do it only for the 'problem' IPs, or only for cases that the webserver returns a 301. I'll take a look if you like. Email is in my profile.
Best,
-Eric
-
Thanks so much Eric. Yes, I was thinking about the mobile version of our site being related to what I'm seeing too. However, I am unaware that we 301 redirect anything from the main site to the mobile site. In fact, users can actually switch to the mobile site via desktop by clicking "Mobile Site" in the footer and then browse the mobile version of the site via desktop. All of the URLs are identical.
Just out of curiosity I browsed to the mobile version of our site, grabbed a URL and then plugged it into "Fetch as Googlebot" in GWT. For all options, including desktop and the three mobile options a status code of 200 was returned.
-
The problem can't be related to DNS. If the problem was related to DNS, the request would never make it to your server, and you would never see anything related to the request in your log files.
Because you can see it in your log file, it is definitely happening on your own webserver (not some external problem).
The requesting IP is probobly not the problem, but it could be if your server automatically adds to a banned list any IP that requests > X pages in Y time - your server might think this is a DOS (denial of service) attack.... But if your server was set up to do this, your IT guys would probobly know about it. This isn't something that is normally enabled 'out of the box' someone would need to intentionally activate a behavior like that.
More likely, is that there is another common denominator besides the requester IP... I would guess that it's the user agent string (the browser or device the user is using).
Taking a quick look at what I think is your site, you have a mobile version available. Google of course would be interested in what your site looks like to a mobile browser, and would send a 'fake' user agent string pretending to be so (a cell phone or a tablet etc...) If your server sees this request, and tries to automatically redirect the browser to the mobile version of the site, then you would have your 301 code (which in this case is exactly what you intended, so your all set!)
There are probably a few other cases that could cause a 301 for just some IPs, but this is the only one that comes to mind at the moment.
Good Luck!
-
Here is the response from my IT Director regarding the possibility that this is being done by our DNS manager:
"I do not believe so. Our DNS does translation of human readable names to IP address. It has nothing to do with the status being returned to a browser, and even if it did it could not write to the log file."
Is this accurate? I understand that the DNS cannot write to the log file, but if the DNS can flag a request to receive a certain status code from the server, then this scenario would still be a possibility.
-
According to our IT Director we have no spam filters, no mod_security module, absolutely nothing on our server to prevent it from being crawled by bot, human or spider from any IP address, including black-listed IPs.
To me, other than the obvious (no security is probably not a good idea at all), that means that the 301 status codes being returned because of a problem with server set up.
I do have server logs that I'd be willing to share privately with anyone who's willing to take a gander. Don't worry, I won't send you a month's worth. 1-2 days should be plenty.
In the meantime I am going to dive in and take a look further. It's entirely possible that IPs from Google are not the only ones receiving nothing but 301 status codes in response to requests.
-
Thanks William. Good suggestion. I am on it! I'll post back here once I know more.
-
I would not be surprised if this was done by your DNS. If you use a DNS manager, they could possibly redirect certain users or IPs based on patterns of visits.
I suggest finding out more about any server configurations from the admin and seeing who they use as a DNS provider or manager.
-
Excellent thoughts! Yes, they are consistently the same IP addresses every time. There are several producing the same phenomenon, so I looked at this one 66.249.79.174
According to what I can find online this is definitely Google and the data center is located in Mountain View, California. We are a USA company, so it seems unlikely that it is a country issue. It could be that this IP (and the others like it) are inadvertently being blocked by a spam filter.
It doesn't matter the day or time, every time Googlebot attempts to crawl from this IP address our server returns 301 status codes for every request, with no exceptions.
I am thinking I need to request a list of IP addresses being blocked by the server's spam filter. I am not a server administrator...would this be something reasonable for me to ask the people who set it up?
Is returning a 301 status code the best scenario for handling a bot attempting to disguise itself as googlebot? I would think setting the server up to respond with a 304 would be better? (Sorry, that's kind of a follow-up "side" question)
Let me know your thoughts and I'm going to go see if I can find out more about the spam filter.
-
Where are the 301s taking Googlebot on those IP addresses? And are they the same IP addresses every time? Have you narrowed those IP addresses down to any particular datacenter/country? It could be possible there is some configuration with your server that treats IP addresses differently depending on the country... it could also be that the IP addresses getting the 301s are known blacklisted spam IP addresses but are masking themselves as Googlebot so your server's blacklist software is keeping them out. It's really hard to say without looking into the data myself but I'm definitely interested in what you find out.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
One website with multiple advertising domains?
Working on a website for a business with distinct lines of business, one is more B2C and one is B2B yet the type of service is related. To think of an example, let's say it's for a photographer who does weddings, but also does real estate photography. He wants to make sure he can market to each audience separately so when they go to his homepage the homepage content is oriented for the services that audience is looking for. If you use two separate websites, they have to be totally unique to avoid dupe content flags, and you also end up diluting each website's domain authority since you are spreading your inbound links between two different websites. However would this be the optimum strategy then? One website hosted on: bozophotography.com A second domain: bozoweddings.com that has a 301 redirect to the wedding section home page on bozophotography.com A third domain: bozorealestatephotos.com that has a 301 redirect to the real estate section home page. So on certain advertising, business cards, etc, the business could choose which domain they want to publicize to insure the audience sees a home page related to that line of business. I suppose you could publicize it as a subdomain like: realestate.bozophotography.com or as a slash address: bozophotography.com/weddings but those seem much less professional, visually, than just having bozoweddings.com. There is rumor you don't quite get 100% of the link juice, but the main domain would be used the majority of the time so I really see no downside?
Intermediate & Advanced SEO | | Jazee0 -
installed PageSpeed Module on our server but no difference to site
Hi
Intermediate & Advanced SEO | | Direct_Ram
I have been searching for an answer for a while now and couldnt find it so maybe someone has had a similar problem. We have installed PageSpeed Module on our server. The administrator has said it is active and has run a test below: [root@mydomain ~]# curl -D- https://www.mydomain.com/ | head -10
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
14 102k 14 15029 0 0 40506 0 0:00:02 --:--:-- 0:00:02 64780HTTP/1.1 200 OK
Server: nginx/1.6.0
Date: Fri, 10 Apr 2015 11:28:43 GMT
Content-Type: text/html
Content-Length: 104885
Connection: keep-alive
Set-Cookie: ci_session=BGANYlg8VmsPLgN1AWABMldkAGUGLVZwVmhQdQd0CGIEaFI6VgkEOQdmUSYHbQZyXz9TZVE4Vm4CIwxnB2hYbAZrAGUHZQg%2BUjUFOgRlUWAEYg05WDxWMg82A2ABOQEzV2IAaQZsVjBWPFA2BzEIaAQ%2FUjBWNwRmBztRJgdtBnJfP1NnUTpWbgIjDDoHflhSBjwAMgdjCHlSNAVwBHdRJwQ6DStYM1ZgD2YDPAF4ATJXZABmBiFWMVY%2FUD4HKQg5BDRSelZnBGAHIFE%2FByUGO180U2ZRMFZ2AnQMIAdrWH8GAgA3B2AIblI%2FBXcEJlE%2BBHINYlg4VmAPZwM8AXgBYFchAC0GY1YsVjpQKAc2CDIEKVJjVnYEeAd6UTwHYAZeXzNTYlEnViYCZAw3B2ZYbAYpAHsHawhiUj8FdgR8USgEZg02WHxWeA91A2oBMwFhVzcAKgZ9Vm9WIlAxBykIOgQ%2BUnpWYQRwB0xRVwcFBi5fNlN4UTtWYgIvDGEHIFg%2BBn0AFAdmCHhSOAVgBCRRQARCDRtYKVYrDzkDbwE4ASxXZQBxBj1WLVY%2BUCYHawhiBGVSPVYyBD4HLVE1B3gGMF89U3ZRZlY9AmMMIAd9WGUGbwB5BzYIJVJlBS0ENlEnBDoNK1gzVmAPZgM8AXgBb1c1ACwGe1ZcVmxQZQdzCGIEcVI9ViIEKQcgUT8HPwY7XzRTYlE4VmwCNwxlBztYPgZvAGUHPAh4UmsFOgQ%2BUScEdA0rWGxWIw8KA2IBOwF3VzUAfQY0VnBWN1A2Bz0IKQQlUm9WKw%3D%3D; expires=Fri, 10-Apr-2015 13:28:43 GMT; path=/
Set-Cookie: ci_session=a%3A0%3A%7B%7D; expires=Thu, 10-Apr-2014 21:28:43 GMT; path=/
Set-Cookie: ci_session=BWEFalk4UWwJKFIq; expires=Fri, 10-Apr-2015 13:28:43 GMT; path=/
X-Mod-Pagespeed: 1.9.32.3-4448 But there doesn't seem to be any difference to the sites speed or change in google speed test recommendations. I do not have much knowledge on servers but the server company has assured me it is active and all the filters are on - so not sure why I am not seeing anything different. if anyone has any advise on this it would be great. thanks E0 -
XML Site Validators...Any Good Ones?
Before submitting to Google, I was wondering if anyone had any suggestions for testing sitemaps out before submitting?
Intermediate & Advanced SEO | | alrockn0 -
Htaccess 301 regex question
I need some help with a regex for htaccess. I want to 301 redirect this: http://olddomain.com/oldsubdir/fruit.aspx to this: https://www.newdomain.com/newsubdir/FRUIT changes: different protocol (http -> https) add 'www.' different domain (olddomain and newdomain are constants) different subdirectory (oldsubdir and newsubdir are constants) 'fruit' is a variable (which will contain only letters [a-zA-Z]) is it possible to make 'fruit' UPPER case on the redirect (so 'fruit' -> 'FRUIT') remove '.aspx' I think it's something like this (placed in the .htaccess file in the root directory of olddomain): RedirectMatch 301 /oldsubdir/(.*).aspx https://www.newdomain.com/newsubdir/$1 Thanks.
Intermediate & Advanced SEO | | scanlin0 -
Does Prefix of my URL make any difference?
Hello, I have a website which is initially appeared in search engine as without www. Last week I made changes in preferred domain name that it appeared with www. In search engine it still shows as without www. I notified to google through webmaster tools that now my domain name is with www but it still shows without www. I want to know that does it affect in SEO and rankings. In Google webmaster tools I added my url with and without www however I kept preferred domain as with www. Do I need to make any extra changes in order to avoid confusion for search engines. Please guide. Thanks
Intermediate & Advanced SEO | | intmktcom0 -
Server Migration, Does it effect SEO?
About to go through a server migration. My intitial thought is that a change in servers shouldn't really change my rankings. But I've heard rumors... Can a server migration change rankings? Why?
Intermediate & Advanced SEO | | Thos0030 -
Web site Migration from one server to another does cause any impact in SEO rankings?
Dear Seomoz members, We are going to migrate our website to different server location(new data center). does this impact on our SEO rankings? If so what and all checklist i have to do....to retain my SEO rankings. Regards, kathiravan subbiah Caratlane.com
Intermediate & Advanced SEO | | kathiravan0 -
We are a web hosting company and some of our best links are from our own customers, on the same IP, but different Class C blocks..
We are a web hosting company and some of our best links are from our own customers, on the same IP same IP, but different Class C blocks. How do search engines treat the uniqie scenario of web hosting companies and linking?
Intermediate & Advanced SEO | | FirePowered0