Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How do I complete a reverse DNS check when completing log file analysis?
-
I'm doing some log file analysis and need to run a reverse DNS check to ensure that I'm analysing logs from Google and not any imposters. Is there a command I can use in terminal to do this?
If not, whats the best way to verify Googlebot?
Thanks
-
That's awesome! Glad to know there's a bulk tool out there!
-
Hi Tyler,
Thanks for your reply. I managed to get down to 98 unique IPs and ran a bulk reverse DNS/IP Look-up using this tool:
https://www.infobyip.com/ipbulklookup.php
Thanks for your help though!
-
Hey Daniel,
If you want to verify that a user-agent is actually Googlebot, you'll want to use a log file analysis tool to aggregate all of the IP addresses associated with Googlebot. Once you have a list of IP addresses, you can perform a reverse DNS lookup to verify whether the IP addresses are actually associated with Googlebot or not.
If you're on windows/pc these steps should work:
https://www.serverintellect.com/support/dns/reverse-dns/If you're on mac try these steps:
1. open Terminal
2. type "host" + ip address
for example: "host 66.249.66.1"
3. hit enter
4. view results. For example: "1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com"If the results are from Google.com or Googlebot.com, you can be sure it's actually Google crawling your site. Unfortunately, I don't know of any faster ways to achieve these results. I'm sure there's a tool out there, I just haven't found it yet.
This might also be a good resource for you: https://support.google.com/webmasters/answer/80553?hl=en
Good luck!
-Tyler
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
One robots.txt file for multiple sites?
I have 2 sites hosted with Blue Host and was told to put the robots.txt in the root folder and just use the one robots.txt for both sites. Is this right? It seems wrong. I want to block certain things on one site. Thanks for the help, Rena
Technical SEO | | renalynd270 -
Log in, sign up, user registration and robots
Hi all, We have an accommodation site that asks users only to register when they want to book a room, in the last step. Though this is the ideal situation when you have tons of users, nowadays we are having around 1500 - 2000 per day and making tests we found out that if we ask for a registration (simple, 1 click FB) we mail them all and through a good customer service we are increasing our sales. That is why, we would like to ask users to register right after the home page ie Home/accommodation or and all the rest. I am not sure how can I make to make that content still visible to robots.
Technical SEO | | Eurasmus.com
Will the authentication process block google crawling it? Maybe something we can do? We are not completely sure how to proceed so any tip would be appreciated. Thank you all for answering.3 -
Recommended log file analysis software for OS X?
Due to some questions over direct traffic and Googlebot behavior, I want to do some log file analysis. The catch is this is a Mac shop, so all our systems are on OS X. I have Windows 8 running in an emulator, but for the sake of simplicity I'd rather run all my software in OS X. This post by Tim Resnik recommended Web Log Explorer, but it's for Windows only. I did discover Sawmill, which claims to run on any platform. Any other suggestions? Bear in mind our site is load balanced over three servers, so please take that into consideration.
Technical SEO | | ufmedia0 -
Remove html file extension and 301 redirects
Hi Recently I ask for some work done on my website from a company, but I am not sure what they've done is right.
Technical SEO | | ulefos
What I wanted was html file extensions to be removed like
/ash-logs.html to /ash-logs
also the index.html to www.timports.co.uk
I have done a crawl diagnostics and have duplicate page content and 32 page title duplicates. This is so doing my head in please help This is what is in the .htaccess file <ifmodule pagespeed_module="">ModPagespeed on
ModPagespeedEnableFilters extend_cache,combine_css, collapse_whitespace,move_css_to_head, remove_comments</ifmodule> <ifmodule mod_headers.c="">Header set Connection keep-alive</ifmodule> <ifmodule mod_rewrite.c="">Options +FollowSymLinks -MultiViews</ifmodule> DirectoryIndex index.html RewriteEngine On
# Rewrite valid requests on .html files RewriteCond %{REQUEST_FILENAME}.html -f RewriteRule ^ %{REQUEST_URI}.html?rw=1 [L,QSA]
# Return 404 on direct requests against .html files RewriteCond %{REQUEST_URI} .html$
RewriteCond %{QUERY_STRING} !rw=1 [NC]
RewriteRule ^ - [R=404] AddCharset UTF-8 .html # <filesmatch “.(js|css|html|htm|php|xml|swf|flv|ashx)$”="">#SetOutputFilter DEFLATE #</filesmatch> <ifmodule mod_expires.c="">ExpiresActive On
ExpiresByType image/gif "access plus 1 years"
ExpiresByType image/jpeg "access plus 1 years"
ExpiresByType image/png "access plus 1 years"
ExpiresByType image/x-icon "access plus 1 years"
ExpiresByType image/jpg "access plus 1 years"
ExpiresByType text/css "access 1 years"
ExpiresByType text/x-javascript "access 1 years"
ExpiresByType application/javascript "access 1 years"
ExpiresByType image/x-icon "access 1 years"</ifmodule> <files 403.shtml="">order allow,deny allow from all</files> redirect 301 /PRODUCTS http://www.timports.co.uk/kiln-dried-logs
redirect 301 /kindling_firewood.html http://www.timports.co.uk/kindling-firewood.html
redirect 301 /about_us.html http://www.timports.co.uk/about-us.html
redirect 301 /log_delivery.html http://www.timports.co.uk/log-delivery.html redirect 301 /oak_boards_delivery.html http://www.timports.co.uk/oak-boards-delivery.html
redirect 301 /un_edged_oak_boards.html http://www.timports.co.uk/un-edged-oak-boards.html
redirect 301 /wholesale_logs.html http://www.timports.co.uk/wholesale-logs.html redirect 301 /privacy_policy.html http://www.timports.co.uk/privacy-policy.html redirect 301 /payment_failed.html http://www.timports.co.uk/payment-failed.html redirect 301 /payment_info.html http://www.timports.co.uk/payment-info.html1 -
Does Bing ignore robots txt files?
Bonjour from "Its a miracle is not raining" Wetherby Uk 🙂 Ok here goes... Why despite a robots text file excluding indexing to site http://lewispr.netconstruct-preview.co.uk/ is the site url being indexed in Bing bit not Google? Does bing ignore robots text files or is there something missing from http://lewispr.netconstruct-preview.co.uk/robots.txt I need to add to stop bing indexing a preview site as illustrated below. http://i216.photobucket.com/albums/cc53/zymurgy_bucket/preview-bing-indexed.jpg Any insights welcome 🙂
Technical SEO | | Nightwing0 -
How to rewrite WordPress permalinks for reverse proxy?
Our main site, www.domain.com, is on an IIS 6 server. When we started our blog, we wanted to put it in a subdirectory (domain.com/blog), but we couldn't because our IT people refused to support it. Instead, we built it on a third-party Apache server and configured it to open under blog.domain.com. However, I came across this SEOmoz post about the glories of reverse proxies, so I've persuaded our IT people to take a swing at it. We got it to work on a staging server, but the permalinks won't change (still appear as blog.domain.com/slug). The IT guys say it's due to a configuration problem with WordPress. Can somebody out there point me in the right direction as far as working out the URL issues with this?
Technical SEO | | ufmedia0 -
Robots.txt file getting a 500 error - is this a problem?
Hello all! While doing some routine health checks on a few of our client sites, I spotted that a new client of ours - who's website was not designed built by us - is returning a 500 internal server error when I try to look at the robots.txt file. As we don't host / maintain their site, I would have to go through their head office to get this changed, which isn't a problem but I just wanted to check whether this error will actually be having a negative effect on their site / whether there's a benefit to getting this changed? Thanks in advance!
Technical SEO | | themegroup0 -
.htacess file format for Apache Server
Hi, My website having canonical issue for home page, I have written the .htaccess file and upload the root directory. But still I didn't see any changes in the home page. I am copying syntax which one I have written in the .htaccess file. Please review the syntax and let me know the changes. Options +FollowSymlinks RewriteEngine on #RewriteBase / re-direct index.htm to root / ### RewriteCond %{THE_REQUEST} ^./index.htm\ HTTP/ RewriteRule ^(.)index.htm$ /$1 [R=301,L] re-direct IP address to www ### re-direct non-www to www ### re-direct any parked domain to www of main domain RewriteCond %{http_host} !^www.metricstream.com$ [nc] RewriteRule ^(.*)$ http://www.metricstream.com/$1 [r=301,nc,L] Is there any specific htaccess file format for apache server? Thanks, Karthik
Technical SEO | | karthik-1755440