Should we use Google's crawl delay setting?
-
We’ve been noticing a huge uptick in Google’s spidering lately, and along with it a notable worsening of render times.
Yesterday, for example, Google spidered our site at a rate of 30:1 (google spider vs. organic traffic.) So in other words, for every organic page request, Google hits the site 30 times.
Our render times have lengthened to an avg. of 2 seconds (and up to 2.5 seconds). Before this renewed interest Google has taken in us we were seeing closer to one second average render times, and often half of that.
A year ago, the ratio of Spider to Organic was between 6:1 and 10:1.
Is requesting a crawl-delay from Googlebot a viable option?
Our goal would be only to reduce Googlebot traffic, and hopefully improve render times and organic traffic.
Thanks,
Trisha
-
Unfortunately you can't change crawl settings for Google in a robots.txt file, they just ignore it. The best way to rate limit them is using custom Crawl settings in Google Webmaster Tools. (look under Site configuration > Settings)
You also might want to consider using your loadbalancer to direct Google (and other search engines) to a "condomised" group of servers (app, db, cache, search) thereby ensuring your users arent inadvertantly hit by perfomance issues caused by over zealous bot crawling.
-
We're a publisher, which means that as an industry our normal render times are always at the top of the chart. Ads are notoriously slow to load, and that's how we earn our keep. These results are bad, though, even for publishing.
We're serving millions of uniques a month, on a bank of dedicated servers hosted off site, load balanced, etc.
-
more info on that here: http://www.robotstxt.org/
-
Wow! those are really high render times. Have you considered perhaps moving to another webserver? NginX is pretty damm fast, and could probably get those render times down. Also, are you on a shared host? or is this a dedicated server?
What you're looking for is the robots.txt file though, and you want to add some lines like this:
User-agent: * Disallow: Crawl-Delay: 10 User-agent: ia_archiver Disallow: / User-agent: Ask Jeeves Crawl-Delay: 120 User-agent: Teoma Disallow: /html/ Crawl-Delay: 120
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Geo ip filtering / Subdomain can't be crawled
My client has "load balancing" site traffic in the following way: domain: www.example.com traffic from US IP redirected to usa.example.com traffic from non-US IP redirected to www2.example.com The reason for doing this is that site contents on the www2 contains herbal medicine info banned by FDA."usa.example.com" is a "cleaned" site. Using HK IP, when I google an Eng keyword, I can see that www.example.com is indexed. When googling a Chi keyword, nothing is indexed - neither the domain or www2 subdomain. From Google Search Console, it shows a Dell Sonicwall geo ip filtering alert for www2 (Connection initiated from country: United States). GSC data also confirms that www2 has never been indexed by Google. Questions: Is geo ip filtering the very reason why www2 isn't indexed? What should I do in order to get www2 to be indexed? Thanks guys!
Technical SEO | | irene7890 -
MOZ says I'm better, but google lists me lower
I have a competitor that is ranking higher for a keyword in Google. But when I look at the MOZ On-Page Grader I get an 'A' for the keyword and they get an 'F'. Then when I look at OSE and Compare Link Metrics, I rank significantly higher on every single metric except google +1's. Any idea as to why or where I should be looking as to why I'm ranking lower in actual search results? Keyword: Tiny House Trailers
Technical SEO | | dlouche
My page: http://www.tinyhomebuilders.com/tiny-house-trailers
Competitor Page: [Removed] Thanks Dan0 -
How can I get the most out of uploading a print magazine to my client's website?
Hi Mozers, My client is just about to launch a print magazine for her watch business. There is so much valuable content in the magazine and we want to feature it on the website both for SEO purposes and also for those who prefer to read articles online instead of reading a physical magazine. My question is: what is the best method of displaying the magazine to get the most from search rankings and also to capitalise on the beautiful imagery from the magazine. The best option that I can think of is to upload the magazine as a flipbook and create a separate page on the website to display each article so that search engine crawlers can index the content. I do understand that this could be problematic if users are only spending time reading the flipbook and not so much time on the article pages. Do you guys have any suggestions about how to get the most out of this opportunity for my client? THANK YOU IN ADVANCE. Meaghan
Technical SEO | | StoryScout0 -
How should I close my forum in a way that's best for SEO?
Hi Guys, I have a forum on a subdomain and it is no longer used. (like forum.mywebsite.com) It kind of feels like a dead limb and I don't know what's best to do for SEO. Should I just leave it as it is and let it stagnate? There is a link in the nav menu to the main domain so users have a chance to find the main domain. Or should I remove it and just redirect the whole subdomain to the main domain? I don't know if redirects would work as I doubt most of the threads would match our articles, plus there are 700 of them. The main domain is PR3 and so is the forum subdomain. Please help!
Technical SEO | | HCHQ0 -
Hard-working newbie question: benefit of moving my blog to my online store's domain?
Hi all, I've been running an online wine store in Switzerland for a month and have been working hard on SEO (I love learning about it). Anyway, for a couple of years prior to launching the store, I had been running a wine blog whose articles are ranking well in Google. I now want to link the two. My questions are: A) will the addition of the blog (store.com/blog) contribute to the store's domain authority (currently, the blog authority is higher than the site authority)? B) technically, can I 301 the whole blog to store.com/blog? Any help and tips would be appreciated. Thank you!
Technical SEO | | fkupfer0 -
Google doesn't rank the best page of our content for keywords. How to fix that?
Hello, We have a strange issue, which I think is due to legacy. Generally, we are a job board for students in France: http://jobetudiant.net (jobetudiant == studentjob in french) We rank quite well (2nd or 3rd) on "Job etudiant <city>", with the right page (the one that lists all job offers in that city). So this is great.</city> Now, for some reason, Google systematically puts another of our pages in front of that: the page that lists the jobs offers in the 'region' of that city. For example, check this page. the first link is a competitor, the 3rd is the "right" link (the job offers in annecy), but the 2nd link is the list of jobs in Haute Savoie (which is the 'departement'- equiv. to county) in which Annecy is... that's annoying. Is there a way to indicate Google that the 3rd page makes more sense for this search? Thanks
Technical SEO | | jgenesto0 -
What to do about Google Crawl Error due to Faceted Navigation?
We are getting many crawl errors listed in Google webmaster tools. We use some faceted navigation with several variables. Google sees these as "500" response code. It looks like Google is truncating the url. Can we tell Google not to crawl these search results using part of the url ("sort=" for example)? Is there a better way to solve this?
Technical SEO | | EugeneF0 -
Removing a site from Google's index
We have a site we'd like to have pulled from Google's index. Back in late June, we disallowed robot access to the site through the robots.txt file and added a robots meta tag with "no index,no follow" commands. The expectation was that Google would eventually crawl the site and remove it from the index in response to those tags. The problem is that Google hasn't come back to crawl the site since late May. Is there a way to speed up this process and communicate to Google that we want the entire site out of the index, or do we just have to wait until it's eventually crawled again?
Technical SEO | | issuebasedmedia0