Suggested Screaming Frog configuration to mirror default Googlebot crawl?
-
Hi All,
Does anyone have a suggested Screaming Frog (SF) configuration to mirror default Googlebot crawl? I want to test my site and see if it will return 429 "Too Many Requests" to Google.
I have set the User Agent as Googlebot (Smartphone). Is the default SF Menu > Configuration > Speed > Max Threads 5 and Max URLs 2.0 comparable to Googlebot?
Context:
I had tried NetPeak SEO Spider which did a nice job and had a cool feature that would pause a crawl if it got to many 429. Long Story short, B2B site threw 429 Errors when there should have been no load on a holiday weekend at 1:00 AM.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google crawling 200 page site thousands of times/day. Why?
Hello all, I'm looking at something a bit wonky for one of the websites I manage. It's similar enough to other websites I manage (built on a template) that I'm surprised to see this issue occurring. The xml sitemap submitted shows Google there are 229 pages on the site. Starting in the beginning of December Google really ramped up their intensity in crawling the site. At its high point Google crawled 13,359 pages in a single day. I mentioned I manage other similar sites - this is a very unusual spike. There are no resources like infinite scroll that auto generates content and would cause Google some grief. So follow up questions to my "why?" is "how is this affecting my SEO efforts?" and "what do I do about it?". I've never encountered this before, but I think limiting my crawl budget would be treating the symptom instead of finding the cure. Any advice is appreciated. Thanks! *edited for grammar.
Intermediate & Advanced SEO | | brettmandoes0 -
[Very Urgent] More 100 "/search/adult-site-keywords" Crawl errors under Search Console
I just opened my G Search Console and was shocked to see more than 150 Not Found errors under Crawl errors. Mine is a Wordpress site (it's consistently updated too): Here's how they show up: Example 1: URL: www.example.com/search/adult-site-keyword/page2.html/feed/rss2 Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword/page2.html Example 2 (this surprised me the most when I looked at the linked from data): URL: www.example.com/search/adult-site-keyword-2.html/page/3/ Linked From: www.example.com/search/adult-site-keyword-2.html/page/2/ (this is showing as if it's from our own site) http://a-spammy-adult-site.com/search/adult-site-keyword-2.html Example 3: URL: www.example.com/search/adult-site-keyword-3.html Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword-3.html How do I address this issue?
Intermediate & Advanced SEO | | rmehta10 -
Screaming frog Advice
Hi I am trying to crawl my site and it keeps crashing. My sys admins keeps upgrading the virtual box it sits on and it now currently has 8GB of memory, but still crashes. It gets to around 200k pages crawl and dies. Any tips on how I can crawl my whole site, can u use screaming frog to crawl part of a site. Thanks in advance for any tips. Andy
Intermediate & Advanced SEO | | Andy-Halliday0 -
Pages with rel "next"/"prev" still crawling as duplicate?
Howdy! I have a site that is crawling as "duplicate content pages" that is really just pagination. The rel next/prev is in place and done correctly but Roger Bot and Google are both showing duplicated content + duplicate page titles & meta's respectively. The only thing I can think of is we have a canonical pointing back at the URL you are on - we do not have a view all option right now and would not feel comfortable recommending it given the speed implications and size of their catalog. Any experience, recommendations here? Something to be worried about? /collections/all?page=15"/>
Intermediate & Advanced SEO | | paul-bold0 -
Website is not indexed in Google, please help with suggestions
Our client website was removed from Google index. Anybody could recommend how to speed up process of re index: Webmaster tools done SM done (Twitter, FB) sitemap.xml done backlinks in process PPC done Robots.txt is fine Guys any recommendations are welcome, client is very unhappy. Thank you
Intermediate & Advanced SEO | | ThinkBDW0 -
Implementation of AJAX Crawling Specifications
My URL is: http://www.redfin.com/TX/Austin/8413-Navidad-Dr-78735/home/31224372 We're using Google's AJAX crawling system, per the documentation here. https://developers.google.com/webmasters/ajax-crawling/The example page above requires JavaScript to display content; it includes in the source. We have a lot of pages like this on our site.We expect Google to query us at this URL:http://www.redfin.com/TX/Austin/8413-Navidad-Dr-78735/home/31224372?escaped_fragment=This page renders correctly with JavaScript disabled.Are we doing this correctly? There are some small differences between the escaped_fragment HTML snapshot and the JavaScript-generated content. Will this cause any problems for us?We ask because there was a period of about two months (from October 4th to Dec 29th) during which Google's crawler radically decreased the hits to our escaped_fragment URLs; it's maybe recovering now, but maybe it isn't, and I wanted to be absolutely sure we're doing this correctly.
Intermediate & Advanced SEO | | RyanOD0 -
Website up for 3 months still not crawl
Hi, I had this website www.reliabledegree.com up for three months, but it is still not crawl by Google. All my on page optimization is grade A but I cannot see any single page crawl by Google. What can I do ? Please advise. Thanks. David
Intermediate & Advanced SEO | | Stevejobs20110 -
Googlebot + Meta-Refresh
Quick question, can Googlebot (or other search engines) follow meta refresh tags? Does it work anything like a 301 in terms of passing value to the new page?
Intermediate & Advanced SEO | | kchandler1