Roger bot taking a long time to crawl site
-
Hi all, I've noticed Roger bot is taking a long time to crawl my new site. It started on the 28th Feb 2013 and is still going. There aren't many pages at the moment. Any ideas please?
thanks a lot, Mark.
-
Hi Peter
thanks for your reply. The crawl has now completed and given me some more areas to work on, it's a great tool.
I was so preoccupied with 'hiding' the site over the last couple of months with the easy code:
User-agent: * Disallow: /
I hadn't thought beyond this.
I've noticed Google has now recognised the new robots.txt which has allowed the sitemap to be accepted..
I'll look at your notes, thank you, and work out my next move. I'll let you know how I get on too.
I know (well think) I have to get noindex, follow for 'sorted' category pages...
all the best, Mark.
-
Hi Mike
The crawl has now completed, thank you. I think the results will keep me occupied
all the best, Mark.
-
Hi Mark,
Sorry it's taking a while to crawl your new site.
While I'm not exactly sure what the delay is, one of the possible reasons is through your robots.txt. Here's what I see in a short snippet from your robots.txt:
# Crawlers Setup User-agent: * Crawl-delay: 30 # Allowable Index Allow: /*?p= Allow: /index.php/blog/ Allow: /catalog/seo_sitemap/category/ Allow: /catalogsearch/result/ Allow: /media/ # Directories Disallow: /404/ Disallow: /app/ Disallow: /cgi-bin/ Disallow: /downloader/ Disallow: /errors/ Disallow: /includes/ Disallow: /js/ Disallow: /lib/ Disallow: /magento/ Disallow: /pkginfo/ Disallow: /report/ From here, the formatting looks a little awkward. What's going on is that you're telling Roger bot to only look at these:
Allowable Index
Allow: /*?p=
Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/
Allow: /catalogsearch/result/
Allow: /media/While the syntax is OK, not every crawler out there will follow the allow directive. Here's an example something you can use.
# Crawlers Setup User-agent: * Crawl-delay: 30 Disallow: / Disallow: /404/ Disallow: /app/ Disallow: /cgi-bin/ Disallow: /downloader/ Disallow: /errors/ Disallow: /includes/ Disallow: /js/ From here you're telling the crawler to disallow nothing except these directories. Please let us know once you implement this method is that will actually fix the crawl. Thanks for reaching out! Best, Peter Li SEOmoz Help Team ```
-
Hi Mark,
This sounds like a bug or issue with the SEOmoz software.
Contact help@seomoz.org and ask one of the help associates to look into this for you.
If you do not have many pages, it definitely shouldn't take that long.
The help team responds extremely quickly!
Good luck.
Mike
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
White listing a site
A new clients site is blocked by a lot of Firewalls. And I can't work out why, the content is family friendly they sell nursery equipment. I've run it through the Google checker and there is no malicious software found on the site. Can anyone tell me what I need to do to get this site unblocked? The url is http://knuma.co.uk/
Technical SEO | | Marketing_Optimist0 -
Weird, long URLS returning crawl error
Hi everyone, I'm getting a crawl error "URL too long" for some really strange urls that I'm not sure where they are being generated from or how to resolve it. It's all with one page, our request info. Here are some examples: http://studyabroad.bridge.edu/request-info/?program=request info > ?program=request info > ?program=request info > ?program=request info > ?program=programs > ?country=country?type=internships&term=short%25 http://studyabroad.bridge.edu/request-info/?program=request info > ?program=blog > notes from the field tefl student elaina h in chile > ?utm_source=newsletter&utm_medium=article&utm_campaign=notes%2Bfrom%2Bthe%2Bf Has anyone seen anything like this before or have an idea of what may be causing it? Thanks so much!
Technical SEO | | Bridge_Education_Group0 -
Server is taking too long to respond - What does this mean?
A client has 3 sites that he would like for me to look at. Whenever I attempt to on my home internet I get this message: The connection has timed out
Technical SEO | | columbiaseo
The server is taking too long to respond. When I take my iphone off wifi and use AT&T, the site comes up fine. What is going on here?0 -
Linking out to authoritive sites from my ecommerce site
Good afternoon SEOmoz community. I was looking for a specific answer or advice or opinion about linking out to other sites. My Site www.tacticalbootstore.com has been undergoing a complete content rewrite. In the process we have been told and read where it can be good to link out to other authoritive sites. One of the pages we have rewritten is here. http://www.tacticalbootstore.com/belleville-boots-sizing-chart-a-97.html We have not added the graphics yet as they are being built now. This is just an informational page about sizing of a particular manufacturers boots. Once you get to the bottom of the text we have added a link to the actual manufacturers page. Is this helpful for us in the SERPS or not? Thank you for your time. Chris
Technical SEO | | scamper0 -
Moving site between hosts over short time scale
Hi Does anyone know if moving your website between hosts & server location 3 times in a relatively short time frame (say 2 months) can have a negative impact from an seo perspective ? Cheers
Technical SEO | | Dan-Lawrence
Dan0 -
What keywords should i be using to promote my site
Hi i am looking to promote my home page which is a lifestyle magazine www.in2town.co.uk and i am not sure what keywords i should be using to promote it. I am doing ok for the keyword lifestyle magazine but i am struggling on what other keywords i should be using to get people to the home page of the magazine. The magazine is nearly finished and we still have a couple of finishing touches to do but the basics of the magazine is as follows holiday and travel news, soap gossip, celebrity gossip, product reviews, lingerie brands, gastric band hypnotherapy, health, fashion and beauty and holiday reviews. I want the home page to be the main page where everyone visits but i am not sure what i should be doing to accomplish this. Any ideas would be of a great help
Technical SEO | | ClaireH-1848860 -
Why does my site have a PageRank of 0?
My site (www.onemedical.com) has a PageRank of 0, and I can't figure out why. We did a major site update about a year ago, and moved the site from .md to .com about 9 months ago. We are crawled by Google and rank on the first page for many of our top keywords. We have a MozRank of 4.59. I figured this is something that would just take time to work out of the system, but nothing seems to change while we patiently wait. One more thing to note - when a user comes to the homepage (city selector) and selects their region they will then be cookied and directed to their relevant city site on subsequent visits. But even our city-specific pages (ie www.onemedical.com/sf) have pageranks of 0. My management team keeps asking me about this and I suspect there is something silly that we keep overlooking...but for the life of me, can't figure it out. Any help would be appreciated.
Technical SEO | | OneMedical0 -
Long load time
My site takes double the time per kb than my competitors. it hosted on shared hosting with Godaddy.com Any ideas why this may be happening?
Technical SEO | | atohad0