Still Cant Crawl My Site
-
I've removed all blocks but two from our htaccess. They are for amazonaws.com to block amazon from crawling us.
I did a fetch as google in our WM tools on our robots txt with success.
SEOMoz crawler here hit's our site and gets a 403. I've looks in our blocked request logs and amazon is the only one in there.
What is going on here?
-
Hey Joel,
Happy Friday!
Sha
-
Hi Dana,
No problem. Glad you have sorted the problem now.
Have an awesome weekend
Sha
-
Hey Dana,
We've been corresponding in email, but I just wanted to update your thread here as well.
We don't use Amazon's bot, we use Amazon Web Service to host our crawler. If you are no longer blocking AWS you should be able to crawl OK moving forward.
Thanks!
Joel. -
Wish someone would've pointed that out days ago.
Thank you soooooo much for your great answer.
I don't understand though how or why seomoz is using amazons bot...
What if I don't want amazon accessing our site ( i dont). That means we can't use seomoz then??
-
we'll see how this goes. I've removed the blocks for amazonaws...
Thanks .
-
Hi Dana,
I believe SEOmoz utilizes Amazonaws services for crawling, (or at least they did a few months ago) so that may well be your problem.
The best (and quickest) way to confirm this is to go to the SEOmoz Help Hub and click the button at the top of the page to contact the Help Team directly.
Hope that helps,
Sha
-
Whats the web address?
Issa
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Complex Rankings Issue For A Law Firm Site
Be warned, this is a complex issue that I have and will require someone who has some advanced knowledge about 301s and link penalty’s. I have a law firm client whose site is having some issues. There are some very complex details here so I'm going to articulate them in bullet points in hopes of making the issues easy to understand. So here's my root problem: We have poor organic rankings (4th, 5th, 6th page for most terms) despite Domain Authority of 32 (avg. 1st page competitor is 28) and some very strong white hat link building the last 60 days or so. How's their backlink profile look, you ask? When you look at their backlink profile in OSE, their spam score is a 1/17 (not sure if that's credible in any way). Lot's of links that score 5's on the spam score make up about 10% of their OSE links. Here’s where it gets tricky; those links are not directed the client's New URL, they are links that go to some old URLs the client used to have, for which they had an SEO guy who built all those crappy links. Those URLs with the crappy links (we'll call them The Crappy URLs) were 301'd (can we all agree 301'd is a verb?) to the NEW URL for just a couple of months. Shortly after that, NEW URL dropped almost completely out of Google, so the client turned off the 301s. So despite those 301s being turned off, OSE still shows all the links going to The Crappy URLs but is giving The New URL credit for them. Keep in mind, the 301s were turned off about 6 months ago so it’s a little strange that OSE still shows those 301s. This has led me to the conclusion that the Domain Authority that OSE shows of 32, is not a “real” number since it is seemingly based off links inherited from 301s that no longer exist. So now I’m trying to create an action plan for this client that will hopefully help us start to make some real progress in our rankings. This client does not have the budget to wait another 6 months for some sign of hope so time is of the essence. Here’s my theoretical action plans I’m choosing from and would like the communities input on which, if any, they feel is best (Also, if I’m missing something or you have an idea, I’m all ears): **Potential Action Plans: ** Do nothing, keep building quality links, creating quality content, monitor crawl reports/gwt for issues. That strategy is going to win long term. #1 + Create one page sites on The Crappy URLs, setup GWT for them, submit sitemaps thus forcing Google, OSE and other web crawlers to index them, thus removing any potential residual penalties from the 301s. NOTE: Currently The Crappy URLS are just landing on GoDaddy’s default landing page which is of course not being indexed by Google or OSE. #2 + Disavow all the bad links going to The Crappy URLS. Then once the bad links no longer appear in the OSE profile for each of The Crappy Sites, 301 them again, thus inheriting the good links but not the bad. #1 + 301 the Crappy URLS back to the New URL, while also disavow any links going to The Crappy URLs. The logic here is that if the road back to recovery is going to be a few months away no matter what, when the 301 knocked them back 6 months ago no reputable link building was being done. I am cautiously optimistic the linkbuilding we are doing will eventually off set any penalty’s coming from the 301s. Plus now we’ll know the 32 Domain Authority OSE is giving us is real. This is the one I’m leaning towards quite frankly because I think it will reduce the recovery time and we’ll know somewhat quickly (30-60 days) if it’s actually working. 1-3 could each take 90 days before we know if it’s working. So please, if you have any expertise with any of this, your help or advice would be appreciated. I’d rather not share The New URL for obvious reasons but if you must know, simply message me and as long as you’re legit, I’ll share it with you.
Moz Pro | | BrianJGomez0 -
How to fix the Crawl Diagnostics error and warnings
hi im new to the seo world and i dont know a lot about it , so after my site get crawled i found 1 error and 151 warning and 96 notices , it that bad ?? and plz cam someone explain to me how to fix thos problem , a will be very thankful
Moz Pro | | medlife0 -
Crawl Disgnosis only crawling 250 pages not 10,000
My crawl diagnosis has suddenly dropped from 10,000 pages to just 250. I've been tracking and working on an ecommerce website with 102,000 pages (www.heatingreplacementparts.co.uk) and the history for this was showing some great improvements. Suddenly the CD report today is showing only 250 pages! What has happened? Not only is this frustrating to work with as I was chipping away at the errors and warnings, but also my graphs for reporting to my client are now all screwed up. I have a pro plan and nothing has (or should have!) changed.
Moz Pro | | eseyo0 -
Amount of Pages Crawled Dropped Significantly
I am just wondering if something changed with the SEOMoz crawler. I was always getting 10,000 or near 10,000 pages crawled. After the last two crawls I am ending up around 2500 pages. Has anything changed that I would need to look at it see if I am blocking the crawler or something else?
Moz Pro | | jeffmace0 -
How can i get seomoz to crawl a campaign on demand
hi how can i get seomoz to crawl a campaign on demand instead of on a weekly basis? For example i have corrected some error warnings and on page elements and would like it to re crawl the site sooner to see how the corrections have worked? thanks
Moz Pro | | Bristolweb0 -
Sub-domain not crawled
One of our sites was recently re-designed. The home page is a landing page (www.labadieauto.com) and I moved the blog to this domain (labadieauto.com/blog/) and put a link is the bottom left of the home page. Since the change the SEOMOZ campaign overview is showing only 1 page crawled. This is not setup as a sub-domain so why isn't it showing in the crawl? Help!
Moz Pro | | LabadieAuto0 -
Crawl slow again
Once again the weekly crawl on my site is very slow. I have around 441 pages in the crawl and this has been running for over 12 hours. This last happened two weeks ago (ran for over 48 hours). Last week's crawl was much quicker (not sure exactly how long but guessing an hour or so). Is this a known issue and is there anything that can be done to unblock it? Weekends are the best time for me to assess and respond to changes I have made to my site so having this (small) crawl take most of the weekend is really quite problematic. Thanks. Mark
Moz Pro | | MarkWill0 -
Still Getting Keyword Self-Cannibalization
I used the onsite optimization and made some changes to my post, but am still getting the Self-Cannibalization mention in my results. It lists this: "Real Estate Careers | Keller Williams | Career In Real Estate" as the link that is not supposed to be there. This is not a link, but the title of the post. What am I missing?
Moz Pro | | brentmitchell0