Why did Moz crawl our development site?
-
In our Moz Pro account we have one campaign set up to track our main domain. This week Moz threw up around 400 new crawl errors, 99% of which were meta noindex issues.
What happened was that somehow Moz found the development/staging site and decided to crawl that. I have no idea how it was able to do this - the robots.txt is set to disallow all and there is password protection on the site. It looks like Moz ignored the robots.txt, but I still don't have any idea how it was able to do a crawl - it should have received a 401 Forbidden and not gone any further.
How do I a) clean this up without going through and manually ignoring each issue, and b) stop this from happening again?
Thanks!
-
@multitimemachine a noindex tag only really applied to Bing/Google other crawlers etc.. You said you blocked (via wildcard) all robots, are you sure you've not gotten e.g. meta robots that might be different?
help@moz.com might be your best bet for a quick resolution for 'cleaning' the report though I'm still slightly lost as to how your main domain and dev/staging were confused as normally there is a subdomain in the way from my experience, even stranger as bots can't by-pass passwords unless it's your sitemap.xml?sorry I can't get you a direct response but without seeing the site or similar it's hard to diagnose though I'm sure the team at Moz can point you in the right direction .
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is this owned by moz?
We continue to get hundreds of spam links per month, and just found a link from here. I do not believe this is owned by moz. https://www.wapz.net/ Maybe I'm wrong? Thanks.
Moz Pro | | plahpoy0 -
Inherited a site - no progress being made, no authority?
Hi all, would really appreciate some input on this one. I've been given the task of optimising a website which was built by another company. It's a Wordpress site using Jigoshop. It wasn't in the best of states, and I don't claim that it is now, though we've done quite a bit of work on it. However, after months of work on it, I feel it's not going anywhere. Naturally my customer wants to see results, and they're not getting much. I've submitted the website to a couple of highly regarded directories, and I believe it's listed on some speciality directories (Asian clothing directories). I've also started them on the task of blogging, and I'm trying to get them involved with guest blogging. There's a DA of 1, OSE reports No Data Available for this URL. The Competitive Domain Analysis shows no links at all. I could understand if I started the campaign last week, but we got going in around January this year. One thing that's crossed my mind is that it's quite a slow-loading website. It's with GoDaddy on some cheap hosting plan (not my doing!!). I understand loading times can affect results - could this be the case? It's a bit disheartening really, because we've had really good success with a lot of local businesses' websites but with this one we just can't seem to get it anywhere! 😞 If anyone could shine a bit of light I'd really appreciate it. The URL is bangle-box.co.uk. Thanks in advance!
Moz Pro | | s7media0 -
Lag time between MOZ crawl and report notification?
I did a lot of work to one of my sites last week and eagerly awaited this week's MOZ report to confirm that I had achieved what I was trying to do, but alas I still see the same errors and warnings in the latest report. This was supposedly generated five days AFTER I made the changes, so why are they not apparent in the new report? I am mainly referring to missing metadata, long page titles, duplicate content and duplicate title errors (due to crawl and URL issues). Why would the new crawl not have picked up that these have been corrected? Does it rely on some other crawl having updated (e.g. Google or Bing)?
Moz Pro | | Gavin.Atkinson0 -
1 page crawled ... and other errors
1. Why is only one (1) page crawled every second time you crawl my site? 2. Why do your bot not obey the rules specified in the robots.txt? 3. Why does your site constantly loose connection to my facebook account/page? This means that when ever i want to compare performance i need to re-authorize, and therefor can not see any data until next time. Next time i also need to re-authorize ... 4. Why cant i add a competitor twitter account? What ever i type i get an "uh oh account cannot be tracked" - and if i randomly succeed, the account added never shows up with any data. It has been like this for ages. If have reported these issues over and over again. We are part of a large scandinavian company represented by Denmark, Sweden, Norway and Finland. The companies are also part of a larger worldwide company spreading across England, Ireland, Continental Europe and Northern Europe. I count at least 10 accounts on Seomoz.org We, the Northern Europe (4 accounts) are now reconsidering our membership at seomoz.org. We have recently expanded our efforts and established a SEO-community in the larger scale businees spanning all our countries. Also in this community we are now discussing the quality of your services. We'll be meeting next time at 27-28th of june in London. I hope i can bring some answers that clarify the problem we have seen here on seomoz.org. As i have written before: I love your setup and you tools - when they work. Regretebly, that is only occasionally the case!
Moz Pro | | alsvik1 -
Crawl Errors from URL Parameter
Hello, I am having this issue within SEOmoz's Crawl Diagnosis report. There are a lot of crawl errors happening with pages associated with /login. I will see site.com/login?r=http://.... and have several duplicate content issues associated with those urls. Seeing this, I checked WMT to see if the Google crawler was showing this error as well. It wasn't. So what I ended doing was going to the robots.txt and disallowing rogerbot. It looks like this: User-agent: rogerbot Disallow:/login However, SEOmoz has crawled again and it still picking up on those URLs. Any ideas on how to fix? Thanks!
Moz Pro | | WrightIMC0 -
Is there any way to view crawl errors historically?
One of the website's we monitor have been getting high duplicate page titles, as we work through the pages, we see changes and the number of duplicate page titles are decreasing. However, lately, it went up again and the duplicate page titles have increased. I wanted to ask if there's any way to view the new errors and the old errors separately or sorted in a way that can help me identify why we are getting new page crawl errors. Any advice would be great. Thanks!
Moz Pro | | TheNorthernOffice790 -
Canonical tags and SEOmoz crawls
Hi there. Recently, we've made some changes to http://www.gear-zone.co.uk/ to implement canonical tags to some dynamically generated pages to stop duplicate content issues. Previously, these were blocked with robots.txt. In Webmaster Tools, everything looks great - pages crawled has shot up, and overall traffic and sales has seen a positive increase. However the SEOmoz crawl report is now showing a huge increase in duplicate content issues. What I'd like to know is whether SEOmoz registers a canonical tag as preventing a piece of duplicate content, or just adds to it the notices report. That is, if I have 10 pages of duplicate content all with correct canonical tags, will I still see 10 errors in the crawl, but also 10 notices showing a canonical has been found? Or, should it be 0 duplicate content errors, but 10 notices of canonicals? I know it's a small point, but it could potentially have a big difference. Thanks!
Moz Pro | | neooptic0