A question about Mozbot and a recent crawl on our website.
-
Hi All,
Rogerbot has been reporting errors on our website's for over a year now, and we correct the issues as soon as they are reported.
However I have 2 questions regarding the recent crawl report we got on the 8th.
1.) Pages with a "no-index" tag are being crawled by roger and are being reported as duplicate page content errors. I can ignore these as google doesnt see these pages, but surely roger should ignore pages with "no-index" instructions as well? Also, these errors wont go away in our campaign until Roger ignores the URL's.
2.) What bugs me most is that resource pages that have been around for about 6 months have only just been reported as being duplicate content. Our weekly crawls have never picked up these resources pages as being a problem, why now all of a sudden? (Makes me wonder how extensive each crawl is?)
Anyone else had a similar problem?
Regards
GREG
-
Its pretty big
Over 1000 Pages in the index, and many more internal URLs to crawl that have a no-index tag. (booking forms etc)
Ill see if we can archive our other campaigns and let roger crawl our main site properly.
-
How big is your website Greg ?
-
Thanks Nakul,
I do a weekly scan with Xenu which doesn't have a URL limit like SF.
I was under the impression a full scan of the site was done each week, but as you say, its being scanned in chunks, divided across our 3 other websites.
If this is the case, it would be great to let Mozbot know were to crawl to avoid unnecessary resources being used up when it could be scanning our most important pages.
Greg
-
Greg The crawl is limited to 10,000 (Total) for all your 5 campaigns. As far as whether or not Roger-Bot should ignore Noindex - Here's what I think - I think the intent of that tool here is to find issue. In this scenario, Roger bot is making sure you are aware of the fact that some of those pages have a noindex. Roger does not know whether it's intentional or not. You can also do a deeper crawl and do a deep dive into your website by using Screaming Frog SEO Spider http://www.screamingfrog.co.uk/seo-spider/ It does a great job of doing a deep crawl when you want it since it's a desktop software and you can set all sorts of options and identify issues.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Question About creating seperate websites or using a subdomain or just subfolders...Thanks!
I am working with a company who has created a reservation software...We have been advertising this software aggressively and have been doing well with companies coming on board...Let's call the company ABC and the url was ABC.com...Now they are launching a customer facing side of things that will have all the clients listed for consumers to book at their location...So an online marketplace...Like VRBO or AirBnB...They are taking the name of the software company and making it the url for the marketplace site...So now the marketplace is ABC.com and we need to figure out if we go with a new site for the software side or if we use subdomains or a subfolder.... Do we go with ABCSoftware.com - New root domain ABC.software.com - Subdomain ABC.com/software - Subfolder There are a few other intricacies with this...Not all the companies that use the software will be on the marketplace side...And people who book through the company but using the software have a url embedded on their site that books there directly...This booking costs the company less than a booking that comes through the marketplace so this needs to be kept separate... Thanks for the help everyone!
Moz Pro | | TheMarketingInitiative1 -
WEbsite cannot be crawled
I have received the following message from MOZ on a few of our websites now Our crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster. I have spoken with our webmaster and they have advised the below: The Robots.txt file is definitely there on all pages and Google is able to crawl for these files. Moz however is having some difficulty with finding the files when there is a particular redirect in place. For example, the page currently redirects from threecounties.co.uk/ to https://www.threecounties.co.uk/ and when this happens, the Moz crawler cannot find the robots.txt on the first URL and this generates the reports you have been receiving. From what I understand, this is a flaw with the Moz software and not something that we could fix form our end. _Going forward, something we could do is remove these rewrite rules to www., but these are useful redirects and removing them would likely have SEO implications. _ Has anyone else had this issue and is there anything we can do to rectify, or should we leave as is?
Moz Pro | | threecounties0 -
Crawl Diagnostics - unexpected results
I received my first Crawl Diagnostics report last night on my dynamic ecommerce site. It showed errors on generated URLs which simply are not produced anywhere when running on my live site. Only when running on my local development server. It appears that the Crawler doesn't think that it's running on the live site. For example http://www.nordichouse.co.uk/candlestick-centrepiece-p-1140.html will go to a Product Not Found page, and therefore Duplicate Content errors are produced. Running http://www.nhlocal.co.uk/candlestick-centrepiece-p-1140.html produces the correct product page and not a Product Not Found page Any thoughts?
Moz Pro | | nordichouse0 -
Settings to crawl entire site
Not sure what happened but I started a third campaign yesterday and only 1 pages was crawled, The other two campaigns has 472 and 10K respectively. What is the proper setting to choose in the beginning of campaign setup to have the entire site crawled. Not sure what I did different and I must be reading the instructions incorrectly. Thanks, Don
Moz Pro | | NicheGuy210 -
2 SEO questions
Hello everyone 🙂 I have two questions for all you fine seo people this morning... 1 - Is there a website that I could go to to make sure that I do not have duplicate content floating around on the web that I am not aware of? Sometimes people take information from my site and post it as their own and I want to make sure that google does not ping me for it. 2 - Does anyone know how I can report a spam site to google? I have filled out the reports many times over the past year and posted it in the webmaster discussion forum and it is still up there 🙂 I sent one email to bing and the next day they contacted me with a thank you email and indicated that I was 100% correct and they removed the site from their index. Thank you all!!!!
Moz Pro | | nazmiyal0 -
Website traffic data miles behind
Just logged into one of my profiles to review my traffic data and the last traffic is for 9th Jan. Given it is now 22 this is far far from acceptable. Given the high monthly fee charged by this service why is the traffic data so far behind. If other services (cheaper services) can update their traffic data daily or even weekly why should I continue to use SEOMOZ. The rankings, errors etc are all updated for the 21st of Jan so why not the traffic. Your's An unhappy custimer
Moz Pro | | Grumpy_Carl0 -
MOZ Crawler only crawling one page per campaign
We set up some new campaigns, and now for the last two weekly crawls, the crawler is only accessing one page per campaign. Any ideas why this is happening? PS - two weeks back we did "upgrade" the account. Could this have been an issue?
Moz Pro | | AllaO0