A question about Mozbot and a recent crawl on our website.
-
Hi All,
Rogerbot has been reporting errors on our website's for over a year now, and we correct the issues as soon as they are reported.
However I have 2 questions regarding the recent crawl report we got on the 8th.
1.) Pages with a "no-index" tag are being crawled by roger and are being reported as duplicate page content errors. I can ignore these as google doesnt see these pages, but surely roger should ignore pages with "no-index" instructions as well? Also, these errors wont go away in our campaign until Roger ignores the URL's.
2.) What bugs me most is that resource pages that have been around for about 6 months have only just been reported as being duplicate content. Our weekly crawls have never picked up these resources pages as being a problem, why now all of a sudden? (Makes me wonder how extensive each crawl is?)
Anyone else had a similar problem?
Regards
GREG
-
Its pretty big
Over 1000 Pages in the index, and many more internal URLs to crawl that have a no-index tag. (booking forms etc)
Ill see if we can archive our other campaigns and let roger crawl our main site properly.
-
How big is your website Greg ?
-
Thanks Nakul,
I do a weekly scan with Xenu which doesn't have a URL limit like SF.
I was under the impression a full scan of the site was done each week, but as you say, its being scanned in chunks, divided across our 3 other websites.
If this is the case, it would be great to let Mozbot know were to crawl to avoid unnecessary resources being used up when it could be scanning our most important pages.
Greg
-
Greg The crawl is limited to 10,000 (Total) for all your 5 campaigns. As far as whether or not Roger-Bot should ignore Noindex - Here's what I think - I think the intent of that tool here is to find issue. In this scenario, Roger bot is making sure you are aware of the fact that some of those pages have a noindex. Roger does not know whether it's intentional or not. You can also do a deeper crawl and do a deep dive into your website by using Screaming Frog SEO Spider http://www.screamingfrog.co.uk/seo-spider/ It does a great job of doing a deep crawl when you want it since it's a desktop software and you can set all sorts of options and identify issues.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Best tools for an initial website health check?
Hi,
Moz Pro | | CamperConnect14
I'd like to offer free website health checks (basic audits) and am wondering what tools other people use for this? It would be good to use something that presents the data well. Moz is great but it gets expensive if I want to offer these to many businesses in the hope of taking on just a few as clients and doing a full manual audit for them. So far I've tried seositecheckup.com (just checks a single page though), metaforensics.io and mysiteauditor. Thanks!0 -
Block Moz (or any other robot) from crawling pages with specific URLs
Hello! Moz reports that my site has around 380 duplicate page content. Most of them come from dynamic generated URLs that have some specific parameters. I have sorted this out for Google in webmaster tools (the new Google Search Console) by blocking the pages with these parameters. However, Moz is still reporting the same amount of duplicate content pages and, to stop it, I know I must use robots.txt. The trick is that, I don't want to block every page, but just the pages with specific parameters. I want to do this because among these 380 pages there are some other pages with no parameters (or different parameters) that I need to take care of. Basically, I need to clean this list to be able to use the feature properly in the future. I have read through Moz forums and found a few topics related to this, but there is no clear answer on how to block only pages with specific URLs. Therefore, I have done my research and come up with these lines for robots.txt: User-agent: dotbot
Moz Pro | | Blacktie
Disallow: /*numberOfStars=0 User-agent: rogerbot
Disallow: /*numberOfStars=0 My questions: 1. Are the above lines correct and would block Moz (dotbot and rogerbot) from crawling only pages that have numberOfStars=0 parameter in their URLs, leaving other pages intact? 2. Do I need to have an empty line between the two groups? (I mean between "Disallow: /*numberOfStars=0" and "User-agent: rogerbot")? (or does it even matter?) I think this would help many people as there is no clear answer on how to block crawling only pages with specific URLs. Moreover, this should be valid for any robot out there. Thank you for your help!0 -
Unable to view crawl test
After doing a crawl test i get a download report. It then downloads in csv form and when I go to view it there is a curruption error or just a load of gibberish signs Can I not see the report onsite?
Moz Pro | | hantaah0 -
Website data updates
Hello All, I have downloaded some reports for "following" links from SEO Moz and no doubt the report is very comprehensive. However, I know some links listed in the reports does have rel="nofollow" tag attached to it but it is list still shown that they are following links. Can some please guide be to understand why this has happened. Is it possible that SEO Moz has old data of our site because these changes were recently done (2-3 weeks before)..? OR Have I misunderstood following as dofollow..? Any help would be highly appreciated. Thanks, moosa.
Moz Pro | | moosa0 -
Multi language websites need proper reporting
Common guys! It can not be THAT difficult to add 3 characters to your reports..
Moz Pro | | nans
Instead of seeing 3 times Google BE, I would like to see the language code too. Example: Google BE-NL Google BE-FR Google BE-EN This is really not helpful for me or my clients (screenshot): https://docs.google.com/file/d/0B4BB1WWdu2ncYWZ4TTRqaWR1YVk/edit?usp=drivesdk There's more in the world than the US and English... Can you do this please? edit?usp=drivesdk0 -
SEO on-demand crawl
what happened to the on-demand crawl you could do in PRO when they switched to the new MOZ site?
Moz Pro | | Vertz-Marketing0 -
Does linking to relevant high authority websites effect your MozTrust or Rank?
Basically what the title says. I am having a hard time understanding why a compeitor with less linking domains and none of any real quality, they're all membership sites or partnerships, nothing to relevant to the industry. While we have links and articles on us from multiple magazines in our industry. As well as to relevant directories with high domain ranks. The only thing I noticed is they're linking to their clients website, which are all high authority websites. So do external links count towards your MozTrust or Rank?
Moz Pro | | SeanConroy1 -
Some questions on Canonical tag AND 301 redirect
Hi everyone, I'm new here - always loved SEOMoz and glad to be part of the Pro community now. I have 2 questions regarding the Canonical URL tag. Some background info: We used to run an OsCommerce store, and recently migrated to Magento. In doing so, we right away created 301 redirects of the old category pages (OsCommerce) to the new category pages (Magento) via the Magento admin. Example: www.example.com/old-widget-category.html
Moz Pro | | yacpro13
301 redicrected to
www.example.com/new-widget-category.html In Magento admin, we have enabled the Canonical tag for all product and category pages. Here's how Magento sets up the Canonical tag: The URL of interest which we want to rank is:
www.example.com/new-widget-category.html However Magento sets up the canonical tag on this page to point to:
www.example.com/old-widget-category.html When using the SEOMoz On Page Report Card, it pick this up as an error because the Canonical tag is pointing to a different URL. However, if we dig a little deeper, we see that the URL being pointed to
www.example.com/old-widget-category.html
has a 301 redirect to
www.example.com/new-widget-category.html
which is the URL we wan to rank. So because we set up a 301 redirect of the old-page to the new-page, on the new-page the canonical tag points to the old-page. Question 1)
What are you opinions on this? Do you think this method of setting up the Canonical tag is acceptable? Second question... We use pagination for category pages, so if we have 50 products in one category, we would have 5 pages of 10 products. The URL's would be: www.example.com/new-widget-category.html (which is the SAME as ?p=1)
www.example.com/new-widget-category.html?p=1
www.example.com/new-widget-category.html?p=2
www.example.com/new-widget-category.html?p=3
www.example.com/new-widget-category.html?p=4
www.example.com/new-widget-category.html?p=5 Now ALL the URLs above have the canonical tag set as:
<link rel="canonical" href="http://www.example.com/new-widget-category" /> However, the content of each page (page 1, 2, 3, 4, 5) is different because different products are displayed. So far most what I read regarding the Canonical tag is that it is used for pages that have the same content but different URLs. I would hope that Google would combine the content of all 5 pages and view the result as a single URL www.example.com/new-widget-category Question 2) Is using the canonical tag appropriate in the case described above? Thanks !0