Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should I set blog category/tag pages as "noindex"? If so, how do I prevent "meta noindex" Moz crawl errors for those pages?
-
From what I can tell, SEO experts recommend setting blog category and tag pages (ie. "http://site.com/blog/tag/some-product") as "noindex, follow" in order to keep the page quality of indexable pages high. However, I just received a slew of critical crawl warnings from Moz for having these pages set to "noindex." Should the pages be indexed? If not, why am I receiving critical crawl warnings from Moz and how do I prevent this?
-
In the situation outline by the OP, these pages are noindexed. There’s no value to clutterig up crawl reports on these pages. Block rogerbot from non-critical parts of your site, unless you want to be alerted of issues, then don’t.
-
Thanks, I'm not concerned about the crawl depth of the search engine bots, there is nothing in your fix that would affect that, I'm curious of the decrease in crawl depth of the site with the Moz as we use that to spot issues with the site.
One of the clients I implemented the fix on went from 4.6K crawled pages to 3.4K and the fix would have removed an expected 1.2K pages.
The other client went from 5K to 3.7K and the fix would have removed an expected 1.3K pages.
TL;DR - Good News everybody, the robots.txt fix didn't reduce the crawl depth of the moz crawler!
-
I agree, unfortunately Moz doesn't have an internal disallow feature that gives you the option to feed them info on where rogerbot can and can't go. I haven't come across any issues with this approach, crawl depth by search engine bots will not be affected since the user-agent is specified.
-
Thanks for the solution! We have been coming across a similar issue with some of our sites and I although I'm not a big fan of this type of workaround, I don't see any other options and we want to focus on the real issues. You don't want to ignore the rule in case other pages that should be indexed are marked noindex by mistake.
Logan, are you still getting the depth of crawls after making this type of fix? Have any other issues arisen from this approach?
Let us know
- topic:timeago_earlier,5 months
-
Hi Nichole,
You're correct in noindexing these pages, they serve little to no value from an SEO perspective. Moz is always going to alert you of noindex tags when they find them since it's such a critical issue if that tag shows up in unexpected places. If you want to remove these issues from your crawl report, add the following directive to your robots.txt file, this will prevent Moz from crawling these URLs and therefore reporting on them:
User-agent: rogerbot
Disallow: /tag/
Disallow: /category/*edit - do not prevent all user-agents from crawling these URLs, as it will prevent search engines from seeing your noindex tag, they can't obey what they aren't permitted to see. If you want, once all tag & category pages have been removed from the index, you can update your robots.txt to remove the rogerbot directive and add the disallows for tag & category to the * user agent.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How crucial are H1 tags and descriptions in wordpress categories?
Hi all Trying to improve SEO for my (mostly) local site, www.nectarbridge.com, and recently got back on Moz Pro account. First crawl of my site by Moz, a manageable number of issues that I've mostly sorted, but the category with the largest number of problems is missing or invalid tags. My content pages and blog posts are not missing the tags. It's category, archives, etc., including multiple pages, ex: https://www.nectarbridge.com/category/blog/page/4/ A smaller number of pages are being flagged by Moz as missing descriptions, and they are also category pages and the like. So the question is - how hard should I pursue fixing these issues? I'm using the divi theme, which apparently doesn't display the category description by default (if it did, that would kill two birds with one stone). There is a fix to add the category description, but before I get into that I'm trying to discern whether this issue really matters greatly to SEO or if I should spend that time just working on more content.
Moz Pro | Jun 24, 2024, 11:19 AM | gary_nectarbridge0 -
My "tag" pages are showing up as duplicate content. Is this harmful?
Hi. I ran a Moz sitecrawl. I see "Yes" under "Duplicate Page Content" for each of my tag pages. Is this harmful? If so, how do I fix it? This is a Wordpress site. Tags are used in both the blog and ecommerce sections of the site. Ecommerce is a very small portion. Thank you. | |
Moz Pro | Jul 28, 2016, 5:55 PM | dlmilli1 -
Block Moz (or any other robot) from crawling pages with specific URLs
Hello! Moz reports that my site has around 380 duplicate page content. Most of them come from dynamic generated URLs that have some specific parameters. I have sorted this out for Google in webmaster tools (the new Google Search Console) by blocking the pages with these parameters. However, Moz is still reporting the same amount of duplicate content pages and, to stop it, I know I must use robots.txt. The trick is that, I don't want to block every page, but just the pages with specific parameters. I want to do this because among these 380 pages there are some other pages with no parameters (or different parameters) that I need to take care of. Basically, I need to clean this list to be able to use the feature properly in the future. I have read through Moz forums and found a few topics related to this, but there is no clear answer on how to block only pages with specific URLs. Therefore, I have done my research and come up with these lines for robots.txt: User-agent: dotbot
Moz Pro | Jul 21, 2015, 11:43 AM | Blacktie
Disallow: /*numberOfStars=0 User-agent: rogerbot
Disallow: /*numberOfStars=0 My questions: 1. Are the above lines correct and would block Moz (dotbot and rogerbot) from crawling only pages that have numberOfStars=0 parameter in their URLs, leaving other pages intact? 2. Do I need to have an empty line between the two groups? (I mean between "Disallow: /*numberOfStars=0" and "User-agent: rogerbot")? (or does it even matter?) I think this would help many people as there is no clear answer on how to block crawling only pages with specific URLs. Moreover, this should be valid for any robot out there. Thank you for your help!0 -
What to do with a site of >50,000 pages vs. crawl limit?
What happens if you have a site in your Moz Pro campaign that has more than 50,000 pages? Would it be better to choose a sub-folder of the site to get a thorough look at that sub-folder? I have a few different large government websites that I'm tracking to see how they are fairing in rankings and SEO. They are not my own websites. I want to see how these agencies are doing compared to what the public searches for on technical topics and social issues that the agencies manage. I'm an academic looking at science communication. I am in the process of re-setting up my campaigns to get better data than I have been getting -- I am a newbie to SEO and the campaigns I slapped together a few months ago need to be set up better, such as all on the same day, making sure I've set it to include www or not for what ranks, refining my keywords, etc. I am stumped on what to do about the agency websites being really huge, and what all the options are to get good data in light of the 50,000 page crawl limit. Here is an example of what I mean: To see how EPA is doing in searches related to air quality, ideally I'd track all of EPA's web presence. www.epa.gov has 560,000 pages -- if I put in www.epa.gov for a campaign, what happens with the site having so many more pages than the 50,000 crawl limit? What do I miss out on? Can I "trust" what I get? www.epa.gov/air has only 1450 pages, so if I choose this for what I track in a campaign, the crawl will cover that subfolder completely, and I am getting a complete picture of this air-focused sub-folder ... but (1) I'll miss out on air-related pages in other sub-folders of www.epa.gov, and (2) it seems like I have so much of the 50,000-page crawl limit that I'm not using and could be using. (However, maybe that's not quite true - I'd also be tracking other sites as competitors - e.g. non-profits that advocate in air quality, industry air quality sites - and maybe those competitors count towards the 50,000-page crawl limit and would get me up to the limit? How do the competitors you choose figure into the crawl limit?) Any opinions on which I should do in general on this kind of situation? The small sub-folder vs. the full humongous site vs. is there some other way to go here that I'm not thinking of?
Moz Pro | Jul 22, 2015, 4:26 PM | scienceisrad0 -
How to find missing or incorrect title tags with a site hosting lots of pages.
i have a website that features more than 9,000 pages. i'm trying to figure out which ones have missing or incorrect title tags. Should I start with screaming frog??
Moz Pro | Jul 2, 2015, 11:09 AM | SapphireCo0 -
What's the best way to eliminate "429 : Received HTTP status 429" errors?
My company website is built on WordPress. It receives very few crawl errors, but it do regularly receive a few (typically 1-2 per crawl) "429 : Received HTTP status 429" errors through Moz. Based on my research, my understand is that my server is essentially telling Moz to cool it with the requests. That means it could be doing the same for search engines' bots and even visitors, right? This creates two questions for me, which I would greatly appreciate your help with: Are "429 : Received HTTP status 429" errors harmful for my SEO? I imagine the answer is "yes" because Moz flags them as high priority issues in my crawl report. What can I do to eliminate "429 : Received HTTP status 429" errors? Any insight you can offer is greatly appreciated! Thanks,
Moz Pro | Oct 14, 2014, 11:21 AM | ryanjcormier
Ryan0 -
Need help understanding search filter URL's and meta tags
Good afternoon Mozzers, One of our clients is a real estate agent and on that site there is a search field that will allow a person to search by filtered categories. Currently, the URL structure makes a new URL for each filter option and in my Moz reports I get the report that there is missing meta data. However, the page is the same the filter options are different so I am at a loss as to how to proper tag our site to optimize those URL's. Can I rel canonical the URL's or alt rel them? I have been looking for a solution for a few days now and like I said I am at a loss of how to properly resolve these warning messages, or if I should even be concerned with the warning messages from Moz (obviously I should be concerned, they are warning messages for a reason). Thank you for your assistance in advance!
Moz Pro | Mar 14, 2014, 1:38 PM | Highline_Ideas0 -
What Exactly Does "Linking Root Domains" mean??
What Exactly Does "Linking Root Domains" mean?? And how does it affect your ranking for certain Keywords?? Thanks
Moz Pro | Jan 30, 2014, 2:54 PM | Caseman57