Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should I use meta noindex and robots.txt disallow?
-
Hi, we have an alternate "list view" version of every one of our search results pages
The list view has its own URL, indicated by a URL parameter
I'm concerned about wasting our crawl budget on all these list view pages, which effectively doubles the amount of pages that need crawling
When they were first launched, I had the noindex meta tag be placed on all list view pages, but I'm concerned that they are still being crawled
Should I therefore go ahead and also apply a robots.txt disallow on that parameter to ensure that no crawling occurs? Or, will Googlebot/Bingbot also stop crawling that page over time? I assume that noindex still means "crawl"...
Thanks
-
Hi,
Thanks, I will do some testing to confirm that this behaves how I would like it to
-
if all pages are 100#5 not indexed then I would block it in robots.txt, Google's John Muller confirmed to me that Googlebot will continue to crawl every link to check to see if a nofollow or noindex has changed status.
So as a result we blocked our pages with robots.txt and saw a great increases in index/crawl rates on pages we want Google to pay attention to. It also reduces waste in server resources.
However if there are any pages that are index, if you block them in robots.txt then Googlebot will never be able to crawl the link to determine that it should be noindex. This means it could stay in a permanent stage of indexed.
I hope that answers all your questions?
-
When you say:
nofollow will tell the crawlers to not crawl the page
I believe you mean to say that this will tell the crawlers not to crawl the links on the page, the page itself is itself still "crawled" is it not?
But yes, you are right to say, that once robots.txt disallow is in place, the meta tag will not be seen and thus be moot (at which point I may as well take it off).
It would be nice to be able to say "don't crawl this and don't put it in the index"... but is there a way?
-
noindex only tells the search crawlers to not include the page in the index but still allows for them to crawl the page. nofollow will tell the crawlers to not crawl the page.
robots.txt will accomplish this as well but both I think would be overkill.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page with metatag noindex is STILL being indexed?!
Hi Mozers, There are over 200 pages from our site that have a meta tag "noindex" but are STILL being indexed. What else can I do to remove them from the Index?
Intermediate & Advanced SEO | | yaelslater0 -
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
Intermediate & Advanced SEO | | morg454540 -
301 redirection pointing to noindexed pages
I have rather an unusual situation where a recently launched affiliate site does not have any unique content as its all syndicated content. For that reason we are currently using the noindex,nofollow meta tags to keep the pages out of the search engines index until we create unique content for the pages. The problem is that due to a very tight timeframe with rebranding, we are looking at 301 redirecting (on a page to page basis) another high authority legacy domain to this new site before we have had a chance to add unique content to it and remove the noindex,nofollow tags. I would assume that any link authority normally passed through the 301 would be lost in this scenario but Im uncertain of what the broader impact might be. Has anyone dealt with a similar scenario? I know this scenario is not ideal and I would rather wait until the unique content is up and noindex tags are removed before launching the 301 redirect of the legacy domain but there are a number of competing priorities at play outside of SEO.
Intermediate & Advanced SEO | | LosNomads0 -
Should comments and feeds be disallowed in robots.txt?
Hi My robots file is currently set up as listed below. From an SEO point of view is it good to disallow feeds, rss and comments? I feel allowing comments would be a good thing because it's new content that may rank in the search engines as the comments left on my blog often refer to questions or companies folks are searching for more information on. And the comments are added regularly. What's your take? I'm also concerned about the /page being blocked. Not sure how that benefits my blog from an SEO point of view as well. Look forward to your feedback. Thanks. Eddy User-agent: Googlebot Crawl-delay: 10 Allow: /* User-agent: * Crawl-delay: 10 Disallow: /wp- Disallow: /feed/ Disallow: /trackback/ Disallow: /rss/ Disallow: /comments/feed/ Disallow: /page/ Disallow: /date/ Disallow: /comments/ # Allow Everything Allow: /*
Intermediate & Advanced SEO | | workathomecareers0 -
Meta tags - are they case sensitive?
I just ran the wordtracker tool and noticed something interesting. The tool didn't pick up our meta description. It's strange as our meta descriptions appear in organic search results and Moz never reported missing meta descriptions.After reviewing other pages, I noticed our meta description tag is written as the following: name="Description" content=" I never thought about this, but are meta tags case sensitive? Should it be written as: name="description" content=" Thoughts?
Intermediate & Advanced SEO | | Bio-RadAbs0 -
Should I use rel=canonical on similar product pages.
I'm thinking of using rel=canonical for similar products on my site. Say I'm selling pens and they are al very similar. I.e. a big pen in blue, a pack of 5 blue bic pens, a pack of 10, 50, 100 etc. should I rel=canonical them all to the best seller as its almost impossible to make the pages unique. (I realise the best I realise these should be attributes and not products but I'm sure you get my point) It seems sensible to have one master canonical page for bic pens on a site that has a great description video content and good images plus linked articles etc rather than loads of duplicate looking pages. love to hear thoughts from the Moz community.
Intermediate & Advanced SEO | | mark_baird0 -
Soft 404's from pages blocked by robots.txt -- cause for concern?
We're seeing soft 404 errors appear in our google webmaster tools section on pages that are blocked by robots.txt (our search result pages). Should we be concerned? Is there anything we can do about this?
Intermediate & Advanced SEO | | nicole.healthline4 -
When using ALT tags - are spaces, hyphens or underscores preferred by Google when using multiple words?
when plugging ALT tags into images, does Google prefer spaces, hyphens, or underscores? I know with filenames, hyphens or underscores are preferred and spaces are replaced with %20. Thoughts? Thanks!
Intermediate & Advanced SEO | | BrooklynCruiser3