Robots.txt vs. meta noindex, follow
-
Hi guys,
I wander what your opinion is concerning exclution via the robots.txt file.
Do you advise to keep using this? For example:User-agent: *
Disallow: /sale/*
Disallow: /cart/*
Disallow: /search/
Disallow: /account/
Disallow: /wishlist/*Or do you prefer using the meta tag 'noindex, follow' instead?
I keep hearing different suggestions.
I'm just curious what your opinion / suggestion is.Regards,
Tom Vledder -
Hi Tom
Agree with Martijn that it depends for example, the robots.txt is generally the first port of call for bots as it allows them to understand where you want them to spend their finite time crawling your site. You can aslo give direction to all bots at once or specify a subset. It is generally the best option for blocking pages such as you /cart/ etc were they don't need crawling.
The problem with robots.txt is that it doesn't always keep pages from being indexed especially if there are other external sources linking to the pages in question.
The meta tag noindex on the other hand can be applied to individual pages and you are actually commanding the robots to NOT Index the relevant page in serps, use this option if you have pages you don't want appearing in Google (or other search engines) but the page may still be relevant for authority or able to acquire links (make sure to use Noindex follow) as you still want the robots to crawl the page. Otherwise use Noindex Nofollow hope that this helps.
-
Hi Tom,
It depends, for the /sale/ I would make an exception to make sure that it could be sales pages. But for the other pages I wouldn't want a search engine to waste any crawl budget by looking at these pages for a start. That's why I would go there with a robots.txt implementation instead of META robots as then they'll still visit the page to figure out there they won't index the page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
"Url blocked by robots.txt." on my Video Sitemap
I'm getting a warning about "Url blocked by robots.txt." on my video sitemap - but just for youtube videos? Has anyone else encountered this issue, and how did you fix it if so?! Thanks, J
Technical SEO | | Critical_Mass0 -
Best way to create robots.txt for my website
How I can create robots.txt file for my website guitarcontrol.com ? It is having login and Guitar lessons.
Technical SEO | | zoe.wilson170 -
Why is robots.txt blocking URL's in sitemap?
Hi Folks, Any ideas why Google Webmaster Tools is indicating that my robots.txt is blocking URL's linked in my sitemap.xml, when in fact it isn't? I have checked the current robots.txt declarations and they are fine and I've also tested it in the 'robots.txt Tester' tool, which indicates for the URL's it's suggesting are blocked in the sitemap, in fact work fine. Is this a temporary issue that will be resolved over a few days or should I be concerned. I have recently removed the declaration from the robots.txt that would have been blocking them and then uploaded a new updated sitemap.xml. I'm assuming this issue is due to some sort of crossover. Thanks Gaz
Technical SEO | | PurpleGriffon0 -
Are robots.txt wildcards still valid? If so, what is the proper syntax for setting this up?
I've got several URL's that I need to disallow in my robots.txt file. For example, I've got several documents that I don't want indexed and filters that are getting flagged as duplicate content. Rather than typing in thousands of URL's I was hoping that wildcards were still valid.
Technical SEO | | mkhGT0 -
'External nofollow' in a robots meta tag? (advertorial links)
I believe this has never worked? It'd be an easy way of preventing any penalties from Google's recent crackdown on paid links via advertorials. When it's not possible to nofollow each external link individually, what are people doing? Nofollowing and/or noindexing the whole page?
Technical SEO | | Alex-Harford0 -
Same URL in "Duplicate Content" and "Blocked by robots.txt"?
How can the same URL show up in Seomoz Crawl Diagnostics "Most common errors and warnings" in both the "Duplicate Content"-list and the "Blocked by robots.txt"-list? Shouldnt the latter exclude it from the first list?
Technical SEO | | alsvik0 -
Can I Disallow Faceted Nav URLs - Robots.txt
I have been disallowing /*? So I know that works without affecting crawling. I am wondering if I can disallow the faceted nav urls. So disallow: /category.html/? /category2.html/? /category3.html/*? To prevent the price faceted url from being cached: /category.html?price=1%2C1000
Technical SEO | | tylerfraser
and
/category.html?price=1%2C1000&product_material=88 Thanks!0 -
We are still seeing duplicate content on SEOmoz even though we have marked those pages as "noindex, follow." Any ideas why?
We have many pages on our website that have been set to "no index, follow." However, SEOmoz is indexing them as duplicate content. Why is that?
Technical SEO | | cmaseattle0