How to Disallow Tag Pages With Robot.txt
-
Hi i have a site which i'm dealing with that has tag pages for instant -
http://www.domain.com/news/?tag=choice
How can i exclude these tag pages (about 20+ being crawled and indexed by the search engines with robot.txt
Also sometimes they're created dynamically so i want something which automatically excludes tage pages from being crawled and indexed.
Any suggestions?
Cheers,
Mark
-
Hi Nakul, its Drupal
Mark
-
What CMS is it Mark ?
-
Thanks, is there a way to test it out before actually implementing it with the site.
The site is non-wordpress aswell.
Cheers,
Mark
-
I agree. I would suggest adding the noindex on the pages and letting the bots crawl them. Blocking them would prevent future crawl of these pages, but I am guessing you would also want to remove the existing pages.
Therefore add the noindex first, wait a few days and then add the disallow (Although technically if they are noindex, you don't really need the disallow).
-
Hi Mark
If your using Wordpress then I would recommend SEO Yoast to resolve the tag issue. If not then I suggest you amend the robots.txt file to resolve.
Here is an example:
Disallow: /?tag=
Disallow: /?subcats=
Disallow: /*?features_hash=NOTE:
Be very careful when blocking search engines. Test and test again!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google webcache of product page redirects back to product page
Hi all– I've legitimately never seen this before, in any circumstance. I just went to check the google webcache of a product page on our site (was just grabbing the last indexation date) and was immediately redirected away from google's cached version BACK to the site's standard product page. I ran a status check on the product page itself and it was 200, then ran a status check on the webcache version and sure enough, it registered as redirected. It looks like this is happening for ALL indexed product pages across the site (several thousand), and though organic traffic has not been affected it is starting to worry me a little bit. Has anyone ever encountered this situation before? Why would a google webcache possibly have any reason to redirect? Is there anything to be done on our side? Thanks as always for the help and opinions, y'all!
Intermediate & Advanced SEO | | TukTown1 -
In our reports we get alt tag errors for our banner images. We are unable to add alt tags to the banner images as they live inside CSS. We can add a title tag on the div title for the banner. Does that help with SEO and accessibility?
We are unable to add alt tags to the banner images as they live inside CSS. We can add a title tag on the div title for the banner. Does that help with SEO and accessibility?
Intermediate & Advanced SEO | | Shirley.Fenlason0 -
Landing pages, are my pages competing?
If I have identified a keyword which generates income and when searched in google my homepage comes up ranked second, should I still create a landing page based on that keyword or will it compete with my homepage and cause it to rank lower?
Intermediate & Advanced SEO | | The_Great_Projects0 -
Block in robots.txt instead of using canonical?
When I use a canonical tag for pages that are variations of the same page, it basically means that I don't want Google to index this page. But at the same time, spiders will go ahead and crawl the page. Isn't this a waste of my crawl budget? Wouldn't it be better to just disallow the page in robots.txt and let Google focus on crawling the pages that I do want indexed? In other words, why should I ever use rel=canonical as opposed to simply disallowing in robots.txt?
Intermediate & Advanced SEO | | YairSpolter0 -
Home Page or Internal Page
I have a website that deals with personalized jewelry, and our main keyword is "Name Necklace".
Intermediate & Advanced SEO | | Tiedemann_Anselm
3 mounth ago i added new page: http://www.onecklace.com/name-necklaces/ And from then google index only this page for my main keyword, and not our home page.
Beacuase the page is new, and we didn't have a lot of link to it, our rank is not so well. I'm considering to remove this page (301 to home page), beacause i think that if google index our home page for this keyword it will be better. I'm not sure if this is a good idea, but i know that our home page have a lot of good links and maybe our rank will be higher. Another thing, because google index this internal page for this keyword, it looks like our home page have no main keyword at all. BTW, before i add this page, google index our main page with this keyword. Please advise... U5S8gyS.png j50XHl4.png0 -
Tags, categories or both?
There is so much debate regarding duplicate content, horror stories, losing visitors, being penalized, yada yada... that I am wandering if it's wise to use tags/categories on a WordPress blog. I saw that all major blogs are using these structuring etiquettes and they are all dofollow and meta robots on index, follow. What do you say? It is wise to use tags, categories or both? Should I nofollow them, noindex or follow and index? Or noindex follow? Cheers and thx.
Intermediate & Advanced SEO | | jasmin280 -
Should I index tag pages?
Should I exclude the tag pages? Or should I go ahead and keep them indexed? Is there a general opinion on this topic?
Intermediate & Advanced SEO | | NikkiGaul0 -
H1 Tags
Quick and easy most likely - Just need to clear a few point. I understand each page within the site should only have one H1 tag which should be the most important one. I also believe these only effect google ranking very slightly? right? Currently my CMS is system is pulling the H1 tag in from the page and automatically using the page heading that is on the page IE) the heading used for the content. Should this be a keyword / key phrase instead? and will it be duplicate if i used the same one on various pages in my site? Cheers guys look forward to hearing your feedback
Intermediate & Advanced SEO | | wazza19850