Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should I disallow all URL query strings/parameters in Robots.txt?
-
Webmaster Tools correctly identifies the query strings/parameters used in my URLs, but still reports duplicate title tags and meta descriptions for the original URL and the versions with parameters. For example, Webmaster Tools would report duplicates for the following URLs, despite it correctly identifying the "cat_id" and "kw" parameters:
/Mulligan-Practitioner-CD-ROM
/Mulligan-Practitioner-CD-ROM?cat_id=87
/Mulligan-Practitioner-CD-ROM?kw=CROMAdditionally, theses pages have self-referential canonical tags, so I would think I'd be covered, but I recently read that another Mozzer saw a great improvement after disallowing all query/parameter URLs, despite Webmaster Tools not reporting any errors.
As I see it, I have two options:
- Manually tell Google that these parameters have no effect on page content via the URL Parameters section in Webmaster Tools (in case Google is unable to automatically detect this, and I am being penalized as a result).
- Add "Disallow: *?" to hide all query/parameter URLs from Google. My concern here is that most backlinks include the parameters, and in some cases these parameter URLs outrank the original.
Any thoughts?
-
Correct. They won't be indexed but are still followed.
-
The statement was in a response to a question I asked earlier.
"I was having an issue like this where moz was showing a lot more duplicate content than webmaster tools was, actually webmaster tools showed none, but I was being penalized. I realized this when I added an exclusion to robots.txt to exclude any query strings on my site. After I did this I saw my rankings shoot through the roof."
Thanks for the info. I did edit the settings in the URL parameters section to tell Google that these parameters do not change the page content, so it should now index only one representative URL. My only concern was that the kw (keyword) parameter does change page content for search result pages, but I just read that Matt Cutts encourages disallowing those pages anyway.
Just to verify, disallowing those pages with parameters won't affect the "link juice" passed from external links?
-
Hi there
I recently answered a question in a similar question in the Q+A that references resources that can help you help Google understand these parameters and categorize them. You can read that here.
That being said, blocking these parameters in your robots.txt will not affect your rankings, especially if those parameter or query strings are properly canonicalized to the proper product page.
That being said, I would make sure you understand the resources above and the options, as you understand your users and website better than anyone - test on a few pages to see what happens and go from there.
Hope this helps! Good luck!
-
"I recently read that another Mozzer saw a great improvement after disallowing all query/parameter URLs" - do you have a link for this?
Canonicals should be enough but Google does mess up and the more clues you can give them, the better.
You can also manually tell Google parameter meanings (if you check out your parameter page now in search console, you should see all of the parameters they've detected for you - you can just change their meaning).
I don't see any harm in disallowing parameters via robots.txt. They will still be crawled and internal links followed, just not indexed in serps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ogranization Schema/Microformat for a content/brand website | Travel
Hi, One of our clients have a website specific to a place, for eg. California Tourism in which they publish local information related to tourism, blogs & other useful content. I want to understand how useful is to publish Organization Schema on such website mentioning the actual Organization, which in this case is a Travel Agency? Or any other schema would fit in for such websites?
Intermediate & Advanced SEO | | ds9.tech0 -
Site-wide Canonical Rewrite Rule for Multiple Currency URL Parameters?
Hi Guys, I am currently working with an eCommerce site which has site-wide duplicate content caused by currency URL parameter variations. Example: https://www.marcb.com/ https://www.marcb.com/?setCurrencyId=3 https://www.marcb.com/?setCurrencyId=2 https://www.marcb.com/?setCurrencyId=1 My initial thought is to create a bunch of canonical tags which will pass on link equity to the core URL version. However I was wondering if there was a rule which could be implemented within the .htaccess file that will make the canonical site-wide without being so labour intensive. I also noticed that these URLs are being indexed in Google, so would it be worth setting a site-wide noindex to these variations also? Thanks
Intermediate & Advanced SEO | | NickG-1230 -
Disallow URLs ENDING with certain values in robots.txt?
Is there any way to disallow URLs ending in a certain value? For example, if I have the following product page URL: http://website.com/category/product1, and I want to disallow /category/product1/review, /category/product2/review, etc. without disallowing the product pages themselves, is there any shortcut to do this, or must I disallow each gallery page individually?
Intermediate & Advanced SEO | | jmorehouse0 -
Linking to URLs With Hash (#) in Them
How does link juice flow when linking to URLs with the hash tag in them? If I link to this page, which generates a pop-over on my homepage that gives info about my special offer, where will the link juice go to? homepage.com/#specialoffer Will the link juice go to the homepage? Will it go nowhere? Will it go to the hash URL above? I'd like to publish an annual/evergreen sort of offer that will generate lots of links. And instead of driving those links to homepage.com/offer, I was hoping to get that link juice to flow to the homepage, or maybe even a product page, instead. And just updating the pop over information each year as the offer changes. I've seen competitors do it this way but wanted to see what the community here things in terms of linking to URLs with the hash tag in them. Can also be a use case for using hash tags in URLs for tracking purposes maybe?
Intermediate & Advanced SEO | | MiguelSalcido0 -
Change url structure and keeping the social media likes/shares
Hi guys, We're thinking of changing the url structure of the tutorials (we call it knowledgebase) section on our website. We want to make it shorter URL so it be closer to the TLD. So, for the convenience we'll call them old page (www.domain.com/profiles/profile_id/kb/article_title) and new page (www.domain.com/kb/article_title) What I'm looking to do is change the url structure but keep the likes/shares we got from facebook. I thought of two ways to do it and would love to hear what the community members thinks is better. 1. Use rel=canonical I thought we might do a rel=canonical to the new page and add a "noindex" tag to the old page. In that way, the users will still be able to reach the old page, but the juice will still link to the new page and the old pages will disappear from Google SERP and the new pages will start to appear. I understand it will be pretty long process. But that's the only way likes will stay 2. Play with the og:url property Do the 301 redirect to the new page, but changing the og:url property inside that page to the old page url. It's a bit more tricky but might work. What do you think? Which way is better, or maybe there is a better way I'm not familiar with yet? Thanks so much for your help! Shaqd
Intermediate & Advanced SEO | | ShaqD0 -
"noindex, follow" or "robots.txt" for thin content pages
Does anyone have any testing evidence what is better to use for pages with thin content, yet important pages to keep on a website? I am referring to content shared across multiple websites (such as e-commerce, real estate etc). Imagine a website with 300 high quality pages indexed and 5,000 thin product type pages, which are pages that would not generate relevant search traffic. Question goes: Does the interlinking value achieved by "noindex, follow" outweigh the negative of Google having to crawl all those "noindex" pages? With robots.txt one has Google's crawling focus on just the important pages that are indexed and that may give ranking a boost. Any experiments with insight to this would be great. I do get the story about "make the pages unique", "get customer reviews and comments" etc....but the above question is the important question here.
Intermediate & Advanced SEO | | khi50 -
Soft 404's from pages blocked by robots.txt -- cause for concern?
We're seeing soft 404 errors appear in our google webmaster tools section on pages that are blocked by robots.txt (our search result pages). Should we be concerned? Is there anything we can do about this?
Intermediate & Advanced SEO | | nicole.healthline4 -
Block an entire subdomain with robots.txt?
Is it possible to block an entire subdomain with robots.txt? I write for a blog that has their root domain as well as a subdomain pointing to the exact same IP. Getting rid of the option is not an option so I'd like to explore other options to avoid duplicate content. Any ideas?
Intermediate & Advanced SEO | | kylesuss12