Can I Disallow Faceted Nav URLs - Robots.txt
-
I have been disallowing /*? So I know that works without affecting crawling. I am wondering if I can disallow the faceted nav urls.
So disallow: /category.html/? /category2.html/? /category3.html/*?
To prevent the price faceted url from being cached:
/category.html?price=1%2C1000
and
/category.html?price=1%2C1000&product_material=88Thanks!
-
If you can no-index , follow all but the default, then you will send link juice to the pages but it will return the link juice because it is follow, but they will not index because they are no-index.
If you use robots, then it can not read the page to follow the links.
-
Hey Tyler! haven't seen you on SEOmoz in a while. Hope you are good!
Check to see if this would make sense for you. GWT > Site Configuration > URL Perameters. It says "Only use this feature if you feel confident about how parameters work for your site. Telling Googlebot to exclude URLs with certain parameters could result in large numbers of your pages disappearing from our index."
-
If I can, then I disallow hundreds of pages that are duplicate content and should not be crawled.
If I don't then I send link juice to urls that I don't want seen.
This is a good answer though, thanks. Any other thoughts?
-
You can, but then you have links passing link juice to non followed pages. it would be better if you used canonical. even better would be to add no-index, follow meta tag when non canonical page is displayed, but this requres some codeing.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I'm struggling to understand (and fix) why I'm getting a 404 error. The URL includes this "%5Bnull%20id=43484%5D" but I cannot find that anywhere in the referring URL. Does anyone know why please? Thanks
Can you help with how to fix this 404 error please? It appears that I have a redirect from one page to the other, although the referring page URL works, but it appears to be linking to another URL with this code at the end of the the URL - %5Bnull%20id=43484%5D that I'm struggling to find and fix. Thanks
Technical SEO | | Nichole.wynter20200 -
One server, two domains - robots.txt allow for one domain but not other?
Hello, I would like to create a single server with two domains pointing to it. Ex: domain1.com -> myserver.com/ domain2.com -> myserver.com/subfolder. The goal is to create two separate sites on one server. I would like the second domain ( /subfolder) to be fully indexed / SEO friendly and have the robots txt file allow search bots to crawl. However, the first domain (server root) I would like to keep non-indexed, and the robots.txt file disallowing any bots / indexing. Does anyone have any suggestions for the best way to tackle this one? Thanks!
Technical SEO | | Dave1000 -
URL Understanding -
Hello everyone! Can anyone help me understanding this url? Product.asp?PID=1236 cheers
Technical SEO | | PremioOscar0 -
What is the best practice to seperate different locations and languages in an URL? At the moment the URL is www.abc.com/ch/de. Is there a better way to structure the URL from an SEO perspective?
I am looking for a solution for using a new URL structure without using www.abc.com**/ch/de** in the URL to deliver the right languages in specific countries where more than one language are spoken commonly. I am looking forward to your ideas!
Technical SEO | | eviom0 -
Is there any value in having a blank robots.txt file?
I've read an audit where the writer recommended creating and uploading a blank robots.txt file, there was no current file in place. Is there any merit in having a blank robots.txt file? What is the minimum you would include in a basic robots.txt file?
Technical SEO | | NicDale0 -
How can I make Google Webmaster Tools see the robots.txt file when I am doing a .htacces redirec?
We are moving a site to a new domain. I have setup an .htaccess file and it is working fine. My problem is that Google Webmaster tools now says it cannot access the robots.txt file on the old site. How can I make it still see the robots.txt file when the .htaccess is doing a full site redirect? .htaccess currently has: Options +FollowSymLinks -MultiViews
Technical SEO | | RalphinAZ
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www.)?michaelswilderhr.com$ [NC]
RewriteRule ^ http://www.s2esolutions.com/ [R=301,L] Google webmaster tools is reporting: Over the last 24 hours, Googlebot encountered 1 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%.0 -
301 an old URL with a ? in the URL?
I am redoing a site and the URL's are changing structure. The client's site was in magento and in the store they would get two URLs, for example: /store/categoryname/productname and /store/categoryname/productname?SID=dslkajsfdoiu947598whouieht983hg98 Do I have to 301 redirect both of these URL's to their new counterpart? Both go to the same content but magento seemed to add these SIDs into the navigation and Google has both versions in the index.
Technical SEO | | DanDeceuster0