Block session id URLs with robots.txt
-
Hi,
I would like to block all URLs with the parameter '?filter=' from being crawled by including them in the robots.txt.
Which directive should I use:
User-agent: *
Disallow: ?filter=or
User-agent: *
Disallow: /?filter=In other words, is the forward slash in the beginning of the disallow directive necessary?
Thanks!
-
Hi Martijn,
Thanks for the answer. Regarding the forward slash in the beginning, is it necessary to use this?
In the robots text from Zalando for example, you can see that they don't use it for a lot of filters.
-
Uhh, that's not what the requester is looking for and could actually cause tons of problems if you would apply this on a site that you're unaware of. I would always go with the most limiting robots.txt that you can and in this case, I would go with: /?filter=
-
Hi,
The following should suffice as it will black any URL with a "?" in it
User-agent: * Disallow: /*?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Full title in url
Hi to all, what is the best url structure, to have all words in the url or to tweak url like Yoast suggest? If we remove some words from url , not focus keyword but stop words and other keywords to have shorter url will that impact search rankings? example.com/one-because-two-for-three-on-four - long url, moz crawl error, yoast red light example.com/one-two-three-four - moz ok, yoast ok Where one is a focus keyword.
Intermediate & Advanced SEO | | WalterHalicki0 -
SEO Best Practices regarding Robots.txt disallow
I cannot find hard and fast direction about the following issue: It looks like the Robots.txt file on my server has been set up to disallow "account" and "search" pages within my site, so I am receiving warnings from the Google Search console that URLs are being blocked by Robots.txt. (Disallow: /Account/ and Disallow: /?search=). Do you recommend unblocking these URLs? I'm getting a warning that over 18,000 Urls are blocked by robots.txt. ("Sitemap contains urls which are blocked by robots.txt"). Seems that I wouldn't want that many urls blocked. ? Thank you!!
Intermediate & Advanced SEO | | jamiegriz0 -
Changing URLs
URLs of my web pages are based on the titles of pages. For sampel, if a title page is called "product ABC", then the URL for this page is /product-abc. Google and all other search engines have indexed all pages. Now I want to change the titles of some sites. Should I change the URLs accordingly, or should I rather leave URLs as they are. SEO Best Practice says that keywords must be placed both in the title, and in the URL. I think that Google will think that pages have douplicate content with diffrent titles, and it comes to many 404 error, if I change the URLs. What do you recommend in this case?
Intermediate & Advanced SEO | | kian_moz0 -
Should all pages on a site be included in either your sitemap or robots.txt?
I don't have any specific scenario here but just curious as I come across sites fairly often that have, for example, 20,000 pages but only 1,000 in their sitemap. If they only think 1,000 of their URL's are ones that they want included in their sitemap and indexed, should the others be excluded using robots.txt or a page level exclusion? Is there a point to having pages that are included in neither and leaving it up to Google to decide?
Intermediate & Advanced SEO | | RossFruin1 -
Is our robots.txt file correct?
Could you please review our robots.txt file and let me know if this is correct. www.faithology.com/robots.txt Thank you!
Intermediate & Advanced SEO | | BMPIRE0 -
MOZ crawl report says category pages blocked by meta robots but theyr'e not?
I've just run a SEOMOZ crawl report and it tells me that the category pages on my site such as http://www.top-10-dating-reviews.com/category/online-dating/ are blocked by meta robots and have the meta robots tag noindex,follow. This was the case a couple of days ago as I run wordpress and am using the SEO Category updater plugin. By default it appears it makes categories noindex, follow. Therefore I edited the plugin so that the default was index, follow as I want google to index the category pages so that I can build links to them. When I open the page in a browser and view source the tags show as index, follow which adds up. Why then is the SEOMOZ report telling me they are still noindex,follow? Presumably the crawl is in real time and should pick up the new follow tag or is it perhaps because its using data from an old crawl? As yet these pages aren't indexed by google. Any help is much appreciated! Thanks Sam.
Intermediate & Advanced SEO | | SamCUK0 -
URL for New Product
Hi, We are creating a section on our established existing website to display our new marketplace product & associated category pages. This marketplace will be a section of the site where our users can sell online training courses that they've created. It will be branded on our site as the Marketplace. Is it important to include 'marketplace' in the URL? Or would it be better to include a relevant keyword such as 'training-courses' instead? Or both? I've assumed I shouldn't use both as that would increase the length of the URLs and number of subfolders.
Intermediate & Advanced SEO | | mindflash0 -
New AddThis URL Sharing
So, AddThis just added a cool feature that attempts to track when people share URL's via cutting and pasting the address from the browser. It appears to do so by adding a URL fragment on the end of the URL, hoping that the person sharing will cut and paste the entire thing. That seems like a reasonable assumption to me. Unless I misunderstand, it seems like it will add a fragment to every URL (since it's trying to track all of 'em). Probably not a huge issue for the search engines when they crawl, as they'll, hopefully, discard the fragment, or discard the JS that appends the fragment. But what about backlinks? Natural backlinks that someone might post to say, their blog, by doing exactly what AddThis is attempting to track - cutting and pasting the link. What are people's thoughts on what will happen when this occurs, and the search engines crawl that link, fragment included?
Intermediate & Advanced SEO | | BedeFahey0