Best practice for disallowing URLS with Robots.txt
-
Hi Everybody,
We are currently trying to tidy up the crawling errors which are appearing when we crawl the site. On first viewing, we were very worried to say the least:17000+. But after looking closer at the report, we found the majority of these errors were being caused by bad URLs featuring:
- Currency - For example: "directory/currency/switch/currency/GBP/uenc/aHR0cDovL2NlbnR1cnlzYWZldHkuY29tL3dvcmt3ZWFyP3ByaWNlPTUwLSZzdGFuZGFyZHM9NzEx/"
- Color - For example: ?color=91
- Price - For example: "?price=650-700"
- Order - For example: ?dir=desc&order=most_popular
- Page - For example: "?p=1&standards=704"
- Login - For example: "customer/account/login/referer/aHR0cDovL2NlbnR1cnlzYWZldHkuY29tL2NhdGFsb2cvcHJvZHVjdC92aWV3L2lkLzQ1ODczLyNyZXZpZXctZm9ybQ,,/"
My question now is as a novice of working with Robots.txt, what would be the best practice for disallowing URLs featuring these from being crawled?
Any advice would be appreciated!
-
If you are looking to disallow url parameters you could use something like the following as a convention.
Disallow: /? or Disallow: /?dir=&order=&p= if you wanted to be more accurate with specific parameters. There have been a few Moz questions of this type over the last few years, if you do look to remove the parameters.
Also try and ensure that the product pages you have listed are well canonicalised and point to the original product etc. A good review on how to do this can be found here. This will in most cases be enough to remove any indexation/duplicate issues.
-
First I assume you have webmaster tools set up?
They have a robots.txt tester tool which you can test out different parameters to make sure you get the right syntax. For example color would be blocked by: Disallow: /?color=91* and you would follow that similar format more or less.
If you are confused I highly recommend reading through Moz's robots.txt best practices guide before you make any changes. Be sure to test all out in webmaster tools(search console)>robots.txt tester.
Let me know if you run into any problems.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My url disappeared from Google but Search Console shows indexed. This url has been indexed for more than a year. Please help!
Super weird problem that I can't solve for last 5 hours. One of my urls: https://www.dcacar.com/lax-car-service.html Has been indexed for more than a year and also has an AMP version, few hours ago I realized that it had disappeared from serps. We were ranking on page 1 for several key terms. When I perform a search "site:dcacar.com " the url is no where to be found on all 5 pages. But when I check my Google Console it shows as indexed I requested to index again but nothing changed. All other 50 or so urls are not effected at all, this is the only url that has gone missing can someone solve this mystery for me please. Thanks a lot in advance.
Intermediate & Advanced SEO | | Davit19850 -
Is it best to 301 redirect or use canonical Url when consolidating two pages?
I have build several pages (A and B) with high quantity content. Page A is aged and gets lots of organic traffic, ranks for lots of valuable keywords, and has only internal links to this page. Page B is newer (6 months) and gets little traffic, ranks for no keywords, but has terrific content and many high value external links. As Page A and B are related to a similar theme, I was going to merge content from page B onto page A, but don't know which would be the best approach for handling the links going to page B. For the purposes of keep as much link equity as possible, is it best to us a 301 redirect from B to A or use a canonical URL from B to A?
Intermediate & Advanced SEO | | Cutopia0 -
Pages with URL Too Long
I manage a number of Shopify stores for ecommerce clients. MOZ keeps kindly telling me the URLs are too long. However, this is largely due to the structure of Shopify, which has to include 'collections' and 'products'. For example: https://domain.com.au/collections/collection-name/products/colour-plus-six-to-seven-word-product-name MOZ recommends no more than 75 characters. This means we have 25-30 characters for both the collection name and product name. VERY challenging! Questions: Anyone know how big an issue URLs are as a ranking factor? I thought pretty low. If it's not an issue, how can we turn off this alert from MOZ? If it is an issue, anyone got any ideas how to fix it on Shopify sites?
Intermediate & Advanced SEO | | muzzmoz0 -
What's the best way to A/B test new version of your website having different URL structure?
Hi Mozzers, Hope you're doing good. Well, we have a website, up and running for a decent tenure with millions of pages indexed in search engines. We're planning to go live with a new version of it i.e a new experience for our users, some changes in site architecture which includes change in URL structure for existing URLs and introduction of some new URLs as well. Now, my question is, what's the best way to do a A/B test with the new version? We can't launch it for a part of users (say, we'll make it live for 50% of the users, an remaining 50% of the users will see old/existing site only) because the URL structure is changed now and bots will get confused if they start landing on different versions. Will this work if I reduce crawl rate to ZERO during this A/B tenure? How will this impact us from SEO perspective? How will those old to new 301 URL redirects will affect our users? Have you ever faced/handled this kind of scenario? If yes, please share how you handled this along with the impact. If this is something new to you, would love to know your recommendations before taking the final call on this. Note: We're taking care of all existing URLs, properly 301 redirecting them to their newer versions but there are some new URLs which are supported only on newer version (architectural changes I mentioned above), and these URLs aren't backward compatible, can't redirect them to a valid URL on old version.
Intermediate & Advanced SEO | | _nitman0 -
URL Optimisation Dilemma
First of all, I fully appreciate that I may be over analysing this, so feel free to highlight if you think I’m going overboard on this one. I’m currently trying to optimise the URLs for a group of new pages that we have recently launched. I would usually err on the side of leaving the urls as they are so that any incoming links are not diluted through the 301 re-direct. In this case, however, there are very few links to these pages, so I don’t think that changing URLs will harm them. My main question is between short URLs vs. long URLs (I have already read Dr. Pete’s post on this). Note: the URLs I have listed below are not the actual URLs, but very similar examples that I have created. The URLs currently exist in a similar format to the examples below: http://www.company.com/products/dlm/hire-ca My first response was that we could put a few descriptive keywords in the url, with something like the following: http://www.company/products/debt-lifecycle-management/hire-collection-agents - I’m worried though that the URL will get too long for any pages sitting under this. As a compromise, I am considering the following: http://www.company/products/dlm/hire-collection-agents My feeling is that the second approach will give the best balance between having the keywords for the products and trying to ensure good user experience. My only concern is whether the /dlm/ category page would suffer slightly, but this would have ‘debt-lifecycle-management’ in the title tag. Does this sound like a good approach to people? Or do you think I’m being a little obsessive about this? Any help would be appreciated 🙂
Intermediate & Advanced SEO | | RG_SEO0 -
Url structure of a blog
We are trying to work out what the best structure for our blog is as we want each page to rank as highly as possible, we were looking at a flat structure similar to http://www.hongkiat.com/blog/ where every posts is after the blog/ but not in category's although the viewers can look in different category's from the top buttons on the page- photoshop - icons etc or we where going to go for the structured way- blog/photoshop/blog-post.html the only problem is that we will end up 4 deep at least with this and at least 80 characters in the url. any help would be appreciated. Thanks Shaun
Intermediate & Advanced SEO | | BobAnderson0 -
What would be the best domain choice?
Hello I got a website www.keywordCA.com and I'm ranking #1 spot on "keyword" but what I notice if you have the exact match you get more site links and etc. Like this keyword that match with my domain name "keyword CA" The ideal name will be www.keyword.com but is taken and the owner don't want to sell the domain (at least he is not using it, is just parked) and I also got the domain www.keyword.net Do you think www.keyword.net will be much better than KeywordCA.com in order to get more exposure and google will generate more site links?
Intermediate & Advanced SEO | | jpgprinting0 -
Best Product URL For Indexing
My proposed URL: mydomain.com/products/category/subcategory/product detail Puts my products 4 levels deep. Is this too deep to get my products indexed?
Intermediate & Advanced SEO | | waynekolenchuk0