Robots.txt, does it need preceding directory structure?
-
Do you need the entire preceding path in robots.txt for it to match?
e.g:
I know if i add Disallow: /fish to robots.txt it will block
/fish
/fish.html
/fish/salmon.html
/fishheads
/fishheads/yummy.html
/fish.php?id=anythingBut would it block?:
en/fish
en/fish.html
en/fish/salmon.html
en/fishheads
en/fishheads/yummy.html
**en/fish.php?id=anything(taken from Robots.txt Specifications)** I'm hoping it actually wont match, that way writing this particular robots.txt will be much easier!
As basically I'm wanting to block many URL that have BTS- in such as:
http://www.example.com/BTS-something
http://www.example.com/BTS-somethingelse
http://www.example.com/BTS-thingybobBut have other pages that I do not want blocked, in subfolders that also have BTS- in, such as:
http://www.example.com/somesubfolder/BTS-thingy
http://www.example.com/anothersubfolder/BTS-otherthingyThanks for listening
-
Yes this is what I thought, but wanted some second opinions.
Although I wouldn't actually need a wild card after BTS, as just leaving it open is the same as using a wildcard:
/fish*.......... Equivalent to "/fish" -- the trailing wildcard is ignored. https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt Thanks for the link, I'll take a look
-
You're right in with the **Disallow: /fish **in the robots file blocking all those initial links, but if you wanted to block everything inside the /en/ folder, you would need to do disallow: /en/fish
You could use a wildcard in the robots.txt file to do something along the lines of Disallow: /BTS-*
This _'should' _work, but it's always worth checking using a tool to make sure it's all implemented correctly. Distilled did a post a while back about a JS tool which allows you to test if robots.txt files work correctly which can be found here - http://www.distilled.net/blog/seo/js-bookmarklet-for-checking-if-a-page-is-blocked-by-robots-txt/
In addition to this, you could also use the 'blocked URLs' tool in GWT to see if the pages are successfully blocked once you've implemented the code.
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blocking poor quality content areas with robots.txt
I found an interesting discussion on seoroundtable where Barry Schwartz and others were discussing using robots.txt to block low quality content areas affected by Panda. http://www.seroundtable.com/google-farmer-advice-13090.html The article is a bit dated. I was wondering what current opinions are on this. We have some dynamically generated content pages which we tried to improve after panda. Resources have been limited and alas, they are still there. Until we can officially remove them I thought it may be a good idea to just block the entire directory. I would also remove them from my sitemaps and resubmit. There are links coming in but I could redirect the important ones (was going to do that anyway). Thoughts?
Intermediate & Advanced SEO | | Eric_edvisors0 -
What is the best URL structure for categories?
A client's site currently uses the URL structure: www.website.com/�tegory%/%postname% Which I think is optimised fairly well, as the categories are keywords being targeted. However, as they are using a category hierarchy, often times the URL looks like this: www.website.com/parent-category/child-category/some-post-titles-are-quite-long-as-they-are-long-tail-terms Best practise often dictates (such as point 3 in this Moz article) that shorter URLs are better for several reasons. So I'm left with a few options: Remove the category from the URL Flatten the category hierarchy Shorten post titles two a word or two - which would hurt my long tail search term traffic. Leave it as it is What do we think is the best route to take? Thanks in advance!
Intermediate & Advanced SEO | | underscorelive0 -
How much great targeted conent do we need to add?
Hi, I'm adding content to a client's website through textbroker. It's ecommerce and it's tough to find backlinks. We have decided to write 100 articles of at least 500 words so that we can say in our backlink campaign email that we have 100 helpful articles. We're thinking that people would like that. Also, we think that 100 good helpful articles will give us traffic and natural backlinks. How do we know if 100 is enough? Do we need 200? 500? Thanks.
Intermediate & Advanced SEO | | BobGW0 -
How to structure your site correctly for optimal juice flow?
Hello fellow mozzers. I have a question regarding structuring a site for optimal link juice flow. If you have an existing website that has for instance a contact page, we know its pointless for that page to have any juice at all. In a hypothetical scenario would it be ok to no index, no follow that page? What happens to existing pagerank on such a page? for instance if you have a contact page with pr 4 and you no index, no follow it, I understand the pagerank will disappear from that page but will it be distributed to other pages on your site? What would be the correct way of handling this scenario?
Intermediate & Advanced SEO | | rightmove0 -
Mobile URLs stolen and I need them back!
Hi guys, Mobile SEO question. So some time in the past, my client accidentally got a whole bunch of m.example.co.nz URLs indexed due to a link on another website and the awesome relative URL links on my client website. However, now they're building a mobile website and they want all those m.example.co.nz URLs. My question is, if we build a new mobile website and use those mobile website URLs including those already indexed by Google, will Google automatically know after crawling those URLs that they are now for mobile users? Will it change the pages to it's mobile index? Or will it be a case of duplicate content? Thanks Kim
Intermediate & Advanced SEO | | Voonie0 -
Directory and Classified Submissions
Are directory submissions and Classified Submissions still a good way to create backlinks? Or they are obsolete methods and should be discontinued?
Intermediate & Advanced SEO | | KS__0 -
Changing URL Structure
We are going to be relaunching our website with a new URL structure. My question is, how is it best to deal with the migration process in terms of old URLS appearing whilst we launch the new ones. How best should we launch the new structure, considering we've in the region of 10,000 pages currently indexed in Google.
Intermediate & Advanced SEO | | NeilTompkins0 -
Most efficient way to change site structure?
I would like to change my sites structure to be more efficient for SEO. I have a fear that the changes will have a potential impact on my current rankings, but know this would be a good long term decision. My site is wordpress, so the changes are relatively easy to make. What are some ways to change the site structure without damaging your rank? I would have to have to clean up a bunch of errors, so is the best way to simply do 301 redirects on the old pages?
Intermediate & Advanced SEO | | dignan990