Robots.txt, does it need preceding directory structure?
-
Do you need the entire preceding path in robots.txt for it to match?
e.g:
I know if i add Disallow: /fish to robots.txt it will block
/fish
/fish.html
/fish/salmon.html
/fishheads
/fishheads/yummy.html
/fish.php?id=anythingBut would it block?:
en/fish
en/fish.html
en/fish/salmon.html
en/fishheads
en/fishheads/yummy.html
**en/fish.php?id=anything(taken from Robots.txt Specifications)** I'm hoping it actually wont match, that way writing this particular robots.txt will be much easier!
As basically I'm wanting to block many URL that have BTS- in such as:
http://www.example.com/BTS-something
http://www.example.com/BTS-somethingelse
http://www.example.com/BTS-thingybobBut have other pages that I do not want blocked, in subfolders that also have BTS- in, such as:
http://www.example.com/somesubfolder/BTS-thingy
http://www.example.com/anothersubfolder/BTS-otherthingyThanks for listening
-
Yes this is what I thought, but wanted some second opinions.
Although I wouldn't actually need a wild card after BTS, as just leaving it open is the same as using a wildcard:
/fish*.......... Equivalent to "/fish" -- the trailing wildcard is ignored. https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt Thanks for the link, I'll take a look
-
You're right in with the **Disallow: /fish **in the robots file blocking all those initial links, but if you wanted to block everything inside the /en/ folder, you would need to do disallow: /en/fish
You could use a wildcard in the robots.txt file to do something along the lines of Disallow: /BTS-*
This _'should' _work, but it's always worth checking using a tool to make sure it's all implemented correctly. Distilled did a post a while back about a JS tool which allows you to test if robots.txt files work correctly which can be found here - http://www.distilled.net/blog/seo/js-bookmarklet-for-checking-if-a-page-is-blocked-by-robots-txt/
In addition to this, you could also use the 'blocked URLs' tool in GWT to see if the pages are successfully blocked once you've implemented the code.
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL structure with dash or slash
Hi, everyone Basically I am editing my website page's URL for SEO Optimisation and I am not sure which URL structure is best for SEO. The main different is the sign ( dash or slash ) before the product-code. HERE ARE TWO EXAMPLE www.example.com/long-tail-keyword-product-code www.example.com/long-tail-keyword/product-code To get more idea of my page, here is one of the product from my website : http://www.okeus.co.uk/pro_view-3.html My website is selling my own product, as a result the only keyword can be found was the name of the product and I separated different design by different code. Any experts who are willing help would be very much appreciated.
Intermediate & Advanced SEO | | chrisyu781 -
How should you determine the preferred URL structure?
Hi Guys, When migrating to a new CMS which include new pages how should you determine the URL structure, specifically: So should we include www. or without it? Should the URL have a trailing slash? How would you determine the answer to these questions? Cheers.
Intermediate & Advanced SEO | | kayl870 -
Advice for structuring hotel website
Hey guys, I am currently setting up a hotel booking website and I'm not so sure how to structure it. I have landing pages for: 1. Cities
Intermediate & Advanced SEO | | baresound
2. Sights
3. States The main keywords are mainly "Hotels in Cityname" or "Hotels near Sightname". What would be the best SEO friendly way of structuring the url? https://hotels-example.com/hotels/cities/cityname
https://hotels-example.com/hotels/sights/sightname
https://hotels-example.com/hotels/states/statename or https://hotels-example.com/hotels/cityname
https://hotels-example.com/hotels/sightname
https://hotels-example.com/hotels/statename or https://hotels-example.com/hotels-in-cityname
https://hotels-example.com/hotels-in-sightname
https://hotels-example.com/hotels-in-statename Or are there better ways of structuring it or am I just overthinking it? I would greatly appreciate any advice and suggestions 🙂 Best, Max0 -
Robots.txt Disallowed Pages and Still Indexed
Alright, I am pretty sure I know the answer is "Nothing more I can do here." but I just wanted to double check. It relates to the robots.txt file and that pesky "A description for this result is not available because of this site's robots.txt". Typically people want the URL indexed and the normal Meta Description to be displayed but I don't want the link there at all. I purposefully am trying to robots that stuff outta there.
Intermediate & Advanced SEO | | DRSearchEngOpt
My question is, has anybody tried to get a page taken out of the Index and had this happen; URL still there but pesky robots.txt message for meta description? Were you able to get the URL to no longer show up or did you just live with this? Thanks folks, you are always great!0 -
Directory concerns - am I right to request nofollow?
A client had taken a free trial on a directory - a niche directory which only takes food related websites. They mentioned, in passing, that the directory listing was replicated across 90 food-relevant "partner" sites [alarm bells!] - some of which use nofollow - some which don't, apparently. The main directory doesn't use nofollow and offers a mix of monthly-fee based listings or free listings. I've demanded a nofollow backlink from the main site and partner sites, or no backlink... what are your thoughts?
Intermediate & Advanced SEO | | McTaggart0 -
It's a good idea to have a directory on your website?
Currently I have a directory on a sub domain but Google apparently sees it as part of my main domain so all outgoing links may be affecting my rankings?
Intermediate & Advanced SEO | | Valarlf0 -
Which URL structure is much better?
Hi Everybody, Which URL structure is much better? Type 01. http://www.domain.com/category-a/
Intermediate & Advanced SEO | | cprasad
http://www.domain.com/category-a/subcategory-a-1/
http://www.domain.com/category-a/subcategory-a-2/
http://www.domain.com/category-b/
http://www.domain.com/category-b/subcategory-b-1/
http://www.domain.com/category-b/subcategory-b-2/ Type 02. http://www.domain.com/category-a/
http://www.domain.com/subcategory-a-1/
http://www.domain.com/subcategory-a-2/
http://www.domain.com/category-b/
http://www.domain.com/subcategory-b-1/
http://www.domain.com/subcategory-b-2/ How these 2 types can affect for Ranking, Site Links in Google and passing PR from root to other pages? Thanks Prasad0 -
Should I Remove My Articles From Article Directories?
I have been submitting articles to directories for about 3 years. With the Panda update, it seems that these directories are now obsolete. So, if there is no link value from these articles: 1) should I remove these articles (at east the better ones) and place them on my site/blog? 2) If not, would there be any benefit at pointing some bookmarks at these old links to maybe get some juice out of them?
Intermediate & Advanced SEO | | inhouseseo0