Robots.txt, does it need preceding directory structure?
-
Do you need the entire preceding path in robots.txt for it to match?
e.g:
I know if i add Disallow: /fish to robots.txt it will block
/fish
/fish.html
/fish/salmon.html
/fishheads
/fishheads/yummy.html
/fish.php?id=anythingBut would it block?:
en/fish
en/fish.html
en/fish/salmon.html
en/fishheads
en/fishheads/yummy.html
**en/fish.php?id=anything(taken from Robots.txt Specifications)** I'm hoping it actually wont match, that way writing this particular robots.txt will be much easier!
As basically I'm wanting to block many URL that have BTS- in such as:
http://www.example.com/BTS-something
http://www.example.com/BTS-somethingelse
http://www.example.com/BTS-thingybobBut have other pages that I do not want blocked, in subfolders that also have BTS- in, such as:
http://www.example.com/somesubfolder/BTS-thingy
http://www.example.com/anothersubfolder/BTS-otherthingyThanks for listening
-
Yes this is what I thought, but wanted some second opinions.
Although I wouldn't actually need a wild card after BTS, as just leaving it open is the same as using a wildcard:
/fish*.......... Equivalent to "/fish" -- the trailing wildcard is ignored. https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt Thanks for the link, I'll take a look
-
You're right in with the **Disallow: /fish **in the robots file blocking all those initial links, but if you wanted to block everything inside the /en/ folder, you would need to do disallow: /en/fish
You could use a wildcard in the robots.txt file to do something along the lines of Disallow: /BTS-*
This _'should' _work, but it's always worth checking using a tool to make sure it's all implemented correctly. Distilled did a post a while back about a JS tool which allows you to test if robots.txt files work correctly which can be found here - http://www.distilled.net/blog/seo/js-bookmarklet-for-checking-if-a-page-is-blocked-by-robots-txt/
In addition to this, you could also use the 'blocked URLs' tool in GWT to see if the pages are successfully blocked once you've implemented the code.
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I need help on how best to do a complicated site migration. Replacing certain pages with all new content and tools, and keeping the same URL's. The rest just need to disappear safely. Somehow.
I'm completely rebranding a website but keeping the same domain. All content will be replaced and it will use a different theme and mostly new plugins. I've been building the new site as a different site in Dev mode on WPEngine. This means it currently has a made-up domain that needs to replace the current site. I know I need to somehow redirect the content from the old version of the site. But I'm never going to use that content again. (I could transfer it to be a Dev site for the current domain and automatically replace it with the click of a button - just as another option.) What's the best way to replace blahblah.com with a completely new blahblah.com if I'm not using any of the old content? There are only about 4 URL'st, such as blahblah.com/contact hat will remain the same - with all content replaced. There are about 100 URL's that will no longer be in use or have any part of them ever used again. Can this be done safely?
Intermediate & Advanced SEO | | brickbatmove1 -
Menu Structure for Large Ecommerce
Hi We have a large ecommerce site, the menu at the moment is limited by the amount of categories we can display. As our site is so large, the menu at the moment only has the top categories and their immediate subcategories, however we have level 3's which go deeper, as there is such a large range. At the moment, they;re not in the top menu, but I want to put a case forward to say why we should include them - I am however mindful of a menu not being overcrowded with hundreds of links. Has anyone had a similar experience of this? Or a case study on how adding important categories to the menu helped improve things? Becky
Intermediate & Advanced SEO | | BeckyKey0 -
Directory concerns - am I right to request nofollow?
A client had taken a free trial on a directory - a niche directory which only takes food related websites. They mentioned, in passing, that the directory listing was replicated across 90 food-relevant "partner" sites [alarm bells!] - some of which use nofollow - some which don't, apparently. The main directory doesn't use nofollow and offers a mix of monthly-fee based listings or free listings. I've demanded a nofollow backlink from the main site and partner sites, or no backlink... what are your thoughts?
Intermediate & Advanced SEO | | McTaggart0 -
Page structure and how to optimize old content
SITE STRUCTURE I am trying to optimize the structure of our site Dreamestatehuahin.com. Getting a visible sitemap of my page make me realized it was not a pyramid as I expected it to be but instead very flat. I Would be happy for some advise on how to structure my site in future aswell how to optimize certain place on the page that i think need a change. 1: structure on posts. Maybe I misunderstand how post works in wordpress or something happen with my theme. When I look at my page sitemap my page is VERY flat because permalinks setting I chose the setting as post name (recommended in most articles). http://www.dreamestatehuahin.com/sample-post What I actually believed was that post name was place after /blog/ like: http://www.dreamestatehuahin.com/blog/sample-post I would be a good idea to do like this right? Should I add some SEO text on the top of my blog page before the actually posts. Or would this be a bad idea due to pagination causing double content? Could one do 4 blogs in one site and replace the name “blog” in the url with a keywords http://www.dreamestatehuahin.com/real-estate-announcement/sample-post http://www.dreamestatehuahin.com/hua-hin-attractions/sample-post 2) Pages Based on property type From our top menu, i have made links under for sael using wordpress property types http://www.dreamestatehuahin.com/property-type/villa/ http://www.dreamestatehuahin.com/property-type/hot-deals/ http://www.dreamestatehuahin.com/property-type/condominium/ Earlier I found that these pages created duplictaon of titles due to pagenation so I deleted the h1 What would you do with these pages. Should I optimize them with a text and h1. maybe it is possible to add some title and text content for the top of the first page only (the one page that are linked to our top menu) http://www.dreamestatehuahin.com/property-type/villa and not to page 2, 3, 4….. http://www.dreamestatehuahin.com/property-type/villa/page/2/ b) Also maybe I should rename the property types WOuld it make sence to change name of the property types from etc villa to villas for sale or even better villas for sale hua hin Then the above urls will look like this instead: http://www.dreamestatehuahin.com/property-type/villas-for-sale/ Or Maybe renaming a property type would result in many 404 errors and not be worth the effort? 3) LINKING + REPOSTING OUR “PROPERTY” PAGES AND DO A 301 REDIRECT? a) Would It be good idea to link back from all properties description to one of our 5 optimized landingpages (for the keyword home/house/condo/villa) for sale in Hua Hin? http://www.dreamestatehuahin.com/property-hua-hin/ http://www.dreamestatehuahin.com/house-for-sale-hua-hin/ b) Also so far we haven’t been really good about optimizing each property (no keywords, optimized titles or descriptions) etc. http://www.dreamestatehuahin.com/property/baan-suksamran/ I wonder if it would be worth the effort to optimize content of each of the old properties )photos-text) on our page? Or maybe post the old properties again in a new optimized version and do a 301 redirect from the old post?
Intermediate & Advanced SEO | | nm19770 -
Recovering from robots.txt error
Hello, A client of mine is going through a bit of a crisis. A developer (at their end) added Disallow: / to the robots.txt file. Luckily the SEOMoz crawl ran a couple of days after this happened and alerted me to the error. The robots.txt file was quickly updated but the client has found the vast majority of their rankings have gone. It took a further 5 days for GWMT to file that the robots.txt file had been updated and since then we have "Fetched as Google" and "Submitted URL and linked pages" in GWMT. In GWMT it is still showing that that vast majority of pages are blocked in the "Blocked URLs" section, although the robots.txt file below it is now ok. I guess what I want to ask is: What else is there that we can do to recover these rankings quickly? What time scales can we expect for recovery? More importantly has anyone had any experience with this sort of situation and is full recovery normal? Thanks in advance!
Intermediate & Advanced SEO | | RikkiD220 -
Issue with Robots.txt file blocking meta description
Hi, Can you please tell me why the following error is showing up in the serps for a website that was just re-launched 7 days ago with new pages (301 redirects are built in)? A description for this result is not available because of this site's robots.txt – learn more. Once we noticed it yesterday, we made some changed to the file and removed the amount of items in the disallow list. Here is the current Robots.txt file: # XML Sitemap & Google News Feeds version 4.2 - http://status301.net/wordpress-plugins/xml-sitemap-feed/ Sitemap: http://www.website.com/sitemap.xml Sitemap: http://www.website.com/sitemap-news.xml User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/ Other notes... the site was developed in WordPress and uses that followign plugins: WooCommerce All-in-One SEO Pack Google Analytics for WordPress XML Sitemap Google News Feeds Currently, in the SERPs, it keeps jumping back and forth between showing the meta description for the www domain and showing the error message (above). Originally, WP Super Cache was installed and has since been deactivated, removed from WP-config.php and deleted permanently. One other thing to note, we noticed yesterday that there was an old xml sitemap still on file, which we have since removed and resubmitted a new one via WMT. Also, the old pages are still showing up in the SERPs. Could it just be that this will take time, to review the new sitemap and re-index the new site? If so, what kind of timeframes are you seeing these days for the new pages to show up in SERPs? Days, weeks? Thanks, Erin ```
Intermediate & Advanced SEO | | HiddenPeak0 -
Two identical websites need ranking locally
Hi Wondering if someone can advise. We have two websites with a .ie domian and .co.uk domain (e-commerce stores) The websites are identical so we need to address duplicate content issue. The issue we have is we are targeting both local Google to rank, google.ie and google.co.uk. Obviously to handle duplicate content we are going to have to "rel can" the one of the websites, which will probably be the .ie domain. Question is, will this effect the ranking within the .ie domian on google.ie. And any advice on how anyone else handles this situation would be greatly appreciated, we have had no issue ranking before with one domain on a local search engine, but this is the first time we have come across needing to rank two domains with identical content on each local search engine Thanks in advance John
Intermediate & Advanced SEO | | Johnny4B0 -
How should i best structure my internal links?
I am new to SEO and looking to employ a logical but effective internal link strategy. Any easy ways to keep track of what page links to what page? I am a little confused regarding anchor text in as much as how I should use this. e.g. for a category page "Towels", I was going to link this to another page we want to build PA for such as "Bath Sheets". What should I put in for anchor text? keep it simple and just put "Bath Sheets" or make it more direct like "Buy Bath Sheets". Should I also vary anchor text if i have another 10 pages internally linking to this or keep it the same. Any advise would be really helpful. Thanks Craig
Intermediate & Advanced SEO | | Towelsrus0