Application & understanding of robots.txt
-
Hello Moz World!
I have been reading up on robots.txt files, and I understand the basics. I am looking for a deeper understanding on when to deploy particular tags, and when a page should be disallowed because it will affect SEO. I have been working with a software company who has a News & Events page which I don't think should be indexed. It changes every week, and is only relevant to potential customers who want to book a demo or attend an event, not so much search engines. My initial thinking was that I should use noindex/follow tag on that page. So, the pages would not be indexed, but all the links will be crawled.
I decided to look at some of our competitors robots.txt files. Smartbear (https://smartbear.com/robots.txt), b2wsoftware (http://www.b2wsoftware.com/robots.txt) & labtech (http://www.labtechsoftware.com/robots.txt).
I am still confused on what type of tags I should use, and how to gauge which set of tags is best for certain pages. I figured a static page is pretty much always good to index and follow, as long as it's public. And, I should always include a sitemap file. But, What about a dynamic page? What about pages that are out of date? Will this help with soft 404s?
This is a long one, but I appreciate all of the expert insight. Thanks ahead of time for all of the awesome responses.
Best Regards,
Will H.
-
Yup.. also don't forget that robots.txt is just a "recommendation" for robots. they do not obey it
Basically Google does what ever it wants to
Also if you want to block a folder so its inner content wont be "accessed", in case anylink will point to this page, even if its coming from outside of your domain, it will be indexed.. Although the content of it wont be shown on search results but it will show up with a notice stating that the site content is blocked due to the sites robots.txt..best of luck!
-
Great Advice Yossi & Chris. Thanks for taking the time to reply. I will have to dig into the Google Guidelines for additional information, but both of your points are valid. I think I was looking at robots.txt the wrong way. Thanks Again Guys!
-
I completely agree with Yossi here; no need to go blocking that page at all.
I can't really add any further value to the points he has covered but one other part of your question suggested that perhaps you're looking at this the wrong way (and it's very common, don't worry!). Rather than having your site stay as-is and just obscuring the bad parts of it from search engines, the thought process should really around creating a great website instead.
If you're ever considering blocking a page from search engines, the first step should always be "why am I blocking this page(s); could I just fix the issue instead?".
For example, you asked if this might help with soft 404s. Rather than trying to find a way to hide these soft 404s, spend that time fixing them instead!
-
Hi Will
There are some concerns that you have which I do not understand.
Why you want to block News & Events page? If it has unique content and on top of that if it is updated regularly, you have no reason to block access to the page. If it is "relevant to potential customers who want to book a demo" its great. I would definitely keep it indexed and followed.Google explicitly states that you should not block access to a page if you simply want to de-index it/remove it. If the page should not be indexed publicly you should remove it or password protect it (a google suggestion).
About tags, i assume you are talking about meta tags, correct?
There is no need to use any kind of meta tag to signal search engines that they need to index or follow the page, you use it only when you want to limit them not to take certain actions.
Also there is no difference between a static or dynamic page when it comes to tag usage. There is no rules for that. A page perfectly be static for years and still get indexed and ranked very good. (but, well we all know that updating the site is a ranking signal)
If you believe that certain page should be tagged "noindex" it is not because it is not updated within the last month or year. Just for an example: contact us pages, about us pages and terms of use pages. These are super static pages that in many cases probably wont be changed for years.best
Yossi
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt was set to disallow for 14 days
We updated our website and accidentally overwrote our robots file with a version that prevented crawling ( "Disallow: /") We realized the issue 14 days later and replaced after our organic visits began to drop significantly and we quickly replace the robots file with the correct version to begin crawling again. With the impact to our organic visits, we have a few and any help would be greatly appreciated - Will the site get back to its original status/ranking ? If so .. how long would that take? Is there anything we can do to speed up the process ? Thanks
Intermediate & Advanced SEO | | jc42540 -
How Can I Displace a Quora Q&A in a Google Featured Snippet?
Hello all. I'm looking for ideas for displacing a Quora Q&A as the featured snippet in google search results. I rank organically for the target term (it's a branded term, "urban airship pricing") in results 1, 2, 3 and 4. The Quora Q&A ranks 5, but is still getting the featured snippet. The Quora question, which is from 2013, is negative - essentially "why does Urban Airship cost so much." It was posed / someone answered the question before we restructured pricing, and added a free starter edition, so the information in the answer is incorrect. It's causing issues for our sales teams, there's a fair amount of volume around this term for us, and worst of all, it's making me mad 😉 I've considered the tactics listed below, but would love to know if anyone's done this, and what free or low-lost tactics work/where to focus efforts. Thanks in advance for help! -Jessica Tactics I'm Considering (Are some or all worth doing? Better ideas?) Create a pricing FAQ page on my website to try give Google a short answer to a query related to pricing that it might feature instead of the Quora Q&A Get a lot of folks to downvote the Quora question (and upvote the short answer we added). Although I'm worried that "activity" on the question might actually make things worse not better in terms of its visibility. Buy paid Google Adwords for the term so the featured snippet isn't quite so starkly featured (we were buying for this term, looking into why our ads aren't showing up at the moment) Talk about pricing on sites like Product Hunt or others (other ideas?) to see if they'll rank highly enough to add more/better content to page 1 results. Contact Quora and let them know that this outdated question is being pulled into a featured snippet and see if they'll do something about it (remove it, etc.) Provide feedback to Google (using the link under the snippet) that "something is wrong" or "this isn't useful"
Intermediate & Advanced SEO | | jpoundstone0 -
Proxy Servers & SEO
Does putting a blog on a proxy server (the pointed at the main site) hurt SEO? i.e. can Google tell? And if they can, does it matter? My server people won't use PHP on their servers but we want a Wordpress blog. So their suggested solution is that they put the blog on a proxy server and point it at the ourdomain.com/blog subfolder on our site. So to all intents and purposes it's hosted in the same place. They assure me this is normal practice and point out that our (main site) images are already being sourced from a CDN. Obviously we'll deal with Google not seeing two separate versions of the same site. But apart from this, is there any negative effect we could suffer from in SEO terms?
Intermediate & Advanced SEO | | abisti20 -
Geo-Targeted Sub-Domains & Duplicate Content/Canonical
For background the sub domain structure here is inherited and commited to due to tech restrictions with some of our platforms. The brand I work with is splitting out their global site into regional sub sites (not too relevant but this is in order to display seasonal product in different hemispheres and to link to stores specific to the region). All sub-domains except EU will be geo-targeted to their relevant country. Regions and sub domains for reference: AU - Australia CA - Canada CH - Switzeraland EU - All Euro zone countries NZ - New Zealand US - United States This will be done with Wordpress multisite. The set up allows to publish content on one 'master' sub site and then decide which other sub sites to 'broadcast' to. Some content is specific to a sub-domain/region so no issue with duplicate and can set the sub-site version as canonical. However some content will appear on all sub-domains. au.example.com/awesome-content/ nz.example.com/awesome-content/ Now first question is since these domains are geo-targeted should I just have them all canonical to the version on that sub-domain? eg Or should I still signal the duplicate content with one canonical version? Essentially the top level example.com exists as a site only for publishing purposes - if a user lands on the top level example.com/awesome-content/ they are given a pop up to select region and redirected to the relevant sub-domain version. So I'm also unsure whether I want that content indexed at all?? I could make the top level example.com versions of all content be the canonical that all others point to eg. and rely on geo-targeting to have the right links show in the right search locations. I hope that's kind of clear?? Obviously I find it confusing and therefore hard to relay! Any feedback at all gratefully received. Cheers, Steve
Intermediate & Advanced SEO | | SteveHoney0 -
Menu Structure & SEO
Hi I have been trying to decide whether we need to change our menu structure http://www.key.co.uk/en/key/ We have a lot of subcategories which are not in the menu structure and for SEO I wonder whether its best to have menu drop downs, so if a customer hovers over one category, it will display all the subcategories within this. I am concerned that sub categories we are trying to rank are many levels away from the homepage e.g If you want to find leather office chairs from the homepage, you have to go to the 'More categories' link, then choose seating > office seating > leather office seating. Users need to do a lot of navigating before seeing what we offer. I would prefer if a user could see these options in the menu when they hover over it. Does anyone think this would help SEO or just customer journey? Thank you
Intermediate & Advanced SEO | | BeckyKey0 -
Heading Tags & Content Count
Hi everyone I am looking into this page on our site http://www.key.co.uk/en/key/sack-trucks Just comparing it against competitors in SEMRush, the tool shows a wordcount of this page for over 4089 words, compared with http://www.wickes.co.uk/Wickes-Green-General-Purpose-Sack-Truck-200kg/p/500302 which only has 2658 - it has a lot more written content than our page - where is this word count coming from? Also looking at the same page on our site Woorank suggests we have the word 'sack truck' in the h1 and title too many times - it's only there once, but its this showing because its an exact match keyword? I'm just wondering if there is something wrong with the html or how the page is being crawed?
Intermediate & Advanced SEO | | BeckyKey0 -
"Starting Over" With A New Domain & 301 Redirect
Hello, SEO Gurus. A client of mine appears to have been hit on a non-manual/algorithm penalty. The penalty appears to be Penguin-like, and the client never received any message (not that that means it wasn't manual). Prior to my working with her, she engaged in all kinds of SEO fornication: spammy links on link farms, shoddy article marketing, blog comment spam -- you name it. There are simply too many tens of thousands of these links to have removed. I've done some disavowal, but again, so much of the link work is spam. She is about to launch a new site, and I am tempted to simply encourage her to buy a new domain and start over. She competes in a niche B2B sector, so it is not terribly competitive, and with solid content and link earning, I think she'd be ok. Here's my question: If we were to 301 the old website to the new one, would the flow of page rank outperform any penalty associated with the site? (The old domain only has a PR of 2). Anyone like my idea of starting over, rather than trying to "recover?" I thank you all in advance for your time and attention. I don't take it for granted.
Intermediate & Advanced SEO | | RCNOnlineMarketing0 -
Effect duration of robots.txt file.
in my web site there is demo site in that also, index in Google but no need it now.so i have created robots file and upload to server yesterday.in the demo folder there are some html files,and i wanna remove all these in demo file from Google.but still in web master tools it showing User-agent: *
Intermediate & Advanced SEO | | innofidelity
Disallow: /demo/ How long this will take to remove from Google ? And are there any alternative way doing that ?0