Application & understanding of robots.txt
-
Hello Moz World!
I have been reading up on robots.txt files, and I understand the basics. I am looking for a deeper understanding on when to deploy particular tags, and when a page should be disallowed because it will affect SEO. I have been working with a software company who has a News & Events page which I don't think should be indexed. It changes every week, and is only relevant to potential customers who want to book a demo or attend an event, not so much search engines. My initial thinking was that I should use noindex/follow tag on that page. So, the pages would not be indexed, but all the links will be crawled.
I decided to look at some of our competitors robots.txt files. Smartbear (https://smartbear.com/robots.txt), b2wsoftware (http://www.b2wsoftware.com/robots.txt) & labtech (http://www.labtechsoftware.com/robots.txt).
I am still confused on what type of tags I should use, and how to gauge which set of tags is best for certain pages. I figured a static page is pretty much always good to index and follow, as long as it's public. And, I should always include a sitemap file. But, What about a dynamic page? What about pages that are out of date? Will this help with soft 404s?
This is a long one, but I appreciate all of the expert insight. Thanks ahead of time for all of the awesome responses.
Best Regards,
Will H.
-
Yup.. also don't forget that robots.txt is just a "recommendation" for robots. they do not obey it
Basically Google does what ever it wants to
Also if you want to block a folder so its inner content wont be "accessed", in case anylink will point to this page, even if its coming from outside of your domain, it will be indexed.. Although the content of it wont be shown on search results but it will show up with a notice stating that the site content is blocked due to the sites robots.txt..best of luck!
-
Great Advice Yossi & Chris. Thanks for taking the time to reply. I will have to dig into the Google Guidelines for additional information, but both of your points are valid. I think I was looking at robots.txt the wrong way. Thanks Again Guys!
-
I completely agree with Yossi here; no need to go blocking that page at all.
I can't really add any further value to the points he has covered but one other part of your question suggested that perhaps you're looking at this the wrong way (and it's very common, don't worry!). Rather than having your site stay as-is and just obscuring the bad parts of it from search engines, the thought process should really around creating a great website instead.
If you're ever considering blocking a page from search engines, the first step should always be "why am I blocking this page(s); could I just fix the issue instead?".
For example, you asked if this might help with soft 404s. Rather than trying to find a way to hide these soft 404s, spend that time fixing them instead!
-
Hi Will
There are some concerns that you have which I do not understand.
Why you want to block News & Events page? If it has unique content and on top of that if it is updated regularly, you have no reason to block access to the page. If it is "relevant to potential customers who want to book a demo" its great. I would definitely keep it indexed and followed.Google explicitly states that you should not block access to a page if you simply want to de-index it/remove it. If the page should not be indexed publicly you should remove it or password protect it (a google suggestion).
About tags, i assume you are talking about meta tags, correct?
There is no need to use any kind of meta tag to signal search engines that they need to index or follow the page, you use it only when you want to limit them not to take certain actions.
Also there is no difference between a static or dynamic page when it comes to tag usage. There is no rules for that. A page perfectly be static for years and still get indexed and ranked very good. (but, well we all know that updating the site is a ranking signal)
If you believe that certain page should be tagged "noindex" it is not because it is not updated within the last month or year. Just for an example: contact us pages, about us pages and terms of use pages. These are super static pages that in many cases probably wont be changed for years.best
Yossi
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Linking from & to in domains and sub-domains
What's the best optimised linking between sub-domains and domains? And every time we'll give website link at top with logo...do we need to link sub-domain also with all it's pages? If example.com is domain and example.com/blog is sub-domain or sub-folder... Do we need to link to example.com from /blog? Do we need to give /blog link in all pages of /blog? Is there any difference in connecting domains with sub-domains and sub-folders?
Intermediate & Advanced SEO | | vtmoz0 -
Twitter Robots.TXT
Hello Moz World, So, I trying to wrap my head around all of the different robots.txt. I decided to dive into a site like Twitter, and look at their robot text. And now, I'm super confused. What are they telling the search engines with /hasttag/*src=. Why don't they just use: Useragent: * Disallow: But, they address each search engine. Is there any benefit to this? Thanks for all of the awesome responses!!! B/R Will H.
Intermediate & Advanced SEO | | MarketingChimp100 -
Can someone help me understand why this page is ranking so well?
Hi everyone, EDIT: I'm going to link to the actual page, please remove if there are any issues with confidentiality. Here is the page: https://www.legalzoom.com/knowledge/llc/topic/advantages-and-disadvantages-overview It's ranking #2 on Google for "LLC" This page is a couple months old and is substantially heavy in content, but not much more so than all the dozens of other pages online that are competing with it. This is a highly competitive industry and this particular domain is an extremely huge player in this industry. This new page is suddenly ranking #2 for an extremely competitive head term, arguably the most important/high volume keyword being targeted by the entire site. The page is outranking the home page, as well as the service page that exactly targets the query - the one that you would think would be the ranking page for this head term. However, this new page is somewhat of a spin-off with some additional related content about the subject, some videos, resources, a lot of internal links, etc. The first word of the title tag exactly matches the head term. I did observe that almost no other pages on the site have the exact keyword as the first word of the title tag, but that couldn't be sufficient to bring it up so high in the ranks, could it? Another bizarre thing that is happening is that Google is ignoring the Title Tag in the actual HTML (which is a specific question that is accurate to the content on the page), and re-assigning a title tag that basically looks like this: "Head Term | Brand." Why would it do this on this page? Doesn't it usually prefer more descriptive title tags? There are no external links coming up on Moz or Majestic pointing to this page. It has just a couple social shares. It's not being linked to from the home page or top nav bar on the main site. Can anyone explain how this particular page would outrank the main service page targeting this keyword, as well as other highly authoritative, older pages online targeting the same keyword? Thanks for your help!
Intermediate & Advanced SEO | | FPD_NYC1 -
Meta Robot Tag:Index, Follow, Noodp, Noydir
When should "Noodp" and "Noydir" meta robot tag be used? I have hundreds or URLs for real estate listings on my site that simply use "Index", Follow" without using Noodp and Noydir. Should the listing pages use these Noodp and Noydr also? All major landing pages use Index, Follow, Noodp, Noydir. Is this the best setting in terms of ranking and SEO. Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
Soft 404s for unpublished & 301'd content
Hi, One site I work with unpublished a lot of thin content. Great idea, right? These unpublished pages were then 301'd up to the main category page that they previously existed in. Now Google Webmaster Tools calls them out as soft 404 errors. This seems unexpected since the pages were 301'd. Here is my question; Is this a serious problem that may affect the site's overall organic results and if so what should I do about it? Thanks... Darcy
Intermediate & Advanced SEO | | 945010 -
Meta Abstract & Revisit
Moz Community, I have just noted a competitor using some meta information i have not seen before, Just wondering if anyone has any experience or feedback on using these tags and if they are worth implementing, Seems very similar to the meta description, i don't really see the point unless potentially this abstract could be more topic based if your meta description is designed for Click-Through optimization. Isn't this defined in the sitemap anyway? , and most of the time we will Tweet and Google Plus share any new updates to our site also Google seems to do a good job anyway of crawling anything new we publish or change, Any advice or feedback would be great please, Thanks James
Intermediate & Advanced SEO | | Antony_Towle0 -
Site architecture: Deep drop menus & flat hidden menu?
I hope this makes sense. I am creating a site that will have normal drop down menu structure that will be about 3 levels deep: site.com/category/topic/sub-topic . I also want to add content that will be set up under a hidden menu, but with a sidebar module (placed on the relevant pages that are set up under the drop down) with links to other custom pages that will be relevant to the drop menu pages, but i'm hoping that the flat structure pages will show better for search: site.com/content-page The reason I am asking is because I have seen a competitor do this for a personal injury law firm and they show everywhere (throughout California) for vanity search -"city car accident lawyer". When you go to the site, they have a personal injury drop down that is 3 layers deep, but when you click down the layers, and look at the URL, they are all "flat" site.com/car-accident-lawyer, not site.com/personal-injury/accidents/car-accident-lawyer. Is having a hidden menu a problem? Is this strategy problematic in any way? Hope that makes sense. Thank you for any direction. BB
Intermediate & Advanced SEO | | BBuck0 -
Multiple sites - ownership & link structure
Hi All I am in the process of creating a number of sites within the garden products sector; each site will have unique, original content and there will be no cross over. So for example I will have one on lawn mowers, one on greenhouses, another on garden furniture etc. My original thinking was to create a single limited company that would own each of the domains, therefore all the registrant details will be identical. Is this a sensible thing to do? (I want to be totally white hat) And what, if any, are the linking opportunities between each of the sites? (16 in total). Not to increase ranking, more from an authoritative perspective. And finally, how should I link between each site? Should I no follow the links? Should I use keyword contextual links? Any advice ideas would be appreciated 🙂 Please note: It has been suggested that I just create one BIG site. I've decided against this as I want to use the keyword for each website in the domain name as I believe this still has value. Thanks
Intermediate & Advanced SEO | | danielparry0