Meta robots at every page rather than using robots.txt for blocking crawlers? How they'll get indexed if we block crawlers?
-
Hi all,
The suggestion to use meta robots tag rather than robots.txt file is to make sure the pages do not get indexed if their hyperlinks are available anywhere on the internet. I don't understand how the pages will be indexed if the entire site is blocked? Even though there are page links are available, will Google really index those pages? One of our site got blocked from robots file but internal links are available on internet for years which are not been indexed. So technically robots.txt file is quite enough right? Please clarify and guide me if I'm wrong.
Thanks
-
I agree with Gaston's approach right up to step 4. If you add the no-indexed pages back into a block in the robots.txt file, you'll end up back where you started from. Because Google will still discover the no-indexed URLs elsewhere and the robots,txt block will stop them from discovering the no-index, and the URLs will likely start to get added to the index again.
No-indexed URLs must not be blocked in robots.txt. Those two processes are mutually exclusive.
-
Hi there,
TLDR; The solution to deindexing and never index again:
- Allow (with robots.txt) the web to be crawable
- Aplly meta robots tag: noindex,follow
- Wait somte weeks to be completely deindexed
- block the entire site/section with robots.txt
Robots.txt and the robots meta tag can make the same effect, but to understand them must be analyzed separatedly.
-
Robots.txt, here you just tell bots where they can go BEFORE they crawl any of the website. This is just a signal, not a directive... Because robots can choose to ignore the what's in the file. Here you can block from the entire web, to an entire section or just specific pages. More info: Robots.txt official page and a really cool and complete guide to robots.txt
-
Robots meta tag, with it you have more signals to tell, the most used are: noindex, nofollow and follow, due to the usual issues about indexing. More info: Robots.txt offical page, Google developers, Meta Robots directive - Moz and a complete guide to meta robots tag - YOAST.
Hope this is what you wanted.
Best luck
GR.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Non-indexed or indexed top hierarchy pages get high PageRank at Google?
Hi, We are creating some pages just to capture leads from blog-posts. We created few pages at top hierarchy like website.com/new-page/. I'm just wondering if these pages will take away more PageRank. Do we need to create these pages at low hierarchy like website.com/folder/new-page to avoid passing more PageRank? Is this is how PR distributed even now and it's same for indexed or non-indexed pages? Thanks
Algorithm Updates | | vtmoz0 -
Page 1 all of a sudden for two clients
Hello, So, for many months, a couple of my clients have had a handful of terms that they were ranking for on Page 2. All of a sudden in the past month, both clients have moved up to Page 1, #2 for most of their terms. I have been working on some optimization tests and made minor changes, but I am concerned because the consistency of the #2 position for both clients for all of the previously Page 2 ranking keywords. I have seen this type of Google increase for clients before, and my experience has shown that it is a test from Google-so, from Google's perspective: "we're going to move your rankings up to Page 1 and see what you do with this to prove to us that your site is worth the position". Anyone had any experience with this kind of movement? Thanks so much in advance..
Algorithm Updates | | lfrazer0 -
Same Meta description is being shown on Google?
Not sure why this is happening but when you this command into Google site:"mywebsite": + "key phrase" It brings up pages from my website which have the key phrase but I have noticed that Google is using the wrong meta description for all of them even though these pages all have their own unique meta description Does anyone know why this would be happening? Thanks
Algorithm Updates | | webguru20140 -
Homepage dropped from Results as I get #1 Local
Today, I noticed for a website I have (www.ghosttoursinsavannah.com) the homepage dropped out of the rankings for only one keyword. This site was previously number 3 for s specific keyword, 'Savannah Ghost Tours'. This keyword is also the keyword most customers search for, by far. Hence my mini-panic. All other keywords (roughly 200 of them) I was previously ranked for are still, pretty much, in the same location rank-wise that they were before. This is for Google by the way. The other thing I noticed is that for many keywords I am all of a sudden the (A) Local Result. I wasn't even on the map previously. Now, the keyword for which my homepage dropped from the rankings, 'Savannah Ghost Tours' is also the 'name' of my company according to Google Local. I haven't had this happen before and am quite confused by it. If anyone can help me understand why this happened and if there is anything I can do to get my organic ranking back, I would appreciate it. The only change I made wasn't even to the site. I did go out and use GetListed to add my company to a number of sites, that is all I have done in the past month. Also, These two things happened the same day, the drop in rank and the promotion to number one position for local. google-rankings.png
Algorithm Updates | | TimNealon0 -
What's better .NET or a hyphenated.COM domain
What's better .NET or a hyphenated .COM domain I know this is simple but in selecting a domain for my current project and I only have two options. firstname-lastname.COM or
Algorithm Updates | | RonSparks
firstnamelastname.NET I'm leaning to the .COM as after reading the how to choose a domain name post. http://www.seomoz.org/blog/how-to-choose-the-right-domain-name Thanks1 -
Any ideas why our category pages got de-indexed?
Hi all, I work for evenues, a directory website that provides listings of meeting rooms and event spaces. Things seemed to be chugging along nicely with our link building effort (mostly through guest blogging using a variety of anchor text). Woke up on Monday morning to find that our City pages have been de-indexed. This page: http://www.evenues.com/Meeting-Spaces/Seattle/Washington used to be at the top of page #2 in the SERPs for the keyword "Meeting Rooms in Seattle" I doubt that we got de-indexed because of our link building efforts, as it was only a few blog posts and links from profile pages on community websites. My guess is that when we did a recent 2.0 release of the site, there are now several "filters" or subcategory pages with latitude and longitude parameters in the URL + different page titles based on the categories like: "Meeting Rooms and Event Spaces in Seattle" --Main Page "Meeting Rooms in Seattle" "Classroom Venues in Seattle" "Party Venues in Seattle" There was a bit of pushback when I suggested that we do a rel="canonical" on these babies because ideally we'd like to rank for all 4 queries (Meeting Rooms, Party Venues, Classrooms, in City). These are new changes, and I have a sneaking suspicion this is why we got de-indexed. We're presenting generally the same content. Thoughts?
Algorithm Updates | | eVenuesSEO0 -
Changes in Sitemap Indexation in GWT?
I've noticed some significant changes in the number and percentage of indexed URLs for the sitemaps we've been submitting to Google. I've been tracking these numbers directly from Google Webmaster Tools>Site Configuration>Sitemaps. We've made some changes that could be causing the changes we're seeing, but I want to confirm that this wasn't just a change in the way Google reports the indexation. Has anyone else noticed major changes, greater than a 30% change, in the indexation of your sitemaps in the past week? Thanks, Joe
Algorithm Updates | | JoeAmadon0 -
Removing secure subdomain from google index
we've noticed over the last few months that Google is not honoring our main website's robots.txt file. We have added rules to disallow secure pages such as: Disallow: /login.cgis Disallow: /logout.cgis Disallow: /password.cgis Disallow: /customer/* We have noticed that google is crawling these secure pages and then duplicating our complete ecommerce website across our secure subdomain in the google index (duplicate content) https://secure.domain.com/etc. Our webmaster recently implemented a specific robots.txt file for the secure subdomain disallow all however, these duplicated secure pages remain in the index. User-agent: *
Algorithm Updates | | marketing_zoovy.com
Disallow: / My question is should i request Google to remove these secure urls through Google Webmaster Tools? If so, is there any potential risk to my main ecommerce website? We have 8,700 pages currently indexed into google and would not want to risk any ill effects to our website. How would I submit this request in the URL Removal tools specifically? would inputting https://secure.domain.com/ cover all of the urls? We do not want any secure pages being indexed to the index and all secure pages are served on the secure.domain example. Please private message me for specific details if you'd like to see an example. Thank you,0