Meta NOINDEX... how long before Google drops dupe pages?
-
Hi,
I have a lot of near dupe content caused by URL params - so I have applied:
How long will it take for this to take effect? It's been over a week now, I have done some removal with GWT removal tool, but still no major indexed pages dropped.
Any ideas?
Thanks,
Ben
-
In his case - he wants to get rid of some duplicate content only.
I see what you mean but if he is not in the situation listed in http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1269119 then it might be the best bet / fastest bet.
For me personally it worked so far very well - if no robots.txt is used as that won't help on the long run as the removal tool has an expiration date of several months.
The down side of the removal tools is the same expiration date - as if you change your mind you will have some issues getting the page sinto the index.
-
You know that I think you are the bees knees, but I am going to have to disagree on this one. Even Google does not recommend using the removal rool for this application.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1269119
Still pals?
-
There are several things that you can do to get Google to crawl your site (or your new content) quicker and more often. You should be doing all of these, but in case you're not, here is the list.
-
Create a Sitemap and submit it through Web Master Tools
-
Install Google Analytics
-
Create social accounts/update your social accounts
-
Fetch as Google Webmaster tools
-
Update your content more often (to get Google to crawl your site more frequently).
-
Adjust the crawl speed on Google Webmaster tools.
-
Check crawl errors on Google Webmaster tools. Are there sever side errors (500)?
I hope that helps!
-
-
Hi,
The best bet is the removal tool from GWT - this is the fastes way.
If your pages are static and google bot is visiting those pages once a month or once 4-5-6 months - you will need to wait until google bot is visiting those pages again, notice the nofollow and drop those from the index.
I'v e seen cases with 6 months.
Anyway you will probably see those pages drop step by step.
What you can try, although is not very straight forward is to build an xml sitemap only with those files and submit it via GWMt - sometimes google bot will think that something new happen and will visit those pages, see the no index and speed the process - but not always as I've seen in some cases that this didn't work - in some cases it did.
Again, the best bet will be the GWMT removal tool.
Cheers.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
If I block a URL via the robots.txt - how long will it take for Google to stop indexing that URL?
If I block a URL via the robots.txt - how long will it take for Google to stop indexing that URL?
Intermediate & Advanced SEO | | Gabriele_Layoutweb0 -
Fetch as Google -- Does not result in pages getting indexed
I run a exotic pet website which currently has several types of species of reptiles. It has done well in SERP for the first couple of types of reptiles, but I am continuing to add new species and for each of these comes the task of getting ranked and I need to figure out the best process. We just released our 4th species, "reticulated pythons", about 2 weeks ago, and I made these pages public and in Webmaster tools did a "Fetch as Google" and index page and child pages for this page: http://www.morphmarket.com/c/reptiles/pythons/reticulated-pythons/index While Google immediately indexed the index page, it did not really index the couple of dozen pages linked from this page despite me checking the option to crawl child pages. I know this by two ways: first, in Google Webmaster Tools, if I look at Search Analytics and Pages filtered by "retic", there are only 2 listed. This at least tells me it's not showing these pages to users. More directly though, if I look at Google search for "site:morphmarket.com/c/reptiles/pythons/reticulated-pythons" there are only 7 pages indexed. More details -- I've tested at least one of these URLs with the robot checker and they are not blocked. The canonical values look right. I have not monkeyed really with Crawl URL Parameters. I do NOT have these pages listed in my sitemap, but in my experience Google didn't care a lot about that -- I previously had about 100 pages there and google didn't index some of them for more than 1 year. Google has indexed "105k" pages from my site so it is very happy to do so, apparently just not the ones I want (this large value is due to permutations of search parameters, something I think I've since improved with canonical, robots, etc). I may have some nofollow links to the same URLs but NOT on this page, so assuming nofollow has only local effects, this shouldn't matter. Any advice on what could be going wrong here. I really want Google to index the top couple of links on this page (home, index, stores, calculator) as well as the couple dozen gene/tag links below.
Intermediate & Advanced SEO | | jplehmann0 -
What can cause for a service page to rank in Google's Answer Box?
Hello Everyone, Have recently seen a Google result for "vps hosting" showing service page details in Answer Box. I would really like to know, what can cause a service page to appear in the Answer Box? Have attached a screenshot of result page. CaRiWtQUcAALn9n.png CaRiWtQUcAALn9n.png
Intermediate & Advanced SEO | | eukmark0 -
How do i prevent Google and Moz from counting pages as duplicates?
I have 130,000 profiles on my site. When not Connected to them they have very few differences. So a bot - not logged in, etc, will see a login form and "Connect to Profilename" MOZ and Google call the links the same, even though theyre unique such as example.com/id/328/name-of-this-group example.com/id/87323/name-of-a-different-group So how do i separate them? Can I use Schema or something to help identify that these are profile pages, or that the content on them should be ignored as its help text, etc? Take facebook - each facebook profile for a name renders simple results: https://www.facebook.com/public/John-Smith https://www.facebook.com/family/Smith/ Would that be duplicate data if facebook had a "Why to join" article on all of those pages?
Intermediate & Advanced SEO | | inmn0 -
Add noindex,nofollow prior to removing pages resulting in 404's
We're working with another site that unfortunately due to how their website has been programmed creates a bit of a mess. Whenever an employee removes a page from their site through their homegrown 'content management system', rather than 301'ing to another location on their site, the page is deleted and results in a 404. The interim question until they implement a better solution in managing their website is: Should they first add noindex,nofollow to the pages that are scheduled to be removed. Then once they are removed, they become 404's? Of note, it is possible that some of these pages will be used again in the future, and I would imagine they could submit them to Google through Webmaster Tools and adding the pages to their sitemap.
Intermediate & Advanced SEO | | Prospector-Plastics0 -
Need to shorten and change site-wide meta titles (50.000 pages). OK to do all at once?
Just noticed that google completely screws up our meta titles in the SERPs. Google decided to show titles which are not understandable to visitors and worst of all even shows titles in different languages than the actual page. The words of the displayedf titles are nowhere on the page (actually they are parts of old title tags that we stopped using 6 months ago and that we used on different pages). Pages are crawled weekly. All our meta titles are a bit longer than the 70 character limit, so I plan to rephrase and shorten them so that they are all max. 66 characters. Dynamically we choose different variations of title texts based on character length of keywords. Having titles that fit into SERPs without cutting are supposed to have less probability to be changed by google. I heard some people reporting loss of rankings after site-wide meta title changes. Especially since we changed title tags sitewide already about 6 months ago I am a bit concerned. How would you proceed? Just do the site-wide change all at once?
Intermediate & Advanced SEO | | lcourse0 -
Meta NoIndex tag and Robots Disallow
Hi all, I hope you can spend some time to answer my first of a few questions 🙂 We are running a Magento site - layered/faceted navigation nightmare has created thousands of duplicate URLS! Anyway, during my process to tackle the issue, I disallowed in Robots.txt anything in the querystring that was not a p (allowed this for pagination). After checking some pages in Google, I did a site:www.mydomain.com/specificpage.html and a few duplicates came up along with the original with
Intermediate & Advanced SEO | | bjs2010
"There is no information about this page because it is blocked by robots.txt" So I had added in Meta Noindex, follow on all these duplicates also but I guess it wasnt being read because of Robots.txt. So coming to my question. Did robots.txt block access to these pages? If so, were these already in the index and after disallowing it with robots, Googlebot could not read Meta No index? Does Meta Noindex Follow on pages actually help Googlebot decide to remove these pages from index? I thought Robots would stop and prevent indexation? But I've read this:
"Noindex is a funny thing, it actually doesn’t mean “You can’t index this”, it means “You can’t show this in search results”. Robots.txt disallow means “You can’t index this” but it doesn’t mean “You can’t show it in the search results”. I'm a bit confused about how to use these in both preventing duplicate content in the first place and then helping to address dupe content once it's already in the index. Thanks! B0 -
MOZ crawl report says category pages blocked by meta robots but theyr'e not?
I've just run a SEOMOZ crawl report and it tells me that the category pages on my site such as http://www.top-10-dating-reviews.com/category/online-dating/ are blocked by meta robots and have the meta robots tag noindex,follow. This was the case a couple of days ago as I run wordpress and am using the SEO Category updater plugin. By default it appears it makes categories noindex, follow. Therefore I edited the plugin so that the default was index, follow as I want google to index the category pages so that I can build links to them. When I open the page in a browser and view source the tags show as index, follow which adds up. Why then is the SEOMOZ report telling me they are still noindex,follow? Presumably the crawl is in real time and should pick up the new follow tag or is it perhaps because its using data from an old crawl? As yet these pages aren't indexed by google. Any help is much appreciated! Thanks Sam.
Intermediate & Advanced SEO | | SamCUK0