NOINDEX content still showing in SERPS after 2 months
-
I have a website that was likely hit by Panda or some other algorithm change. The hit finally occurred in September of 2011. In December my developer set the following meta tag on all pages that do not have unique content:
name="robots" content="NOINDEX" />
It's been 2 months now and I feel I've been patient, but Google is still showing 10,000+ pages when I do a search for site:http://www.mydomain.com
I am looking for a quicker solution. Adding this many pages to the robots.txt does not seem like a sound option. The pages have been removed from the sitemap (for about a month now). I am trying to determine the best of the following options or find better options.
- 301 all the pages I want out of the index to a single URL based on the page type (location and product). The 301 worries me a bit because I'd have about 10,000 or so pages all 301ing to one or two URLs. However, I'd get some link juice to that page, right?
- Issue a HTTP 404 code on all the pages I want out of the index. The 404 code seems like the safest bet, but I am wondering if that will have a negative impact on my site with Google seeing 10,000+ 404 errors all of the sudden.
- Issue a HTTP 410 code on all pages I want out of the index. I've never used the 410 code and while most of those pages are never coming back, eventually I will bring a small percentage back online as I add fresh new content. This one scares me the most, but am interested if anyone has ever used a 410 code.
Please advise and thanks for reading.
-
Just wanted to let you know that submitting all the sites I wanted removed into an XML sitemap worked. I then submitted that sitemap to webmaster tools and listed it in the robots.txt. When doing query "site:domain.com" index pages went from 20k+ down to 700 in a matter of days.
-
I could link to them then, but what about creating a custom sitemap for just content that I want removed? Would that have the same effect?
-
If they are not linked to then spiders will not find the noindex code. They could suffer in the SERPs for months and months.
-
If all these pages are under a directory structure than you have the option to remove a complete directory in URL removal option. See if that is feasible in your case.
-
I suppose I'll wait longer. Crawl rate over the last 90 days is a high of 3,285 and average of 550 with a low of 3 according to webmaster tools.
-
Yeah the pages are low PR and are not linked to at all from the site. I've never heard of removing a page via webmaster tools. How do I do that? I also have to remove several thousand.
*edit: It looks like I have to remove them one at a time which is not feasible in my case. Is there a faster way?
-
If you want a page out of the index fast the best way is to do it through webmaster tools. It's easy and lasts for about six months. Then, if they find your page again it will register the noindex and you should be fine.
As EGOL said, if it's a page that isn't crawled very often then it could be a LONG time before it gets deindexed.
-
I removed some pages from the index and used the same line of code...
name="robots" content="NOINDEX" />
My pages dropped from the index within 2 or 3 days - but this is a site that has very heavy spider activity.
If your site is not crawled very much or these are low PR pages (such as PR1, PR2) it could take google a while to revisit and act upon your noindex instructions - but two months seems a bit long.
Is your site being crawled vigorously? Look in webmaster tools to see if crawling declined abruptly when your rankings fell. Check there also for crawl problems.
If I owned your site and the PR of these pages is low I would wait a while longer before doing anything. If my patience was wearing thin I would do the 301 redirect because that will transfer the linkjuice from those pages to the target URL of the redirect - however, you might wait quite a while to see the redirect take effect. That's why my first choice would be to wait longer.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Benefit of internal link in content
Hi, Is there a real benefit to having internal links in content other than at the bottom of a page for example and not surrounded by content. Would the benefit be 1 to 10 or 1 to 1.5 ? Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
My landing pages don't show up in the SERPs, only my frontpage does.
I am having some trouble with getting the landing pages for a clients website to show up in the SERPs.
Intermediate & Advanced SEO | | InmediaDK
As far as I can see, the pages are optimized well, and they also get indexed by Google. The website is a danish webshop that sells wine, www.vindanmark.com Take for an instance this landing page, http://www.vindanmark.com/vinhandel/
It is optimzied for the keywords "Vinhandel Århus". Vinhandel means "Winestore" and "Århus" is a danish city. As you can see, I manage to get them at page 1 (#10), but it's the frontpage that ranks for the keyword. And this goes for alle the other landing pages as well. But I can't figure out, why the frontpage keep outranking the landingpages on every keyword.
What am I doing wrong here?1 -
Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)
Hi Guys, We have developed a plugin that allows us to display used vehicle listings from a centralized, third-party database. The functionality works similar to autotrader.com or cargurus.com, and there are two primary components: 1. Vehicle Listings Pages: this is the page where the user can use various filters to narrow the vehicle listings to find the vehicle they want.
Intermediate & Advanced SEO | | browndoginteractive
2. Vehicle Details Pages: this is the page where the user actually views the details about said vehicle. It is served up via Ajax, in a dialog box on the Vehicle Listings Pages. Example functionality: http://screencast.com/t/kArKm4tBo The Vehicle Listings pages (#1), we do want indexed and to rank. These pages have additional content besides the vehicle listings themselves, and those results are randomized or sliced/diced in different and unique ways. They're also updated twice per day. We do not want to index #2, the Vehicle Details pages, as these pages appear and disappear all of the time, based on dealer inventory, and don't have much value in the SERPs. Additionally, other sites such as autotrader.com, Yahoo Autos, and others draw from this same database, so we're worried about duplicate content. For instance, entering a snippet of dealer-provided content for one specific listing that Google indexed yielded 8,200+ results: Example Google query. We did not originally think that Google would even be able to index these pages, as they are served up via Ajax. However, it seems we were wrong, as Google has already begun indexing them. Not only is duplicate content an issue, but these pages are not meant for visitors to navigate to directly! If a user were to navigate to the url directly, from the SERPs, they would see a page that isn't styled right. Now we have to determine the right solution to keep these pages out of the index: robots.txt, noindex meta tags, or hash (#) internal links. Robots.txt Advantages: Super easy to implement Conserves crawl budget for large sites Ensures crawler doesn't get stuck. After all, if our website only has 500 pages that we really want indexed and ranked, and vehicle details pages constitute another 1,000,000,000 pages, it doesn't seem to make sense to make Googlebot crawl all of those pages. Robots.txt Disadvantages: Doesn't prevent pages from being indexed, as we've seen, probably because there are internal links to these pages. We could nofollow these internal links, thereby minimizing indexation, but this would lead to each 10-25 noindex internal links on each Vehicle Listings page (will Google think we're pagerank sculpting?) Noindex Advantages: Does prevent vehicle details pages from being indexed Allows ALL pages to be crawled (advantage?) Noindex Disadvantages: Difficult to implement (vehicle details pages are served using ajax, so they have no tag. Solution would have to involve X-Robots-Tag HTTP header and Apache, sending a noindex tag based on querystring variables, similar to this stackoverflow solution. This means the plugin functionality is no longer self-contained, and some hosts may not allow these types of Apache rewrites (as I understand it) Forces (or rather allows) Googlebot to crawl hundreds of thousands of noindex pages. I say "force" because of the crawl budget required. Crawler could get stuck/lost in so many pages, and my not like crawling a site with 1,000,000,000 pages, 99.9% of which are noindexed. Cannot be used in conjunction with robots.txt. After all, crawler never reads noindex meta tag if blocked by robots.txt Hash (#) URL Advantages: By using for links on Vehicle Listing pages to Vehicle Details pages (such as "Contact Seller" buttons), coupled with Javascript, crawler won't be able to follow/crawl these links. Best of both worlds: crawl budget isn't overtaxed by thousands of noindex pages, and internal links used to index robots.txt-disallowed pages are gone. Accomplishes same thing as "nofollowing" these links, but without looking like pagerank sculpting (?) Does not require complex Apache stuff Hash (#) URL Disdvantages: Is Google suspicious of sites with (some) internal links structured like this, since they can't crawl/follow them? Initially, we implemented robots.txt--the "sledgehammer solution." We figured that we'd have a happier crawler this way, as it wouldn't have to crawl zillions of partially duplicate vehicle details pages, and we wanted it to be like these pages didn't even exist. However, Google seems to be indexing many of these pages anyway, probably based on internal links pointing to them. We could nofollow the links pointing to these pages, but we don't want it to look like we're pagerank sculpting or something like that. If we implement noindex on these pages (and doing so is a difficult task itself), then we will be certain these pages aren't indexed. However, to do so we will have to remove the robots.txt disallowal, in order to let the crawler read the noindex tag on these pages. Intuitively, it doesn't make sense to me to make googlebot crawl zillions of vehicle details pages, all of which are noindexed, and it could easily get stuck/lost/etc. It seems like a waste of resources, and in some shadowy way bad for SEO. My developers are pushing for the third solution: using the hash URLs. This works on all hosts and keeps all functionality in the plugin self-contained (unlike noindex), and conserves crawl budget while keeping vehicle details page out of the index (unlike robots.txt). But I don't want Google to slap us 6-12 months from now because it doesn't like links like these (). Any thoughts or advice you guys have would be hugely appreciated, as I've been going in circles, circles, circles on this for a couple of days now. Also, I can provide a test site URL if you'd like to see the functionality in action.0 -
Client site is lacking content. Can we still optimize without it?
We just signed a new client whose site is really lacking in terms of content. Our plan is to add content to the site in order to achieve some solid on-page optimization. Unfortunately the site design makes adding content very difficult! Does anyone see where we may be going wrong? Is added content really the only way to go? http://empathicrecovery.com/
Intermediate & Advanced SEO | | RickyShockley0 -
Anchor Text to Content Ratio
My Home Page has about 500 words of content, but when I crawl the text it brings up about 1400 total words when counting all the anchor text links (I believe all are in the navigation or images). All of the link are internal and relevant (it's a huge site), but I am worried that they are diluting the copy. Is that likely the case? What's a good ratio? Thoughts?
Intermediate & Advanced SEO | | NathanArizona0 -
Old URL showing up in SERPs 4 months after Re-direct
Hi guys, I did a full site redirect back in October to a new URL, SERPS eventually changed to the new URL and everything was fine. However recently i have started to see the old URL showing up? Anyone else seeing this?
Intermediate & Advanced SEO | | Martin_Harris0 -
Duplicate content on the same page--is this an issue?
We are transitioning to responsive design and some of our pages will not scale properly, so we were thinking of adding the same content twice to the same URL (one would be simple text -- for mobile and the other would include the images, etc for the desktop version), and content would change based on size of the screen. I'm not looking for another technical solution (I know google specifies that you can dynamically serve different content based on user agent)--I am wondering if any one knows if having the same exact content appear twice on the same URL will cause a problem with SEO (any historical tests or experience would be great). Thank you in advance.
Intermediate & Advanced SEO | | nicole.healthline0 -
Erratic Behaviour In The SERPS
I am seeing some really erratic behaviour in the SERPS just now. We have 2 domains a .com and .co.uk The .com is holding fine on page 1 however the .co.uk is jumping from page 1 to page 4 almost on a daily basis. Now, we are aware that our link profile is not the best on this domain and we are working on this just now creating more quality content/links. If this was a penalty surely it would drop to page 4 and stay there... This bouncing around seems very strange..... We have updated the on page content etc to make sure that we are following all best practices but nothing seems to be working... Has anyone else experienced this kind of problem? Matthew
Intermediate & Advanced SEO | | EwanFisher0