Panda Updates - robots.txt or noindex?
-
Hi,
I have a site that I believe has been impacted by the recent Panda updates. Assuming that Google has crawled and indexed several thousand pages that are essentially the same and the site has now passed the threshold to be picked out by the Panda update, what is the best way to proceed?
Is it enough to block the pages from being crawled in the future using robots.txt, or would I need to remove the pages from the index using the meta noindex tag? Of course if I block the URLs with robots.txt then Googlebot won't be able to access the page in order to see the noindex tag.
Anyone have and previous experiences of doing something similar?
Thanks very much.
-
This is a good read. http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world I think you should be careful with robot.txt because blocking access to the bot will not cause them to remove the content from their index. They will simply include a message saying not quite sure what's on this page.. I would use noindex to clear out the index first before attempting robot.txt exclusion.
-
Yes, both because if a page is linked to on another site google with spider that other site and follow your link without hitting the robots.txt and the page could get indexed if there is not a noindex on it.
-
Indeed try both.
Irving +1
-
both. block the lowest quality lowest traffic pages with nodindex and block the folder in robots.txt
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Disavow post Penguin update
As recent Penguin update makes quick move with backlinks with immediate impact; does Disavow tool also results the changes in few days rather than weeks like earlier? How long does it take now to see the impact of disavow? And I think still we must Disavow some links even Google claim that it'll take care of bad backlinks without passing value from them?
Intermediate & Advanced SEO | | vtmoz0 -
Search engine blocked by robots-crawl error by moz & GWT
Hello Everyone,. For My Site I am Getting Error Code 605: Page Banned by robots.txt, X-Robots-Tag HTTP Header, or Meta Robots Tag, Also google Webmaster Also not able to fetch my site, tajsigma.com is my site Any expert Can Help please, Thanx
Intermediate & Advanced SEO | | falguniinnovative0 -
Updated My Business Profiles & Still Not Ranking in Local SEO... What Next?
I have used Get Listed and a few other services to update my profiles for my company Health Care Associates (its a home health care agency). http:healthcareassociates.net. I have added pictures, categories, descriptions, key words, etc. and it doesn't seem to help. We are still not ranking in the search engines for "home care grand rapids", "home health care" etc. What else can I do to optimize the local search? Todd
Intermediate & Advanced SEO | | t1kuslik0 -
Panda'd - and I think I know how to fix it...
Hi, I have a non-core site that seems to have been affected by a Panda refresh in late December http://www.seomoz.org/google-algorithm-change#2012 Anyway, I couldn't figure out for the longest time why this site, which is full of high-quality, expert-level content would get dinged -- i made several moves to try and eliminate duplicate content -- even though I couldn't find evidence of the duplicate content, but it's a wordpress site so there's lots of opportunities to accidentally introduce it through archives, tags and whatnot. The classic SEO mistake I was making was I was forgetting about a type of post we were doing to facilitate one of our email campaigns. On most, sites there's always something you aren't optimizing, and that's the stuff that can really create unintended issues in google, because the decisions made on those pieces, is often more operational toward the other campaigns, than strategic to search. these posts, are thin little articles, written by humans, but the text is actually submitted to another external site, published there and then recreated as content that the email campaign links to. These posts are segregated from the normal feed on the wordpress site, and the last time I had reviewed this content, we were not using a method for creating that involved publishing it to facebook first. But, OK, so I'm going to stop indexing this content, that's a given. I believe that is the Panda issue -- I could be wrong, but it makes sense, since otherwise the site is maybe the least likely site to be affected by Panda that I've ever been involved with. Do I do anything else, after fixing a Panda issue? Is there a reconsideration request for this or something. Should I send a singing telegram to Cutts? I researched a few articles, and there wasn't much on what to do after you fixed it, but to wait. Just wondering if anyone else who fixed a Panda thang, utilized any communication channel to let google know. thanks!
Intermediate & Advanced SEO | | reallygoodstuff0 -
Will our PA be retained after URL updates?
Our web hosting company recently applied a seo update to our site to deal with canonicalization issues and also rewrote all urls to lower case. As a result our PA is now 1 on all pages its effected. I took this up with them and they had this to say. "I must confess I’m still a bit lost however can assure you our consolidation tech uses a 301 permanent redirect for transfers. This should ensure any back link equity isn’t lost. For instance this address: http://www.towelsrus.co.uk/towels-bath-sheets/aztex/egyptian-cotton-Bath-sheet_ct474bd182pd2731.htm Redirects to this page: http://www.towelsrus.co.uk/towels-bath-sheets/aztex/egyptian-cotton-bath-sheet_ct474bd182pd2731.htm And the redirect returns 301 header response – as discussed in your attached forum thread extract" Firstly, is canonicalization working as the number of duplicate pages shot up last week and also will we get our PA back? Thanks Craig
Intermediate & Advanced SEO | | Towelsrus0 -
Accidental Noindex/Mis-Canonicalisation - Please help!
Hi everybody, I was hoping somebody might be able to help as this is an issue my team and I have never come across before. A client of ours recently migrated to a new site design. 301 redirects were properly implemented and the transition was fairly smooth. However, we realised soon after that a sub-section of pages had either one or both of the following errors: They featured a canonical tag pointing to the wrong page They featured the 'meta noindex' tag After realising this, both the canonicals and the noindex tags were immediately removed. However, Google crawled the site while these were in place and the pages subsequently dropped out of Google's index. We re-submitted the affected pages to Google's index and used WMT to 'Fetch' the pages as Google. We have also since 'allowed' the pages in the robots.txt file as an extra measure. We found that the pages which just had the noindex tag were immediately re-indexed, while the pages which featured the noindex tag and which were mis-canonicalised are still not being re-indexed. Can anyone think of a reason why this might be the case? One of the pages which featured both tags was one of our most important organic landing pages, so we're eager to resolve this. Any help or advice would be appreciated. Thanks!
Intermediate & Advanced SEO | | robmarsden0 -
1200 pages no followed and blocked by robots on my site. Is that normal?
Hi, I've got a bunch of notices saying almost 1200 pages are no-followed and blocked by robots. They appear to be comments and other random pages. Not the actual domain and static content pages. Still seems a little odd. The site is www.jobshadow.com. Any idea why I'd have all these notices? Thanks!
Intermediate & Advanced SEO | | astahl110 -
Search Engine Blocked by robots.txt for Dynamic URLs
Today, I was checking crawl diagnostics for my website. I found warning for search engine blocked by robots.txt I have added following syntax to robots.txt file for all dynamic URLs. Disallow: /*?osCsid Disallow: /*?q= Disallow: /*?dir= Disallow: /*?p= Disallow: /*?limit= Disallow: /*review-form Dynamic URLs are as follow. http://www.vistastores.com/bar-stools?dir=desc&order=position http://www.vistastores.com/bathroom-lighting?p=2 and many more... So, Why should it shows me warning for this? Does it really matter or any other solution for these kind of dynamic URLs.
Intermediate & Advanced SEO | | CommercePundit0