Panda Updates - robots.txt or noindex?
-
Hi,
I have a site that I believe has been impacted by the recent Panda updates. Assuming that Google has crawled and indexed several thousand pages that are essentially the same and the site has now passed the threshold to be picked out by the Panda update, what is the best way to proceed?
Is it enough to block the pages from being crawled in the future using robots.txt, or would I need to remove the pages from the index using the meta noindex tag? Of course if I block the URLs with robots.txt then Googlebot won't be able to access the page in order to see the noindex tag.
Anyone have and previous experiences of doing something similar?
Thanks very much.
-
This is a good read. http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world I think you should be careful with robot.txt because blocking access to the bot will not cause them to remove the content from their index. They will simply include a message saying not quite sure what's on this page.. I would use noindex to clear out the index first before attempting robot.txt exclusion.
-
Yes, both because if a page is linked to on another site google with spider that other site and follow your link without hitting the robots.txt and the page could get indexed if there is not a noindex on it.
-
Indeed try both.
Irving +1
-
both. block the lowest quality lowest traffic pages with nodindex and block the folder in robots.txt
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Noindex / Nofollow multiple reviews pages?
I have well over a hundred pages of reviews (10 per page). I know this is solid content and I'd hate to not be able to leverage it, but I'm running into the issue of having duplicate title tags and H1s on all of the pages. What's the best way to make use of the review content without have those types of issues? Is a noindex / nofollow strategy something I should be considering here for Page 2 and beyond? Thanks! Edit: I did additional digging into pagination strategies and found this terrific article on Moz. I'm thinking it should address my questions regarding review pages as well.
Intermediate & Advanced SEO | | Andrew_Mac0 -
Can't find X-Robots tag!
Hi all. I've been checking out http://www.unthankbooks.com/ as it seems to have some indexing problems. I ran a server header check, and got a 200 response. However, it also shows the following: X-Robots-Tag:
Intermediate & Advanced SEO | | Blink-SEO
noindex, nofollow It's not in the page HTML though. Could it be being picked up from somewhere else?0 -
Robots.txt
What would be a perfect robots.txt file my site is propdental.es Can i just place: User-agent: * Or should i write something more???
Intermediate & Advanced SEO | | maestrosonrisas0 -
Sitemap contains Meta NOINDEX pages - Good or bad?
Hi, Our sitemap is created by our e-commerce software - Magento - We are probably going to make a lot of products Meta No Index for the moment, until all the content has been corrected on them - but by default, as they are enabled, they will appear in Sitemap. So, the question is: "Should pages that are Meta NOINDEX be listed in a sitemap"? Does it matter? thanks!
Intermediate & Advanced SEO | | bjs20100 -
Panda recovery. Is it possible ?
Dear all, To begin, english is not my native language, so I'm very sorry if I make some mistake. On the 23th march 2012, Panda penalized my website (a coupon website) Avec-Reduction (dot com for the url). At this date, I have lost more than 70% of my traffic. The structure of the website was like an e-commerce website. Categories -> merchant page -> coupon page. The content was to thin for Google, I'm agree wit that. So, in may, I have made a new version. Here you can see the most important modifications : A smallest header (-100px height). 2 columns website (the oldest website had 3 columns) I have deleted the category menu with the list of all categories and the alphabetical menu. less ads on the website (since few days I have also deleted the 2 adense blocks) The coupons was promoted with the thumbs of the merchant on the home listing. Now I have few top lists in text only. I have deleted all the categories pages (one page by category of merchant, with the listing of all the merchants of the category). Now I have only one page for this. Same thing with the alphabetical pages. All these deleted pages have a 301 redirect. The 2 new pages (categorie page and alphabetical page) are on noindex. I have deleted all the promo codes pages, all the coupons are now on the merchant page (301 redirect used). I have create an anti-spam system for the review codes (I have a lot of spam on these forms, even if I cleaned every day/2days). Now, I have no spam. Visitors have now the possibility to put a note and a review for each merchant. This fonctionality is new, so not a lot of reviews for the moment. All the merchants pages without promo codes have a noindex on the robot tag. Since july, I have the possibility to use the "title" of each promo code. I can personnalised the promo code. On the same time, to have more content, I have the possibility to add sales or great promos for each merchants, not only promo codes. Affiliate links are created on JS which open a new window (a redirect page with noindex). That's the end of the most important changes on my website. I have a better speed page (/2 since july) because I have optimized my images, CSS, JS... At the end of july, I had health problem and the website has no update until the first days of october. Now, the website is updated every day, but between july and october I have no panda recovery. I have no duplicate content, I try to add the most content I can. So I don't understand why Google Panda penalized me again and again. Some of my competitors have a lot of keyword stuffing (4, 5, 6, ... 10 lines of KS on each merchant pages). Some of them have only affiliate merchants, automatic script to put coupons on websites), few "same" websites... I have less than 30% of affiliated merchant, I validate all the coupons or promo manually, I personalized all my coupons... So I don't understand what to do. I will appreciate all help. If you see problems on my wesite or if you know tips to have a panda recovery, I will be very happy to have informations. Many thanks for all. Sincerely, Florent
Intermediate & Advanced SEO | | Floroger0 -
Googlebot Can't Access My Sites After I Repair My Robots File
Hello Mozzers, A colleague and I have been collectively managing about 12 brands for the past several months and we have recently received a number of messages in the sites' webmaster tools instructing us that 'Googlebot was not able to access our site due to some errors with our robots.txt file' My colleague and I, in turn, created new robots.txt files with the intention of preventing the spider from crawling our 'cgi-bin' directory as follows: User-agent: * Disallow: /cgi-bin/ After creating the robots and manually re-submitting it in Webmaster Tools (and receiving the green checkbox), I received the same message about Googlebot not being able to access the site, only difference being that this time it was for a different site that I manage. I repeated the process and everything, aesthetically looked correct, however, I continued receiving these messages for each of the other sites I manage on a daily-basis for roughly a 10-day period. Do any of you know why I may be receiving this error? is it not possible for me to block the Googlebot from crawling the 'cgi-bin'? Any and all advice/insight is very much welcome, I hope I'm being descriptive enough!
Intermediate & Advanced SEO | | NiallSmith1 -
202 error page set in robots.txt versus using crawl-able 404 error
We currently have our error page set up as a 202 page that is unreachable by the search engines as it is currently in our robots.txt file. Should the current error page be a 404 error page and reachable by the search engines? Is there more value or is it a better practice to use 404 over a 202? We noticed in our Google Webmaster account we have a number of broken links pointing the site, but the 404 error page was not accessible. If you have any insight that would be great, if you have any questions please let me know. Thanks, VPSEO
Intermediate & Advanced SEO | | VPSEO0 -
Mozpoints not updating
Added 2 comments on 2 questions and my points are still at the same level. Anyone knows why? I don't.
Intermediate & Advanced SEO | | mosaicpro0