Large-Scale Penguin Cleanup - How to prioritize?
-
We are conducting a large-scale Penguin cleanup / link cleaning exercise across 50+ properties that have been on the market mostly all for 10+ years. There is a lot of link data to sift through and we are wondering how we should prioritize the effort.
So far we have been collecting backlink data for all properties from AHref, GWT, SeoMajestic and OSE and consolidated the data using home-grown tools.
As a next step we are obviously going through the link cleaning process. We are interested in getting feedback on how we are planning to prioritize the link removal work. Put in other words we want to vet if the community agrees with what we consider are the most harmful type of links for penguin.
- Priority 1: Clean up site-wide links with money-words; if possible keep a single-page link
- Priority 2: Clean up or rename all money keyword links for money keywords in the top 10 anchor link name distribution
- Priority 3: Clean up no-brand sitewide links; if possible keep a single-page link
- Priority 4: Clean up low-quality links (other niche or no link juice)
- Priority 5: Clean up multiple links from same IP C class
Does this sound like a sound approach? Would you prioritize this list differently?
Thank you for any feedback /T
-
Your data sources are correct (AHREFs, Bing, Ose & Majestic) but I recommend including Bing as well. The data is free and you will find at least some links not shown in other sources.
The link prioritization you shared is absolutely incorrect.
"Priority 1: Clean up site-wide links with money-words; if possible keep a single-page link"
While it is true site-wide links are commonly manipulative, removing the site wide link and keeping a single one does not necessarily make it less manipulative. You have only removed one of the elements which are often used to identify manipulative links.
"Priority 2: Clean up or rename all money keyword links for money keywords in the top 10 anchor link name distribution"
A manipulative link is still manipulative regardless of the anchor text used. Based in April 2012, Google used anchor text as a means to identify manipulative links. That was over 18 months ago and Google's link identification process has evolved substantially since that time.
"Priority 3: Clean up no-brand sitewide links; if possible keep a single-page link"
Same response as #1 & 2
"Priority 4: Clean up low-quality links (other niche or no link juice)"
See below
"Priority 5: Clean up multiple links from same IP C class"
The IP address should not be given any consideration whatsoever. You are using a concept that had validity years ago and is completely outdated.
bonegear.net IP address 66.7.211.83
vitopian.com IP address 64.37.49.163
There are no commonalities between the above two IP addresses, be it C block or otherwise, yet they are both hosted on the same server.
You have identified the issue affecting your site (Step 1) and collected a solid list of your backlinks using multiple sources (Step 2). The backlink report is an excellent step which places you well above most site owners and SEOs in your situation.
Step 3 - Identify links from every linking domain.
a. Have an experienced, knowledgeable human visit each and every linking domain. Yes, that is a lot of work but it is what's necessary if you are going to accurately identify all of the manipulative links. Prior to beginning this step, be absolutely sure the person can accurately identify manipulative links with AT LEAST 95% accuracy, although 100% is strongly desired.
b. Document the effort. I have had 3 clients who approached me with a Penguin issue, we confirmed there was not any manual action in place at the time we began the clean up process, but before we finished the sites incurred a manual penalty. Solid documentation of the clean up effort is required by Google in case the Penguin issue morphs into a manual penalty. Also, it just makes sense. You mentioned 50+ web properties so clearly others will be performing these tasks.
c. Audit the effort. A wise former boss once stated "You must inspect what you expect". Unless you carefully audit the work, the process will fail. Evaluators will mis-identify links. You will lose some quality links and manipulative links will be missed as well.
d. While you are on the site, capture manipulative site's e-mail address and contact forum URL (if any). This information is helpful to contact site owners to request link removal.
Step 4 - Conduct a Webmaster Outreach Campaign. Each manipulative domain needs to be contacted in a comprehensive manner. In my experience, most SEOs and site owners do not put in the required level of effort.
a. Send a professional request to the site's WHOIS e-mail address.
b. After 3 business days if no response is received, send the same letter to the site's e-mail address found on the website.
c. After another 3 business days, if no response is received submit the e-mail via the site's contact form. Take a screenshot of the submission on the site (not required for Penguin as no documentation is, but it is helpful for the process).
All of the manipulative link penalties (Penguin and manual) I have worked with have been cleaned up manually. With that said, we use Rmoov to manage the Webmaster Outreach process. It sends and maintains a copy of every e-mail sent. It even has a place to add the Contact Form URL. A big time saver.
If a site owner responds and removes the link, that's great. CHECK IT! If there are only a few links, manually confirm link removal. If there are many URLs, use Screaming Frog or another tool to confirm link removal.
If a site owner refuses or requests money, you can often achieve link removal by having further respectful conversations.
If a site owner does not respond, you can use "extra measures". Call the phone number listed in WHOIS. Send a physical letter to the WHOIS address. Reach out to them on social media sites. Is it a .com domain with missing WHOIS information? You can report them on INTERNIC. Is it a spammy wordpress.com or blogspot site? You can report that as well.
When Matt Cutts introduced the Disavow Tool, he clearly said "...at the point where you have written to as many people as you can, multiple times, you have really tried hard to get in touch and you have only been able to get a fraction of those links down and there is still a small fraction of those links left, that's where you can use our Disavow Tool".
The above process satisfies that requirement. In my experience, not much less than the above process meets that need. The overwhelming majority of those tackling these penalties try to perform the minimal amount of work possible, which is why forums are flooded with complaints about numerous attempts to remove manipulative link penalties and failing.
Upon completion of the above, THEN upload a Disavow list of the links you could not remove after every reasonable human effort. In my experience you should have removed at least 20% of the linking DOMAINS (with rare exceptions).
It can take up to 60 days thereafter, but if you truly cleaned up the links in a quality manner, then the Penguin issues should be fully resolved.
The top factors in determining whether you succeed or fail are:
1. Your determination to follow the above process thoroughly
2. The experience, training and focus of your team
You can resolve the issue in one round of effort and have the Penguin issue resolved within a few months....or you can be one of those site owners who thinks it is impossible and be struggling with the same issue a year later. If you are not 100% committed, RUN AWAY. By that I mean change domain names and start over.
Good Luck.
TLDR - Don't try to fool Google. Anchor text and site wide links are part of the MECHANISM used to identify manipulative links. Don't confuse the mechanism with the message. Google's clear message: EARN links, don't "build" links. Polishing up the old manipulative links is a complete waste of your time. AT BEST, you will enjoy limited success for a period of time until Google catches up. Many site owners and SEOs have already been there, and it is a painful process.
-
When you say "clean up" do you mean removing the links or disavowing them?
You will never be able to get them all removed, so in the end you will need to a Disavow anyways. If your time frame is short, you may want to make Priority One be doing a Disavow for each of the 50+ sites you are working with. Then you can proceed with attempting to get the links removed. I have not heard that there is any downside to having a link removed that already appears on your disavow file...
As for the order of the Priorities, you may want to shuffle them a bit depending on the different situations on the different websites. I suggest you read this Moz Blog article called It's Penguin-Hunting Season: How to Be the Predator and Not the Prey
...and then test a few of your sub-pages that used to rank well at the program used in this article which is called the Penguin Analysis Tool. I say sub-page because it needs a single keyword phrase you want rank that particular page for so it do the anchor text analysis. And that works better on focused sub-pages than on general homepages. $10 per website will let you fully evaluate two typical pages on each and see which facet of the link profile is most valuable to attack first.
-
Have you read the post at http://moz.com/blog/ultimate-guide-to-google-penalty-removal? Matt Cutts even called it out on Twitter as a good post. That's where I'd first look for ideas.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Upgarded woocommerce 2 to 3 and saw a large drop in rank
Hi guys, I recently upgraded from woocommerce 2 to v3 and rankings have been plummeting. I have double checked all of the schema, markup, technical, speed, etc and it all seems to be the same. The on thing that I did find is that while there used to be about 400 images (products) there are now over 1500 images in the index as all of the thumbnails are not in the sitemap. Would / could this cause the issue? Can i remove all the thumbs from the sitemap? if so any suggestions on how (google returns very little to nothing regarding this). Should i alt text every thumbnail..... Thanks in advance for any suggestions.
Intermediate & Advanced SEO | | plahpoy0 -
Large robots.txt file
We're looking at potentially creating a robots.txt with 1450 lines in it. This will remove 100k+ pages from the crawl that are all old pages (I know, the ideal would be to delete/noindex but not viable unfortunately) Now the issue i'm thinking is that a large robots.txt will either stop the robots.txt from being followed or will slow our crawl rate down. Does anybody have any experience with a robots.txt of that size?
Intermediate & Advanced SEO | | ThomasHarvey0 -
How long before your rankings improved after Penguin?
Those of you that have algorithmic penalties, how long after making changes did you actually see an improvement, or have you ever? I have several sites that tanked after Penguin 2.1 and after doing Link Removal, Diasvaow files and building newer more quality links and adding content, I am STILL not seeing any change in rankings after several months. I have heard from some people it can take up to 6-months for google to even crawl a disavow file. I have also heard no matter what you do it won't matter until Google does another update. I feel like we have made a lot of changes in the right direction, but I don't want to go overboard if nothing is going to matter until another Google Update is done. What are your experiences?
Intermediate & Advanced SEO | | netviper0 -
Will using 301 redirects to reduce duplicate content on a massive scale within a domain hurt the site?
We have a site that is suffering a duplicate content problem. To help resolve this we intend to reduce the amount of landing pages within the site. There are a HUGE amount of pages. We have identified the potential to reduce the pages by half at first by combing the top level directories, as we believe they are semantically similar enough that they no longer warrant being seperated.
Intermediate & Advanced SEO | | Silkstream
For instance: Mobile Phones & Mobile Tablets (Its not mobile devices). We want to remove this directory path and 301 these pages to the others, then rewrite the content to include both phones and tablets on the same landing page. Question: Would a massive amount of 301's (over 100,000) cause any harm to the general health of the website? Would it affect the authority? We are also considering just severing them from the site, leaving them indexed but not crawlable from the site, to try and maintain a smooth transition. We dont want traffic to tank. Has anyone performed anything similar? Id be interested to hear all opinions. Thanks!0 -
Penguin Penalty On A Duplicate url
Hi I have noticed a distinct drop in traffic to a page on my web site which occurred around April of last year. Doing some analysis of links pointing to this page, I found that most were sitewide and exact match commercial anchor text. I think the obvious conclusion from this is I got slapped by Penguin although I didn't receive a warning in Webmaster Tools. The page in question was ranking highly for our targeted terms and the url was structured like this: companyname.com/category/index.php The same page is still ranking for some of those terms, but it is the duplicate url: companyname.com/category/ The sitewide problem is associated with links going to the index.php page. There aren't too many links pointing to the non index.php page. My question is this - if we were to 301 redirect index.php to the non php page, would this be detrimental to the rankings we are getting today? ie would we simply redirect the penguin effect to the non php page? If anybody has come across a similar problem or has any advice, it would be greatly appreciated. Thanks
Intermediate & Advanced SEO | | sicseo0 -
Can't seem to get traffic back post Panda / Penguin. WHY?
I have done and am doing everything I can think of to bring back lost traffic after the late 2012 updates from google hit us. I just is not working. We had some issues with our out of house web developers which screwed up our site in 2012 and after taking it in house we have Eden doing damage control form months now. We think we have fixed pretty much everything. URL structure filling up with good unique content(under way. Lots still to do) making better category descriptions redesigned homepage. Updated product pages (CMS is holding things back on that part otherwise they would be better. New CMS under construction) started more link building(its a real weak spot on our SEO as far as I can see) audited bad links from dodgy irelavent sites. hired writers to create content and link bait articles. Begun making high quality video's for both YouTube (brand awareness and viral) and on site hosting (link building and conversions) (in the pipeline not online yet). Flattened out site architecture. optimise internal link flow (got this wrong by using nofollows. In the process of thinking of a better way by reducing nun wanted Nav links on page.) i realise its not all done but I have been working ever since the drop in traffic and I'm just seeing no increase at all. I have been asking a few questions on here for the past few days but still can't put my finger on the issue. Am I just impatient and need to wait on the traffic as I am doing all the correct things? Or have I missed something and need to fix it. you anyone would like to have a quick look at my site and see if there is an obvious issue I have missed It would be great as I have been tearing my hair out trying to find the issues with my site. It's www.centralsaddlery.co.uk Criticism would me much appreciated.
Intermediate & Advanced SEO | | mark_baird0 -
Penguin: Recovery from Algorithm
Hi Mozzers, A quick question regarding Google Penguin recovery. A domain I have was hit by Penguin and we got a message in Webmaster Tools. We went to work fixing links we thought were most harmful, documented the evidence and did a reconsideration request. Google's reply was "No manual spam actions found". If reconsideration requests aren't the way to go then of course I will continue to build good natural links. But if I remove more links which I consider to be harmful will I ever know if the penalty is removed? Is there a point at which the algorithm would remove the penalty and inform me? Thanks!
Intermediate & Advanced SEO | | panini0 -
Time for a new domain post-Penguin?
Hi there, Quick overview of a client who came to us after having the vast majority of their link value slashed by Penguin. Client only has a limited 'recovery budget' Client had been outsourcing SEO to a foreign company The website was 'keyword stuffed' when we arrived Links were of poor quality, the company clearly majoring on quantity rather than quality. Client was ranking #4 or #3 for a keyword which was bringing in sales. Post penguin, dropped to page 5 and then out of the top 100 for that keyword, losing 70% of sales. The client, under our supervision has Rewritten spammy content so they work for human beings (so it now reads well) Gone through the website and is removing old/duplicate and low-quality content De-emphasised other pages for the target keyword so that only one page majors on it. After doing the above has submitted a reconsideration request (about 2 weeks ago, so I know there's time). We are focussing on ensuring her content is written well and on building decent links to the site (i.e. to put some good-uns where the bad ones were). We're into month 2 of the 'clean-up exercise' and the site is still only ranking #90 for the keyword. Given the client's budgetary limitation, could it be more beneficial to consider a new brand identity and domain name to start afresh (without a 301 redirect) or should we just continue along the track we are doing with this client? Thanks!
Intermediate & Advanced SEO | | Nobody15609869897230