Post Penguin Recovery: Link Removal Strategy for Back Link Profile Clean Ups

Since the introduction of the Penguin update and what some SEOs like to call the BLOOPER algorithm (Back Link Over Optimisation Penalty Exterminates Rankings), it has become more important than ever to ensure your site has as clean a link profile as possible. Historically, most sites would have at least a few dodgy links, but some may have more than others. This has meant that, sadly, a lot of businesses have been caught in the crossfire with Penguin updates. Part of the update appears to look at the type of sites linking to yours, the anchor text diversity of your profile, and the type of links. For example, an over optimised link profile biased to a number of links for certain anchors could trigger a penalty, so could a number of blog-roll links.

This means that the SEO industry needs to make sure it has an up-to-date link removal strategy in case disaster strikes. Below is outlined a brief strategy to get you started on this process. It will get you to consider the different levels of information you should be looking at, show you how to acquire the information you don’t have, and give you a few tips on some removal strategies and reconsideration requests.

Getting Started

The first thing to do is run an extract from all your available sources, such as:

Once these have been collated, you need to look into your own link building efforts, or work carried out by agencies or freelancers on your behalf, ensuring you go as far back as possible. The aim is to have your entire link building data in one area, so that correlation and computation can happen at once.

Hopefully, most of the links you have built are pre-classified into directories, guest blogs, widget links, syndicated content, paid links (!), etc. Get as much of this classification data as possible.

A few notes:

  1. OSE data is typically 2-3 months old, but sometimes it captures things you may not have through other tools.
  2. Majestic is a great tool – the data is much more fresh but the hidden win is the volume of information that it provides.
  3. Ahrefs and Blekko are good for verification but don’t use them as your sole source of information; think of them as being supplemental.
  4. Webmaster Tools are key here. A link highlighted in there means Google is telling you explicitly “We know about this link”. Every other link you may have may be important, but these are THE most important to action.
  5. Links you have physically built are useful to have records of, and if you haven’t kept a rolling record, its time you started. It is easier to get a link removed if you have placed it – there is a contact trail, payment trail etc.

Starting Work

The first steps are to take all the backlink data that you’ve gathered from various sources and put them into one large spreadsheet. This will allow you to:

  • De-dupe links from various tools – giving you a single view
  • Create a master reference sheet
  • Allow you to compare links and link types

If you have them (and there’s no reason why you shouldn’t!) you should cross reference all the links you and/or your SEO team have acquired in the last X years, going back as far as possible. These should also be highlighted.

As part of this exercise, the next step would be to highlight all the links that appear in my master list, AND on Google Webmaster Tools. As mentioned earlier, these are all the links Google says they know about explicitly, which means that these should form a large part of your clean up focus. However before you start actioning those links, you need to dig a bit deeper into the data and isolate links into groups. One of the common classifications that should be easy to run in your master sheet would be:

  1. Links from free website facilities, such as wordpress, blogger, weebly etc.
  2. Site wide links – easy to spot if you have too many of the same anchor links to the same page on your site from another site.
  3. Image based links – especially hot linked images or banner placements
  4. Links you have acquired yourself, paid, begged or guest blogged
  5. Directory links – (when cross referenced against your own directory submission lists if you have them)
  6. Highlight non common domain extensions – .edu and other country specific extensions
  7. Majestic has a unique IP report – highlight links on similar IPs

Micro analysing

Sometimes it is difficult to micro analyse your links – especially if you have thousands of root domains. However, with some of the steps above you would have hopefully classified a large portion if not all; but, if you get stuck, you don’t need to despair.

Ideally the kind of information you would want for each link you didn’t place would be:

  1. Type of site
  2. Type of Link (here you have to set some qualitative targets – explain what a network site looks like for example)
  3. Contact details- on site contact form
  4. Contact details – via who is
  5. Contact details – hosting company
  6. Any advertising information on the site such as “advertise with us”
  7. Details on trademark policy, DMCA policy etc if they exist

There might be further classifications that you require in order to understand your links better, but that depends on how much information you would need if a removal was necessary.

Once you’ve outlined the details you need per link, you then ought to isolate all the links you need extra information on and compile a new list. This new list would form the basis of a “task” that you could send to a remote worker (who can be hired through a number of means, such as Odesk).

Why use a remote working service? To start with, this is a very manual process and could take a few days. As such, it makes more sense to split it across multiple workers who would cost a lot less than UK / US manpower would. (TIP: if you are an SEO company, you may want to develop a custom scraping tool that compiles a lot of this info together for you!)
Once you have carried this exercise out you are ready to roll (i.e. have as much information as possible to get a link removed)!

Risk Analysis

There are no hard and fast tools and rules for a decent back link risk analysis that. However you should prioritise these attributes of a link as a potential investigation for removal:

  1. It’s indicated in Google Web Master tools
  2. Too many links from the same domain
  3. Too many links from the same IP
  4. Authority Domain extensions such as EDU
  5. Spam Domain extensions such as .info / .co / .cc
  6. *Free* hosted sites such as wordpress subdomains, weebly etc
  7. Sites with narrow match anchor texts
  8. Sites with obvious “advertise with us” footprints

Removal Tips

  1. Be Polite. Be respectful. Take the human approach first – and start with links you have identified as high risk.
  2. You may not want to remove all links. You could:
    • Ask for a No Follow added to the link (preserving any referral traffic)
    • Ask for the anchor to be changed if you had gone for anchor abuse historically
  3. Allocate a budget. Some site owners will charge you for a link removal. Be prepared for that and don’t be shocked – make sure you negotiate though.
  4. Any blogger / WordPress / Weebly / Squidoo link can, in most cases, be blown with a single complaint. Just make sure that your complaint is justified, and ensure you point out that the subdomain or site is built for SEO purposes, which is hurting your business.
  5. Image links. See if they are using any Trade Mark images, use that as a good contact point – “please remove any unauthorised images and links” If that doesn’t work, then consider hitting them with a DMCA.
  6. Check those authority links. In most cases, a UK site has NO reason to have EDU links. Commercial sites have no reason to have links. Be realistic; if those links have been placed, you want to keep them in your high risk category.
  7. Check your IP referrers. If there is a high volume in one particular 513 subnet (majestic is great for this data by the way), chances are a bunch are held by a link network. Try and get these checked. If those sites are still indexed, they are high risk.
  8. Some sites will have “advertising” pages on them which are obviously paid for. Say you want to advertise, but tell them to remove their current link.
  9. Redirects (again, Majestic excels at providing redirected domains inbound data). If they are domains that your clients don’t own, contact the registrar. Explain the situation.

Building a case for reconsideration

The more detailed and informative your re-inclusion request is, the higher your chances of the webspam team looking at it favourably. The party line is they want to see “a good faith effort”. This means:

  1. Give as much information as possible
  2. Actually try and get offending links removed
  3. Prove that you have taken as much action possible to remove those links and will continue to do so

If you have already gone as far as a reconsideration request, you would probably have done some of the above. But remember to:

  1. Record a spreadsheet saved on Google docs with your efforts at removal and successful removals. Share that link in your reconsideration.
  2. Take screenshots of emails sent to links you were unsuccessful at removing. Host these on Google Drive and give the webspam team access to them.
  3. If you have used networks etc that you can’t get the links removed off make sure you outline them.
    As long as you can prove that you have taken every pain and effort necessary to remove those links, you should be in better stead for trying to start reviving your rankings.


  • Collate all back link data
  • Classify back link data
  • Identify and isolate high risk links
  • Start contacts and record
  • Submit re-inclusion
  • Sit back and wait for response
  • Rinse and repeat


Categorized as SEO

By Matthew Taylor

Matthew is one of the most experienced members of the SEOptimise team and works on a number of large clients. With a background in web design, Matthew is also responsible for the SEOptimise website as well as specialist content production, including infographics.