A creative agency with deep digital DNA

Link Profile Cleaning

Link Profile Cleaning


Preparing to remove low quality “Exact Match” anchor text links!

(Safeguarding your link building in 2013 – Part 2)


In part one, (Some Anchors Sink Ships) of this series we looked at ways to identify “Exact Match” anchor text links. These dangerous link types are high risk, particularly if there are too many of them pointing into your website for the one phrase.


Google will punish sites that over-optimize their inbound links with exact-match anchor text, so to safeguard your site you may have to remove a lot of these or at the very least, get the anchor text changed. This is really dependent on the quality of the site linking into you.


There are a number of ways and tools to complete this task but I find the most effective method is combining some paid tools with some free ones. The tools we will be introduced to are as follows:


  • Google Webmaster Tools (Free)
  • SEOMoz Open Site Explorer($99 per month)
  • Majestic SEO (£39 per month)
  • Screaming Frog (£99 per annum)


If you don’t already have a free Google Webmaster Tools account for your website, then you need to get one. Luckily, it couldn’t be easier to setup.


Simply go to www.google.com/webmasters/tools/ and login using your Gmail account (you do have one, right?). Once logged in, you can add a website to manage; once added you will need to follow the onscreen instructions to verify the website – to prove that you do in fact own the site.


Once completed, you can see all sorts of great information in regard to your website.


If you have been involved in aggressive link building campaigns in the past, then there is a good chance that there will be a short message waiting for you in the GWT inbox, and no, it’s not a pleasant welcome message; more than likely this will be a warning from Google confirming that they have noticed unnatural links pointing into your website. This is further confirmation that you have indeed received some sort of penalty and you need to take action!


Taking Action

So the first thing we are interested in seeing is the amount of websites that link into your site, or at least, as many as Google will show us – they do not offer you information on every link, just a handful of the ones they feel are relevant.


Choose the site you want to manage by clicking on the appropriate URL; in this instance I will select the www.stevenforsyth.com website (shameless plug noted).



All the sites you manage should be listed in here; if they are not then you will need to add them by uploading a verification html file to the server. You can download the verification file from GWT and then upload the file to the live server. Once uploaded you can choose the Verify Site option listed beside the site you just added in GWT.


You will now be brought to the dashboard screen which is an overview-type screen. On the left hand side you will notice a navigational menu, click on the Traffic link to expand the menu, and then select the Links to Your Site option, and you will see the following screen:



Here we see a snapshot of the Who Links the Most, You’re Most Linked Content & How your Data is Linked (Internal Only). We need to select the Who Links the Most part, then click on the More >> link below the links under that heading and you will see this screen:



You will notice three large, friendly grey buttons along the top. The only button we are interested in is the Download Latest Links; this is a new addition to GWT since the penguin update rolled out. With so many people having to clean their profiles Google made it easier to download only the latest, most important links to aid you with this task. Click the button to download all your links into a CSV file.


Once downloaded, you can open the CSV file in Excel; you should see a list of links dated over the last three years or so. Select the entire Column B and delete it. Save the file as Complete-Link-Profile.xls.

More Link Information Required

Now we need to get even more back links from our next source, SEOMoz’s – Open Site Explorer.


This is a paid service; although you can sign up for a free trial initially. However, if you are serious about SEO then I highly recommend signing up for an annual subscription.


Enter the address of the website you want to check, in this case www.stevenforsyth.com and click Search. This will list all the links it finds pointing to the site. However, we need to do some adjusting before downloading.



In the screen above you will see four drop down menus. You can ignore the first one. The second one should be set to Only External, as we only want to download external links. Then in the next drop down you should select Pages on this Sub domain and the last box should be left as is.


Once these are selected you can click on the Filter button, just below the filter button is the Download CSV link, click this to download your new file.


Open the new file and now count the amount of rows you have with URL’s, select all the URLs rows within the column and copy and paste them into your Complete-Link-Profile.xls file, below the links that are already there. Don’t worry about duplicate links; we’ll sort them out later.


Now we move onto our final source of back links, and that is the ever trusty Majestic SEO site. Again, this is a paid service but one I highly recommend. If you don’t want to sign up for this or budget simply doesn’t allow – don’t fret, you can still create an extensive profile with the first two tools that should fix the issue. I added Majestic only because it does tend to list additional links which means we are more likely to create a more complete profile.


Once logged in then type the URL into the search box, make sure you select the Fresh Index radio button below; we only want the latest links from the last few years. The History Index will give you details of all links built up over time, even the ones that are no longer active. Once you click search you will see this screen, click on the Backlinks tab:



Select the Remove Deleted Backlinks radio button and click on the Explore button to update the file. Now scroll right to the end and on the left hand side you will see a small link: Download CSV.


Download the file, open it and copy the URL rows again like last time and place them in your Complete-Link-Profile.xls file. We now have a complete list.

Removing Duplicates

Open your Entire-Link-Profile.xls file in Excel, select the entire URL column and then go to the Data menu item, then click on the Remove Duplicates icon to remove the duplicates you are bound to get from the crossover of using three sources. Save the file.

Removing Dead Links from the List

We now have a very big file with all the links listed, however, over time many links disappear for various reasons. So now we need to run a test on the entire URL’s list to find out which ones are active, as these are the only ones we want listed here. To do this we will use the Screaming Frog software.


Open up Screaming Frog now. Along the top toolbar, click the file menu item called and select the option.



Next, click ‘select file’ and browse to where you saved your Entire-Link-Profile.xls file. Click open. The file reader will tell you how many URLs it found in the file you uploaded.


Now we need to configure the Custom Source Code Filter. Back up on the navigation, click on and then . In the custom filter configuration window you have several options available, select in the filter 1 dropdown, then enter your website URL in the text input field – as show in the image below:



Click the Custom Tab. Select your chosen filter (Filter 1) from the Filter drop down on the left. Click Start.


The tool will now crawl through your link list and identify any links that are no longer active. We can remove these from our CSV file and we now have a complete active list of links to review and possibly remove. A quick review of what we have just done:


We got back link information from three different sources and copied them into one file, removed all the duplicate links and non-active links. Now we are ready to start cleaning the profile. To do this we need to identify poor quality links, gather contact information, reach out to webmaster, record date first contacted etc. We will use our spread sheet to record all this information.


In our next post, we will cover the actual removal/change process…what fun! 🙂


Steven Forsyth – Integrated Search Strategist, inthecompanyofhuskies.com