Anyways to pull anchor text?
-
Hi guys,
So basically i have a list of URLs/Domains and there backlinks (example: http://s29.postimg.org/ujxm0c4lj/screenshot_677.jpg) but i'm missing anchor text. Can anyone recommend any tools which can scan a backlink, locate the URL/Domain on the page and then pull the anchor text?
Cheers, Chris
<colgroup><col width="548"><col width="884"></colgroup>
| | | -
Hi Matt!
No i have not yet found a tool which can do this.
The _ScrapeBox Anchor Text plugin _CleverPhD mentioned can only do this for one domain at a time. I need it for multiple domains.
Any other suggestions?
-
Hi Jay! Did you get this worked out?
-
Thanks Jay. If I look on the backlinks side, they all seem to have the same subdomain in some form or another. You would just need to setup the regex in Screaming Frog to look for just that keyword in the subdomain so it should match all the variants of it.
That said, ignore everything I just posted. I was thinking earlier, "Surely there is scraper software out there that does this already." I did not take the time to look. Your mention of Scrapebox reminded me of that.
Scrapebox has a separate addon that does this
http://www.scrapebox.com/anchor-text-checker
The ScrapeBox Anchor Text Checker allows you to enter your domain and then load a list of URL’s that contain your backlink. It will scan all the URL’s containing your link and extract the anchor text used by the websites that link to you.
-
Basically want the anchor text, so I can easily identify the location of the link on the page without needing to view source and search for the URL.
This export is directly from: http://s29.postimg.org/ujxm0c4lj/screenshot_677.jpg
Scrapebox backlink checker which doesn't give you anchor text.
-
Ok. Can you be more specific on what you are trying to accomplish with this data? I think that may help my understanding of what you are trying to do.
-
Thanks CleverPhD, sorry should had mentioned i'm looking to do this for multiple domain names not just one. So the method you describe works great for a single domain.
-
Screaming Frog can do this with custom extraction and list mode. If I am reading your question correctly, you have a list of URLs and what pages on your site that they link to.
You would upload the list of URLs into Screaming Frog so it knows what pages to scan and run it in list mode
http://www.screamingfrog.co.uk/seo-spider/user-guide/configuration/#15
You would then use the custom extraction tool to grep for the ahref code that has a link to your domain
http://www.screamingfrog.co.uk/web-scraper/
You would need to plug in a regular expression to look for your domain (or versions of it) and then include the rest of the HTML tag that include the anchor text all the way through the ending .
You should then be able to import that data into a spreadsheet and use text to columns to split the anchor text into it's own column.
It is a little tricky as the regular expression may have to be tweaked depending on how other sites link to your site. Run the Frog on a test group of 10 or so to make sure it works. If you have a bunch of errors, take the error examples and tweak the regular expression based on those.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEO impact of mouse over text on product page
Hi, we recently improved our product page to show more color options, like this http://www.prams.net/knorr-baby-voletto-sport-pram-stroller-reversible-seat-green-a?inref=home-left In order to improve the seo, we expanded our rich snippets the following way we added all color options, skus and prices as "items offered" we are showing the highest and lowest price range and eliminated the base price https://developers.google.com/structured-data/testing-tool/ Google now shows the price range in the rich snippet. The questions is: as the user see the original color, the price and the sku only when mousing over the small images. We are worried that this could be treated a "hidden text". Does anybody have experience in this matter or a way a to solve it better? Thanks in advance Dieter 8WthtQY
Intermediate & Advanced SEO | | Storesco0 -
Anchor text penalties and indexed links
Hi! I'm working on a site that got hit by a manual penalty some time ago. I got that removed, cleaned up a bunch of links and disavowed the rest. That was about six months ago. Rankings improved, but the big money terms still aren't doing great. I recently ran a Searchmetrics anchor text report though, and it said that direct match anchors still made up the largest part of the overall portfolio. However, when I started looking at individual links with direct anchors, nearly every one had been removed or disavowed. My question is, could an anchor text penalty be in place because these removed links have not been reindexed? If so, what are my options? We've waited for this to happen naturally, but it hasn't occurred after quite a few months. I could ping them - could this have any impact? Thanks!
Intermediate & Advanced SEO | | Blink-SEO0 -
Panda 4.0 Update Affected Site - What should be a the minimum Code to Text Ratio we should aim for ?
Hi All, My eCommerce site got hit badly with the Panda 4.0 update so we have been doing some site auditing and analysis identifying issues which need addressing. We have thin/duplicate issues which I am quite sure was part of the reason we were affected by this even though we use rel=next and rel=prev along with having a separate view all page although we don't concanical tag to this page as I dont' think users would benefit from seeing to many items on one page. This led me to look at our Code to Content Ratio. We have now managed to increase it from 9% to approx 18-22% on popular pages by getting rid of unnecessary code etc. My question is , is there an ideal percentage the code to content ratio should be ?.. and what should I be aiming for ? Also any other Panda 4.0 advice would also be appreciated thanks Sarah
Intermediate & Advanced SEO | | SarahCollins0 -
Does Google read bullet point lists are text? WordPress SEO by Yoast says different...
I am using the WordPress SEO plugin by Yoast. They have a site analysis, once you enter a keyword for optimize it for. Now I found that this plugin doesn't count in the text from bullet point (or numbered lists) as text. Now that made me curios...Does Google see bullet points text as text or not?
Intermediate & Advanced SEO | | soralsokal0 -
TLA / Text Link Ads
Hi folks, Curious to hear what people know about the TLA situation since reports surfaced that they'd been de-indexed. It looks like it's all been quiet since those early reports. Not many people admit to using TLA so perhaps you've heard something on the grapevine... nudge nudge wink wink.
Intermediate & Advanced SEO | | MattBarker0 -
Diversifying anchor text question
Hi, I've seen a new article by Dr. Pete on diversifying links for 2013 (http://www.seomoz.org/blog/top-1-seo-tips-for-2013), now my question is this: Dr. Pete talks about mixing up the anchor text for links, is so we don't get caught out by Google or actually mixing it has a better impact? For example: 1. 20 anchor text links targeting just the target term. 2. 20 anchor text links targeting 4 variations of the target term. Is number 2 recommended so things look natural or does it actually have a better impact on SEO. Thanks
Intermediate & Advanced SEO | | activitysuper0 -
How to get around Google Removal tool not removing redirected and 404 pages? Or if you don't know the anchor text?
Hello! I can’t get squat for an answer in GWT forums. Should have brought this problem here first… The Google Removal Tool doesn't work when the original page you're trying to get recached redirects to another site. Google still reads the site as being okay, so there is no way for me to get the cache reset since I don't what text was previously on the page. For example: This: | http://0creditbalancetransfer.com/article375451_influencial_search_results_for_.htm | Redirects to this: http://abacusmortgageloans.com/GuaranteedPersonaLoanCKBK.htm?hop=duc01996 I don't even know what was on the first page. And when it redirects, I have no way of telling Google to recache the page. It's almost as if the site got deindexed, and they put in a redirect. Then there is crap like this: http://aniga.x90x.net/index.php?q=Recuperacion+Discos+Fujitsu+www.articulo.org/articulo/182/recuperacion_de_disco_duro_recuperar_datos_discos_duros_ii.html No links to my site are on there, yet Google's indexed links say that the page is linking to me. It isn't, but because I don't know HOW the page changed text-wise, I can't get the page recached. The tool also doesn't work when a page 404s. Google still reads the page as being active, but it isn't. What are my options? I literally have hundreds of such URLs. Thanks!
Intermediate & Advanced SEO | | SeanGodier0 -
The Affects of Removing Anchor Texts from Super Menu on Homepage
Hi, Currently we have a div that drops down our super menu which has subcategories, ie. under Shop by Color (super menu) Black Ties, Blue Ties, Brown Ties, et, al. (see Ties.com Anchor Text image attached) If we were to remove these subcategories from the div (in other words, they do not get crawled from homepage, will we loose ranking for those keywords? We are trying to reduce link count on homepage. Thoughts? UBHu8.png
Intermediate & Advanced SEO | | Ties.com0