Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there a downside to "closing" a domain and 301'ing to to your main domain?
We have a domain which was optimized for a specific keyword. It does very well for that keyword, but seems like it has in turn caused our primary domain to be penalized (back to page 3) for that same keyword (We think they've figured out the connection). The optimized domain (only one page) has links on it to our primary domain, so I don't want to delete it as that's where our traffic currently comes from. But if I deleted it and 301'ed it to the primary domain is there a downside? Alternatively I could disavow it, but if the search engines know it's the same owner will they hold that against all domains? Or, alternatively again. Should I turn it into a more substantial website? Which could in turn remove it's "spamminess". Any suggestions are welcomed!
Link Building | | absoauto0 -
Starting a new site's link campaign... how to approach it?
Over the last few years I have been building content in my niche that I believe rivals some of the best content out there and deserves some attention. Although I have a plan to produce alot more content which I believe will take the quality and quantity of my content into a position among the top 5 or top 10 sites in my niche in the next 1-2 years, I decided that making that massive investment in content production irrespective of a consistent marketing plan is a recipe for failure because I need the positive feedback loop from site visitors to begin now, not in 2 years. Right now I'm in a position where I'm producing content that I think is better than alot of what's out there, and it's just not ranking the way I believe it should. I think I need to do a legitimate link building campaign to establish the website a little more firmly and put it on more level ground with some of its competitors. In Majestic SEO's "fresh index", most of my site's immediate competition have no more than 500 new domains in their links, though the biggest one has some 2,000. How can any link building effort I might take on possibly compare to links of this scale? Is there some "rule of thumb" for how many quality links I should aim for to get on square ground with some of the competitors on the lowest rungs? And if I try to build that many links at once, do I risk sending signals of untrustworthiness? (Assume I'm not going to be looking for any shoddy links, and in general will aim to follow Google guidelines.)
Link Building | | guitarsites0 -
Someone's been spamming my client...
Hi all Bit of a strange one....doing a backlink analysis for my client's website (a handmade oak furniture supplier) and noticed there are about 13,000 spam backlinks to the domain from dozens of websites for keywords related to replica watches. Odd! Obviously neither us nor them have made these backlinks. Would a disavow be enough action to take in this case? I would rather the client not see a penalty in WMT for spam backlinks for this. Not sure how, or why, we have acquired this links. I can only think someone has been trying to do a spot of negative seo against the site Thanks Carl
Link Building | | carl_daedricdigital0 -
01 November 2013 - A possible Google's update??
Hello Guys! On 01 November 2013, one of our website's traffic has been dropped by more than 70%. We had added plenty of domains to the disavow tool, I am just wondering if its a Google's update or the disavow tool has destroyed the site? Any ideas? 9p3sN3p.jpg
Link Building | | TheSEOGuy10 -
Removing links from rubbishy 'blog' sites
I need to remove around 800 bad links, probably about 500 domains as a very rough estimate. These were built by a previous link building company. Here some example domains: http://globalweddingblog.com
Link Building | | Coraltoes77
http://theweddinginsider.net
http://www.couturefashionissues.com
http://www.topfashionlabels.com
http://weddingworldnews.com
http://www.savingsdistrict.com
http://bestfemalesblog.com
http://mylatestfashion.com
http://lastfashion.net
http://womansonlineblog.org I have already tried emailing a hundred or so with a manual link request - with zero outcome. Hardly surprising when you consider the types of sites they are. I've had a quote for a link removal service, but I'm not sure if it's wise to pay someone to do this work - not sure what resources/tools they would have above and beyond what I can access and there could be increased risk. Any advice?0 -
Where can I get a list of broken links to my client's website?
I have a client who owns a website that attracted a large number of links to internal pages of their sites. Without realising the value of those links, they removed every page of their website other than the home pages that now reads, "new website coming soon". How can I get a list of every website still linking to the (now broken) internal pages of their site?
Link Building | | richdan0 -
Has anyone seen positive results from using Submiteaze to submit to directories? I know an SEM agency that uses it for clients' link building campaigns, but I don't know if it is worth buying. Are there better alternatives?
I would like to start a link building initiative at my company for a new website, and would like to know if the value of the links built using Submiteaze would be worth the money.
Link Building | | pbhatt0