Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Question about reciprocal link building. I'm not an SEO professional, just a local service business owner.
I did a link page on my website 13 years ago and never took it down. Should we scratch that page all together? Is it ok with Google to do a page on Recommended local service providers. Maybe I can keep some of those reciprocal links if that's the case...
Link Building | | FVLMS0 -
Backlinks from different TLD's impact
Hi MOZ'ers, I'm wondering what the impact of different TLD backlinks is for same language pages. For example: we're on a website that has a German national TLD .de. We're earning backlinks and they are coming from .de as well as .ch (Switzerland) or .at (Austria) pages. What would be more desirable, and how big would you consider the difference? Looking forward to hearing your responses 🙂 Justen
Link Building | | Justen_H0 -
How can I get my wesbite on Dmoz if it isn't accepting? Am I S.O.L.?
Each of the categories I would certainly fall under aren't accepting sites at this time because of it isn't showing the icon you need next to it to submit a website. I have been checking this for months, but it isn't changing. Is there another way in or am I just out of luck here? See step 3: http://www.dmoz.org/docs/en/add.html
Link Building | | pmull0 -
Someone's been spamming my client...
Hi all Bit of a strange one....doing a backlink analysis for my client's website (a handmade oak furniture supplier) and noticed there are about 13,000 spam backlinks to the domain from dozens of websites for keywords related to replica watches. Odd! Obviously neither us nor them have made these backlinks. Would a disavow be enough action to take in this case? I would rather the client not see a penalty in WMT for spam backlinks for this. Not sure how, or why, we have acquired this links. I can only think someone has been trying to do a spot of negative seo against the site Thanks Carl
Link Building | | carl_daedricdigital0 -
Use 301'd Domain for a new campaign
Hello everybody, My company is getting ready to start a new mediacampaign on a very specific subject. The mediacampaign is not directly targeted at our core business, the goal is more to inform our customers about a subject and do a little branding for our company. A nice (and expensive) infographic was built that is going to be the core content of the campaign. We want the infographic to get shared a lot and therefore some of my colleagues want the url to be as short as possible. The idea is to host the infographic on a url on our companysite, but use a 301'd, shorter domainname in our communications. We are going to be getting a lot of links to this empty 301'd domain which does nothing else then 301 to our companysite. I know that linkbuilding to a 301'd domain is an old blackhat tactic, that's the main reason I don't feel good about this. But i can't really find any info on this subject.
Link Building | | Laurensvda0 -
Is it ok for a web design company to have a branded footer link on their client's sites?
Now I know that in general footer links to your site from another site are bad...this is because they are very often spammy...however I like to think that Google is pretty smart and I am of the opinion that a web design company should be able to link back to their own site. Here's why: If a visitor comes across a site that they love the design of, and they want a new website built...why shouldn't they be able to click through to the web designers site? (as long as the client is happy to link to it of course) I also feel that if there are a whole bunch of high authority/pagerank websites have been designed by a web design company and they therefore have a footer link pointing to them, it's probably a pretty good sign that they're a good web designer. Is it not? In saying this I think that the link anchor text should be branded rather than keywords. For example I usually write "Web Design by Static Shift" I'm interested to hear people's thoughts. Am I being blinded by my bias? Thoughts aside, and onto the facts...what are people's experiences with footer links for a web design company. Do they help or hinder?
Link Building | | Static_Shift3 -
Is it better to link back to the root domain, or the specific page you're optimizing for?
I'm working on a project our site - building links for iPhone Repair. For the term "iphone repair miami" - is it more ideal to link to the root domain (xxx.com) or to the subpage that is about iPhone Repair (xxx.com/iphone-repair)? I would imagine the latter - but obviously that page has less authority than the root. Thanks!
Link Building | | preemo0 -
What's the difference between follow and nofollow links?
I understand this may be a really dumb question and from my understanding there is a piece of code in some url's that tell search engines not to follow that link. I am interested in finding out what the purpose of nofollow links are and how they apply to search rankings. Thanks for the help
Link Building | | A2890