Blocking Affiliate Links via robots.txt
-
Hi,
I work with a client who has a large affiliate network pointing to their domain which is a large part of their inbound marketing strategy. All of these links point to a subdomain of affiliates.example.com, which then redirects the links through a 301 redirect to the relevant target page for the link. These links have been showing up in Webmaster Tools as top linking domains and also in the latest downloaded links reports. To follow guidelines and ensure that these links aren't counted by Google for either positive or negative impact on the site, we have added a block on the robots.txt of the affiliates.example.com subdomain, blocking search engines from crawling the full subddomain. The robots.txt file is the following code:
User-agent: *
Disallow: /
We have authenticated the subdomain with Google Webmaster Tools and made certain that Google can reach and read the robots.txt file. We know they are being blocked from reading the affiliates subdomain. However, we added this affiliates subdomain block a few weeks ago to the robots.txt, but links are still showing up in the latest downloads report as first being discovered after we added the block. It's been a few weeks already, and we want to make sure that the block was implemented properly and that these links aren't being used to negatively impact the site. Any suggestions or clarification would be helpful - if the subdomain is being blocked for the search engines, why are the search engines following the links and reporting them in the www.example.com subdomain GWMT account as latest links. And if the block is implemented properly, will the total number of links pointing to our site as reported in the links to your site section be reduced, or does this not have an impact on that figure?From a development standpoint, it's a much easier fix for us to adjust the robots.txt file than to change the affiliate linking connection from a 301 to a 302, which is why we decided to go with this option.Any help you can offer will be greatly appreciated.Thanks,Mark
-
I think you did the right thing. Engines will take a while until they re-crawl your robots.txt and actually following what you commanded.
Extra steps I would take:
- 302 the redirect, probably is just a line of code doing the redirect after setting some cookies or session variables.
- Try to edit the affiliate codes to work with Javascript instead of naked URLs ( could be something like <ins class="affiliate">that is later switched to a text link or banner using JS). This will not only allow you to set a nofollow for those links, but you could be able to remove/block specific affiliates or pages where you don't want your links/banners.</ins>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt blocking Addon Domains
I have this site as my primary domain: http://www.libertyresourcedirectory.com/ I don't want to give spiders access to the site at all so I tried to do a simple Disallow: / in the robots.txt. As a test I tried to crawl it with Screaming Frog afterwards and it didn't do anything. (Excellent.) However, there's a problem. In GWT, I got an alert that Google couldn't crawl ANY of my sites because of robots.txt issues. Changing the robots.txt on my primary domain, changed it for ALL my addon domains. (Ex. http://ethanglover.biz/ ) From a directory point of view, this makes sense, from a spider point of view, it doesn't. As a solution, I changed the robots.txt file back and added a robots meta tag to the primary domain. (noindex, nofollow). But this doesn't seem to be having any effect. As I understand it, the robots.txt takes priority. How can I separate all this out to allow domains to have different rules? I've tried uploading a separate robots.txt to the addon domain folders, but it's completely ignored. Even going to ethanglover.biz/robots.txt gave me the primary domain version of the file. (SERIOUSLY! I've tested this 100 times in many ways.) Has anyone experienced this? Am I in the twilight zone? Any known fixes? Thanks. Proof I'm not crazy in attached video. robotstxt_addon_domain.mp4
Technical SEO | | eglove0 -
Sitemap links
Hi, I´m running a sitemap using pro-sitemaps and I find several pages that shouldn´t be listed. How do I find how are these pages being generated? Can´t find the links the robot is following to get to those pages..
Technical SEO | | ceci27100 -
Too Many Links?
Search Term is Indianapolis Wedding Photographers. Site is http://www.tallandsmallphotography.com/ Their metrics are through the roof compared to everyone else's. They've dropped from 27 in May to 40 Now. 'A' Grade on-site optimization. Either there's too many links, or there's some bad links involved... I don't know which it is...
Technical SEO | | WilliamBay0 -
Internal linking disaster
Can someone help me understand what my devs have done? The site has thousands of pages but if there's an internal homepage link on all of the pages (click on the logo) shouldn't that count for internal links? Could it be because they are nonfollow? http://goo.gl/0pK5kn I've attached my competitors opensiteexplorer rankings (I'm the 2nd column) .. so despite the face the site is new you can see where I'm getting my ass kicked. Thanks! psRsQtH.png
Technical SEO | | bradmoz0 -
Fix or Block Webmaster Tools URL Errors Not Found Linked from a certain domain?
RE: Webmaster Tool "Not Found" URL Errors are strange links from webstatsdomain.com Should I continue to fix 404 errors for strange links from a website called webstatsdomain.com or is there a way to ask Google Webmaster Tools to ignore them? Most of Webmaster Tools "URL Not Found errors" I find for our website are from this domain. They refer to pages that never existed. For example, one was to www.mydomain.com/virtual. Thanks for your help.
Technical SEO | | zharriet0 -
Too Many On-Page Links?
How much would this affect my page ranks performance? There are many Too Many On-Page Links? warning on my campaign. should I address this issue right away to fix it or leave it as it would not matter seriously ? I've looked at some of the pages and think all of them are necessary. Could someone help me? Thanks!
Technical SEO | | LauraHT0 -
Links to Website Author
I'm a website developer, and in the past I have usually added a tiny backlink to the footer of my clients' websites like this: Website Design by MyCompanyName I understand that Google sees this as a low-quality backlink. However, I was wondering if such links can hurt my rankings. Does Penguin sees these links as spam? If so, should I add a rel="nofollow" to the links? Is there anything else I should change? I do not want to remove these links completely because they are good for marketing my business. I just want to minimize any negative SEO impact of the links. I appreciate your input. Thanks.
Technical SEO | | SiteWizard_LLC0 -
Different links to to the same page
Hi, Based on the user's actions we post activity into users Facebook timeline. And each activity has link back to our particular page on our website. For example if original page was: www.Domain.com from Facebook timeline it would be like this: www.Domain.com?Ffb_action_ids=101508953168 Do you think this will have a negative effect on our page rankings as we will eded up having a lot of different URL's to the same page? www.Domain.com?Ffb_action_ids=101508953168 www.Domain.com?Ffb_action_ids=456788765609 etc.. Thank you, Karen Bdoyan
Technical SEO | | showme0