Restricted by robots.txt does this cause problems?
-
I have restricted around 1,500 links which are links to retailers website and links that affiliate links accorsing to webmaster tools
Is this the right approach as I thought it would affect the link juice? or should I take the no follow out of the restricted by robots.txt file
-
Hello Ocelot,
I am assuming you have a site that has affiliate links and you want to keep Google from crawling those affiliate links. If I am wrong, please let me know. Going forward with that assumption then...
That is one way to do it. So perhaps you first send all of those links through a redirect via a folder called /out/ or /links/ or whatever, and you have blocked that folder in the robots.txt file. Correct? If so, this is how many affiliate sites handle the situation.
I would not rely on rel nofollow alone, though I would use that in addition to the robots.txt block.
There are many other ways to handle this. For instance, you could make all affilaite links javascript links instead of href links. Then you could put the javascript into a folder called /js/ or something like that, and block that in the robots.txt file. This works less and less now that Google Preview Bot seems to be ignoring the disallow statement in those situations.
You could make it all the same URL with a unique identifyer of some sort that tells your database where to redirect the click. For example:
www.yoursite.com/outlink/mylink#123
or
www.yoursite.com/mylink?link-id=123
In which case you could then block /mylink in the robots.txt file and tell Google to ignore the link-ID parameter via Webmaster Tools.
As you can see, there is more than one way to skin this cat. The problem is always going to be doing it without looking like you're trying to "fool" Google - because they WILL catch up with any tactic like that eventually.
Good luck!
Everett
-
From a coding perspective, applying the nofollow to the links is the best way to go.
With the robots.txt file, only the top tier search engines respect the information contained within, so lesser known bots or spammers might check your robots.txt file to see what you don't want listed, and that info will give them a starting point to look deeper into your site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there a limit to how many URLs you can put in a robots.txt file?
We have a site that has way too many urls caused by our crawlable faceted navigation. We are trying to purge 90% of our urls from the indexes. We put no index tags on the url combinations that we do no want indexed anymore, but it is taking google way too long to find the no index tags. Meanwhile we are getting hit with excessive url warnings and have been it by Panda. Would it help speed the process of purging urls if we added the urls to the robots.txt file? Could this cause any issues for us? Could it have the opposite effect and block the crawler from finding the urls, but not purge them from the index? The list could be in excess of 100MM urls.
Technical SEO | | kcb81780 -
.htaccess probelem causing 605 Error?
I'm working on a site, it's just a few html pages and I've added a WP blog. I've just noticed that moz is giving me the following error with reference to http://website.com: (webmaster tools is set to show the www subdomain, so it appears OK). Error Code 605: Page Banned by robots.txt, X-Robots-Tag HTTP Header, or Meta Robots Tag Here's the code from my htaccess, is this causing the problem? RewriteEngine on
Technical SEO | | Stevie-G
Options +FollowSymLinks
RewriteCond %{THE_REQUEST} ^./index.html
RewriteRule ^(.)index.html$ http://www.website.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^./index.php
RewriteRule ^(.)index.php$ http://www.website.com/$1 [R=301,L] RewriteCond %{HTTP_HOST} ^website.com$ [NC]
RewriteRule ^(.*)$ http://www.website.com/$1 [R=301,L] BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule> END WordPress Thanks for any advice you can offer!0 -
Are robots.txt wildcards still valid? If so, what is the proper syntax for setting this up?
I've got several URL's that I need to disallow in my robots.txt file. For example, I've got several documents that I don't want indexed and filters that are getting flagged as duplicate content. Rather than typing in thousands of URL's I was hoping that wildcards were still valid.
Technical SEO | | mkhGT0 -
Panda and unnatural links caused ranking drop
Hi I have been approached to do some SEO work for a site that has been hit badly by the latest panda update 3.3, they have also had a warning in their Google webmaster tools account saying they had unnatural looking links to their site, they received this in 26 Feb and that prompted them to stop working with their excising seo company and look for a new one. Apparently their rankings for the keywords they were targeting have dropped dramatically, but it looks like just those they were actively building back links for, other phrases do not look affected. Before I take them on I want to be clear that it is possible to help them reclaim their rankings? I have checked the site and the on-page seo is good, the site build is good, just a few errors to fix but the links that have been built by the seo company are low quality with a lot of spun articles and the same anchor text so I see what the Google webmaster tools message is refuring to. I do not think these links can be removed as there is no contact details on the sites I checked I have not checked all of them but a random sample does not show promise, they are from low authority domains. So if I am to take them on as a client and help them to regain their previous rankings what is the best strategy? Obviously they want results yesterday and from our phone call they would rather someone else did the work than them, so my initial response of add some better quality content that others in your industry would link to as a reference did not go down well, to be fair I think it is a time issue there are only 3 people in the company and they are not technical at all. Thanks for your help Sean
Technical SEO | | ske110 -
New URL structure caused a HUGE drop?
I have started working with a client who did an upgrade on their e-commerce sive in May of last year. It totally changed the URL structure and they didn't redirect old URLs or do any of the things they should have. Not unexpectedly they they went from about 300 visitors a day to 0 for then rose up to maybe 50 and have remained there ever since. There were some major onsite issues including about 15000 internal links that 302 back to the site. In any case I have fixed most of the onsite problems and worked on a little better categorization + content optimization, etc. We have only been working on this for about 30 days and organic traffic is up and they are ranking for much better keywords, but I expected a little quicker rise. Here is a screenshot out of GA of their descent. Its pretty rapid. I dont think it makes sense to redirect their old URLs at this point since most of them have been deindexed for 10+ months. Anyone have any suggestions on how to get back to their previous level. The domain actually has decent authority and link profile, etc. Is this just going to be a slow climb back? Any thoughts? Fxz9Y.png
Technical SEO | | BlinkWeb0 -
Mapping Internal Links (Which are causing duplicate content)
I'm working on a site that is throwing off a -lot- of duplicate content for its size. A lot of it appears to be coming from bad links within the site itself, which were caused when it was ported over from static HTML to Expression Engine (by someone else). I'm finding EE an incredibly frustrating platform to work with, as it appears to be directing 404's on sub-pages to the page directly above that subpage, without actually providing a 404 response. It's very weird. Does anyone have any recommendations on software to clearly map out a site's internal link structure so that I can find what bad links are pointing to the wrong pages?
Technical SEO | | BedeFahey0 -
Does RogerBot read URL wildcards in robots.txt
I believe that the Google and Bing crawlbots understand wildcards for the "disallow" URL's in robots.txt - does Roger?
Technical SEO | | AspenFasteners0