Will blocking urls in robots.txt void out any backlink benefits? - I'll explain...
-
Ok...
So I add tracking parameters to some of my social media campaigns but block those parameters via robots.txt. This helps avoid duplicate content issues (Yes, I do also have correct canonical tags added)... but my question is -- Does this cause me to miss out on any backlink magic coming my way from these articles, posts or links?
Example url: www.mysite.com/subject/?tracking-info-goes-here-1234
- Canonical tag is: www.mysite.com/subject/
- I'm blocking anything with "?tracking-info-goes-here" via robots.txt
- The url with the tracking info of course IS NOT indexed in Google but IT IS indexed without the tracking parameters.
What are your thoughts?
- Should I nix the robots.txt stuff since I already have the canonical tag in place?
- Do you think I'm getting the backlink "juice" from all the links with the tracking parameter?
What would you do?
Why?
Are you sure?
-
Thanks Guys...
Yeah, I figure that's the right path to take based on what we know... But I love to hear others chime in so I can blame it all on you if something goes wrong - ha!
Another Note: Do you think this will cause some kind of unnatural anomaly when the robots.txt file is edited? All of a sudden these links will now be counted (we assume).
It's likely the answer is no because Google still knows about the links.. they just don't count them - but still thought I'd throw that thought out there.
-
I agree with what Andrea wrote above - just one additional point - blocking a file via robots.txt doesn't prevent the search engine from not indexing the page. It just prevents the search engine from crawling the page and seeing the content on the page. The page may very well still show up in the index - you'll just see a snippet that your robots.txt file is preventing google from crawling the site and caching it and providing a snippet or preview. If you have canonical tags put in place properly, remove the block on the parameters in your robots.txt and let the engines do things the right way and not have to worry about this question.
-
If you block with robots.txt link juice can't get passed along. If your canonicals are good, then ideally you wouldn't need the robots. Also, it really removes value of the social media postings.
So, to your question, if you have the tracking parameter blocked via robots, then no, I don't think you are getting the link juice.
http://www.rickrduncan.com/robots-txt-file-explained
When I want link juice passed on but want to avoid duplicate content, I'm more a fan of the no index, follow tags and using canonicals where it makes sense, too. But since you say your URLs with the parameters aren't being indexed then you must be using tags anyway to make that happen and not just relying on robots.
To your point of "are you sure":
http://www.evergreensearch.com/minimum-viable-seo-8-ways-to-get-startup-seo-right/
(I do like to cite sources - there's so many great articles out there!)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does Google ignore content styled with 'display:none'?
Do you know if an H1 within a div that has a 'display: none' style applied will still be crawled and evaluated by Google? We have that situation on this page on line 136: view-source:https://www.junk-king.com/services/items-we-take/foreclosure-cleanouts Of course we also have an H1 up at the top of the page and are concerned that the second one will cause interference with our SEO efforts. I've seen conflicting and inconclusive information on line - not sure. Thanks for any help.
Intermediate & Advanced SEO | | rastellop0 -
Should I better noindex 'scripted' files in our portfolio?
Hello Moz community, As a means of a portfolio, we upload these PowerPoint exports – which are converted into HTML5 to maintain interactivity and animations. Works pretty nicely! We link to these exported files from our products pages. (We are a presentation design company, so they're pretty relevant). For example: https://www.bentopresentaties.nl/wp-content/portfolio/ecar/index.html However, they keep coming up in the Crawl warnings, as the exported HTML-file doesn't contain text (just code), so we get errors in: thin content no H1 missing meta description missing canonical tag I could manually add the last two, but the first warnings are just unsolvable. Therefore I figured we probably better noindex all these files… They appear to don't contain any searchable content and even then; the content of our clients work is not relevant for our search terms etc. They're mere examples, just in the form of HTML files. Am I missing something or should I better noindex these/such files? (And if so: is there a way to include a whole directory to noindex automatically, so I don't have to manually 'fix' all the HTML exports with a noindex tag in the future? I read that using disallow in robots.txt wouldn't work, as we will still link to these files as portfolio examples).
Intermediate & Advanced SEO | | BentoPres0 -
Magento 1.9 SEO. I have product pages with identical On Page SEO score in the 90's. Some pull up Google page 1 some won't pull up at all. I am searching for the exact title on that page.
I have a website built on Magento 1.9. There are approximately 290,000 part numbers on the site. I am sampling Google SERP results. About 20% of the keywords show up on page 1 position 5 thru 10. 80% don't show up at all. When I do a MOZ page score I get high 80's to 90's. A page score of 89 on one part # may show up on page one, An identical page score on a different part # can't be found on Google. I am searching for the exact part # in the page title. Any thoughts on what may be going on? This seems to me like a Magento SEO issue.
Intermediate & Advanced SEO | | CTOPDS0 -
Set Robots.txt file to crawl my website at specific times
Our website provider has stated that they can only 'lift' their block on our website in order for it to be crawled as specific times. Is there any way to amend a robots.txt to ensure that it crawls our website at a specific time of day/night in order to coincide with the block being lifted? Many Thanks, Charlene
Intermediate & Advanced SEO | | CharleneKennedy120 -
NGinx rule for redirecting trailing '/'
We have successfully implemented run-of-the-mill 301s from old URLs to new (there were about 3,000 products). As normal. Like we do on every other site etc. However, recently search console has started to report a number of 404s with the page names with a trailing forward slash at the end of the .html suffix. So, /old-url.html is redirecting (301) to /new-url.html However, now for some reason /old-url.html/ has 'popped up' in the Search Console crawl report as a 404. Is there a 'blobal' rule you can write in nGinx to say redirect *.html/ to */html (without the forward slash) rather than manually doing them all?
Intermediate & Advanced SEO | | AbsoluteDesign0 -
If Robots.txt have blocked an Image (Image URL) but the other page which can be indexed has this image, how is the image treated?
Hi MOZers, This probably is a dumb question but I have a case where the robots.tags has an image url blocked but this image is used on a page (lets call it Page A) which can be indexed. If the image on Page A has an Alt tags, then how is this information digested by crawlers? A) would Google totally ignore the image and the ALT tags information? OR B) Google would consider the ALT tags information? I am asking this because all the images on the website are blocked by robots.txt at the moment but I would really like website crawlers to crawl the alt tags information. Chances are that I will ask the webmaster to allow indexing of images too but I would like to understand what's happening currently. Looking forward to all your responses 🙂 Malika
Intermediate & Advanced SEO | | Malika11 -
Is it possible for a multi doctor practice to have the practice's picture displayed in Google's SERP?
Google now includes pictures of authors in the results of the pages. Therefore, a single practice doctor can include her picture into Google's SERP (http://markup.io/v/dqpyajgz7jkd). How can a multi doctor practice display the practice's picture as opposed to a single doctor? A search for Plastic Surgery Chicago displayed this (query: plastic surgery Chicago) http://markup.io/v/bx3f28ynh4w5. I found one example of a search result showing a picture of both doctors for a multi doctor practice (query: houston texas plastic surgeon). http://markup.io/v/t20gfazxfa6h
Intermediate & Advanced SEO | | CakeWebsites0 -
Should I robots block site directories with primarily duplicate content?
Our site, CareerBliss.com, primarily offers unique content in the form of company reviews and exclusive salary information. As a means of driving revenue, we also have a lot of job listings in ouir /jobs/ directory, as well as educational resources (/career-tools/education/) in our. The bulk of this information are feeds, which exist on other websites (duplicate). Does it make sense to go ahead and robots block these portions of our site? My thinking is in doing so, it will help reallocate our site authority helping the /salary/ and /company-reviews/ pages rank higher, and this is where most of the people are finding our site via search anyways. ie. http://www.careerbliss.com/jobs/cisco-systems-jobs-812156/ http://www.careerbliss.com/jobs/jobs-near-you/?l=irvine%2c+ca&landing=true http://www.careerbliss.com/career-tools/education/education-teaching-category-5/
Intermediate & Advanced SEO | | CareerBliss0