Using Robots.txt
-
I want to Block or prevent pages being accessed or indexed by googlebot. Please tell me if googlebot will NOT Access any URL that begins with my domain name, followed by a question mark,followed by any string by using Robots.txt below. Sample URL http://mydomain.com/?example User-agent: Googlebot Disallow: /?
-
Not sure if that would work, but you can test by changing your robots.txt and running a test in GWT > Health > Blocked URLs
You might also be interested in specifying specific URL paraments (e.g. /?sort=name&order=asc > can block sort and order parameters) from within GWT (Configuration > URL Parameters)
Learn more about parameters - https://support.google.com/webmasters/bin/answer.py?hl=en&answer=1235687
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt & meta noindex--site still shows up on Google Search
I have set up my robots.txt like this: User-agent: *
Technical SEO | | RoxBrock
Disallow: / and I have this meta tag in my on a Wordpress site, set up with SEO Yoast name="robots" content="noindex,follow"/> I did "Fetch as Google" on my Google Search Console My website is still showing up in the search results and it says this: "A description for this result is not available because of this site's robots.txt" This site has not shown up for years and now it is ranking above my site that I want to rank for this keyword. How do I get Google to ignore this site? This seems really weird and I'm confused how a site with little content, that has not been updated for years can rank higher than a site that is constantly updated and improved.1 -
Google indexing despite robots.txt block
Hi This subdomain has about 4'000 URLs indexed in Google, although it's blocked via robots.txt: https://www.google.com/search?safe=off&q=site%3Awww1.swisscom.ch&oq=site%3Awww1.swisscom.ch This has been the case for almost a year now, and it does not look like Google tends to respect the blocking in http://www1.swisscom.ch/robots.txt Any clues why this is or what I could do to resolve it? Thanks!
Technical SEO | | zeepartner0 -
What is the best way to use canonical tag
Hi, i have been researching this since yesterday and have looked at this subject many times before but still cannot get my head around it. i done a report on my site which was very useful, i used http://www.juxseo.com for my site www.in2town.co.uk and it brought me some useful information but part of that info was it was telling me that i should have on my home page a canonical tag which would improve my seo. Now i am using sh404sef for my friendly urls and i am using joomla 3.0 and when i approached the makers of the sh404sef to ask about the tag they said i would need to be careful of using it as it could damage my site and my rankings. i have read lots of information but still do not have a clear understanding behind it. can anyone please explain the best way to use this and should i be using where i may have some sort of duplicate page, any help to understand this would be great.
Technical SEO | | ClaireH-1848860 -
I accidentally blocked Google with Robots.txt. What next?
Last week I uploaded my site and forgot to remove the robots.txt file with this text: User-agent: * Disallow: / I dropped from page 11 on my main keywords to past page 50. I caught it 2-3 days later and have now fixed it. I re-imported my site map with Webmaster Tools and I also did a Fetch as Google through Webmaster Tools. I tweeted out my URL to hopefully get Google to crawl it faster too. Webmaster Tools no longer says that the site is experiencing outages, but when I look at my blocked URLs it still says 249 are blocked. That's actually gone up since I made the fix. In the Google search results, it still no longer has my page title and the description still says "A description for this result is not available because of this site's robots.txt – learn more." How will this affect me long-term? When will I recover my rankings? Is there anything else I can do? Thanks for your input! www.decalsforthewall.com
Technical SEO | | Webmaster1230 -
How to use rel canonical?
Hi, I am having some questions about this and I think you can help me on this. Here I have the example of my problem: pagination: Suppose that I have a new with 2 pages http://www.espectador.com/noticias/208907/fernando-pereira-encuesta-de-cifra-prendio-una-lucecita-amarilla-en-el-pit-cnt you can access the first page by different ways: www.espectador.com/1v4_contenido.php?m=&id=250419&ipag=1 http://www.espectador.com/1v4_contenido.php?m=&id=250419 http://www.espectador.com/noticias/250419/alvaro-vega-fa-creo-que-cosmo-fue-usada-por-bqb-para-evitar-una-subasta-a-la-baja-y-asi-quedar-con-las-manos-libres Same meta descr, same body with different URLs. Can I use rel canonical in the file 1v4_contenido.php that point to the friendly url? <link rel="<a class="attribute-value">canonical</a>" href="[http://www.espectador.com/noticias/250419/alvaro-vega-fa-creo-que-cosmo-fue-usada-por-bqb-para-evitar-una-subasta-a-la-baja-y-asi-quedar-con-las-manos-libres](view-source:http://www.espectador.com/noticias/250419/alvaro-vega-fa-quotcreo-que-cosmo-fue-usada-por-bqb-para-evitar-una-subasta-a-la-bajaquot-y-asi-quotquedar-con-las-manos-libresquot)"/> do I have a loop here? The rel canonical can goes in the page 1? Thanks
Technical SEO | | informatica8100 -
Using Blogger.com
I have a client that is currently using blogger.com for their blog. I don't have much experience with this site as I have mostly used Wordpress in the past. Are there any good SEO plugins/tools for this site? Thanks!
Technical SEO | | AlightAnalytics0 -
How can I exclude display ads from robots.txt?
Google has stated that you can do this to get spiders to content only, and faster. Our IT guy is saying it's impossible.
Technical SEO | | GregBeddor
Do you know how to exlude display ads from robots.txt? Any help would be much appreciated.0 -
Is blocking RSS Feeds with robots.txt necessary?
Is it necessary to block an rss feed with robots.txt? It seems they are automatically not indexed (http://googlewebmastercentral.blogspot.com/2007/12/taking-feeds-out-of-our-web-search.html) And, google says here that it's important not to block RSS feeds (http://googlewebmastercentral.blogspot.com/2009/10/using-rssatom-feeds-to-discover-new.html) I'm just checking!
Technical SEO | | nicole.healthline0