Robots.txt: excluding URL
-
Hi,
spiders crawl some dynamic urls in my website (example: http://www.keihome.it/elettrodomestici/cappe/cappa-vision-con-tv-falmec/714/ + http://www.keihome.it/elettrodomestici/cappe/cappa-vision-con-tv-falmec/714/open=true) as different pages, resulting duplicate content of course.
What is syntax for disallow these kind of urls in robots.txt?
Thanks so much
-
You don't want to do this in robots.txt. If you serve pages with these parameters, people will inevitably link to them, and even if they're disallowed in your robots.txt file, Google maybe still index them, according to this: "While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web."
This is what the rel=canonical tag is designed for. You should use that to tell Google the page is duplicate content of another page on your site, and that it should refer to that other page. You can read (and watch a video) about that here.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Query for paginated URLs - Shopify
Hi there, /collections/living-room-furniture?page=2
On-Page Optimization | | williamhuynh
/collections/living-room-furniture?page=3
/collections/living-room-furniture?page=4 Is that ok to make all the above paginated URLs canonicalised with their main category /collections/living-room-furniture Also, does it needs to be noindex, follow as well? Please advice, thank you!1 -
How important are clean URLs?
Just wanting to understand the importance of clean URLs in regards to SEO effectiveness. Currently, we have URLs for a site that reads as follows: http://www.interhampers.com.au/c/90/Corporate Gift Hampers Should we look into modifying this so that the URL does not have % or figures?
On-Page Optimization | | Gavo1 -
URL Question
This url looks bad: http://www.patrickmunoz.com/#!classes/c1vw1 And when you click around the page change doesn't actually occur, it's a fade into the next page. I think this is a major problem for rankings. Although pages are crawled: https://www.google.com/search?q=site%3Ahttp%3A%2F%2Fwww.patrickmunoz.com%2F&oq=site%3A&aqs=chrome.2.69i57j69i58j69i59l3j69i61.3548j0j7&sourceid=chrome&espv=210&es_sm=122&ie=UTF-8 When I search for a simple page - "patrick munoz FAQs" nothing comes up:
On-Page Optimization | | tylerfraser
https://www.google.com/search?q=site%3Ahttp%3A%2F%2Fwww.patrickmunoz.com%2F&oq=site%3A&aqs=chrome.2.69i57j69i58j69i59l3j69i61.3548j0j7&sourceid=chrome&espv=210&es_sm=122&ie=UTF-8#q=patrick+munoz+|+FAQs Do you think this is a bad url configuration? Thanks! Tyler0 -
When You Add a Robots.txt file to a website to block certain URLs, do they disappear from Google's index?
I have seen several websites recently that have have far too many webpages indexed by Google, because for each blog post they publish, Google might index the following: www.mywebsite.com/blog/title-of-post www.mywebsite.com/blog/tag/tag1 www.mywebsite.com/blog/tag/tag2 www.mywebsite.com/blog/category/categoryA etc My question is: if you add a robots.txt file that tells Google NOT to index pages in the "tag" and "category" folder, does that mean that the previously indexed pages will eventually disappear from Google's index? Or does it just mean that newly created pages won't get added to the index? Or does it mean nothing at all? thanks for any insight!
On-Page Optimization | | williammarlow0 -
Removing old URLs that are being used for my on page optimization?
Is there a way to remove old URL's that are still being used for my keywords for my on page optimization? They are giving me grades of F since they no longer exist and if I change the URL to the current one, the grade becomes an A, but they are still showing after the new crawl.
On-Page Optimization | | Dirty0 -
Importance of URL Structure
We are trying to restructure our onpage SEO and want to make sure we have our URLs correct. The problem is we did the URLs incorrectly in the first place and the ones we currently have are several years olds. We have some URLs such as: http://www.firebrandtraining.co.uk/courses/management/prince2.asp and
On-Page Optimization | | RobertChapman
http://www.firebrandtraining.co.uk/courses/cisco/ccna_2007.asp which are not ideal but user experience aside does it make sense for us to change the URLs and use 301 redirects to the new ones or is the damage done to our natural rankings simply not worth making the change? I have read different articles saying different things, some say that URL structure has little weight (if any weight at all) on rankings while other people seem to say it is quite important. In addition we have heard that changing the URLs with a 301 redirect will cause a large drop in ranking which will take months to recover from and contrarily that 301s are now considered "ok" by Google and we shouldn't see too much change at all in our rankings. Any advice would be much appreciated.0 -
Canonical URL problem
On page analysis wanted me to add a canonical url tag. However I added then re ran the on page analysis and it came up with an error. What is the proper way to add a canonical url tag in the head of an index page? ie. add a canonical tag to www.hompeage.com/index.html would it be ? Or should I ignore this for a home page? Because I add it then run the analysis again and get this? Appropriate Use of Rel Canonical Moderate fix <dl> <dt>Canonical URL</dt> <dd>"http://www.ensoplastics.com/index.html"</dd> <dt>Explanation</dt> <dd>If the canonical tag is pointing to a different URL, engines will not count this page as the reference resource and thus, it won't have an opportunity to rank. Make sure you're targeting the right page (if this isn't it, you can reset the target above) and then change the canonical tag to reference that URL.</dd> <dt>Recommendation</dt> <dd>We check to make sure that IF you use canonical URL tags, it points to the right page. If the canonical tag points to a different URL, engines will not count this page as the reference resource and thus, it won't have an opportunity to rank. If you've not made this page the rel=canonical target, change the reference to this URL. NOTE: For pages not employing canonical URL tags, this factor does not apply.</dd> <dd>So do I add it or not? If I don't I get a lower page rating if I take it off I get a higher page rating with room for improvement. </dd> </dl>
On-Page Optimization | | ENSO0 -
Does having a "+" in a URL hurt SEO? Would much value be gained changing it to a hyphen?
There's a site that contains "+" signs in the URL in order to call different information for the content on the page. Would it be better to change those to hyphens (-), or not that much value will be gained, so leave them as is? Thanks!
On-Page Optimization | | MitchellStoker0