Robots.txt: excluding URL
-
Hi,
spiders crawl some dynamic urls in my website (example: http://www.keihome.it/elettrodomestici/cappe/cappa-vision-con-tv-falmec/714/ + http://www.keihome.it/elettrodomestici/cappe/cappa-vision-con-tv-falmec/714/open=true) as different pages, resulting duplicate content of course.
What is syntax for disallow these kind of urls in robots.txt?
Thanks so much
-
You don't want to do this in robots.txt. If you serve pages with these parameters, people will inevitably link to them, and even if they're disallowed in your robots.txt file, Google maybe still index them, according to this: "While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web."
This is what the rel=canonical tag is designed for. You should use that to tell Google the page is duplicate content of another page on your site, and that it should refer to that other page. You can read (and watch a video) about that here.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Link flow for multiple links to same URL
Hi there,
On-Page Optimization | | doctecs
my question is as follows: How does Google handle link flow if two links in a given page point to the same URL? (do they flow link individually or not?) This seems to be a newbie question, but actually it seems that there is little evidence and even also little consensus in the SEO community about this detail. Answers should include source Information about the current state of art at Google is preferable The question is not about anchor text, general best practises for linking, "PageRank is dead" etc. We do know that the "historical" PageRank was implemented (a long time ago) without special handling for multiple links, as e.g. last stated by Matt Cutts in this video: http://searchengineland.com/googles-matt-cutts-one-page-two-links-page-counted-first-link-192718 On the other hand, many people from the SEO community say that only the first link counts. But so far I could not find any data to back this up, which is quite surprising.0 -
URL Structure Category Pages -Current Moz Friday-
Hello,
On-Page Optimization | | _Heiko_
regarding #15 of the last moz friday I have a question: http://moz.com/blog/15-seo-best-practices-for-structuring-urls What would you prefer if the lenght of the URL will be still under 60 characters and you have an example like this: Let's call it a specific page in a category. As I like the old shoe examples: You have a page about red shoes in your shoe category. Which URL would you prefer: a) www.mydomain.com/shoes/red-shoe b) www.mydomain.com/shoes/red Personally I would prefer a) or would you already consider this as spammy? My real example is not that trivial like the shoe example and the categories will be in plural and the specific pages always in singular (like in the example shoes vs shoe). c) would be to put it independently from the side structure on www.mydomain.com/red-shoe - but personally I have the experience that a) or b) will help the rankings of the category page if you have the specific pages in the same subfolder. What's your opinion on this?1 -
Properly changing title, URL and content for new keywords without harming other rankings.
Hello - We are looking to try to bring up some keywords in the SERPs that we are currently ranking fairly low for. We sell Christening clothing for children and people will use both Christening and Baptism to search for the same thing. We currently rank very high for Christening (#1 on Google for certain combinations) but we are fairly low on Baptism.
On-Page Optimization | | BabyBeauBelle
I am trying to figure out the best way to start getting Baptism up by changing some title, URL and content pages to include more Baptism keywords. My concern is messing with the existing because we rank so well for Christening. Since we are ecommerce we can vary this quite a bit on our products, but again I'm nervous to do so fearing changing the wrong things, too many products etc and in the process of trying to raise one set of keywords (baptism) we harm the other set (christening).
Any advice would be appreciated!0 -
URL parameters
Hello, Currently, I paginated a content to 5 pages eg: http://abc.com/faqs.html?&page=2 Is it right? and how to check it is correct or not?
On-Page Optimization | | JohnHuynh0 -
Mixing hyphens and underscores in a url
Hello. I am working on a site that was built with underscores in the urls, but only in the page names, not in the subdirectories. All the subdirectories have one-word names. So a typical url is "example.com/sub1/sub2/page_name." We would like to change the name of one of the subdirectories to a name that would be very useful for SEO, but this new name is a hyphenated word, let's call it "new-sub." If we changed "sub2" to "new-sub" then our url would have a mix of underscores and hyphens: example.com/sub1/new-sub/page_name. But if I used "new_sub" instead, google would read the words as connected with an underscore, instead of reading the subdirectory as a hyphenated word, which would be less useful for SEO. It seems like it might be a problem to have a hyphen in a subdirectory and underscores in the page names. But I want the SEO value of the hyphenated word. Any recommendations? Thank you!
On-Page Optimization | | nyc-seo0 -
Should I use www in my url when running On-Page Report Card?
When creating a On-Page Report Card I get 2 different results when using a WWW and without for my url. What is best?
On-Page Optimization | | thomas.wittine0 -
Hierarchy and consistency in ecommerce URLs
One of the first things I remember reading about SEO and URLs, a long time ago, is that keywords are important, and hierarchy is important, for search engines and for users. Hierarchy in URLs would give the search engines an idea of the structure of the site, and users would be able to edit the URLs to continue navigating. I'm wondering about URLs, hierarchy and usability lately, since I've seen that ASOS uses a new URL structure on their site. At first glance, I thought it was brilliant, so I would like to get all of your opinions as well. For those of you that haven't seen the URLs: for categories, ASOS uses a structure as you would expect it, but for products they don't insert the category in the URL. Instead they insert the brand name as the first part of the URL, followed by the product title. Some examples: Category:
On-Page Optimization | | DocdataCommerce
www.asos.com/women/dresses/... Product:
www.asos.com/french-connection/french-connection-tie-waist-pocket-stripe-dress/... I can see the importance of brand name for a site like ASOS, and like how they stressed this by inserting not the category but the brand for products. I don't know how much ASOS still relies on organic non-ASOS related keyword traffic, but still. Now, for hierarchy, I guess a good internal linking structure will tell the search engines about the hierarchy of a site as well, right? So perhaps hierarchy in the URL isn't that important? Perhaps something like this would be just as good as anything, given a good internal link structure? www.onlinestore.com/category/
www.onlinestore.com/subcategory/
www.onlinestore.com/brand/product-title/ Now, I understand that if you use this structure, you wouldn't be able to have men/shirts and women/shirts, but let's say that you don't have subcategories that use the same names. In this case, how important is hierarchy? And, what do you think about this URL structure for an ecommerce site for which brands are important?0 -
Can duplicate content issues be solved with a noindex robot metatag?
Hi all I have a number of duplicate content issues arising from a recent crawl diagnostics report. Would using a robots meta tag (like below) on the pages I don't necessarily mind not being indexed be an effective way to solve the problem? Thanks for any / all replies
On-Page Optimization | | joeprice0