Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Blocking URL's with specific parameters from Googlebot
-
Hi,
I've discovered that Googlebot's are voting on products listed on our website and as a result are creating negative ratings by placing votes from 1 to 5 for every product. The voting function is handled using Javascript, as shown below, and the script prevents multiple votes so most products end up with a vote of 1, which translates to "poor".
How do I go about using robots.txt to block a URL with specific parameters only? I'm worried that I might end up blocking the whole product listing, which would result in de-listing from Google and the loss of many highly ranked pages.
DON'T want to block:
http://www.mysite.com/product.php?productid=1234
WANT to block:
http://www.mysite.com/product.php?mode=vote&productid=1234&vote=2
Javacript button code:
onclick="javascript: document.voteform.submit();"
Thanks in advance for any advice given.
Regards,
Asim -
Good to hear, I am glad you perservered
-
Tried them all now and all come back with "Success"... May be I'll post in the WMT Forum and see if anyone can shed light on this problem. Thanks for your help Alan, it's much appreciated.
-
Yes correct, did you try the other formats?
-
Tried "Fetch as Googlebot" in Diagnostics and it came back as "Success" so I guess the robots.txt directive is not working. I'm assuming it should have reported a failure message when attempting to fetch a URL containing "?mode=vote".
-
Wrong place, go to diagnostics, then look for fetch as googlebot
-
I added "Disallow: /mode=vote" to the robots.txt file and also manually entered it on Crawler Access page, then clicked "Test" and no errors were reported. The WMT page states that robots.txt was last downloaded 16 hours ago so I'll wait until it picks the file up again and then check for any errors. Hopefully that will do trick

-
Try this in robots.txt, I did not think that Google allows wild cards but i just read that they do.
Disallow: /*mode=vote*orDisallow: /*mode=voteorDisallow: /*modeThen try in Google WMT to read with googlebot to see if it works.The first in the list seems right to me, but I have seen others do it the other ways. -
Thanks for the reply. The site was developed using PHP, mySQL and Javascript. I was hoping there was a way to do it without getting programmers involved...
-
dont think you are going to do it in robots.txt, rather do a 301 from mode=vote to non mode vote.
If you dont know how to put this into practise, tell me what your site is built with, if it is ASP.NET, i will show you how to impliment, if not someone else should be able to help.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Problem with Yoast not seeing any of this website's text/content
Hi, My client has a new WordPress site http://www.londonavsolutions.co.uk/ and they have installed the Yoast Premium SEO plug-in. They are having issues with getting the lights to go green and the main problem is that on most pages Yoast does not see any words/content – although there are plenty of words on the pages. Other tools can see the words, however Yoast is struggling to find any and gives the following message:- Bad SEO score. The text contains 0 words. This is far below the recommended minimum of 300 words. Add more content that is relevant for the topic. Readability - You have far too little content. Please add some content to enable a good analysis. They have contacted the website developer who says that there is nothing wrong, but they are frustrated that they cannot use the Yoast tools themselves because of this issue, plus Yoast are offering no support with the issue. I hope that one of you guys has seen this problem before, or can spot a problem with the way the site has been built and can perhaps shed some light on the problem. I didn't build the site myself so won't be offended if you spot problems with it. Thanks in advance, Ben
Technical SEO | | bendyman0 -
How do I deindex url parameters
Google indexed a bunch of our URL parameters. I'm worried about duplicate content. I used the URL parameter tool in webmaster to set it so future parameters don't get indexed. What can I do to remove the ones that have already been indexed? For example, Site.com/products and site.com/products?campaign=email have both been indexed as separate pages even though they are the same page. If I use a no index I'm worried about de indexing the product page. What can I do to just deindexed the URL parameter version? Thank you!
Technical SEO | | BT20090 -
Why is Google's cache preview showing different version of webpage (i.e. not displaying content)
My URL is: http://www.fslocal.comRecently, we discovered Google's cached snapshots of our business listings look different from what's displayed to users. The main issue? Our content isn't displayed in cached results (although while the content isn't visible on the front-end of cached pages, the text can be found when you view the page source of that cached result).These listings are structured so everything is coded and contained within 1 page (e.g. http://www.fslocal.com/toronto/auto-vault-canada/). But even though the URL stays the same, we've created separate "pages" of content (e.g. "About," "Additional Info," "Contact," etc.) for each listing, and only 1 "page" of content will ever be displayed to the user at a time. This is controlled by JavaScript and using display:none in CSS. Why do our cached results look different? Why would our content not show up in Google's cache preview, even though the text can be found in the page source? Does it have to do with the way we're using display:none? Are there negative SEO effects with regards to how we're using it (i.e. we're employing it strictly for aesthetics, but is it possible Google thinks we're trying to hide text)? Google's Technical Guidelines recommends against using "fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash." If we were to separate those business listing "pages" into actual separate URLs (e.g. http://www.fslocal.com/toronto/auto-vault-canada/contact/ would be the "Contact" page), and employ static HTML code instead of complicated JavaScript, would that solve the problem? Any insight would be greatly appreciated.Thanks!
Technical SEO | | fslocal0 -
How to Remove /feed URLs from Google's Index
Hey everyone, I have an issue with RSS /feed URLs being indexed by Google for some of our Wordpress sites. Have a look at this Google query, and click to show omitted search results. You'll see we have 500+ /feed URLs indexed by Google, for our many category pages/etc. Here is one of the example URLs: http://www.howdesign.com/design-creativity/fonts-typography/letterforms/attachment/gilhelveticatrade/feed/. Based on this content/code of the XML page, it looks like Wordpress is generating these: <generator>http://wordpress.org/?v=3.5.2</generator> Any idea how to get them out of Google's index without 301 redirecting them? We need the Wordpress-generated RSS feeds to work for various uses. My first two thoughts are trying to work with our Development team to see if we can get a "noindex" meta robots tag on the pages, by they are dynamically-generated pages...so I'm not sure if that will be possible. Or, perhaps we can add a "feed" paramater to GWT "URL Parameters" section...but I don't want to limit Google from crawling these again...I figure I need Google to crawl them and see some code that says to get the pages out of their index...and THEN not crawl the pages anymore. I don't think the "Remove URL" feature in GWT will work, since that tool only removes URLs from the search results, not the actual Google index. FWIW, this site is using the Yoast plugin. We set every page type to "noindex" except for the homepage, Posts, Pages and Categories. We have other sites on Yoast that do not have any /feed URLs indexed by Google at all. Side note, the /robots.txt file was previously blocking crawling of the /feed URLs on this site, which is why you'll see that note in the Google SERPs when you click on the query link given in the first paragraph.
Technical SEO | | M_D_Golden_Peak0 -
Mobile URL parameter (Redirection to desktop)
Hello, We have a parallel mobile website and recently we implemented a link pointing to the desktop website. This redirect is happening via a javascript code and results in a url followed by this paramenter: ?m=off Example:
Technical SEO | | echo1
http://www.m.website.com redirects to:
http://www.website.com/?m=off Questions: Will the "http://www.website.com/?m=off" be considered duplicate content with "http://www.website.com" since they both return the same content? Is there any possibility that Google will take into consideration the url ending in "/?m=off"? How should we treat this new url? The webmaster tools URL parameter configuration at the moment isn't experiencing problems but should we submit the parameter anyway in order not to be indexed or should we wait first and see the error response? In case we should submit this for removal... what's the best way to do it? Like this? Parameter: ?m=off Does this parameter change page content seen by the user? - doesn't affect page content Any help is much appreciated.
Thank you!0 -
Google insists robots.txt is blocking... but it isn't.
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site. When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section. Bing's webmaster tools are able to read the site and sitemap just fine. Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
Technical SEO | | ahockley0 -
Ecommerce website: Product page setup & SKU's
I manage an E-commerce website and we are looking to make some changes to our product pages to try and optimise them for search purposes and to try and improve the customer buying experience. This is where my head starts to hurt! Now, let's say I am selling a T shirt that comes in 4 sizes and 6 different colours. At the moment my website would have 24 products, each with pretty much the same content (maybe differing references to the colour & size). My idea is to change this and have 1 main product page for the T-shirt, but to have 24 product SKU's/variations that exist to give the exact product details. Some different ways I have been considering to do this: a) have drop-down fields on the product page that ask the customer to select their Tshirt size and colour. The image & price then changes on the page. b) All product 24 product SKUs sre listed under the main product with the 'Add to Cart' open next to each one. Each one would be clickable so a page it its own right. Would I need to set up a canonical links for each SKU that point to the top level product page? I'm obviously looking to minimise duplicate content but Im not exactly sure on how to set this up - its a big decision so I need to be 100% clear before signing off on anything. . Any other tips on how to do this or examples of good e-commerce websites that use product SKus well? Kind regards Tom
Technical SEO | | DHS_SH0 -
A question about RSS feeds and nofollow's
With the nofollow tag used very widely on the internet these days I was just wondering about how an RSS feed might help me find a way around it. Basically my question is this : I post a comment on a blog, it's approved and my comment together with my link(nofollow tag applied) is there. Now when the blogs RSS feed updates, does this nofollow tag get applied to the feed? As far as I can tell it does not - but I'm not too clue'd up on how the feed is generated. Anyone want to help me understand how it works and if what I'm suggesting would be 'a way around the nofollow tag' ? Thanks 🙂
Technical SEO | | DanHill0