Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to Remove /feed URLs from Google's Index
-
Hey everyone, I have an issue with RSS /feed URLs being indexed by Google for some of our Wordpress sites. Have a look at this Google query, and click to show omitted search results. You'll see we have 500+ /feed URLs indexed by Google, for our many category pages/etc. Here is one of the example URLs: http://www.howdesign.com/design-creativity/fonts-typography/letterforms/attachment/gilhelveticatrade/feed/. Based on this content/code of the XML page, it looks like Wordpress is generating these:
<generator>http://wordpress.org/?v=3.5.2</generator>
Any idea how to get them out of Google's index without 301 redirecting them? We need the Wordpress-generated RSS feeds to work for various uses.
My first two thoughts are trying to work with our Development team to see if we can get a "noindex" meta robots tag on the pages, by they are dynamically-generated pages...so I'm not sure if that will be possible. Or, perhaps we can add a "feed" paramater to GWT "URL Parameters" section...but I don't want to limit Google from crawling these again...I figure I need Google to crawl them and see some code that says to get the pages out of their index...and THEN not crawl the pages anymore.
I don't think the "Remove URL" feature in GWT will work, since that tool only removes URLs from the search results, not the actual Google index.
FWIW, this site is using the Yoast plugin. We set every page type to "noindex" except for the homepage, Posts, Pages and Categories. We have other sites on Yoast that do not have any /feed URLs indexed by Google at all.
Side note, the /robots.txt file was previously blocking crawling of the /feed URLs on this site, which is why you'll see that note in the Google SERPs when you click on the query link given in the first paragraph.
-
I tried many different htaccess file codings (such as recommended here), but they didn't work. Had to succumb to using the outdated Meta Robots plugin by Yoast, which can add the "noindex" code to the http header of /feed/ URLs. But, at least it's a solution: http://wordpress.org/plugins/robots-meta/. Hopefully this helps someone else.
-
I believe I found the solution: implement an x-robots-tag into the HTTP header of the various feed URLs. But, I need some help with creating the code to place in my .htaccess file. Any takers?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sudden Indexation of "Index of /wp-content/uploads/"
Hi all, I have suddenly noticed a massive jump in indexed pages. After performing a "site:" search, it was revealed that the sudden jump was due to the indexation of many pages beginning with the serp title "Index of /wp-content/uploads/" for many uploaded pieces of content & plugins. This has appeared approximately one month after switching to https. I have also noticed a decline in Bing rankings. Does anyone know what is causing/how to fix this? To be clear, these pages are **not **normal /wp-content/uploads/ but rather "index of" pages, being included in Google. Thank you.
Technical SEO | | Tom3_150 -
Pages are Indexed but not Cached by Google. Why?
Hello, We have magento 2 extensions website mageants.com since 1 years google every 15 days cached my all pages but suddenly last 15 days my websites pages not cached by google showing me 404 error so go search console check error but din't find any error so I have cached manually fetch and render but still most of pages have same 404 error example page : - https://www.mageants.com/free-gift-for-magento-2.html error :- http://webcache.googleusercontent.com/search?q=cache%3Ahttps%3A%2F%2Fwww.mageants.com%2Ffree-gift-for-magento-2.html&rlz=1C1CHBD_enIN803IN804&oq=cache%3Ahttps%3A%2F%2Fwww.mageants.com%2Ffree-gift-for-magento-2.html&aqs=chrome..69i57j69i58.1569j0j4&sourceid=chrome&ie=UTF-8 so have any one solutions for this issues
Technical SEO | | vikrantrathore0 -
Problems with WooCommerce Product Attribute Filter URL's
I am running a WordPress/WooCommerce site for a client, and Moz is picking up some issues with URL's generated from WooCommerce product attribute filters. For example: ..co.uk/womens-prescription-glasses/?filter_gender=mens&filter_style=full-rim&filter_shape=oval How do I get Google to ignore these filters?
Technical SEO | | SushiUK
I am running Yoast Premium, but not sure if this can solve the issue? Product categories are canonicalised to the root category URL. Any suggestions very gratefully appreciated. Thanks Bob0 -
New theme adds ?v=1d20b5ff1ee9 to all URL's as part of cache. How does this affect SEO
New theme I am working in ads ?v=1d20b5ff1ee9 to every URL. Theme developer says its a server setting issue. GoDaddy support says its part of cache an becoming prevalent in new themes. How does this impact SEO?
Technical SEO | | DML-Tampa0 -
Google will index us, but Bing won't. Why?
Bing is crawling our site, but not indexing it, and we cannot figure out why -- plus it's being indexed fine in Google. Any ideas on what the issue with Bing might be? Here's are some details to let you know what we've already checked/established: We have 4 301’s and the rest of our site checks out We’ve already established our Robots is ok, and that we are fixing our site map/it's in fine shape We do not see anything blocking bingbot access to the site There is no varnish or any load balancers, so nothing on that end that would be blocking the access We also don't see any rules in the apache or the .htaccess config that would be blocking the access
Technical SEO | | Alex_RevelInteractive1 -
My old URL's are still indexing when I have redirected all of them, why is this happening?
I have built a new website and have redirected all my old URL's to their new ones but for some reason Google is still indexing the old URL's. Also, the page authority for all of my pages has dropped to 1 (apart from the homepage) but before they were between 12 to 15. Can anyone help me with this?
Technical SEO | | One2OneDigital0 -
Do I use /es/, /mx/ or /es-mx/ for my Spanish site for Mexico only
I currently have the Spanish version of my site under myurl.com/es/ When I was at Pubcon in Vegas last year a panel reviewed my site and said the Spanish version should be in /mx/ rather than /es/ since es is for Spain only and my site is for Mexico only. Today while trying to find information on the web I found /es-mx/ as a possibility. I am changing my site and was planning to change to /mx/ but want confirmation on the correct way to do this. Does anyone have a link to Google documentation that will tell me for sure what to use here? The documentation I read led me to the /es/ but I cannot find that now.
Technical SEO | | RoxBrock0 -
Blocking URL's with specific parameters from Googlebot
Hi, I've discovered that Googlebot's are voting on products listed on our website and as a result are creating negative ratings by placing votes from 1 to 5 for every product. The voting function is handled using Javascript, as shown below, and the script prevents multiple votes so most products end up with a vote of 1, which translates to "poor". How do I go about using robots.txt to block a URL with specific parameters only? I'm worried that I might end up blocking the whole product listing, which would result in de-listing from Google and the loss of many highly ranked pages. DON'T want to block: http://www.mysite.com/product.php?productid=1234 WANT to block: http://www.mysite.com/product.php?mode=vote&productid=1234&vote=2 Javacript button code: onclick="javascript: document.voteform.submit();" Thanks in advance for any advice given. Regards,
Technical SEO | | aethereal
Asim0