Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Removing Dynamic "noindex" URL's from Index
-
6 months ago my clients site was overhauled and the user generated searches had an index tag on them. I switched that to noindex but didn't get it fast enough to avoid being 100's of pages indexed in Google.
It's been months since switching to the noindex tag and the pages are still indexed. What would you recommend? Google crawls my site daily - but never the pages that I want removed from the index.
I am trying to avoid submitting hundreds of these dynamic URL's to the removal tool in webmaster tools. Suggestions?
-
Hooray! Usually, I just give my advice and then run away, so it's always nice to hear I was actually right about something Seriously, glad you got it sorted out.
-
Just a follow up to your suggestion.
I created sitemaps for the pages I want removed using the google spreadsheet importXML functions, which saved a lot of time.
It took a couple weeks but all of the pages, and similar pages, have successfully been removed from the index. Even the similar pages I didn't get a chance to put in the sitemap yet (importXML limits the results to 100).
Your suggestion worked!
-
I can't 404 dynamic search pages.
-
There are a mix of search pages and old mobile pages.
The search pages I've been testing out having the canonical point to the default search page. I've seen a slight drop in these pages - but I guess I just have to be more patient.
For the other pages the path is no longer there like you were mentioning. I like the idea of setting up the XML sitemap, I never even thought of making a bad/indexed page sitemap. I will give that a shot! Thankfully this will be a quick job with the importXml function in google spreadsheets! Great tip, hopefully it'll work.
-
Is there a crawl path to them currently? One issue I see a lot is that a bunch of pages get indexed, the path is found and cut off, NOINDEX (canonical, 301, etc.) is added, but then the pages never get re-crawled. Since they don't get recrawled, the page-level directive never gets honored.
If there's a URL parameter involved, you could use parameter-handling in GWT - it's not a perfect solution, but it sometimes seems to work without a re-crawl.
The other option would be to create a new XML sitemap with all of the bad/indexed URLs. This may push Google to re-crawl them and then see the tags to deindex. It's a bit safer than re-opening the crawl paths.
If they are being crawled and Google is just ignoring the NOINDEX for some reason, I'd try to 301 or canonical those pages to a primary search page, if that's feasible (probably canonical, since you don't want the users to 301). Sometimes, if a signal isn't working for that long, you just have to shake Google and try a different signal. Even following their exact recommendations, it rarely works as planned at large scale.
-
Don't use GWMT's removal tool to remove URLs which should not be in the index (unless those expose sensitive information). Best practise is to exclude them in robots.txt and to also ensure that the pages either 404 or have a noindex,noarchive tag.
-
Change the site structure and let the pages 404, Google will deindex them if they are not being linked to.
-
You could try adding the pages you want to remove to your robots.txt file. Since you're not linking to them, and it's very unlikely that Googlebot will index those pages naturally now, this might be a better way of telling it which pages to explicitly not index.
I'm not really sure how quickly this will trigger Google to remove those pages from the index - but they do reference robots.txt on the actual "Remove URLs" page of WMT ---> "Use **robots.txt **to specify how search engines should crawl your site, or request **removal **of URLs from Google's search results ..."
For that technique, you'd want to add something like this for all of the pages you want to remove:
Disallow: /oldpage1toremove.php
That should work. If it doesn't, then I would probably just submit the requests through the "Remove URLs" tool.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should you 'noindex' Checkout Pages?
Today I was reviewing my Moz analytics and suddenly noticed 1,000 issues with pages without a meta description. I reviewed the list and learned it is 1,000 checkout pages. That's because my website has thousands of agency pages from which you can buy a product, and it reflects that difference on each version of the checkout. So, I was thinking about no-indexing (but continuing to 'follow') these checkout pages, but wondering if it has any knock-on effects I may be unaware of? Any assistance is much appreciated. Luke
Intermediate & Advanced SEO | | Luke_Proctor0 -
Should I Add Location to ALL of My Client's URLs?
Hi Mozzers, My first Moz post! Yay! I'm excited to join the squad 🙂 My client is a full service entertainment company serving the Washington DC Metro area (DC, MD & VA) and offers a host of services for those wishing to throw events/parties. Think DJs for weddings, cool photo booths, ballroom lighting etc. I'm wondering what the right URL structure should be. I've noticed that some of our competitors do put DC area keywords in their URLs, but with the moves of SERPs to focus a lot more on quality over keyword density, I'm wondering if we should focus on location based keywords in traditional areas on page (e.g. title tags, headers, metas, content etc) instead of having keywords in the URLs alongside the traditional areas I just mentioned. So, on every product related page should we do something like: example.com/weddings/planners-washington-dc-md-va
Intermediate & Advanced SEO | | pdrama231
example.com/weddings/djs-washington-dc-md-va
example.com/weddings/ballroom-lighting-washington-dc-md-va OR example.com/weddings/planners
example.com/weddings/djs
example.com/weddings/ballroom-lighting In both cases, we'd put the necessary location based keywords in the proper places on-page. If we follow the location-in-URL tactic, we'd use DC area terms in all subsequent product page URLs as well. Essentially, every page outside of the home page would have a location in it. Thoughts? Thank you!!0 -
Replace dynamic paramenter URLs with static Landing Page URL - faceted navigation
Hi there, got a quick question regarding faceted navigation. If a specific filter (facet) seems to be quite popular for visitors. Does it make sense to replace a dynamic URL e.x http://www.domain.com/pants.html?a_type=239 by a static, more SEO friendly URL e.x http://www.domain.com/pants/levis-pants.html by creating a proper landing page for it. I know, that it is nearly impossible to replace all variations of this parameter URLs by static ones but does it generally make sense to do this for the most popular facets choose by visitors. Or does this cause any issues? Any help is much appreciated. Thanks a lot in advance
Intermediate & Advanced SEO | | ennovators0 -
URL Injection Hack - What to do with spammy URLs that keep appearing in Google's index?
A website was hacked (URL injection) but the malicious code has been cleaned up and removed from all pages. However, whenever we run a site:domain.com in Google, we keep finding more spammy URLs from the hack. They all lead to a 404 error page since the hack was cleaned up in the code. We have been using the Google WMT Remove URLs tool to have these spammy URLs removed from Google's index but new URLs keep appearing every day. We looked at the cache dates on these URLs and they are vary in dates but none are recent and most are from a month ago when the initial hack occurred. My question is...should we continue to check the index every day and keep submitting these URLs to be removed manually? Or since they all lead to a 404 page will Google eventually remove these spammy URLs from the index automatically? Thanks in advance Moz community for your feedback.
Intermediate & Advanced SEO | | peteboyd0 -
Should you allow an auto dealer's inventory to be indexed?
Due to the way most auto dealership website populate inventory pages, should you allow inventory to be indexed at all? The main benefit us more content. The problem is it creates duplicate, or near duplicate content. It also creates a ton of crawl errors since the turnover is so short and fast. I would love some help on this. Thanks!
Intermediate & Advanced SEO | | Gauge1230 -
Does Google Read URL's if they include a # tag? Re: SEO Value of Clean Url's
An ECWID rep stated in regards to an inquiry about how the ECWID url's are not customizable, that "an important thing is that it doesn't matter what these URLs look like, because search engines don't read anything after that # in URLs. " Example http://www.runningboards4less.com/general-motors#!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 Basically all of this: #!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 That is a snippet out of a conversation where ECWID said that dirty urls don't matter beyond a hashtag... Is that true? I haven't found any rule that Google or other search engines (Google is really the most important) don't index, read, or place value on the part of the url after a # tag.
Intermediate & Advanced SEO | | Atlanta-SMO0 -
How to check a website's architecture?
Hello everyone, I am an SEO analyst - a good one - but I am weak in technical aspects. I do not know any programming and only a little HTML. I know this is a major weakness for an SEO so my first request to you all is to guide me how to learn HTML and some basic PHP programming. Secondly... about the topic of this particular question - I know that a website should have a flat architecture... but I do not know how to find out if a website's architecture is flat or not, good or bad. Please help me out on this... I would be obliged. Eagerly awaiting your responses, BEst Regards, Talha
Intermediate & Advanced SEO | | MTalhaImtiaz0 -
Removing dashes in our URLs?
Hi Forum, Our site has an errant product review module that is resulting in about 9-10 404 errors per day on Google Webmaster Tools. We've found that by changing our product page URLs to only include 2 dashes, the module stops causing 404 errors for that page. Does changing our URL from "oursite.com/girls-pink-yoga-capri.html" to "oursite.com/girlspink-yoga-capri.html" hurt our SEO for a search for "girls pink yoga capri"? If so, by how much (assuming everthing else on the page is optimized properly) Thanks for your input.
Intermediate & Advanced SEO | | pano0