How should I handle URL's created by an internal search engine?
-
Hi,
I'm aware that internal search result URL's (www.example.co.uk/catalogsearch/result/?q=searchterm) should ideally be blocked using the robots.txt file. Unfortunately the damage has already been done and a large number of internal search result URL's have already been created and indexed by Google. I have double checked and these pages only account for approximately 1.5% of traffic per month.
Is there a way I can remove the internal search URL's that have already been indexed and then stop this from happening in the future, I presume the last part would be to disallow /catalogsearch/ in the robots.txt file.
Thanks
-
Basic cleanup
From a procedural standpoint, you want to first add the noindex meta tag to the search results first. Google has to see that tag to then act on it and remove the URLs. You can also enter some of the URLs into the Webmaster tools removal tool.
Next you would want to add /catalogsearch/ to robots.txt once you see all the pages getting out of the index.
Advanced cleanup
If any of these search result URLs are ranking and are landing pages in Google. You may want to consider 301 redirecting those pages to the properly related category pages.
My 2 cents. I only use the GWT parameter handler on parameters that I have to show to the search engines. I otherwise try to hide all those URLs from Google to help with crawl efficiency.
Note that it is really important that you do the work to find what pages/urls Google has cataloged to make sure you dont delete a page that is actually generating some traffic for you. A landing page report from GA would help with this.
Cheers!
-
On top of Lesley's recommendations, both google and bing have url parameter exclusion options in webmaster tools.
-
I am guessing that you are using a system that templates pages and maybe adds a query string after the search, something like search.php?caws+cars. I would set in the header of all of the pages that use the search template a noindex, nofollow. Then I would also add it to the robots text as well to disregard the search pages. They will start dropping out of the results pages in about a week or so.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
404's after pruning old posts
Hey all, So after reading about the benefits of pruning old content I decided to give it a try on our blog. After reviewing thousands of posts I found around 2500 that were simply not getting any traffic, or if they were there was 100% bounce & exit. Many of these posts also had content with relevance that had long ago expired. After deleted these old posts, I am now seeing the posts being reported as 404's in Google Search Console. But most of them are the old url with "trashed" appended to the url. My question is: are these 404's normal? Do I now have to go through and set up 301's for all of these? Is it enough to simply add the lot to my robots.txt file? Are these 404's going to hurt my blog? Thanks, Roman
Intermediate & Advanced SEO | | Dynata_panel_marketing0 -
Webpage has bombed outside of Top 50 for search term in one week. What's the cause?
I've been monitoring the performance of some pages via the email Moz sends every week, and until this week two pages that I've managed to get ranking have ranked between 20 and 23 for the specific term. However, today on the email one of the pages for one search term has bombed out of the top 50 while the other page has remained unaffected. What could be the cause for this? I've looked at Google Webmasters for an indication of a penalty of some sort but there is nothing glaringly obvious. I've no messages on there, and I haven't bought a load of spam links at all. What else could I check?
Intermediate & Advanced SEO | | mickburkesnr0 -
What is happening with this page's rankings? (G Analytics screenprint attached) help me.
Hi, At the moment im confused. I have a page which shows up for the query 'bank holidays' first page solid for 2 years - this also applies to the terms 'mothers day', 'pancake day' and a few others (UK Google). And there still ranking. Here is the problem: Usually I would rank for 'bank holidays 2014' (the terms with the year in are the real traffic drivers) and would be position 3/5. Over the last 3 months this has decayed dropping position to 30+. From the screenprint you can see the term 'Bank Holidays' is holding on but the term 'bank holidays 2014' is slowly decaying. If you query 'bank holidays 2015' we don't appear in rankings at all. What is causing this? The content is ok, social sharing happens and the odd link is picked up hear and there. I need help, how do I start pushing this back in the other direction, its like the site is slowly dying. And what really kills me, is 2 pages are ranking on page1 off link farms. URL: followuk.co.uk/bank-holidays serp-decay.jpg
Intermediate & Advanced SEO | | followuk0 -
Is there any SEO advantage to sharing links on twitter using google's url shortener goo.gl/
Hi is there any advantage to using <cite class="vurls">goo.gl/</cite> to shorten a URL for Twitter instead of other ones? I had a thought that <cite class="vurls">goo.gl/</cite> might allow google to track click throughs and hence judge popularity.
Intermediate & Advanced SEO | | S_Curtis0 -
Brackets vs Encoded URLs: The "Same" in Google's eyes, or dup content?
Hello, This is the first time I've asked a question here, but I would really appreciate the advice of the community - thank you, thank you! Scenario: Internal linking is pointing to two different versions of a URL, one with brackets [] and the other version with the brackets encoded as %5B%5D Version 1: http://www.site.com/test?hello**[]=all&howdy[]=all&ciao[]=all
Intermediate & Advanced SEO | | mirabile
Version 2: http://www.site.com/test?hello%5B%5D**=all&howdy**%5B%5D**=all&ciao**%5B%5D**=all Question: Will search engines view these as duplicate content? Technically there is a difference in characters, but it's only because one version encodes the brackets, and the other does not (See: http://www.w3schools.com/tags/ref_urlencode.asp) We are asking the developer to encode ALL URLs because this seems cleaner but they are telling us that Google will see zero difference. We aren't sure if this is true, since engines can get so _hung up on even one single difference in character. _ We don't want to unnecessarily fracture the internal link structure of the site, so again - any feedback is welcome, thank you. 🙂0 -
Multiple URL's exist for the same page, canonicaliazation issue?
All of the following URL's take me to the same page on my site: 1. www.mysite.com/category1/subcategory.aspx 2. www.mysite.com/subcategory.aspx 3. www.mysite.com/category1/category1/category1/subcategory.aspx All of those pages are canonicalized to #1, so is that okay? I was told the following my a company trying to make our sitemap: "the site's platform dynamically creates URLs that resolve as 200 and should be 404. This is a huge spider trap for any search engine and will make them wary of crawling the site." What would I need to do to fix this? Thanks!
Intermediate & Advanced SEO | | pbhatt0 -
Is it worth submitting a blog's RSS feed...
to as many RSS feed directories as possible? Or would this have a similar negative impact that you'd get from submitting a site to loads to "potentially spammy" site directories?
Intermediate & Advanced SEO | | PeterAlexLeigh0 -
How to let Search engines index login-first SNS sites?
What's the Effective way to let major search engine to index Login-first SNS sites? the reason of asking that is because i saw a search engines index Millon of SNS pages but most of them requested to login, how search engine get through this? http://www.baidu.com/s?wd=site%3Akaixin001.com&pn=50 thanks Boson
Intermediate & Advanced SEO | | DarwinChinaSEO0