Block search engines from URLs created by internal search engine?
-
Hey guys,
I've got a question for you all that I've been pondering for a few days now. I'm currently doing an SEO Technical Audit for a large scale directory.
One major issue that they are having is that their internal search system (Directory Search) will create a new URL everytime a search query is entered by the user. This creates huge amounts of duplication on the website.
I'm wondering if it would be best to block search engines from crawling these URLs entirely with Robots.txt?
What do you guys think? Bearing in mind there are probably thousands of these pages already in the Google index?
Thanks
Kim
-
That sounds perfect - if the user-generated URLs are getting enough traffic, make them permanent pages and 301-redirect or canonical. If not, weed them out of the index.
-
Thanks for your reply Dr. Meyers. I think you're probably right.
Yes I'm recommending they define a canonical set of pages that are the most popular searches, categories and locations which can be reached via internal links and we'll get all those duplicates re-directed back to that canonical set.
But for pages that fall outside those categories and locations, I'll recommend a meta-no-index tag.
-
It can be a complicated question on a very large site, but in most cases I'd META NOINDEX those pages. Robots.txt isn't great at removing content that's already been indexed. Admittedly, NOINDEX will take a while to work (virtually any solution will), as Google probably doesn't crawl these pages very often.
Generally, though, the risk of having your index explode with custom search pages is too high for a site like yours (especially post-Panda). I do think blocking those pages somehow is a good bet.
The only exception I would add is if some of the more popular custom searches are getting traffic and/or links. I assume you have a solid internal link structure and other paths to these listings, but if it looks like a few searches (or a few dozen) have attracted traffic and back-links, you'll want to preserve those somehow.
-
Sure, check below and some of the duplication I mean:
Capitalization Duplication
http://yellow.co.nz/yellow+pages/Car+dealer/Auckland+Region
http://yellow.co.nz/yellow+pages/Car+Dealer/Auckland+Region
With a few URL parameters
And with location duplication
http://yellow.co.nz/yellow+pages/Car+Dealer/Auckland
Let me know if you need any more info!
Cheers
Kim
-
Whats the content look like on the new url? Can you give us an example?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Migrating From Parameter-Driven URL's to 'SEO Friendly URL's (Slugs)
Hi all, hope you're all good and having a wonderful Friday morning. At the moment we have over 20,000+ live products on our ecomms site, however, all of the products are using non-seo friendly URL's (/product?p=1738 etc) and we're looking at deploying SEO friendly url's such as (/product/this-is-product-one) etc. As you could imagine, making such a change on a big ecomms site will be a difficult task and we will have to take on A LOT of content changes, href-lang changes, affiliate link tests and a big 301 task. I'm trying to get some analysis together to pitch the Tech guys, but it's difficult, I do understand that this change has it's benefits for SEO, usability and CTR - but I need some more info. Keywords in the slugs - what is it's actual SEO weight? Has anyone here recently converted from using parameter based URL's to keyword-based slugs and seen results? Also, what are the best ways of deploying this? Add a canonical and 301? All comments greatly appreciated! Brett
Intermediate & Advanced SEO | | Brett-S0 -
Mixing static.htm urls and dynamic urls on a Windows IIS Server?
Hi all, We've had a website originally built using static html with .htm extensions ranking well in Google hence we want to keep those pages/urls. We are on a dedicated sever (Windows IIS). However our developer has custom made a new DYNAMIC section for the site which shows new added products dynamically and allows them to be booked online via shopping cart. We are having problems displaying them both on the same domain even if we put the dynamic section withing its own subfolder and keep the static htms in the root. Is it possible to have both function on IIS (even if they may have to function a little separately)? Does anyone have previous experience of this kind of issue or a way of making both work? What setup do we need to do on the dedicated server.
Intermediate & Advanced SEO | | emerald0 -
Company Blog at a different URL
Ok, I have been doing a lot of work over the past 6 months, disavowing low quality links from spammy directories to our company website, etc. However, my efforts seem to have had a negative, not positive effect. This has brought me back to reconsidering what we are doing as we have lost a good amount of traction on the nationwide Google rankings specifically. Considering our company blog - platinumcctv(dot)net - we have used this blog for a long time to inform customers of new products, software developments and then to provide them links to purchase those components. Last week, I revamped the nearly default wordpress theme to another on a piece of advice. However, someone told me that all of our links should be nofollow, even though it is a company blog because we have many links coming from this domain, and it could be found as spammy. Potato/Potato - But before I start the tedious task of changing every link to no follow on a whim, i searched a lot, but have found no CLEAR substantiation of this. Any ideas? Other recommendations appreciated as well! Platinum-CCTV(dot)com
Intermediate & Advanced SEO | | PTCCTV0 -
Are these URLs too Keyword-packed?
Hi guys, Here is the URL: http://www.consumerbase.com/mailing-lists/dog-stores-mailing-list.html The target keywords are "Dog stores mailing list" and "Dog stores mailing lists" Does having "mailing-list" and "mailing-lists" in my URL hurt me?
Intermediate & Advanced SEO | | Travis-W0 -
Blocking out specific URLs with robots.txt
I've been trying to block out a few URLs using robots.txt, but I can't seem to get the specific one I'm trying to block. Here is an example. I'm trying to block something.com/cats but not block something.com/cats-and-dogs It seems if it setup my robots.txt as so.. Disallow: /cats It's blocking both urls. When I crawl the site with screaming flog, that Disallow is causing both urls to be blocked. How can I set up my robots.txt to specifically block /cats? I thought it was by doing it the way I was, but that doesn't seem to solve it. Any help is much appreciated, thanks in advance.
Intermediate & Advanced SEO | | Whebb0 -
SEOMoz and Facebook Graph Search
Are SEOMoz looking to integrate Facebook Graph Search (the web search section) into the product? At the moment we can measure and track rankings for Google, Bing/Yahoo, but not Facebook graph search. What are the general thoughts among the community? Do you think it will be adopted as a real search engine? I'm not overly concerned - I reckon it will take a lot to change people behaviour and have them moving away from the other search engines. It's throwing up some interesting results though in searches!
Intermediate & Advanced SEO | | littlesthobo0 -
Removing dashes in our URLs?
Hi Forum, Our site has an errant product review module that is resulting in about 9-10 404 errors per day on Google Webmaster Tools. We've found that by changing our product page URLs to only include 2 dashes, the module stops causing 404 errors for that page. Does changing our URL from "oursite.com/girls-pink-yoga-capri.html" to "oursite.com/girlspink-yoga-capri.html" hurt our SEO for a search for "girls pink yoga capri"? If so, by how much (assuming everthing else on the page is optimized properly) Thanks for your input.
Intermediate & Advanced SEO | | pano0 -
Best way to block a search engine from crawling a link?
If we have one page on our site that is is only linked to by one other page, what is the best way to block crawler access to that page? I know we could set the link to "nofollow" and that would prevent the crawler from passing any authority, and we can set the page to "noindex" to prevent it from appearing in search results, but what is the best way to prevent the crawler from accessing that one link?
Intermediate & Advanced SEO | | nicole.healthline0