Internal search : rel=canonical vs noindex vs robots.txt
-
Hi everyone,
I have a website with a lot of internal search results pages indexed. I'm not asking if they should be indexed or not, I know they should not according to Google's guidelines. And they make a bunch of duplicated pages so I want to solve this problem.
The thing is, if I noindex them, the site is gonna lose a non-negligible chunk of traffic : nearly 13% according to google analytics !!!
I thought of blocking them in robots.txt. This solution would not keep them out of the index. But the pages appearing in GG SERPS would then look empty (no title, no description), thus their CTR would plummet and I would lose a bit of traffic too...
The last idea I had was to use a rel=canonical tag pointing to the original search page (that is empty, without results), but it would probably have the same effect as noindexing them, wouldn't it ? (never tried so I'm not sure of this)
Of course I did some research on the subject, but each of my finding recommanded one of the 3 methods only ! One even recommanded noindex+robots.txt block which is stupid because the noindex would then be useless...
Is there somebody who can tell me which option is the best to keep this traffic ?
Thanks a million
-
Yeah, normally I'd say to NOINDEX those user-generated search URLs, but since they're collecting traffic, I'd have to side with Alan - a canonical may be your best bet here. Technically, they aren't "true" duplicates, but you don't want the 1K pages in the index, you don't want to lose the traffic (which NOINDEX would do), and you don't want to kill those pages for users (which a 301 would do).
Only thing I'd add is that, if some of these pages are generating most of the traffic (e.g. 10 pages = 90% of the traffic for these internal searches), you might want to make those permanent pages, like categories in your site architecture, and then 301 the custom URLs to those permanent pages.
-
Huh not sure since I'm not a developer (and didn't work on that website dev) but I'd say all of the above^^. If useful, here are their url structure, there's two kind :
- /searchpage.htm?action=search&pagenumber=xx&query=product+otherterms
So I guess they are generated when a user makes a search
paginated (about 15 pages generally),
and I can approximately know how much they are duplicates, I can tell some are probably overlapping when there's a lot of variations for the product. There are just a few complete duplicates (when the product searched is the same with different added terms, doesn't happen a lot in this list).
- /searchpage-searchterm-addedterm-number.htm
Those I find surprising, I don't know if they are pages generated with a fixed url, or if they are rewritten (Haven't looked at the htaccess yet, but I will, god I have a headache just thinking about reading that thing lol)
There's about a thousand of them all (from GGanalytics, about half of each sort, and nearly all are indexed by Google), on a website with about 12 thou total in pages.
Maybe the traffic loss will be compensated by the removed competition between those search pages and the product pages (and the rel=canonical is surely way less brutal than a noindex for that matter), but without experience in these kind of situations it's hard to make a decision...
Really appreciate you guys taking the time to help !
-
Alan's absolutely right about how canonical works, but I just want to clarify something - what about these pages is duplicated? In other words, are these regular searches (like product searches) with duplicate URLs, are these paginated searches (with page 2, 3, etc. that appear thin), or are these user-generated searches spinning out into new search pages (not exact duplicates but overlapping)? The solutions can vary a bit with the problem, and internal search is tricky.
-
Just one more point, a canonical is just a hint to the search engines, it is not a directive, so if they think that the pages should not be merged, they will ignore them, so in that way, they may make the decision for you
-
Not a lot of real duplicates, they're more alike, and the most visited are unique, so I'll keep the most important ones and just toss a few duplicates.
Thanks a lot for your help, problem solved !
-
no not like a noindex. more like a merge.
will it make you rank for many keywords? not necessarly, as a page all about blue widgets is going to rank higher then a page has many different subjects including blue widgets.
A canonical is really for duplicate content, or very alike content.
So you have to decide what your page is, is it duplicate or alike content, or is it unique?
if the pages are unique then do nothing, let them rank. if yopu think they are alike, then use a canonical. if there are only a few, then i would not worry either way.
if you decide they are unique, they I would look at making the page title unique also, maybe even description too.
-
Thanks for your answer
Ok you're saying indeed it will act like a noindex over time.
So if one of the result page would have ranked for a particular query, it will not rank any more, like with a noindex => it will lose the 13% of traffic it generated...
Otherwise it would be too easy to make a page rank for the keywords used in a bunch of other pages that refer to it via rel=canonical... wouldn't it ?
I'm starting to think I can't do anything... Maybe just noindex a bunch of them that cause duplicates, and leave the rest in the index.
-
Rel=canonical is tge way to go, it will tell the search results that all credit for all diffrent urls go to the original search page. eventual onl;y the original search page will exist in the index.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Adding your sitemap to robots.txt
Hi everyone, Best practice question: When adding your sitemap to your robots.txt file, do you add the whole sitemap at once or do you add different subcategories (products, posts, categories,..) separately? I'm very curious to hear your thoughts!
Technical SEO | | WeAreDigital_BE0 -
Discrepancy in actual indexed pages vs search console
Hi support, I checked my search console. It said that 8344 pages from www.printcious.com/au/sitemap.xml are indexed by google. however, if i search for site:www.printcious.com/au it only returned me 79 results. See http://imgur.com/a/FUOY2 https://www.google.com/search?num=100&safe=off&biw=1366&bih=638&q=site%3Awww.printcious.com%2Fau&oq=site%3Awww.printcious.com%2Fau&gs_l=serp.3...109843.110225.0.110430.4.4.0.0.0.0.102.275.1j2.3.0....0...1c.1.64.serp..1.0.0.htlbSGrS8p8 Could you please advise why there is discrepancy? Thanks.
Technical SEO | | Printcious0 -
How to stop robots.txt restricting access to sitemap?
I'm working on a site right now and having an issue with the robots.txt file restricting access to the sitemap - with no web dev to help, I'm wondering how I can fix the issue myself? The robots.txt page shows User-agent: * Disallow: / And then sitemap: with the correct sitemap link
Technical SEO | | Ad-Rank0 -
Rel canonical question
Hi, I have an e-commerce site hosted on Volusion currently the rel canonical link for the homepage points to www.store.com/default.asp. I spoke with the Volusion support people and they told me that whether the canonical link points to store.com/default.asp or store.com does not really matter as long as there is a canonical version. I thought this sounded odd, so looked at other websites hosted on volusion and some sites canonicalize to default.asp and others .com. (volusion.com canonicalizes to .com fwiw). The question is...I have a majority of my external links going to www.store.com , and since that page has default.asp as it canonical version, am I losing link juice from those incoming links? If so, should I change the canonical link? If I do what are the potential issues/penalties? Hopefully this question makes sense and thanks in advance.
Technical SEO | | IOSC0 -
SEO Terms for Internal Vs External
Hey there! I am writing up an SEO plan for our company and wanted to get the groups input on the use of some SEO terms. I need to organize and explain these efforts to nonSEO people. I usually talk about, SEO in terms of "Internal" vs "External" efforts. Internal SEO efforts being things like Title Tags, Description Tags, Page Speed, Minimizing errors, proper 301 redirect, content development for the site, internal linking and anchor, etc. External SEO efforts being things like Link building, social media profile setups and posts (FB Twitter Pinterest, YouTube), PR work. How do you split these out? What terms do you use? Do you subdivide these tasks? What terms do you use? For example, with Internal, I sometimes talk about "Technical SEO" that has do to with making sure that site speed is working well, 301s are setup correctly, noindex tag etc are all used properly. These are things that different versus "On Page" efforts to use keywords properly etc. I will also use the term "Site Visibility" for non SEOs to explain the technical impact. For example, if your site has the wrong robots.txt, if you have 500 errors everywhere and a slow site, if you are sending spiders down a daisy chain of 301s, it is difficult for the key parts of your site to be found and so your "Visibility" to the engines are poor. You have to get your visibility up, before you begin to then worry about if you have the right keywords on a page etc. Any input or references would be appreciated.
Technical SEO | | CleverPhD0 -
International Websites: rel="alternate" hreflang="x"
Hi people, I keep on reading and reading , but I won't get it... 😉 I mean this page: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=189077&topic=2370587&ctx=topic On the bottom of the page they say: Step 2: Use rel="alternate" hreflang="x" Update the HTML of each URL in the set by adding a set of rel="alternate" hreflang="x" link elements. Include a rel="alternate" hreflang="x" link for every URL in the set, like this: This markup tells Google's algorithm to consider all of these pages as alternate versions of each other. OK! Each URL needs this markup. BUT: Do i need it exactly as written above, or do I have to put in the complete URL of the site, like: The next question is, what happens exactly in the SERPS when I do it like this (an also with Step1 that I haven't copied here)? Google will display the "canonical"-version of the page, but wehen a user from US clicks he will get on http://en-us.example.com/**page.htm **??? I tried to find other sites which use this method, but I haven't found one. Can someone give me an example.website??? Thank you, thank you very much! André
Technical SEO | | waynestock0 -
Google +1 not recognizing rel-canonical
So I have a few pages with the same content just with a different URL. http://nadelectronics.com/products/made-for-ipod/VISO-1-iPod-Music-System http://nadelectronics.com/products/speakers/VISO-1-iPod-Music-System http://nadelectronics.com/products/digital-music/VISO-1-iPod-Music-System All pages rel-canonical to:
Technical SEO | | kevin4803
http://nadelectronics.com/products/made-for-ipod/VISO-1-iPod-Music-System My question is... why can't google + (or facebook and twitter for that matter) consolidate all these pages +1. So if the first two had 5 +1 and the rel-canonical page had 5 +1's. It would be nice for all pages to display 15 +1's not 5 on each. It's my understanding that Google +1 will gives the juice to the correct page. So why not display all the +1's at the same time. Hope that makes sense.0 -
On-Page Report Card, rel canonical
My site has the rel canonical tags set up for it. The developers say that it is set up correctly. Looking at the source code myself, it looks (to my untutored eyes) to be set up correctly. However, on the On Page Report Card for every page I have checked, it says that it doesn't point to the right page. I'd really like to change all my 'B's to 'A's, but I simply can't see what the issue is.
Technical SEO | | Breakout0