Similar pages: noindex or rel:canonical or disregard parameters?!
-
Hey all!
We have a hotel booking website that has search results pages per destinations (e.g. hotels in NYC is dayguest.com/nyc). Pages are also generated for destinations depending on various parameters, that can be star rating, amenities, style of the properties, etc. (e.g. dayguest.com/nyc/4stars, dayguest.com/nyc/luggagestorage, dayguest.com/nyc/luxury, etc.).
In general, all of these pages are very similar, as for example, there might be 10 hotels in NYC and all of them will offer luggage storage. Pages can be nearly identical. Come the problems of duplicate content and loss of juice by dilution.
I was wondering what was the best practice in such a situation: should I just put all pages except the most important ones (e.g. dayguest.com/nyc) as noindex? Or set it as canonical page for all variations? Or in google webmaster tool ask google to disregard the URLs for various parameters? Or do something else altogether?!
Thanks for the help!
-
Sorry, I don't think I explained (1) very well. What I mean is that you may want to gradually change the site architecture so that not all of the search options are crawlable pages. This could mean putting some filters in form variables, for example (instead of links). It could also mean making sure that certain paths always converge. There's no easy solution. This is a problem all big sites face, and it's very dependent on the platform/CMS.
With (2), a "level" could be anything. Maybe there are major cities you need to cover but everything else could stay out of the index. This really depends on your information architecture, but there's always something that's high priority and something that's low priority. If you can focus Google on the high-priority pages, it can definitely work in your favor. The trick is figuring out how to build the logic such that you can code that dynamically. I've found there's almost always an answer, but it can take some creative thinking. I definitely don't encourage doing it manually.
If the results are easy to group by city and you can code that logic, the canonical may be fine. Since the search results could be different in some cases, canonical isn't technically the best choice, but it does often work. It really depends on how different they can be, so it's a bit tricky.
-
Honestly, option 1 would be a nightmare. Imagine that we add one property in a city not covered. There are about 50 amenities, and most hotels feature most, so as much new pages generated. That would become quickly unmanageable, to handle manually.
Not sure I understand your second option. There are not several "level", only one under the "city" in which the property is. But mutliplied by several cities, they quickly become hundreds, if not thousands.
Why would it not be possible/desirable to code all such pages as canonical pages of each city?
-
Ugh - that's what I was afraid you'd say. Unfortunately, the coincidental problem can't really be easily solved with code, which makes it hard to use canonical tags. There's no good way to tell the site when to use them.
So, a couple of options:
(1) Try to gradually rework the structure so that there are less of these paths.
(2) Consider using META NOINDEX on some lower-value paths. Internal search results don't have great value for Google, so you could let the major categories/options be indexed, but the cut off a certain level (index nothing "below" it). That may be more feasible from a code standpoint.
(3) Use rel=prev/next, use unique TITLEs if possible (based on the query) and just clean things up the best you can, but leave everything indexed.
It depends a lot on your scope, structure, and your future plans. I'm not sure there's one "right" answer.
-
Ugh - that's what I was afraid you'd say. Unfortunately, the coincidental problem can't really be easily solved with code, which makes it hard to use canonical tags. There's no good way to tell the site when to use them.
So, a couple of options:
(1) Try to gradually rework the structure so that there are less of these paths.
(2) Consider using META NOINDEX on some lower-value paths. Internal search results don't have great value for Google, so you could let the major categories/options be indexed, but the cut off a certain level (index nothing "below" it). That may be more feasible from a code standpoint.
(3) Use rel=prev/next, use unique TITLEs if possible (based on the query) and just clean things up the best you can, but leave everything indexed.
It depends a lot on your scope, structure, and your future plans. I'm not sure there's one "right" answer.
-
These pages return the same results coincidentally, that's the issue... The more properties we get on board, the less likely it is that these pages will be similar. But it might take a long time to build that up, and we may never achieve it.
-
Ah, got it - yeah, I think rel=canonical would be fine there, but I'd want to understand your architecture better. Are these pages returning the same results coincidentally, or are these two URLs that basically land on the same combination of search options/filters. If it's the former, it's a lot tougher, because that's just a coincidence happening at large scale. If it's the latter, a solid canonical scheme could help a lot, but I'd also explore whether these paths are useful (or should be indexed at all). In other words, in the long term, it might be better to use one URL consistently, even if people navigate by different paths to reach it.
-
That's odd, they were supposed to be the same. And yeah, results come and go as properties are added/removed from our inventory.
The following is what I wanted to highlight:
http://www.dayguest.com/rome-dayuse/concierge
http://www.dayguest.com/rome-dayuse/air-conditioning
As you can see, the pages are identical, except that one has 5 properties and the other one has 6. Most overlap. There are so manies property "features" or "category", that some list have exactly the same list. Actually, SEOMOZ find that I have over 1700 pages with duplicate content, most being search results page with closely similar contents such as these.
Hence my issue...
-
Are they duplicates in the sense that there are currently no results? I wouldn't generally use rel=canonical on these, because the search results should (theoretically) be different. These are distinct regions and, I assume, have unique properties.
If they're just returning no results, I'd actually consider a META NOINDEX until there are results available. Otherwise, this is likely to be treated as a soft 404 by Google (not a disaster, honestly). It depends on whether results come and go or if you're just building out the site and there will be data later. If the data isn't ready, I think META NOINDEX is a good way to go. Until results are available, these pages have no search value.
-
Well, let me give you an example, look at this page: http://www.dayguest.com/milan-city-centre-dayuse?amenities=10
And this page: http://www.dayguest.com/milan-central-station-dayuse?amenities=10
Do you see what I'm talking about? The pages are identical but for the page title/description & a few words on the page.
So, you'd go for canonical?
-
The relation is more hierarchal then next/previous. Judging from the post you mentioned, canonical would be more appropriate...
-
Sorry, I'm not clear on whether these are paginated search results or actual property pages that vary only by a small amount. As @SEO5 said, if these are paginated search results, you could use rel=prev/next. It's a bit tricky to set up with search filters (you need rel=prev/next + rel=canonical).
If these are nearly identical property pages, then it depends on how they differ. If they only differ by one attribute, I'd probably lean toward the canonical tag.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical Page Question
Hi, I have a question relation to Canonical pages That i need clearing up. I am not sure that my bigcommere website is correctly configured and just wanted clarification from someone in the know. Take this page for example https://www.fishingtackleshop.com.au/barra-lures/ Canonical link is https://www.fishingtackleshop.com.au/barra-lures/ The Rel="next" link is https://www.fishingtackleshop.com.au/barra-lures/?sort=bestselling&page=2 and this page has a canonical tag as rel='canonical' href='https://www.fishingtackleshop.com.au/barra-lures/?page=2' /> Is this correct as above and working as it should or should the canonical tag for the second (pagination page) https://www.fishingtackleshop.com.au/barra-lures/?page=2 in our source code be saying rel='canonical' href='https://www.fishingtackleshop.com.au/barra-lures/' />
Technical SEO | | oceanstorm0 -
Is this an ideal rel=canonical situation?
Hey Moz community, Thanks for taking time to answer my question. I'm working directly with a hospital that has several locations across the country. They've copied the same content over to each of their websites. Could I point the search engines back to a singular location (URL) using the rel=canonical tag? In addition, does the rel=canonical tag affect the search engine rankings of the URLs (about 13 of them) that use the rel=canonical tag? If I'm on track, is there an ideal URL (location) to decide has the original content? This is actually the first time I've ever needed to use rel=canonical (if applicable). Thanks so much. Cole
Technical SEO | | ColeLusby0 -
GWT Malware notification for meta noindex'ed pages ?
I was wondering if GWT will send me Malware notification for pages that are tagged with meta noindex ? EG: I have a site with pages like example.com/indexed/content-1.html
Technical SEO | | Saijo.George
example.com/indexed/content-2.html
example.com/indexed/content-3.html
....
example.com/not-indexed/content-1.html
example.com/not-indexed/content-2.html
example.com/not-indexed/content-3.html
.... Here all the pages like the ones below, are tagged with meta noindex and does not show up in search.
example.com/not-indexed/content-1.html
example.com/not-indexed/content-2.html
example.com/not-indexed/content-3.html Now one fine day example.com/not-indexed/content-2.html page on the site gets hacked and starts to serve malware, none of the other pages are affected .. Will GWT send me a warning for this ? What if the pages are blocked by Robots.txt instead of meta noindex ? Regard
Saijo UPDATE hope this helps someone else : https://plus.google.com/u/0/109548904802332365989/posts/4m17sUtPyUS0 -
NOINDEX,FOLLOW on product pages
Hi Can I have people's thoughts on something please. We sell wedding stationery and whilst we can generate lots of good content describing a particular range of stationery we can't relistically differentiate at a product level. So imagine we have three ranges Range 1 - A Bird Range 2 - A Heart Range 3 - A Flower Within each of these ranges we would have invitations, menus, place cards, magnets etc. The ranges vary quite alot so we can write good textual keyword rich descriptions that attract traffic (i.e. one about the bird, one about the heart and one about the flower). However the individual products within a range just reflect the design for the range as a whole (as all items in a range match). Therefore we can't just copy the content down to the product level and if we just describe the generic attributes of the products they will alll be very similar. We have over 1,000 "products" easily so I am conscious of creating too much duplication over the site in case Mr Panda comes to call. So I was thinking that I "might" NOINDEX, FOLLOW the product pages to avoid this duplication and put lots of effort into making my category pages much better and content rich. The site would be smaller in the index BUT I do not really expect to generate traffic from the product pages because they are not branded items and any searches looking for particular features of our stationery would be picked up, much more effectively, by the category pages. Any thoughts on this one? Gary
Technical SEO | | gtrotter6660 -
Canonical Question
Our site has thousands of items, however using the old "Widgets" analogy we are unsure on how to implement the canonical tag, and if we need to at all. At the moment our main product pages lists all different "widget" products on one page, however the user can visit other sub pages that filter out the different versions of the product. I.e. glass widgets (20 products)
Technical SEO | | Corpsemerch
glass blue widgets (15 products)
glass red widgets (5 products)
etc.... I.e. plastic widgets (70 products)
plastic blue widgets (50 products)
plastic red widgets (20 products)
etc.... As the sub pages are repeating products from the main widgets page we added the canonical tag on the sub pages to refer to the main widget page. The thinking is that Google wont hit us with a penalty for duplicate content. As such the subpages shouldnt rank very well but the main page should gather any link juice from these subpages? Typically once we added the canonical tag it was coming up to the penguin update, lost a 20%-30% of our traffic and its difficult not to think it was the canonical tag dropping our subpages from the serps. Im tempted to remove the tag and return to how the site used to be repeating products on subpages.. not in a seo way but to help visitors drill down to what they want quickly. Any comments would be welcome..0 -
Thoughts about stub pages - 200 & noindex ok, or 404?
With large database/template driven websites it is often possible to get a lot of pages with no content on them. What are the current thoughts regarding these pages with no content, options; Return a 200 header code with noindex meta tag Return a 404 page & header code Something else? Thanks
Technical SEO | | slingshot0 -
Page not being indexed
Hi all, On our site we have a lot of bookmaker reviews, and we are ranking pretty good for most bookmaker names as keywords, however a single bookmaker seems to have been shunned by Google. For a search "betsafe" in Denmark, this page does not appear among the top 50: http://www.betxpert.com/bookmakere/betsafe All of our other review pages rank in top 10-20 for the bookmaker name as keyword. What to do if Google has "banned" a page? Best regards, Rasmus
Technical SEO | | rasmusbang0 -
Should i use NoIndex, Follow & Rel=Canonical Tag In One Page?
I am having pagination problem with one of my clients site , So I am deciding to use noindex, follow tag for the Page 2,3,4 etc for not to have duplicated content issue, Because obviously SEOMoz Crawl Diagnostics showing me lot of duplicate page contents. And past 2 days i was in constant battle whether to use noindex, follow tag or rel=canonical tag for the Page 2,3,4 and after going through all the Q&A,None of them gives me crystal clear answer. So i thought "Why can't i use 2 of them together in one page"? Because I think (correct me if i am wrong) 1.noindex, follow is old and traditional way to battle with dup contents
Technical SEO | | DigitalJungle
2.rel=canonical is new way to battle with dup contents Reason to use 2 of them together is: Bot finds to the non-canonical page first and looks at the tag nofollow,index and he knows not to index that page,meantime he finds out that canonical url is something something according to the url given in the tag,NO? Help Please???0