Glossary index and individual pages create duplicate content. How much might this hurt me?
-
I've got a glossary on my site with an index page for each letter of the alphabet that has a definition. So the M section lists every definition (the whole definition).
But each definition also has its own individual page (and we link to those pages internally so the user doesn't have to hunt down the entire M page).
So I definitely have duplicate content ... 112 instances (112 terms). Maybe it's not so bad because each definition is just a short paragraph(?)
How much does this hurt my potential ranking for each definition? How much does it hurt my site overall?
Am I better off making the individual pages no-index? or canonicalizing them?
-
Thanks, Ryan!
-
From here: http://moz.com/messages/write to Dirk's username: DC1611. There used to be a button in profiles, but it looks like it got shuffled in the redesign.
-
PM? Does Moz offer that function?
-
It's a bit difficult to assess which of the pages is more important without knowing the site. Having a lot of content is good - but if the only link between the content is that they all start with the same letter it could be pretty weak or pretty strong depending on the situation:
I'll give 2 examples :
Suppose that the index is on First names starting with S - in this case this page is a valuable one because a lot of people are searching for it - and the search volume is potentially bigger than the number of people that are looking for first name steve (= one specific item)
Suppose the index is about Illnesses starting with S - in this case the index page has very little value for a searcher, because people are searching illnesses based the symptoms -the fact that illnesses start with S doesn't link them together.
It could be helpful if you send me the actual url's via PM if you don't want to disclose them here.
rgds
Dirk
-
Oops. Sorry. Poor wording there. Meant to say ...
Definitely not concerned that the M index page and the M* definition** page BOTH show up in the search results.
We definitely do want at least one of the pages to not only show up in the rankings, but to rank highly. I'm guessing the M index page would actually have a chance of ranking high because it will have so many long tails related to our short-tail.
But it would seem weird to put a no-index on the M* definition** page ... since we have multiple internal links to those pages.
Thanks again for your patience. Really appreciate the feedback.
Steve
-
That's exactly what I am saying - your index page with all the definitions is from Google perspective completely different from the detailed definition page (the first one being much richer in content than the 2nd one). If getting these pages ranked is the least of concerns - you can keep it as it is. If you want to play on the safe side, you can put a noindex on the index page.
rgds,
Dirk
-
Just having a bit of a dilemma. Trying to make it easier for people who come to the glossary and then go to ... say ... the M page. Don't have to keep clicking away to see the definitions. Result: More user-friendly
But we also want to have a very specific definition page so that when we link from an article to the definition, the user doesn't have to see all of the M definitions. Result: More user-friendly.
Definitely not concerned that both the M index page and the M* definition** page show up in the search results. That would actually be swell. Just more concerned that our overall site ranking or domain authority will somehow suffer.
If you're saying that the M index page and the M* page** are dramatically different (because the M index page is much, much longer) and so I shouldn't worry, that's great. (Hope that's what you're saying.)
Thanks!
-
Hi,
As far as I understand it's not really a question of duplicate content in the SEO meaning. Although all the definitions starting with M are on the M-index page this page is quite different to the pages that contain the individual definitions of the terms that start with M.
A problem on many sites is that the pages that only contain the explanation of one term are very light in terms of content, and that the page with is listing all these terms is generally not very interesting from a user (and search perspective). I don't know your site, so difficult to assess if this is the case
You could make the index page noindex/follow - and just list the terms, linking to the explanation pages. For the explanation pages which are probably the most interesting for users & search engines: try to enrich them by adding more content, like links to articles on your site that use the term, or have more information on the term
Hope this helps,
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do we decide which pages to index/de-index? Help for a 250k page site
At Siftery (siftery.com) we have about 250k pages, most of them reflected in our sitemap. Though after submitting a sitemap we started seeing an increase in the number of pages Google indexed, in the past few weeks progress has slowed to a crawl at about 80k pages, and in fact has been coming down very marginally. Due to the nature of the site, a lot of the pages on the site likely look very similar to search engines. We've also broken down our sitemap into an index, so we know that most of the indexation problems are coming from a particular type of page (company profiles). Given these facts below, what do you recommend we do? Should we de-index all of the pages that are not being picked up by the Google index (and are therefore likely seen as low quality)? There seems to be a school of thought that de-indexing "thin" pages improves the ranking potential of the indexed pages. We have plans for enriching and differentiating the pages that are being picked up as thin (Moz itself picks them up as 'duplicate' pages even though they're not. Thanks for sharing your thoughts and experiences!
Intermediate & Advanced SEO | | ggiaco-siftery0 -
Duplicate Page Content
We have different plans that you can signup for - how can we rectify the duplicate page content and title issue here? Thanks. | http://signup.directiq.com/?plan=100 | 0 | 1 | 32 | 1 | 200 |
Intermediate & Advanced SEO | | directiq
| http://signup.directiq.com/?plan=104 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=116 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=117 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=102 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=119 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=101 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=103 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=5 |0 -
Help: How to optimize my duplicate category pages
Hi all My category pages will showcase the same products how do I go about optimizing these pages so they don't show up as duplicate content? Would appreciate your all feedaback! Thanks
Intermediate & Advanced SEO | | edward-may0 -
Best practice to prevent pages from being indexed?
Generally speaking, is it better to use robots.txt or rel=noindex to prevent duplicate pages from being indexed?
Intermediate & Advanced SEO | | TheaterMania0 -
Apps content Google indexation ?
I read some months back that Google was indexing the apps content to display it into its SERP. Does anyone got any update on this recently ? I'll be very interesting to know more on it 🙂
Intermediate & Advanced SEO | | JoomGeek0 -
Does Google see this as duplicate content?
I'm working on a site that has too many pages in Google's index as shown in a simple count via a site search (example): site:http://www.mozquestionexample.com I ended up getting a full list of these pages and it shows pages that have been supposedly excluded from the index via GWT url parameters and/or canonicalization For instance, the list of indexed pages shows: 1. http://www.mozquestionexample.com/cool-stuff 2. http://www.mozquestionexample.com/cool-stuff?page=2 3. http://www.mozquestionexample.com?page=3 4. http://www.mozquestionexample.com?mq_source=q-and-a 5. http://www.mozquestionexample.com?type=productss&sort=1date Example #1 above is the one true page for search and the one that all the canonicals reference. Examples #2 and #3 shouldn't be in the index because the canonical points to url #1. Example #4 shouldn't be in the index, because it's just a source code that, again doesn't change the page and the canonical points to #1. Example #5 shouldn't be in the index because it's excluded in parameters as not affecting page content and the canonical is in place. Should I worry about these multiple urls for the same page and if so, what should I do about it? Thanks... Darcy
Intermediate & Advanced SEO | | 945010 -
Duplicate site (disaster recovery) being crawled and creating two indexed search results
I have a primary domain, toptable.co.uk, and a disaster recovery site for this primary domain named uk-www.gtm.opentable.com. In the event of a disaster, toptable.co.uk would get CNAMEd (DNS alias) to the .gtm site. Naturally the .gtm disaster recover domian is an exact match to the toptable.co.uk domain. Unfortunately, Google has crawled the uk-www.gtm.opentable site, and it's showing up in search results. In most cases the gtm urls don't get redirected to toptable they actually appear as an entirely separate domain to the user. The strong feeling is that this duplicate content is hurting toptable.co.uk, especially as .gtm.ot is part of the .opentable.com domain which has significant authority. So we need a way of stopping Google from crawling gtm. There seem to be two potential fixes. Which is best for this case? use the robots.txt to block Google from crawling the .gtm site 2) canonicalize the the gtm urls to toptable.co.uk In general Google seems to recommend a canonical change but in this special case it seems robot.txt change could be best. Thanks in advance to the SEOmoz community!
Intermediate & Advanced SEO | | OpenTable0 -
Keep older blog content indexed or no?
Our really old blog content still sees traffic, but engagement metrics aren't the best (little time on site), and as a result, traffic has gradually started to decrease. Should we de-index it?
Intermediate & Advanced SEO | | nicole.healthline0