The Bible and Duplicate Content
-
We have our complete set of scriptures online, including the Bible at http://lds.org/scriptures. Users can browse to any of the volumes of scriptures. We've improved the user experience by allowing users to link to specific verses in context which will scroll to and highlight the linked verse. However, this creates a significant amount of duplicate content. For example, these links:
http://lds.org/scriptures/nt/james/1.5
http://lds.org/scriptures/nt/james/1.5-10
http://lds.org/scriptures/nt/james/1
All of those will link to the same chapter in the book of James, yet the first two will highlight the verse 5 and verses 5-10 respectively. This is a good user experience because in other sections of our site and on blogs throughout the world webmasters link to specific verses so the reader can see the verse in context of the rest of the chapter.
Another bible site has separate html pages for each verse individually and tends to outrank us because of this (and possibly some other reasons) for long tail chapter/verse queries. However, our tests indicated that the current version is preferred by users.
We have a sitemap ready to publish which includes a URL for every chapter/verse. We hope this will improve indexing of some of the more popular verses. However, Googlebot is going to see some duplicate content as it crawls that sitemap!
So the question is: is the sitemap a good idea realizing that we can't revert back to including each chapter/verse on its own unique page? We are also going to recommend that we create unique titles for each of the verses and pass a portion of the text from the verse into the meta description. Will this perhaps be enough to satisfy Googlebot that the pages are in fact unique? They certainly are from a user perspective.
Thanks all for taking the time!
-
Dave,
Thanks for the clarification. You're definitely in a rare circumstance as compared to most web sites.
In reality, since it's the Bible, there is going to be a duplicate content issue regardless, given how many sites currently and how many more will most likely publish the same content now and in the future. From Eternalministries.org to KingJamesBibleOnline.org, concordance.biblebrowser.com, and so many other sites are all offering this content.
If you can find a way to offer your content in a unique way, and within your own site, offer different versions of it (individual verses compared to entire chapters), then ideally yes, you'd want it all indexed.
How you do that without adding your own unique text above or below each page's direct biblical content is the issue though.
Given this challenge,this is why I offered the concept of not indexing variations. Even if you weren't hit by the Panda update, any time Google has to evaluate multiple pages across sites where the content is either identical or "mostly" identical, someone's content is going to suffer to one degree or another. Any time it's a conflict within a single site, some versions are going to be given less ranking value than others.
So unfortunately it's not a simple, straight forward situation where duplication avoidance can be guaranteed to provide the maximum reach, nor is there a simple way to boost multiple versions in a way to guarantee that they'll all be found, let alone show up above "competitor" sites.
This is why I initially offered what are essentially SEO best practices for addressing duplicate content.
If you don't want to lose the traffic you have now that come in by multiple means, the only other way to bolster what you've got already is to focus on high quality long term link building, and social media.
The link building would need to focus on obtaining high quality links pointing to deep content. (Specific chapter pages and specific verse pages), where the anchor text used in those links varies between chapter or verse specific words, broader bible related phrases, and the LDS brand.
On the other hand, by implementing canonical tags, you will definitely reduce at least a number of visits that currently come in by variation URLs. Will that be compensated for by an equal or greater number of visits to the new "preferred" URL? In this rather unique situation there's no way to truly know. It is a risk.
Which brings me back to the concept that you'd potentially be better off finding ways to add truly unique content around the biblical entries. It's the only on-site method I can think of that would allow you to continue to have multiple paths indexed. Combined with unique page Titles, chapter/verse targeted links and social media, it could very well make the difference.
With what, over 1100 chapters, and 31,000 verses, that's a lot of footwork. Then again, it's a labor of love, and every journey is made up of thousands of steps.
-
So you're saying it would not be a good idea to try and get every verse url listed in Google? Perhaps we could try adding a canonical tag to point the the chapter only? For example, browsing the site you can't actually navigate to http://lds.org/scriptures/nt/james/1.5?lang=eng. You can only navigate to /james/1?lang=eng. However, the other URLs exist when someone links externally to a specific chapter and verse. The code on the page will highlight the desired verse. In our example the entire chapter exists on its own url and the content is unique.
Your suggestion may work if we just canonicalize all those "verse" urls like /james/1.5?lang=eng and james/1.5-10?lang=eng to /james/1?lang=eng. Some of the more popular verses with great page authority could actually help prop up the rest of the content on the page.
My concern though is that MUCH of the scripture related traffic comes through queries of the exact chapter/verse reference. So I can see where having individual pages for each passage could be valuable for rankings. But that user experience is poor when someone wants to see a range of passages like ch 5 vs 1-4 or similar. So we are looking for the best way to get our URLs indexed and ranked as individual passages or ranges of passages that are popular on search engines.
I can tell you that this section was not hit by the Panda update. The content is not "thin" as could be the case if we put each verse on a single page.
The ?lang=eng parameter is how we handle language versions. We have the scriptures online in several languages. I'm sure there are better ways to handle that as well. Due to the size of the organization we're certainly trying to get the low hanging fruit out of the way first.
-
Dave,
You're facing a difficult challenge - satisfy the needs of SEO, or user experience. In light of all that Google has done going back to their May Day update last year and right through the Panda/Farmer update, duplicate content, as well as "thin" content, is more of a concern than ever.
Just having unique titles on each page is not enough. It's the entire weight of uniqueness.
Since you're not intending to go to individual pages for each verse, as long as you've got multiple methods of getting tocontent that is found by other methods, only one method should be designated as the primary search engine preferred method. All others should be blocked from being indexed.
From there, users can choose to explore other methods of finding content as they bookmark your site if they find it of help to their goals.
Unfortunately, this does of course, mean that you're going to end up with many less pages indexed. However every page that is indexed will become stronger in their individual rankings, and that in turn will boost all of the pages above them, and the entire site over time.
And here's another issue - when I go to any of the URLs you posted above, your site automatically tacks on "?lang=eng" using 301 Redirects. This means any inbound links you have pointing to the non-appended URLs are not providing maximum value to your site, since they point to pages designated as permanently moved.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Footer Content Issue
Please check given screenshot URL. As per the screenshot we are using highlighted content through out the website in the footer section of our website (https://www.mastersindia.co/) . So, please tell us how Google will treat this content. Will Google count it as duplicate content or not? What is the solution in case if the Google treat it as duplicate content. Screenshot URL: https://prnt.sc/pmvumv
Technical SEO | | AnilTanwarMI0 -
Shopify Duplicate Content in products
Hello Moz Community, New to Moz and looking forward to beginning my journey towards SEO education and improving our clients' sites. Our client's website is a Shopify store. https://spiritsofthewestcoast.com/ Our first Moz reports show 686 duplicate content issues. I will show the first 4 as examples. https://spiritsofthewestcoast.com/collections/native-earrings-and-studs-in-silver-and-gold/products/haida-eagle-teardrop-earrings https://spiritsofthewestcoast.com/collections/native-earrings-and-studs-in-silver-and-gold/products/haida-orca-silver-earrings https://spiritsofthewestcoast.com/collections/native-earrings-and-studs-in-silver-and-gold/products/silver-oval-earrings https://spiritsofthewestcoast.com/collections/native-earrings-and-studs-in-silver-and-gold/products/haida-eagle-spirit-silver-earrings As you can see, URL titles are unique. But I know that the content in each of those products have very similar product descriptions but not exactly. But since they have been flagged as a site issue by Moz, I am guessing that the content is 95% duplicate. So can a rel=canonical be the right solution for this type of duplicate content? Or should I be considering adding new content to each of 686 products to drop below the 95% threshold? Or another solution that I may not be aware of. Thanks in advance for your assistance and expertise! Sean
Technical SEO | | TheUpdateCompany1 -
Duplicate Content - Different URLs and Content on each
Seeing a lot of duplicate content instances of seemingly unrelated pages. For instance, http://www.rushimprint.com/custom-bluetooth-speakers.html?from=topnav3 is being tracked as a duplicate of http://www.rushimprint.com/custom-planners-diaries.html?resultsperpg=viewall. Does anyone else see this issue? Is there a solution anyone is aware of?
Technical SEO | | ClaytonKendall0 -
How to avoid duplicate content
Hi, I have a website which is ranking on page 1: www.oldname.com/landing-page But because of legal reason i had to change the name.
Technical SEO | | mikehenze
So i moved the landing page to a different domain.
And 301'ed this landing page to the new domain (and removed all products). www.newname.com/landing-page All the meta data, titles, products are still the same. www.oldname.com/landing-page is still on the same position
And www.newname.com/landing-page was on page 1 for 1 day and is now on page 4. What did i do wrong and how can I fix this?
Maybe remove www.oldname.com/landing-page from Google with Google Webmaster Central or not allow crawling of this page with .htaccess ?0 -
Tired of finding solution for duplicate contents.
Just my site was scanned by seomoz and seen lots of duplicate content and titles found. Well I am tired of finding solutions of duplicate content for a shopping site product category page. You can see the screenshot below. http://i.imgur.com/TXPretv.png You can see below in every link its showing "items_per_page=64, 128 etc.". This happened in every category in which I was created. I am already using Canonical add-on to avoid this problem but still it's there. You can check my domain here - http://www.plugnbuy.com/computer-software/pc-security/antivirus-internet-security/ and see if the add-on working correct. I recently submitted my sitemap to GWT, so that's why it's not showing me any report regarding duplicate issues. Please help ME
Technical SEO | | chandubaba0 -
Hiding Duplicate Content using Javascript
We have e-commerce site selling books. Besides basic information on books, we have content for “About the book” , “Editorial Reviews”, “About the author” etc. But the content in all these section are duplicate and are available on all sites selling similar books. Our question is: 1.Should we worry about the content being duplicate?2.If yes, then will it by a good idea to hide this duplicate content using javascript or iframe?
Technical SEO | | CyrilWilson0 -
URL query considered duplicate content?
I have a Magento site. In order to reduce duplicate content for products of the same style but with different colours I have combined them on to 1 product page. I would like to allow the pictures to be dynamic, i.e. allow a user to search for a colour and all the products that offer that colour appear in the results, but I dont want the default product image shown but the product image for that colour applying to the query. Therefore to do this I have to append a query string to the end of the URL to produce this result: www.website.com/category/product-name.html?=red My question is, will the query variations then be picked up as duplicate content: www.website.com/category/product-name.html www.website.com/category/product-name.html?=red www.website.com/category/product-name.html?=yellow Google suggest it has contingencies in its algorithm and I will not be penalised: http://googlewebmastercentral.blogspot.co.uk/2007/09/google-duplicate-content-caused-by-url.html But other sources suggest this is not accurate. Note the article was written in 2007.
Technical SEO | | BlazeSunglass0 -
How to resolve duplicate content and title errors?
Hello, I'm new to this resource and SEO. I have taken the time to read other posts but am not entirely sure about the best way to resolve the issues I am experiencing and so am hoping for a helpful hand. My site diagnostics advise me that most of my errors relate to duplicate content and duplicate page titles. Most of these errors seem to relate to our ecommerce product pages. A little about us first, we manufacture and retail over the internet our own line of unique products which can only be purchased through our website. So it’s not so important to make our product pages stand out from competitors. An example of one of our product pages can be found here: http://www.nabru.co.uk/product/Sui+2X2+Corner+Sofa In terms of SEO we are focusing on improving the rankings of our category pages which compete much more with our competitors, but would also like our product pages to be found via a google search for those potential customers that are at the late stage of a buying cycle. So my question: Whilst it would be good to add more content to the product pages, user reviews, individual product descriptions etc (and have good intentions to do this over time, which unfortunately is limited) is there an easy way to fix the duplicate content issues, ensure our products can be found and ensure that the main focus is on our category pages? Many thanks.
Technical SEO | | jannkuzel0