Duplicate without user-selected canonical excluded
-
We have pdf files uploaded in the media of wordpress and used in our website. As these pdfs are duplicate content of the original publishers, we have marked links to these pdf urls as nofollow. These pages are also disallowed in robots.txt
Now, Google Search Console has shown these pages Excluded as "Duplicate without user-selected canonical"
As it comes out we cannot use canonical tag with pdf pages so as to point to the original pdf source
If we embed a pdf viewer in our website and fetch the pdfs by passing the urls of the original publisher, would the pdfs be still read as text by google and again create duplicate content issue? Another thing, when the pdf expires and is removed, it would lead to 404 error.
If we direct our users to the third party website, then it would add up to our bounce rate.
What should be the appropriate way to handle duplicate pdfs?
Thanks
-
From what I have read, so much of the web is duplicate content so it really doesn't matter if the pdf is on other sites; let google figure it out. (example, every car brand dealer has a pdf of the same car model brochure on their dealer site) No big deal. Visitors will be landing on your site from other search relevance - the duplicate pdf doesn't matter. Just my take. Adrian
-
Sorry, I mean pdf files only
-
As the pdf pages are marked as a duplicate and not the pdf files, then you should check which page has duplicate content compared to it, and take the needed measures (canonical tags or 301 redirect) form the page with less rank to the page with more rank. Alternatively, you can edit the content so that it isn't anymore duplicate.
If I had a link to the site and duplicate pages, I would be able to give you a more detailed response.
Daniel Rika - Dalerio Consulting
https://dalerioconsulting.com/
info@dalerioconsulting.com -
Hello Daniel
The pdfs are duplicates from another site.
The thing is that we have already disallowed the pdfs in the robots.txt file.
Now, what happened is this - We have a set of pages (let's call them content pages) which we had disallowed in the robots file as they had thin content. Those pages have links to their respective third party pdfs, which have been marked as nofollow. The pdfs are also disallowed in the robots file.
Few days back, we improved our content pages and removed them from robots file so that they can be indexed. Pdfs are still disallowed. Despite being disallowed, we have come across this issue with the pdf pages as "Duplicate without user-selected canonical."
I hope I make myself clear. Any insights now please.
-
If the pdfs are duplicate within your own site, then the best solution would be for you to link to the same document from different sources. Then you can delete the duplicated documents and 301 redirect them to the original.
If the pdfs are duplicate from another site, then disallowing them on robots.txt will stop them from being marked as a duplicate, as the crawler will not be able to access them at all. It will just take some time for them to be updated on google search console.
If however, you want to add canonical tags to the pdf documents (or other non-HTML documents), you can add it to the HTTP header through the .htaccess file. You can find a tutorial on how to do that in this article.
Daniel Rika - Dalerio Consulting
https://dalerioconsulting.com/
info@dalerioconsulting.com
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Matching user intent in my blog
Hello, I am planning on doing a blog on travel bike basic. I noticed that the google keyword tool gives me things like How to plan a bike route Can I bike during pregnancy etc... In order to compete on that keyword do I need to answer those questions or can I answer different ones and still rank such as : The tool kit that are recommend. Whether you should take an insurance or not, if so which one. Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Pages canonicaled to another appearing before the canonical on google searches
Hello, When I do this google search, this page(amandine roses category) appears before the one it is canonical-ed to(this multi-product version of amandine roses). This happens often with this multi-product template, where they don't rank as well as their category version(that are canonical to the multi-product version). Can someone maybe point us in the right direction on what the issue may be? What can be improved?
Intermediate & Advanced SEO | | globalrose.com0 -
Best Strategy for FAQ & Canonical?
I have an FAQ database setup on my site and there's about 30 questions in 6 categories so 5 questions per category which is a pretty good page size for one category. I'm trying to determine the best strategy for publishing them from both a user and SEO standpoint. From a user standpoint, I want to have one page per category. Dumping them into a page with all 30 questions is not user-friendly and some categories are very unrelated to others. I should note that Google did already index a page that does have all the questions on it, but I was just planning on changing that page to just have 6 links to each of the category pages so then I don't have to bother with 301 redirect or removing the pages in the site's Search Console. There's also an option to to link the questions for the entire FAQ or from the category list to one page with just that question and answer. So my thinking at this point is to as I said, just change the page that has all 30 questions to a list of the categories and link to category pages having the questions for that category and disable the individual question pages. Or would it be beneficial from an SEO page to have google index the individual question pages and link back to the category page and put a canonical tag on the category pages? In other words the question then becomes, index the category pages or index the individual question pages? The other issue is the answers for some of the questions are lengthy, multiple paragraphs, and the FAQ has the option to have a hide/unhide feature on the answers so you can easily see all the questions first then expand the answers on the ones you are interested in. However I thought I heard Google discounts (doesn't ignore) content that is by default hidden on page load. I guess this would then give a reason for going with the indexing of the individual question pages. But it seems to me, you can't put the canonical tag on the category pages and point it to the individual question page. And if you put the canonical tag on the individual question page linking it to the category page, then the individual page won't necessarily get indexed will it?
Intermediate & Advanced SEO | | MrSem0 -
Duplicate content based on filters
Hi Community, There have probably been a few answers to this and I have more or less made up my mind about it but would like to pose the question or as that you post a link to the correct article for this please. I have a travel site with multiple accommodations (for example), obviously there are many filter to try find exactly what you want, youcan sort by region, city, rating, price, type of accommodation (hotel, guest house, etc.). This all leads to one invevitable conclusion, many of the results would be the same. My question is how would you handle this? Via a rel canonical to the main categories (such as region or town) thus making it the successor, or no follow all the sub-category pages, thereby not allowing any search to reach deeper in. Thanks for the time and effort.
Intermediate & Advanced SEO | | ProsperoDigital0 -
Canonical URL Tag
I have 3 websites with same content, I want to add Canonical tag to my main website. Is this also important to mentioned other duplicate URL in canonical tag in main website? or just need to just add
Intermediate & Advanced SEO | | marknorman0 -
Having problems resolving duplicate meta descriptions
Recently, I’ve recommended to the team running one of our websites that we remove duplicate meta descriptions. The site currently has a large number of these and we’d like to conform to SEO best practice. I’ve seen Matt Cutt’s recent video entitled, ‘Is it necessary for every page to have a meta description’, where he suggests that webmasters use meta descriptions for their most tactically important pages, but that it is better to have no meta description than duplicates. The website currently has one meta description that is duplicated across the entire site. This seemed like a relatively straight forward suggestion but it is proving much more challenging to implement over a large website. The site’s developer has tried to resolve the meta descriptions, but says that the current meta description is a site wide value. It is possible to create 18 distinct replacements for 18 ‘template’ pages, but any sub-pages of these will inherit the value and create more duplicates. Would it be better to: Have no meta descriptions at all across the site? Stick with the status quo and have one meta description site-wide? Make 18 separate meta descriptions for the 18 most important pages, but still have 18 sets of duplicates across the sub-pages of the site. Or…is there a solution to this problem which would allow us to follow the best practice in Matt’s video? Any help would be much appreciated!
Intermediate & Advanced SEO | | RG_SEO0 -
Why are these pages considered duplicate content?
I have a duplicate content warning in our PRO account (well several really) but I can't figure out WHY these pages are considered duplicate content. They have different H1 headers, different sidebar links, and while a couple are relatively scant as far as content (so I might believe those could be seen as duplicate), the others seem to have a substantial amount of content that is different. It is a little perplexing. Can anyone help me figure this out? Here are some of the pages that are showing as duplicate: http://www.downpour.com/catalogsearch/advanced/byNarrator/narrator/Seth+Green/?bioid=5554 http://www.downpour.com/catalogsearch/advanced/byAuthor/author/Solomon+Northup/?bioid=11758 http://www.downpour.com/catalogsearch/advanced/byNarrator/?mediatype=audio+books&bioid=3665 http://www.downpour.com/catalogsearch/advanced/byAuthor/author/Marcus+Rediker/?bioid=10145 http://www.downpour.com/catalogsearch/advanced/byNarrator/narrator/Robin+Miles/?bioid=2075
Intermediate & Advanced SEO | | DownPour0 -
Duplicate content for images
On SEOmoz I am getting duplicate errors for my onsite report. Unfortunately it does not specify what that content is... We are getting these errors for our photo gallery and i am assuming that the reason is some of the photos are listed in multiple categories. Can this be the problem? what else can it be? how can we resolve these issues?
Intermediate & Advanced SEO | | SEODinosaur0