PDF's - Dupe Content
-
Hi
I have some pdfs linked to from a page with little content. Hence thinking best to extract the copy from the pdf and have on-page as body text, and the pdf will still be linked too. Will this count as dupe content ?
Or is it best to use a pdf plugin so page opens pdf automatically and hence gives page content that way ?
Cheers
Dan
-
Should be different, but you would have to look at them to make sure.
-
ps - is a pdf to html coverter different from a plugin that loads the pdf as an open page when you click it ? or same thing ?
-
That is what I was going to suggest - setting up a canonical in the http header of the PDF back to the article
https://support.google.com/webmasters/answer/139394?hl=en
As another option, you can just block access to the PDFs to keep them out of the index as well.
-
thanks Chris
yes you can canonicalise the pdf to the html (according to the comments of that article i just linked to anyway)
-
Hi Dan,
Yes PDFs are crawlable (sorry for confusion!) if you were to put it into say a .zip or .rar (or similar) it wouldn't be crawled or you could no index the link i guess. You would need to stick the PDF (download) behind some thing that couldn't be crawled. You could try rel= canonical but I've never tried it with a PDF so i'm not sure how that would go.
Hope that enlightens you a bit.
-
Thanks Chris although i thought PDFS were crawlable??: http://www.lunametrics.com/blog/2013/01/10/seo-pdfs/
Hence why im worried about dupe content if use content of pdf as body text too OR are you saying should no-follow the link to the pdf if use its content as body text because it is considered dupe content in that scenario ?
Ideally i want both - the copy on it used as body text copy on page and the pdf a linkable download, or page as embed of open pdf via a plugin.
-
What would give the user the best experience is the really question,I would;d say put it on page then if the user is lacking a plugin they can still read it, if you have it as a downloadable PDF is shouldn't be able to get crawled and thus avoiding the problem.
Hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate URL's in Sitemap? Is that a problem?
I submitted a sitemap to on Search Console - but noticed that there are duplicate URLs, is that a problem for Google?
On-Page Optimization | | Luciana_BAH0 -
Strange SERP's descriptions
Hey, when I googled one of our products i came up with this strange result, see attachment. I searched for: kurs praktische psychologie on google germany. These words also come up in the meta description of this page:** Praktische Psychologie** Fernkurs mit professioneller Betreuung. Testen Sie den praxisorientierten Kurs über die Grundlagen der Psychologie 4 Wochen kostenlos. and in the body: _Sie glauben der Mensch lässt sich trotz all seiner Facetten durchschauen, wenn man sich nur Mühe gibt ihn zu verstehen? Da liegen Sie vollkommen richtig! Der Kurs "Praktische Psychologie" vermittelt Ihnen hierfür alle Kenntnisse und Fähigkeiten, sodass Sie schon bald das Mysterium Mensch ergründen. _ Why is Google still showing this description which i obviously don't want to be shown, and why does it state _spring naar (jump to) Kursgeburh _and how can i avoid this? yd1DStW
On-Page Optimization | | NHA_DistanceLearning0 -
Google search: 'define:____'
See: http://screencast.com/t/oFSzIt5rRm Thrilled that Google is pulling our content over wikipedia (in this instance). Wondering how we can assure more success like this. Mike Corso
On-Page Optimization | | Mike_c
Gartner.com1 -
Schema and Rich Snippets What's the difference?
Sorry if this is a daft question but... what is the difference between Rich snippets and Schema markup? Are they one and the same? They seem to be used interchaneably and I'm confused. If someone could give a brief sentence or two about the differences between them that would be great. Thanks
On-Page Optimization | | AL123al1 -
Duplicate Content?
Hi All, I have a new client site, a static site with navigation across the top, and down the left side. Two of the menus from the top navigation are replicated in the navigation structure on the left hand side. They have the exact same url structure, they are in fact the same exact page, listed on the site in two areas. My question is - is this a case of duplicate content, or, as they urls are the exact same, will they be seen as a single page? A canonical tag on one would be replicated on the other by the CMS - so do I leave it, or try to get them to re-structure removing one of the links? (I doubt they will do this as its a brand new site they just has developed). Many thanks!
On-Page Optimization | | Webrevolve0 -
Duplicat contents on wordpress
I ran a crawl error and found that I have many pages with "tag" i.e. http://www.soobumimphotography.com/tag/70-200-2-8-is/ What's the best way to deal with this problems? Is it worth to visit all of them and fix? Delete? Could you give me some suggestions?
On-Page Optimization | | BistosAmerica0 -
Home Page Content - In a Div?
Is putting content in a div so it doesn't muck up the look of the home page create a problem in doing well organically? Example - http://www.callawaygardens.com. We have lots of clients that want no text on the home page and we are trying to figure out how to do this while still ranking well organically. What are your thoughts? Can we get in trouble? Are there negative impacts with SEO doing it like this? Thank you!
On-Page Optimization | | RezStream80 -
Duplicate page content errors
Site just crawled and report shows many duplicate pages but doesn't tell me which ones are dups of each other. For you experienced duplicate page experts, do you have a subscription with copyscape and pay $.05 per test? What is the best way to clear these? Thanks in advance
On-Page Optimization | | joemas990