PDF's - Dupe Content
-
Hi
I have some pdfs linked to from a page with little content. Hence thinking best to extract the copy from the pdf and have on-page as body text, and the pdf will still be linked too. Will this count as dupe content ?
Or is it best to use a pdf plugin so page opens pdf automatically and hence gives page content that way ?
Cheers
Dan
-
Should be different, but you would have to look at them to make sure.
-
ps - is a pdf to html coverter different from a plugin that loads the pdf as an open page when you click it ? or same thing ?
-
That is what I was going to suggest - setting up a canonical in the http header of the PDF back to the article
https://support.google.com/webmasters/answer/139394?hl=en
As another option, you can just block access to the PDFs to keep them out of the index as well.
-
thanks Chris
yes you can canonicalise the pdf to the html (according to the comments of that article i just linked to anyway)
-
Hi Dan,
Yes PDFs are crawlable (sorry for confusion!) if you were to put it into say a .zip or .rar (or similar) it wouldn't be crawled or you could no index the link i guess. You would need to stick the PDF (download) behind some thing that couldn't be crawled. You could try rel= canonical but I've never tried it with a PDF so i'm not sure how that would go.
Hope that enlightens you a bit.
-
Thanks Chris although i thought PDFS were crawlable??: http://www.lunametrics.com/blog/2013/01/10/seo-pdfs/
Hence why im worried about dupe content if use content of pdf as body text too OR are you saying should no-follow the link to the pdf if use its content as body text because it is considered dupe content in that scenario ?
Ideally i want both - the copy on it used as body text copy on page and the pdf a linkable download, or page as embed of open pdf via a plugin.
-
What would give the user the best experience is the really question,I would;d say put it on page then if the user is lacking a plugin they can still read it, if you have it as a downloadable PDF is shouldn't be able to get crawled and thus avoiding the problem.
Hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to NOT keyword stuff in long form content
My homepage has an authoritative guide with long form content. As a result, my main keyword is mentioned 40+ times. It's not forced, but natural, and the frequency of it is a function of the length of the article. Is that okay? Any suggestions or advise?
On-Page Optimization | | ntaparia171 -
Can bots crawl this homepage's content?
The website is https://ashleydouglas.com.au/ I tried using http://www.seo-browser.com/ to see if bots could see the content on the site, but the tool was unable to retrieve the page. I used mobile-friendly test and it just rendered some menu links - no content and images. I also used Fetch and Render on Search Console. The result for 'how google sees the page' and 'how a visitor sees the page' are the same and only showing the main header image. Anything below isn't shown. Does this mean that bots can't actually read all content on the page past the header image? I'm not well versed with what's going on with the code. Why are the elements below the header not rendering? Is it the theme? Plugins? Thank you.
On-Page Optimization | | nhhernandez0 -
Duplicate Content
When I say duplicate content, I don't mean that the content on a clients site is displaying on another site on the web or taken from a site on the web. A client has a few product pages and each product page has content on the bottom of the page (4-5 paragraphs) describing the product. Now, this content is also displaying on other pages, but re-worded so it's not 100% duplicate. Some pages show a duplicate content % ranging from 12% to 35% and maybe 40%. Just curious if I should suggest having each product page less than 10% duplicated. Thanks for your help.
On-Page Optimization | | Kdruckenbrod0 -
Duplicated content by the product pages
Hi,Do you thing those pages have duplicate content:https://www.nobelcom.com/Afghanistan-phone-cards/from-Romania-235-2.htmlhttps://www.nobelcom.com/Afghanistan-phone-cards-2.htmlhttps://www.nobelcom.com/Afghanistan-Cell-phone-cards-401.htmlhttps://www.nobelcom.com/Afghanistan-Cell-phone-cards/from-Romania-235-401.html.And also how much impact will it have on a panda update?I'm trying to figure out if all the product pages, (that are in the same way as the ones above) are the reson for a Panda Penalty
On-Page Optimization | | Silviu0 -
H2's vs Meta description
in some of my serp results the h2's are showing up instead of the meta description. i have read that H2's arent really valid anymore. can someone clarify this for me?
On-Page Optimization | | dhanson240 -
Does 'XXX' in Domain get filtered by Google
I have a friend that has xxx in there domain and they are a religious based sex/porn addiction company but they don't show up for the queries that they are optimized against. They have a 12+ year old domain, all good health signs in quality links and press from trusted companies. Google sends them adult traffic, mostly 'trolls' and not the users they are looking for. Has anyone experienced domain word filtering and have a work around or solution? I posted in the Google Webmaster help forums and that community seems a little 'high on their horses' and are trying to hard to be cool. I am not too religious and don't necessarily support the views of the website but just trying to help a friend of a friend with a topic that I have never encountered. here is the url: xxxchurch.com Thanks, Brian
On-Page Optimization | | Add3.com0 -
What are the guidelines for writing good content ?
Hello Guys I am bit confused on how to write good content. I have read so many articles where they are mentioning you have add your keyword in the first paragraph,the last paragraph and use latex words etc I am confused ! I want to create reviews that rank on the first page and stay their. Please could someone point out what are the basic guidelines for writing good content for google & users ?
On-Page Optimization | | umk0 -
Duplicate Page Content Question
This article was published on fastcompany.com on March 19th. http://www.fastcompany.com/magazine/164/designing-facebook It did not receive much traffic, so it was re-posted on Co.Design today (March 27th) where it has received significantly more traffic. http://www.fastcodesign.com/1669366/facebook-agrees-the-secret-to-its-future-success-is-design My question is if google will dock us for reprinting/reusing content on another site (even if it is a sister site within the same company). If they do frown on that, is there a proper way to attribute the content to the source material/site (fastcompany.com)?
On-Page Optimization | | DanAsadorian0