PDF's - Dupe Content
-
Hi
I have some pdfs linked to from a page with little content. Hence thinking best to extract the copy from the pdf and have on-page as body text, and the pdf will still be linked too. Will this count as dupe content ?
Or is it best to use a pdf plugin so page opens pdf automatically and hence gives page content that way ?
Cheers
Dan
-
Should be different, but you would have to look at them to make sure.
-
ps - is a pdf to html coverter different from a plugin that loads the pdf as an open page when you click it ? or same thing ?
-
That is what I was going to suggest - setting up a canonical in the http header of the PDF back to the article
https://support.google.com/webmasters/answer/139394?hl=en
As another option, you can just block access to the PDFs to keep them out of the index as well.
-
thanks Chris
yes you can canonicalise the pdf to the html (according to the comments of that article i just linked to anyway)
-
Hi Dan,
Yes PDFs are crawlable (sorry for confusion!) if you were to put it into say a .zip or .rar (or similar) it wouldn't be crawled or you could no index the link i guess. You would need to stick the PDF (download) behind some thing that couldn't be crawled. You could try rel= canonical but I've never tried it with a PDF so i'm not sure how that would go.
Hope that enlightens you a bit.
-
Thanks Chris although i thought PDFS were crawlable??: http://www.lunametrics.com/blog/2013/01/10/seo-pdfs/
Hence why im worried about dupe content if use content of pdf as body text too OR are you saying should no-follow the link to the pdf if use its content as body text because it is considered dupe content in that scenario ?
Ideally i want both - the copy on it used as body text copy on page and the pdf a linkable download, or page as embed of open pdf via a plugin.
-
What would give the user the best experience is the really question,I would;d say put it on page then if the user is lacking a plugin they can still read it, if you have it as a downloadable PDF is shouldn't be able to get crawled and thus avoiding the problem.
Hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content hubs vs blog
Hey all! I work for a large healthcare company. We're in the planning stages of redesigning our website, and the question came up of whether we needed to continue with the patient-focused blog at all when we could simply incorporate the blog articles into the service lines they best fit with (i.e. an article about feeding babies solid good would go under the pediatrics section of the website instead of the pediatrics section of the blog).Anybody have an opinion/insight on whether the articles would get better rankings being dispersed to the services sections of the website instead of concentrated on a blog? Or would good internal linking make the whole question moot?Thanks!
On-Page Optimization | | MartyIHC1 -
Fading in content above the fold on window load
Hi, We'd like to render a font stack from Typekit and paint a large cover image above the fold of our homepage after document completion. Since asynchronously loading anything generally looks choppy, we fade in the affected elements when it's done. Sure, it gives a much smoother feeling and fast load times, but I have a concern about SEO. While Typekit loads, h1, h2 and the page's leading paragraph are sent down the wire with an invisible style (but still technically exist as static html). Even though they appear to a user only milliseconds later, I'm concerned that a search engine's initial request is met with a page whose best descriptive assets are marked as invisible. Both UX and SEO have high value to our business model, so we're asking for some perspective to make the right kind of trade off. Our site has a high domain authority compared to our competition, and sales keyword competition is high. Will this UX improvement damage our On-Page SEO? If so and purely from an SEO perspective, roughly how serious will the impact be? We're eager to hear any advice or comments on this. Thanks a lot.
On-Page Optimization | | noyelling0 -
Product category content!? what should it include?
Hello everyone!, I consider myself a rookie... so... please, excuse me if this is super basic or dumb!. I'm working on a ecommerce web (family business!)... and i've got this doubt. Say you've got architected your site this way...: site.com/category
On-Page Optimization | | jleandroperez
site.com/category/model_1
site.com/category/model_2 I'm mainly interested in getting the category webpages to rank high. The problem i've got is... what to put in the CATEGORY webpage!. Suppose you sale office furniture... and the category is 'chairs'... if you add content there, it won't be useful. What do you suggest me to add there?. ====== NOTE: My 'categories' webpage is split vertically, so you've got an image gallery on the left, and the product description on the right. So all of my product pages look a bit alike... and the 'category' itself has a placeholder on the right. I suspect that's why i'm not getting good rankings! THANKS in advance.0 -
What's a reasonable bounce rate for school website?
Does anyone have a baseline on what the average bounce rate should be on a school website?
On-Page Optimization | | BillyBobGriffin0 -
Depreciated content - Canononical, 301, or noindex?
I have a page that has existed on our website for many years, without ever being updated.This is what I would consider an "evergreen" content page, but it is now considered out of date and depreciated. It was never ranking high for any keyword in particular, but it is a page that has existed for many years. We have now created a more up-to-date version of the page, with much more informative content, a new URL, and of course it is SEO optimized. I am puzzled as to what I should do with my old page. Should I add a canononical link pointing it to the new updated page, or should I 301 redirect it to the new page, or should I no-index the old page? What are your thoughts and suggestions? I can give more information if needed. Thank you!!
On-Page Optimization | | jcph0 -
ECommerce URL's
This is based on a clothing retailer, eCommerce site. In an effort to reduce the length of our product names, we are considering removing terms like long-sleeve, short-sleeve, etc., but leaving that information in the URL. Now, the concern is that we would lose some traction in the SERP's if those descriptive words are left out as the product name is also our page title. Then I think keywords as broad as long-sleeve shirt wouldn't serve us well anyways. One idea we have is that the alt tag on the product image could still display the longer product name that would include long-sleeve, etc. thus having the keyword on the product page. Any ideas or suggestions? Hope this is clear. Seems redundant from a user standpoint to state long-sleeve, etc. in every product name. Thanks - your answers are always so helpful!
On-Page Optimization | | kennyrowe0 -
Duplicate content? Not sure.
Good news! I have my first real SEO gig and now I have to be able to actually deliver. I'm up for it but I want to be sure I'm seeing what I think I am before suggesting any changes. I'm working my way throught Danny Dover's excellent book SEO Secrets and learning tons! To see if there is duplicate content on the site, I've taken a sentence from one of the pages on the site and searched for it: i.e., site:storybooksforhealing.com "Some of the most quiet moments are often the most difficult after a loss. Mornings, late nights, time alone." The SERPs show 7 pages that have this text on it. It seems like this is duplicate content, right? This is a Wordpress website so what's happening is the actual page is here: www.storybooksforhealing.com/publish-cup-of-joy/ but there are several archive pages that show excerpts of this text, too. If this is duplicate content (first question) then how would I go about remedying it? Should I set the canonical reference to /publish-cup-of-joy page? Thank you for being patient with my NOOB questions.
On-Page Optimization | | ChristiMc0 -
3 Different Home Page URL's Being Indexed?
Hello Everyone! I own a dog supplies eCom site on the x-cart platform. I recently upgraded to 4.4 version about 3 weeks ago and am noticing 3 different home page URL's getting indexed and ranked: /
On-Page Optimization | | k9byron
/home.php
/home.php?cat= I dont know why this is happening and I dont claim to be an expert SEO but know this cant be good! I am seeing high rankings on certain terms for all 3 URL's. Has anyone seen this before and can anyone give me any feedback on this and how it may be effecting my sites ranking in the future? Thanks in advance!
Byron-0