PDFs and webpages
-
If a website provides PDF versions of the page as a download option, should the PDF be no-indexed in your opinion?
We have to offer PDF versions of the webpage as our customers want them, they are a group who will download/print the pdfs. I thought of leaving the pdfs alone as they site in a subdomain but the more I think about it, I should probably noindex them. My reasons
- They site in a subdomain, if users have linked to them, my main domain isn't getting the rank juice
- Duplication issues, they might be affecting the rank of the existing webpages
- I can't track the PDF as they are in a subdomain, I can see event clicks to them from the main site though
On the flipside
- I could lose out on the traffic the pdfs bring when a user loads it from an organic search and any link existing on the pdf
What are your experiences?
-
Cool. It's advisable to add canonical HTTP headers to the PDFs too, if you can.
-
Thanks Alex,
I do have canonical tags on the webpages to ensure they are seen as the main one. I'll look into tracking subdomains.
-
Google now class subdomains pretty much as part of your main domain: http://www.youtube.com/watch?v=_MswMYk05tk - so you will be getting some of that rank juice.
I'd think that the major search engines wouldn't have a problem knowing that an HTML version of a page is preferred over a PDF. However, you can use canonical HTTP headers to make sure there are no problems with duplicate content: http://moz.com/blog/how-to-advanced-relcanonical-http-headers
If you use Google Analytics you will be able to track the subdomain. You can do it as part of your existing profile or by setting up a separate one: https://developers.google.com/analytics/devguides/collection/gajs/gaTrackingSite (ensure this is the version of Analytics you have installed).
There's a short guide here on getting more data about PDFs through Google Analytics: http://moz.com/ugc/how-to-track-pdf-traffic-links-in-google-analytics-open-site-explorer
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What to do with PDFs that rank well?
Looking at some reports, I found that a client's site has PDFs that are ranking well for niche terms and getting some traffic. What can I do to get more out of them from a marketing standpoint? The obvious issue is that a PDF doesn't have the interactivity of a site visit, where we have analytics and CTAs. Someone has to follow a link back from the PDF to the site for us to even register a visit, let alone try to get their email or have them otherwise convert. My first guess is to make landing page summaries of the PDF content that link to the PDF, and canonical the PDF to the respective landing page. Has anyone tried this, or done something else that they would recommend again in this situation?
Intermediate & Advanced SEO | | JFA0 -
Creating Redirect Maps -To include PDFs or Not to include PDFs?
When creating a redirect map for a site re-build or domain change, it is necessary to include .PDFs or any other non-HTML URLs? Do PDFs even carry "seo juice" over? When switching CMS, does it even matter to include them? Thanks!
Intermediate & Advanced SEO | | emilydavidson0 -
Moving a lot of pdfs to main site. Worth trying to get them indexed?
On my main site we link to pdfs that are located on another one of our domains. The only thing that is on this other domain is the pdfs. It was setup really poorly so I am going to redesign everything and probably move it. Is it worthwhile trying to add these pdfs to our sitemap and to try and get them indexed? They are all connected to a current item, but the content is original.
Intermediate & Advanced SEO | | EcommerceSite0 -
Should I change my product infomation (PDF attatchments) into additional webpages with links ?
Hello All, On our eCommerce site some products have additional information which we currently show via a PDF link next to the product. I am thinking, is it more beneficial from an SEO point of view , If I was to put this additional pdf information to a webpage and have a link going from the product to this . From what I read, google cannot read contents of pdfs so if I was to have this as webpage via a link , then the product page would get more keywords and strength around it which would help improve it's seo etc. Just wondered if this is the best way forward or not ? thanks Peter
Intermediate & Advanced SEO | | PeteC120 -
How to make AJAX content crawlable from a specific section of a webpage?
Content is located in a specific section of the webpage that are being loaded via AJAX.
Intermediate & Advanced SEO | | zpm20140 -
Cross Domain Rel Canonical tags vs. Rel Canonical Tags for internal webpages
Today I noticed that one of my colleagues was pointing rel canonical tags to a third party domain on a few specific pages on a client's website. This was a standard rel canonical tag that was written Up to this point I haven't seen too many webmasters point a rel canonical to a third party domain. However after doing some reading in the Google Webmaster Tools blog I realized that cross domain rel canonicals are indeed a viable strategy to avoid duplicate content. My question is this; should rel canonical tags be written the same way when dealing with internal duplicate content vs. external duplicate content? Would a rel=author tag be more appropriate when addressing 3rd party website duplicate content issues? Any feedback would be appreciated.
Intermediate & Advanced SEO | | VanguardCommunications0 -
Google isn't seeing the content but it is still indexing the webpage
When I fetch my website page using GWT this is what I receive. HTTP/1.1 301 Moved Permanently
Intermediate & Advanced SEO | | jacobfy
X-Pantheon-Styx-Hostname: styx1560bba9.chios.panth.io
server: nginx
content-type: text/html
location: https://www.inscopix.com/
x-pantheon-endpoint: 4ac0249e-9a7a-4fd6-81fc-a7170812c4d6
Cache-Control: public, max-age=86400
Content-Length: 0
Accept-Ranges: bytes
Date: Fri, 14 Mar 2014 16:29:38 GMT
X-Varnish: 2640682369 2640432361
Age: 326
Via: 1.1 varnish
Connection: keep-alive What I used to get is this: HTTP/1.1 200 OK
Date: Thu, 11 Apr 2013 16:00:24 GMT
Server: Apache/2.2.23 (Amazon)
X-Powered-By: PHP/5.3.18
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Thu, 11 Apr 2013 16:00:24 +0000
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
ETag: "1365696024"
Content-Language: en
Link: ; rel="canonical",; rel="shortlink"
X-Generator: Drupal 7 (http://drupal.org)
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8 xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/terms/"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:og="http://ogp.me/ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:sioc="http://rdfs.org/sioc/ns#"
xmlns:sioct="http://rdfs.org/sioc/types#"
xmlns:skos="http://www.w3.org/2004/02/skos/core#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"> <title>Inscopix | In vivo rodent brain imaging</title>0