Can Google index the text content in a PDF?
-
I really really thought the answer was always no. There's plenty of other things you can do to improve search visibility for a PDF, but I thought the nature of the file type made the content itself not-parsable by search engine crawlers...
But now, my client's competitor is ranking for my client's brand name with a PDF that contains comparison content.
Thing is, my client's brand isn't in the title, the alt-text, the url... it's only in the actual text of the PDF.
Did I miss a major update? Did I always have this wrong?
-
Yes they can crawl and index also the contents of PDF's and they are doing that extensively. Its nothing new actually. As long as the contents of the PDF is not only images but also text they will be able to scan the actual text.
Interesting article with tips to make your PDF's SEO-friendly: https://www.searchenginejournal.com/pdf-seo-best-practices/59975/
Cheers,
Cesare
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to check if an individual page is indexed by Google?
So my understanding is that you can use site: [page url without http] to check if a page is indexed by Google, is this 100% reliable though? Just recently Ive worked on a few pages that have not shown up when Ive checked them using site: but they do show up when using info: and also show their cached versions, also the rest of the site and pages above it (the url I was checking was quite deep) are indexed just fine. What does this mean? thank you p.s I do not have WMT or GA access for these sites
Technical SEO | | linklander0 -
How does Google view duplicate photo content?
Now that we can search by image on Google and see every site that is using the same photo, I assume that Google is going to use this as a signal for ranking as well. Is that already happening? I ask because I have sold many photos over the years with first-use only rights, where I retain the copyright. So I have photos on my site that I own the copyright for that are on other sites (and were there first). I am not sure if I should make an effort to remove these photos from my site or if I can wait another couple years.
Technical SEO | | Lina5000 -
How can I index several systems used for my website?
My site is built on PHP, but has a help.website.com page based on a helpdesk platform. I also have a wordpress blog. So, these are three "different systems" under the same domain. When I crawl my site, neither the blog nor the help page show up. How can I make them show up? Thanks!
Technical SEO | | rodelmo880 -
Google is keeping very old title tags in the SERPs for my site. How can I fix this?
Hi Around 6 months ago a site I work with changed its brand. One company became two. Despite changing the title when a new site went live around 6 months ago Google still picks up the old title for certain search results relevant to the old title. When a search result is relevant to the new title it shows that. It's very frustrating as we are trying to re-brand and do not want the old brand name showing for some very important search results. Thanks in advance for your help Paul
Technical SEO | | pauldoffman0 -
Help removing duplicate content from the index?
Last week, after a significant drop in traffic, I noticed a subdomain in the index with duplicate content. The main site and subdomain can be found below. http://mobile17.com http://232315.mobile17.com/ I've 301'd everything on the subdomain to the appropriate location on the main site. Problem is, site: searches show me that if the subdomain content is being deindexed, it's happening really slowly. Traffic is still down about 50% in the last week or so... what's the best way to tackle this issue moving forward?
Technical SEO | | ccorlando0 -
How long does it take for Google to de-index urls?
Added the noindex meta tag to some pages on my site and I am wondering if anyone has any idea how long it will take to deindex the urls?
Technical SEO | | nicole.healthline0