Can Google index the text content in a PDF?
-
I really really thought the answer was always no. There's plenty of other things you can do to improve search visibility for a PDF, but I thought the nature of the file type made the content itself not-parsable by search engine crawlers...
But now, my client's competitor is ranking for my client's brand name with a PDF that contains comparison content.
Thing is, my client's brand isn't in the title, the alt-text, the url... it's only in the actual text of the PDF.
Did I miss a major update? Did I always have this wrong?
-
Yes they can crawl and index also the contents of PDF's and they are doing that extensively. Its nothing new actually. As long as the contents of the PDF is not only images but also text they will be able to scan the actual text.
Interesting article with tips to make your PDF's SEO-friendly: https://www.searchenginejournal.com/pdf-seo-best-practices/59975/
Cheers,
Cesare
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content Issues: Duplicate Content
Hi there
Technical SEO | | Kingagogomarketing
Moz flagged the following content issues, the page has duplicate content and missing canonical tags.
What is the best solution to do? Industrial Flooring » IRL Group Ltd
https://irlgroup.co.uk/industrial-flooring/ Industrial Flooring » IRL Group Ltd
https://irlgroup.co.uk/index.php/industrial-flooring Industrial Flooring » IRL Group Ltd
https://irlgroup.co.uk/index.php/industrial-flooring/0 -
URLs dropping from index (Crawled, currently not indexed)
I've noticed that some of our URLs have recently dropped completely out of Google's index. When carrying out a URL inspection in GSC, it comes up with 'Crawled, currently not indexed'. Strangely, I've also noticed that under referring page it says 'None detected', which is definitely not the case. I wonder if it could be something to do with the following? https://www.seroundtable.com/google-ranking-index-drop-30192.html - It seems to be a bug affecting quite a few people. Here are a few examples of the URLs that have gone missing: https://www.ihasco.co.uk/courses/detail/sexual-harassment-awareness-training https://www.ihasco.co.uk/courses/detail/conflict-resolution-training https://www.ihasco.co.uk/courses/detail/prevent-duty-training Any help here would be massively appreciated!
Technical SEO | | iHasco0 -
Meta Titles and Meta Descriptions are not Indexing in Google
Hello Every one, I have a Wordpress website in which i installed All in SEO plugin and wrote meta titles and descriptions for each and every page and posts and submitted website to index. But after Google crawl the Meta Titles and Descriptions shown by Google are something different that are not found in Content. Even i verified the Cached version of the website and gone through Source code that crawled at that moment. the meta title which i have written is present there. Apart from this, the same URL's are displaying perfect meta titles and descriptions which i wrote in Yahoo and Bing Search Engines. Can anyone explain me how to resolve this issue. Website URL: thenewyou (dot) in Regards,
Technical SEO | | SatishSEOSiren0 -
How can I provide titles and descriptive text for our list of USPs on the same page optimized both for usability and SEO
I am rebuilding our website together with an agency and I am stuck with the following problem: We have a page which will provide the visitor with a quick and convincing impression why he should chose our enterprise. On this page we want to show our USPs (Unique Selling Points) each with a title and a short description. Now my preferred way of presenting those USPs would be of a list of the titles (which permits to see all USPs without having to read a lot of text) where each title can be clicked to expand the description (in case you want to know more about this specific USP) and if you click on another title the previously clicked title description will collapse and the new description expand and so on (similar to this page: http://www.berlin-city-immobilien.de/38.html - I'm talking about the list in the middle of the page starting with the headline "Dabei profitieren Sie von folgenden Vorteilen"). Since I also want to use these descriptions as on page SEO-texts I checked whether Google might not index or at least value "click to expand content" less than plain text in the body of the page and I stumbled over this article: https://www.seroundtable.com/google-hidden-tab-content-seo-19489.html. According to this article Google will definitely discount the descriptions on my page. Does anyone have an idea how to solve this problem? Either by suggesting a different way to show titles and descriptions on the page or maybe by suggesting a workaround so Google will not treat the descriptions as "click to expand text". Thank you already in advance for your input.
Technical SEO | | Benni
Ben0 -
How can I stop google indexing an image
I have put a map of cornwall on my site on the Corwnall Page, and for some reason Google.de has picked it up and shows it up in the top 4 images for a search for cornwall? The result is I am getting about 80% of the traffic coming to my site for the search Cornwall (I get about 50 unique visits per day, over 40 a day are landing on the Cornwall page. Is this a problem for my normal SEO as a Close up Magician? Will google start to think my site is about Cornwall? Should I noindex the image (I say that like I know how! - How do I noindex that image? ) Or is any traffic to a site good traffic, I imagine they will be clicking on the link landing on the page and then leaving, which I suspect is not good for google reputation. Any thoughts anyone Thanks Roger http://www.rogerlapin.co.uk Where they land http://www.google.de/imgres?imgurl=http://www.rogerlapin.co.uk/wp-content/uploads/2013/09/map-of-cornwall.jpg&imgrefurl=http://www.rogerlapin.co.uk/magician-cornwall-magicians-hire-cornwall&h=904&w=1000&sz=167&tbnid=9GFlDv3BTz4ikM:&tbnh=99&tbnw=110&zoom=1&usg=__-b4bUYWREU_wAy2M04LrsrkzZpw=&docid=AUFmzso0arbGDM&sa=X&ei=HLZ2UpGYDMrY0QWXp4D4Dg&ved=0CEgQ9QEwAw&dur=2958
Technical SEO | | rnperki0 -
Pages not indexed by Google
We recently deleted all the nofollow values on our website. (2 weeks ago) The number of pages indexed by google is the same as before? Do you have explanations for this? website : www.probikeshop.fr
Technical SEO | | Probikeshop0 -
Forget Duplicate Content, What to do With Very Similar Content?
All, I operate a Wordpress blog site that focuses on one specific area of the law. Our contributors are attorneys from across the country who write about our niche topic. I've done away with syndicated posts, but we still have numerous articles addressing many of the same issues/topics. In some cases 15 posts might address the same issue. The content isn't duplicate but it is very similar, outlining the same rules of law etc. I've had an SEO I trust tell me I should 301 some of the similar posts to one authoritative post on the subject. Is this a good idea? Would I be better served implementing canonical tags pointing to the "best of breed" on each subject? Or would I be better off being grateful that I receive original content on my niche topic and not doing anything? Would really appreciate some feedback. John
Technical SEO | | JSOC0 -
Google Panda and ticketing sites: quality of content
Hi from Madrid! I am managing the Marketing Department of a ticketing site in Europe similar to Stubhub.com. We have thousands of events and, until now, we used templates for their descriptions. A lot of events share the same description with minor changes. They also have a lot of tickets on sale, so that's unique content different on each event. Now the last Google Panda update hit Europe and I was wondering if that will affect us a lot. It's hard to tell for now, because we are in the middle of the summer and the volume of searches in our industry depends decreases a lot during this time of the year. I know that ideally we should have unique descriptions but that would need a lot of resources and they are not important for our users: they just want to know the venue, the time and the price of the tickets! Have you experienced something about Google Panda update with a similar site or with another e-commerce industry? Thanks!
Technical SEO | | jorgediaz0