Getting Google to index a large PDF file
-
Hello! We have a 100+ MB PDF with multiple pages that we want Google to fully index on our server/website. First of all, is it even possible for Google to index a PDF file of this size?
It's been up on our server for a few days, and my colleague did a Googlebot fetch via Webmaster Tools, but it still hasn't happened yet.
My theories as to why this may not work:
A) We have no actual link(s) to the pdf anywhere on our website.
B) This PDF is approx 130 MB and very slow to load. I added some compression to it, but that only got it down to 105 MB.
Any tips or suggestions on getting this thing indexed in Google would be appreciated. Thanks!
-
- Link to it
- Add it to sitemap
- Compress it and make it load faster
My guess is that the file size is too large. Why not turn it into an HTML page? Or several pages?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My brand name has 2 words but Google only indexing as 1 word. Is there a fix?
Hi all...I'm at a loss. I've never had this happen. Google only shows pages of my site when I search the brand name as one word. When I Google the site as one word BrandBrand- it only shows my blog page and about us page plus Twitter and Facebook on page 1. The homepage does not show up at all. When I Google the site as two words Brand Brand - My Facebook page is on page 1 but nothing else. The homepage isn't showing up at all. When I search both words on Bing and Yahoo both are indexing it as two words and shows on page 1. Any ideas?
Technical SEO | | TexasBlogger0 -
Homepage no longer indexed in Google
Have been working on a site and the hompage has recently vanished from Google. I submit the site to Google webmaster tools a couple of days ago and checked today and the homepage has vanished. There are no no follow tags, and no robots.txt stopping the page from being crawled. It's a bit of a worry, the site is http://www.beyondthedeal.com
Technical SEO | | tonysandwich
Any insights would be massively appreciated! Thanks.0 -
How can I best find out which URLs from large sitemaps aren't indexed?
I have about a dozen sitemaps with a total of just over 300,000 urls in them. These have been carefully created to only select the content that I feel is above a certain threshold. However, Google says they have only indexed 230,000 of these urls. Now I'm wondering, how can I best go about working out which URLs they haven't indexed? No errors are showing in WMT related to these pages. I can obviously manually start hitting it, but surely there's a better way?
Technical SEO | | rango0 -
Problem with indexed files before domain was purchased
Hello everybody, We bought this domain a few months back and we're trying to figure out how to get rid of indexed pages that (i assume) existed before we bought this domain - the domain was registered in 2001 and had a few owners. I attached 3 files from my webmasters tools, can anyone tell me how to get rid of those "pages" and more important: aren't this kind of "pages" result of some kind of "sabotage"? Looking forward to hearing your thoughts on this. Thank you, Alex Picture-5.png Picture-6.png Picture-7.png
Technical SEO | | pwpaneuro0 -
How do I get content to be indexed at the top?
I have a paragraph at the top of my homepage. I was told I could use css to make the content visually appear at the bottom of the page but it would still get indexed at the top of the page, still giving it the same level of importance. Can anyone tell me how to do this?
Technical SEO | | BradBorst0 -
How can I get unimportant pages out of Google?
Hi Guys, I have a (newbie) question, untill recently I didn't had my robot.txt written properly so Google indexed around 1900 pages of my site, but only 380 pages are real pages, the rest are all /tag/ or /comment/ pages from my blog. I now have setup the sitemap and the robot.txt properly but how can I get the other pages out of Google? Is there a trick or will it just take a little time for Google to take out the pages? Thanks! Ramon
Technical SEO | | DennisForte0 -
Existing Pages in Google Index and Changing URLs
Hi!! I am launching a newly recoded site this week and had a another noobie question. The URL structure has changed slightly and I have installed a 301 redirect to take care of that. I am wondering how Google will handle my "old" pages? Will they just fall out of the index? Or does the 301 redirect tell Google to rewrite the URLs in the index? I am just concerned I may see an "old" page and a "new" page with the same content in the index. Just want to make sure I have covered all my bases. Thanks!! Lynn
Technical SEO | | hiphound0 -
How do I get Google to display categories instead of the URL in results?
I've seen that for some domains Google will show a nice clickable site heirarchy in place of the actual URL of a search result. See attached for an example. How do I go about achieving this type of results? categorized.png
Technical SEO | | Carlito-2569610