PDFs and indexing
-
Hello and good morning.
I work for a paint manufacturing company in the UK on their seo campaigns across a couple of websites, this is my question. as paint and chemicals require data and tech sheets by law, available to be downloadable for said product, should these be included in the sitemap, we auto generate our sitemaps which they include these files, with low priorities and never change in terms of name etc.
they basically have a name of say 092847.pdf for example which cannot be changed, but from an seo view this doesn't mean a thing? so theres my question should they be included and would they carry any value?
-
thank you, I'm not saying they couldn't be changed it would just cause a lot of stress for our labs and tech guys who create these and work by the number. were as having a naming structure things would become a mess and everything up in the air.
I will look into the back end keywords, authors, company name which may give them some sort of impact from what I read on the link above.
-
-
Hi
Sitemaps - yes, include anything in Sitemaps that you want users to be able to find, so the more ways you can lead a Search Engine to it, the better.
Filename - it would help if you could change the filenames to include keywords, but if that's not an option then there are other things you can do to optimise each PDF.
There's a good overview of optimising PDFs here - How To Optimize PDF Documents For Search
As that post mentions, include links back to your site for maximum value, especially if these documents are shared on other websites. Also, a bit of branding within each PDF (just add a logo) could help you out in some way.
Hope that's helpful
-
Case A:
If the content of the PDF's is valuable, if it contains also some text about the product, I would make them indexable. It will make niche searchers find you.
You might want to make a separate sitemap for these PDF's, just to keep things clean.Case B:
If it's only numbers and very technical jibber jabber, I wouldn't let it index, since Google won't understand it either.Update with an interesting story:
A client of mine also had technical PDF sheets online. He has put a lot of effort in that. There were a few (4-5) competitors using direct links to the PDF's. After a while, we referred all that competitor traffic to a special landing page trying to convince why my client is a better deal. It's still online on some of the sites, since some competitors never really checked the PDF's.
Made my client very happy. -
Hey there
I can't imagine them having any SEO value, but I can't see the PDFs doing any harm either.
PDFs are crawlable and indexable by the search engines, so I would want to keep it in your sitemap for the user. I'm quite familiar with your industry (my dad worked with providing paint and chemical coatings) and I can imagine your target audience being quite specific in their searches, looking for products by code and specifications. A PDF would probably be the ideal solution for this and so having it indexed and sitting on your domain could bring in some organic traffic.
I'd make sure that the PDFs are branded if possible containing clear links back to your site, in order to funnel any long-tail traffic back to your homepage and sales pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Strange - Search Console page indexing "../Detected" as 404
Anyone seen this lately? All of a sudden Google Search Console is insisting in Page indexing that there is a 404 for a page that has never existed on our client's site: https://........com.au/Detected We've noticed this across a number of sites, precisely in this way with a capitalised "/Detected" To me it looks like something spammy is being submitted to the SERPs (somehow) and Google is trying to index that and then getting a 404. Naturally MOZ isn't picking it up, cause the page simply never existed - it's just happening in Search Console 2afc7e35-71e4-4e25-80a3-690bf10776a7.png It comes and it goes in the 404 alerts in Console and is really annoying. I reckon it started happening late 2022.
Reporting & Analytics | | DanielDL0 -
"index.htm" for all url's in google analytics
I don't have this issue with other wordpress websites, only this one website, and I don't know what's causing the issue: Google Analytics is adding an "index.htm" to every single page on the website. So it is tracking the pages, I see no errors - is it tracking the right page? When I click on the page link in a report, I naturally go to a "404 page not found" since the website address isn't "www.example.com/rewards/index.htm" - but instead the actual address would be:
Reporting & Analytics | | cceebar
"www.example.com/rewards/". I have navigated to View Settings in GA to insure "default page" is empty. Although adding anything else to this field does not effect the page url in analytics reports either. Could it be htaccess file - or a plugin effecting the htaccess file?_Cindy0 -
Drop in indexation but increase in organic traffic
We've had a puzzling drop in indexed pages on our ecommerce website. My crawl returns just over 25k items. Until 19/6 we had about 23-24k indexed. Then we experienced a sudden drop from 19/6 to 26/6: from 23,400 to 18,999, losing 4.4k pages from one week to the next. At the same time, our organic traffic has not decreased, it actually increased, however, it's only been a couple of weeks so that may be coincidence. A few things that have happened during the past few weeks: 31/5: we implemented pagination on category pages to avoid issues with duplicate content - could it be that this led to a decrease in indexed pages 3 weeks later? However, I can only find about 1.5k pages in my crawl that are page 2+ 18-19/6: we had some website outages over the weekend; as a B2B business, we don't get much traffic over the weekend, so I can't see an impact to traffic. However, the following week, indexation dropped by another 250 (then stayed the same this past week), so I don't think this was a factor. 21/6: we retired another website and migrated it to our main website. However, all pages were redirected to existing pages so no new pages were created for the migration. This doesn't really explain a decrease in indexation, but may account for some of the increase in organic traffic; however not all as the retired website hardly got any organic traffic. So, should we be worried? As our website is quite large, it would probably be quite difficult to pin point exactly which pages dropped off the index, but a loss of 19% of pages is quite significant. Then again, it doesn't appear to have negatively impacted organic traffic... Have you got any suggestions for what I should be looking at to find out what happened? Should I be worried at this point? I will definitely continue to have an eye on how our organic traffic (and indexation) develops but I am not sure if there is anything I can do at this point. I'd appreciate your advice on this, to make sure I am not missing something blindingly obvious. Thanks! RmWaNib JJm4tC3
Reporting & Analytics | | ViviCa10 -
Webmaster Tools Indexed pages vs. Sitemap?
Looking at Google Webmaster Tools and I'm noticing a few things, most sites I look at the number of indexed pages in the sitemaps report is usually less than 100% (i.e. something like 122 indexed out of 134 submitted or something) and the number of indexed pages in the indexed status report is usually higher. So for example, one site says over 1000 pages indexed in the indexed status report but the sitemap says something like 122 indexed. My question: Is the sitemap report always a subset of the URLs submitted in the sitemap? Will the number of pages indexed there always be lower than or equal to the URLs referenced in the sitemap? Also, if there is a big disparity between the sitemap submitted URLs and the indexed URLs (like 10x) is that concerning to anyone else?
Reporting & Analytics | | IrvCo_Interactive1 -
"not selected" is gone from Google Webmaster Tools Index Status Advanced
Just noticed today that the "not selected" has been removed from the Index status, Advanced section of Google Webmaster Tools. Anyone know why. I've used this metric to determine how or why Google was not selecting pages, particularly to gauge canonical's, 301's and duplicate content. It will be missed if gone for good.
Reporting & Analytics | | tdawson090 -
Yahoo wont Index my site...???
For some reason every time I get an SEO report card, or even check for my site on Yahoo, im never there. An the Report card always tells me that I am not being indexed by Yahoo. I don't understand bc my site is indexed by Google and Bing beautifully. I feel like I am missing out on good potential traffic...Any suggestions?
Reporting & Analytics | | Caseman0 -
Google: show all images indexed on a domain
Is there a way to display all images that google has indexed on a domain / subdomain? I'm basically looking for something like a site:-command for google image search.
Reporting & Analytics | | jmueller0 -
Historical Indexation
Hello, Is there at tool to see how many pages were indexed in google for a particular website historically? Thanks
Reporting & Analytics | | soeren.hofmayer0