How long does it take for customized Google Site Search to show results from pdf files?
-
The site in question is http://www.ejmh.eu
I am pretty unsatisfied with the results I am getting from the Site Search provided by Google.
We have over 160 pdf files in this subfolder: http://www.ejmh.eu/mellekletek
The files are the digital versions of articles. When I search for content in those pdf files, Google does not show results. It does show results from older pages, dating back 1-2 years but it is certainly not showing anything from pdf files that I have just put up 3 weeks ago.
My questions:
If I place a Google Search on a site, does it not automatically display results from ALL the content in the root domain?
Is there any correlation between how the Site Search is indexing the files and how Google is indexing the urls in general?
Should I just wait and see whether site search performance improves or should I switch to another Search software like Zoom Search?
It is vital to have a proper, high-quality search functioning on that site in the very near future.
What are your experiences? Any tips are greatly appreciated.
-
Hi, everyone: problem solved.
Here is what I did: I created a seperate sitemap-xml and linked to all the new pdfs.
I updated the general sitemap.xml and linked to the new sitemap as well.
I (re)submitted both sitempas via the Webmaster Tools.
Within a few hours, most of pdfs got indexed and the overall quality of search has improved dramatically. Thanks for all your help.
-
It may be a good idea to include all the pdf files on the sitemap, even if it is a troublesome process.
Otherwise it just takes too long for Google to index them.
What still surprises me is that even for a site search, you need to win the 'indexing battle'. I thought that Google indexes everythig within the map for the 'sake of the site search' and displays the results when a visitor is searching within the site. Less fancy softwares are actually doing the job. I thought a Google Site Search provides something even better.
-
Last crawl - thanks, great info.
yes, all new pdfs are linked from the html files.
This the summary page of one article: http://www.ejmh.eu/5archives_ppr_jaggle_061.html
In the middle of the page, you see 'download full text' - this is from where the individual papers (pdf) are linked.
-
Do you have the new PDFs Linked from pages like the old ones?
Try to create a page listing all the new PDFs, and basically Google might take time to recrawl your site and add these new PDFs ( by the way the last copy saved in Google Cache is from Feb 11)
-
You are great, thanks for your time. Yeah, I did check things out with this google command: there are pdf's listed but these are all old pdfs I have put up a long time ago. None of the pdfs I have put up recently are among those indexed.
Do you think that only those urls come up through a customized site search that are indexed by Google? Does Google not crawl the site and make a list of urls for the sake of the search purely? (Zoom search does it, for example) In theory, there could be two different type of 'crawls': one for the site search and one for the larger world, searching in the browser.
As for the settings...can you plase help me further: what exactly would you change?
-
if you check here all the pdf are indexed in google
so i will check the settings on CSE
reference here http://www.google.com/cse/docs/resultsxml.html#wsQueryTerms
-
Thanks for the tip, it's a good one. But they are all 100% texts.
-
If a search engine cannot read the text, due to it being a graphic and not text, then it won't be able to fully index the words on the document.
so make sure all your PDF are 100% text that was converted to a PDF and not a "Scan" (image) of the original document that was saved as a PDF
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sudden site drop google, not banned or penalised?
Hi all, We've been working on our site weldingmart.com for a while. 4 weeks ago we got a sudden drop from google rankings even with our own brand name. No clear cause found, and decided to walk through all technicalities of the SEO fundament. thus we did the following
Technical SEO | | jkossel
-> Setup google webmasters tools
no issues found, a few 404's, few 410's sure thats all ok,
-> Setup robots.txt to only index homepages, lister pages, content and product detail pages (disabled all filters and search queries) Also we banned russian and spammy bots for performance-sake.
-> Added sitemaps, and around 14k pages seemed to be already indexed.
-> When searching for "site:weldingmart.com" i can find 14k pages.
-> we had a low 33/100 page speed score and improved this to 76/100 So we did a lot of clean up and improved a lot of items. but still 2-weeks in. we still have no ranking improvements. Before we went down we had around 100 clicks a day from google. now 5 avg. by the way i think a main issue is the low link count of course but still googling your own name should return us in top3 right. Is there something we are missing, do we need more time. I just want to verify that we do not mis anything!1 -
Google showing https:// page in search results but directing to http:// page
We're a bit confused as to why Google shows a secure page https:// URL in the results for some of our pages. This includes our homepage. But when you click through it isn't taking you to the https:// page, just the normal unsecured page. This isn't happening for all of our results, most of our deeper content results are not showing as https://. I thought this might have something to do with Google conducting searches behind secure pages now, but this problem doesn't seem to affect other sites and our competitors. Any ideas as to why this is happening and how we get around it?
Technical SEO | | amiraicaew0 -
Google haveing problems accessing part of my site
hi my site is, www.in2town.co.uk and for a few weeks now google has had trouble accessing part of my site. Today googlewebmaster tools tells me that google is having major problems it shows, 123 pages where access were denied. i have spoken to my hosting company who could not find a problem, so not sure what to do now. can anyone please give me advice on what the problem may be. any help would be great
Technical SEO | | ClaireH-1848860 -
Google instant results different to results shown when press enter
A client's site, www.duorol.co.uk is top (or second if a youtube video makes an appearance) for the term duorol if you press enter after typing it in to google UK. Before you press enter though, their site is not listed in the results bought back for instant search. It's the same behaviour in incognito mode too. Very weird I thought. Does anyone have any ideas please? Their site's only been live about a month. Could that be anything to do with it?
Technical SEO | | OffSightIT0 -
Google Analytics - Custom Variables
Hi guys, I'd appreciate any advice with this one. At the moment I'm in the process of arranging a URL re-structure. I was wondering what the best way would be to track the performance of the old URLs against new ones? We will be ammending the URLs for any new property pages which go live on our website but leaving the old URLs in play for any old properties listed. We're taking this approach for the moment so we can conduct analysis on the change. It has been mentioned to me that placing a 'setvariable' in the code of pages with the old URLs and ones with the new URLs would be a way of tracking performance. However, my knowledge in this area is a little bit grey. Any advice? Cheers, Mark
Technical SEO | | MarkScully0 -
Google webmasters showing links from my own IP
Was looking at Google webmasters earlier today and I noticed that links are shown from my own IP address. Really weird. Any clue on how to fix that?
Technical SEO | | EricMoore0 -
Google has not been visiting my site
Hi I am working on a site at the moment http://www.cheapflightsgatwick.com and i had the site using a different template and in the search engines for the search term cheap flights gatwick we were fourth and for the term holiday magazine we were 12th in google but now we are not even in google on the first page for the search terms. But now after changing the template in joomla our rankings have gone out of the window. It took me about a day to sort out the site with the new template so i was not expecting any problems with the search engines but for some reason there is. If you put into the search engine www.cheapflightsgatwick.com then you will see that google has not visited the site for four days and also it is not showing the description and instead it is showing details about joomla. Can anyone let me know if there is anything i need to do to sort this out and why google is taking so long to visit my site
Technical SEO | | ClaireH-1848860 -
Google +1 Button on Flash sites
One of my customers is willing to add Google +1 button on their Flash websites. Is it possible? How can we add Google +1 button on a Flash site? Thanks in advance!
Technical SEO | | merkal20050