How to allow googlebot past paywall
-
Does anyone know of any ways or ideas to allow Google/Bing etc. to index your content, but have it behind a paywall for users?
-
Thanks Mark,
I have been researching this idea from Google, but it is only for Google News and not Google Web Search.
Also, users would be able to jump the paywall by returning to Google News to search fro more links through to the site.
-
Google has a program called first click free - basically, you need to allow google bot, along with users, to view the first full article they land on. So if you have multiple page articles, you need to give them access to the entire article. After that though, the rest of the content can be behind a paywall.
You can read more about it here - http://support.google.com/webmasters/bin/answer.py?hl=en&answer=74536
And here are the technical guidelines for implementation - http://support.google.com/news/publisher/bin/answer.py?hl=en&answer=40543
Hope this helps,
Mark
-
Not possible. Google's not going to index something that is not accessible to everyone.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will putting a one page site up for all other countries stop Googlebot from crawling my UK website?
I have a client that only wants UK users to be able to purchase from the UK site. Currently, there are customers from the US and other countries purchasing from the UK site. They want to have a single webpage that is displayed to users trying to access the UK site that are outside the UK. This is fine but what impact would this have on Google bots trying to crawl the UK website? I have scoured the web for an answer but can't find one. Any help will be greatly appreciated. Thanks 🙂
Technical SEO | | lbagley0 -
Googlebot and other spiders are searching for odd links in our website trying to understand why, and what to do about it.
I recently began work on an existing Wordpress website that was revamped about 3 months ago. https://thedoctorwithin.com. I'm a bit new to Wordpress, so I thought I should reach out to some of the experts in the community.Checking ‘Not found’ Crawl Errors in Google Search Console, I notice many irrelevant links that are not present in the website, nor the database, as near as I can tell. When checking the source of these irrelevant links, I notice they’re all generated from various pages in the site, as well as non-existing pages, allegedly in the site, even though these pages have never existed. For instance: https://thedoctorwithin.com/category/seminars/newsletters/page/7/newsletters/page/3/feedback-and-testimonials/ allegedly linked from: https://thedoctorwithin.com/category/seminars/newsletters/page/7/newsletters/page/3/ (doesn’t exist) In other cases, these goofy URLs are even linked from the sitemap. BTW - all the URLs in the sitemap are valid URLs. Currently, the site has a flat structure. Nearly all the content is merely URL/content/ without further breakdown (or subdirectories). Previous site versions had a more varied page organization, but what I'm seeing doesn't seem to reflect the current page organization, nor the previous page organization. Had a similar issue, due to use of Divi's search feature. Ended up with some pretty deep non-existent links branching off of /search/, such as: https://thedoctorwithin.com/search/newsletters/page/2/feedback-and-testimonials/feedback-and-testimonials/online-continuing-education/consultations/ allegedly linked from: https://thedoctorwithin.com/search/newsletters/page/2/feedback-and-testimonials/feedback-and-testimonials/online-continuing-education/ (doesn't exist). I blocked the /search/ branches via robots.txt. No real loss, since neither /search/ nor any of its subdirectories are valid. There are numerous pre-existing categories and tags on the site. The categories and tags aren't used as pages. I suspect Google, (and other engines,) might be creating arbitrary paths from these. Looking through the site’s 404 errors, I’m seeing the same behavior from Bing, Moz and other spiders, as well. I suppose I could use Search Console to remove URL/category/ and URL/tag/. I suppose I could do the same, in regards to other legitimate spiders / search engines. Perhaps it would be better to use Mod Rewrite to lead spiders to pages that actually do exist. Looking forward to suggestions about best way to deal with these errant searches. Also curious to learn about why these are occurring. Thank you.
Technical SEO | | linkjuiced0 -
Help with Getting Googlebot to See Google Charts
We received a message from Google saying we have an extremely high number of URLs that are linking to pages with similar or duplicate content. The main difference between these pages are the Google charts we use. It looks like Google isn't able to see these charts (most of the text are very similar) and the charts (lots of it) are the main differences between these pages. So my question is what is the best approach to allowing Google to see the data that exists in these charts? I read from here http://webmasters.stackexchange.com/questions/69818/how-can-i-get-google-to-index-content-that-is-written-into-the-page-with-javascr that a solution would be to have the text that is displayed on the charts coded into the html and hidden by CSS. I'm not sure but it seems like a bad idea to have it seen by Google but hidden to the user by CSS. It just sounds like a cloaking hack. Can someone clarify if this is even a solution or is there a better solution?
Technical SEO | | ERICompensationAnalytics1 -
Can Googlebot crawl the content on this page?
Hi all, I've read the posts in Google about Ajax and javascript (https://support.google.com/webmasters/answer/174992?hl=en) and also this post: http://moz.com/ugc/can-google-really-access-content-in-javascript-really. I am trying to evaluate if the content on this page, http://www.vwarcher.com/CustomerReviews, is crawlable by Googlebot? It appears not to be. I perused the sitemap and don't see any ugly Ajax URLs included as Google suggests doing. Also, the page is definitely indexed, but appears the content is only indexed via its original source (Yahoo!, Citysearch, Google+, etc.). I understand why they are using this dynamic content, because it looks nice to an end-user and requires little to no maintenance. But, is it providing them any SEO benefit? It appears to me that it would be far better to take these reviews and simply build them into HTML. Thoughts?
Technical SEO | | danatanseo0 -
We are using Hotlink Protection on our server for jpg mostly. What is moz.com address to allow crawl access?
We are using Hotlink Protection on our server for jpg mostly. What is moz.com crawl url address so we can allow it in the list of allowed domains? The reason is that the crawl statistics gives our a ton of 403 Forbidden errors. Thanks.
Technical SEO | | sergeywin10 -
Googlebot Crawl Rate causing site slowdown
I am hearing from my IT department that Googlebot is causing as massive slowdown/crash our site. We get 3.5 to 4 million pageviews a month and add 70-100 new articles on the website each day. We provide daily stock research and marke analysis, so its all high quality relevant content. Here are the crawl stats from WMT: http://imgur.com/dyIbf I have not worked with a lot of high volume high traffic sites before, but these crawl stats do not seem to be out of line. My team is getting pressure from the sysadmins to slow down the crawl rate, or block some or all of the site from GoogleBot. Do these crawl stats seem in line with sites? Would slowing down crawl rates have a big effect on rankings? Thanks
Technical SEO | | SuperMikeLewis0 -
Why is either Rogerbot or (if it is the case) Googlebots not recognizing keyword usage in my body text?
I have a client that does liposuction as one of their main services, they have been ranked in the top 1-5 for their keywords "sarasota liposuction" with different variations of the words for a long time, and suddenly have dropped about 10-12 places down to #15 in the engine. I went to investigate this and actually came to the "on-page analysis" tool for SEOmoz pro, where oddly enough it says that there is no mention of the target keyword in the body content (on-page analysis tool screenshot attached). I didn't quite understand why it would not recognize the obvious keywords in the body text so I went back to the page and inspected further. The keywords have an odd featured link that links up to an internally hosted keyword glossary for definitions of terms that people might not know directly. These definitions pop up in a lightbox upon clicking the keyword (liposuction lightbox screenshots attached). I have no idea why google would not recognize these words as they have the text in between the link, yet if there is something wrong with the code syntax etc. it might possibly hender the engine from seeing the body text of the link? any help would be greatly appreciated! Thank you so much! Phn2m Phn2m.png bWr5K.png V36CL.png
Technical SEO | | jbster130 -
Is Googlebot ignoring directives? Or is it Me?
I saw an answer to a question in this forum a few days ago, that said it was a bad idea to use robots.txt to tell googlebot to go away. That SEO said it was much better to use the META tag to say noindex,nofollow. So I removed the robots directive and added the META tag <meta robots='noindex,nofollow'> Today, I see google showing my send to a friend page where I expected the real page to be. Does it mean Google is stupid? Does it mean google ignores the Robots META tag? Does it mean short pages have more value than long pages? Does it mean if I convert my whole site to snippets, I'll get more traffic? Does it mean garbage trumps content? I have more questions, but this is more than enough.
Technical SEO | | loopyal0