Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Getting Pages Requiring Login Indexed
-
Somehow certain newspapers' webpages show up in the index but require login. My client has a whole section of the site that requires a login (registration is free), and we'd love to get that content indexed. The developer offered to remove the login requirement for specific user agents (eg Googlebot, et al.). I am afraid this might get us penalized.
Any insight?
-
My guess: It's possible, but it would be an uphill battle. The reason being Google would likely see the page as a duplicate of all the other pages on your site with a login form. Not only does Google tend to drop duplicate pages from it's index (especially if it has a duplicate title tag - more leeway is giving the more unique elements you can place on a page) but now you face a situation where you have lots of duplicate or "thin" pages, which is juicy meat for a Panda-like penalty. Generally, you want to keep this pages out of the index, so it's a catch 22.
-
That makes sense. I am looking into whether any portion of our content can be made public in a way that would still comply with industry regulations. I am betting against it.
Does anyone know whether a page requiring login like this could feasibly rank with a strong backlink profile or a lot of quality social mentions?
-
The reason Google likes the "first click free" method is because they want the user to have a good result. They don't want users to click on a search result, then see something else on that page entirely, such as a login form.
So technically showing one set of pages to Google and another to users is considered cloaking. It's very likely that Google will figure out what's happening - either through manual review, human search quality raters, bounce rate, etc - and take appropriate actions against your site.
Of course, there's no guarantee this will happen, and you could argue that the cloaking wasn't done to deceive users, but the risk is high enough to warrant major consideration.
Are there any other options for displaying even part of the content, other than "first-click-free"? For example, can you display a snippet or few paragraphs of the information, then require login to see the rest? This at least would give Google something to index.
Unfortunately, most other methods for getting anything indexed without actually showing it to users would likely be considered blackhat.
Cyrus
-
Should have read the target:
"Subscription designation, snippets only: If First Click Free isn't a feasible option for you, we will display the "subscription" tag next to the publication name of all sources that greet our users with a subscription or registration form. This signals to our users that they may be required to register or subscribe on your site in order to access the article. This setting will only apply to Google News results.
If you prefer this option, please display a snippet of your article that is at least 80 words long and includes either an excerpt or a summary of the specific article. Since we do not permit "cloaking" -- the practice of showing Googlebot a full version of your article while showing users the subscription or registration version -- we will only crawl and display your content based on the article snippets you provide. If you currently cloak for Googlebot-news but not for Googlebot, you do not need to make any changes; Google News crawls with Googlebot and automatically uses the 80-word snippet.
NOTE: If you cloak for Googlebot, your site may be subject to Google Webmaster penalties. Please review Webmaster Guidelines to learn about best practices."
-
"In order to successfully crawl your site, Google needs to be able to crawl your content without filling out a registration form. The easiest way to do this is to configure your webservers not to serve the registration page to our crawlers (when the user-agent is "Googlebot") so that Googlebot can crawl these pages successfully. You can choose to allow Googlebot access to some restricted pages but not others. More information about technical requirements."
-http://support.google.com/webmasters/bin/answer.py?hl=en&answer=74536
Any harm in doing this while not implementing the rest of First Click Free??
-
What would you guys think about programming the login requirement behavior in such a way that only Google can't execute it--so Google wouldn't know that it is the only one getting through?
Not sure whether this is technically possible, but if it were, would it be theoretically likely to incur a penalty? Or is it foolish for other reasons?
-
Good idea--I'll have to determine precisely what I can and cannot show publicly and see if there isn't something I can do to leverage that.
I've heard about staying away from agent-specific content, but I wonder what the data are and whether there are any successful attempts?
-
First click free unfortunately won't work for us.
How might I go about determining how adult content sites handle this issue?
-
Have you considered allowing only a certain proportion of each page to show to any visitors including search engines. This way your pages will have some specific content that can be indexed and help you rank in the SERPs.
I have seen it done where publications behind a pay wall only allow the first paragraph or two to show - just enough to get them ranked appropriately but not enough to stop user wanting to register to access the full articles when they find them either through the SERPs, other sites or directly.
However for this to work it all depends on what the regualtions you mention require - would a proportion of the content being shown to all be ok??
I would definitely stay away from serving up different content to different users if I were you as this is likely to end up causing you trouble in the search engines..
-
I believe newspapers use a feature called "first click free" that enables this to work. I don't know if that will work with your industry regulations or not, however. You may also want to see how sites that deal with adult content, such as liquor sites, have a restriction for viewing let allow indexing.
-
Understood. The login requirement is necessary for compliance with industry regulations. My questions is whether I will be penalized for serving agent-specific content and/or whether there is a better way to get these pages in the index.
-
Search engines aren't good at completing online forms (such as a login), and thus any content contained behind them may remain hidden, so the developers option sounds like a good solution.
You may want to read:
http://www.seomoz.org/beginners-guide-to-seo/why-search-engine-marketing-is-necessary
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No Index thousands of thin content pages?
Hello all! I'm working on a site that features a service marketed to community leaders that allows the citizens of that community log 311 type issues such as potholes, broken streetlights, etc. The "marketing" front of the site is 10-12 pages of content to be optimized for the community leader searchers however, as you can imagine there are thousands and thousands of pages of one or two line complaints such as, "There is a pothole on Main St. and 3rd." These complaint pages are not about the service, and I'm thinking not helpful to my end goal of gaining awareness of the service through search for the community leaders. Community leaders are searching for "311 request service", not "potholes on main street". Should all of these "complaint" pages be NOINDEX'd? What if there are a number of quality links pointing to the complaint pages? Do I have to worry about losing Domain Authority if I do NOINDEX them? Thanks for any input. Ken
Intermediate & Advanced SEO | | KenSchaefer0 -
Google Indexing Of Pages As HTTPS vs HTTP
We recently updated our site to be mobile optimized. As part of the update, we had also planned on adding SSL security to the site. However, we use an iframe on a lot of our site pages from a third party vendor for real estate listings and that iframe was not SSL friendly and the vendor does not have that solution yet. So, those iframes weren't displaying the content. As a result, we had to shift gears and go back to just being http and not the new https that we were hoping for. However, google seems to have indexed a lot of our pages as https and gives a security error to any visitors. The new site was launched about a week ago and there was code in the htaccess file that was pushing to www and https. I have fixed the htaccess file to no longer have https. My questions is will google "reindex" the site once it recognizes the new htaccess commands in the next couple weeks?
Intermediate & Advanced SEO | | vikasnwu1 -
Password Protected Page(s) Indexed
Hi, I am wondering if my website can get a penalty if some password protected pages are showing up when I search on google: site:www.example.com/sub-group/pass-word-protected-page That shows that my password protected page was indexed either before or after adding the password protection. I've seen people suggest no indexing the page. Is that the best method to take care of this? What if we are planning on pushing the page live later on? All of these pages have no title tag, meta description, image alt text, etc. Should I add them for each page? I am wondering what is the best step, especially if we are planning on pushing the page(s) live. Thanks for any help!
Intermediate & Advanced SEO | | aua0 -
Do internal links from non-indexed pages matter?
Hi everybody! Here's my question. After a site migration, a client has seen a big drop in rankings. We're trying to narrow down the issue. It seems that they have lost around 15,000 links following the switch, but these came from pages that were blocked in the robots.txt file. I was wondering if there was any research that has been done on the impact of internal links from no-indexed pages. Would be great to hear your thoughts! Sam
Intermediate & Advanced SEO | | Blink-SEO0 -
Is there a way to get a list of Total Indexed pages from Google Webmaster Tools?
I'm doing a detailed analysis of how Google sees and indexes our website and we have found that there are 240,256 pages in the index which is way too many. It's an e-commerce site that needs some tidying up. I'm working with an SEO specialist to set up URL parameters and put information in to the robots.txt file so the excess pages aren't indexed (we shouldn't have any more than around 3,00 - 4,000 pages) but we're struggling to find a way to get a list of these 240,256 pages as it would be helpful information in deciding what to put in the robots.txt file and which URL's we should ask Google to remove. Is there a way to get a list of the URL's indexed? We can't find it in the Google Webmaster Tools.
Intermediate & Advanced SEO | | sparrowdog0 -
Why does my home page show up in search results instead of my target page for a specific keyword?
I am using Wordpress and am targeting a specific keyword..and am using Yoast SEO if that question comes up.. and I am at 100% as far as what they recommend for on page optimization. The target html page is a "POST" and not a "Page" using Wordpress definitions. Also, I am using this Pinterest style theme here http://pinclone.net/demo/ - which makes the post a sort of "pop-up" - but I started with a different theme and the results below were always the case..so I don't know if that is a factor or not. (I promise .. this is not a clever spammy attempt to promote their theme - in fact parts of it don't even work for me yet so I would not recommend it just yet...) I DO show up on the first page for my keyword.. however.. instead of Google showing the page www.mywebsite.com/this-is-my-targeted-keyword-page.htm Google shows www.mywebsite.com in the results instead. The problem being - if the traffic goes only to my home page.. they will be less likely to stay if they dont find what they want immediately and have to search for it.. Any suggestions would be appreciated!
Intermediate & Advanced SEO | | chunkyvittles0 -
Number of Indexed Pages are Continuously Going Down
I am working on online retail stores. Initially, Google have indexed 10K+ pages of my website. I have checked number of indexed page before one week and pages were 8K+. Today, number of indexed pages are 7680. I can't understand why should it happen and How can fix it? I want to index maximum pages of my website.
Intermediate & Advanced SEO | | CommercePundit0 -
Tool to calculate the number of pages in Google's index?
When working with a very large site, are there any tools that will help you calculate the number of links in the Google index? I know you can use site:www.domain.com to see all the links indexed for a particular url. But what if you want to see the number of pages indexed for 100 different subdirectories (i.e. www.domain.com/a, www.domain.com/b)? is there a tool to help automate the process of finding the number of pages from each subdirectory in Google's index?
Intermediate & Advanced SEO | | nicole.healthline0