To noindex or not to noindex
-
Our website lets users test whether any given URL or keyword is censored in China. For each URL and keyword that a user looks up, a page is created, such as https://en.greatfire.org/facebook.com and https://zh.greatfire.org/keyword/freenet. From a search engines perspective, all these pages look very similar. For this reason we have implemented a noindex function based on certain rules. Basically, only highly ranked websites are allowed to be indexed - all other URLs are tagged as noindex (for example https://en.greatfire.org/www.imdb.com). However, we are not sure that this is a good strategy and so are asking - what should a website with a lot of similar content do?
- Don't noindex anything - let Google decide what's worth indexing and not.
- Noindex most content, but allow some popular pages to be indexed. This is our current approach. If you recommend this one, we would like to know what we can do to improve it.
- Noindex all the similar content. In our case, only let overview pages, blog posts etc with unique content to be indexed.
Another factor in our case is that our website is multilingual. All pages are available (and equally indexed) in Chinese and English. Should that affect our strategy?References:https://zh.greatfire.orghttps://en.greatfire.orghttps://www.google.com/search?q=site%3Agreatfire.org
-
1. yes - if you no index all but 20 pages, those 20 pages would get a boost in rankings. You would end up losing the long tail searches from those other thousands of page - so you'll need to do some cost / benefit analysis on that.
2. you'll need to do a cost / benefit analysis on this one. Are most of the visitors to your site searching in Chinese or English? Are your search terms mainly in Chinese or mainly in English? Are your Chinese speaking visitors more likely to want to visit the .zh subdomain?
You could publish 20 to 50 pages on each subdomain, and then focus on doing some link building. If you have strong rankings across those 40 to 100 pages, then you could start adding more pages slowly over time.
-
Nops, no need to include the no index tag as adding canonical is an indication to Google that what are the original pages that search engine need to index and crawl so al other pages then category pages will be crawled automatically.
-
Hi Moosa. Thanks very much for your reply and great suggestions. If I add canonical tags on each URL page referencing the category page where it belongs, should I also add noindex tags on it? Should then actually all URL pages have noindex tags and only allow category pages to be indexed?
-
Thanks for suggestions. I have some follow-up questions. Would really like to know what you think about the following:
- The "page rank will get shared to all of the pages that you have across your site". In general, does this mean that if I add noindex tags to all but a few pages, they will be ranked much higher? Currently thousands of pages are indexed. Is it correct to say that if only say 20 pages were indexed that would greatly improve their ranking?
- The zh and en versions of the website have different templates and most of the text content is also translated (with the main exception of old blog posts). We could add noindex on all of the zh website or all except the main pages. Would you recommend that?
-
Ok I might sound completely stupid here as I never come across this case before but here is my hypothesis….
While searching for a keyword or URL you another field (may be a checkbox) that represents the category of search.
So, ones the new URL will generate it will come under the specific category automatically.
Customize the category pages so that they look different from each other.
Index the category pages and add canonical tag on any new generated URL of the category page. For example if the new page generates like www.yourwebsite.com/movies/ice-age -3/ this page should have the canonical tag to http://www.yourwebsite.com/movies/
Why?
Creating category pages will allow you more unique pages to get indexed in SERPs without the duplicate content issue. Adding canonical tag on all other URLs will tell category pages are the real pages that Google should consider.
This might help you cater more chances to earn more search traffic from Google.
**This is my assumption what I think should work!
-
Creating a page every time someone performs a search could probably spiral out of control pretty quickly. If you have a certain amount of 'page rank', based on all of the back links you have, that page rank will get shared to all of the pages that you have across your site.
One way you could more naturally control what gets indexed, is by what you link to from your home page. For instance, if you track the most blocked big sites, as well as the most blocked keywords, and have those pages 1 link from your homepage, you could expect those to get indexed naturally when your site is spidered.
As you get more links from other sites, and your trust from the search engines and page rank grows, you should be able to support more pages getting indexed across your site.
There is the issue of your site contents potentially being regarded as 'thin content', since many of the pages appear to be the same from page to page.
One question I had - I saw your site hosts both Chinese language words and English language words, and checks whether those words are being filtered. Perhaps it would make more sense to only show the words in Chinese characters on the zh. subdomain, and the English words on the en. subdomain? Just a thought. Is there any difference between the zh and en subdomains, aside from the language of the template?
Really interesting website.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I use NoIndex on short-lived pages?
Hello, I have a large number of product pages on my site that are relatively short-lived: probably in the region of a million+ pages that are created and then removed within a 24 hour period. Previously these pages were being indexed by Google and did receive landings, but in recent times I've been applying a NoIndex tag to them. I've been doing that as a way of managing our crawl budget but also because the 410 pages that we serve when one of these product pages is gone are quite weak and deliver a relatively poor user experience. We're working to address the quality of those 410 pages but my question is should I be no-indexing these product pages in the first place? Any thoughts or comments would be welcome. Thanks.
Intermediate & Advanced SEO | | PhilipHGray0 -
Why is our noindex tag not working?
Hi, I have the following page where we've implemented a no index tag. But when we run this page in screaming frog or this tool here to verify the noidex is present and functioning, it shows that it's not. But if you view the source of the page, the code is present in the head tag. And unfortunately we've seen instances where Google is indexing pages we've noindexed. Any thoughts on the example above or why this is happening in Google? Eddy
Intermediate & Advanced SEO | | eddys_kap0 -
Conditional Noindex for Dynamic Listing Pages?
Hi, We have dynamic listing pages that are sometimes populated and sometimes not populated. They are clinical trial results pages for disease types, some of which don't always have trials open. This means that sometimes the CMS produces a blank page -- pages that are then flagged as thin content. We're considering implementing a conditional noindex -- where the page is indexed only if there are results. However, I'm concerned that this will be confusing to Google and send a negative ranking signal. Any advice would be super helpful. Thanks!
Intermediate & Advanced SEO | | yaelslater0 -
Noindex / Nofollow multiple reviews pages?
I have well over a hundred pages of reviews (10 per page). I know this is solid content and I'd hate to not be able to leverage it, but I'm running into the issue of having duplicate title tags and H1s on all of the pages. What's the best way to make use of the review content without have those types of issues? Is a noindex / nofollow strategy something I should be considering here for Page 2 and beyond? Thanks! Edit: I did additional digging into pagination strategies and found this terrific article on Moz. I'm thinking it should address my questions regarding review pages as well.
Intermediate & Advanced SEO | | Andrew_Mac0 -
Noindex
I have been reading a lot of conflicting information on the Link Juice ramifications of using "NoIndex". Can I get some advice for the following situation? 1. I have pages that I do not want indexed on my site. They are lead conversion pages. Just about every page on my site has links to them. If I just apply a standard link, those pages will get a ton of Link Juice that I'd like to allocate to other pages. 2. If I use "nofollow", the pages won't rank, but the link juice evaporates. I get that. I won't use "nofollow" 3. I have read that "noindex, follow" will block the pages in the SERPs, but will pass Link Juice to them. I don't think that I want this either. If I "dead end" the lead form with no navigation or links, will the juice be locked up on the page? 4. I assume that I should block the pages in robots.txt In order to keep the pages out of the SERPs, and conserve Link Juice, what should I do? Can someone please give me a step by step process with the reasoning for what I should do here?
Intermediate & Advanced SEO | | CsmBill0 -
Why is a page with a noindex code being indexed?
I was looking through the pages indexed by Google (with site:www.mywebsite.com) and one of the results was a page with "noindex, follow" in the code that seems to be a page generated by blog searches. Any ideas why it seems to be indexed or how to de-index it?
Intermediate & Advanced SEO | | theLotter0 -
Canonical vs noindex for blog tags
Our blog started to user tags & I know this is bad for Panda, but our product team wants use them for user experience. Should we canonizalize these tags to the original blog URL or noindex them?
Intermediate & Advanced SEO | | nicole.healthline0 -
NOINDEX content still showing in SERPS after 2 months
I have a website that was likely hit by Panda or some other algorithm change. The hit finally occurred in September of 2011. In December my developer set the following meta tag on all pages that do not have unique content: name="robots" content="NOINDEX" /> It's been 2 months now and I feel I've been patient, but Google is still showing 10,000+ pages when I do a search for site:http://www.mydomain.com I am looking for a quicker solution. Adding this many pages to the robots.txt does not seem like a sound option. The pages have been removed from the sitemap (for about a month now). I am trying to determine the best of the following options or find better options. 301 all the pages I want out of the index to a single URL based on the page type (location and product). The 301 worries me a bit because I'd have about 10,000 or so pages all 301ing to one or two URLs. However, I'd get some link juice to that page, right? Issue a HTTP 404 code on all the pages I want out of the index. The 404 code seems like the safest bet, but I am wondering if that will have a negative impact on my site with Google seeing 10,000+ 404 errors all of the sudden. Issue a HTTP 410 code on all pages I want out of the index. I've never used the 410 code and while most of those pages are never coming back, eventually I will bring a small percentage back online as I add fresh new content. This one scares me the most, but am interested if anyone has ever used a 410 code. Please advise and thanks for reading.
Intermediate & Advanced SEO | | NormanNewsome0