Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to check if the page is indexable for SEs?
-
Hi, I'm building the extension for Chrome, which should show me the status of the indexability of the page I'm on.
So, I need to know all the methods to check if the page has the potential to be crawled and indexed by a Search Engines. I've come up with a few methods:
- Check the URL in robots.txt file (if it's not disallowed)
- Check page metas (if there are not noindex meta)
- Check if page is the same for unregistered users (for those pages only available for registered users of the site)
Are there any more methods to check if a particular page is indexable (or not closed for indexation) by Search Engines?
Thanks in advance!
-
I understand the difference between what you're doing and what Google shows, I guess I'm just not sure when I'd want to know that something could technically be indexed, but isn't?
I guess I'm not your target market!
Good luck with your tool.
-
With "site:site.com" you can only see if the page is indexED, but to know if it's indexABLE you need to dig deeper. That is why I've decided to automate this process.
As I already told, this gonna be a browser extension, once you got on any page, this ext. automatically checks the page, and show the status (with color, I guess), if this page indexed, if not - it shows if its indexABLE. When I'm looking for linkbuilding resources, this little tool should help a lot
-
Ah, gotcha. Personally, I use Google itself to find out if something is indexable: if it's my own site, I can use Fetch as Google, and the robots.txt tester; if it's another site, you can search for "site:[URL]" to see if Google's indexed it.
I think this tool could be really good if you keep it as an icon and it glows or something if you've accidentally deindexed the page? Then it's helping you proactively.
Hope this helps!
Kristina
-
Actually I'm not. That's why I'm asking, to not to miss this basic stuff, so I really appreciate your advice. Thank you!
If I get your question correctly, you are asking why this extension is need for?
Well, 2 main aims:
-
When I want to check any of pages on my own websites, I just visit the page and see if it's ok with all the robots stuff. (or if it should be closed from robots, see if it really is)
-
For linkbuilding purposes. When I come to the page and see a link from it to external website and I know for sure that I can get the same link to my site, I'm asking myself, if it worth getting link from the page like this, if it's gonna be indexed. Why waste your time on getting links from pages that are closed from indexation.
-
-
Hello Peter,
First of all, thank you for the great ideas.
I don't think it's necessary to call the API, as this check references to only one URL (so no aggressiveness) , I need it to be done as fast as possible. But the idea with Structured Data - bravo!
Thanks a lot!
-
You're probably already doing this, but make sure that all of your tests are using the Googlebot user agent! That could cause different results, especially with the robots.txt check.
A sense check: what is your plugin going to offer over Google Search Console's Fetch as Google and robots.txt Tester?
-
You also can check for HTTP header results for crawling too:
https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tagAlso you can use some of Google services for this. Specially PageSpeed API:
https://developers.google.com/speed/docs/insights/v2/reference/Once you call this API it return JSON with list of blocked resources. It's little bit slower but i found that this is safe. Some hostings have IDS (intruder detection systems) and when some crawl them little bit aggressive they block whole IP or IP range. I know few cases when site is OK to be seen from users, but blocked from Google IP. Webmasters wasn't happy when they discover this. They call hosting few times and got "there isn't issues from our side, we didn't block anything". And 6 hours later they get "seems that another department was blocked this server for few specific IPs".
About checking for logged/nonloged users. You can use StructuredData Testing Tool. Also one call to get JSON with full HTTP response and then compare it with your result.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I index resource submission forms, thank you pages, etc.?
Should I index resource submission forms, thank you, event pages, etc.? Doesn't Google consider this content too thin?
Intermediate & Advanced SEO | | amarieyoussef0 -
Is it ok to repeat a (focus) keyword used on a previous page, on a new page?
I am cataloguing the pages on our website in terms of which focus keyword has been used with the page. I've noticed that some pages repeated the same keyword / term. I've heard that it's not really good practice, as it's like telling google conflicting information, as the pages with the same keywords will be competing against each other. Is this correct information? If so, is the alternative to use various long-winded keywords instead? If not, meaning it's ok to repeat the keyword on different pages, is there a maximum recommended number of times that we want to repeat the word? Still new-ish to SEO, so any help is much appreciated! V.
Intermediate & Advanced SEO | | Vitzz1 -
Category Page as Shopping Aggregator Page
Hi, I have been reviewing the info from Google on structured data for products and started to ponder.
Intermediate & Advanced SEO | | Alexcox6
https://developers.google.com/search/docs/data-types/products Here is the scenario.
You have a Category Page and it lists 8 products, each products shows an image, price and review rating. As the individual products pages are already marked up they display Rich Snippets in the serps.
I wonder how do we get the rich snippets for the category page. Now Google suggest a markup for shopping aggregator pages that lists a single product, along with information about different sellers offering that product but nothing for categories. My ponder is this, Can we use the shopping aggregator markup for category pages to achieve the coveted rich results (from and to price, average reviews)? Keen to hear from anyone who has had any thoughts on the matter or had already tried this.0 -
Google indexing only 1 page out of 2 similar pages made for different cities
We have created two category pages, in which we are showing products which could be delivered in separate cities. Both pages are related to cake delivery in that city. But out of these two category pages only 1 got indexed in google and other has not. Its been around 1 month but still only Bangalore category page got indexed. We have submitted sitemap and google is not giving any crawl error. We have also submitted for indexing from "Fetch as google" option in webmasters. www.winni.in/c/4/cakes (Indexed - Bangalore page - http://www.winni.in/sitemap/sitemap_blr_cakes.xml) 2. http://www.winni.in/hyderabad/cakes/c/4 (Not indexed - Hyderabad page - http://www.winni.in/sitemap/sitemap_hyd_cakes.xml) I tried searching for "hyderabad site:www.winni.in" in google but there also http://www.winni.in/hyderabad/cakes/c/4 this link is not coming, instead of this only www.winni.in/c/4/cakes is coming. Can anyone please let me know what could be the possible issue with this?
Intermediate & Advanced SEO | | abhihan0 -
Putting "noindex" on a page that's in an iframe... what will that mean for the parent page?
If I've got a page that is being called in an iframe, on my homepage, and I don't want that called page to be indexed.... so I put a noindex tag on the called page (but not on the homepage) what might that mean for the homepage? Nothing? Will Google, Bing, Yahoo, or anyone else, potentially see that as a noindex tag on my homepage?
Intermediate & Advanced SEO | | Philip-DiPatrizio0 -
How long takes to a page show up in Google results after removing noindex from a page?
Hi folks, A client of mine created a new page and used meta robots noindex to not show the page while they are not ready to launch it. The problem is that somehow Google "crawled" the page and now, after removing the meta robots noindex, the page does not show up in the results. We've tried to crawl it using Fetch as Googlebot, and then submit it using the button that appears. We've included the page in sitemap.xml and also used the old Google submit new page URL https://www.google.com/webmasters/tools/submit-url Does anyone know how long will it take for Google to show the page AFTER removing meta robots noindex from the page? Any reliable references of the statement? I did not find any Google video/post about this. I know that in some days it will appear but I'd like to have a good reference for the future. Thanks.
Intermediate & Advanced SEO | | fabioricotta-840380 -
Dynamic pages - ecommerce product pages
Hi guys, Before I dive into my question, let me give you some background.. I manage an ecommerce site and we're got thousands of product pages. The pages contain dynamic blocks and information in these blocks are fed by another system. So in a nutshell, our product team enters the data in a software and boom, the information is generated in these page blocks. But that's not all, these pages then redirect to a duplicate version with a custom URL. This is cached and this is what the end user sees. This was done to speed up load, rather than the system generate a dynamic page on the fly, the cache page is loaded and the user sees it super fast. Another benefit happened as well, after going live with the cached pages, they started getting indexed and ranking in Google. The problem is that, the redirect to the duplicate cached page isn't a permanent one, it's a meta refresh, a 302 that happens in a second. So yeah, I've got 302s kicking about. The development team can set up 301 but then there won't be any caching, pages will just load dynamically. Google records pages that are cached but does it cache a dynamic page though? Without a cached page, I'm wondering if I would drop in traffic. The view source might just show a list of dynamic blocks, no content! How would you tackle this? I've already setup canonical tags on the cached pages but removing cache.. Thanks
Intermediate & Advanced SEO | | Bio-RadAbs0 -
Should the sitemap include just menu pages or all pages site wide?
I have a Drupal site that utilizes Solr, with 10 menu pages and about 4,000 pages of content. Redoing a few things and we'll need to revamp the sitemap. Typically I'd jam all pages into a single sitemap and that's it, but post-Panda, should I do anything different?
Intermediate & Advanced SEO | | EricPacifico0