Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to check if the page is indexable for SEs?
-
Hi, I'm building the extension for Chrome, which should show me the status of the indexability of the page I'm on.
So, I need to know all the methods to check if the page has the potential to be crawled and indexed by a Search Engines. I've come up with a few methods:
- Check the URL in robots.txt file (if it's not disallowed)
- Check page metas (if there are not noindex meta)
- Check if page is the same for unregistered users (for those pages only available for registered users of the site)
Are there any more methods to check if a particular page is indexable (or not closed for indexation) by Search Engines?
Thanks in advance!
-
I understand the difference between what you're doing and what Google shows, I guess I'm just not sure when I'd want to know that something could technically be indexed, but isn't?
I guess I'm not your target market!
Good luck with your tool.
-
With "site:site.com" you can only see if the page is indexED, but to know if it's indexABLE you need to dig deeper. That is why I've decided to automate this process.
As I already told, this gonna be a browser extension, once you got on any page, this ext. automatically checks the page, and show the status (with color, I guess), if this page indexed, if not - it shows if its indexABLE. When I'm looking for linkbuilding resources, this little tool should help a lot
-
Ah, gotcha. Personally, I use Google itself to find out if something is indexable: if it's my own site, I can use Fetch as Google, and the robots.txt tester; if it's another site, you can search for "site:[URL]" to see if Google's indexed it.
I think this tool could be really good if you keep it as an icon and it glows or something if you've accidentally deindexed the page? Then it's helping you proactively.
Hope this helps!
Kristina
-
Actually I'm not. That's why I'm asking, to not to miss this basic stuff, so I really appreciate your advice. Thank you!
If I get your question correctly, you are asking why this extension is need for?
Well, 2 main aims:
-
When I want to check any of pages on my own websites, I just visit the page and see if it's ok with all the robots stuff. (or if it should be closed from robots, see if it really is)
-
For linkbuilding purposes. When I come to the page and see a link from it to external website and I know for sure that I can get the same link to my site, I'm asking myself, if it worth getting link from the page like this, if it's gonna be indexed. Why waste your time on getting links from pages that are closed from indexation.
-
-
Hello Peter,
First of all, thank you for the great ideas.
I don't think it's necessary to call the API, as this check references to only one URL (so no aggressiveness) , I need it to be done as fast as possible. But the idea with Structured Data - bravo!
Thanks a lot!
-
You're probably already doing this, but make sure that all of your tests are using the Googlebot user agent! That could cause different results, especially with the robots.txt check.
A sense check: what is your plugin going to offer over Google Search Console's Fetch as Google and robots.txt Tester?
-
You also can check for HTTP header results for crawling too:
https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tagAlso you can use some of Google services for this. Specially PageSpeed API:
https://developers.google.com/speed/docs/insights/v2/reference/Once you call this API it return JSON with list of blocked resources. It's little bit slower but i found that this is safe. Some hostings have IDS (intruder detection systems) and when some crawl them little bit aggressive they block whole IP or IP range. I know few cases when site is OK to be seen from users, but blocked from Google IP. Webmasters wasn't happy when they discover this. They call hosting few times and got "there isn't issues from our side, we didn't block anything". And 6 hours later they get "seems that another department was blocked this server for few specific IPs".
About checking for logged/nonloged users. You can use StructuredData Testing Tool. Also one call to get JSON with full HTTP response and then compare it with your result.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My url disappeared from Google but Search Console shows indexed. This url has been indexed for more than a year. Please help!
Super weird problem that I can't solve for last 5 hours. One of my urls: https://www.dcacar.com/lax-car-service.html Has been indexed for more than a year and also has an AMP version, few hours ago I realized that it had disappeared from serps. We were ranking on page 1 for several key terms. When I perform a search "site:dcacar.com " the url is no where to be found on all 5 pages. But when I check my Google Console it shows as indexed I requested to index again but nothing changed. All other 50 or so urls are not effected at all, this is the only url that has gone missing can someone solve this mystery for me please. Thanks a lot in advance.
Intermediate & Advanced SEO | | Davit19850 -
If a page ranks in the wrong country and is redirected, does that problem pass to the new page?
Hi guys, I'm having a weird problem: A new multilingual site was launched about 2 months ago. It has correct hreflang tags and Geo targetting in GSC for every language version. We redirected some relevant pages (with good PA) from another website of our client's. It turned out that the pages were not ranking in the correct country markets (for example, the en-gb page ranking in the USA). The pages from our site seem to have the same problem. Do you think they inherited it due to the redirects? Is it possible that Google will sort things out over some time, given the fact that the new pages have correct hreflangs? Is there stuff we could do to help ranking in the correct country markets?
Intermediate & Advanced SEO | | ParisChildress1 -
No Index thousands of thin content pages?
Hello all! I'm working on a site that features a service marketed to community leaders that allows the citizens of that community log 311 type issues such as potholes, broken streetlights, etc. The "marketing" front of the site is 10-12 pages of content to be optimized for the community leader searchers however, as you can imagine there are thousands and thousands of pages of one or two line complaints such as, "There is a pothole on Main St. and 3rd." These complaint pages are not about the service, and I'm thinking not helpful to my end goal of gaining awareness of the service through search for the community leaders. Community leaders are searching for "311 request service", not "potholes on main street". Should all of these "complaint" pages be NOINDEX'd? What if there are a number of quality links pointing to the complaint pages? Do I have to worry about losing Domain Authority if I do NOINDEX them? Thanks for any input. Ken
Intermediate & Advanced SEO | | KenSchaefer0 -
Redirecting homepage to internal page (2nd Tier page)
We are planning to experiment redirecting our homepage to one of the 2nd tier page. I mean....example.com to example.com/page. We need this page to rank well, but it doesn't have much internal links or external back-links, so we opt for this redirect. Advantage with this page is, it has "keyword" we want to rank for in URL. "page" in example.com/page. Will this help or hurt us in SEO? I think we are missing keyword in our root domain, so interested to highlight this page. Thanks, Satish
Intermediate & Advanced SEO | | vtmoz0 -
Can noindexed pages accrue page authority?
My company's site has a large set of pages (tens of thousands) that have very thin or no content. They typically target a single low-competition keyword (and typically rank very well), but the pages have a very high bounce rate and are definitely hurting our domain's overall rankings via Panda (quality ranking). I'm planning on recommending we noindexed these pages temporarily, and reindex each page as resources are able to fill in content. My question is whether an individual page will be able to accrue any page authority for that target term while noindexed. We DO want to rank for all those terms, just not until we have the content to back it up. However, we're in a pretty competitive space up against domains that have been around a lot longer and have higher domain authorities. Like I said, these pages rank well right now, even with thin content. The worry is if we noindex them while we slowly build out content, will our competitors get the edge on those terms (with their subpar but continually available content)? Do you think Google will give us any credit for having had the page all along, just not always indexed?
Intermediate & Advanced SEO | | THandorf0 -
Do internal links from non-indexed pages matter?
Hi everybody! Here's my question. After a site migration, a client has seen a big drop in rankings. We're trying to narrow down the issue. It seems that they have lost around 15,000 links following the switch, but these came from pages that were blocked in the robots.txt file. I was wondering if there was any research that has been done on the impact of internal links from no-indexed pages. Would be great to hear your thoughts! Sam
Intermediate & Advanced SEO | | Blink-SEO0 -
Are pages with a canonical tag indexed?
Hello here, here are my questions for you related to the canonical tag: 1. If I put online a new webpage with a canonical tag pointing to a different page, will this new page be indexed by Google and will I be able to find it in the index? 2. If instead I apply the canonical tag to a page already in the index, will this page be removed from the index? Thank you in advance for any insights! Fabrizio
Intermediate & Advanced SEO | | fablau0 -
There's a website I'm working with that has a .php extension. All the pages do. What's the best practice to remove the .php extension across all pages?
Client wishes to drop the .php extension on all their pages (they've got around 2k pages). I assured them that wasn't necessary. However, in the event that I do end up doing this what's the best practices way (and easiest way) to do this? This is also a WordPress site. Thanks.
Intermediate & Advanced SEO | | digisavvy0