undefined
Skip to content
Moz logo Menu open Menu close
  • Products
    • Moz Pro
    • Moz Pro Home
    • Moz Local
    • Moz Local Home
    • STAT
    • Moz API
    • Moz API Home
    • Compare SEO Products
    • Moz Data
  • Free SEO Tools
    • Domain Analysis
    • Keyword Explorer
    • Link Explorer
    • Competitive Research
    • MozBar
    • More Free SEO Tools
  • Learn SEO
    • Beginner's Guide to SEO
    • SEO Learning Center
    • Moz Academy
    • SEO Q&A
    • Webinars, Whitepapers, & Guides
  • Blog
  • Why Moz
    • Agency Solutions
    • Enterprise Solutions
    • Small Business Solutions
    • Case Studies
    • The Moz Story
    • New Releases
  • Log in
  • Log out
  • Products
    • Moz Pro

      Your all-in-one suite of SEO essentials.

    • Moz Local

      Raise your local SEO visibility with complete local SEO management.

    • STAT

      SERP tracking and analytics for enterprise SEO experts.

    • Moz API

      Power your SEO with our index of over 44 trillion links.

    • Compare SEO Products

      See which Moz SEO solution best meets your business needs.

    • Moz Data

      Power your SEO strategy & AI models with custom data solutions.

    NEW Keyword Suggestions by Topic
    Moz Pro

    NEW Keyword Suggestions by Topic

    Learn more
  • Free SEO Tools
    • Domain Analysis

      Get top competitive SEO metrics like DA, top pages and more.

    • Keyword Explorer

      Find traffic-driving keywords with our 1.25 billion+ keyword index.

    • Link Explorer

      Explore over 40 trillion links for powerful backlink data.

    • Competitive Research

      Uncover valuable insights on your organic search competitors.

    • MozBar

      See top SEO metrics for free as you browse the web.

    • More Free SEO Tools

      Explore all the free SEO tools Moz has to offer.

    NEW Keyword Suggestions by Topic
    Moz Pro

    NEW Keyword Suggestions by Topic

    Learn more
  • Learn SEO
    • Beginner's Guide to SEO

      The #1 most popular introduction to SEO, trusted by millions.

    • SEO Learning Center

      Broaden your knowledge with SEO resources for all skill levels.

    • On-Demand Webinars

      Learn modern SEO best practices from industry experts.

    • How-To Guides

      Step-by-step guides to search success from the authority on SEO.

    • Moz Academy

      Upskill and get certified with on-demand courses & certifications.

    • MozCon

      Save on Early Bird tickets and join us in London or New York City

    Unlock flexible pricing & new endpoints
    Moz API

    Unlock flexible pricing & new endpoints

    Find your plan
  • Blog
  • Why Moz
    • Small Business Solutions

      Uncover insights to make smarter marketing decisions in less time.

    • Agency Solutions

      Earn & keep valuable clients with unparalleled data & insights.

    • Enterprise Solutions

      Gain a competitive edge in the ever-changing world of search.

    • The Moz Story

      Moz was the first & remains the most trusted SEO company.

    • Case Studies

      Explore how Moz drives ROI with a proven track record of success.

    • New Releases

      Get the scoop on the latest and greatest from Moz.

    Surface actionable competitive intel
    New Feature

    Surface actionable competitive intel

    Learn More
  • Log in
    • Moz Pro
    • Moz Local
    • Moz Local Dashboard
    • Moz API
    • Moz API Dashboard
    • Moz Academy
  • Avatar
    • Moz Home
    • Notifications
    • Account & Billing
    • Manage Users
    • Community Profile
    • My Q&A
    • My Videos
    • Log Out

The Moz Q&A Forum

  • Forum
  • Questions
  • Users
  • Ask the Community

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

  1. Home
  2. SEO Tactics
  3. Technical SEO
  4. Indexed pages

Moz Q&A is closed.

After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

Indexed pages

Technical SEO
4
6
2.7k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as question
Log in to reply
This topic has been deleted. Only users with question management privileges can see it.
  • muzzmoz
    muzzmoz last edited by Oct 26, 2016, 10:42 PM

    Just started a site audit and trying to determine the number of pages on a client site and whether there are more pages being indexed than actually exist. I've used four tools and got four very different answers...

    • Google Search Console: 237 indexed pages
    • Google search using site command: 468 results
    • MOZ site crawl: 1013 unique URLs
    • Screaming Frog: 183 page titles, 187 URIs (note this is a free licence, but should cut off at 500)

    Can anyone shed any light on why they differ so much? And where lies the truth?

    1 Reply Last reply Reply Quote 1
    • MikeGracia
      MikeGracia last edited by Oct 30, 2016, 3:13 PM Oct 30, 2016, 3:12 PM

      Another option is if the site uses a CMS. If so, then you can create a sitemap for content pages/posts etc,.

      Personally, I'm with Krzysztof Furtak  on SF. Screaming Frog rocks. It'll find most pages, except perhaps Orphan pages as it wouldn't be able to find a link to crawl to discover the page.

      If it's really important to get as many pages as possible, I'd do the following (I've put an Astrix (*) next to ones that some people may think are a tad extreme)

      • Run a Screaming Frog crawl
      • Grab a sitemap from your CMS
      • Check any server-based analytics (AWSTATS etc)
      • Check your access_log file & parse out URLs in there**(*)**
      • site: queries, with & without www, and also using * as a subdomain (use something like Moz's toolbar to export)
      • As Krzysztof suggests, Scrapebox would extract data too, but be careful scraping, you may get an IP slap.(*)
      • Export crawl data from Moz & a tool such as Deep Crawl
      • Throw the pages from all into Excel and de-dupe.
      • Once you have a de-duped list, as an optional last step, go back to Screaming Frog and enter list mode (I have the paid version, not sure if it's possible with the free one) and run a crawl over all the de-duped URLs to get status codes etc

      If you're going to do this sort of thing a fair bit - buy a Screaming Frog license, it's an awesome tool and can be useful in a multitude of situations. 🙂

      1 Reply Last reply Reply Quote 0
      • MikeGracia
        MikeGracia @Insomniacs last edited by Oct 30, 2016, 3:04 PM Oct 30, 2016, 3:04 PM

        The site: command is handy for asking Google what pages it knows about, however if Muzzmoz wants to know the number of pages on a site, you'd need more than this.

        Also, re: your different ways or querying, I like to use:

        site:*.domain.com - This can show other subdomains too, that may otherwise be missed 😉

        1 Reply Last reply Reply Quote 0
        • PenaltyHammer
          PenaltyHammer @Insomniacs last edited by Oct 27, 2016, 7:07 AM Oct 27, 2016, 7:07 AM

          Ok so check with site something under 1000 pages and go to the last results page. You'll see that there'll be different number (in almost all cases).

          1 Reply Last reply Reply Quote 1
          • Insomniacs
            Insomniacs last edited by Oct 27, 2016, 7:05 AM Oct 27, 2016, 7:05 AM

            I Will Always Prefer To Check Manually Using Site Command Because,  site: operator, which will show us how many pages Google currently has indexed for the domain.

            There Will Be Difference Between Index status in search console and current index as search console update the data after few days.

            The number of indexed URLs is almost always significantly smaller than the number of crawled URLs, because Total indexed excludes URLs identified as duplicates, non-canonical or those that contain a meta no index tag.

            Also, Check For Index(Preferred)  Version Of Your Site

            For E.g-

            • http://abc.com
            • http://www.abc.com
            • https://abc.com
            • https://www.abc.com

            You can check More About this Here - https://support.google.com/webmasters/answer/2642366?hl=en

            PenaltyHammer MikeGracia 2 Replies Last reply Oct 30, 2016, 3:04 PM Reply Quote 0
            • PenaltyHammer
              PenaltyHammer last edited by Oct 27, 2016, 7:06 AM Oct 27, 2016, 7:05 AM

              Hi

              Most accurate number is from screaming frog (if you have less than 500 pages or paid version if more than 500).

              Google indexes what it wants and if good enough to show in google index. If some pages are similar, got quality issues, blocked by robots etc then it won't show all. BTW don't think number in GSC or google index is good, check it manually because there can be 468 but in fact 200 only.

              Moz can have "historical" pages that now don't exists or don't care about quality issues.

              The truth is in screaming frog - most accurate number. If you used google user agent then number is the max that can appear in google index. If screaming frog user agent with turned off robots then you'll see bigger number (but google won't show it because of blocks).

              If you want to check what's indexed then use tool like scrapebox. First get all urls (maybe without images if you don't care), then check indexed with sb. What's not indexed, can have some issues.

              1 Reply Last reply Reply Quote 0
              • 1 / 1
              1 out of 6
              • First post
                1/6
                Last post

              Got a burning SEO question?

              Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


              Start my free trial


              Browse Questions

              Explore more categories

              • Moz Tools

                Chat with the community about the Moz tools.

              • SEO Tactics

                Discuss the SEO process with fellow marketers

              • Community

                Discuss industry events, jobs, and news!

              • Digital Marketing

                Chat about tactics outside of SEO

              • Research & Trends

                Dive into research and trends in the search industry.

              • Support

                Connect on product support and feature requests.

              • See all categories

              Related Questions

              • SameerBhatia

                Indexing Issue of Dynamic Pages

                Hi All, I have a query for which i am struggling to find out the answer. I unable to retrieve URL using "site:" query on Google SERP. However, when i enter the direct URL or with "info:" query then a snippet appears. I am not able to understand why google is not showing URL with "site:" query. Whether the page is indexed or not? Or it's soon going to be deindexed. Secondly, I would like to mention that this is a dynamic URL. The index file which we are using to generate this URL is not available to Google Bot. For instance, There are two different URL's. http://www.abc.com/browse/ --- It's a parent page.
                http://www.abc.com/browse/?q=123 --- This is the URL, generated at run time using browse index file. Google unable to crawl index file of browse page  as it is unable to run independently until some value will get passed in the parameter and is not indexed by Google. Earlier the dynamic URL's were indexed and was showing up in Google for "site:" query but now it is not showing up. Can anyone help me what is happening here? Please advise. Thanks

                Technical SEO | Mar 28, 2018, 2:43 AM | SameerBhatia
                0
              • linklander

                How to check if an individual page is indexed by Google?

                So my understanding is that you can use site: [page url without http] to check if a page is indexed by Google, is this 100% reliable though? Just recently Ive worked on a few pages that have not shown up when Ive checked them using site: but they do show up when using info: and also show their cached versions, also the rest of the site and pages above it (the url I was checking was quite deep) are indexed just fine. What does this mean? thank you p.s I do not have WMT or GA access for these sites

                Technical SEO | Aug 25, 2015, 6:45 PM | linklander
                0
              • SierraPCB

                Why google indexed pages are decreasing?

                Hi, my website had around 400 pages indexed  but from February, i noticed a huge decrease in indexed numbers and it is continually decreasing. can anyone help me to find out the reason. where i can get solution for that?  will it effect my web page ranking ?

                Technical SEO | Jun 19, 2014, 11:28 PM | SierraPCB
                0
              • Dan-Lawrence

                Staging & Development areas should be not indexable (i.e. no followed/no index in meta robots etc)

                Hi I take it if theres a staging or development area on a subdomain for a site, who's content is hence usually duplicate then this should not be indexable i.e. (no-indexed & nofollowed in metarobots) ? In order to prevent dupe content probs as well as non project related people seeing work in progress or finding accidentally in search engine listings ? Also if theres no such info in meta robots is there any other way it may have been made non-indexable, or at least dupe content prob removed by canonicalising the page to the equivalent page on the live site ? In the case in question i am finding it listed in serps when i search for the staging/dev area url, so i presume this needs urgent attention ? Cheers Dan

                Technical SEO | Aug 23, 2013, 11:41 AM | Dan-Lawrence
                0
              • inlinear

                Correct linking to the /index of a site and subfolders: what's the best practice? link to: domain.com/ or domain.com/index.html ?

                Dear all, starting with my .htaccess file: RewriteEngine On
                RewriteCond %{HTTP_HOST} ^www.inlinear.com$ [NC]
                RewriteRule ^(.*)$ http://inlinear.com/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^./index.html 
                RewriteRule ^(.)index.html$ http://inlinear.com/ [R=301,L] 1. I redirect all URL-requests with www. to the non www-version...
                2. all requests with "index.html" will be redirected to "domain.com/" My questions are: A) When linking from a page to my frontpage (home) the best practice is?: "http://domain.com/" the best and NOT: "http://domain.com/index.php" B) When linking to the index of a subfolder "http://domain.com/products/index.php" I should link also to: "http://domain.com/products/" and not put also the index.php..., right? C) When I define the canonical ULR, should I also define it just: "http://domain.com/products/" or in this case I should link to the definite file: "http://domain.com/products**/index.php**" Is A) B) the best practice? and C) ? Thanks for all replies! 🙂
                Holger

                Technical SEO | Jul 25, 2013, 6:54 PM | inlinear
                0
              • TalkInThePark

                De-indexing millions of pages - would this work?

                Hi all, We run an e-commerce site with a catalogue of around 5 million products. Unfortunately, we have let Googlebot crawl and index tens of millions of search URLs, the majority of which are very thin of content or duplicates of other URLs. In short: we are in deep. Our bloated Google-index is hampering our real content to rank; Googlebot does not bother crawling our real content (product pages specifically) and hammers the life out of our servers. Since having Googlebot crawl and de-index tens of millions of old URLs would probably take years (?), my plan is this: 301 redirect all old SERP URLs to a new SERP URL. If new URL should not be indexed, add meta robots noindex tag on new URL. When it is evident that Google has indexed most "high quality" new URLs, robots.txt disallow crawling of old SERP URLs. Then directory style remove all old SERP URLs in GWT URL Removal Tool This would be an example of an old URL:
                www.site.com/cgi-bin/weirdapplicationname.cgi?word=bmw&what=1.2&how=2 This would be an example of a new URL:
                www.site.com/search?q=bmw&category=cars&color=blue I have to specific questions: Would Google both de-index the old URL and not index the new URL after 301 redirecting the old URL to the new URL (which is noindexed) as described in point 2 above? What risks are associated with removing tens of millions of URLs directory style in GWT URL Removal Tool? I have done this before but then I removed "only" some useless 50 000 "add to cart"-URLs.Google says themselves that you should not remove duplicate/thin content this way and that using this tool tools this way "may cause problems for your site". And yes, these tens of millions of SERP URLs is a result of a faceted navigation/search function let loose all to long.
                And no, we cannot wait for Googlebot to crawl all these millions of URLs in order to discover the 301. By then we would be out of business. Best regards,
                TalkInThePark

                Technical SEO | Aug 7, 2012, 8:20 AM | TalkInThePark
                0
              • Petersen11

                How to get Google to index another page

                Hi, I will try to make my question clear, although it is a bit complex. For my site the most important keyword is "Insurance" or at least the danish variation of this. My problem is that Google are'nt indexing my frontpage on this, but are indexing a subpage - www.mydomain.dk/insurance instead of www.mydomain.dk. My link bulding will be to subpages and to my main domain, but i wont be able to get that many links to www.mydomain.dk/insurance. So im interested in making my frontpage the page that is my main page for the keyword insurance, but without just blowing the traffic im getting from the subpage at the moment. Is there any solutions to do this? Thanks in advance.

                Technical SEO | Apr 3, 2012, 4:10 AM | Petersen11
                0
              • dreadmichael

                How to block "print" pages from indexing

                I have a fairly large FAQ section and every article has a "print" button. Unfortunately, this is creating a page for every article which is muddying up the index - especially on my own site using Google Custom Search. Can you recommend a way to block this from happening? Example Article: http://www.knottyboy.com/lore/idx.php/11/183/Maintenance-of-Mature-Locks-6-months-/article/How-do-I-get-sand-out-of-my-dreads.html Example "Print" page: http://www.knottyboy.com/lore/article.php?id=052&action=print

                Technical SEO | Mar 16, 2012, 8:32 PM | dreadmichael
                0

              Get started with Moz Pro!

              Unlock the power of advanced SEO tools and data-driven insights.

              Start my free trial
              Products
              • Moz Pro
              • Moz Local
              • Moz API
              • Moz Data
              • STAT
              • Product Updates
              Moz Solutions
              • SMB Solutions
              • Agency Solutions
              • Enterprise Solutions
              Free SEO Tools
              • Domain Authority Checker
              • Link Explorer
              • Keyword Explorer
              • Competitive Research
              • Brand Authority Checker
              • Local Citation Checker
              • MozBar Extension
              • MozCast
              Resources
              • Blog
              • SEO Learning Center
              • Help Hub
              • Beginner's Guide to SEO
              • How-to Guides
              • Moz Academy
              • API Docs
              About Moz
              • About
              • Team
              • Careers
              • Contact
              Why Moz
              • Case Studies
              • Testimonials
              Get Involved
              • Become an Affiliate
              • MozCon
              • Webinars
              • Practical Marketer Series
              • MozPod
              Connect with us

              Contact the Help team

              Join our newsletter
              Moz logo
              © 2021 - 2025 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
              • Accessibility
              • Terms of Use
              • Privacy

              Looks like your connection to Moz was lost, please wait while we try to reconnect.