Indexed Pages in Google, How do I find Out?
-
Is there a way to get a list of pages that google has indexed?
Is there some software that can do this?
I do not have access to webmaster tools, so hoping there is another way to do this.
Would be great if I could also see if the indexed page is a 404 or other
Thanks for your help, sorry if its basic question
-
If you want to find all your indexed pages in Google just type: site:yourdomain.com or .co.uk or other without the www.
-
Hi John,
Hope I'm not too late to the party! When checking URL's for their cache status I suggest using Scrapebox (with proxies).
Be warned, it was created as a black-hat tool, and as such is frowned upon, but there are a number of excellent white-hat uses for it! Costs $57 one off
-
sorry to keep sending you messages but I wanted to make sure that you know SEOmoz does have a fantastic tool for what you are requesting. Please look at this link and then click on the bottom where it should says show more and I believe you will agree it does everything you've asked and more.
http://pro.seomoz.org/tools/crawl-test
Sincerely,
Thomas
does this answer your question?
-
What giving you a 100 limit?
try using Raven tools or spider mate they both have excellent free trials and allow you quite a bit of information.
-
Neil you are correct I agree with screaming frog is excellent they definitely will show you your site. Here is a link from SEOmoz associate that I believe will benefit you
http://www.seomoz.org/q/404-error-but-i-can-t-find-any-broken-links-on-the-referrer-pages
sincerely,
Thomas
-
this is what I am looking for Thanks
Strange that there is no tool I can buy to do this in full without the 100 limit
Anyway, i will give that a go
-
can I get your sites URL? By the way this might be a better way into Google Webmaster tools
if you have a Gmail account use that if you don't just sign up using your regular e-mail.
Of course using SEOmoz via http://pro.seomoz.org/tools/crawl-test will give you a full rundown of all of your links and how they're running. Are you not seen all of them?
Another tool I have found very useful. Is website analysis as well as their midsize product from Alexia
I hope I have helped,
Tom
-
If you don't have access to Webmaster Tools, the most basic way to see which pages Google has indexed is obviously to do a site: search on Google itself - like "site:google.com" - to return pages of SERPs containing the pages from your site which Google has indexed.
Problem is, how do you get the data from those SERPs in a useful format to run through Screaming Frog or similar?
Enter Chris Le's Google Scraper for Google Docs
It will let scrape the first 100 results, then let you offset your search by 100 and get the next 100, etc.. slightly cumbersome, but it will achieve what you want to do.
Then you can crawl the URLs using Screaming Frog or another crawler.
-
just thought I might add these links these might help explain it better than I did.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1352276
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=2409443&topic=2446029&ctx=topic
http://pro.seomoz.org/tools/crawl-test
you should definitely sign up for Google Webmaster tools it is free here is a link all you need to do is add an e-mail address and password
http://support.google.com/webmasters/bin/topic.py?hl=en&topic=1724121
I hope I have been of help to you sincerely,
Thomas
-
Thanks for the reply.
I do not have access to webmaster tools and the seomoz tools do not show a great deal of the pages on my site for some reason
Majestic shows up to 100 pages. Ahrefs shows some also.
I need to compare what google has indexed and the status of the page
Does screaming frog do thiss?
-
Google Webmaster tools should supply you with this information. In addition Seomoz tools will tell you that and more. Run your website through the campaign section of seomoz you will then see any issues with your website.
You may also want to of course use Google Webmaster tools run a test as a Google bot the Google but should show you any issues you are having such is 404's or other fun things that websites do.
If you're running WordPress there are plenty of plug-ins I recommend 404 returned
sincerely,
Thomas
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawling/indexing of near duplicate product pages
Hi, Hope someone can help me out here. This is the current situation: We sell stones/gravel/sand/pebbles etc. for gardens. I will take a type of pebbles and the corresponding pages/URL's to illustrate my question --> black beach pebbles. We have a 'top' product page for black beach pebbles on which you can find different types of quantities (differing from 20kg untill 1600 kg). There is not any search volume related to the different quantities The 'top' page does not link to the pages for the different quantities The content on the pages for the different quantities is not exactly the same (different price + slightly different content). But a lot of the content is the same. Current situation:
Intermediate & Advanced SEO | | AMAGARD
- Most pages for the different quantities do not have internal links (about 95%) But the sitemap does contain all of these pages. Because the sitemap contains all these URL's, google frequently crawls them (I checked the logfiles) and has indexed them. Problems: Google spends its time crawling irrelevant pages --> our entire website is not that big, so these quantity URL's kind of double the total number of URL's. Having url's in the sitemap that do not have an internal link is a problem on its own All these pages are indexed so all sorts of gravel/pebbles have near duplicates. My solution: remove these URL's from the sitemap --> that will probably stop Google from regularly crawling these pages Putting a canonical on the quantity pages pointing to the top-product page. --> that will hopefully remove the irrelevant (no search volume) near duplicates from the index My questions: To be able to see the canonical, google will need to crawl these pages. Will google still do that after removing them from the sitemap? Do you agree that these pages are near duplicates and that it is best to remove them from the index? A few of these quantity pages do have intenral links (a few procent of them) because of a sale campaign. So there will be some (not much) internal links pointing to non-canonical pages. Would that be a problem? Thanks a lot in advance for your help! Best!1 -
Tool to help find blog / news pages?
Do you guys know of any tools where if I have a list of Url's it can help find blog and news pages and let me know which ones have these.
Intermediate & Advanced SEO | | BobAnderson0 -
Product Pages not indexed by Google
We built a website for a jewelry company some years ago, and they've recently asked for a meeting and one of the points on the agenda will be why their products pages have not been indexed. Example: http://rocks.ie/details/Infinity-Ring/7170/ I've taken a look but I can't see anything obvious that is stopping pages like the above from being indexed. It has a an 'index, follow all' tag along with a canonical tag. Am I missing something obvious here or is there any clear reason why product pages are not being indexed at all by Google? Any advice would be greatly appreciated. Update I was told 'that each of the product pages on the full site have corresponding page on mobile. They are referred to each other via cannonical / alternate tags...could be an angle as to why product pages are not being indexed.'
Intermediate & Advanced SEO | | RobbieD910 -
Substantial difference between Number of Indexed Pages and Sitemap Pages
Hey there, I am doing a website audit at the moment. I've notices substantial differences in the number of pages indexed (search console), the number of pages in the sitemap and the number I am getting when I crawl the page with screamingfrog (see below). Would those discrepancies concern you? The website and its rankings seems fine otherwise. Total indexed: 2,360 (Search Consule)
Intermediate & Advanced SEO | | Online-Marketing-Guy
About 2,920 results (Google search "site:example.com")
Sitemap: 1,229 URLs
Screemingfrog Spider: 1,352 URLs Cheers,
Jochen0 -
Meta NOINDEX... how long before Google drops dupe pages?
Hi, I have a lot of near dupe content caused by URL params - so I have applied: How long will it take for this to take effect? It's been over a week now, I have done some removal with GWT removal tool, but still no major indexed pages dropped. Any ideas? Thanks, Ben
Intermediate & Advanced SEO | | bjs20100 -
De Index Section of Page?
Hey all! We're having a couple of issues with a certain section of our page that we don't want to index. Basically, our cross sells change really quickly, and big G is ranking them and linking to them even when they've long gone. Is it possible to put some kind of no index tag for a specific section of the page? See below 🙂 http://www.freestylextreme.com/uk/Home/Brands/DC-Shoe-Co-/Mens-DC-Shoe-Co-Hoodies-and-Sweaters/DC-Black-Rob-Dyrdek-Official-Sweater.aspx Thanks!
Intermediate & Advanced SEO | | elbeno0 -
Cant find my home page to seo it....aajhhhhhh
Hi Guys, This might be more of a joomla thiing than a SEO thing but it is correlated as I need to seo this pgage and i cant find it. Please help if you can, while my developer is on hols, this is driving me nuts!! I can find the article sections in Joomla 2.5 to edit all the text in my other pages but for some reason cannot find the text for the home page!!??? any ideas? Please...?? He set a lot of it up using CSS and Jquery / php etc....so im a little confused as to why I can find the html to edit.......aaahhhhhhhh Thanks guys, Im sure its quite easy!! Thanks in advance. Craig
Intermediate & Advanced SEO | | craigyboy0 -
How Google Carwler Cached Orphan pages and directory?
I have website www.test.com I have made some changes in live website and upload it to "demo" directory (which is recently created) for client approval. Now, my demo link will be www.test.com/demo/ I am not doing any type of link building or any activity which pass referral link to www.test.com/demo/ Then how Google crawler find it and cached some pages or entire directory? Thanks
Intermediate & Advanced SEO | | darshit210