What tool do you use to check for URLs not indexed?
-
What is your favorite tool for getting a report of URLs that are not cached/indexed in Google & Bing for an entire site? Basically I want a list of URLs not cached in Google and a seperate list for Bing.
Thanks,
Mark
-
-
I can work on building this tool if there's enough interest.
-
I generally just use Xenu's hyperlink sleuth (if you thousands of pages) to listing out all the URLs you have got and I might then manually take a look at them, however, see the guitar in demand I have not come upon an automatic device yet. If all people are aware of any, I'd like to recognize as properly.
-
This post from Distilled mentions that SEO for Excel plugin has a "Indexation Checker":
https://www.distilled.net/blog/seo/awesome-examples-of-how-to-use-seotools-for-excel/Alas, after downloading and installing, it appears this feature was removed...
-
Unless I'm missing something, there doesn't seem to be a way to get Google to show more than 100 results on a page. Our site has about 8,000 pages, and I don't relish the idea of manually exporting 80 SERPs.
-
Annie Cushing from Seer Interactive made an awesome list of all the must have tools for SEO.
You can get it from her link which is http://bit.ly/tools-galore
In the list there is a tool called scrapebox which is great for this. In fact there are many uses for the software, it is also useful for sourcing potential link partners.
-
I would suggest using the Website Auditor from Advanced Web Ranking. It can parse 10.000 pages and it will tell you a lot more info than just if it's indexed by Google or not.
-
hmm...I thought there was a way to pull those SERPs urls into Google docs using a function of some sort?
-
I think you need not any tool for this, you can directly go to google.com and search: Site:www.YourWebsiteNem.com Site:www.YourWebsiteName.com/directory I think this will be the best option to check if your website is crwled by google or not.
-
I do something similar but use Advanced Web Ranking, use site:www.domain.com as your phrase, run it to retrieve 1000 results and generate a Top Site Report in Excel to get the indexed list.
Also remember that you can do it on sub-directories (or partial URL paths) as a way to get more than 1000 pages from the site. In general I run it once with site:www.domain.com, then identify the most frequent sub-directories, and add those as additional phrases to the project and run a second time, i.e.: site:www.domain.com site:www.domain.com/dir1 site:www.domain.com/dir2 etc.
Still not definitive, but think it does give indication of where value is.
-
David Kauzlaric has in my opinion the best answer. If google hasn't indexed it and you've investigated your Google webmaster account, then there isn't anything better out there as far as I'm concerned. It's by far the simplest, quickest and easiest way to identify a serp result.
re: David Kauzlaric
We built an internal tool to do it for us, but basically you can do this manually.
Go to google, type in "site:YOURURLHERE" without the quotes. You can check a certain page, a site, a subdomain, etc... of course if you have thousands of URLs this method is not ideal, but it can be done.
Cheers!
-
I concur, Xenu is an extremely valuable tool for me that I use daily. Also, once you get a list of all the URLs on your site, you can compare the two lists in excel (two lists being the Xenu page list for your site and the list of pages that have been indexed by Google).
-
Nice solution Kieran!
I use the same method, to compare URL list from Screaming Frog output with URL Found column from my Keyword Ranking tool - of course it doesn't catch all pages that might be indexed.
The intention is not really to get a complete list, more to "draught" out pages that need work.
-
I agree, this is not automated but so far, from what we know, looks like a nice and clean option. Thanks.
-
Saw this and tried the following which isn't automated but is one way of doing it.
- First install SEO Quake plugin
- Go to Google
- Turn off Google Instant (http://www.google.com/preferences)
- Go to Advanced search set the number of results you want displayed (estimate the number of pages on your site)
- Then run your site:www.example.com search query
- Export this to CSV
- Import to Excel
- Once then do a Data to columns conversion using ; as a delimiter (this is the CSV delimiter)
- This gives you a formatted list.
- Then import your sitemap.xml into another TAB in Excel
- Run a vlookup between the URL tabs to flag which are on sitemap or vice versa.
Not exactly automated but does the job.
-
Curious about this question also, it would be very useful to see a master list of all URLs on our site that are not indexed by Google so that we can take action to see what aspects of the page are lacking and what we need for it to get indexed.
-
I usually just use Xenu's link sleuth (if you thousands of pages) to list out all the URLs you have and I would then manually check them, but I haven't come across an automated tool yet. If anyone knows any, I'd love to know as well.
-
Manual is a no go for large sites. If someone knows a tool like this, it woul be cool to know which/ where to find. Or..... This would make a cool SEOmoz pro tool
-
My bad - you are right that it doesn't display the actual URLs. So I guess the best thing you can do is site:examplesite.com and see what comes up.
-
That will tell you the number indexed, but it still doesn't tell you which of those URLs are or are not indexed. I think we all wish it would!
-
I would use Google Webmaster Tools as you can see how many URLs are indexed based on your sitemap. Once you have that, you can compare it to your total list. The same can be done with Bing.
-
Yeah I do it manually now so was looking for something more efficient.
-
We built an internal tool to do it for us, but basically you can do this manually.
Go to google, type in "site:YOURURLHERE" without the quotes. You can check a certain page, a site, a subdomain, etc... of course if you have thousands of URLs this method is not ideal, but it can be done.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redirect indexed lightbox URLs?
Hello all, So I'm doing some technical SEO work on a client website and wanted to crowdsource some thoughts and suggestions. Without giving away the website name, here is the situation: The website has a dedicated /resources/ page. The bulk of the Resources are industry definitions, all encapsulated in colored boxes. When you click on the box, the definition opens in a lightbox with its own unique URL (Ex: /resources/?resource=augmented-reality). The information for these colored lightbox definitions is pulled from a normal resources page (Ex: /resources/augmented-reality/). Both of these URLs are indexed, leading to a lot of duplicate indexed content. How would you approach this? **Things to Consider: ** -Website is built on Wordpress with a custom theme.
Technical SEO | | Alces
-I have no idea how to even find settings for the lightbox (will be asking the client today).
-Right now my thought is to simply disallow the lightbox URL in robots.txt and hope Google will stop crawling and eventually drop from the index.
-I've considered adding the main resource page canonical to the lightbox URL, but it appears to be dynamically created and thus there is no place to access (outside of the FTP, I imagine?). I'm most rusty with stuff like this, so figured I'd appeal to the masses for some assistance. Thanks! -Brad0 -
Change URL or use Canonicals and Redirects?
We just completed a conclusive a/b test on a client's landing page. The new page saw a 30% bump in conversions, yay! Now what? Option 1: Change the url of the new page to that of the old page, retire the old page. Option 2: Redirect the old page and anything that was pointing to it to the new page, make the new page the canonical. I'm afraid of option 1 because I think Google's WTF penalty will be a bit harsher than option 2, but I wanted to sanity check that here. Any thoughts or experienced advice would be very appreciated!
Technical SEO | | LindsayDayton0 -
No index
Screaming frog spider does index pages on our website like: wp-content/plugins/woocommerce/assets/js/frontend/jquery-ui-touch-punch.min.js?ver=2.3.9 wp-content/plugins/mailchimp-for-wp/assets/css/checkbox.min.css?ver=2.3.2 Is it a bad/good idea to set my parameters in Webmastertools and tell Google not to crawl pages that begin with wp/content? Thanks!
Technical SEO | | Happy-SEO1 -
Removed URLs
Hi all, We have recently removed 200+ articles from our blog. However, those links are still being shown on Google weeks after their removal. In there a way to speed up the process? What effect will this have on our SEO ranking?
Technical SEO | | businessowner0 -
Magento URL change
We have a Magento website parked at HostGator. The site is comprised of both a PC and a mobile version. We changed the URL to a new one ... We made the domain changes in the ‘core_config_data’ (phpMyAdmin) ... We flushed the cache in the ‘File Manager’ part of cPanel (regular and mobile version) Currently we can access the http://newsite.com (on a desktop) with no problem ... We can also access http://m.newsite.com BUT… only from a desktop PC. When we try http://newsite.com from a MOBILE device, it routes to: http://m.OLDsite.com (it keeps going to the old URL) Need some help please. Thanks in advance!
Technical SEO | | Prime850 -
Does it really matter to maintain 301 redirect after de-indexing of old URLs?
Today, I was reading latest blog post on SEOmoz blog about. Uncrawled 301s - A Quick Fix for When Relaunches Go Too Well This is very interesting study about 301 & How it useful to maintain traffic. I'm working on eCommerce website and I have done similar stuff on my website. I have big confusion to manage 301 redirect. My website generates new URLs due to following actions. Re-write dynamic URLs. Re-launch entire website on different eCommerce platform. [osCommerce to Magento Commerce] Re-name category. Trasfer one product from one category to another category. I'm managing my 301 redirect with old practice. Excel sheet data from Google webmaster tools and set specific new URLs for redirect. Hoooo... Now, I have 8.5K redirect in htaccess... And, I'm thinking it's too much. Can we remove old 301 redirect from htaccess or not? This is big question for me. Because, all pages are not hyperlink on external website. Google have just de-indexed old URLs and indexed new URLs. So, Is it require to maintain 301 redirect after Google process?
Technical SEO | | CommercePundit0 -
URL with tracking code
Hi there, At the company i am currently working for we have a problem with shortcut url with tracking in it. They send a lot of brochures with a shortcut URL which redirects to the page of the event with tagging. For example The real URL is:
Technical SEO | | RuudHeijnen
http://www.sbo.nl/cursussen/schoolleider-primair-onderwijs/ The URL in the brochure is:
www.sbo.nl/schoolleiderpo this then redirects to: h
ttp://www.sbo.nl/cursussen/schoolleider-primair-onderwijs/?utm_source=direct&utm_medium=shortcut&utm_campaign=schoolleiderpo Now we can measure the effect of the brochure on on-line traffic and conversion. This is great but a lot of website link to that shortcut url and if the event is put offline the links to it generate an 404. We have now about 800 backlinks that generate this 404 and i want to fix it. Another big problem "i think" is the possibility that google will index this url with tagging. Now i have 2 options: 1. look at al the url with that 404 and redirect them with a 301 to the best page 2. create the shortcut on an page that is most suitable but then i will get the tagging in the URL and i guess google will see this as dublicate content. It is possible that in the future the shortcut url will be used again. What would you suggest as the best sollution.0 -
Is there any value to a home page URL adding the /index.html ?
For proper SEO, which version would you prefer? A. www.abccompany.com B. www.abccompany.com/index.html Is there any value or difference with either home page URL??
Technical SEO | | theideapeople0