How do I find out which pages are being indexed on my site and which are not?
-
Hi,
I doing my first technical audit on my site. I am learning how to do an audit as i go and am a lost. I know some page won't be indexed but how do I:
1. Check the site for all pages, both indexed and not indexed
2. Run a report to show indexed pages only (i am presuming i can do this via screaming Frog or webmaster tool)
3. I can do a comparison between the two list and work out which pages are not being indexed.
I'll then need to figure out way. I'll cross this bridge once i get to it
Thanks Ben
-
Hi Ben,
I'd echo what Patrick has said and probably recommend his first suggestion the most. Google Webmaster Tools is a good way of checking indexation and if you have a large site with lots of categories, you can even break down the sitemaps by category so that you can see if certain areas are having problems.
Here is an old, but still relevant post on the topic:
http://www.branded3.com/blogs/using-multiple-sitemaps-to-analyse-indexation-on-large-sites/
In terms of creating the sitemap, Screaming Frog has an option under Advanced Export for creating an XML sitemap file for you which works very well. You just need to make sure you're only including pages that you want indexed in there.
Cheers.
Paddy
-
Hi Patrick,
Thanks for replying.
Can you recommend any tools for creating the site map i've had a look around and the few i've found seem to all deliver different results? One has been submitted previously so i need to go through the process for myself so i can under these basics.
I've had a read up on robot txt so i understand what is happening there from an exclusion perspective and once i understand how the XML site works ill be able to do an audit as mentioned above.
Ben
-
Ben,
You can check a couple things:
- Have you submitted your XML site map to Google? If not, create one and get it submitted so you tell Google what pages you want indexed.
- Submit your domain and all pages through Google Webmasters Tool as well (Login > left side bar > Crawl > Fetch as Google
- Screaming Frog is an awesome software, so yes, if you have it, use it to scan your pages
- Try and do a simple "site:domainname.com" search in Google to see what is being indexed from your domain
Cross reference it all and you will then have a better understanding. I do believe, your sitemap is crucial in telling Google exactly what pages you do and do not want indexed. They will follow that. You're on the right track and hope my input was helpful! - Patrick
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Problem with On-page
I have an issue. I have added 5 keywords but when i go to the "on page" tab. They are not there... So i press on "Add keyword" and it takes me to another page where i can see all my keywords. So i go back to the "on page" and no keyword shows up. I wanna have a summary of the weekly crawl for the on page of these keywords and it's not showing up 😞 Anybody knows why?
Moz Pro | | theseolab0 -
Using Seomoz for Site Evaluation am I up to par ?
Just wanted to see how people using the seomoz bar would rate a four month old site with Domain-Homepage Authority of 27 Mozrank of 5.08 and Moztrust of 5.65 . I've read up on all the factors but just wanted to know if Im up to par on building a great site thats search engine friendly. Inner pagers are on a PA of 20 and around the same mozrank and moztrust levels of +- 5.
Moz Pro | | NikolasNikolaou0 -
"Too many on-page links" warning on all of my Wordpress pages
Hey guys, I've got like 120 "Too many on-page links" warnings in my crawl diagnostics section. They're all the pages of my WordPress blog. Is this an acceptable and expected warning for Wordpress or does something need to be better optimized? Thanks.
Moz Pro | | SheffieldMarketing0 -
Tools that crawl 2 million page sites
Our site is about 2million pages deep, 50% of which is stale content. Yes, I know - OMG #unhygienic. Even if we get approval to get rid of half of it. SEOMoz Pro Elite only crawls 20k deep - what can i do to crawl and diagnose the whole site. Are there any tools anyone can suggest. SEOMoz??
Moz Pro | | ilhaam0 -
Link buying: finding sites based on criteria
Hi. I'm looking to get links to my site from blogs/sites/pages that fit this criteria: niche: IT, electronics, product reviews Minimum PR: 2 Maximum 20 external links Is there an automated tool that can help me discover these site?
Moz Pro | | seo_marker0 -
I have another Duplicate page content Question to ask.Why does my blog tags come up as duplicates when my page gets crawled,how do I fix it?
I have a blog linked to my web page.& when rogerbot crawls my website it considers tags for my blog pages duplicate content.is there any way I can fix this? Thanks for your advice.
Moz Pro | | PCTechGuy20120 -
When to stop link building because page authority is low - open site explorer
Hi, I'm link building with Open Site Explorer. I'm really picky in get links from only high quality sites. When do you stop going down the list of possible backlink providers because the page authority is too low. I usually stop at 40, but what do you do, why, and what does it depend on?
Moz Pro | | BobGW0 -
On Page Optimization Reports - Huh?
I've been working hard to use this EXCELLENT tool for optimize some of what I consider my most important pages . . . But the automatic tool that pulls pages and grades them (the "summary" of the "on page" report) . . . I don't get it. It only graded three of my pages, and I don't understand how it chose what keywords to grade it for? I'm just very confused. I don't understand how it chose the pages to grade, not the words it chose to grade it against. 😞
Moz Pro | | damon12120