Investigating a huge spike in indexed pages
-
I've noticed an enormous spike in pages indexed through WMT in the last week. Now I know WMT can be a bit (OK, a lot) off base in its reporting but this was pretty hard to explain. See, we're in the middle of a huge campaign against dupe content and we've put a number of measures in place to fight it. For example:
-
Implemented a strong canonicalization effort
-
NOINDEX'd content we know to be duplicate programatically
-
Are currently fixing true duplicate content issues through rewriting titles, desc etc.
So I was pretty surprised to see the blow-up. Any ideas as to what else might cause such a counter intuitive trend? Has anyone else see Google do something that suddenly gloms onto a bunch of phantom pages?
-
-
I haven't contacted the forum yet but that's my next step.
Pages indexed: 91k
Blocked by robots.txt: 8.4million
I don't even know how you could create 8.4 million indexable pages from our content.
-
Have you contacted the Google Webmaster Help forums? As that seems to be a glitch in Google.
How many pages are scraped by Mozbot? If the amount that mozbot shows is different, then you should either sit and wait until Google removes those indexed pages or create a conversation on the forums so someone at google can give you a hint of what is going on.
-
Any help out there? Since the original question was posted, I've seen some improvement but even with aggressive canonicalization and noindexing, I'm still seeing a boatload of indexed pages. I am still seeing pages indexed that I've asked explicitly to be omitted by robots.txt (/search.aspx and */filter). I'm guessing it's just going to take a while to deindex what's there. Still, 91k pages indexed is quite a lot when you consider we only have about 3-4k pages and some articles.
Is anyone aware of any significant releases by Google?
-
Quite recent. We were actually seeing a nice downward trend in the huge number of pages indexed and then the number tripled. Crazy is an understatement. I would have thought the number of pages would fall given the number of pages that now use canonicals.
-
How long have you waited since you applied all the rules to avoid duplicate content, as if it was just recently, then Google should be "rebuilding" the index of your site and stats may be a little crazy while that is happening.
If it was over 2 month ago and you are seeing the increase now, then I'd suggest you revise the rules you created to see if your own Website isn't creating all those new pages.
Hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pages being flagged in Search Console as having a "no-index" tag, do not have a meta robots tag??
Hi, I am running a technical audit on a site which is causing me a few issues. The site is small and awkwardly built using lots of JS, animations and dynamic URL extensions (bit of a nightmare). I can see that it has only 5 pages being indexed in Google despite having over 25 pages submitted to Google via the sitemap in Search Console. The beta Search Console is telling me that there are 23 Urls marked with a 'noindex' tag, however when i go to view the page source and check the code of these pages, there are no meta robots tags at all - I have also checked the robots.txt file. Also, both Screaming Frog and Deep Crawl tools are failing to pick up these urls so i am a bit of a loss about how to find out whats going on. Inevitably i believe the creative agency who built the site had no idea about general website best practice, and that the dynamic url extensions may have something to do with the no-indexing. Any advice on this would be really appreciated. Are there any other ways of no-indexing pages which the dev / creative team might have implemented by accident? - What am i missing here? Thanks,
Technical SEO | | NickG-1230 -
How can a keyword placed on a page with the Moz page optimization score of 100 be ranked #51+?
Hi, Please help me figure out why this is happening and what goes wrong. This is the example of the poor ranked keyword - 'viking cooktop repair' with page optimization score of 100 (http://www.yourappliancerepairla.com/blog/viking-cooktop-repair/) Yet it's ranking is #51+. I've got many like these: Page Optimization Score for 'kitchenaid oven repair' is 100 (http://www.yourappliancerepairla.com/blog/kitchenaid-oven-repair/) yet its ranking is #51+ And so on. According to Google Search Console, I have 266 of links to my site with variety of root domains. While building backlinks, I paid attention to relevancy and DA.What else do I have to do to get those keywords ranked higher? And why don't they rank well if the pages are 100% optimized, not keywords stuffed and I have quality backlinks? What am I missing out on? Please help!
Technical SEO | | kirupa1 -
Drop in traffic, spike in indexed pages
Hi, We've noticed a drop in traffic compared to the previous month and the same period last year. We've also noticed a sharp spike in indexed pages (almost doubled) as reported by Search Console. The two seemed to be linked, as the drop in traffic is related to the spike in indexed pages. The only change we made to our site during this period is we reskinned out blog. One of these changes is that we've enable 'normal' (not ajax) pagination. Our blog has a lot of content on, and we have about 550 odd pages of posts. My question is, would this impact the number of pages indexed by Google, and if so could this negatively impact organic traffic? Many thanks, Jason
Technical SEO | | Clickmetrics0 -
Should I put meta descriptions on pages that are not indexed?
I have multiple pages that I do not want to be indexed (and they are currently not indexed, so that's great). They don't have meta descriptions on them and I'm wondering if it's worth my time to go in and insert them, since they should hypothetically never be shown. Does anyone have any experience with this? Thanks! The reason this is a question is because one member of our team was linking to this page through Facebook to send people to it and noticed random text on the page being pulled in as the description.
Technical SEO | | Viewpoints0 -
Off-page SEO and on-page SEO improvements
I would like to know what off-page SEO and on-page SEO improvements can be made to one of our client websites http://www.nd-center.com Best regards,
Technical SEO | | fkdpl2420 -
I have 15,000 pages. How do I have the Google bot crawl all the pages?
I have 15,000 pages. How do I have the Google bot crawl all the pages? My site is 7 years old. But there are only about 3,500 pages being crawled.
Technical SEO | | Ishimoto0 -
High number of Duplicate Page titles and Content related to index.php
It appears that every page on our site (www.bridgewinners.com) also creates a version of itself with a suffix. This results in Seomoz indicating that there are thousands of duplicate titles and content. 1. Does this matter? If so, how much? 2. How do I eliminate this (we are using joomla)? Thanks.
Technical SEO | | jfeld2220 -
Rel canonical or 301 the Index Page?
Still a bit confused on best practice for /index.php showing up as duplicate for www.mysite.com. What do I need to do and How?
Technical SEO | | bozzie3110