Index bloating issue
-
Hello,
In the last month, I noticed a huge spike in the number of pages indexed on my site, which I think is impacting my SEO quality score.
While I've only have about 90 pages on my site map, the number of pages indexed jumped to 446, with about 536 pages being blocked by robots. At first we thought this might be due to duplicate product pages showing up in different categories on my site, but we added something to our robot.txt file to not index those pages. But the number has not gone down. I've tried to consult with our hosting vendor, but no one seems to be concerned or have any idea why there was such a big jump in the last month.
Any insights or pointers would be so greatly appreciated, so that I can fix/improve my SEO as quickly as possible!
Thanks!
-
in order to determine if your website is hacked this is one of the best tools I know of both to find out and to remove the malware.
In order to determine rather not you have on-site SEO problems on a very technical and granular scale I would use
https://www.deepcrawl.com/ $80 a month you cannot go wrong
another amazing tool and it's free for the first 500 pages and if you want the added features which you do or more pages only about $150 a year is
-
Thank you. These are helpful suggestions.
-
A couple of things to note:
- As Robert mentioned, I would definitely make sure there is no longer an issue on your wordpress site relating to your previous hack.
- Robots.txt disallow does not stop pages from being indexed. It merely tells search engines to stop crawling that page from here out. The meta noindex tag is more applicable for noindexing pages that are already out there.
- I would check your search console crawl errors to see if there's a hefty spike in 404 errors as well, as it may be old spam pages you removed from the site.
- If these pages that are bloating your index are all still old spam filled pages from when you were hacked, you could start by using the search console's "remove url's" tool, which will remove all these url's from the index temporarily. For a more long term approach, instead of them giving off a 404 if they have been removed, making the server give off a "410" response would tell google they are gone forever, and thus they will be removed from the index as time goes on.
-
When I do the search for my main url - the results are clean. Just the pages to my site show up. And the index results for this site still bloated. However, for my wordpress site, which is a subdomain and on a different platform to my main site, there are some issues (it was hacked as Rob noted below). But we have since cleaned up the pages etc, reuploaded the site maps, etc. So I'm a little stumped on my main site (which wasn't hacked - that I'm aware of).
-
What do you see if you do a search for site:yoursite.com ?
-
Hello Julie,
This sounds like you might have a hacking issue on your website. You probably need someone to conduct a full code audit of your site to determine whether any files you have uploaded (plugins, for example) were contaminated. If a site is hacked, new pages can be added that are hidden from view and difficult to detect unless handled by a security specialist.
We recently brought on a new client who had this issue and discovered that his site had 1000's of pages dedicated to testosterone pills, etc. We had to go through GWT and the site logs to determine what new pages were created and it was a complete hack job.
In terms of fixing your SEO, the first step is to determine where/if the hack exists. Once that is decided, you have to clean up the site and restore the site's security.
I would be happy to help you with the next steps if you would like. I am always available!
Thanks and best of luck,
Rob
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
X-robots tag causing no index issues
I have an interesting problem with a site which has an x-robot tag blocking the site from being indexed, the site is in Wordpress, there are no issues with the robots.txt or at the page level, I cant find the noindex anywhere. I removed the SEO plug-in which was there and installed Yoast but it made no difference. this is the url: https://www.cotswoldflatroofing.com/ Its coming up with a HTTP error: x-robots tag noindex, nofollow, noarchive
Technical SEO | | Donsimong0 -
Indexing Issue
Hi, We have moved one of our domain https://www.mycity4kids.com/ in angular js and after that, i observed the major drop in the number of indexed pages. I crosschecked the coding and other important parameters but didn't find any major issue. What could be the reason behind the drop?
Technical SEO | | ResultFirst0 -
SSL redirect issue
Hi guys, I have a site that has some internal pages with SSL. Recently i noticed that if i put https://mydomain.com, this URL is accessible but all the design is messed up. My site is on wordpress and i use "redirection" plugin for all the 301 redirect. So i decided to add a new 301 redirect from https://mydomain.com to my actual URL version of home page http://mydomain.com. After doing that, my home page doesn't load at all. Does anybody know what happens? Thank you for advice!
Technical SEO | | odmsoft0 -
Should i index or noindex a contact page
Im wondering if i should noindex the contact page im doing SEO for a website just wondering if by noindexing the contact page would it help SEO or hurt SEO for that website
Technical SEO | | aronwp0 -
All images are noindex will opening this at once be an issue?
Hi, All images are noindex will opening this at once be an issue? Not sure how a few months ago all my images were set as noindex which i realized last week. We have 20K images which were indexed fine but now when i check Site:sitename it shows 10 or 12 and the inspect element via Chrome i see the noindex is set for all images. We have been renaming the images and adding ALT tags for most of them and would it be an issue if we change the noindex in one shot or should we do them few at a time? Thanks
Technical SEO | | mtthompsons0 -
Google Indexed Only 1 Page
Hi, I'm new and hope this forum can help me. I have recently resubmit my sitemap and Google only Indexed 1 Page. I can still see many of my old indexed pages in the SERP's? I have upgraded my template and graded all my pages to A's on SEOmoz, I have solid backlinks and have been building them over time. I have redirected all my 404 errors in .htaccess and removed /index.php from my url's. I have never done this before but my website runs perfect and all my pages redirect as I hoped. My site: www.FunerallCoverFinder.co.za How do I figure out what the problem is? Thanks in Advance!
Technical SEO | | Klement690 -
Issue: Duplicate Pages Content
Hello, Following the setting up of a new campaign, SEOmoz pro says I have a duplicate page content issue. It says the follwoing are duplicates: http://www.mysite.com/ and http://www.mysite.com/index.htm This is obviously true, but is it a problem? Do I need to do anything to avoid a google penalty? The site in question is a static html site and the real page only exsists at http://www.mysite.com/index.htm but if you type in just the domain name then that brings up the same page. Please let me know what if anything I need to do. This site by the way, has had a panda 3.4 penalty a few months ago. Thanks, Colin
Technical SEO | | Colski0