What would be considered a bad ratio to determine Index Bloat?
-
I am using Annie Cushing's most excellent site audit checklist from Google Docs. My question concerns Index Bloat because it is mentioned in her "Index" tab.
We have 6,595 indexed pages and only 4,226 of those pages have received 1 or more visits since January 1 2013.
Is this an acceptable ratio? If not, why not and what would be an acceptable ratio? I understand the basic concept that "dissipation of link juice and constrained crawl budget can have a significant impact on SEO traffic." [Thanks to Reid Bandremer http://www.lunametrics.com/blog/2013/04/08/fifteen-minute-seo-health-check/#sr=g&m=o&cp=or&ct=-tmc&st=(opu%20qspwjefe)&ts=1385081787]
If we make this an action item I'd like to have some idea how to prioritize it compared to other things that must be done. Thanks all!
-
Hi EGOL,
Wow, thank you so very much. This is one of the best answers I've ever received, probably the best, here in Q & A. Your thoughtful comments and suggestions are so appreciated. Honestly, you gave me a check list of things that have potential to be pure gold for us if we act on them.
Yes, you are correct, this is the site that had many issues with content being under tabs. It's also got a tremendous amount of duplicate and thin content issues, in addition to orphaned pages. Progress has been coming along, slowly and surely, but having your comments, and having them be so specific, pointed and concise are something I can take to my team and say "Here's an awesome check list of things that we can actually address right now, without re-platforming the site [you know, there are always people who think that the root of all a site's problems is the platform that it's on...pure mythology]."
I hope many others find your check list useful. Combined with Annie's audit spreadsheet in Google docs, I feel like I have the tools I need to go to battle and help this site fulfill its potential. Nearly every point you mentioned struck a chord. Better yet, now that I know my way around the "guts" of this homegrown CMS, I feel like I can actually make the necessary changes.
Egol, I really can't thank you enough.
-
I totally agree Keri. Every word Egol wrote , to me, is worth its weight in gold. I think this may be the best response I have ever received here in Q & A.
-
If only people realized how much good information members drop in Q&A...
Once again, thanks for this EGOL!
-
From my experience, that is a frightening number of pages that have not received a visit. I would definitely be taking some type of action. This hits to me like a site in very bad health. I have lots of little pages on a weak little site that get a lot more traffic than none since January. This would be high on my priority list of things to solve. Solving this could bring major income so this is potential opportunity as much as it is a problem.
To diagnose, I would check.... I know you and suspect that you have looked at all of these but just making a list, just in case.
A) Duplicate content problem? Does this site have lots of pages with very similar other pages on the same site. Does the company have another site that is running the same product descriptions? Does the site run product descriptions that are used from a datafeed supplied to vendors? Are affiliates using the same content? Have other websites stolen the content?
B) Have you been scraped and republished by a strong website? Just one is all it would take. A strong site was once scraping and republishing some of my short content pages and that killed the traffic into a section of my site. As soon as I asked them to stop traffic was back within days. One site can hurt you like that or numerous small sites - even minor sites in Asia can do this.
C) Lots of thin content? Do you have a lot of pages that might only have two or three unique sentences? Google could be disrespecting your entire site because of this.
D) Technical problem? I would be looking at robots.txt and .htaccess, noindex, badly coded links, content management system causing duplicated title tags or other problems? Faulty analyitics that make it look like these pages are not getting traffic when really they are.
E) Content cannibalization? Lots of separate pages for red widgets that are being filtered from the SERPs.
F) Inadequate linkjuice? This is not a huge site but not a small one. Does it have a nice amount of linkjuice coming in?
G) Does this site have pages that are really deeeeep down in the linkstructure? Many clicks down? Fix that either with a new linkstructure or some kickass powerful links that hit nodes deep in the site to force spiders down. I would solve with linkstructure.
H) This isn't the site that had all of the content behind tabs that I remember from a while ago? (My memory is really bad so it might not even be your site.) If you have pages like that I would get rid of those tabs immediately. I have a personal opinion that Google does not treat content hidden behind tabs as well as content that is out in the open.
I) Are there a lot of other sites - strong ones - publlishing very similar pages - like product description pages - competing for the same keywords. If that is the case you could be crowded out of the SERPs and receiving no traffic on these pages.
J) Does this site have a bad history? Does it have something that might be causing a penalty or filtering?
After doing all of that you might have something that is really worth fixing. If you can't identify the problem I would be slashing, hatcheting those pages from the site right away.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blogs Not Getting Indexed Intermittently - Why?
Over the past 5 months many of our clients are having indexing issues for their blog posts.
Technical SEO | | JohnBracamontes
A blog from 5 months ago could be indexed, and a blog from 1 month ago could be indexed but blogs from 4, 3 and 2 months ago aren't indexed. It isn't consistent and there is not commonality across all of these clients that would point to why this is happening. We've checked sitemap, robots, canonical issues, internal linking, combed through Search Console, run Moz reports, run SEM Rush reports (sorry Moz), but can't find anything. We are now manually submitting URLs to be indexed to try and ensure they get into the index. Search console reports for many of the URLs will show that the blog has been fetched and crawled, but not indexed (with no errors). In some cases we find that the blog paginated pages (i.e. blog/page/2 , blog/page/3 , etc.) are getting indexed but not the blogs themselves. There aren't any nofollow tags on the links going to the blogs either. Any ideas? *I've added a screenshot of one of the URL inspection reports from Search Console alt text0 -
No index and Crawl Budget
Hello, If we noindex pages, will it improve crawl budget ? For example pages like these - https://x-z.com/2012/10/
Technical SEO | | Johnroger
https://x-y.com/2012/06/
https://x-y.com/2013/03/
https://x-y.com/2019/10/
https://x-y.com/2019/08/ Should we delete/redirect such pages ? Thanks0 -
Do Sitespect links get indexed?
I put a link on one of my websites using sitespect because the next release is not for a few weeks. The reason for the link is to pass domain authority (SEO Juice) to the linked site. In my next release I will add the link in the actual code, but am hoping that from now till then google will crawl and index this link. So the question is, will google crawl and index links adding to webpages via sitespect? Here is the code: | * [http://www.](<a class=)yourdomain.com" class="" >YourDomain |
Technical SEO | | AlyssaN
| | | Link to Sitespect: http://www.sitespect.com/0 -
Are sitewide links bad for SEO?
I have 11 real estate sites and have had links from one to another for about 7 years but someone just suggested me to take them all out because I might get penalized or affected by penguin. My main site was affected on July of 2012 and organic visits have dropped 43%...I've been working on many aspects of my SEO but it's been difficult to come back. Any suggestions are very welcome, thanks 🙂
Technical SEO | | mbulox0 -
Is link cloaking bad?
I have a couple of affiliate gaming sites and have been cloaking the links, the reason I do this is to stop have so many external links on my sites. In the robot.txt I tell the bots not to index my cloaked links. Is this bad, or doesnt it really matter? Thanks for your help.
Technical SEO | | jwdesign0 -
Getting a citation page indexed
Howdy mozzers, I have a citation on a .govt domain with 2 links pointing to my site. The page is not indexed by Google, bing or yahoo. URL; http://www.familyservices.govt.nz/directory/viewprovider.htm?id=17077 I have tried getting the paged indexed by building bookmark links to it. I have tweeted the url and gotten a few re-tweets for it. But no luck. The page has got no nofollow meta tag. Other listings have been indexed by google. Could someone please advise on means to help me get the page indexed? A strategy that I have not yet tried is submitting a sitemap that includes the external url as I am not sure if it is possible to include url's not part of my domain. Any advice, help would be greatly appreciated. viva le SEOmoz Thanks
Technical SEO | | ihms1 -
Https indexed - though a no index no follow tag has been added
Hi, The https-pages of our booking section are being indexed by Google. We added But the pages are still being indexed. What can I do to exclude these URL's from the Google index? Thank you very much in advance! Kind regards, Dennis Overbeek ACSI Publishing | dennis@acsi.eu
Technical SEO | | SEO_ACSI0 -
Index forum sites
Hi Moz Team, somehow the last question i raised a few days ago not only wasnt answered up until now, it was also completely deleted and the credit was not "refunded" - obviously there was some data loss involved with your restructuring. Can you check whether you still find the last question and answer it quickly? I need the answer 🙂 Here is one more question: I bought a website that has a huge forum, loads of pages with user generated content. Overall around 500.000 Threads with 9 Million comments. The complete forum is noindex/nofollow when i bought the site, now i am thinking about what is the best way to unleash the potential. The current system is vBulletin 3.6.10. a) Shall i first do an update of vbulletin to version 4 and use the vSEO tool to make the URLs clean, more user and search engine friendly before i switch to index/follow? b) would you recommend to have the forum in the folder structure or on a subdomain? As far as i know subdomain does take lesser strenght from the TLD, however, it is safer because the subdomain is seen as a separate entity from the regular TLD. Having it in he folder makes it easiert to pass strenght from the TLD to the forum, however, it puts my TLD at risk c) Would you release all forum sites at once or section by section? I think section by section looks rather unnatural not only to search engines but also to users, however, i am afraid of blasting more than a millionpages into the index at once. d) Would you index the first page of a threat or all pages of a threat? I fear duplicate content as the different pages of the threat contain different body content but the same Title and possibly the same h1. Looking forward to hear from you soon! Best Fabian
Technical SEO | | fabiank0