Anyone managed to decrease the "not selected" graph in WMT?
-
Hi Mozzers.
I am working with a very large E-com site that has a big issue with duplicate or near duplicate content. The site actually received a message in WMT listing out pages that Google deemed it should not be crawling. Many of these were the usual pagination / category sorting option URL issues etc.
We have since fixed the issue with a combination of site changes, robots.txt, parameter handling and URL removals, however I was expecting the "not selected" graph in WMT to start dropping.
The number of roboted pages has increased by around 1 million pages (which was expected) and indexed pages has actually increased despite removing hundreds of thousands of pages. I assume this is due to releasing some crawl bandwidth for more important pages like products.
I guess my question is two-fold;
1. Is the "not selected" graph cumulative, as this would explain why it isn't dropping?
2. Has anyone managed to get this figure to significantly drop? Should I even care? I am relating this to Panda by the way.
Important to note that the changes were made around 3 weeks ago and I am aware not everything will be re-crawled yet.
Thanks,
Chris -
Very interesting. I'm also convinced the "not selected" graph is a big clue towards a Panda penalty. I guess I will have to wait another couple of weeks to see if our changes have affected the graph. Maybe this time lag is why it can take upwards of 6 months to get recover from Panda!
-
Hi Chris
Here is the nice information about the "Not Selected" data in WMT. I hope this post will help you more to understand about the Not Selected Graph : http://support.google.com/webmasters/bin/answer.py?hl=en&answer=2642366
-
The "Not Selected" isn't cumulative. The "Ever Crawled" is though.
I have a large Wordpress content site. It was hit by Panda on a very same day that my "not selected" multiplied by 8. I don't think it was a coincidence, and I didn't make any large changes to the site besides the regular addition of about 10 posts per week.
I've been able to effect a downward movement on the not selected count by removing/redirecting things like "replytocom" variable URLs in the comments section;reworking print and email versions of each article, etc. It very slow though, only reducing by an average of 100 per week.
Needless to say, I think the not selected metric means quite alot.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google indexed "Lorem Ipsum" content on an unfinished website
Hi guys. So I recently created a new WordPress site and started developing the homepage. I completely forgot to disallow robots to prevent Google from indexing it and the homepage of my site got quickly indexed with all the Lorem ipsum and some plagiarized content from sites of my competitors. What do I do now? I’m afraid that this might spoil my SEO strategy and devalue my site in the eyes of Google from the very beginning. Should I ask Google to remove the homepage using the removal tool in Google Webmaster Tools and ask it to recrawl the page after adding the unique content? Thank you so much for your replies.
Intermediate & Advanced SEO | | Ibis150 -
Anyone have a good process for Schema.org auditing?
I am looking to do a Schema.org audit across a large number of websites that all run on the same platform. I'm not really sure where to start and what format to use for a deliverable. I suppose starting by checking for errors on the current schema and documenting them and then moving on to additional schema that could be added to the JSON+LD? My last structured data audit I just used a spreadsheet and it didn't come out as neat as I would have liked. Anyone who has some experience in this, your input would be much appreciated!
Intermediate & Advanced SEO | | MJTrevens0 -
When "pruning" old content, is it normal to see an drop in Domain Authority on Moz crawl report?
After reading several posts about the benefits of pruning old, irrelevant content, I went through a content audit exercise to kick off the year. The biggest category of changes so far has been to noindex + remove from sitemap a number of blog posts from 2015/2016 (which were very time-specific, i.e. software release details). I assigned many of the old posts a new canonical URL pointing to the parent category. I realize it'd be ideal to point to a more relevant/current blog post, but could this be where I've gone wrong? Another big change was to hide the old posts from the archive pages on the blog. Any advice/experience from anyone doing something similar much appreciated! Would be good to be reassured I'm on the right track and a slight drop is nothing to worry about. 🙂 If anyone is interested in having a look: https://vivaldi.com https://vivaldi.com/blog/snapshots [this is the category where changes have been made, primarily] https://vivaldi.com/blog/snapshots/keyboard-shortcut-editing/ [example of a pruned post]
Intermediate & Advanced SEO | | jonmc1 -
Redirect old "not found" url (at http) to new corresponding page (now at https)
My least favorite part of SEO 😉 I'm trying to redirect an old url that no longer exists to our new website that is built with https. The old url: http://www.thinworks.com/palm-beach-gardens-team/ New url: https://www.thinworks.com/palm-beach-gardens/ This isn't working with my standard process of the quick redirection plugin in WP or through htaccess because the old site url is at http and not https. Any help would be much appreciated! How do I accomplish this, where do I do it and what's the code I'd use? Thank you Moz community! Ricky
Intermediate & Advanced SEO | | SUCCESSagency0 -
"Hot Desk" type office space to establish addresses in multiple locations
Hello Mozzers, I'm noticing increasing numbers of clients' competitors getting physical addresses and phone numbers in multiple locations, no doubt partly for SEO purposes. These are little more than ghost presences (in hot desk style office space) and the phone numbers are simply diverted. Do such physical addresses put them at an SEO advantage (over and above those who don't have hot desk style space and location phone numbers). Or does Google weed out hot desk type office spaces where they can? Your thoughts/experience would be very welcome! Thanks in advance, Luke
Intermediate & Advanced SEO | | McTaggart0 -
De-indexing product "quick view" pages
Hi there, The e-commerce website I am working on seems to index all of the "quick view" pages (which normally occur as iframes on the category page) as their own unique pages, creating thousands of duplicate pages / overly-dynamic URLs. Each indexed "quick view" page has the following URL structure: www.mydomain.com/catalog/includes/inc_productquickview.jsp?prodId=89514&catgId=cat140142&KeepThis=true&TB_iframe=true&height=475&width=700 where the only thing that changes is the product ID and category number. Would using "disallow" in Robots.txt be the best way to de-indexing all of these URLs? If so, could someone help me identify how to best structure this disallow statement? Would it be: Disallow: /catalog/includes/inc_productquickview.jsp?prodID=* Thanks for your help.
Intermediate & Advanced SEO | | FPD_NYC0 -
My website hasn't been cached for over a month. Can anyone tell me why?
I have been working on an eCommerce site www.fuchia.co.uk. I have asked an earlier question about how to get it working and ranking and I took on board what people said (such as optimising product pages etc...) and I think i'm getting there. The problem I have now is that Google hasn't indexed my site in over a month and the homepage cache is 404'ing when I check it on Google. At the moment there is a problem with the site being live for both WWW and non-WWW versions, i have told google in Webmaster what preferred domain to use and will also be getting developers to do 301 to the preferred domain. Would this be the problem stopping Google properly indexing me? also I'm only having around 30 pages of 137 indexed from the last crawl. Can anyone tell me or suggest why my site hasn't been indexed in such a long time? Thanks
Intermediate & Advanced SEO | | SEOAndy0 -
What does "base" link mean here?
On http://www.google.com/support/webmasters/bin/answer.py?answer=139394, it says: rel="canonical" can be used with relative or absolute links, but we recommend using absolute links to minimize potential confusion or difficulties. If your document specifies a base link, any relative links will be relative to that base link. Where would a document specify a base link? And how?
Intermediate & Advanced SEO | | nicole.healthline0