I have removed over 2000+ pages but Google still says i have 3000+ pages indexed

apogeecorp

Good Afternoon,

I run a office equipment website called top4office.co.uk.

My predecessor decided that he would make an exact copy of the content on our existing site top4office.com and place it on the top4office.co.uk domain which included over 2k of thin pages.

Since coming in i have hired a copywriter who has rewritten all the important content and I have removed over 2k pages of thin pages.

I have set up 301's and blocked the thin pages using robots.txt and then used Google's removal tool to remove the pages from the index which was successfully done.

But, although they were removed and can now longer be found in Google, when i use site:top4office.co.uk i still have over 3k of indexed pages (Originally i had 3700).

Does anyone have any ideas why this is happening and more importantly how i can fix it?

Our ranking on this site is woeful in comparison to what it was in 2011. I have a deadline and was wondering how quickly, in your opinion, do you think all these changes will impact my SERPs rankings?

Look forward to your responses!

CleverPhD

I agree with DrPete. You cant have the pages within the robot.txt otherwise Google will not crawl the pages and "see" the 301s to then update the index.

Something else to consider is on the new pages, have them canonical to themselves. We had a site that Google was caching old URLs that had 301 redirects that had been up for 2 years. Google was finding the new pages and new titles and new content, but were referencing the old URLs. We were seeing this in the SERPs and also in the GWT. GWT was reporting duplicate content for titles and descriptions for sets of pages that were 301ed. Adding the canonical to self helped get that cleaned up.

Cheers.

Dr-Pete

This process can take a painfully long time, even done right, but I do have a couple of concerns:

(1) Assuming I understand the situation, I think using Robots.txt on top of 301-redirects is a bad idea. If Google doesn't recrawl the pages, they won't process the 301s, and Robots.txt is bad for removal (good for prevention, but not once something is in the index). Basically, you're telling Google not to re-crawl these pages, and if they don't re-crawl, they won't process the 301s. So, I'd drop the Robots.txt blocking for now, honestly.

(2) What's your internationalization strategy? You could potential try rel="alternate"/hreflang to specify US vs. UK English, target each domain in webmaster tools, and leave the duplicates alone. If you 301-redirect, you're not giving the UK site a chance to rank properly on Google.co.uk (if that's your objective).

Kurt_Steinbrueck

It sounds like you have done pretty much everything you could do to remove those pages from Google, and that Google has removed them.

There are two possibilities that I can think of. First, Google is finding new pages or new URLs at least. These may be old pages that have some sort of a parameter on them or something like that that are causing Google to find some new pages even though you're not adding any new pages.

Another possibility is that, I found that the site:search is not entirely accurate. So, it's more like anything else that Google gives us words this kind of estimate of the actual figure. It's possible that Google was giving you a smaller number of pages if in that original 3700 they said they had. And now they're just reporting more of the pages that they had had in their index, which they weren't showing before.

By the way, when I do a search for site:top four office.co.uk, I only get 2600 results.

apogeecorp

I no longer see the pages. No chance Google has seen any additional pages as we spend every day looking at new pages indexed by using the filter and site:top4office.co.uk.

Any ideas?

HiveDigitalInc

Just a quick question, do you see the URLs you "removed" still in the index? Or is it possible that Google has found a different set of 3000 URLs on your site?

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

I have removed over 2000+ pages but Google still says i have 3000+ pages indexed

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Google Not Indexing App Content

How do you check the google cache for hashbang pages?

Google Is Indexing My Internal Search Results - What should i do?

Index Pages become No-Index

Whats the best way to remove search indexed pages on magento?

How to Remove Joomla Canonical and Duplicate Page Content

Indexing non-indexed content and Google crawlers

Does Google index url with hashtags?