Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Sudden Increase In Number of Pages Indexed By Google Webmaster When No New Pages Added
-
Greetings MOZ Community:
On June 14th Google Webmaster tools indicated an increase in the number of indexed pages, going from 676 to 851 pages. New pages had been added to the domain in the previous month. The number of pages blocked by robots increased at that time from 332 (June 1st) to 551 June 22nd), yet the number of indexed pages still increased to 851.
The following changes occurred between June 5th and June 15th:
-A new redesigned version of the site was launched on June 4th, with some links to social media and blog removed on some pages, but with no new URLs added. The design platform was and is Wordpress.
-Google GTM code was added to the site.
-An exception was made by our hosting company to ModSecurity on our server (for i-frames) to allow GTM to function.
In the last ten days my web traffic has decline about 15%, however the quality of traffic has declined enormously and the number of new inquiries we get is off by around 65%. Click through rates have declined from about 2.55 pages to about 2 pages.
Obviously this is not a good situation.
My SEO provider, a reputable firm endorsed by MOZ, believes the extra 175 pages indexed by Google, pages that do not offer much content, may be causing the ranking decline.
My developer is examining the issue. They think there may be some tie in with the installation of GTM. They are noticing an additional issue, the sites Contact Us form will not work if the GTM script is enabled. They find it curious that both issues occurred around the same time.
Our domain is www.nyc-officespace-leader. Does anyone have any idea why these extra pages are appearing and how they can be removed? Anyone have experience with GTM causing issues with this?
Thanks everyone!!!
Alan -
Yes, and I appreciate it!
Alan -
I did what I asked you to do.
-
-
-
- in my first post and repeated frequently.
-
-
-
-
Hi Egol:
How did you locate this duplicate or re-published content?
Obviously what you have pointed out is a major source of concern so I ran Copyscape search this afternoon for duplicate content and did not locate any the URLs you mention in the "this", "this" link above. It appears you entered the URL of the blog post in Google's search bar. Would that work? This method would be pretty slow going with 600 URLs.
Thanks,
Alan -
Those are the 448 URLs from your website that have been filtered.
You should find garbage in them like shown below.
Have you done what I have suggested three times above? Do that if you want to identify the problem pages.
-
www.nyc-officespace-leader.com/wp-content/plugins/...
A description for this result is not available because of this site's robots.txt – learn more.
-
www.nyc-officespace-leader.com/wp-content/plugins/...
A description for this result is not available because of this site's robots.txt – learn more.
-
www.nyc-officespace-leader.com/wp-content/plugins/...
A description for this result is not available because of this site's robots.txt – learn more.
-
-
Hi Egol:
Thanks for the suggestion.
When I click on _ repeat the search with the omitted results included _I get 448 results not the entire 859 results. Seems very strange. Some of these URLS have light content but I don't believe they are dups. I don't see any content outside our website when I click this.
Am I doing something wrong? I would think the total of 859 would appear not 447 URLs.
Thanks!!
Alan -
I don't know. You should ask someone who knows a lot about canonicalization.
Did you drill down through all of those indexed pages to see if you can identify all of them?
I've suggested it twice.
-
Hi Egol:
In the content of launching an upgraded site, could the canonicalization have implemented incorrectly? That could account for 175 pages sudden new content as the thin content has been there for some time.
I am particularly suspicious regarding canonicalization as there was an issue involving multi page URLs of property listings when the site was migrated from Drupal to Wordpress last Summer.
Thoughts?
Thanks, Alan
-
Apparently infitter24.rssing.com/chan-13023009/all is poaching my content, taking my original content and adding it to there site. I am not quiet sure what to do about that.
You can have an attorney demand that they stop, you can file DMCA complaints. Be careful
**However it does not explain the sudden appearance of the 175 pages on Googles index **
-
Do this query: site:www.nyc-officespace-leader.com
-
Start drilling down the SERPs. One page at a time. Look for content that you didn't make. Look for duplicates.
-
Get a spreadsheet that has all of your URLs. Drill down through the SERPs checking every one of them. Can you account for your pagination. You have a lot of it and that type of page is usually rubbish in the index. Combine, canonicalize, or get rid of them.
-
-
Hi Egol:
Thanks so much for taking the time for your thorough response!!
Apparently infitter24.rssing.com/chan-13023009/all is poaching my content, taking my original content and adding it to there site. I am not quiet sure what to do about that.
You have pointed out something very useful and I appreciate it and will act upon it. However it does not explain the sudden appearance of the 175 pages on Googles index that did not appear at the end of May and somehow coincided with uploading of the new version of our website in early June. Any ideas???
Thanks,
Alan -
-
Do this query: site:www.nyc-officespace-leader.com
-
Start drilling down the SERPs. One page at a time. Look for content that you didn't make. Look for duplicates.
-
When you drill down about 44 pages you will find this...
In order to show you the most relevant results, we have omitted some entries very similar to the 440 already displayed.
If you like, you can repeat the search with the omitted results included.The bad stuff is usually behind that link. Google doesn't want to show that stuff to people. It could be thin, it could be duplicate, it could be spammy, they just might not like it.
- Find out what is in there.
Possible problems that I see....
I see dupe content like this and this. Either your guys are grabbin' somebodyelse's content or they are grabbin' yours. Can get you in trouble with Panda. You need original and unique. Anything that is not original and unique should be deleted, noindexed or rewritten.
A lot of these pages are really skimpy. Think content can get you into trouble with Panda. Anything that is skimpy should be deleted, noindexed or beefed up.
I see multiple links to tags on lots of these posts. That can cause duplicate content problems.
The tag pages are paginated with just a few pages on each. These can generate extra pages that are low value, suck up your linkjuice or compound duplicate content problems.
You have archive pages, and category pages and more pagination problems.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website excluded from indexing, google-selected canonical: N/A
The google search console revealed to me that none of my pages was indexed, all pages are listed in the 'excluded' section "duplicate, google chose different canonical than user".
Reporting & Analytics | | Fibigrus
But in the URL-inspection tab it shows me google-selected canonical: N/A Indexing and crawling is both allowed. Don't know how to get my pages to be indexed correctly. (by the way, they do NOT exist in other languages, so that can't be a reason why google might think they are a duplicate. There's definitively no other version of those pages available)0 -
Why google stubbornly keeps indexing my http urls instead of the https ones?
I moved everything to https in November, but there are plenty of pages which are still indexed by google as http instead of https, and I am wondering why. Example: http://www.gomme-auto.it/pneumatici/barum correctly redirect permanently to https://www.gomme-auto.it/pneumatici/barum Nevertheless if you search for pneumatici barum: https://www.google.it/search?q=pneumatici+barum&oq=pneumatici+barum The third organic result listed is still http. Since we moved to https google crawler visited that page tens of time, last one two days ago. But doesn't seems to care to update the protocol in google index. Anyone knows why? My concern is when I use API like semrush and ahrefs I have to do it twice to try both http and https, for a total of around 65k urls I waste a lot of my quota.
Reporting & Analytics | | max.favilli0 -
How does switching to HTTPS effect Google Analytics?
We are looking at making our site HTTPS. We have been using the same Google Analytics account for years and I like having the historical data. All of our pages will be the same, we are just going to redirect from the http to https. Does anything need to be done with Google Analytics? What about other addons such as Optimizely, Crazy Egg, or Share this?
Reporting & Analytics | | EcommerceSite0 -
Getting google impressions for a site not in the index...
Hi all Wondering if i could pick the brains of those wise than myself... my client has an https website with tons of pages indexed and all ranking well, however somehow they managed to also set their server up so that non https versions of the pages were getting indexed and thus we had the same page indexed twice in the engine but on slightly different urls (it uses a cms so all the internal links are relative too). The non https is mainly used as a dev testing environment. Upon seeing this we did a google remove request in WMT, and added noindex in the robots and that saw the index pages drop over night. See image 1. However, the site still appears to getting return for a couple of 100 searches a day! The main site gets about 25,000 impressions so it's way down but i'm puzzled as to how a site which has been blocked can appear for that many searches and if we are still liable for duplicate content issues. Any thoughts are most welcome. Sorry, I am unable to share the site name i'm afraid. Client is very strict on this. Thanks, Carl image1.png
Reporting & Analytics | | carl_daedricdigital0 -
Google Ad referral
I was wondering if someone could decode the jumble of a referral - this is supposedly the referal that led to a click through to my site via a product listing ad. I am trying to figure out how www.nextag.com comes in to the picture as we do not have refurbexperts even listed there? Thanks to anyone who tries/does work it out. http://www.googleadservices.com/pagead/aclk?sa=L&ai=CGXud6DmDU_qeL5THygHpuICwCaTZwMYD_Nvvv0bEwMS50wEIBhAEIOn5-gEoBVCl7P7f-v____8BYMnu8omYpPQSoAHAhIv9A8gBB8gDG6oEJ0_QwcNc5zNun_d7S5KNcMT6uPjjH_mMDkKFFgBCQ6aKICRPJVVa7MAFBYgGAaAGJoAHqPv0ApAHAeASupqdo-ypit0m&ohost=www.google.com&cid=5GhZEzUCSC6x9n2wxOdz3-mrAfSUkvHKPN3wD5yLInnlNil_&sig=AOD64_1D1z1JPYbFP0UnUglJVOfvd25RfA&adurl=http://refurbexperts.com/product/527/HP-LaserJet-P2015-Laser-Printer-RECONDITIONED%3Futm_source%3Dproductlistingads%26utm_medium%3Dadwords%26utm_campaign%3Dadwords&ctype=5&nb=0&res_url=http%3A%2F%2Fwww.nextag.com%2Fhp-p2015-laserjet%2Fproducts-html%3Fnxtg%3D116d0a1c0504-9FFEB16DE52A7E2A&rurl=http%3A%2F%2Fwww.nextag.com%2Fgoto.jsp%3Fp%3D3652%26search%3Dhp%2520p2015%2520laserjet%26t%3Dag%253D1384181795%26crid%3D48271786%26gg_aid%3D20169721025%26gg_site%3D%26gclid%3DCjgKEAjwzIucBRDzjIz9qMOB3TASJABBIwL1LHK7GcAPS6yHGpd9Kq3wsZrcPORAWD8QCWivr4W75PD_BwE&nm=11&nx=43&ny=12&is=700x181&clkt=187
Reporting & Analytics | | henya0 -
Google Analytics Organic Search Keywords Suddenly Displaying FulL Urls
In my Google Analytics, the top keywords for Organic Search are suddenyl displaying full URLs. For example, now the third and fourth keywords are http://www.domain.com/highly-specific-URL. These have all started recently around the same day, July 12th. I've checked back, and we've made no internal changes to the site around that time that could affect this. Any thoughts on this? Thanks! P.S. It might be related to rich snippets, but I cannot tell at this point.
Reporting & Analytics | | 10SL0 -
Open internal links in a new tab increase bonus rate?
Hello! This week I used a simple method to reduce my blog Google Analytics bounce rate. My blog all the posts are guides, in order to follow them, user need to download a zip file (same zip file). Otherwise they can't. Therefore I added a separate blog post to download all the necessary files. As a result of that I can reduce my bounce rate from 62-70% to 45-50% level. Now I'm thinking to open that zip file download page in a new tab. If I open my blog zip file download page, in a new tab. It will again increase my bounce rate? I reduced my bounce rate using that zip file download page. Thanks!
Reporting & Analytics | | Godad0 -
Why is Google Analytics showing index.php after every page URL?
Hi, My client's site has GA tracking code gathering correct data on the site, but the pages are listed in GA as having /index.php at the end of every URL, although this does not appear when you visit the site pages. Even if there is a redirect happening for site visitors, shouldn't GA be showing the pages as their redirect destination, i.e. the URL that visitors actually see? Could this discrepancy be adversely affecting my search performance? Example page: http://freshstarttax.com/innocent-spouse/ shows up in GA as http://freshstarttax.com/innocent-spouse/index.php thanks
Reporting & Analytics | | JMagary0