Unnecessary pages getting indexed in Google for my blog
-
I have a blog dapazze.com and I am suffering from a problem for a long time. I found out that Google have indexed hundreds of replytocom links and images attachment pages for my blog.
I had to remove these pages manually using the URL removal tool. I had used "Disallow: ?replytocom" in my robots.txt, but Google disobeyed it. After that, I removed the parameter from my blog completely using the SEO by Yoast plugin.
But now I see that Google has again started indexing these links even after they are not present in my blog (I use #comment). Google have also indexed many of my admin and plugin pages, whereas they are disallowed in my robots.txt file.
Have a look at my robots.txt file here: http://dapazze.com/robots.txt
Please help me out to solve this problem permanently?
-
Me too have the same issue ! but not indexed in the Google ! but URL parameters in Google Webmasters shows there are 5K errors !
Should i use the URL Parameters settings or which one ?
Also make sure replytocom links are not blocked using Robots.txt, as it will stop Google bots from crawling and this your links won’t get deindexed. This is one mistake which I did, and later after removing replytocom parameter from robots.txt file, I was able to get most of my replytocom links deindexed. These are warning by the blogger ! http://www.shoutmeloud.com/how-to-fix-replytocom-links-issue-in-wordpress.html - he showed how to do that ! but my problem is different - It's Good that it's not indexed but i don't want to take any risk ! how to avooid them for future !
Someone else told me here that some plugins are doing/helping for you ! and not seen in your Robot.txt !
Confused confused ! so much confused ! Please help me !
-
Actually previously I had removed the links manually. But I am seeing them come up again even after removing the parameter completely.
Can you please point our the problem for me?
-
Please check that the comment pages are blocked by robots.txt file -
However, the blocked pages are now getting redirected to the main landing page of the blog posts.
Seems like it will take a while for Google to recrawl these pages and sort the issue.
In the mean time, could you please show some pages that are getting indexed by Google.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can't get Google to index our site although all seems very good
Hi there, I am having issues getting our new site, https://vintners.co indexed by Google although it seems all technical and content requirements are well in place for it. In the past, I had way poorer websites running with very bad setups and performance indexed faster. What's concerning me, among others, is that the crawler of Google comes from time to time when looking on Google Search Console but does not seem to make progress or to even follow any link and the evolution does not seem to do what google says in GSC help. For instance, our sitemap.xml was submitted, for a few days, it seemed like it had an impact as many pages were then visible in the coverage report, showing them as "detected but not yet indexed" and now, they disappeared from the coverage report, it's like if it was not detected any more. Anybody has any advice to speed up or accelerate the indexing of a new website like ours? It's been launched since now almost two months and I was expected, at least on some core keywords, to quickly get indexed.
Technical SEO | | rolandvintners1 -
Only fraction of the AMP pages are indexed
Back in June, we had seen a sharp drop in traffic on our website. We initially assumed that it was due to the Core Update that was rolled out in early June. We had switched from http to https in May, but thought that should have helped rather than cause a problem. Until early June the traffic was trending upwards. While investigating the issue, I noticed that only a fraction (25%) of the AMP pages have been indexed. The pages don't seem to be getting indexed even though they are valid. Accordingly to Google Analytics too, the percentage of AMP traffic has dropped from 67-70% to 40-45%. I wonder if it is due to the indexing issue. In terms of implementation it seems fine. We are pointing canonical to the AMP page from the desktop version and to the desktop version from the AMP page. Any tips on how to fix the AMP indexing issue. Should I be concerned that only a fraction of the AMP pages are indexed. I really hope you can help in resolving this issue.
Technical SEO | | Gautam1 -
Why Are Some Pages On A New Domain Not Being Indexed?
Background: A company I am working with recently consolidated content from several existing domains into one new domain. Each of the old domains focused on a vertical and each had a number of product pages and a number of blog pages; these are now in directories on the new domain. For example, what was www.verticaldomainone.com/products/productname is now www.newdomain.com/verticalone/products/product name and the blog posts have moved from www.verticaldomaintwo.com/blog/blogpost to www.newdomain.com/verticaltwo/blog/blogpost. Many of those pages used to rank in the SERPs but they now do not. Investigation so far: Looking at Search Console's crawl stats most of the product pages and blog posts do not appear to be being indexed. This is confirmed by using the site: search modifier, which only returns a couple of products and a couple of blog posts in each vertical. Those pages are not the same as the pages with backlinks pointing directly at them. I've investigated the obvious points without success so far: There are a couple of issues with 301s that I am working with them to rectify but I have checked all pages on the old site and most redirects are in place and working There is currently no HTML or XML sitemap for the new site (this will be put in place soon) but I don't think this is an issue since a few products are being indexed and appearing in SERPs Search Console is returning no crawl errors, manual penalties, or anything else adverse Every product page is linked to from the /course page for the relevant vertical through a followed link. None of the pages have a noindex tag on them and the robots.txt allows all crawlers to access all pages One thing to note is that the site is build using react.js, so all content is within app.js. However this does not appear to affect pages higher up the navigation trees like the /vertical/products pages or the home page. So the question is: "Why might product and blog pages not be indexed on the new domain when they were previously and what can I do about it?"
Technical SEO | | BenjaminMorel0 -
Mobile site not getting indexed
My site is www.findyogi.com - a shopping comparison site The mobile site is hosted at m.findyogi.com I fixed my sitemap and attribution to mobile site in May last week. My mobile site pages are getting de-indexed since then. Website - www.findyogi.com/mobiles/motorola/motorola-moto-g-16gb-b95ef8/price - indexed Mobile - m.findyogi.com/mobiles/motorola/motorola-moto-g-16gb-b95ef8/price - _not indexed. _ Google is crawling my website and mobile site normally. What am I am doing wrong?
Technical SEO | | namansr0 -
How to get found on local google search?
Hey When look for particular local businesses on Google like "Manchester Hotels" for example; at the top of Google page come all business up that are marked on the city map (with A,B,C..), then all the others follow. So my question is: "How can I get my business on that map?". Thank you! Ve
Technical SEO | | MissVe0 -
CDN Being Crawled and Indexed by Google
I'm doing a SEO site audit, and I've discovered that the site uses a Content Delivery Network (CDN) that's being crawled and indexed by Google. There are two sub-domains from the CDN that are being crawled and indexed. A small number of organic search visitors have come through these two sub domains. So the CDN based content is out-ranking the root domain, in a small number of cases. It's a huge duplicate content issue (tens of thousands of URLs being crawled) - what's the best way to prevent the crawling and indexing of a CDN like this? Exclude via robots.txt? Additionally, the use of relative canonical tags (instead of absolute) appear to be contributing to this problem as well. As I understand it, these canonical tags are telling the SEs that each sub domain is the "home" of the content/URL. Thanks! Scott
Technical SEO | | Scott-Thomas0 -
Home page deindexed by google
when I search my website on google by site:www.mydomain.com I have found my domain with www has been de-indexed by google, but when I search site:mydomain.com, my home page--**mydomain.com **show up on the search results without www, put it simple, google only index my domain without www, I wonder how to make my domain with www being indexed, and how to prevent this problem occure again.
Technical SEO | | semer0 -
The number of pages indexed on Bing DROPPED significantly.
I haven't signed in to bing webmaster tool for a while. and I found that Bing is not indexing my site properly all of a sudden. IT DROPPED SIGNIFICANTLY Any idea why it is behaving this way? (please check the attachment) INg1o.png
Technical SEO | | joony20080