Why are so many pages indexed?
-
We recently launched a new website and it doesn't consist of that many pages. When you do a "site:" search on Google, it shows 1,950 results. Obviously we don't want this to be happening. I have a feeling it's effecting our rankings. Is this just a straight up robots.txt problem? We addressed that a while ago and the number of results aren't going down. It's very possible that we still have it implemented incorrectly. What are we doing wrong and how do we start getting pages "un-indexed"?
-
What's to stop google from finding them? They're out there and available on the internet!
Block or remove pages using a robots.txt file
You can do this by putting:
User-agent: * Disallow: /
in the robots.txt file.
You might also want to stop humans from accessing the content too - can you put this content behind a password using htaccess or block access based on network address?
-
Sounds like you need to put a robots.txt on those subdomains (and maybe consider some type of login too).
Quick fix: put a robots.txt on the subdomains to block them from being indexed. Go into Google Webmaster Tools and verify each subdomain as its own site, then request removal of each of those subdomains (which should be approved, since you've already blocked it in robots.txt).
I took a quick look at lab.capacity.com/robots.txt and it isn't blocking the entire subdomain, though the robots.txt at fb.capacitr.com is.
-
I most certainly do not want those pages indexed, they're used for internal purposes only. That's exactly what I'm trying to figure out here. Why are those subdomains being indexed? They should obviously be private. Any insights would be great.
Thanks!
-
What are are you searching for? I notice that if you do a site:.capacitr.com you get the 1,950 results you mention above.
If you do a search for site:www.capacitr.com then you only get 29 results.
Its looks like there's a whole load of pages being indexed on other subdomains - fb.capacitr.com and lab.capacity.com. (Which has 1,860 pages!)
What are these used for, do you really want these in the index!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Webshop landing pages and product pages
Hi, I am doing extensive keyword research for the SEO of a big webshop. Since this shop sells technical books and software (legal books, tax software and so on), I come across a lot of very specific keywords for separate products. Isn't it better to try and rank in the SERP's with all the separate product pages, instead of with the landing (category) pages?
Intermediate & Advanced SEO | | Mat_C0 -
URL structure - Page Path vs No Page Path
We are currently re building our URL structure for eccomerce websites. We have seen a lot of site removing the page path on product pages e.g. https://www.theiconic.co.nz/liberty-beach-blossom-shirt-680193.html versus what would normally be https://www.theiconic.co.nz/womens-clothing-tops/liberty-beach-blossom-shirt-680193.html Should we be removing the site page path for a product page to keep the url shorter or should we keep it? I can see that we would loose the hierarchy juice to a product page but not sure what is the right thing to do.
Intermediate & Advanced SEO | | Ashcastle0 -
E-Commerce Site Collection Pages Not Being Indexed
Hello Everyone, So this is not really my strong suit but I’m going to do my best to explain the full scope of the issue and really hope someone has any insight. We have an e-commerce client (can't really share the domain) that uses Shopify; they have a large number of products categorized by Collections. The issue is when we do a site:search of our Collection Pages (site:Domain.com/Collections/) they don’t seem to be indexed. Also, not sure if it’s relevant but we also recently did an over-hall of our design. Because we haven’t been able to identify the issue here’s everything we know/have done so far: Moz Crawl Check and the Collection Pages came up. Checked Organic Landing Page Analytics (source/medium: Google) and the pages are getting traffic. Submitted the pages to Google Search Console. The URLs are listed on the sitemap.xml but when we tried to submit the Collections sitemap.xml to Google Search Console 99 were submitted but nothing came back as being indexed (like our other pages and products). We tested the URL in GSC’s robots.txt tester and it came up as being “allowed” but just in case below is the language used in our robots:
Intermediate & Advanced SEO | | Ben-R
User-agent: *
Disallow: /admin
Disallow: /cart
Disallow: /orders
Disallow: /checkout
Disallow: /9545580/checkouts
Disallow: /carts
Disallow: /account
Disallow: /collections/+
Disallow: /collections/%2B
Disallow: /collections/%2b
Disallow: /blogs/+
Disallow: /blogs/%2B
Disallow: /blogs/%2b
Disallow: /design_theme_id
Disallow: /preview_theme_id
Disallow: /preview_script_id
Disallow: /apple-app-site-association
Sitemap: https://domain.com/sitemap.xml A Google Cache:Search currently shows a collections/all page we have up that lists all of our products. Please let us know if there’s any other details we could provide that might help. Any insight or suggestions would be very much appreciated. Looking forward to hearing all of your thoughts! Thank you in advance. Best,0 -
To Many Links On Page Problem
Hello My Moz report is showing I have an error for too many links on my sitemap and blog. The links on both pages are relevant and I'm not sure if this has to be sorted out, as I would have thought Google would expect sitemaps and blogs to have lots of links. If I were to reduce the number of links how much of a positive affect would it have on my site? If any of you feel it is best practice to reduce number of links on these particular pages, do you have any suggestions on how I can tackle this? http://www.dradept.com/blog.php http://www.dradept.com/sitemap.php Thank you Christina
Intermediate & Advanced SEO | | ChristinaRadisic0 -
Can use of the id attribute to anchor t text down a page cause page duplication issues?
I am producing a long glossary of terms and want to make it easier to jump down to various terms. I am using the<a id="anchor-text" ="" attribute="" so="" am="" appending="" #anchor-text="" to="" a="" url="" reach="" the="" correct="" spot<="" p=""></a> <a id="anchor-text" ="" attribute="" so="" am="" appending="" #anchor-text="" to="" a="" url="" reach="" the="" correct="" spot<="" p="">Does anyone know whether Google will pick this up as separate duplicate pages?</a> <a id="anchor-text" ="" attribute="" so="" am="" appending="" #anchor-text="" to="" a="" url="" reach="" the="" correct="" spot<="" p="">If so any ideas on what I can do? Apart from not do it to start with? I am thinking 301s won't work as I want the URL to work. And rel=canonical won't work as there is no actual page code to add it to. Many thanks for your help Wendy</a>
Intermediate & Advanced SEO | | Chammy0 -
Google is indexing wordpress attachment pages
Hey, I have a bit of a problem/issue what is freaking me out a bit. I hope you can help me. If i do site:www.somesitename.com search in Google i see that Google is indexing my attachment pages. I want to redirect attachment URL's to parent post and stop google from indexing them. I have used different redirect plugins in hope that i can fix it myself but plugins don't work. I get a error:"too many redirects occurred trying to open www.somesitename.com/?attachment_id=1982 ". Do i need to change something in my attachment.php fail? Any idea what is causing this problem? get_header(); ?> /* Run the loop to output the attachment. * If you want to overload this in a child theme then include a file * called loop-attachment.php and that will be used instead. */ get_template_part( 'loop', 'attachment' ); ?>
Intermediate & Advanced SEO | | TauriU0 -
NOINDEX listing pages: Page 2, Page 3... etc?
Would it be beneficial to NOINDEX category listing pages except for the first page. For example on this site: http://flyawaysimulation.com/downloads/101/fsx-missions/ Has lots of pages such as Page 2, Page 3, Page 4... etc: http://www.google.com/search?q=site%3Aflyawaysimulation.com+fsx+missions Would there be any SEO benefit of NOINDEX on these pages? Of course, FOLLOW is default, so links would still be followed and juice applied. Your thoughts and suggestions are much appreciated.
Intermediate & Advanced SEO | | Peter2640 -
Can a XML sitemap index point to other sitemaps indexes?
We have a massive site that is having some issue being fully crawled due to some of our site architecture and linking. Is it possible to have a XML sitemap index point to other sitemap indexes rather than standalone XML sitemaps? Has anyone done this successfully? Based upon the description here: http://sitemaps.org/protocol.php#index it seems like it should be possible. Thanks in advance for your help!
Intermediate & Advanced SEO | | CareerBliss0