Removing indexed pages
-
Hi all, this is my first post so be kind - I have a one page Wordpress site that has the Yoast plugin installed. Unfortunately, when I first submitted the site's XML sitemap to the Google Search Console, I didn't check the Yoast settings and it submitted some example files from a theme demo I was using. These got indexed, which is a pain, so now I am trying to remove them. Originally I did a bunch of 301's but that didn't remove them from (at least not after about a month) - so now I have set up 410's - These also seem to not be working and I am wondering if it is because I re-submitted the sitemap with only the index page on it (as it is just a single page site) could that have now stopped Google indexing the original pages to actually see the 410's?
Thanks in advance for any suggestions. -
Thanks for all the responses!
At the moment I am serving the 410's using the .htaaccess file as I removed the actual pages a while ago. The pages don't show in most searches, however, two of them do show up in some instances under the sitelinks which is the main pain. I manually asked for them to be removed using 'remove urls' however that only last a couple of months and they are now back.
So I guess the best way is to recreate the pages and insert a noindex?
Thanks again for everyone time, it's much appreciated.
-
I agree with ViviCa1's methods, so go with that.
One thing I just wanted to bring up though, is that unless people are actually visiting those pages you don't want indexed, or it does some type of brand damage, then you don't really need to make it a priority.
Just because they're indexed doesn't mean they're showing up for any searches - and most likely they aren't - so people will realistically never see them. And if you only have a one-page site, you're not wasting much crawl budget on those.
I just bring this up since sometimes we (I'm guilty of it too) can get bogged down by small distractions in SEO that don't really help much, when we should be creating and producing new things!
"These also seem to not be working and I am wondering if it is because I re-submitted the sitemap with only the index page on it (as it is just a single page site) could that have now stopped Google indexing the original pages to actually see the 410's?"
There was a good related response from Google employee Susan Moskwa:
“The best way to stop Googlebot from crawling URLs that it has discovered in the past is to make those URLs (such as your old Sitemaps) 404. After seeing that a URL repeatedly 404s, we stop crawling it. And after we stop crawling a Sitemap, it should drop out of your "All Sitemaps" tab.”
A bit older, but shows how Google discovers URLs through the sitemap. Take a look at the rest of that thread as well.
-
I'd suggest adding a noindex robots meta tag to the affected pages (see how to do this here: https://support.google.com/webmasters/answer/93710?hl=en) and until Google recrawls use the remove URLs tool (see how to use this here: https://support.google.com/webmasters/answer/1663419?hl=en).
If you use the noindex robots meta tag, don't disallow the pages through your robots.txt or Google won't even see the tag. Disallowing Google from crawling a page doesn't mean it won't be indexed (or removed from the index), it just means Google won't crawl the page.
-
Couple of ideas spring to mind
- Use the robots.txt file
- Demote the site link in Google search console (see https://support.google.com/webmasters/answer/47334)
Example of robots.txt file...
Disallow: /the-link/you-dont/want-to-show.html
Disallow: /the-link/you-dont/want-to-show2.htmlDon't include the domain just the link to the page, Plenty of tutorials out there worthwhile having a look at http://www.robotstxt.org
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Specific pages won't index
I have a few pages on my site that Google won't index, and I can't understand why. I've looked into possible issues with Robots, noindex, redirects, canonicals, and Search Console rules. I've got nothing. Example: I want this page to index https://tour.franchisebusinessreview.com/services/franchisee-satisfaction-surveys/ When I Google the full URL, I get results including the non-subdomain homepage, and various pages on the subdomain, including a child page of the page I want, but not the page itself. Any ideas? Thanks for the help!
Technical SEO | | ericstites0 -
Is it better to use XXX.com or XXX.com/index.html as canonical page
Is it better to use 301 redirects or canonical page? I suspect canonical is easier. The question is, which is the best canonical page, YYY.com or YYY.com/indexhtml? I assume YYY.com, since there will be many other pages such as YYY.com/info.html, YYY.com/services.html, etc.
Technical SEO | | Nanook10 -
Page that appears on SERPs is not the page that has been optimized for users
This may seem like a pretty newbie question, but I haven't been able to find any answers to it (I may not be looking correctly). My site used to rank decently for the KW "Gold name necklace" with this page in the search results:http://www.mynamenecklace.co.uk/Products.aspx?p=302This was the page that I was working on optimizing for user experience (load time, image quality, ease of use, etc.) since this page was were users were getting to via search. A couple months ago the Google SERP's started showing this page for the same query (also ranked a little lower, but not important for this specific question):http://www.mynamenecklace.co.uk/Products.aspx?p=314Which is a white gold version of the necklaces. This is not what most users have in mind (when searching for gold name necklace) so it's much less effective and engaging.How do I tell Google to go back to old page/ give preference to older page / tell them that we have a better version of the page / etc. without having to noindex any of the content? Both of these pages have value and are for different queries, so I can't canonical them to a single page. As far as external links go, more links are pointing to the Yellow gold version and not the white gold one.Any ideas on how to remedy this?Thanks.
Technical SEO | | Don340 -
Duplicate page errors from pages don't even exist
Hi, I am having this issue within SEOmoz's Crawl Diagnosis report. There are a lot of crawl errors happening with pages don't even exist. My website has around 40-50 pages but SEO report shows that 375 pages have been crawled. My guess is that the errors have something to do with my recent htaccess configuration. I recently configured my htaccess to add trailing slash at the end of URLs. There is no internal linking issue such as infinite loop when navigating the website but the looping is reported in the SEOmoz's report. Here is an example of a reported link: http://www.mywebsite.com/Door/Doors/GlassNow-Services/GlassNow-Services/Glass-Compliance-Audit/GlassNow-Services/GlassNow-Services/Glass-Compliance-Audit/ btw there is no issue such as crawl error in my Google webmaster tool. Any help appreciated
Technical SEO | | mmoezzi0 -
Home page indexed but not ranking...interior pages with thin content outrank home page??
I have a Joomla site with a home page that I can't get to rank for anything beyond the company name @ Google - the site works fine @ Bing and Yahoo. The interior pages will rank all day long but the home page never shows up in the results. I have checked the page code out in every tool that I know about and have had no luck....by all account it should be good to go...any thoughts/comments/help would be greatly appreciated. The site is http://www.selectivedesigns.com Thanks! Greg
Technical SEO | | DougHosmer0 -
Have a client that migrated their site; went live with noindex/nofollow and for last two SEOMoz crawls only getting one page crawled. In contrast, G.A. is crawling all pages. Just wait?
Client site is 15 + pages. New site had noindex/nofollow removed prior to last two crawls.
Technical SEO | | alankoen1230 -
Page rank 2 for home page, 3 for service pages
Hey guys, I have noticed with one of our new sites, the home page is showing page rank two, whereas 2 of the internal service pages are showing as 3. I have checked with both open site explorer and yahoo back links and there are by far more links to the home page. All quality and relevant directory submissions and blog comments. The site is only 4 months old, I wonder if anyone can shed any light on the fact 2 of the lesser linked pages are showing higher PR? Thanks 🙂
Technical SEO | | Nextman0 -
Removing Duplicate Pages
Hi everyone. I'm sure this falls under novice seo question. But how do i remove duplicate pages from my site. I have not created the pages per say. Their may be a an internal link on a page that links to the page causing the duplication. Do i remove the internal link here is a sample of a duplicate page http://www.ticketplatform.com/about/ticket-industry-news-details/11-03-07/Ticket_Platform_to_help_LilysProject_com_to_raise_money_for_ALYN_Hospital_in_Israel.aspx?ReturnURL=%2fabout%2fticket-industry-news.aspx http://www.ticketplatform.com/about/ticket-industry-news-details/11-03-07/Ticket_Platform_to_help_LilysProject_com_to_raise_money_for_ALYN_Hospital_in_Israel.aspx?ReturnURL=%2fhome.aspx&CntPageID=1 I know the url is way too long. working on it Thanks for your feedbacks.
Technical SEO | | ticketplatform0