404'd pages still in index
-
I recently launched a site and shortly after performed a URL rewrite (not the greatest idea, i know). The developer 404'd the old pages instead of a permanent 301 redirect. This caused a mess in the index. I have tried to use Google's removal tool to remove these URL's from the index. These pages were being removed but now I am finding them in the index as just URL's to the 404'd page (i.e. no title tag or meta description). Should I wait this out or now go back and 301 redirect the old URL's (that are 404'd now) to the new URL's? I am sure this is the reason for my lack of ranking as the rest of my site is pretty well optimized and I have some quality links.
-
Will do. Thanks for the help.
-
I think the latter - robot and 301.
but (if you can) leave a couple without 301 and see what (if any) difference you get - would love to hear how it works out.
-
Is it better to remove the robots.txt entries that are specific to the old URL's so Google can see the 404 so Google will remove those pages at their own pace or remove those bits of the robots.txt file specific to the old URL's and 301 them to the new URL's. It seems those are my two options....? Obviously, I want to do what is best for the site's rankings and will see the fastest turnaround. Thanks for your help on this by the way!
-
I'm not saying remove the whole robots.txt file - just the bits relating to the old urls (if you have entries in a robots.txt that affect the old urls).
e.g. say you're robots.txt blocks access to
then you should remove that line from the robots.txt otherwise google won't be able to crawl those pages to 'see' the 404 and realise that they're not there.
My guess is a few weeks before it all settles down, but that really is a finger in the air guess. I went through a similar scenario with moving urls and then moving them again shortly after the first move - took a month or two.
-
I am a little confused regarding removal of the robots.txt file since that is a step in requesting removal from google (per their removal tool requirements). My natural tendency is to 301 redirect the old URL's to the new ones. Will I need to remove the robots.txt file prior to permanently redirecting the old URL's to the new ones? How long does it take Google (estimate) to remove old URL's after a 301?
-
Ok, got that, so that sounds like an external rewrite - which is fine. url only, but no title or description - that sounds like what you get when you block crawling via robots.txt - if you've got that situation, I'd suggest removing the block so that google can crawl them and find that they are 404s. Sounds like they'll fall out of the index eventually. Another thing you could try to hurry things along is: 301 the old urls to the new ones. submit a sitemap containing the old urls (so that they get crawled and the 301s are picked up) update your sitemap and resubmit with only the new urls.
-
When I say URL rewrite, I mean we restructured the URL's to be cleaner and more search friendly. For example, take a URL that was www.example.com/index/home/keyword and structure it to be www.example.com/keyword. Also, the old URL's (i.e. www.example.com/index/home/keyword) are being shows towards the end of the site:example.com search with just the old URL - no title or meta description. Is this a sign that they are on the way out of the index? Any insight would be helpful.
-
Couple of things probably need clarifying: When you say URL rewrite, I'm assuming you mean an external rewrite (in effect, a redirect)? If you do an internal rewrite, that (of itself) should make no difference at all to how any external visitors/engines see your urls/pages. If the old pages had links or traffic I would be inclined to 301 them to the new pages. If the old pages didn't have traffic/links, leave them, they'll fall out eventually - they're not in an xml sitemap by any chance are they (in which case update the sitemap). You often see a drop in rankings when restructuring a site and (in my experience), it can take a few weeks to recover. To give you an example, it took nearly two months for the non-www version of our site to disappear from the index after a similar move (and messing about with redirects).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there a way to no index no follow sections on a page to avoid duplicative text issues?
I'm working on an event-related site where every blog post starts with an introductory header about the event and then a Call To Action at the end which gives info about the Registration Deadline. I'm wondering if there is something we can and should do to avoid duplicative content penalties. Should these go in a widget or is there some way to No Index, No Follow a section of text? Thanks!
Intermediate & Advanced SEO | | Spiral_Marketing0 -
HTML5: Changing 'section' content to be 'main' for better SEO relevance?
We received an HTML5 recommendation that we should change onpage text copy contained in 'section" to be listed in 'main' instead, because this is supposedly better for SEO. We're questioning the need to ask developers spend time on this purely for a perceived SEO benefit. Sure, maybe content in 'footer' may be seen as less relevant, but calling out 'section' as having less relevance than 'main'? Yes, it's true that engines evaluate where onpage content is located, but this level of granular focus seems unnecessary. That being said, more than happy to be corrected if there is actually a benefit. On a side note, 'main' isn't supported by older versions of IE and could cause browser incompatibilities (http://caniuse.com/#feat=html5semantic). Would love to hear others' feedback about this - thanks! 🙂
Intermediate & Advanced SEO | | mirabile0 -
Alternative HTML Structure for indexation of JavaScript Single Page Content
Hi there, we are currently setting up a pure html version for Bots on our site amazine.com so the content as well as navigation will be fully indexed by google. We will show google exactly the same content the user sees (except for the fancy JS effects). So all bots get pure html and real users see the JS based version. My questions are first, if everyone agrees that this is the way to go or if there are alternatives to this to get the content indexed. Are there best practices? All JS-based websites must have this problem, so I am hoping someone can share their experience. The second question regards the optimal number of content pieces ('Stories') displayed per page and the best method to paginate. Should we display e.g. 10 stories and use ?offset in the URL or display 100 stories to google per page and maybe use rel=”next”/"pref" instead. Generally, I would really appreciate any pointers and experiences from you guys as we haven't done this sort of thing before! Cheers, Frank
Intermediate & Advanced SEO | | FranktheTank-474970 -
What to do when you buy a Website without it's content which has a few thousand pages indexed?
I am currently considering buying a Website because I would like to use the domain name to build my project on. Currently that domain is in use and that site has a few thousand pages indexed and around 30 Root domains linking to it (mostly to the home page). The topic of the site is not related to what I am planing to use it for. If there is no other way, I can live with losing the link juice that the site is getting at the moment, however, I want to prevent Google from thinking that I am trying to use the power for another, non related topic and therefore run the risk of getting penalized. Are there any Google guidelines or best practices for such a case?
Intermediate & Advanced SEO | | MikeAir0 -
Duplicate Content From Indexing of non- File Extension Page
Google somehow has indexed a page of mine without the .html extension. so they indexed www.samplepage.com/page, so I am showing duplicate content because Google also see's www.samplepage.com/page.html How can I force google or bing or whoever to only index and see the page including the .html extension? I know people are saying not to use the file extension on pages, but I want to, so please anybody...HELP!!!
Intermediate & Advanced SEO | | WebbyNabler0 -
How do I increase rankings when the indexed page is the homepage?
Hi Forum, This is a two-part question. The first is: "what may be the cause of some rank declines?" and the second is "how do I bring them back up when the indexed page is the homepage?" Over the last week I noticed some declines in several of my top keywords, many of which point to the site's homepage. The site itself is an eCommerce site, which had less visits last week than normal (holidays it seems, since the data jibes with key dates). Can a decline in traffic cause ranking declines? Any other ideas of where to look? Secondly, for those keywords that link to the homepage, how do we bring these back up since a homepage can't be optimized for every single keyword? We sell yoga products and can't have a homepage that is optimized for keywords like "yoga mat," "yoga blocks," "yoga pilates clothing," and several others, as these are our category pages' keywords. Any thoughts? Thanks!
Intermediate & Advanced SEO | | pano0 -
Duplicate content on index.htm page
How do I avoid duplicate content on the index.htm page . I need to redirect the spider from the /index.htm file to the main root of http://www.manandhisvan.com.au and hence avoid duplicate content. Does anyone know of a foolproof way of achieving this without me buggering up the complete site Cheers Freddy
Intermediate & Advanced SEO | | Fatfreddy0 -
Should I index tag pages?
Should I exclude the tag pages? Or should I go ahead and keep them indexed? Is there a general opinion on this topic?
Intermediate & Advanced SEO | | NikkiGaul0