Dealing with 404 pages
-
I built a blog on my root domain while I worked on another part of the site at .....co.uk/alpha I was really careful not to have any links go to alpha - but it seems google found and indexed it. The problem is that part of alpha was a copy of the blog - so now soon we have a lot of duplicate content. The /alpha part is now ready to be taken over to the root domain, the initial plan was to then delete /alpha. But now that its indexed I'm worried that Ill have all these 404 pages. I'm not sure what to do.. I know I can just do a 301 redirect for all those pages to go to the other ones in case a link comes on but I need to delete those pages as the server is already very slow. Or does a 301 redirect mean that I don't need those pages anymore? Will those pages still get indexed by google as separate pages? Please assist.
-
after a 301 redirect can I delete the pages and the databases/folders associated with them?
Yes. Think of a 301 redirect like mail forwarding. If you have an address, 1000 main street and then move to a new address you would leave a forward order (e.g. 301 redirect) with the post office. Once that is done, you can bulldozer the house (e.g.. delete the webpage/database) and the mail should still be forwarded properly.
How does one create a 301 redirect?
The method of creating a 301 redirect varies based on your server setup. If you have a LAMP setup with cPanel, there is a Redirect tool. Otherwise I would suggest contacting your host and ask how to create a redirect based on your particular setup.
-
Ryan,
Two things.
First - after a 301 redirect can I delete the pages and the databases/folders associated with them?
Second - How does one create a 301 redirect?
-
Hi Ryan,
Agree with you, but I thought to provide alternate solution to the problem. I know it is difficult and not chosen one.
But as I said that if he can't get any traffic from it then and then only it can delete the pages for index. Plus as he told earlier in question that mistakenly alpha folder was indexed so lines as per you said in the comment "That tool was designed to remove content which is damaging to businesses such as when confidential or personal information is indexed by mistake." and Its contradictory statement too "The indexed content are pages you want in the index but simply have the wrong URL - The wrong URL means the different page.
Anyways will definitely go with your solution but sometimes two options helps you to choose better one.
Thanks
-
Semil, your answer is a working solution but I would like to share why it is not a best practice.
Once the /alpha pages were indexed you could have traffic on them. You cannot possibly know who has linked to those pages, e-mailed links, bookmarked them, etc. By providing a simple 301 the change will be completely seamless to users. All their links and bookmarks will still work. Additionally if any website did link to your /alpha pages, you will retain the link.
The site will also benefit because it is already indexed by Google. You will not have to wait for Google to index your pages. This means more traffic for the site.
The 301 is very quick and easy to implement. If you are simply moving from the /alpha directory to your main site then a single 301 redirect can cover your entire site.
I will offer a simple best practice of SEO (my belief which not everyone agrees with) which I do my best to follow. NEVER EVER EVER use the robots.txt file unless you have exhausted every other possibility. The robots.txt file is an inferior solution that many people latch on to because it is quick and easy. In your case, there is no need to adjust your robots.txt file at all. The original poster stated an intention to delete the /alpha pages. Those pages will no longer exist. Why block URLs which don't exist? It doesn't offer any benefit.
Also, it makes no sense to use the Google removal tool. That tool was designed to remove content which is damaging to businesses such as when confidential or personal information is indexed by mistake. The indexed content are pages you want in the index but simply have the wrong URL. The 301 redirect will allow your pages to remain in the index and for the URL to be properly updated. In order for the 301 to work correctly, you would need to NOT block the /alpha pages with robots.txt.
The solution you shared would work, but it is not as friendly all around.
-
Whoops! Thanks for correcting my answer...
-
The reason behind not using 301 is alpha is not a page or folder you want to create for your users so I don't want to put 301. Its indexed that's it. Are you getting any traffic from it ?
No, then why you need to redirect. Remove the page and ask search engine to remove that page from index. That is all.
-
Thanks Dan,
Is there a way of blocking an entire folder or do I have to add each link?
-
How can I ask them to remove it from webmaster? How can I ask everything on the /alpha folder not to be indexed - or do I have to write each link out?
Why do you think my case isn't good for 301 redirects?
-
You have to be very careful from the start, but now Google indexed your alpha. So dont worry about the thing.
Using 301 is something which I dont like to do on your case. Ask google to remove that urls from indexing from GWT, and put robots.txt to prevent alpha to be indexed.
Thanks,
-
You can perform the 301 redirect and you will not need those pages anymore. Using the redirect would be a superior SEO solution over using the robots.txt file. Since the content is already indexed, it will stay indexed and Google will update each page over the next 30 days as it crawls your site.
If you block /alpha with robots.txt, Google will still retain the pages in their index, users will experience 404s and your new pages wont start to be properly indexed until Google drops the existing pages which takes a while. The redirect is better for everyone.
-
Hi
If you do not want them in the index you should block them in your robots.txt file like so:
-
-
-
-
- -
-
-
-
User-agent: *
Allow: /
Disallow: /alpha
-Dan
PS - Some documentation on robots.txt
-
-
-
-
- -
-
-
-
EDIT: I left my answer, but don't listen to it. Do what Ryan says
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page content not being recognised?
I moved my website from Wix to Wordpress in May 2018. Since then, it's disappeared from Google searches. The site and pages are indexed, but no longer ranking. I've just started a Moz campaign, and most pages are being flagged as having "thin content" (50 words or less), when I know that there are 300+ words on most of the pages. Looking at the page source I find this bit of code: page contents Does this mean that Google is finding this and thinks that I have only two words (page contents) on the page? Or is this code to grab the page contents from somewhere else in the code? I'm completely lost with this and would appreciate any insight.
Technical SEO | | Photowife1 -
Why google does not remove my page?
Hi everyone, last week i add "Noindex" tag into my page, but that site still appear in the organic search. what other things i can do for remove from google?
Technical SEO | | Jorge_HDI0 -
Indexed pages
Just started a site audit and trying to determine the number of pages on a client site and whether there are more pages being indexed than actually exist. I've used four tools and got four very different answers... Google Search Console: 237 indexed pages Google search using site command: 468 results MOZ site crawl: 1013 unique URLs Screaming Frog: 183 page titles, 187 URIs (note this is a free licence, but should cut off at 500) Can anyone shed any light on why they differ so much? And where lies the truth?
Technical SEO | | muzzmoz1 -
Quest about 404 Errors
About two months ago, we deleted some unnecessary pages on our website that were no longer relevant. However, MOZ is still saying that these deleted pages are returning 404 errors when a crawl test is done. The page is no longer there, at least that I can see. What is the best solution for this? I have a page that similar to the older page, so is it a good choice to just redirect the bad page to my good page? If so, what's the best way to do this. I found some useful information searching but none of it truly pertained to me. I went around my site to make sure there were no old links that directed traffic to the non existent page, and there are none.
Technical SEO | | Meier0 -
What to do with temporary empty pages?
I have a website listing real estate in different areas that are for sale. In small villages, towns, and areas, sometimes there is nothing for sale and therefore the page is completely empty with no content except a and some footer text. I have thousand of landing pages for different areas. For example "Apartments in Tibro" or "Houses in Ljusdahl" and Moz Pro gives me some warnings for "Duplicate Content" on the empty ones (I think it does so because the pages are so empty that they are quite similar). I guess Google could also think bad of my site if I have hundreds or thousands of empty pages even if my total amount of pages are 100,000. So, what to do with these pages for these small cities, towns and villages where there is not always houses for sale? Should I remove them completely? Should I make a 404 when no houses for sale and a 200 OK when there is? Please note that I have totally 100,000+ pages and this is only about 5% of all my pages.
Technical SEO | | marcuslind900 -
Translating Page Titles & Page Descriptions
I am working on a site that will be published in the original English, with localized versions in French, Spanish, Japanese and Chinese. All the versions will use the English information architecture. As part of the process, we will be translating the page the titles and page descriptions. Translation quality will be outstanding. The client is a translation company. Each version will get at least four pairs of eyes including expert translators, editors, QA experts and proofreaders. My question is what special SEO instructions should be issued to translators re: the page titles and page descriptions. (We have to presume the translators know nothing about SEO.) I was thinking of: stick to the character counts for titles and descriptions make sure the title and description work together avoid over repetition of keywords page titles (over-optimization peril) think of the descriptions as marketing copy try to repeat some title phrases in the description (to get the bolding and promote click though) That's the micro stuff. The macro stuff: We haven't done extensive keyword research for the other languages. Most of the clients are in the US. The other language versions are more a demo of translation ability than looking for clients elsewhere. Are we missing something big here?
Technical SEO | | DanielFreedman0 -
Page Not Found Help!
Hi, I recently (about 2 months ago) moved a blog from a separate domain name over to my eCommerce site to help with marketing. http://www.moondoggieinc.com/blog. I seem to have gotten it all to work right, but I'm getting tons of 404 errors and they all have " in them for example: http://www.moondoggieinc.com/blog/”http://www.moondoggieinc.com/custom_dog_tanks_and_tees.php” I'm not sure how this happened of how to fix it, but there are about 250 pages like this. I know how to redirect them all with a 301 in htaccess, but Im not sure if that's the appropriate course to fix this or if that's just putting a patch on something that's causing a more major issue. Or do i just need to write 250 301 redirects? Thanks! Kristy O
Technical SEO | | KristyO0 -
How to proceed on this 404 situation
Hi folks, I have a website (www.mysite.com) where I can't host a 404 page inside the subdomain www because a CMS issue. In order to provide a 404 page, I can create a subdomain like “404.mysite.com” that returns 404 and then if I find that a page does not exist I can redirect it to 404.mysite.com. My question is: should I redirect as 301 or 302? Does it have any difference?
Technical SEO | | fabioricotta-840380