Status Code 404: But why?
-
Google Web Master Tool reported me that I have several 404 staus code.,
First they were 2, after 4..6 and 10, right now. Every time I add a new page.
I've got a no CMS managed website. After old website was been deleted, I installed Wordpress, created new page and deleted and blocked (robots.txt) old page.
Infact all page not found don't exist!!! (Pic: Page not found).
The strange thing is that no pages link to those 404 pages (All Wordpress Created page are new!!!). Seomoz doesn't report me any 404 error (Pic 3)
I controlled all my pages:
- No "strange" link in any pages
- No link reported by Seomoz tool
Bu why GWMT reports me that one? How can I risolve that problem?
I'm going crazy!!!Regards
Antonio -
Antonio,
Ryan has explained this perfectly.
For a more detailed explanation of methods for controlling page indexing, you could read this post on Restricting Robot Access for Improved SEO
It seems from your comments and questions about 301 redirects, that there is some confusion on how they work and why we use them.
A 301 redirect is an instruction to the server which is most commonly done by adding a .htaccess file (if you are using an Apache server).
The .htaccess file is read by the server when it receives a request to serve any page on the site. The server reads each rule in the file and checks to see if the rule matches the existing situation. When a rule matches, the server carries out the action required. If no rule matches, then the server proceeds to serve the reqested page.
So, in Ryan's first example above, there would be a line of code in the .htaccess file that basically says to the server IF the page requested is /apples, send the request to /granny-smith-apples using a 301 (Permanent) Redirect.
The intent of using a 301 Redirect is to achieve two things:
- To prevent loss of traffic and offer the visitor an alternative landing page.
- To send a signal to Search Engines that the old page should be removed from the index and replaced with the new page.
The 301 Redirect is referred to as Permanent for this reason. Once the 301 Redirect is recognized and acted upon by the search engine, the page will be permanently removed from the index.
In contrast, the request to remove a page via Google WMT is a "moment in time" option. The page can possibly be re-indexed because it is accessible to crawlers via an external link from another site (unless you use the noindex meta tag instead of robots.txt). Then you would need to resubmit a removal request.
I hope this makes clearer the reasons for my response - basically, the methods you have used are not "closing the door" on the issue, but leaving the possibility open for it to occur again.
Sha
-
But I think, tell me I'm right, that robots.txt is better than noindex tag.
Definitely not. The opposite is true.
A no-index tag tells search engines not to index the page. The content will not be considered as duplicate anymore. But the search engines can still crawl the page and follow all the links. This allows your PR to flow naturally throughout your site. This also allows search engines to naturally read any changes in meta tags. A robots.txt disallow prevents the search engine from looking at any of the page's code. Think of it as a locked door. The crawler cannot read any meta tags and any PR from your site that flows to the page simply dies.
Do I need "real" page to create a 301 redirect?
No. Let's look at a redirect from both ends.
Example 1 - you delete the /apples page from your site. The /apples page no longer exists. After reviewing your site you decide the best replacement page would be the /granny-smith-apples page. Solution: a 301 redirect from the non-existent /apples page to the /granny-smith-apples page.
Example 2 - you delete the /apples page from your site. You no longer carry any form of apples but you do carry other fruit. After some thought you decide to redirect to the /fruit/ category page. Solution: a 301 redirect from the non-existent /apples page to the /fruit/ category page.
Example 3 - you delete the /apples page from your site but you no longer carry anything similar. You can decide to let the page 404. A 404 error is a natural part of the internet. Examine your 404 page to ensure it is helpful. Ideally it should contain your normal site navigation, a site search field and a friendly "sorry the page you are looking for is no longer available" message.
Since you asked about existence of redirected pages, you can actually redirect to a page that does not exist. You could perform a 301 from /apples to a non-existent /apples2 page. When this happens it is almost always due to user error by the person who added the redirect. When that happens anyone who tries to reach the /apples page will be redirected to the non-existent /apples2 page and therefore receive a 404 error.
-
Ryan,
what you say is right: The best robots.txt file is a blank one. But I think, tell me I'm right, that robots.txt is better than noindex tag.
You have presented 404 errors. Those errors are links TO pages which don't exist, correct? Yes.If so, I believe Sha was recommending you can create a 301 redirect from the page which does not exist...
**Ok. But Do I need "real" page to create a 301 redirect?
I deleted those one.So, to resolve my problem must i redirect old page to most relevant page?**
-
Greenman,
I have a simple rule I learned over time. NEVER EVER EVER EVER use robots.txt unless there is absolutely no other method possible to achieve the required result. It is simply bad SEO and will cause problems. The best robots.txt file is a blank one.
When you use CMS software like WP, then it is required for some areas but it's use should be minimized.
How can I add a 301 redirect to a page that doesn't exit?
You have presented 404 errors. Those errors are links TO pages which don't exist, correct? If so, I believe Sha was recommending you can create a 301 redirect from the page which does not exist, to the most relevant page that does exist.
It's a bit of semantics but if you chose to do such, you can create 301s from or to pages that don't exist.
-
Greenman,
As I suspected many of the dates of the bad URLs are old, some even being from 2010. I took a look at your home page specifically checking for the URL you highlighted in red on the 4th image. It is not present.
My belief is your issue has been resolved by the changes you made. I recommend you continue to monitor WMT for any NEW errors. If you see any fresh dates with 404, that would be a concern which should be investigated. Otherwise the problem appears to be resolved.
I also very much support Sha's reply above.
-
Hi Sha, thanks for your answer.
1.** robots.txt is not the most reliable method of ensuring that pages are not indexed**
If you use tag noindex, spider will acces to your page but it will not get enough information. So, page will be semi-indexed.
My old pages ware been removed, no indexed (by robots) and I sent remove request to Google. No problem with that, no result on the SERP.
2. So, the simple answer is that there are links out there which still point to your old pages...does not mean that they don't exist.
You can see by screenshot the link's source: just my old "ghost" pages. No other sources.
3. If you know that you have removed pages you should add 301 redirects to send any traffic to another relevant page.
How can I add a 301 redirect to a page that doesn't exit?
Old page -> 301 -> New page (Home?). But Old page doesn't exit in Wordpress!!!**I don't want stop 404, I want remove link that bring to deleted pages. **
-
-
My gut feeling is that a catch al 301, is not a good thing, I cant give you any evidence, just a bit of reasoning and gut feeling.
I always try to put myself in the SE shoes, would i think a lot of 301's pointing to one not relavant page is natual? and would it be hard to detect? I would answer No and No. Although i used to do it to my home page a while ago, I guess i had a different gut feeling back then
-
Hi Greenman,
I would guess that your problem is most likely caused by the fact that you have used the robots.txt method to block the pages you removed.
robots.txt is not the most reliable method of ensuring that pages are not indexed. Even though robots.txt tells bots not to crawl a page, Google has openly stated that if a page is found through an external link from another site, they can be crawled and indexed.
The most effective way to block pages is to use the noindex meta tag.
So, the simple answer is that there are links out there which still point to your old pages. Just because links are not highlighted in OSE or even Google WMT, does not mean that they don't exist. WMT should provide you with the most accurate link information, but even that is not necessarily complete according to Google.
Don't forget that there may also be "links" out there in the form of bookmarks or favorites that people keep in their browsers. When clicked these will also generate a 404 response from your server.
If you know that you have removed pages you should add 301 redirects to send any traffic to another relevant page. If you do not know the URL's of the pages that have been removed the best way to stop them from returning 404's is to add a catch-all 301 redirect so that any request for a page that does not exist is redirected to a single page. Some people send all of this traffic to the home page, but my preference would be to send it to a custom designed 404 or a relevant category page.
Hope that helps,
Sha
-
When did you change over to the WP site?
Today is October 1st and the most recent 404 error shared in your image is from 9/27. If you have made the changes after 9/27, then no new errors have been found since you made the change.
Since the moz report shows no crawl errors, your current site is clean assuming your site navigation allowed your website to be fully crawled.
The Google errors can be from any website. The next step is to determine the source of the link causing the 404 error. Using the 2nd image you shared, click on each link in the left column of your WMT report. For example, http://www.mangotano.eu/ge/doc/tryit.php shows 3 pages. Click on it and you should see a list of those 3 pages so you can further troubleshoot.
-
I dont think they are, i thing they found them long ago, and no matterr if you block them, remove them of whatever, google take for ever to sort itsself out
-
Sorry Alan,
but I think that Google can looking for old page yet. This is the reason:I deleted old page form index by GMWT "remove url request"
I dissalowed old page by robots.txtThe problem is why Google find in NEW page links to OLD page.
-
The 404's are from pages that used to be linked in your old site correct?, if so I suggest that google is still looking for them. Unless you changed your domain name this would be the reason
-
Yes, link come from my page. Bu I created new page by Wordpress (and deleted OLD website). So, there are NO link beetwen OLD and NEW pages. How GWMT can find a connection? Webpage Source Code HTML doesn't show any link to those page.
-
From you own web page i would asume.
i would suuggest that even that they are not in index, google is till trying, and that WMT is a bit behind. i have simular for links that i took down moths ago.
-
Hi Alan,
404 not found pages are not indexed. My big problem is that I don't now where (and How) GWMT found source link (pages that link to not found page)
-
If they were in a SE index, they will try them for some time before removing from index., i would not worry
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can have FAQ schema code in home page?
Hi, can we have FAQ schema code in Home page ? we wrote some question and answer in drop down box on home page, and also add the schema code script to the head tag of page, but it does not work !,( other pages of our site show the FAQ in the SERP correctly) I want to know this problem is because its on Home Page ? or because its in Drop down style ? its the url of our site: http://payamgostar.com/
Intermediate & Advanced SEO | | allenwebmaster0 -
How to Handle a Soft 404 error to an admin page in WordPress
I'm seeing this error on Google Webmaster Console: | URL: | http://www.awlwildlife.com/wp-admin/admin-ajax.php | | | Error details | Linked from | | |
Intermediate & Advanced SEO | | aj613
| Last crawled: 11/15/16First detected: 11/15/16 The target URL doesn't exist, but your server is not returning a 404 (file not found) error. Learn more Your server returns a code other than 404 or 410 for a non-existent page (or redirecting users to another page, such as the homepage, instead of returning a 404). This creates a poor experience for searchers and search engines. More information about "soft 404" errors | Any ideas what I should do about it? Thanks!0 -
Page Count in Webmaster Tools Index Status Versus Page Count in Webmaster Tools Sitemap
Greeting MOZ Community: I run www.nyc-officespace-leader.com, a real estate website in New York City. The page count in Google Webmaster Tools Index status for our site is 850. The page count in our Webmaster Tools Sitemap is 637. Why is there a discrepancy between the two? What does the Google Webmaster Tools Index represent? If we filed a removal request for pages we did not want indexed, will these pages still show in the Google Webmaster Tools page count despite the fact that they no longer display in search results? The number of pages displayed in our Google Webmaster Tools Index remains at about 850 despite the removal request. Before a site upgrade in June the number of URLs in the Google Webmaster Tools Index and Google Webmaster Site Map were almost the same. I am concerned that page bloat has something to do with a recent drop in ranking. Thanks everyone!! Alan
Intermediate & Advanced SEO | | Kingalan10 -
OSE link report showing links to 404 pages on my site
I did a link analysis on this site mormonwiki.com. And many of the pages shown to be linked to were pages like these http://www.mormonwiki.com/wiki/index.php?title=Planning_a_trip_to_Rome_By_using_Movie_theatre_-_Your_five_Fun_Shows2052752 There happens to be thousands of them and these pages actually no longer exist but the links to them obviously still do. I am planning to proceed by disavowing these links to the pages that don't exist. Does anyone see any reason to not do this, or that doing this would be unnecessary? Another issue is that Google is not really crawling this site, in WMT they are reporting to have not crawled a single URL on the site. Does anyone think the above issue would have something to do with this? And/or would you have any insight on how to remedy it?
Intermediate & Advanced SEO | | ThridHour0 -
How to Destroy Old 404 Pages
Hello Mozzers, So I just purchased a new domain and to my surprise it has a domain authority of 13 right out of the box (what luck!). I needed to investigate. To make a long story short the domain used to be home to a music blog that had hundreds of pages which of course are all missing now. I have about 400 pages on my hands that are resulting in a 404. How or what is the best method for eliminating these pages. Does deleting the Crawl Errors in Google Webmaster Tools do anything? Thanks
Intermediate & Advanced SEO | | TheOceanAgency0 -
Do 410 show in the 404 not found section in Google Webmaster Tools?
Question: Do 410 show in the 404 not found section in Google Webmaster Tools? Specific situation: We got rid of an entire subdomain except for a few pages that we 301'd to relevant content on our main domain. The rest return a 404 not found. These show up in our google webmaster tools as crawl errors. I was wondering since 410 is a content gone error and we intentionally want this content gone, if we switch it to 410, does Google still report it as a 404 error? Thanks
Intermediate & Advanced SEO | | MarloSchneider0 -
How to get around Google Removal tool not removing redirected and 404 pages? Or if you don't know the anchor text?
Hello! I can’t get squat for an answer in GWT forums. Should have brought this problem here first… The Google Removal Tool doesn't work when the original page you're trying to get recached redirects to another site. Google still reads the site as being okay, so there is no way for me to get the cache reset since I don't what text was previously on the page. For example: This: | http://0creditbalancetransfer.com/article375451_influencial_search_results_for_.htm | Redirects to this: http://abacusmortgageloans.com/GuaranteedPersonaLoanCKBK.htm?hop=duc01996 I don't even know what was on the first page. And when it redirects, I have no way of telling Google to recache the page. It's almost as if the site got deindexed, and they put in a redirect. Then there is crap like this: http://aniga.x90x.net/index.php?q=Recuperacion+Discos+Fujitsu+www.articulo.org/articulo/182/recuperacion_de_disco_duro_recuperar_datos_discos_duros_ii.html No links to my site are on there, yet Google's indexed links say that the page is linking to me. It isn't, but because I don't know HOW the page changed text-wise, I can't get the page recached. The tool also doesn't work when a page 404s. Google still reads the page as being active, but it isn't. What are my options? I literally have hundreds of such URLs. Thanks!
Intermediate & Advanced SEO | | SeanGodier0 -
SEOMoz Internal Dupe. Content & Possible Coding Issues
SEOmoz Community! I have a relatively complicated SEO issue that has me pretty stumped... First and foremost, I'd appreciate any suggestions that you all may have. I'll be the first to admit that I am not an SEO expert (though I am trying to be). Most of my expertise is with PPC. But that's beside the point. Now, the issues I am having: I have two sites: http://www.federalautoloan.com/Default.aspx and http://www.federalmortgageservices.com/Default.aspx A lot of our SEO efforts thus-far have done good for Federal Auto Loan... and we are seeing positive impacts from them. However, we recently did a server transfer (may or may not be related)... and since that time a significant number of INTERNAL duplicate content pages have appeared through the SEOmoz crawler. The number is around 20+ for both Federal Auto Loan and Federal Mortgage Services (see attachments). I've tried to include as much as I can via the attachments. What you will see is all of the content pages (articles) with dupe. content issues along with a screen capture of the articles being listed as duplicate for the pages: Car Financing How It Works A Home Loan is Possible with Bad Credit (Please let me know if you could use more examples) At first I assumed it was simply an issue with SEOmoz... however, I am now worried it is impacting my sites (I wasn't originally because Federal Auto Loan has great quality scores and is climbing in organic presence daily). That being said, we recently launched Federal Mortgage Services for PPC... and my quality scores are relatively poor. In fact, we are not even ranking (scratch that, not even showing that we have content) for "mortgage refinance" even though we have content (unique, good, and original content) specifically around "mortgage refinance" keywords. All things considered, Federal Mortgage Services should be tighter in the SEO department than Federal Auto Loan... but it is clearly not! I could really use some significant help here... Both of our sites have a number of access points: http://www.federalautoloan.com/Default.aspx and http://www.federalmortgageservices.com/Default.aspx are both the designated home pages. And I have rel=canonical tags stating such. However, my sites can also be reached via the following: http://www.federalautoloan.com http://www.federalautoloan.com/default.aspx http://www.federalmortgageservices.com http://www.federalmortgageservics.com/default.aspx Should I incorporate code that "redirects" traffic as well? Or is it fine with just the relevancy tags? I apologize for such a long post, but I wanted to include as much as possible up-front. If you have any further questions... I'll be happy to include more details. Thank you all in advance for the help! I greatly appreciate it! F7dWJ.png dN9Xk.png dN9Xk.png G62JC.png ABL7x.png 7yG92.png
Intermediate & Advanced SEO | | WPColt0