Generating 404 Errors but the Pages Exist
-
Hey
I have recently come across an issue with several of a sites urls being seen as a 404 by bots such as Xenu, SEOMoz, Google Web Tools etc. The funny thing is, the pages exist and display fine.
This happens on many of the pages which use the Modx CMS, but the index is fine. The wordpress blog in /blog/ all works fine.
The only thing I can think of is that I have a conflict in the htaccess, but troubleshooting this is difficult, any tool I have found online seem useless.
Have tried to rollback to previous versions but still does not work.
Anyone had any experience of similar issues?
Many thanks
K.
-
FYI, we finally found our error. The short URL turned out to be the same name as the folder (photo-gallery) so once this was changed, wordpress was able to access the correct path. A bit of custom javascript had to be amended as well, but that was limited to our custom code. Using your web-sniffer.net link we were able to test immediately and fix it fairly quickly. Thank you for your help!
-
That's true Ryan I guess it is coding related really.
Issues like this are a real pain in the ass. And most people don't even check WMT to realise the issues exist. TBH, I don't check as often as I should.
-
I agree with you Paul.
As you pointed out one possible cause is a CMS-related issue which I would refer to as "coding" meaning something in the code which was used to present the website. Perhaps there is a better way to phrase it but nothing comes to mind at the moment.
Another possibility you mentioned is Litespeed which would be a server-side issue directly. Either way, it is a legitimate issue which should be addressed.
-
FWIW, I don't think it's a coding issue. If it were coding, it would either show a 200OK or it would show a 404. It wouldn't sometimes serve a 404.
If you're using Litespeed, I'd guarantee that is the issue and if you're using Joomla, it's another prime culprit.
-
Please keep in mind, that 404 error does not mean the page doesn't exist. It means your server, is sending a response code to indicate that it doesn't exist.
When I installed Litespeed on my server, this issue happened over and over again.
I believe Joomla for example, has some kind of security module that serves a 404 if a single IP requests a page too many times. I remember running SEOFrog on a friends Joomla site and tons of 404's were showing up.
-
Dev team are looking into it, must be quite a complex htaccess issue. Will get to the bottom of it this week and post any findings.
-
Thanks Ryan! I will get it looked at...Sue
-
@DentalID, the same reply I offered to Guy applies for you as well. This is an SEO issue which does need to be fixed. Something on your end is causing the page to show with a 403 response code. You really need a programmer to get in there and determine the root cause of the issue. You could try asking your web host if you have managed hosting, but this level of assistance would normally be outside the support of managed hosting.
-
Guy,
In looking at the page this appears to be a legitimate problem. Your server settings allow you to present a page with any header code you wish. You can 301 a page but still present the page with a 200 code if you want. Presently it appears the page is being presented fine but your server is offering a 404 header code.
I can't tell the actual source of the problem other then to say it appears to be on your end and should be fixed. I originally looked at the code with the MOZbar but then checked independently with another tool as well. http://web-sniffer.net/
All tools show a 404 header code for the page. This response code is generated by your web server.
-
We are having a similar problem with this URL: http://dentalimplantsportland.com/photo-gallery/ and also the following locations:
http://cosmeticdentistportland.net/photo-gallery/
http://dentalveneersportland.com/photo-gallery/
SEO Moz and Google webmaster tools show it as a 403 error but the pages display fine. I am not able to tell if this is really a problem for SEO or if we should reconstruct this gallery system and would really love your input.
This is Wordpress with a Spry gallery...
Thanks so much!
-
It is just a small affiliate site I am looking at - this page creates a 404.
http://www.insure-uk.com/post-office-car-insurance.html
Currently testing on some beta servers. Hopefully should fix soon as otherwise it will lose indexation.
-
I also see this now and again, but next crawl they fix themselfs. i assume robots can not always reach page for a number of reasons
-
Can you offer an example of a URL which is causing this problem?
-
I have had the same issues, I think it is often the bot's problem
Just to be certain check your links are correct and manually test them. Also ensure your sitemap is up to date and that you are not blocking the crawlers with metarobots, robots.txt, or some weird stuff in htaccess.
I have found that renaming pages or moving them will often cause 404 issues with crawlers
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Viewing search results for 'We possibly have internal links that link to 404 pages. What is the most efficient way to check our sites internal links?
We possibly have internal links on our site that point to 404 pages as well as links that point to old pages. I need to tidy this up as efficiently as possible and would like some advice on the best way to go about this.
Intermediate & Advanced SEO | | andyheath0 -
Substantial difference between Number of Indexed Pages and Sitemap Pages
Hey there, I am doing a website audit at the moment. I've notices substantial differences in the number of pages indexed (search console), the number of pages in the sitemap and the number I am getting when I crawl the page with screamingfrog (see below). Would those discrepancies concern you? The website and its rankings seems fine otherwise. Total indexed: 2,360 (Search Consule)
Intermediate & Advanced SEO | | Online-Marketing-Guy
About 2,920 results (Google search "site:example.com")
Sitemap: 1,229 URLs
Screemingfrog Spider: 1,352 URLs Cheers,
Jochen0 -
404 errors
Hi, we have plenty of 404 errors. We just deal with those that are of the highest priority (the ones that have high page authority). We have also a lot of errors like this: http://www.weddingrings.com/www.yoy-search.com . Does it make sense to redirect those to the home page or leave them as an 404 error?
Intermediate & Advanced SEO | | alexkatalkin0 -
How can you indexed pages or content on pages that are behind a pay wall or subscription login.
I have a client that has a boat of awesome content they provide to their client that's behind a pay wall ( ie: paid subscribers can only access ) Any suggestions mozzers? How do I get those pages index? Without completely giving away the contents in the front end.
Intermediate & Advanced SEO | | BizDetox0 -
How do you transition a keyword rank from a home page to a sub-page on the site?
We're currently ranking #1 for a valuable keyword, but the result on the SERP is our home page. We're creating a new product page focused on this keyword to provide a better user experience and create more relevant content. What is the best way to make a smooth transition to make the product page rank #1 for the keyword instead of the home page?
Intermediate & Advanced SEO | | buildasign0 -
Pages that were ranking now are not?
Hi Folks Noticed something strange just now pages that were ranking on position 10 on Google for searches such as 'ufc trainer kinect best price' at the start of this week are no longer ranking? Is what has happened to the site the famous google dance or sandbox effect as the site only officially went live on Monday If this is the case what is the recommended course of action to get back ranking competitively again? as I have no idea on what has gone wrong as I has always tried to follow best practice from these forums and the SEOMOZ and YOUMOZ articles My site is www.cheapfindergames.com Many Thanks Ian
Intermediate & Advanced SEO | | ocelot0 -
Should we deindex duplicate pages?
I work on an education website. We offer programs that are offered up to 6 times per year. At the moment, we have a webpage for each instance of the program, but that's causing duplicate content issues. We're reworking the pages so the majority of the content will be on one page, but we'll still have to keep the application details as separate pages. 90% of the time, application details are going to be nearly identical, so I'm worried that these pages will still be seen as duplicate content. My question is, should we deindex these pages? We don't particularly want people landing on our application page without seeing the other details of the program anyway. But, is there problem with deindexing such a large chunk of your site that I'm not thinking of? Thanks, everyone!
Intermediate & Advanced SEO | | UWPCE0 -
Removing pages from index
Hello, I run an e-commerce website. I just realized that Google has "pagination" pages in the index which should not be there. In fact, I have no idea how they got there. For example, www.mydomain.com/category-name.asp?page=3434532
Intermediate & Advanced SEO | | AlexGop
There are hundreds of these pages in the index. There are no links to these pages on the website, so I am assuming someone is trying to ruin my rankings by linking to the pages that do not exist. The page content displays category information with no products. I realize that its a flaw in design, and I am working on fixing it (301 none existent pages). Meanwhile, I am not sure if I should request removal of these pages. If so, what is the best way to request bulk removal. Also, should I 301, 404 or 410 these pages? Any help would be appreciated. Thanks, Alex0