What if 404 Error not possible?
-
Hi Everyone,
I get an 404 error in my page if the URL is simply wrong, but for some parameters, like if a page has been deleted, or has expired, I get an error page indicating that the ID is wrong, but no 404 error.
It is for me very difficult to program a function in php that solve the problem and modify the .htaccess with the mod_rewrite. I ask the developer of the system to give a look, but I am not sure if I will get an answer soon.
I can control the content of the deleted/expired page, but the URL will be very similar to those that are ok (actually the url could has been fine, but now expired).
Thinking of solutions I can set the expired/deleted pages as noindex, would it help to avoid duplicated title/description/content problem? If an user goes to i.e., mywebsite.com/1-article/details.html I can set the head section to noindex if it has expired. Would it be good enough?
Other question, is it possible anyhow to set the pages as 404 without having to do it directly in the .htacess, so avoiding the mod_rewrite problems that I am having? Some magical tag in the head section of the page?
Many thanks in advance for your help,
Best Regards,
Daniel
-
The pages should not show up at all once they are de-indexed.
-
Hi Takeshi, thanks for the asnwer again.
Would it prevent the deleted/expired pages to be shown as soft 404 in the Webmaster tools?
-
Ok, sounds like a noindex,follow in the header is the best solution then. That will keep the no-longer-existant pages from being indexed while still preserving any link juice the page may have acquired.
-
Hi Again,
@Takeshi Young: Thanks for your answer.
I will try to explain what is happening a little better.
We are using a CMS for Classifieds adds. The script is able to give "SEO Friendly" URLs, which are based in mode_rewrite. If a listing has an ID number, lets say "5", that listings url will look like this:
http://mydomain.com/5-listingname/details.html
After the listing expires, the URL will not be valid anymore, and if a user try to visit the listing, the script deliver a page with a message indicating that the lising is not active anylonger. The HTTP Code is 200 "ok". If the listing is deleted, then a user trying to visit the URL will get a similar message, also with a HTTP Code 200. It is a problem, because that page should return a 404 code, indicating the search engine that the page is gone.
If a user try to visit an invalid page, like for example:
http://mydomain.com/invalidpage.html
then the system will deliver the 404 page that is set in the .htaccess file, but since the script recognises the numeric parameter in the deleted/inactive listing, it does not deliver the 404 error but a page with a message, and this page with a message is a soft 404 error, bad for SEO.
It is out of my knowlage to repair the script in order to make it deliver the proper 404 header, but I can customize as much as I want the page indicating the error.
Then I have two questions:
-
If I set the soft 404 error page as noindex, will it be good enough as to not being affected by the problem?
-
Is there any way of indicating the search engine that a page is 404, other than using the apache .htaccess? Like a tag in the head section? or any trick that would help me with this problem?
Thanks in advance for your help,
Daniel
-
-
Why are these parameters an issue for you? Where are they getting linked from? If it's from a high authority external site, it may make sense to 301 redirect them. If they're just low quality sites, it's probably safe to ignore.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Title errors for pages behind a login
On our website we have content which is located behind a members login. the SEOMoz crawl report has returned these pages with a "no title" error against them. It appears that these pages are being crawled until the website prompts it to login. I can only presume that it follows the url but doesn't have an opportunity to crawl the meta data. what is the solution for these pages? 401, so that the bots know these pages are behind a login? do we implement anything to ensure "no index", "no follow"? I searched the T'interwebs and couldn't find anything conclusive on this issue.
Technical SEO | | digitalez0 -
Webmaster Tools Server Error
We recently did a build to our site and after the build the build one of the softwares that we are using changed. This caused our server errors to go into the thousands. right now google webmaster tools gave us a list of top 1,000 pages with errors and we fixed them all is there a way to see the rest of the errors?
Technical SEO | | DoRM0 -
"/blogroll" causing 404 error
I'm running a campaign, and the crawling report for my site returned a lot of 4xx errors. When I look at the URLs, they all have a "/blogroll" in the end, like: mysite.com/post-number-1/blogroll mysite.com/post-number-2/blogroll And so on, for pretty much all the pages. The thing is, I removed the blogroll widget completely, so I really wouldn't know what can possibly point to links like that. Is there anything to fix on the site? Thanks
Technical SEO | | Baffo0 -
Get rid of a large amount of 404 errors
Hi all, The problem:Google pointed out to me that I have a large increase of 404 errors. In short I had software before that created pages (automated) for long tale search terms and feeded them to google. Recently I quit this service and all those pages (about 500000) were deleted. Now google GWM points out about 800000 404 errors. What I noticed: I had a large amount of 404's before when I changed my website. I fixed it (proper 302) and as soon as all the 404's in GWM were gone I had around 200 visitors a day more. It seems that a clean site is better positioned. Anybody any suggestion on how to tell google that all urls starting with www.domain/webdir/ should be deleted from cache?
Technical SEO | | hometextileshop0 -
Having a massive amount of duplicate crawl errors
Im having over 400 crawl errors over duplicate content looking like this: http://www.mydomain.com/index.php?task=login&prevpage=http%3A%2F%2Fwww.mydomain.com%2Ftag%2Fmahjon http://www.mydomain.com/index.php?task=login&prevpage=http%3A%2F%2Fwww.mydomain.com%2Findex.php%3F etc.. etc... So there seems to be something with my login script that is not working, Anyone knows how to fix this? Thanks
Technical SEO | | stanken0 -
Is it possible for someone outside an organization to remove links?
The other day I used the Explorer tool to check my links. I checked again today, and the strongest links, which have been on my site for many months, are no longer listed on the report. I'm wondering if there has been some malicious activity. If the links have been removed, how do you get them back?
Technical SEO | | Stol0 -
Thousands of 503 Errors
I was just checking Google Webmaster Tools for one of the first times (I know this should have been a regular habit). I noticed that on Feb 8th we had almost 80K errors of type 503. This is obviously very alarming because as far as I know our site was up and available that whole day. This makes me wonder if there is a firewall issue or something else that I'm not aware of. Any ideas for the best way to determine what's causing this? Thanks, Chris
Technical SEO | | osports0 -
Google causing Magento Errors
I have an online shop - run using Magento. I have recently upgraded to version 1.4, and I installed a extension called Lightspeed, a caching module which makes tremendous improvements to Magento's performance. Unfortunately, a confoguration problem, meant that I had to disable the module, because it was generating errors relating to the session, if you entered the site from any page other than the home page. The site is now working as expected. I have Magento's error notification set to email - I've not received emails for errors generated by visitors. However over a 72 hour period, I received a deluge of error emails, which where being caused by Googlebot. It was generating an erro in a file called lightspeed.php Here is an example: URL: http://www.jacksgardenstore.com/tahiti-vulcano-hammock IP Address: 66.249.66.186 Time: 2011-06-11 17:02:26 GMT Error: Cannot send headers; headers already sent in /home/jack/jacksgardenstore.com/user/jack_1.4/htdocs/lightspeed.php, line 444 So several things of note: I deleted lightspeed.php from the server, before any of these error messages began to arrive. lightspeed.php was never exposed in the URL, at anytime. It was referred to in a mod_rewrite rule in .htaccess, which I also commented out. If you clicked on the URL in the error message, it loaded in the browser as expected, with no error messages. It appears that Google has cached a version of the page which briefly existed whilst Lightspeed was enabled. But I though that Google cached generated HTML. Since when does cache a server-side PHP file ???? I've just used the Fetch as Googlebot facility on Webmaster Tools for the URL in the above error message, and it returns the page as expected. No errors. I've had to errors at all in the last 48 hours, so I'm hoping it's just sorted itself out. However I'm concerned about any Google related implications. Any insights would be greatly appreciated. Thanks Ben
Technical SEO | | atticus70