404 from a 404 that 301s
-
I must be missing something or skipping a step or lacking proper levels of caffeine.
Under my High Priority warnings I have a handful of 404s which are like that on purpose but I'm not sure how Moz is finding them. When I check the referrer info, the 404 is being linked to from a different 404 which is now a 301 (due to craziness of our system and what was easiest for the coders to fix a different problem ages ago). Basically, if a user decides to type in a non-existent model number into the URL there is a specific 404 that comes up. While the 404 error is "site.com/product/?model=abc123" the referrer is "site.com/product?model=abc123" (or more simply, one slash is missing). I can't see how Moz is finding the referrer so I can't figure out how to make Moz stop crawling it. I actually have the same problem in Google WMT for the same group of 404s.
What am I just not seeing that will fix this?
-
Let me know if it works Mike. There is actually a third possibility which is;
Some page(s) might generate a dynamic URL only upon being visited by a browser/search agent. If that's the case, then you can set up an event tracking through your website in conjuction with Google Analytics and track teh refferer;
_gaq.push(['_trackEvent', 'Error', '404', 'page: ' + document.location.pathname + document.location.search + ' ref: ' + document.referrer ]);
After you collect some data (Submit your website to Google WMT or wait for next MOZ visit) you can export and run your filter.
The alternative to this method could be one of the 2 following;
- enabling extreme debug/log mode on your programming platform and collect logs for further processing. You can run a small Python script to find the RegEx pattern. I advise to setup a demo copycat of your website on a subdomain and then run this experiment. You can then submit the demo sub domain to Google Webmaster tools and wait for the crawlers.
- Reconfigure your webserver logging (httpd.conf if using Apache) to log more details. Make sure you turn back into to the normal data collecting configuration to avoid storage consumption.
Good luck,
Ali
-
I had done about half of that... I'll take a look at all of it and try again tomorrow following your suggestions and see if I can figure it out then. Thanks.
-
Hi Mike,
Hope all is well. There are two things that might have made this confusion. Either you have some outdated links somewhere on your website that are leading to the custom 404 page or some external link is pointing back to your website with a wrong URL or missing product. In order to find the link (I say so, because a crawler has to hit a link to crawl so there is definitely one), you can use tools like Ahrefs link analysis and see what is pointing where. export to an excel and filter based on a RegEx you'd make out a 404 generating pattern you already have with Moz or Google WMT. You find one and you'll know where they are coming from and how to fix them. You'd be able to write custom redirects in your htaccess if they are not many. If they are many though, htaccess could slow down your website and the best way would be a back-end base redirect either custom coded or through a plugin based on your platform. I would start from
- my error_logs in webserver logs and match them with WMT and Moz report.
- download CSV and import to excel or program of your choice
- filter based on the pattern
- Match it with where you've found the link through Ahref
- and Voila, now you know exactly how to clean them up
Hope this helps Mike,
Have a nice day,
Ali
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What should I do with all these 404 pages?
I have a website that Im currently working on that has been fairly dormant for a while and has just been given a face lift and brought back to life. I have some questions below about dealing with 404 pages. In Google WMT/search console there are reports of thousands of 404 pages going back some years. It says there are over 5k in total but I am only able to download 1k or so from WMT it seems. I ran a crawl test with Moz and the report it sent back only had a few hundred 404s in, why is that? Im not sure what to do with all the 404 pages also, I know that both Google and Moz recommend a mixture of leaving some as 404s and redirect others and Id like to know what the community here suggests. The 404s are a mix of the following: Blog posts and articles that have disappeared (some of these have good back-links too) Urls that look like they used to belong to users (the site used to have a forum) which where deleted when the forum was removed, some of them look like they were removed for spam reasons too eg /user/buy-cheap-meds-online and others like that Other urls like this /node/4455 (or some other random number) Im thinking I should permanently redirect the blog posts to the homepage or the blog but Im not sure what to do about all the others? Surely having so many 404s like this is hurting my crawl rate?
Technical SEO | | linklander0 -
Changing permalinks in new wordpress website using regex in 301s?
Hi there I am working on a website and we would like to change the permalinks from product-category (replacing with Shop) and product to buy. Currently there are nearly 400 products and multiple categories. Although the website has just been indexed wondering if we need to do 301's? if we did would like to use regex to manage so redirect would be as example: mydomain.com/sub-domain/product-category/ redirecting to to mydomain.com/sub-domain/shop/ (I know you do not need to put in the domain but as an example) - could anyone give me the regex for this? Same for products: mydomain.com/sub-domain/product/sample-product redirect to mydomain.com/sub-domain/buy/sample-product thanks in anticipation
Technical SEO | | musthavemarketing0 -
Question about spammy links to 404 Pages we never created ...
FYI I'm a beginner within the company, so this might be a basic question, but ...I was going through open site explorer and checking www.partnermd.com for opportunities to reclaim links and I found a bunch of 404 pages that we never created that had nothing to do with the business. Out of curiousity, I plugged in one of the weird links like this one:http://www.partnermd.com/images/2015-best-space-heater-best-wers.html into open site explorer and found several bad spammy links pointing to it. When I clicked on one of them I got a notice that the site might have been hacked.I did some research and it looks like Google doesn't penalize you for spammy links to 404 pages, but how do we prevent this from occurring in the first place if possible?
Technical SEO | | WhittingtonConsulting1 -
HTTPS & 301s
Hi We have like most set up a redirect from HTTP to HTTPS. We also changed our website and set up redirects from .ASP pages to PHP pages We are now seeing 2 redirects in place for the whole of the website.
Technical SEO | | Direct_Ram
http.www.domain.com > https.www.domain.com (1) >> oldwebpage.asp >> new webpage.php (2) The question is: Is there anyway of making the redirect 1 and not 2? thanks
Enver0 -
404 issues
Hello, Some time ago, something like a month and a half) I have removed all 404 errors from the google index and the webmaster tools have removed them already, however yesterday moz found the same 404 errors that i have removed from indexing (tose pages are deleted or redirected by the site developer). What could be an issue here and why webmaster tools are not registering those 404 errors but moz analytics does. And the other question is if those pages do not exist can i track where the placed? I tried dowloading moz crawl test, but the refering source was not provided. I would highly appreciate anyones help. Thank you
Technical SEO | | rikomuttik0 -
Miss meta description on 404 page
Hi, My 404 page did not have meta description. Is it an error? Because I run report and seomoz said that a problem. Thanks!
Technical SEO | | JohnHuynh0 -
Weird 404 error
I have 2 404 errors on my site. The pages which are coming up as errors look like this www.mywebsite.com/a-page-not-belong-to-wordpress.html www.mywebsite.com/another-page-not-belong-to-wordpress.html Just wondering if i can delete these pages? if so how Regards
Technical SEO | | panda320 -
I am using SEOmoz pro software and my blog tags are bringing up 404 errors.
After checking they do bring back a 404 page, so i am wondering what to do. Do i remove all the blog tags? We use a Drupal cms system.
Technical SEO | | AITLtd0