Killing 404 errors on our site in Google's index
-
Having moved a site across to Magento, obviously re-directs were a large part of that, ensuring all the old products and categories linked up correctly with the new site structure.
However, we came up against an issue where we needed to add, delete, then re-add products. This, coupled with a misunderstanding of the csv upload processing, meant that although the old urls redirected, some of the new Magento urls changed and then didn't redirect:
For Example:
mysite/product
would get deleted re-added and become:
mysite/product-1324
We now know what we did wrong to ensure it doesn't continue to happen if we weret o delete and re-add a product, but Google contains all these old URLs in its index which has caused people to search for products on Google, click through, then land on the 404 page - far from ideal.
We kind of assumed, with continual updating of sitemaps and time, that Google would realise and update the URL accordingly. But this hasn't happened - we are still getting plenty of 404 errors on certain product searches (These aren't appearing in SEOmoz, there are no links to the old URL on the site, only Google, as the index contains the old URL).
Aside from going through and finding the products affected (no easy task), and setting up redirects for each one, is there any way we can tell Google 'These URLs are no longer a thing, forget them and move on, let's make a fresh start and Happy New Year'?
-
No canonical back to the main product page?
-
Both helpful replies thanks. Further investigation led me to this Magento Bug:
http://www.magentocommerce.com/bug-tracking/issue/?issue=13662
(Need to have a magneto account to see the bug report).
Seems there's a spearate underlying issue which we need to fix first - the rewrite table grows exponentially every time we index Magento and creates a new URL for every configurable product. i.e. a product that has one or more associated products that will have the same name - used for displaying different sizes and colours. This means that Google is picking up a new page for each configurable product each time it indexes: different URL, same content, same product sku - a technical SEO nightmare!
-
Hey Sean
This should take care of itself but there are a few things you can do to help.
**1. **Firstly, using webbug or some such, just make sure the page is returning a HTTP 404 or 410 code to ensure that whilst it may be displaying some kind of 404 like page, that it is actually sending the 4XX code back to Google (so they can update this and remove them).
2. Then, you can log into webmaster tools and remove URLs from your site:
Webmaster Tools > Optimisation > Remove URLs
This way you can manually remove them.
Alternatively, you could always just manually add some 301 redirects for those pages which may be the quickest way to sort this out and certainly provides the best experience for any users clicking on those links in the SERPs.
Hope that helps!
Marcus -
complex thing. Not sure if this may help you or not -
Example meta tag
Add the following meta tag in the HTML source of your page:
<meta http-equiv="expires" content="mon, 27 sep 2010 14:30:00 GMT">
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Weird behavior with site's rankings
I have a problem with my site's rankings.
Intermediate & Advanced SEO | | Mcurius
I rank for higher difficulty (but lower search volume) keywords , but my site gets pushed back for lower difficulty, higher volume keywords, which literally pisses me off. I thought very seriously to start new with a new domain name, cause what ever i do seems that is not working. I will admit that in past (2-3 years ago) i used some of those "seo packages" i had found, but those links which were like no more than 50, are all deleted now, and the domains are disavowed.
The only thing i can think of, is that some how my site got flagged as suspicious or something like that in google. Like 1 month ago, i wrote an article about a topic related with my niche, around a keyword that has difficulty 41%. The search term in 1st page has high authority domains, including a wikipedia page, and i currently rank in the 3rd place. In the other had, i would expect to rank easily for a keyword difficulty of 30-35% but is happening the exact opposite.The pages i try to rank, are not spammy, are checked with moz tools, and also with canirank spam filters. All is good and green. Plus the content of those pages i try to rank have a Content Relevancy Score which varies from 98% to 100%... Your opinion would be very helpful, thank you.0 -
Google cache is showing my UK homepage site instead of the US homepage and ranking the UK site in US
Hi There, When I check the cache of the US website (www.us.allsaints.com) Google returns the UK website. This is also reflected in the US Google Search Results when the UK site ranks for our brand name instead of the US site. The homepage has hreflang tags only on the homepage and the domains have been pointed correctly to the right territories via Google Webmaster Console.This has happened before in 26th July 2015 and was wondering if any had any idea why this is happening or if any one has experienced the same issueFDGjldR
Intermediate & Advanced SEO | | adzhass0 -
Not found errors (404) due to being hacked
Hi Moz Guru's Our website was hacked a few months ago, since then we have taken various measures, last one being redesigning the website all together and removing it from a WordPress platform. So far all is going well, except that the 404 not found errors keeps coming up in Google Webmaster tools. The URLs are spam pages that were created by the virus. And these spam pages have been indexed by Google, and now we are struggling to get rid of them. Is there any way we can deal with these 404 spam pages links? Is marking all of them as fixed in the webmaster tools - search console- crawl errors helpful in any way? Can this have a negative impact on the SEO ? Looking forward to your answers. Many thanks.
Intermediate & Advanced SEO | | monicapopa0 -
Removing Parameterized URLs from Google Index
We have duplicate eCommerce websites, and we are in the process of implementing cross-domain canonicals. (We can't 301 - both sites are major brands). So far, this is working well - rankings are improving dramatically in most cases. However, what we are seeing in some cases is that Google has indexed a parameterized page for the site being canonicaled (this is the site that is getting the canonical tag - the "from" page). When this happens, both sites are being ranked, and the parameterized page appears to be blocking the canonical. The question is, how do I remove canonicaled pages from Google's index? If Google doesn't crawl the page in question, it never sees the canonical tag, and we still have duplicate content. Example: A. www.domain2.com/productname.cfm%3FclickSource%3DXSELL_PR is ranked at #35, and B. www.domain1.com/productname.cfm is ranked at #12. (yes, I know that upper case is bad. We fixed that too.) Page A has the canonical tag, but page B's rank didn't improve. I know that there are no guarantees that it will improve, but I am seeing a pattern. Page A appears to be preventing Google from passing link juice via canonical. If Google doesn't crawl Page A, it can't see the rel=canonical tag. We likely have thousands of pages like this. Any ideas? Does it make sense to block the "clicksource" parameter in GWT? That kind of scares me.
Intermediate & Advanced SEO | | AMHC0 -
Category 404 Error in Wordpress | Help!!!
Hello Gurus hope everyone is having a fantastic day. Right so I've been pulling my hair with this 404 error on links as such: htttp://manvanlondon.co.uk/category/clients/removals/man-and-van-wandsworth This link appears in the category page clients and the /removals/man-and-van-wandsworth part is the link that should take the user on the Man and Van Wandsworth page in the/ from the footer. However this link and all other links in the footer on these category pages/posts appear to be broken ONLY on this category pages, thus creating 404 errors. And those pages(i.e man and van Wandsworth) are not even categorized. The website is www.manvanlondon.co.uk . We tried various things on Wordpress and nothing is working including non-indexing. Has anyone met this problem before? Is there a way to fix it? Thank you for your time, and hope my explanations make sens. Monica
Intermediate & Advanced SEO | | monicapopa0 -
How long does google index old urls?
Hey guys, We are currently in the process of redesigning a site but in two phases as the timeline issues. So there will be up to a 4 week gap between the 1st and 2nd set of redirects. These urls will be idle 4 weeks before the phase content is ready. What effect if any will this have on the domain and page authority? Thanks Rob
Intermediate & Advanced SEO | | daracreative0 -
Status Code: 404 Errors. How to fix them.
Hi, I have a question about the "4xx Staus Code" errors appearing in the Analysis Tool provided by SEOmoz. They are indicated as the worst errors for your site and must be fixed. I get this message from the good people at SEOmoz: "4xx status codes are shown when the client requests a page that cannot be accessed. This is usually the result of a bad or broken link." Ok, my question is the following. How do I fix them? Those pages are shown as "404" pages on my site...isn't that enough? How can fix the "4xx status code" errors indicated by SEOmoz? Thank you very much for your help. Sal
Intermediate & Advanced SEO | | salvyy0 -
404'd pages still in index
I recently launched a site and shortly after performed a URL rewrite (not the greatest idea, i know). The developer 404'd the old pages instead of a permanent 301 redirect. This caused a mess in the index. I have tried to use Google's removal tool to remove these URL's from the index. These pages were being removed but now I am finding them in the index as just URL's to the 404'd page (i.e. no title tag or meta description). Should I wait this out or now go back and 301 redirect the old URL's (that are 404'd now) to the new URL's? I am sure this is the reason for my lack of ranking as the rest of my site is pretty well optimized and I have some quality links.
Intermediate & Advanced SEO | | mj7750