Killing 404 errors on our site in Google's index
-
Having moved a site across to Magento, obviously re-directs were a large part of that, ensuring all the old products and categories linked up correctly with the new site structure.
However, we came up against an issue where we needed to add, delete, then re-add products. This, coupled with a misunderstanding of the csv upload processing, meant that although the old urls redirected, some of the new Magento urls changed and then didn't redirect:
For Example:
mysite/product
would get deleted re-added and become:
mysite/product-1324
We now know what we did wrong to ensure it doesn't continue to happen if we weret o delete and re-add a product, but Google contains all these old URLs in its index which has caused people to search for products on Google, click through, then land on the 404 page - far from ideal.
We kind of assumed, with continual updating of sitemaps and time, that Google would realise and update the URL accordingly. But this hasn't happened - we are still getting plenty of 404 errors on certain product searches (These aren't appearing in SEOmoz, there are no links to the old URL on the site, only Google, as the index contains the old URL).
Aside from going through and finding the products affected (no easy task), and setting up redirects for each one, is there any way we can tell Google 'These URLs are no longer a thing, forget them and move on, let's make a fresh start and Happy New Year'?
-
No canonical back to the main product page?
-
Both helpful replies thanks. Further investigation led me to this Magento Bug:
http://www.magentocommerce.com/bug-tracking/issue/?issue=13662
(Need to have a magneto account to see the bug report).
Seems there's a spearate underlying issue which we need to fix first - the rewrite table grows exponentially every time we index Magento and creates a new URL for every configurable product. i.e. a product that has one or more associated products that will have the same name - used for displaying different sizes and colours. This means that Google is picking up a new page for each configurable product each time it indexes: different URL, same content, same product sku - a technical SEO nightmare!
-
Hey Sean
This should take care of itself but there are a few things you can do to help.
**1. **Firstly, using webbug or some such, just make sure the page is returning a HTTP 404 or 410 code to ensure that whilst it may be displaying some kind of 404 like page, that it is actually sending the 4XX code back to Google (so they can update this and remove them).
2. Then, you can log into webmaster tools and remove URLs from your site:
Webmaster Tools > Optimisation > Remove URLs
This way you can manually remove them.
Alternatively, you could always just manually add some 301 redirects for those pages which may be the quickest way to sort this out and certainly provides the best experience for any users clicking on those links in the SERPs.
Hope that helps!
Marcus -
complex thing. Not sure if this may help you or not -
Example meta tag
Add the following meta tag in the HTML source of your page:
<meta http-equiv="expires" content="mon, 27 sep 2010 14:30:00 GMT">
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is the image property really required for Google's breadcrumbs structured data type?
In its structured data (i.e., Schema.org) documentation, Google says that the "image" property is required for the breadcrumbs data type. That seems new to me, and it seems unnecessary for breadcrumbs. Does anyone think this really matters to Google? More info about breadcrumbs data type:
Intermediate & Advanced SEO | | Ryan-Ricketts
https://developers.google.com/search/docs/data-types/breadcrumbs I asked Google directly here:
https://twitter.com/RyanRicketts/status/7554782668788531220 -
Not found errors (404) due to being hacked
Hi Moz Guru's Our website was hacked a few months ago, since then we have taken various measures, last one being redesigning the website all together and removing it from a WordPress platform. So far all is going well, except that the 404 not found errors keeps coming up in Google Webmaster tools. The URLs are spam pages that were created by the virus. And these spam pages have been indexed by Google, and now we are struggling to get rid of them. Is there any way we can deal with these 404 spam pages links? Is marking all of them as fixed in the webmaster tools - search console- crawl errors helpful in any way? Can this have a negative impact on the SEO ? Looking forward to your answers. Many thanks.
Intermediate & Advanced SEO | | monicapopa0 -
Why Google isn't indexing my images?
Hello, on my fairly new website Worthminer.com I am noticing that Google is not indexing images from my sitemap. Already 560 images submitted and Google indexed only 3 of them. Altough there is more images indexed they are not indexing any new images, and I have no idea why. Posts, categories and other urls are indexing just fine, but images not. I am using Wordpress and for sitemaps Wordpress SEO by yoast. Am I missing something here? Why Google won't index my images? Thanks, I appreciate any help, David xv1GtwK.jpg
Intermediate & Advanced SEO | | Worthminer1 -
Duplicate content when changing a site's URL due to algorithm penalty
Greetings A client was hit by penguin 2.1, my guess is that this was due to linkbuilding using directories. Google webmaster tools has detected about 117 links to the site and they are all from directories. Furthermore, the anchor texts are a bit too "perfect" to be natural, so I guess this two factors have earned the client's site an algorithm penalty (no manual penalty warning has been received in GWT). I have started to clean some of the backlinks, on Oct the 11th. Some of the webmasters I asked complied with my request to eliminate backlinks, some didn´t, I disavowed the links from the later. I saw some improvements on mid october for the most important KW (see graph) but ever since then the rankings have been falling steadily. I'm thinking about giving up on the domain name and just migrating the site to a new URL. So FINALLY MY QUESTION IS: if I migrate this 6-page site to a new URL, should I change the content completely ? I mean, if I just copy paste the content of the curent site into a new URL I will incur in dpolicate content, correct?. Is there some of the content I can copy ? or should I just start from scratch? Cheers hRggeNE
Intermediate & Advanced SEO | | Masoko-T0 -
Why google index some meta titles I dont have?
Hi there, I have a problem with a website and I am desperate to find a solution because I have tried many things and nothing works! My website its: adtriboo.com Google does not find my main URL (main countro spain) www.adtriboo.com/es and I dont see this page its indexed in google. See link https://www.google.es/search?num=100&hl=es&site=&source=hp&q=site%3Aadtriboo.com&oq=site%3Aadtriboo.com&gs_l=hp.3...1189.4419.0.4586.17.17.0.0.0.0.223.1457.9j6j1.16.0...0.0...1c.1.8.hp.brTKX-zPwVI Also, google its showing some meta titles that are not in my page! For example my subfolder for the country Chile shows this title: Chile - Adtriboo but this its my real title Diseño logo, logotipos, video corporativo - adtriboo In webmaster tools everything looks good, and if I explore the webpage like google in webmaster tools the code its ok and everything lookd okay. If you see for example the URL from Chile (www.adtriboo.com/es_CL) the meta title is not the right one! Also i have a problem indexatión because i am not visible for any of my keywords even in the page 10! Please, somebody knows what happen?
Intermediate & Advanced SEO | | Comunicare0 -
Why will google not index my pages?
About 6 weeks ago we moved a subcategory out to becomne a main category using all the same content. We also removed 100's of old products and replaced these with new variation listings to remove duplicate content issues. The problem is google will not index 12 critcal pages and our ranking have slumped for the keywords in the categories. What can i do to entice google to index these pages?
Intermediate & Advanced SEO | | Towelsrus0 -
Is Google's reinclusion request process flawed?
We have been having a bit of a nightmare with a Google penalty (please see http://www.browsermedia.co.uk/2012/04/25/negative-seo-or-google-just-getting-it-painfully-wrong/ or http://econsultancy.com/uk/blog/10093-why-google-needs-to-be-less-kafkaesque for background information - any thoughts on why we have been penalised would be very, very welcome!) which has highlighted a slightly alarming aspect of Google's reinclusion process. As far as I can see (using Google Analytics), supporting material prepared as part of a reinclusion request is basically ignored. I have just written an open letter to the search quality team at http://www.browsermedia.co.uk/2012/06/19/dear-matt-cutts/ which gives more detail but the short story is that the supporting evidence that we prepared as part of a request was NOT viewed by anyone at Google. Has anyone monitored this before and experienced the same thing? Does anyone have any suggestions regarding how to navigate the treacherous waters of resolving a penalty? This no doubt sounds like a sob story for us, but I do think that this is a potentially big issue and one that I would love to explore more. If anyone could contribute from the search quality team, we would love to hear your thoughts! Cheers, Joe
Intermediate & Advanced SEO | | BrowserMediaLtd0 -
Does Google crawl the pages which are generated via the site's search box queries?
For example, if I search for an 'x' item in a site's search box and if the site displays a list of results based on the query, would that page be crawled? I am asking this question because this would be a URL that is non existent on the site and hence am confused as to whether Google bots would be able to find it.
Intermediate & Advanced SEO | | pulseseo0