Magento Core_URL_Rewrite Problems
-
Hi Everyone,
We are currently caught between a rock and a hard place with Magento and are wondering if anyone else had similar problems and could share their advice.
Our Core_URL_Rewrite now containt 1.3 million records for an account that has 12000 products on 4 different store views. This has ballooned past the point that we are no longer able to reindex our URL Management.
The option that is being suggested to us is to truncate the table and start over, though this will essentially kill our SEO for those pages.(Which as there are duplicates, I can only imagine how much they are going to be penalized by it)
Would anyone have any advice other than truncating and starting over?
Any advice would be greatly appreciated.
Thanks!
-
Hi,
I found the exact problem you are facing with a solution on this link
http://magento.stackexchange.com/questions/17553/magento-core-url-rewrite-table-excessively-large
There are patch codes available on this link, however do read this reply on this page
Bugs in earlier (and possibly current) versions of Magento is one. Another is there's logic in this table that tries to track changes to the URL key value so that 301/302 rewrites are setup for old products. Because of this, and complicating things, truncating the table and regenerating may make existing URL rewrites go away, and this will have an unknown effect on your search engine listing (not necessity bad, just hard to predict).
My general advice to clients who ask is
-
Leave the giant growing table as is if you don't have a good handle on your URL/SEO situation
-
Until the table size starts being a problem (generating site maps, for example). When that happens, get a handle on your URL/SEO situation.
-
Once you have a handle on your URL/SEO situation, backup the table, then truncate the table and regenerate. Address any URL/SEO problems caused by the truncating.
-
Automate step 3
Trying to fix this on the Magento code level is admirable, but you'll be swimming upstream. Sometimes it's better to accept that "That's just Magento being Magento", and to solve the problem with and external process.
I hope this helps, if you have further questions, then post a response, I will be happy to answer.
Regards,
Vijay
-
-
I'm not sure the answers previously presented are related to the issues you're having. Having worked with Magento for a long time, this can be an issue that occurs over and over again.
To answer your initial question, truncating your core_url_rewrite table will remove all of these URLs, but it'll only delay the problem until it reoccurs again in the future (unless you've had a problem in the past which has been rectified). You're also correct in that any rewrites in the system previously there will disappear, so you'll probably end up with a lot of crawl issues appearing in Search Console.
Your best move would be to find out why you have so many URLs in there in the first place. Do you have a huge product catalog with multiple stores? Or is this something to do with an issue in your Magento version or some setup issues. The most common time this usually occurs is if two products get added to your site with the same URL Key. Every time the reindex process runs, your core_url_rewrite table will grow. You could check this by looking at the number of rows in the table, reindexing the site and if it grows further, then it's likely to be the problem. The quickest way to fix this is to ensure all URL key are unique.
There's also an article here about duplicate keys - https://firebearstudio.com/blog/magento-url-reindex-core_url_rewrite-duplicates-patch.html - this should hopefully clear the issue.
I hope this helps! If it doesn't solve the problem, then sending over a little more information around the number of stores, catalog site and the split between system generated URL rewrites and custom URL rewrites would be great so we can try to help further!
Thanks,
Lewis -
This is an issue to to set-up. When you set up multiple ecommerce websites on Magento as 'Stores', then all SKUs will load on other domains. if they were set-up as 'Websites' then this would alleviate the issue. However, with Stores you are able to share shopping carts (i.e. Add a product from website A and checkout on website B).
What I did was turn off the XML cron jobs and set-up cross-domain canonicals. Also make sure your session IDs (/?SID=) are working properly. Not sure if this solves the technical issues, but should help clear up dupe content.
-
Is it creating a new url for each option (size, color, etc) as well as what page it shows up on or other various sort orders (by price, by size, etc.) and session id's that you could exclude? Are you sure they are truly duplicates?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there any problem if we migrate the entire site to HTTPS except for the blog ?
Hello guys,
Intermediate & Advanced SEO | | newrankbg
I have a question to those of you, who have migrated from HTTP to HTTPS. We are planning to migrate the site of our customer to Always SSL. In other words, we want to redirect all site pages to HTTPS, except for the blog. Currently, the whole site is using the HTTP protocol (except the checkout page).
After the change, our customer's site should look like this: https://www.domain.com
http://www.domain.com/blog/ The reasons we do not want to migrate the blog to HTTPS are as follows: The blog does not collect any sensitive user information, as opposed to the site. We all know that on-site algorithms like Panda are having sitewide effect. If the Panda doesn’t like part of the blog (if any thin or low quality content), we do not want this to reflect on the rankings of the entire website. Having in mind that for Google, HTTP and HTTPS are two different protocols, a possible blog penalty should not reflect the web site, which will use HTTPS. Point 2 is the reason I am writing here, as this is just a theory. I would like to hear more thoughts from the experts here. Also, I would like to know your opinion, regarding this mixed use of protocols – could this change lead to a negative effect for any of the properties and why? For me, there should be no negative effect at all. The only disadvantage is that we will have to monitor both metrics – the blog and the site separately in webmaster tools. Thank you all and looking forward for your comments.0 -
Ranking problems with international website
Hey there, we have some ranking issues with our international website. It would be great if any of you could share their thoughts on that. The website uses subfolders for country and language (i.e. .com/uk/en) for the website of the UK branch in English. As the company has branches all over the world and also offers their content in many languages the url structure is quite complex. A recent problem we have seen is that in certain markets the website is not ranking with the correct country. Especially in the UK and the US, Google prefers the country subfolder for Ghana (.com/gh/en) over the .com/us/en and .com/uk/en versions. We have hreflang setup and should also have some local backlinks pointing to the correct subfolders as we switched from many ccTLDs to one gTLD. What confuses me is that when I check for incoming links (Links to your site) with GWT, the subfolder (.com/gh/en) is listed quite high in the column (Your most linked content). However the listed linking domains are not linking at all to this folder as far as I am aware. If I check them with a redirect checker they all link to different subfolders. So I have now idea why Google gives such high authority to this subfolder over the specific country subfolders. The content is pretty much identical at this stage. Has any of you experienced similar behaviour and could point me in a promising direction? Thanks a lot. Regards, Jochen
Intermediate & Advanced SEO | | Online-Marketing-Guy0 -
Pagination causing duplicate content problems
Hi The pagination on our website www.offonhols.com is causing duplicate content problems. Is the best solution adding add rel=”prev” / “next# to the hrefs As now the pagination links at the bottom of the page are just http://offonhols.com/default.aspx?dp=1
Intermediate & Advanced SEO | | offonhols
http://offonhols.com/default.aspx?dp=2
http://offonhols.com/default.aspx?dp=3
etc0 -
Any problems with two sites by same owner targeting same keyword search?
I have a site, let's call it ExcellentFreeWidgets.com. There is a page on the site that is very popular and we'll call the page title, "Big Blue Widget." That page is currently #1 for the search "big blue widget." This week, I was able to buy the exact match domain for that page, we'll call it BigBlueWidget.com. I want to build a site on BigBlueWidget.com to better capitalize on that search "big blue widget," which is huge. The content would not be the same wording at all, but it would be the same subject. It would probably be a five page or so website, all about Big Blue Widgets: what they are, where to get them, etc. The sites will not reciprocally link to each other. New new site, BigBlueWidgets.com, would link to the existing site, ExcellentFreeWidgets.com. The new site and the current page will compete for position in the SERPs. Here are my questions to you experts: 1. Will Google care at all that the same entity owns both sites, or will just just rank for the term as they normally would. 2. I am not sure I'll run Adsense on the new site or not. I will be pointing a link back my ExcellentWidgets.com site from a button that says, "Get an Excellent Widget." But if I do run Adsense on it, does Google Adsense care that the same entity has a site and another site's page that are competing for the same term that both have Adsense add on them? Note: I do not want to start a new entity for the new site (I'm in CA and LLC's are $800/year) as it's probably not worth all that hassle and money. Thank you so much. I hope the that obfuscating the real domain names did not confuse the issue too much.
Intermediate & Advanced SEO | | bizzer0 -
Is using dots in URL path really a problem?
we have a couple of pages displaying a dot in the URL path like domain.com/mr.smith/widget-mr.smith It displays fine in chrome, firefox and IE and for the user it may actually look better than replacing it by _ or -. Did this ever cause problems to anybody?
Intermediate & Advanced SEO | | lcourse
Any statement from google about it?
Should I change existing URLs? If so, which other characters can I use in the URL instead of underscore and dash, since in our system dash and underscore are already used for rewriting other characters. Thanks0 -
Title tags with >70 characters but most important words at start. Is this really a problem?
Is there in fact any kind of negative impact having title tags longer than 70 characters, as long as I place the most important keywords at the start and make sure that title still is compelling when cut somewhere around 70 characters? Are the additional words after the 70 characters limit just ignored? May additional words dillute the strength of the first words or may they even be helpful ? Any experience or any studies you know about impact of longer title tags? Or any statement from google about it?
Intermediate & Advanced SEO | | lcourse0 -
SEO Problem with PowerPoint to PDF?
Can anyone think of any reasons why it would be a bad idea to use PowerPoint to create documents and then convert them to PDFs? Do you think this could cause any crawling issues for Google?
Intermediate & Advanced SEO | | BlueLinkERP0 -
How to Find problem domain history
Hi I have what most of you may think is a dumb question but here goes. please be nice... 🙂 So I have a client (http://www,ace-alarms.co.uk) who are having a real problem ranking for ANY of their key words. I know it's a reasonably competitive area but I've not seen such a stubborn domain and it seems that no matter what we do there's nothing listed. i'm thinking that there may be a problem with the domain name. My question is; how can I find out if this is a problem domain. Thanks in advance Steve
Intermediate & Advanced SEO | | stevecounsell0