Sitemap.xml - autogenerated by CMS is full of crud
-
Hi all,
hope you can help.
the Magento ecommerce system I'm working with autogenerates sitemap.xml - it's well formed with priority and frequency parameters.
However, it has generated lots of URLs that are pointing to broken pages returning fatal erros, duplicate URLs (not canonicals), 404s etc
I'm thinking of hand creating sitemap.xml - the site has around 50 main pages including products and categories, and I can get the main page URLs listed by screaming frog or xenu.
Then I'll have to get into the hand editing the crud pages with noindex, and useful duplicates with canonicals.
Is this the way to go or is there another solution
thanks in advance for any advice
-
If the cron is working then I would personally turn to the other forum to see if anyone knows a way to rope those messy URLs in and get them under control. I try to avoid manually generating and updating sitemaps whenever I can, because it's a hassle on a small site, not to mention the trouble on an ecommerce site.
If your site is going to stay that small, then a manual sitemap might be less of a headache for you than customizing Magento.
I would worry about keeping a clean sitemap. If the search engines learn that you keep a messy sitemap, they will rely on it less and less. 404 & 500 codes especially, but also redirects and perhaps duplicate content.
For Further Reading:
Google Sitemaps Ask For Clean URLs - http://www.johnfdoherty.com/google-sitemaps-ask-for-clean-urls/
-
Hi Kane,
the sitemap is new - it's just that Magento create lots of duplicate files on the fly & it's not putting the canonical URLs in the sitemap etc.
I just wondered whether its worth hand creating a sitemap.xml containing the content pages (60 or 70 of them) for this relatively small site, or not worry too much about the sitemap, the site is pretty well indexed by google already
I'll head over to the Magento forums again to see if I can find more info
many thanks for you help
-
If it's returning 404 pages, that sounds like a dated sitemap. Have you activated the cron service?
See the "Refreshing Sitemaps at Regular Intervals" section of this page if not:
Magento can be set up to automatically refresh Google Sitemaps at regular intervals. This function is configured in Admin > System > Configuration > Google Sitemap.
To use Magento’s automatic generation of Google Sitemaps, you must activate the Magento Cron service.
If you do have that setup, and you're certain it's working correctly, then I would turn to the forums at MagentoCommerce.com - you're going to get a lot faster answer there since everyone is familiar with that exact platform.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Adding your sitemap to robots.txt
Hi everyone, Best practice question: When adding your sitemap to your robots.txt file, do you add the whole sitemap at once or do you add different subcategories (products, posts, categories,..) separately? I'm very curious to hear your thoughts!
Technical SEO | | WeAreDigital_BE0 -
2 sites using 1 CMS... issues?
Hi, We are working with a client that has 2 sites in the same sector. They are currently on separate servers, with separate blogs, images galleries etc. Both sites rank combined for over 200 terms. IF we were to "combine" the sites on one CMS, with one IP, two separate front ends, one blog stream, one image gallery what do you think the SEO impact would be from this? We had an issue with another client whose sites were too close and we had to separate in order to get them both to rank. Further to this we want both sites to now have their own https certificate however this wouldn't be possible if combined. Interested to hear thoughts on this. Thanks
Technical SEO | | lauratagdigital0 -
Using a single sitemap for multiple domains
We have a possible duplicate content issue based on the fact that we have a number of websites run from the same code base across .com / .co.uk / .nl / .fr / .de and so on. We want to update our sitemaps alongside using the href lang tags to ensure Google knows we've got different versions of essentially the same page to serve different markets. Google has written an article on tackling this:https://support.google.com/webmasters/answer/75712?hl=en but my question remains whether having a single sitemap accessible from all the international domains is the best approach here or whether we should have individual sitemaps for each domain.
Technical SEO | | jon_marine0 -
Best XML Generator for Wordpress?
Hi all, Quick question - does anybody have any recommendations for the best XML sitemap plugin for Wordpress? An idea of why you like it would also be helpful. Thank you very much! Mark
Technical SEO | | markadoi840 -
On-Site Sitemaps - Guidance Required
Hi, I am looking to find good examples of on-site sitemaps. We already submit our XML sitemap regularly through GWMT but I now wonder if we still need an on-site sitemap, as we have about 30 static pages and 300+ Wordpress blogs which in a sense makes that a spammy page as it has too many links and a higher than average keyword density. The reason I am looking for good examples is that I want to create a basic on-site sitemap that aids navigation but is styled to look ok as well. The Solution I have in mind: mydomain.com/link-example-one.php
Technical SEO | | tdsnet
mydomain.com/link-example-two.php
mydomain.com/liink-example-ten.php mydomain.com/blog then links to my 300 WP blogs, broken down into chunks navigated by using breadcrumbs. Will Google crawl this ok or should I stick to the current format listing ALL posts on one page? Thanks0 -
What's the best way to switch over to a new site with a different CMS?
Is it better to 301 or to closely duplicate each page URL when switching over to a new website from an established site with good ranking and a different CMS ( Drupal switching to Wordpress)?
Technical SEO | | OhYeahSteve0 -
Does Google index XML files?
Does Google or other search engines include XML files in their index? More specifically, I am wondering how Google knows the difference between an xml filetype and an RSS feed.
Technical SEO | | nicole.healthline0 -
Segmenting Website into XML Sitemaps
Hi all, I'm about to begin the process of chopping up a 1,000 page website into separate sitemaps. I'm going for a three tiered approach so that I can check indexation on each level for: Category, Subcategory, Product What's the easiest way to create three separate XML sitemaps for this? Thanks, Nick
Technical SEO | | NickPateman810