Sitemap.xml - autogenerated by CMS is full of crud
-
Hi all,
hope you can help.
the Magento ecommerce system I'm working with autogenerates sitemap.xml - it's well formed with priority and frequency parameters.
However, it has generated lots of URLs that are pointing to broken pages returning fatal erros, duplicate URLs (not canonicals), 404s etc
I'm thinking of hand creating sitemap.xml - the site has around 50 main pages including products and categories, and I can get the main page URLs listed by screaming frog or xenu.
Then I'll have to get into the hand editing the crud pages with noindex, and useful duplicates with canonicals.
Is this the way to go or is there another solution
thanks in advance for any advice
-
If the cron is working then I would personally turn to the other forum to see if anyone knows a way to rope those messy URLs in and get them under control. I try to avoid manually generating and updating sitemaps whenever I can, because it's a hassle on a small site, not to mention the trouble on an ecommerce site.
If your site is going to stay that small, then a manual sitemap might be less of a headache for you than customizing Magento.
I would worry about keeping a clean sitemap. If the search engines learn that you keep a messy sitemap, they will rely on it less and less. 404 & 500 codes especially, but also redirects and perhaps duplicate content.
For Further Reading:
Google Sitemaps Ask For Clean URLs - http://www.johnfdoherty.com/google-sitemaps-ask-for-clean-urls/
-
Hi Kane,
the sitemap is new - it's just that Magento create lots of duplicate files on the fly & it's not putting the canonical URLs in the sitemap etc.
I just wondered whether its worth hand creating a sitemap.xml containing the content pages (60 or 70 of them) for this relatively small site, or not worry too much about the sitemap, the site is pretty well indexed by google already
I'll head over to the Magento forums again to see if I can find more info
many thanks for you help
-
If it's returning 404 pages, that sounds like a dated sitemap. Have you activated the cron service?
See the "Refreshing Sitemaps at Regular Intervals" section of this page if not:
Magento can be set up to automatically refresh Google Sitemaps at regular intervals. This function is configured in Admin > System > Configuration > Google Sitemap.
To use Magento’s automatic generation of Google Sitemaps, you must activate the Magento Cron service.
If you do have that setup, and you're certain it's working correctly, then I would turn to the forums at MagentoCommerce.com - you're going to get a lot faster answer there since everyone is familiar with that exact platform.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moving Shopify from a Sub Domain to the Full Domain
Apologies if this has been asked before. Currently we have a Shopify shop on a subdomain shop.lucybee.com a blog on subdomain blog.lucybee.com and the full domain of course. We'd now like to move everything onto Shopify on the full domain. Therefore the blog rolls into Shopify. We'll manually move pages into Shopify. Any advice or links to resources on how to manage would be gratefully received. Thank you , Jim
Technical SEO | | LucyBee0 -
Sitemap
I have a question for the links in a sitemap. Wordpress works with a sitemap that first link to the different kind of pages: pagesitemap.xml categorysitemap.xml productsitemap.xml etc. etc. These links on the first page are clickable. We have a website that also links to the different pages but it's not clickable, just a flat link. Is this an issue?
Technical SEO | | Happy-SEO0 -
Type of sitemap
I have a client with a large sitemap in html for his web shop. I am wondering though if i would be better to have a xml sitemap for Google. Is there any advantage in type of sitemap?
Technical SEO | | auke18100 -
Some pages on my site are not linked - should I add a Visual SiteMap?
Hello, I have a site that does not have a blog feed.
Technical SEO | | NikitaG
And unless it is done Manually there is no way to see the blog links.
www.MigrationLawyers.co.za Now, I submit the the Sitemap to google, but will it be a good Idea to include an actual sitemap of the site (for example in the footer of the site)
http://migrationlawyers.co.za/sitemap-immigration-south-africa and should i Make the "sitemap" link a follow or nofollow? Thanks so much in advance
Nikita0 -
How to display the full structure of website on Google serps
I have been searching around but unable to gather information as to how we can control or list top pages of a website on Google's first page , i.e. if we type seomoz in google , we can see the main listing with 6 subdomain listings , which link to Blog , Seo tool , Beginner Seo guide , Learn Seo , Pricing & Plans and login My question is can we control these listings i.e. what to display and what not , and if yes how can we make this type of visibility on first page , by using html or xml sitemaps or theirs something mostly websites are missing. Cause this type of data is coming up for very less websites and mostly websites are with single urls. c43Ki.jpg
Technical SEO | | ngupta10 -
Host sitemaps on S3?
Hey guys, I run a dynamic web service and I will start building static sitemaps for it pretty soon. The fact that my app lives in a multitude of servers doesn't make it easy to distribute frequently updated static files throughout the servers. My idea was to host the files in AWS S3 and point my robots.txt sitemap directive there. I'll use a sitemap index so, every other sitemap will be hosted on S3 as well. I could dynamically mirror the content from the files in S3 through my app, but that would be a little more resource intensive than just serving the static files from a common place. Any ideas? Thanks!
Technical SEO | | tanlup0 -
Does Bing support a news sitemap yet?
With Bing's new app that will integrate their news feed into Facebook, I'd like to optimize for inclusion in Bing news pickup. Does Bing accept news sitemaps yet?
Technical SEO | | Aggie0 -
Benefits to having an HTML sitemap?
We are currently migrating our site to a new CMS and in part of this migration I'm getting push-back from my development team regarding the HTML sitemap. We have a very large news site with 10s of thousands of pages. We currently have an HTML sitemap that greatly helps with distributing PR to article pages, but is not geared towards the user. The dev team doesn't see the benefit to recreating the HTML sitemap despite my assurance that we don't want to lose all these internal links since removing 1000s of links could have a negative impact on our Domain Authority. Should I give in and concede the HTML sitemap since we have an XML one? Or am I right that we don't want to get rid of it?
Technical SEO | | BostonWright0