Sitemap.xml - autogenerated by CMS is full of crud
-
Hi all,
hope you can help.
the Magento ecommerce system I'm working with autogenerates sitemap.xml - it's well formed with priority and frequency parameters.
However, it has generated lots of URLs that are pointing to broken pages returning fatal erros, duplicate URLs (not canonicals), 404s etc
I'm thinking of hand creating sitemap.xml - the site has around 50 main pages including products and categories, and I can get the main page URLs listed by screaming frog or xenu.
Then I'll have to get into the hand editing the crud pages with noindex, and useful duplicates with canonicals.
Is this the way to go or is there another solution
thanks in advance for any advice
-
If the cron is working then I would personally turn to the other forum to see if anyone knows a way to rope those messy URLs in and get them under control. I try to avoid manually generating and updating sitemaps whenever I can, because it's a hassle on a small site, not to mention the trouble on an ecommerce site.
If your site is going to stay that small, then a manual sitemap might be less of a headache for you than customizing Magento.
I would worry about keeping a clean sitemap. If the search engines learn that you keep a messy sitemap, they will rely on it less and less. 404 & 500 codes especially, but also redirects and perhaps duplicate content.
For Further Reading:
Google Sitemaps Ask For Clean URLs - http://www.johnfdoherty.com/google-sitemaps-ask-for-clean-urls/
-
Hi Kane,
the sitemap is new - it's just that Magento create lots of duplicate files on the fly & it's not putting the canonical URLs in the sitemap etc.
I just wondered whether its worth hand creating a sitemap.xml containing the content pages (60 or 70 of them) for this relatively small site, or not worry too much about the sitemap, the site is pretty well indexed by google already
I'll head over to the Magento forums again to see if I can find more info
many thanks for you help
-
If it's returning 404 pages, that sounds like a dated sitemap. Have you activated the cron service?
See the "Refreshing Sitemaps at Regular Intervals" section of this page if not:
Magento can be set up to automatically refresh Google Sitemaps at regular intervals. This function is configured in Admin > System > Configuration > Google Sitemap.
To use Magento’s automatic generation of Google Sitemaps, you must activate the Magento Cron service.
If you do have that setup, and you're certain it's working correctly, then I would turn to the forums at MagentoCommerce.com - you're going to get a lot faster answer there since everyone is familiar with that exact platform.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How would I make an image sitemap for a site with 50k images?
Is it worth creating an image sitemap? I read an image sitemap can only have 1k entries would I need to make 50 different ones?
Technical SEO | | EcommerceSite0 -
Submitting a new sitemap index file. Only one file is getting read. What is the error?
Hi community, I am working to submit a new a new sitemap index files, where about 5 50,000 sku files will be uploaded. Webmasters is reporting that only 50k skus have been submitted. Google Webmasters is accepting the index, however only the first file is getting read. I have 2 errors and need to know if this is the reason that the multiple files are not getting uploaded. Errors: | 1 | | Warnings | Invalid XML: too many tags | Too many tags describing this tag. Please fix it and resubmi | | 2 | | Warnings | Incorrect namespace | Your Sitemap or Sitemap index file doesn't properly declare the namespace. | 1 | Here is the url I am submitting: http://www.westmarine.com/sitemap/wm-sitemap-index.xml | 1 | | | | |
Technical SEO | | mm9161570 -
Will sitemap generated in Yoast for a combined wordpress/magento site map entire site ?
Hi For an ecommerce site thats been developed via a combination of wordpress and magento and has yoast installed, will the sitemap (& other yoast features) map (& apply to) the entire site or just wordpress aspects ? In other words does one need to do anything else to have a full sitemap for a combined magento/wordpress site or will Yoast cover it all ? This link seems to suggest should be fine but seeing if anyone else encountered this and had problems or if straightforward ? http://fishpig.co.uk/wordpress-integration/docs/plugins.html cheers dan
Technical SEO | | Dan-Lawrence0 -
Do I need a link to my sitemap?
I have a very large sitemap. I submit it to both Google and Bing, but do I need a link to it? If someone went there it would probably lock their browser. Is there any danger of not having a link if I submit it to Google and Bing?
Technical SEO | | EcommerceSite0 -
Does Bing support a news sitemap yet?
With Bing's new app that will integrate their news feed into Facebook, I'd like to optimize for inclusion in Bing news pickup. Does Bing accept news sitemaps yet?
Technical SEO | | Aggie0 -
WordPress blog and XML sitemap
I have a friend that just spent 15K on a new site and believe it or not the developer did not incorporate a CMS into the site. If a WP blog is built and the URL is added to the site's XML sitemap, for all intensive purposes, would Google view this URL as part of the site in terms of overall number of links, referring domains etc.? The developer is saying that even if the WP URL is added to the XML sitemap, Google will not view this URL as part of the site domain. I cannot think of another way of adding unique content to the site unless the developer is paid to build new pages every month. If the WP blog is not part of the overall domain, then we're left with the URL simply pointing back to the domain with anchor text and such and not adding to the total number of links and RD... ANY THOUGHTS ON THIS WOULD BE GREATLY APPRECIATED! Thanks Mozzers!
Technical SEO | | hawkvt10 -
Sitemap for dynamic website with over 10,000 pages
If I have a website with thousands of products, is it a good idea to create a sitemap for this website for the search engines where you show maybe 250 products on a page so it makes it easy for the search engine to find the part and also puts that part closer to the home page? Seems like google likes pages that are the closest to the home page (less clicks the better)
Technical SEO | | roundbrix0 -
301 redirects inside sitemaps
I am in the process of trying to get google to follow a large number of old links on site A to site B. Currently I have 301 redirects as well a cross domain canonical tags in place. My issue is that Google is not following the links from site A to site B since the links no longer exist in site A. I went ahead and added the old links from site A into site A's sitemap. Unfortunately Google is returning this message inside webmaster tools: When we tested a sample of URLs from your Sitemap, we found that some URLs redirect to other locations. We recommend that your Sitemap contain URLs that point to the final destination (the redirect target) instead of redirecting to another URL. However I do not understand how adding the redirected links from site B to the sitemap in site A will remove the old links. Obviously Google can see the 301 redirect and the canonical tag but this isn't defined in the sitemap as a direct correlation between site A and B. Am I missing something here?
Technical SEO | | jmsobe0