I need an XML sitemap expert for 5 minutes!
-
Hi all!
I'm hoping that someone with a lot of experience with XML sitemaps can help me out here...
When submitting my sitemap in Google Webmaster Tools, these are the results:
2,414,714 Submitted
34,721 IndexedAnd there's also tonnes of warnings.
Would anyone be able to take a quick look at these sitemaps to perhaps advise me on what's going wrong there? These do not load without the www, not sure if this is an issue?
http://www.eumom.ie/sitemap.xml
http://www.eumom.ie/sitemap.xml.gzThanks everyone in advance!!
Gavin
-
Few rules about sitemaps;
-
You should only include in them pages you also want crawled and indexed
-
They should not contain URLs with 404s or blocked by robots.txt
My guess is there are too many URLs in the sitemaps, since I'd guess the website is not over 2 million actual "real" pages,
Also, I randomly clicked on a URL in one of the sitemaps and it 404'd;
http://www.eumom.ie/forums/topic/oakhill-school-leopardstown-/
This is probably causing a lot of the errors you see. It's honestly not a 5 minute fix - but if it were my site, I would be using the Yoast SEO plugin and using the sitemap feature within Yoast. It makes it very easy to include / exclude certain pages and updated automatically etc.
I think there must be a way to tell your plugin what to include / exclude from the sitemap but I don't have as much experience with it.
But generally - only include pages you want crawled and indexed. Don't include pages that 404.
-
-
Hi all,
Many thanks for your input so far, much appreciated!
The sitemaps that you are seeing actually were generated using that plugin you mentioned. Formatting-wise, do you see anything wrong with the sitemaps?
Thanks!!
Gavin -
I couldn't agree more altecdesign!
http://wordpress.org/plugins/google-sitemap-generator/ all the way!
-
That XML sitemap you linked too is formatted in an odd way. I noticed the site you are generating the xml sitemap for is based in wordpress. There is a really solid sitemap plugin you could use to generate your XML and submit to google instead of the current plugin you are using: http://wordpress.org/plugins/google-sitemap-generator/
I've used that plugnin numerous times and submitted sitemaps to google with no errors. Hopefully that helps you out.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I submitted Sitemaps from AIO SEO to google search console, if I now delete the AIO plugin, do my sitemaps become invalid?
I use Yoast as SEO for my new Wordpress website https://www.satisfiedshoes.com/, however I couldn't get the sitemaps with Yoast as it was giving me error 404, and regardless of what I tried, it wasn't working. So I then got the All In One SEO while still having Yoast installed, I easily got the AIO sitemaps and then submitted them successfully to the Google search console. My question is that now I got the sitemaps on Google, since I'd rather use Yoast, If I want to delete AIO, will the sidemaps given to Google become invalid? There is no point keeping both SEO plugins active right? Thank You
Technical SEO | | iamzain160 -
Should all pagination pages be included in sitemaps
How important is it for a sitemap to include all individual urls for the paginated content. Assuming the rel next and prev tags are set up would it be ok to just have the page 1 in the sitemap ?
Technical SEO | | Saijo.George0 -
Sitemap all of a sudden only indexing 2 out of 5000+ pages
Any ideas why this happened? Our sitemap looks the same. Also, our total number of pages indexed has not decreased, just the sitemap. Could this eventually affect my pages being in the index?
Technical SEO | | rock220 -
Host sitemaps on S3?
Hey guys, I run a dynamic web service and I will start building static sitemaps for it pretty soon. The fact that my app lives in a multitude of servers doesn't make it easy to distribute frequently updated static files throughout the servers. My idea was to host the files in AWS S3 and point my robots.txt sitemap directive there. I'll use a sitemap index so, every other sitemap will be hosted on S3 as well. I could dynamically mirror the content from the files in S3 through my app, but that would be a little more resource intensive than just serving the static files from a common place. Any ideas? Thanks!
Technical SEO | | tanlup0 -
RegEx help needed for robots.txt potential conflict
I've created a robots.txt file for a new Magento install and used an existing site-map that was on the Magento help forums but the trouble is I can't decipher something. It seems that I am allowing and disallowing access to the same expression for pagination. My robots.txt file (and a lot of other Magento site-maps it seems) includes both: Allow: /*?p= and Disallow: /?p=& I've searched for help on RegEx and I can't see what "&" does but it seems to me that I'm allowing crawler access to all pagination URLs, but then possibly disallowing access to all pagination URLs that include anything other than just the page number? I've looked at several resources and there is practically no reference to what "&" does... Can anyone shed any light on this, to ensure I am allowing suitable access to a shop? Thanks in advance for any assistance
Technical SEO | | MSTJames0 -
Google Webmaster Sitemap *pending*
Hey guys, I've noticed that my sitemap has been "pending" for quite some time in Google Webmaster tools. This leads me to believe that Google is not indexing my site. Could someone help me and point me to what I'm doing wrong? My site is The Tech Block
Technical SEO | | ttb0 -
Summarize your question.Sitemap blocking or not blocking that is the question?
Hi from wet & overcast wetherby UK 😞 Ones question is this... " Is the sitemap plus boxes blocking bots ie they cant pass on this page http://www.langleys.com/Site-Map.aspx " Its just the + boxes that concern me, i remeber reading somewherte javascript nav can be toxic. Is there a way to test javascript nav set ups and see if they block bots or not? Thanks in advance 🙂
Technical SEO | | Nightwing0 -
Sitemap question
My sitemap includes www.example.com and www.example.com/index.html, they are both the same page, will this have any negative effects, or can I remove the www.example.com/index.html?
Technical SEO | | Aftermath_SEO0