I need an XML sitemap expert for 5 minutes!
-
Hi all!
I'm hoping that someone with a lot of experience with XML sitemaps can help me out here...
When submitting my sitemap in Google Webmaster Tools, these are the results:
2,414,714 Submitted
34,721 IndexedAnd there's also tonnes of warnings.
Would anyone be able to take a quick look at these sitemaps to perhaps advise me on what's going wrong there? These do not load without the www, not sure if this is an issue?
http://www.eumom.ie/sitemap.xml
http://www.eumom.ie/sitemap.xml.gzThanks everyone in advance!!
Gavin
-
Few rules about sitemaps;
-
You should only include in them pages you also want crawled and indexed
-
They should not contain URLs with 404s or blocked by robots.txt
My guess is there are too many URLs in the sitemaps, since I'd guess the website is not over 2 million actual "real" pages,
Also, I randomly clicked on a URL in one of the sitemaps and it 404'd;
http://www.eumom.ie/forums/topic/oakhill-school-leopardstown-/
This is probably causing a lot of the errors you see. It's honestly not a 5 minute fix - but if it were my site, I would be using the Yoast SEO plugin and using the sitemap feature within Yoast. It makes it very easy to include / exclude certain pages and updated automatically etc.
I think there must be a way to tell your plugin what to include / exclude from the sitemap but I don't have as much experience with it.
But generally - only include pages you want crawled and indexed. Don't include pages that 404.
-
-
Hi all,
Many thanks for your input so far, much appreciated!
The sitemaps that you are seeing actually were generated using that plugin you mentioned. Formatting-wise, do you see anything wrong with the sitemaps?
Thanks!!
Gavin -
I couldn't agree more altecdesign!
http://wordpress.org/plugins/google-sitemap-generator/ all the way!
-
That XML sitemap you linked too is formatted in an odd way. I noticed the site you are generating the xml sitemap for is based in wordpress. There is a really solid sitemap plugin you could use to generate your XML and submit to google instead of the current plugin you are using: http://wordpress.org/plugins/google-sitemap-generator/
I've used that plugnin numerous times and submitted sitemaps to google with no errors. Hopefully that helps you out.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to implement multilingual sitemaps when not all pages have translations
We are trying to implement sitemaps for a site that has localized content for a few countries. We’ve concluded that we should utilize sitemapindex and then create one sitemap per country. Now to the problems we’re facing. Not all urls on the site have translations, how should these urls be presented in the sitemap? Should they be stated simply like so? <url><loc>https://example.com/sdfsdf</loc></url> So urls with the hreflang attribute and without are mixed in the same sitemap, or is that a problem? (I have added empty rows to make it easier to read) <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" <br="">xmlns:xhtml="http://www.w3.org/1999/xhtml"></urlset> <url><loc>http://www.example.com/english/page.html</loc>
Technical SEO | | Telsenome
<xhtml:link rel="alternate" hreflang="de" href="http://www.example.com/deutsch/page.html"><xhtml:link rel="alternate" hreflang="de-ch" href="http://www.example.com/schweiz-deutsch/page.html"><xhtml:link rel="alternate" hreflang="en" href="http: www.example.com="" english="" page.html"=""></xhtml:link rel="alternate" hreflang="en" href="http:></xhtml:link></xhtml:link></url> <url><loc>http://www.example.com/page-with-no-translations</loc></url> <url><loc>http://www.example.com/page-with-no-translations2</loc></url> <url><loc>http://www.example.com/page-with-no-translations3</loc></url> <url><loc>http://www.example.com/deutsch/page.html</loc>
<xhtml:link rel="alternate" hreflang="de" href="http://www.example.com/deutsch/page.html"><xhtml:link rel="alternate" hreflang="de-ch" href="http://www.example.com/schweiz-deutsch/page.html"><xhtml:link rel="alternate" hreflang="en" href="http://www.example.com/english/page.html"></xhtml:link rel="alternate"></xhtml:link></xhtml:link></url>0 -
Should I add my html sitemap to Robots?
I have already added the .xml to Robots. But should I also add the html version?
Technical SEO | | Trazo0 -
Sitemap_index.xml = noindex,follow
I was running a rapport with Sreaming Frog SEO Spider and i saw: (Tab) Directives > NOindex : https://compleetverkleed.nl/sitemap_index.xml/ is set on X-Robots-Tag 1 > noindex,follow Does this mean my sitemap isn't indexed? If anyone has some more tips for our website, feel free to give some suggestions 🙂 (Website is far from complete)
Technical SEO | | Happy-SEO2 -
Not All Submitted URLs in Sitemap Get Indexed
Hey Guys, I just recognized, that of about 20% of my submitted URL's within the sitemap don't get indexed, at least when I check in the webmaster tools. There is of about 20% difference between the submitted and indexed URLs. However, as far as I can see I don't get within webmaster tools the information, which specific URLs are not indexed from the sitemap, right? Therefore I checked every single page in the sitemap manually by putting site:"URL" into google and every single page of the sitemap shows up. So in reality every page should be indexed, but why does webmaster tools shows something different? Thanks for your help on this 😉 Cheers
Technical SEO | | _Heiko_0 -
Do I need both canonical meta tags AND 301 redirects?
I implemented a 301 redirect set to the "www" version in the .htaccess (apache server) file and my logs are DOWN 30-40%! I have to be doing something wrong! AddType application/x-httpd-php .html .htm RewriteCond %{HTTP_HOST} ^luckygemstones.com
Technical SEO | | spkcp111
RewriteRule (.*) http://www.luckygemstones.com/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^./index.htm
RewriteRule ^(.)index.htm$ http://www.luckygemstones.com/$1 [R=301,L] IndexIgnore *
ErrorDocument 404 http://www.luckygemstones.com/page-not-found.htm
ErrorDocument 500 http://www.luckygemstones.com/internal-serv-error.htm
ErrorDocument 403 http://www.luckygemstones.com/forbidden-request.htm
ErrorDocument 401 http://www.luckygemstones.com/not-authorized.htm I've also started adding canoncial META's to EACH page: I'm using HMTL 4.0 loose still--1000's of pages--painful to convert to HTML5 so I left the / off the tag so it would validate. Am I doing something wrong? Thanks, Kathleen0 -
Is the Sandbox Real? Need Help!
To start, I'm very new at this so I've likely made a ton of mistakes but here is the breakdown of what's happened/what's been done to my site. I own a wedding photography company which was based in Portland, we decided about six months prior that we wanted to relocate to San Diego. It was too soon to optimize our website for our new town of San Diego so I created a brand new site. It was born around June 2011. It looks just like the old site but all the content is different (different titles, re-uploaded images, text, etc was optimized for San Diego). What may be my pitfall is I imported our blog posts from the old site to the new site and we continued to keep both blogs live (writing the post in one, importing to the other). San Diego site: http://continuumweddings.com Old Site (now optimized for LA): http://continuumphotography.com From there I began link building. I signed up for the SEO Scheduler and began making the changes suggested there. It told me to sign up for Linxboss, and I did it. Other than that, my links have been build naturally and I have quite a few of them, definitely enough to compete with my top competitors. At one point I was #3 for "San Diego Wedding Photographer" and I stayed there for a couple weeks. Then I began to drop. Now I'm somewhere on page 10. I've read a lot of articles on here and I know I have a lot of things potentially hurting me. Site age, Duplicate content, etc. I'm just not sure why I dropped (still rank on 1st page in Yahoo & Bing) and what I should do about it. I tend to get overwhelmed and every post I read seems to talk about something new I may have done wrong. I'm willing to put in the time to fix this; I just need to know where my time is best spent.
Technical SEO | | mrsmelmitch0 -
Do sites really need a 404 page?
We have people posting broken links to our site is this looking us link juice as they link to 404 pages. We could redirect to the homepage or just render the home page content, in both cases we can still display a clear page not found message. Is this legal (white hat).
Technical SEO | | ed1234560 -
Should XML sitemaps include *all* pages or just the deeper ones?
Hi guys, Ok this is a bit of a sitemap 101 question but I cant find a definitive answer: When we're running out XML sitemaps for google to chew on (we're talking ecommerce and directory sites with many pages inside sub-categories here) is there any point in mentioning the homepage or even the second level pages? We know google is crawling and indexing those and we're thinking we should trim the fat and just send a map of the bottom level pages. What do you think?
Technical SEO | | timwills0