XML Sitemap Issue or not?
-
Hi Everyone,
I submitted a sitemap within the google webmaster tools and I had a warning message of 38 issues.
Issue: Url blocked by robots.txt.
Description: Sitemap contains urls which are blocked by robots.txt.
Example: the ones that were given were urls that we don't want them to be indexed: Sitemap: www.example.org/author.xml
Value: http://www.example.org/author/admin/
My issue here is that the number of URL indexed is pretty low and I know for a fact that Robot.txt aren't good especially if they block URL that needs to be indexed. Apparently the URLs that are blocked seem to be URLs that we don't to be indexed but it doesn't display all URLs that are blocked.
Do you think i m having a major problem or everything is fine?What should I do? How can I fix it?
FYI: Wordpress is what we use for our website
Thanks
-
Hi Dan
Thanks for your answer. Would you really recommend using the plugin instead of just uploading the xml sitemap directly to the website's root directory? If yes why?
Thanks
-
Lisa
I would honestly switch to the Yoast SEO plugin. It handles the SEO (and robots.txt) a lot better, as well as the XML sitemaps all within that one plugin.
I'd check out my guide for setting up WordPress for SEO on the moz blog.
Most WP robots.txt files will look like this;
User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/
And that's it.
You could always just try changing yours to the above setting first,
before switching to Yoast SEO - I bet that would clear up
the sitemap issues.
Hope that helps!
-Dan ```
-
Lisa, try checking manually which URL is not getting indexed in Google. Make sure you do not have any no follows on those pages. If all the pages are connected / linked together, then Google will crawl your whole site eventually, just a matter of time.
-
Hi
when generating sitemap there are 46 URLs detected by xml-sitemaps.com but when adding the sitemap to WMT only 12 get submitted and 5 are indexed which is really kind of worrying me. This might be because of the xml sitemap plugin that I installed. May be something is wrong with my settings(doc attached 1&2)
I am kind of lost especially that SEOmoz hasn't detected any URLs blocked by Robot.txt
It would be great if you could tell me what should I do next ?
Thanks
-
The first question i would ask is how big is the difference. If the difference is a large in the # of pages on your site and the ones indexed by Google, then you have an issue. The blocked pages might be the ones linking to the ones that have not been indexed and causing issues. Try removing the no follow on those pages and then resubmit your sitemap and see if that fixes the issue. Also double check your site map to make sure you have correctly added all the pages in it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to implement multilingual sitemaps when not all pages have translations
We are trying to implement sitemaps for a site that has localized content for a few countries. We’ve concluded that we should utilize sitemapindex and then create one sitemap per country. Now to the problems we’re facing. Not all urls on the site have translations, how should these urls be presented in the sitemap? Should they be stated simply like so? <url><loc>https://example.com/sdfsdf</loc></url> So urls with the hreflang attribute and without are mixed in the same sitemap, or is that a problem? (I have added empty rows to make it easier to read) <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" <br="">xmlns:xhtml="http://www.w3.org/1999/xhtml"></urlset> <url><loc>http://www.example.com/english/page.html</loc>
Technical SEO | | Telsenome
<xhtml:link rel="alternate" hreflang="de" href="http://www.example.com/deutsch/page.html"><xhtml:link rel="alternate" hreflang="de-ch" href="http://www.example.com/schweiz-deutsch/page.html"><xhtml:link rel="alternate" hreflang="en" href="http: www.example.com="" english="" page.html"=""></xhtml:link rel="alternate" hreflang="en" href="http:></xhtml:link></xhtml:link></url> <url><loc>http://www.example.com/page-with-no-translations</loc></url> <url><loc>http://www.example.com/page-with-no-translations2</loc></url> <url><loc>http://www.example.com/page-with-no-translations3</loc></url> <url><loc>http://www.example.com/deutsch/page.html</loc>
<xhtml:link rel="alternate" hreflang="de" href="http://www.example.com/deutsch/page.html"><xhtml:link rel="alternate" hreflang="de-ch" href="http://www.example.com/schweiz-deutsch/page.html"><xhtml:link rel="alternate" hreflang="en" href="http://www.example.com/english/page.html"></xhtml:link rel="alternate"></xhtml:link></xhtml:link></url>0 -
Sitemap and canonical
In my sitemap I have two entries for my page ContactUs.asp ContactUs.asp?Lng=E ContactUs.asp?Lng=F What should I use in my page ContactUS.asp ? Is this correct?
Technical SEO | | CustomPuck0 -
Image Indexing Issue by Google
Hello All,My URL is: www.thesalebox.comI have Submitted my image Sitemap in google webmaster tool on 10th Oct 2013,Still google could not indexing any of my web images,Please refer my sitemap - www.thesalebox.com/AppliancesHomeEntertainment.xml and www.thesalebox.com/Hardware.xmland my webmaster status and image indexing status are below,
Technical SEO | | CommercePunditCan you please help me, why my images are not indexing in google yet? is there any issue? please give me suggestions?Thanks!
0 -
Is there a way for me to automatically download a website's sitemap.xml every month?
From now on we want to store all our sitemap.xml over the next years. Its a nice archive to have that allows us to analyse how many pages we have on our website and which ones were removed/redirected. Any suggestions? Thanks
Technical SEO | | DeptAgency0 -
Http & https canonicalization issues
Howdyho I'm SEOing a daily deals site that mostly runs on https Versions. (only the home page is on http). I'm wondering what to do for canonicalization. IMO it would be easiest to run all pages on https. But the scarce resources I find are not so clear. For instance, this Youmoz blog post claims that https is only for humans, not for bots! That doesn't really apply anymore, right?
Technical SEO | | zeepartner0 -
Duplicate Page Issue
Dear All, I am facing stupid duplicate page issue, My whole site is in dynamic script and all the URLs were in dynamic, So i 've asked my programmer make the URLs user friendly using URL Rewrite, but he converted aspx pages to htm. And the whole mess begun. Now we have 3 different URLs for single page. Such as: http://www.site.com/CityTour.aspx?nodeid=4&type=4&id=47&order=0&pagesize=4&pagenum=4&val=Multi-Day+City+Tours http://www.tsite.com/CityTour.aspx?nodeid=4&type=4&id=47&order=0&pagesize=4&pagenum=4&val=multi-day-city-tours http://www.site.com/city-tour/multi-day-city-tours/page4-0.htm I think my programmer messed up the URL Rewrite in ASP.net(Nginx) or even didn't use it. So how do i overcome this problem? Should i add canonical tag in both dynamic URLs with pointing to pag4-0.htm. Will it help? Thanks!
Technical SEO | | DigitalJungle0 -
Mobile sitemaps - how much value?
Hi, We have a main www website with a standard sitemap. We also have a m. site for mobile content (the mobile site only contains our top pages and doesn't include the entire site). If a mobile client accesses one of our www pages we redirect to the m. page. If we don't have a m. version we keep them on the www site. Since we already have a www sitemap, is there much value in creating a mobile site map? The mobile site (although missing all pages) is pretty robust and contains most content people are looking for. Will the mobile sitemap help for Mobile searches (more so than our standard sitemap)? I'm also planning on rel canonical the m. pages to the www. pages (per other suggestios on SEOMoz) Thanks
Technical SEO | | NicB10 -
Crawl issues/ .htacess issues
My site is getting crawl errors inside of google webmaster tools. Google believe a lot of my links point to index.html when they really do not. That is not the problem though, its that google can't give credit for those links to any of my pages. I know I need to create a rule in the .htacess but the last time I did it I got an error. I need some assistance on how to go about doing this, I really don't want to lose the weight of my links. Thanks
Technical SEO | | automart0