Xml Sitemap
-
Hi mozzers,
I am about to submit a sitemap for one of my clients via webmaster tools. The issue is that I have way too many urls that I don't want them to be indexed by Google such as testing pages, auto generated pages...
Is there way to remove certain URL from the XML sitemap or is this impossible?
If impossible, is the only way to control these urls is to "No index" all these pages that i don't want the search engine to see?
Thanks Mozzers,
-
That is correct, you just submit as you would normally. There are two ways to submit the file:
-
Via the webmaster tools interface. Have you created your webmaster tools account yet? Optimization -> Sitemaps -> Add Sitemap
-
By referencing it in your robots.txt. Just add the following on a new line: Sitemap: http://www.yourdomain.com/sitemap.xml
-
-
Hi Greg,
Since I am not an expert into sitemaps yet, once i finish removing URLs I don t want, should I just save the text editor document and then how do I submit this doc into webmaster tool?
Is it just "add sitemap" and put the name of the doc "www.example.com/sitemap.xml"? or is there another manipulation I should be aware of?
Thank you,
-
Great question! You can manually remove all pages from the sitemap, by opening it up in a text editor of your choice, removing offending entries, and saving the file. Make sure you use a simple text editor like notepad on windows or textwrangler on mac.
It is best to do this before your submission, rather then add pages which you know you don't want indexed and then have to ask google to remove them.
-
You should absolutely be able to exercise complete control over what URLs are contained in your site map. It is dependent upon your sitemap software. There are hundreds of software solutions available.
Regardless of the site map, you should definitely no index the pages you do not wish to appear in search results. A robots.txt entry is definitely not the best solution.
-
Like you mentioned, you could either use robots.txt, or submit a URL removal request through webmaster tools. I've used both methods.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to implement multilingual sitemaps when not all pages have translations
We are trying to implement sitemaps for a site that has localized content for a few countries. We’ve concluded that we should utilize sitemapindex and then create one sitemap per country. Now to the problems we’re facing. Not all urls on the site have translations, how should these urls be presented in the sitemap? Should they be stated simply like so? <url><loc>https://example.com/sdfsdf</loc></url> So urls with the hreflang attribute and without are mixed in the same sitemap, or is that a problem? (I have added empty rows to make it easier to read) <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" <br="">xmlns:xhtml="http://www.w3.org/1999/xhtml"></urlset> <url><loc>http://www.example.com/english/page.html</loc>
Technical SEO | | Telsenome
<xhtml:link rel="alternate" hreflang="de" href="http://www.example.com/deutsch/page.html"><xhtml:link rel="alternate" hreflang="de-ch" href="http://www.example.com/schweiz-deutsch/page.html"><xhtml:link rel="alternate" hreflang="en" href="http: www.example.com="" english="" page.html"=""></xhtml:link rel="alternate" hreflang="en" href="http:></xhtml:link></xhtml:link></url> <url><loc>http://www.example.com/page-with-no-translations</loc></url> <url><loc>http://www.example.com/page-with-no-translations2</loc></url> <url><loc>http://www.example.com/page-with-no-translations3</loc></url> <url><loc>http://www.example.com/deutsch/page.html</loc>
<xhtml:link rel="alternate" hreflang="de" href="http://www.example.com/deutsch/page.html"><xhtml:link rel="alternate" hreflang="de-ch" href="http://www.example.com/schweiz-deutsch/page.html"><xhtml:link rel="alternate" hreflang="en" href="http://www.example.com/english/page.html"></xhtml:link rel="alternate"></xhtml:link></xhtml:link></url>0 -
I want to load my ecommerce site xml via CDN
Hello Experts. My ecommerce site - abcd.com
Technical SEO | | micey123
My ecommrece site sitemap abcd.com/sitemap.xml
My subdomain - xyz.abcd.com ( this is blank page but status is 200 which runs from cdn) My ecommerce site sitemap abcd.com/sitemap.xml contains only 1 link of subdomain sitemap- xyz.abcd.com/sitemap.xml
And this sitemap- xyz.abcd.com/sitemap.xml contains all category and product links of abcd.com So my query is :- Above configuration is okay? In search console I will add new property - xyz.abcd.com. and add sitemap xyz.abcd.com/sitemap.xml So Google will able to give errors for my website abcd.com Purpose - I want to run my xml sitemap from cdn that's why i have created subdomain like xyz.abcd.com Hope you understood my query. Thanks!0 -
Google Webmaster Tools is saying "Sitemap contains urls which are blocked by robots.txt" after Https move...
Hi Everyone, I really don't see anything wrong with our robots.txt file after our https move that just happened, but Google says all URLs are blocked. The only change I know we need to make is changing the sitemap url to https. Anything you all see wrong with this robots.txt file? robots.txt This file is to prevent the crawling and indexing of certain parts of your site by web crawlers and spiders run by sites like Yahoo! and Google. By telling these "robots" where not to go on your site, you save bandwidth and server resources. This file will be ignored unless it is at the root of your host: Used: http://example.com/robots.txt Ignored: http://example.com/site/robots.txt For more information about the robots.txt standard, see: http://www.robotstxt.org/wc/robots.html For syntax checking, see: http://www.sxw.org.uk/computing/robots/check.html Website Sitemap Sitemap: http://www.bestpricenutrition.com/sitemap.xml Crawlers Setup User-agent: * Allowable Index Allow: /*?p=
Technical SEO | | vetofunk
Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/ Directories Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /includes/
Disallow: /lib/
Disallow: /magento/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /stats/
Disallow: /var/ Paths (clean URLs) Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/
Disallow: /aitmanufacturers/index/view/
Disallow: /blog/tag/
Disallow: /advancedreviews/abuse/reportajax/
Disallow: /advancedreviews/ajaxproduct/
Disallow: /advancedreviews/proscons/checkbyproscons/
Disallow: /catalog/product/gallery/
Disallow: /productquestions/index/ajaxform/ Files Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt Paths (no clean URLs) Disallow: /.php$
Disallow: /?SID=
disallow: /?cat=
disallow: /?price=
disallow: /?flavor=
disallow: /?dir=
disallow: /?mode=
disallow: /?list=
disallow: /?limit=5
disallow: /?limit=10
disallow: /?limit=15
disallow: /?limit=20
disallow: /*?limit=250 -
Xml sitemaps giving 404 errors
We have recently made updates to our xml sitemap and have split them into child sitemaps. Once these were submitted to search console, we received notification that the all of the child sitemaps except 1 produced 404 errors. However, when we view the xml sitemaps in a browser, there are no errors. I have also attempted crawling the child sitemaps with Screaming Frog and received 404 responses there as well. My developer cannot figure out what is causing the errors and I'm hoping someone here can assist. Here is one of the child sitemaps: http://www.sermonspice.com/sitemap-countdowns_paged_1.xml
Technical SEO | | ang0 -
301 Redirects Relating to Your XML Sitemap
Lets say you've got a website and it had quite a few pages that for lack of a better term were like an infomercial, 6-8 pages of slightly different topics all essentially saying the same thing. You could all but call it spam. www.site.com/page-1 www.site.com/page-2 www.site.com/page-3 www.site.com/page-4 www.site.com/page-5 www.site.com/page-6 Now you decided to consolidate all of that information into one well written page, and while the previous pages may have been a bit spammy they did indeed have SOME juice to pass through. Your new page is: www.site.com/not-spammy-page You then 301 redirect the previous 'spammy' pages to the new page. Now the question, do I immediately re-submit an updated xml sitemap to Google, which would NOT contain all of the old URL's, thus making me assume Google would miss the 301 redirect/seo juice. Or do I wait a week or two, allow Google to re-crawl the site and see the existing 301's and once they've taken notice of the changes submit an updated sitemap? Probably a stupid question I understand, but I want to ensure I'm following the best practices given the situation, thanks guys and girls!
Technical SEO | | Emory_Peterson0 -
Sitemap indexation
3 days ago I sent in a new sitemap for a new platform. Its 23.412 pages but until now its only 4 pages (!!) that are indexed according to the Webmaster Tools. Why so few? Our stage-enviroment got indexed (more than 50K pages) in a few days by a mistake.
Technical SEO | | Morten_Hjort0 -
How could i create sitemap with 1000 page and should i update sitemap frequently?
My website have over 1000 pages but the sitemap creator tools i knew only create maximum 500 pages, how could i create sitemap with full of my webpage?
Technical SEO | | magician0 -
What to include on a sitemap for a huge site?
I have a very large site and I'm not sure what all to include on the sitemap page. We have categories such as items1, items2 and in the items1 category are 100 vendors with their individual vendor pages. Should I link all 100 vendor pages on the sitemap or just the main items1 category?
Technical SEO | | CFSSEO0