Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should I include URLs that are 301'd or only include 200 status URLs in my sitemap.xml?
-
I'm not sure if I should be including old URLs (content) that are being redirected (301) to new URLs (content) in my sitemap.xml. Does anyone know if it is best to include or leave out 301ed URLs in a xml sitemap?
-
I agree with Logan.
If the ratio of redirected or broken URLs is too high in your sitemap XML, there is a chance that Google won't crawl it as frequently as it should because the search robot doesn't want to waste resources on these URLs.
The only time when redirected URLs are useful in the sitemap XML is when you're migrating the domain or make IA changes and you want to make sure that the search engine discovers the 301 redirections as quickly as possible.
-
Hi,
Your XML sitemap should only contain 'clean URLs'. By that I mean only 200 status URLs.
You should not have any redirects or error pages. You should also make sure you've got the preferred format; i.e. www vs. non-www and https vs. http.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can't generate a sitemap with all my pages
I am trying to generate a site map for my site nationalcurrencyvalues.com but all the tools I have tried don't get all my 70000 html pages... I have found that the one at check-domains.com crawls all my pages but when it writes the xml file most of them are gone... seemingly randomly. I have used this same site before and it worked without a problem. Can anyone help me understand why this is or point me to a utility that will map all of the pages? Kindly, Greg
Intermediate & Advanced SEO | | Banknotes0 -
Duplicate Content through 'Gclid'
Hello, We've had the known problem of duplicate content through the gclid parameter caused by Google Adwords. As per Google's recommendation - we added the canonical tag to every page on our site so when the bot came to each page they would go 'Ah-ha, this is the original page'. We also added the paramter to the URL parameters in Google Wemaster Tools. However, now it seems as though a canonical is automatically been given to these newly created gclid pages; below https://www.google.com.au/search?espv=2&q=site%3Awww.mypetwarehouse.com.au+inurl%3Agclid&oq=site%3A&gs_l=serp.3.0.35i39l2j0i67l4j0i10j0i67j0j0i131.58677.61871.0.63823.11.8.3.0.0.0.208.930.0j3j2.5.0....0...1c.1.64.serp..8.3.419.nUJod6dYZmI Therefore these new pages are now being indexed, causing duplicate content. Does anyone have any idea about what to do in this situation? Thanks, Stephen.
Intermediate & Advanced SEO | | MyPetWarehouse0 -
Partial Match or RegEx in Search Console's URL Parameters Tool?
So I currently have approximately 1000 of these URLs indexed, when I only want roughly 100 of them. Let's say the URL is www.example.com/page.php?par1=ABC123=&par2=DEF456=&par3=GHI789= All the indexed URLs follow that same kinda format, but I only want to index the URLs that have a par1 of ABC (but that could be ABC123 or ABC456 or whatever). Using URL Parameters tool in Search Console, I can ask Googlebot to only crawl URLs with a specific value. But is there any way to get a partial match, using regex maybe? Am I wasting my time with Search Console, and should I just disallow any page.php without par1=ABC in robots.txt?
Intermediate & Advanced SEO | | Ria_0 -
Changing URL structure of date-structured blog with 301 redirects
Howdy Moz, We've recently bought a new domain and we're looking to change over to it. We're also wanting to change our permalink structure. Right now, it's a WordPress site that uses the post date in the URL. As an example: http://blog.mydomain.com/2015/01/09/my-blog-post/ We'd like to use mod_rewrite to change this using regular expressions, to: http://newdomain.com/blog/my-blog-post/ Would this be an appropriate solution? RedirectMatch 301 /./././(.) /blog/$1
Intermediate & Advanced SEO | | IanOBrien0 -
Does anyone know of any tools that can help split up xml sitemap to make it more efficient and better for seo?
Hello All, We want to split up our Sitemap , currently it's almost 10K pages in one xml sitemap but we want to make it in smaller chunks splitting it by category or location or both. Ideally into 100 per sitemap is what I read is the best number to help improve indexation and seo ranking. Any thoughts on this ? Does anyone know or any good tools out there which can assist us in doing this ? Also another question I have is that should we put all of our products (1250) in one site map or should this also be split up in to say products for category etc etc ? thanks Pete
Intermediate & Advanced SEO | | PeteC120 -
For URLs that require login, should our redirect be 301 or 302?
We have a login required section of our website that is being crawled and reporting as potential issues in Webmaster Tools. I'm not sure what the best solution to this is - is it to make URLs requiring a login noindex/nocrawl? Right now, we have them 302 redirecting to the login page, since it's a temporary redirect, it seems like it isn't the right solution. Is a 301 better?
Intermediate & Advanced SEO | | alecfwilson0 -
Multiple 301 redirects for a HTTPS URL. Good or bad?
I'm working on an ecommerce website that has a few snags and issues with it's coding. They're using https, and when you access the website through domain.com, theres a 301 redirect to http://www.domain.com and then this, in turn, redirected to https://www.domain.com. Would this have a deterimental effect or is that considered the best way to do it. Have the website redirect to http and then all http access is redirected to the https URL? Thanks
Intermediate & Advanced SEO | | jasondexter0 -
XML Sitemap Index Percentage (Large Sites)
Hi all I'm wanting to find out from those who have experience dealing with large sites (10s/100s of millions of pages). What's a typical (or highest) percentage of indexed pages vs. submitted pages you've seen? This information can be found in webmaster tools where Google shows you the pages submitted & indexed for each of your sitemap. I'm trying to figure out whether, The average index % out there There is a ceiling (i.e. will never reach 100%) It's possible to improve the indexing percentage further Just to give you some background, sitemap index files (according to schema.org) have been implemented to improve crawl efficiency and I'm wanting to find out other ways to improve this further. I've been thinking about looking at the URL parameters to exclude as there are hundreds (e-commerce site) to help Google improve crawl efficiency and utilise the daily crawl quote more effectively to discover pages that have not been discovered yet. However, I'm not sure yet whether this is the best path to take or I'm just flogging a dead horse if there is such a ceiling or if I'm already at the average ballpark for large sites. Any suggestions/insights would be appreciated. Thanks.
Intermediate & Advanced SEO | | danng0