Editing A Sitemap
-
Would there be any positive effect from editing a site map down to a more curated list of pages that perform, or that we hope they begin to perform, in organic search?
A site I work with has a sitemap with about 20,000 pages that is automatically created out of a Drupal plugin.
Of those pages, only about 10% really produce out of search. There are old sections of the site that are thin, obsolete, discontinued and/or noindexed that are still on the sitemap.
For instance, would it focus Google's crawl budget more efficiently or have some other effect?
Your thoughts? Thanks! Best... Darcy
-
Hi Darcy
Looking at what has been mentioned previously I would agree with the train of thought that a more focussed sitemap would generally be advantageous.
Andrew
-
Hi Dmitrii,
Always fun to watch Matt's Greatest Hits, in this example the value of making things better.
I guess the make better or delete seems super black and white to me.
Economically, who is able to make thousands of pages dramatically better with compelling original content? So, instead, the only other option is apparently radical elective surgery and massive amputation? I guess I'd choose the chemo first and don't really see what the downside is for noindex/follow and exclude from the sitemap.
Anyway, thanks again! Best... Darcy
-
- I really read the above linked post differently than Google saying "just delete it."
Well, here is a video from Matt Cutts about thin content. In this particular video he's talking about websites, which already took hit for thin content, but in your case it's the same, since you're trying to prevent it
https://www.youtube.com/watch?v=w3-obcXkyA4&t=322So, there are two options he is talking about: delete or make it better. From your previous responses I understand that making it better is not an option, so there is only one option left
As for link juice thorough those pages. If those pages have good amount of links, traffic and are quite popular on your website, then surely DON'T delete them, but rather make them better. However, I understood that those pages are not popular or have much traffic, so, option two
-
Hi Thomas,
Thanks for the message.
To answer your question, part of the reason is link juice via a noindex/follow and then there are some pages that serve a very very narrow content purpose, but have absolutely no life in search.
All things being equal, do you think a smaller, more focused, sitemap is generally an advantage? In the extreme and on other sites I've seen sitemaps with noindexed pages on them.
Thanks... Darcy
-
Thanks for the suggestion, Andrew.
With setting priority or not in a sitemap, do you think a smaller, more focused, sitemap is generally an advantage?
Thanks... Darcy
-
Thomas & Dmitrii,
Thanks for the message. With all do respect, I really read the above linked post differently than Google saying "just delete it."
Also, I don't see how deleting it preserves whatever link juice those pages had, as opposed to a "noindex, follow" and taking them out of the sitemap.
Finally, I don't necessarily equate all of Google's suggestions as synonymous with a "for best effect in search." I assume their suggestions mean, "it's best for Google if you..."
Thanks, again!
Best... Darcy
-
You misunderstand the meaning of that article.
"...that when you do block thin or bad content, Google prefers when you use the noindex over 404ing the page..."
They are talking about the walk around the problem of blocking pages INSTEAD of removing them.
So, if for whatever reason you don't want to delete a page and just put a 404 status on it, it's worse than putting noindex on it. Basically, what they're saying is:
- if you have thin content, DELETE it;
- if for whatever reason you don't want to delete it, put NOINDEX on it.
P.S. My suggestion still stays the same. Delete all bad content and, if you really want, put 410 gone status for that deleted content for Google to understand immediately that those pages are deleted forever, not inaccessible by mistake or something.
Hope this makes sense
.
-
Darcy,
Whilst noindex would be a good solution, if the page has no benefit why would you noindex instead of deleting it?
-
Dmitrii & Thomas,
Thanks for your thoughts.
Removal would be one way to go. I note with some interest this post:
https://www.seroundtable.com/google-block-thin-content-use-noindex-over-404s-21011.html
According to that, removal would be the third thing after making it better and noindexing.
With thousands of pages, making it better is not really an option.
Best... Darcy
-
Hi Darcy
I don't know about scaling the sitemap down but you could make use of an area of the sitemap to optimise and make it a crawl more efficient.
The area in question is the Priority area that basically tells the search engines which pages on your site are the most important. The theory is that pages with a higher priority (say 100%) are more likely to get indexed by the search engines than pages with a lower priority of say (10%), although not everyone in the industry agrees.
-
"There are old sections of the site that are thin, obsolete, discontinued and/or noindexed that are still on the sitemap."
Why not remove these from the site?
I personally believe that it'll have a positive impact, as you're submitting this sitemap to Google, you're giving it a way of going through your whole site, so why would you give it low quality pages. You want to provide Google (and your users) the best possible experience, so if you've got out of date pages, update them or if they're not relevant delete them, a user who lands on this page anyway would just bounce because it's not relevant anymore.
If these out of date pages can't be found by crawling, then 100% it's best to craft your sitemap to show the best pages.
-
hi there.
Of those pages, only about 10% really produce out of search. There are old sections of the site that are thin, obsolete, discontinued and/or noindexed that are still on the sitemap.
Have you considered removing those pages/sections, rather than altering the sitemap? It would make more sense I think.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
In Search Console, why is the XML sitemap "issue" count 5x higher than the URL submission count?
Google Search Console is telling us that there are 5,193 sitemap "issues" - URLs that are present on the XML sitemap that are blocked by robots.txt However, there are only 1,222 total URLs submitted on the XML sitemap. I only found 83 instances of URLs that fit their example description. Why is the number of "issues" so high? Does it compound over time as Google re-crawls the sitemap?
Intermediate & Advanced SEO | | FPD_NYC0 -
Add versioning to an xml sitemap?
Is there a way to add versioning to an xml sitemap? Something like <version>x.x</version> outside of the <urlset>?</urlset> I've looked at a bunch of sitemaps for various sites and don't see anyone adding versioning information, but it seems like it would be a common issue - I can't believe someone hasn't come up with some way to do it.
Intermediate & Advanced SEO | | ATT_SEO0 -
Google Webmaster Tools -> Sitemap suddent "indexed" drop
Hello MOZ, We had an massive SEO drop in June due to unknown reasons and we have been trying to recover since then. I've just noticed this yesterday and I'm worried. See: http://imgur.com/xv2QgCQ Could anyone help by explaining what would cause this sudden drop and what does this drop translates to exactly? What is strange is that our index status is still strong at 310 pages, no drop there: http://imgur.com/a1sRAKo And when I do search on google site:globecar.com everything seems normal see: http://imgur.com/O7vPkqu Thanks,
Intermediate & Advanced SEO | | GlobeCar0 -
Hreflang in vs. sitemap?
Hi all, I decided to identify alternate language pages of my site via sitemap to save our development team some time. I also like the idea of having leaner markup. However, my site has many alternate language and country page variations, so after creating a sitemap that includes mostly tier 1 and tier 2 level URLs, i now have a sitemap file that's 17mb. I did a couple google searches to see is sitemap file size can ever be an issue and found a discussion or two that suggested keeping the size small and a really old article that recommended keeping it < 10mb. Does the sitemap file size matter? GWT has verified the sitemap and appears to be indexing the URLs fine. Are there any particular benefits to specifying alternate versions of a URL in vs. sitemap? Thanks, -Eugene
Intermediate & Advanced SEO | | eugene_bgb0 -
XML sitemaps questions
Hi All, My developer has asked me some questions that I do not know the answer to. We have both searched for an answer but can't find one.... So, I was hoping that the clever folk on Moz can help!!! Here is couple questions that would be nice to clarify on. What is the actual address/name of file for news xml. Can xml site maps be generated on request? Consider following scenario: spider requests http://mypage.com/sitemap.xml which permanently redirects to extensionless MVC 4 page http://mypage.com/sitemapxml/ . This page generates xml. Thank you, Amelia
Intermediate & Advanced SEO | | CommT0 -
Need help creating sitemap
Hello, The details of my question is sitemap related. Below is the background info: we are ecommerce site with around 4000 pages, and 20000 images. we dont have a sitemap implemented on our site yet. i have checked alot of sitemap tools out there, like g-sitecrawler, xml sitemap, a1 sitemap builder etc, and i tried to create sitemaps via them, but all them give different results. the major links are all there, but the results start to vary for level 2, level 3 links and so on. plus no matter how much i read up on sitemaps, the more i am getting confused. i read lots of seomoz articles on sitemaps, and due to my limited seo and technical knowledge, the extra information on these articles gets more confusing. i also just read an article on seomoz that instead of having one sitemap, having multiple smaller sitemaps is very good idea, specially if we are adding lots of new products (which we are). Now my question: My question is having understood the immense value of sitemap (and by having it very poorly implemented before), how can i make sure that i get a very good sitemap (both xml and html sitemap). i do not want to do something again and just repeat old mistakes by having a poorly implemented sitemap for our site. I am hoping that one of the professionals out there, can help me also make and implement the sitemap. If you can please point me to the right direction.
Intermediate & Advanced SEO | | kannu10 -
XML Sitemaps for Message Boards / Forums - Best Practices?
I'm working with a message board that has been around for 10+ years and never taken SEO best practices into consideration. They recently started seeing mobile URLs show up in regular results, which they don't want. I'm recommending they implement multiple sitemaps to properly indicate to Google how to crawl the site and what to index. I've never dealt with a site this large so I'm not sure best practices. They have a HUGE community and new URLs are created every second. Doing a site: search returns "About 12,100,000" URLs. What are some best practices / the best way to approach sitemaps for a site of this size?
Intermediate & Advanced SEO | | MichaelWeisbaum0 -
Google Webmaster Tools Sitemap errors for phantom urls?
Two weeks ago we changed our urls so the correct addresses are all lowercase. Everything else 301 redirects to those. We have submitted and made sure that Google has downloaded our updated sitemap several times since. Even so, Webmaster Tools is reporting 33000 + errors in our sitemap for urls that are no longer in our sitemap and haven't been for weeks. It claims to have found the errors within the last couple of days but the sitemap has been updated for a couple of weeks and has been downloaded by Google at least three times since. Here is our sitemap: http://www.aquinasandmore.com/urllist.xml Here are a couple of urls that Webmaster Tools says are in the sitemap: http://www.aquinasandmore.com/catholic-gifts/Caroline-Gerhardinger-Large-Sterling-Silver-Medal/sku/78664
Intermediate & Advanced SEO | | IanTheScot
Redirect error unavailable
Oct 7, 2011
http://www.aquinasandmore.com/catholic-gifts/Catherine-of-Bologna-Small-Gold-Filled-Medal/sku/78706
Redirect error unavailable
Oct 7, 20110