Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Sitemap use for very large forum-based community site
-
I work on a very large site with two main types of content, static landing pages for products, and a forum & blogs (user created) under each product. Site has maybe 500k - 1 million pages. We do not have a sitemap at this time.
Currently our SEO discoverability in general is good, Google is indexing new forum threads within 1-5 days roughly. Some of the "static" landing pages for our smaller, less visited products however do not have great SEO.
Question is, could our SEO be improved by creating a sitemap, and if so, how could it be implemented? I see a few ways to go about it:- Sitemap includes "static" product category landing pages only - i.e., the product home pages, the forum landing pages, and blog list pages. This would probably end up being 100-200 URLs.
- Sitemap contains the above but is also dynamically updated with new threads & blog posts.
Option 2 seems like it would mean the sitemap is unmanageably long (hundreds of thousands of forum URLs). Would a crawler even parse something that size? Or with Option 1, could it cause our organically ranked pages to change ranking due to Google re-prioritizing the pages within the sitemap?
Not a lot of information out there on this topic, appreciate any input. Thanks in advance. -
Agreed, you'll likely want to go with option #2. Dynamic sitemaps are a must when you're dealing with large sites like this. We advise them on all of our clients with larger sites. If your forum content is important for search then these are definitely important to include as the content likely changes often and might be naturally deeper in the architecture.
In general, I'd think of sitemaps from a discoverability perspective instead of a ranking one. The primary goal is to give Googlebot an avenue to crawl your sites content regardless of internal linking structure.
-
Hi
Go with option 2, there is no scaling issue here. I have worked with and for sites that have a high multiplier on the number of sitemaps and pages that they're submitting, in some cases up to 100M pages. In all cases, Google was totally fine in crawling and processing the data that was there. As long as you follow the guidelines (max 50K URLs in a sitemap) you're fine as you're just providing another file that usually doesn't exceed about 50MB (depending on if you also add images to the sitemap). If you have an engineering team build the right infrastructure you can easily deal with thousands of these files and run them automated every day/week.
My main focus on big sites is also to streamline their sitemaps to have sitemaps with just the last 50.000 pages and the same for the last 50.000 pages that were updated. This way you're able to also monitor the indexation level of these pages. If you are able to, for example, combine the data from log file analysis you can say: we added 50K pages and Google in the last days were able to crawl X percentage of that.
Hope this gives you some extra insights.
Martijn.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap.xml strategy for site with thousands of pages
I have a client that has a HUGE website with thousands of product pages. We don't currently have a sitemap.xml because it would take so much power to map the sitemap. I have thought about creating a sitemap for the key pages on the website - but didn't want to hurt the SEO on the thousands of product pages. If you have a sitemap.xml that only has some of the pages on your site - will it negatively impact the other pages, that Google has indexed - but are not listed on the sitemap.xml.
Technical SEO | | jerrico10 -
In writing the url, it is better to use the language used by the people of my country or English?
We speak Persian and all people search in Persian on Google. But I read in some sources that the url should be in English. Please tell me which language to use for url writing?
Technical SEO | | ghesta
For example, I brought down two models: 1fb0e134-10dc-4737-904f-bfdf07143a98-image.png https://ghesta.ir/blog/how-to-become-rich/
2)https://ghesta.ir/blog/چگونه-پولدار-شویم/0 -
Help Setting Up 301 Redirects from Coldfusion Site to Wordpress Site.
I have created a new website and need to redirect all of the previous pages to the new one. The old website was built in coldfusion and the new site is built in wordpress. One of the pages I'm trying to redirect is www.norriseal.com/products.cfm to http://norrisealwellmark.com/products/. This is what I have in my .htaccess file <ifmodule mod_rewrite.c="">Options +FollowSymlinks
Technical SEO | | MarketHubb
RewriteEngine On
RewriteBase /
Redirect 301 /products.cfm http://norrisealwellmark.com/products/</ifmodule> The result of this redirect is http://norrisealwellmark.com/products.cfm How do I prevent the .cfm from appending to the destination URL?1 -
Automate XML Sitemaps
Quick question, which is the best method that people have for automating sitemaps. We publish around 200 times a day and I would like to make sure as soon as we publish it gets updated in the site map. What is the best method of updating a sitemap so it gets updated immediately after it is published.
Technical SEO | | mattdinbrooklyn0 -
302 redirect used, submit old sitemap?
The website of a partner of mine was recently migrated to a new platform. Even though the content on the pages mostly stayed the same, both the HTML source (divs, meta data, headers, etc.) and URLs (removed index.php, removed capitalization, etc) changed heavily. Unfortunately, the URLs of ALL forum posts (150K+) were redirected using a 302 redirect, which was only recently discovered and swiftly changed to a 301 after the discovery. Several other important content pages (150+) weren't redirected at all at first, but most now have a 301 redirect as well. The 302 redirects and 404 content pages had been live for over 2 weeks at that point, and judging by the consistent day/day drop in organic traffic, I'm guessing Google didn't like the way this migration went. My best guess would be that Google is currently treating all these content pages as 'new' (after all, the source code changed 50%+, most of the meta data changed, the URL changed, and a 302 redirect was used). On top of that, the large number of 404's they've encountered (40K+) probably also fueled their belief of a now non-worthy-of-traffic website. Given that some of these pages had been online for almost a decade, I would love Google to see that these pages are actually new versions of the old page, and therefore pass on any link juice & authority. I had the idea of submitting a sitemap containing the most important URLs of the old website (as harvested from the Top Visited Pages from Google Analytics, because no old sitemap was ever generated...), thereby re-pointing Google to all these old pages, but presenting them with a nice 301 redirect this time instead, hopefully causing them to regain their rankings. To your best knowledge, would that help the problems I've outlined above? Could it hurt? Any other tips are welcome as well.
Technical SEO | | Theo-NL0 -
What is the best way to find missing alt tags on my site (site wide - not page by page)?
I am looking to find all the missing alt tags on my site at once. I have a FF extension that use to do it page by page, but my site is huge and that will take forever. Thanks!!
Technical SEO | | franchisesolutions1 -
Forum Profile Links
Are they really important? Many preach they are, and there are tonnes of services out there who give you thousands of forum profile links in no time. I strictly believe in genuine links built the hard way, and definitely don't want to get into anything which is black hat. Please suggest if building several Forum Profile Links is an appropriate way of building links?
Technical SEO | | KS__2 -
Do we need to manually submit a sitemap every time, or can we host it on our site as /sitemap and Google will see & crawl it?
I realized we don't have a sitemap in place, so we're going to get one built. Once we do, I'll submit it manually to Google via Webmaster tools. However, we have a very dynamic site with content constantly being added. Will I need to keep manually re-submitting the sitemap to Google? Or could we have the continually updating sitemap live on our site at /sitemap and the crawlers will just pick it up from there? I noticed this is what SEOmoz does at http://www.seomoz.org/sitemap.
Technical SEO | | askotzko0