XML sitemap generator only crawling 20% of my site
-
Hi guys,
I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site. The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap.
How should I go about this?
Thanks
-
I believe it's not a significant issue if the sitemap encompasses the core framework of your website. As long as the sitemap is well-organized, omitting a few internal pages is acceptable since Googlebot will crawl all pages based on the sitemap. Take a look at the <a href="https://convowear.in">example page</a> that also excludes some pages, yet it doesn't impact the site crawler's functionality.
-
Yes Yoast on WordPress works fine for sitemap generation. I would also recommend that. Using on all of my blog sites.
-
If you are using WordPress then I would recommend to use Yoast plugin. It generates sitemap automatically regularly. I am also using it on my blog.
-
I'm using Yoast SEO plugin for my website. It generates the Sitemap automatically.
-
My new waterproof tent reviews blog facing the crawling problem. How can I fix that?
-
use Yoast or rankmath ot fix it
آموزش سئو در اصفهان https://faneseo.com/seo-training-in-isfahan/
-
Patrick wrote a list of reasons why Screaming Frog might not be crawling certain pages here: https://moz.com/community/q/screamingfrog-won-t-crawl-my-site#reply_300029.
Hopefully that list can help you figure out your site's specific issue.
-
This doesn't really answer my question of why I am not able to get all links into the XML sitemap when using xml sitemap generators.
-
I think it's not a big deal if the sitemap covers the main structure of your site. If your sitemap is constructed in a really decent structure, then missing some internal pages are acceptable because Googlebot will crawl all of your pages based on your site map. You can see the following page which also doesn't cover all of its pages, but there's no influence in terms of site crawler.
-
Thanks Boyd but unfortunately I am still missing a good chunk of URLs here and I am wondering why? Do those check on internal links in order to find these pages?
-
Use Screaming Frog to crawl your site. It is free to download the software and you can use the free version to crawl up to 500 URLs.
After it crawls your site you can click on the Sitemaps tab and generate an XML sitemap file to use.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap into SE
Hi Moz community experts, I have a question about the sitemap into search engine like here : http://i.imgur.com/gQ0JhuH.jpg. Do you know what I need to do to get the same structure or do decide which pages we want to present into our result. We created a new page and we would like to see it into the resultat when the visitor is searching for our branded keywords. Thank in advance for your support. gQ0JhuH.jpg.
Intermediate & Advanced SEO | | johncurlee0 -
Google can't access/crawl my site!
Hi I'm dealing with this problem for a few days. In fact i didn't realize it was this serious until today when i saw most of my site "de-indexed" and losing most of the rankings. [URL Errors: 1st photo] 8/21/14 there were only 42 errors but in 8/22/14 this number went to 272 and it just keeps going up. The site i'm talking about is gazetaexpress.com (media news, custom cms) with lot's of pages. After i did some research i came to the conclusion that the problem is to the firewall, who might have blocked google bots from accessing the site. But the server administrator is saying that this isn't true and no google bots have been blocked. Also when i go to WMT, and try to Fetch as Google the site, this is what i get: [Fetch as Google: 2nd photo] From more than 60 tries, 2-3 times it showed Complete (and this only to homepage, never to articles). What can be the problem? Can i get Google to crawl properly my site and is there a chance that i will lose my previous rankings? Thanks a lot
Intermediate & Advanced SEO | | granitgash
Granit FvhvDVR.png dKx3m1O.png0 -
Bad site migration - what to do!
Hi Mozzers - I'm just looking at a site which has been damaged by a very poor site migration. Basically, the old URLs were 301'd to a page on the new website (not a 404) telling everyone the page no longer existed. They did not 301 old pages to equivalent new pages. So I just checked Google WMT and saw 1,000 crawl errors - basically the old URLs. This migration was done back in February, since when traffic to the website has never recovered. Should I fix this now? Is it worth implementing the correct 301s now, after such a timelapse?
Intermediate & Advanced SEO | | McTaggart0 -
Site rankings down
Our site is over 10 years old and has consistently ranked highly in google.co.uk for over 100 key phrases. Until the middle of April, we were 7th for 'nuts and bolts' and 5th for 'bolts and nuts' - we have been around these positions for 5-6 years easily now. Our rankings dropped mid-April, but now (presumably as a result of Penguin 2.0), we've seen larger decreases across the board. We are now 5th page on 'nuts and bolts', and second page on 'bolts and nuts'. Can anyone please shed any light on this? Although we'd fallen some before Penguin 2.0, we've fallen quite a bit further since. So I'm wondering if it's that. We do still rank well on our more specialised terms though - 'imperial bolts', 'bsw bolts', 'bsf bolts', we're still top 5. We've lost out with the more generic terms. In the past we did a bit of (relevant) blog commenting and obtained some business directory links, before realising the gain was tiny if at all. Are those likely to be the issue? I'm guessing so. It's hard to know which to get rid of though! Now, I use social media sparingly, just Facebook, Twitter and G+. The only linkbuilding I do now is by sending polite emails to people who run classic car clubs that would use our bolts, stuff like that. I've had a decent response from that, and a few have become customers directly. Here's our link profile if anyone would be kind enough as to have a look: http://www.opensiteexplorer.org/links?site=www.thomassmithfasteners.com Also, SEOMOZ says we have too many links on our homepage (107) - the dropdown navigation is the culprit here. Should I simply get rid of the dropdown and take users to the categories? Any advice here would be appreciated before I make changes! If anyone wants to take a look at the site, the URL is in the link profile above - I'm terrified of posting links anywhere now! Thanks for your time, and I'd be very grateful for any advice. Best Regards, Stephen
Intermediate & Advanced SEO | | stephenshone1 -
How do I create a XML Sitemap?
It appears that the free online tools limit the number of URLs they'll include. What tools have you had success with?
Intermediate & Advanced SEO | | NaHoku1 -
Is my site being penalized?
I launched http://rumma.ge in February of this year. Because I'm using a domain hack (the Georgian domain), I'd really like to rank for just the word "rummage". After launching, I was steady at around page 4/5 on searches for "rummage". However since then I've tumbled out of the first 100. In fact I can't even find the site in the first 20 pages on Google for that search. Even a search for my exact homepage title text doesn't bring up the site, despite the fact that the site is still in the index. I'm wondering if one of the following could be the root cause: We have a ccTLD (.ge)--not sure about the impacts of this, but seems like it might not be the root cause because we were ranking for "rummage" when we first launched. Tried running an Adwords campaign but the site was flagged as a "bridge page" (working on getting this addressed). I'm wondering if this could have carryover impacts into natural search rankings? We've tried doing some press and built up a decent number of backlinks over the past couple of months, many of which had "rummage" in the anchor text. This was all organic, but happened over the span of a month which may be too fast? Am I being penalized? Beyond checking indexing of the site, is there a way to tell if I've been flagged for some bad behavior? Any help or thoughts would be greatly appreciated. I'm really confused by this since I feel like I've been doing things right and my rankings have been travelling downward. Thanks!! Matt
Intermediate & Advanced SEO | | minouye0 -
Splitting a Site into Two Sites for SEO Purposes
I have a client that owns a business that really could be easily divided into two separate business in terms of SEO. Right now his web site covers both divisions of his business. He gets about 5500 visitors a month. The majority go to one part of his business and around 600 each month go to the other. So about 11% I'm considering breaking off this 11% and putting it on an entirely different domain name. I think I could rank better for this 11%. The site would only be SEO'd for this particular division of the company. The keywords would not be in competition with each other. I would of course link the two web sites and watch that I don't run into any duplicate content issues. I worry about placing the redirects from the pages that I remove to the new pages. I know Google is not a fan of redirects. Then I also worry about the eventual drop in traffic to the main site now. How big of a factor is traffic in rankings? Other challenges include that the business services 4 major metropolitan areas. Would you do this? Have you done this? How did it work? Any suggestions?
Intermediate & Advanced SEO | | MSWD0 -
Sitemap in SERPS
What's up guys, Having some troubles with SERP rankings. My sitemap (navigation) is appearing instead of my actual keywords. I have tried a few methods to fix this; setting a preferred domain, using a 301 redirects, deleting out of date pages via Google webmaster tools. Nothing seems to work. My next step was to refresh the cache for my entire site - does anyone know how to do this? Can't see any tools... Any help would be great. Cheers, Jon.
Intermediate & Advanced SEO | | jamesjk240