XML sitemap generator only crawling 20% of my site
-
Hi guys,
I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site. The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap.
How should I go about this?
Thanks
-
I believe it's not a significant issue if the sitemap encompasses the core framework of your website. As long as the sitemap is well-organized, omitting a few internal pages is acceptable since Googlebot will crawl all pages based on the sitemap. Take a look at the <a href="https://convowear.in">example page</a> that also excludes some pages, yet it doesn't impact the site crawler's functionality.
-
Yes Yoast on WordPress works fine for sitemap generation. I would also recommend that. Using on all of my blog sites.
-
If you are using WordPress then I would recommend to use Yoast plugin. It generates sitemap automatically regularly. I am also using it on my blog.
-
I'm using Yoast SEO plugin for my website. It generates the Sitemap automatically.
-
My new waterproof tent reviews blog facing the crawling problem. How can I fix that?
-
use Yoast or rankmath ot fix it
آموزش سئو در اصفهان https://faneseo.com/seo-training-in-isfahan/
-
Patrick wrote a list of reasons why Screaming Frog might not be crawling certain pages here: https://moz.com/community/q/screamingfrog-won-t-crawl-my-site#reply_300029.
Hopefully that list can help you figure out your site's specific issue.
-
This doesn't really answer my question of why I am not able to get all links into the XML sitemap when using xml sitemap generators.
-
I think it's not a big deal if the sitemap covers the main structure of your site. If your sitemap is constructed in a really decent structure, then missing some internal pages are acceptable because Googlebot will crawl all of your pages based on your site map. You can see the following page which also doesn't cover all of its pages, but there's no influence in terms of site crawler.
-
Thanks Boyd but unfortunately I am still missing a good chunk of URLs here and I am wondering why? Do those check on internal links in order to find these pages?
-
Use Screaming Frog to crawl your site. It is free to download the software and you can use the free version to crawl up to 500 URLs.
After it crawls your site you can click on the Sitemaps tab and generate an XML sitemap file to use.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there an issue with my site?
Been mostly hanging around top of page two for the last couple of years for “Liverpool Wedding photographer” although got myself on page 1 for “Liverpool photographer” I have split the title of the page to target these two keywords. I took the Liverpool photographer off the title to see if it was being detrimental to the “Liverpool wedding photographer” I didn’t see no increase in ranking so put it back as I get a bit of commercial work from it. Since last year I have got onto page 1 at least three times around position 5-6. Within a week or two I start sliding down again and end up back at top of page two. I could understand this slow push out if my competitors were busy SEO wise but from what I have seen they are not. There is a guy using the keywords in URL and calls himself “Liverpool wedding photographer” last time I checked he literally had no links but is in the first 5 positions. I have I think a better link profile than every one else. Although I am on and off with Facebook and Instagram, (more off) so that probably isn’t helping. Although I have a colleague in the video side of things and he doesn’t use social media at all and it hasn’t harmed him. A few years ago I was burned quite badly by a total charlatan. He sunk my home page to page 4. He talked the talk about creating landing pages but his methods were shoddy to say the least. I can’t believe I was taken in by him, although I was only with him for 2 months. He was still using spammy link techniques to generate lots of toxic links for me! I disavowed all of his links and put the keywords back on the home page and was back to my usual top of page 2 position within a week. Since then I have disavowed all directory links and anything not wedding related. I have an article which ranks 1st or second for “Nikon CLS”. I have also another article of 2000 words or so on another reasonable placed photography website. A few links from other vendors or people I have taken photographs for. I have about 10 featured weddings with a link on 4 good weddings blogs. I don’t think a massive amount of blog comments although I have stopped doing this. If I look at most of the competitors these are their main links, with directories as well! Last winter I put a quite substantial article about documentary wedding photography on my home page. I flew to number 2, although I photographed The World Transformed (the alternative labour conference in Liverpool). I got a lot of clicks to a gallery page (few thousand off social media} so I don’t know if that coincided with it. Same thing – watching the website go down a few positions every day until within just over a week or two I was about 4<sup>th</sup> on page 2! Its like my website is on a spring which can push into page 1 but rebounds back to top of page 2. I am staring to worry that my site has been marked as a bad character in some way because I get what seems to be rough treatment from google compared to my peers. I have written I think 4 or 5 (1500 word) articles the last couple of months talking about lenses and wedding photography related topics and Google pushed me back to page 1, peaking At position 5. I was there for a few weeks and then the slide happened again. Bit demoralised at the moment, what to do? Any help or pointers would be most appreciated. Best wishes. David.
Intermediate & Advanced SEO | | WallerD0 -
Our parent company has included their sitemap links in our robots.txt file - will that have an impact on the way our site is crawled?
Our parent company has included their sitemap links in our robots.txt file. All of their sitemap links are on a different domain and I'm wondering if this will have any impact on our searchability or potential rankings.
Intermediate & Advanced SEO | | tsmith1310 -
Open Site Explorer - Spam analysis: need help with inbound links... from my site!
hallo, reading my spam analysis report from open explorer, I found somenthing I don't understand (please see attached image): The long list of links inside the red rectangle are inbound links with a spam score of 5 coming from my same site. How is that possible? Should I remove those links? Also , I see that many of those links are links present in the top navigation bar (about page, home page, service description etc.) or in the sidebar section of the website (categories, recent posts, recent comments). Should I treat them differently? Thank you for your time.
Intermediate & Advanced SEO | | micvitale0 -
Launching a new website. Old inherited site cannot be saved after lifted penalty. When should we kill the old site and how?
Background Information A website that we inherited was severely penalized and after the penalty was revoked the site still never resurfaced in rankings or traffic. Although a dramatic action, we have decided to launch a completely new version of the website. Everything will be new including the imagery, branding, content, domain name, hosting company, registrar account, google analytics account, etc. Our question is when do we pull the plug on the old site and how do we go about doing it? We had heard advice that we should make sure we run both sites at the same time for 3 months, then deindex the old site using a noindex meta robots tag.We are cautious because we don't want the old website to be associated in any way, shape or form with the new website. We will purposely not be 301 redirecting any URLs from the old website to the new. What would you do if you were in this situation?
Intermediate & Advanced SEO | | peteboyd0 -
Linking from a corporate site to a brand site.
Is there an SEO impact to a large corporation linking from a corporate and/or a divisional site to a specific brand site with it's own top level domain? We would like to keep the traffic coming, but not if it will be seen as a black hat tactic. My guess is that Google will be smart enough to see that the corporation owns the brand and at least not penalize us, but I am wondering if anyone else has this experience? Google Analytics is calling it self-referral.
Intermediate & Advanced SEO | | mrbobland0 -
Redirecting Pages from site A to site B
Hi, I have a client who have a solid, high ranking content based site (site A). They have now created an ecommerce site in addition (site B). To give site B a boost in terms of search engine visibility upon launch, they now wish to redirect approx 90% of site As pages to site B. What would be the implications of this? Apart from customers being automatically redirected from the page they thought they where landing on, how would google now view site A? What are your thoughts to thier idea. I am trying to talk them out of it as I think its a poor one.
Intermediate & Advanced SEO | | Webrevolve0 -
Site Navigation
Hi Mozzers, I am an SEO at uncommongoods.com and looking for your opinion on our site nav. Currently our nav & URLs are structured in 3 levels. From the top level down, they are: 1. Category ex: http://www.uncommongoods.com/home-garden 2. Subcat ex: http://www.uncommongoods.com/home-garden/bed-bath 3. Family ex:http://www.uncommongoods.com/home-garden/bed-bath/bath-accessories Right now, all levels are accessible from our top nav but we are considering removing the family pages. If we did that, Google could still find & crawl links to the family pages, but they would have to drill down to the subcat pages to find them. Do you guys think this would help or hurt our SEO efforts? Thanks! -Zack
Intermediate & Advanced SEO | | znotes0 -
SEOMOZ crawl all my pages
SEOMOZ crawl all my pages including ".do" (all web pages after sign up ) . Coz of this it finishes all my 10.000 crawl page quota and be exposed to dublicate pages. Google is not crawling pages that user reach after sign up. Because these are private pages for customers I guess The main question is how we can limit SEOMOZ crawl bot. If the bot can stay out of ".do" java extensions it'll perfect to starting SEO analysis. Do you know think about it? Cheers Example; .do java extension (after sign up page) (Google can't crawl) http://magaza.turkcell.com.tr/showProductDetail.do?psi=1001694&shopCategoryId=1000021&model=Apple-iPhone-3GS-8GB Normal Page (Google can crawl) http://magaza.turkcell.com.tr/telefon/Apple-iPhone-3GS-8GB/1001694/.html
Intermediate & Advanced SEO | | hcetinsoy0