Sitemap generator partially finding list of website URLs
-
Hi everyone,
When creating my XML sitemap here it is only able to detect a portion of the website. I am missing at least 20 URLs (blog pages + newly created resource pages). I have checked those missing URLs and all of them are index and they're not blocked by the robots.txt.
Any idea why this is happening? I need to make sure all wanted URLs to be generated in an XML sitemap.
Thanks!
-
Gaston,
Interestingly enough by default the generator only located only half of the URLs. I hope that one of those 2 fields will do the trick.
-
Hi Taysir,
I´ve never used that service. I suspect that the section you refer to should do the trick.
I believe that you do know how many URLs there are in the whole site, so you can compare how much pro-sitemaps.com finds to your numbers.Best luck!
GR -
Thanks for your response Gaston. These pages are definitely not blocked by the robots.txt file. I think that it is an internal linking problem. I actually subscribed to pro-sitemap.com and was wondering if I should use this section and add remaining sitemap URLs that are missing: https://cl.ly/0k0t093f0Y1T
Do you think this would do the trick?
-
Google not only provides a basic template you could do the sitemap manually if you wished, and this link has Google listing several dozen open source sitemap generators.
If Google Webmaster's can't read the one you generated fully, then clearly an alternate generator should definitely fix that for you. Good luck!
-
Hi taysir!
Have you tried any other crawler to check whether those pages can be finded?
I'd strongly suggest you Screaming Frog spider, the free version allows you up to 500 URLs. Also, it has a feature to create sitemaps from the crawled URLs. Even though dont know if that available in the free version.
Here some info about that feature: XML sitemap genetator - Screaming FrogUsual issues in not being findable are:
- Poor internal linking
- Not having a sitemap (this is why you find out)
- Blocked resources in robots.txt
- Blocked pages with robots meta tag
That being said, its completely normal that Google has indexed pages that you cant find in a AdHoc crawl, that is because GoogleBot could have found those pages from external linking.
Also keep in mind that having pages blocked with Robots.txt or robots meta tag will not prevent that page from being indexed nor will make them deindex if you add some rules to block them.Hope it helps.
Best luck
GR
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Image Sitemap
I currently use a program to create our sitemap (xml). It doesn't offer creating an mage sitemaps. Can someone suggest a program that would create an image sitemap? Thanks.
Technical SEO | | Kdruckenbrod0 -
Are mobile annotation in PC xml sitemaps a replacement for mobile xml sitemaps?
These two links confused me as to what I should do... https://developers.google.com/webmasters/smartphone-sites/details https://support.google.com/webmasters/answer/34648?hl=en
Technical SEO | | JasonOliveira0 -
Website not crawled
i added website www.nsale.in in add campaign, it shows only 1 page crawled. but its working fine for other sites, any idea why it failed ?
Technical SEO | | Dhinesh0 -
How do I use only one URL
my site can be reach by both www.site.com and site.com. How do I make it only use www?
Technical SEO | | Weblion0 -
Weird Local Listings same company in listing
Have a look at this It's the local listing of weddings Gretna Here is a short version
Technical SEO | | ibexinternet
http://bit.ly/HD6ay4 There are two listing with the same address and they are the same company but just different Wed address I though this was against the rules! www.gretnaweddings.co.uk and www.gretnaweddings.com Any ideas how they getting away with it!? Cheers Steve0 -
Where to place your brandname in your URL?
Hello everybody! Quick and short question: What is better when you want to rank for your your brandname? www.jobsbrandname.com or www.brandnamejobs.com I think for SEO it's better to use the last one but marketing has the wish to use the first one. Thanks for your responce!
Technical SEO | | ltom0 -
Generating a Sitemap for websites linked to a wordpress blog
Greetings, I'm looking for a way to generate a sitemap that will include static pages on my home directory, as well as my wordpress blog. The site that I'm trying to build this for is in a temporary folder, and can be accessed at http://www.lifewaves.com/Website 3.0 I plan on moving the contents of this folder to the root directory for lifewaves.com whenever we are go for launch. What I'm wondering is, is there a way to build a sitemap or sitemap index that will point to the static pages of my site, as well as the wordpress blog while taking advantage of the built in wordpress hierarchy? If so, what's an easy way to do this. I have generated a sitemap using Yoast, but I can't seem to find any xml files within the wordpress folder. Within the plugin is a button that I can click to access the sitemap index, but it just brings me to the homepage of my blog. Can I build a sitemap index that points to a sitemap for the static pages as well as the sitemap generated by yoast? Thank you in advance for your help!! P.S. I'm kind of a noob.
Technical SEO | | WaveMaker0 -
URL Rewrite
We are trying to convince a client to do a massive rewrite from all URL's looking like this: "www.company.com/category/categoryId=82374" to something like "www.company.com/womens/jackets/rain" How would you describe the importance and impact of doing URL rewrites to an ecommerce site? What evidence/research can we share with them to convince them it is worth the time and effort to do?
Technical SEO | | Hakkasan0