Google only crawling a small percentage of the sitemap
-
Hi,
The company which I work for have developed a new website for a customer, there URL is https://www.wideformatsolutions.co.uk I've created a sitemap which has 25,555 URL's. I submitted this to Google around 4 weeks ago and the most crawls that have ever occurred has been 2,379.
I've checked everything I can think of, including;
- Speed of website
- Canonical Links
- 404 errors
- Setting a preferred domain
- Duplicate content
- Robots Txt
- .htaccess
- Meta Tags
I did read that Matt Cutts revealed in an interview with Eric Enge that the number of pages Google crawls is roughly proportional to your pagerank. But I'm sure it should crawl more than 2000 pages.
The website is based on Opencart, if anyone has experienced anything like this I would love hear from you.
-
No problem! I meant to mention this in my first comment, but I also noticed that there's no robots.txt file in place. That's obviously not going to help your indexation problem too much, but nonetheless something you should know about.
-
I did have some issues with this when we first launched the site, I will try and look into it further now. The HTTPS certificate is fairly new.
Thanks for commenting
-
Looks to me like Google can't properly access your XML sitemap. I tried to put it into 2 different validator tools and URI Valet and none of those tools were able to access it. It could be something with HTTPS. Did you recently switch the site over to secure?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Recovering from a Google penalty
Hi there, So about 3.5 weeks ago I noticed my website (www.authenticstyle.co.uk) had gone from ranking in second place for our main key phrase "web design dorset" to totally dropping off the SERP's for that particular search phrase - it's literally no where to be seen. It seems that other pages of my website still rank, but the homepage. I then noticed that I had an unread alert in my Google Search Console account to say that a staging site we were hosting on a subdomain (the subdomain was domvs.authenticstyle.co.uk) had hacked content - it was a couple of PDF files with weird file names. The strange thing is we'd taken this staging site down a few weeks earlier, BUT one of my staff had left an A record set up in our Cloudflare account pointing to that staging server - they'd forgotten to remove it when removing the staging site. I then removed the A record, myself and submitted a reconsideration request on Google Search Console (which I still haven't received confirmation of) in the hope of everything sorting itself out. Since then I've also grabbed a Moz Pro account to try and dig a little deeper, but without any success. We have a few warnings for old 404's, some missing meta descs on some pages, and some backlinks that have accumulated over time that have hghish spam rating, but nothing major - nothing that would warrant a penalty as far as I can tell. From what I can make out, we've been issued a penalty on our homepage only, but I don't understand why we would get penalised for hacked content if that site domvs.authenticstyle.co.uk no longer existed (would it just be due to that erroneous A record we forgot to remove?). I contacted a few freelance SEO experts and one came back to me saying I'd done everything correctly and that I should see our site appearing again in a few days after submitting the reconsideration request. Its been 3 weeks and nothing. I'm at a huge loss as to how my site can recover from this. What would you recommend? I even tried getting our homepage to rank for a variation of "web design dorset", but it seems our homepage has been penalised for anything with "dorset" in the keyphrase. Any pointers would be HUGELY appreciated. Thanks in advance! Will
Technical SEO | | wsmith7270 -
Google News problem
Hello to all. The latest Google algorithm changes have had a big impact on the way that Google news features stories, at least in my country. I've been featured heavily in Google News until about 6th of october, when the changes had the biggest impact, but since then, I haven't been featured at all. Prior to this, I would be featured for keywords on almost any article, not necessarily on the 1st position, but I was almost always there. Posts still show up in the dedicated News category, but not in the main search pages. I've seen a lot of websites being impacted, but some with lower ranks than mine still show up there. I haven't done any changes prior to the 6th of october, and I haven't done any link building campaings, just getting links from higher ranking news sites in my country, for articles I wrote. What I'd like to know is if there were any major changes for Google News and I'm not complying with any of them, or If i could check to see if there are any other problems. I don't have any penalties disclosed by Google, and no new errors in the Webmasters console, I'm just baffled by the fact that overnight the website was completely cut off from being featured in Google News. And one other strange thing, I'm now ranking better for searches that are kind of opposite to my website's main theme. Think about mainly writing about BMW, and less about AUDI, but ranking a lot better for the latter, and a lot less for the other. Thank you.
Technical SEO | | thefrost0 -
Sitemaps:
Hello, doing an audit found in our sitemaps the tag which at the time was to say that the url was mobile. In our case the URL is the same for desktop and mobile.
Technical SEO | | romaro
Do you recommend leaving or removing it?
Thank you!0 -
Why HTML entities gets crawled as content keywords in Google search console?
My Google search console shows HTML parameters such as div, class, img, src, gif, align as content keywords, but why google crawls HTML parameters as keywords? because of this, I would be losing traffic for my on-page content keywords. Please let me know how to solve this. Thanks, Jenifer
Technical SEO | | Jenifer300 -
Google webmaster showing 0 indexed, yet I can see them all them Google search?
I can see them all the pages showing up in Google when i search for my site. But in webmaster tools under the sitemaps section in the indexed pages - the red bar is showing 0 indexed pages, even though they seem to be indexed. Any idea why is this showing like this? I don’t really think it’s that important as the pages are still indexed, but it just seems odd. Please see in the image.
Technical SEO | | Perfect0070 -
Do I need a link to my sitemap?
I have a very large sitemap. I submit it to both Google and Bing, but do I need a link to it? If someone went there it would probably lock their browser. Is there any danger of not having a link if I submit it to Google and Bing?
Technical SEO | | EcommerceSite0 -
How long after google crawl do you need 301 redirects
We have just added 301's when we moved our site. Google has done a crawl & spat back a few errors. How long do I need to keep those 301's in place? I may need to change some. Thanks
Technical SEO | | Paul_MC0 -
Google Pake Rank of Zero?
Hi! We've been running our website www.opzeggen.nl for a while now, but we still have a google page rank of 0. Searching for a keyword like 'opzeggen' does give us good ranking however. Should I be worried about this? And if so, is there a way to fix this? Best Regards, Pieter
Technical SEO | | greenonline0