Crawling issue
-
Hello,
I am working on 3 weeks old new Magento website. On GWT, under index status >advanced, I can only see 1 crawl on the 4th day of launching and I don't see any numbers for indexed or blocked status.
| Total indexed | Ever crawled | Blocked by robots | Removed |
| 0 | 1 | 0 | 0 |I can see the traffic on Google Analytic and i can see the website on SERPS when i search for some of the keywords, i can see the links appear on Google but i don't see any numbers on GWT.. As far as I check there is no 'no index' or robot block issue but Google doesn't crawl the website for some reason.
Any ideas why i cannot see any numbers for indexed or crawled status on GWT?
Thanks
Seda
| | | | |
| | | | | -
Thanks Davenport and Everett, I've got XML sitemap submitted already, checked robot and no index etc but no stats yet. I'll wait for a few weeks more but it just doesn't make sense to not get any stays after a month. Meanwhile, If i figure out anything, I'll reply here.
-
The data in GWT is not always updated regularly. Also, for a new site that has never been indexed before and has no, or few, external links, it would not be surprising to experience infrequent crawls. The more links you earn and the more of a history of fresh content and updated pages you develop, the more often and deeply you'll be crawled.
As Davenport-Tractor mentioned, an XML sitemap submitted to GWT will also help if you haven't done that already.
If most of your pages are indexed when you do a (site:yourdomain.com) search on Google I wouldn't worry about it too much. If they aren't indexed, you may have a problem, such as inadvertently blocking the crawlers via robots meta tag or robots.txt file. I'd have to see the site to know that though.
-
Seda,
Have you submitted a sitemap to GWMT?
That will greatly help the Google spiders crawl your site. Kind of like telling someone how to find your business vs providing them a road map. They will get there a whole lot quicker if you provide a map on how to find all the different locations.
There are quite a few different sitemap generator programs available. These programs will index your site and build the sitemap.xml file for you. Now you can save the file to your website root directory, then point GWMT to the sitemap.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl Stats Decline After Site Launch (Pages Crawled Per Day, KB Downloaded Per Day)
Hi all, I have been looking into this for about a month and haven't been able to figure out what is going on with this situation. We recently did a website re-design and moved from a separate mobile site to responsive. After the launch, I immediately noticed a decline in pages crawled per day and KB downloaded per day in the crawl stats. I expected the opposite to happen as I figured Google would be crawling more pages for a while to figure out the new site. There was also an increase in time spent downloading a page. This has went back down but the pages crawled has never went back up. Some notes about the re-design: URLs did not change Mobile URLs were redirected Images were moved from a subdomain (images.sitename.com) to Amazon S3 Had an immediate decline in both organic and paid traffic (roughly 20-30% for each channel) I have not been able to find any glaring issues in search console as indexation looks good, no spike in 404s, or mobile usability issues. Just wondering if anyone has an idea or insight into what caused the drop in pages crawled? Here is the robots.txt and attaching a photo of the crawl stats. User-agent: ShopWiki Disallow: / User-agent: deepcrawl Disallow: / User-agent: Speedy Disallow: / User-agent: SLI_Systems_Indexer Disallow: / User-agent: Yandex Disallow: / User-agent: MJ12bot Disallow: / User-agent: BrightEdge Crawler/1.0 (crawler@brightedge.com) Disallow: / User-agent: * Crawl-delay: 5 Disallow: /cart/ Disallow: /compare/ ```[fSAOL0](https://ibb.co/fSAOL0)
Intermediate & Advanced SEO | | BandG0 -
Portfolio Image Landing Page Question/Issue
Hello, We have a client with a very image heavy website. They have Portfolio pages with a large number of images. We are currently working on adding more copy to the site but wanted to confirm we are taking the right approach for the images on the site. Under the current structure each image has its own landing page (with no copy) and is fed in (or generated on) to a Portfolio Page. While we know this is not ideal as it would be best to have the images on the Portfolio Page directly or even fill out the landing pages with copy; due to the amount of images and the fact these are only images (and not a 'targeted' page) that would not really be feasible. Aside from the thin content concern these individual landing pages were being indexed so they are showing hundreds of pages on their sitemap.xml and in GSC even though they only have a few actual pages. In the meantime we went into each image-page and placed a canonical tag back to the main Portfolio Page (with the hopes to add content to that page and have it as the ‘overarching’ page). Would this be the right approach? – We considered ‘noindex-follow’ tags but would want the images to be crawled; the issue is because the pages are not on the actual page are we canonicalizing these images to nothing? Any insight would really be appreciated. Thank you in advance.
Intermediate & Advanced SEO | | Ben-R0 -
I need thoughts on how to chase a suspected Hosting Issue with Simple Helix and 524 errors, also some site speed data mixed in...
So the back story on this project is we've been working as PPC and SEO managers with an ecoomerce site (Magento Enterprise based) that crashed in April. After the issue they fired their developer and switched hosting to Simple Helix at the recommendation of the new developer. Since the change we have seen a plummeting ecommerce conversion rate especially on weekends. Every time something seems really bad, the Developer gives us a "nothing on our end causing it." So doing more research we found site speed in GA was reporting crazy numbers of 25+ seconds for page loads, when we asked Simple Helix gave us answers back that it was "Baidu spiders" crawling the site causing the slowdown. I knew that wasn't the issue. In all of this the developer keeps reporting back to the site owner that there is no way it is hosting. So the developer finally admitted the site could be slowing down from a Dos attack or some other form of probing. So they installed Cloudflare. Since then the site has been very fast, and we haven't seen turbulence in the GA site speed data. What we have seen though is the appearance of 524 and 522 errors in Search Console. Does anyone have experience with Cloudflare that seeing those types of errors are common in usage? Is there any other thought what might be causing that and what that means from the servers, because the developer reports back that Simple Helix has had no issues during this time. This has been a super frustrating project and we've tried a lot different tests, but there is really abnormal conversion data as I said especially during peak times on the weekend. Any ideas of what to chase would be appreciated.
Intermediate & Advanced SEO | | BCutrer0 -
Google Mobile algo traffic issue?
Hello, I have just been approach by a website owner - site isn't mobile friendly in any way - and they've seen a significant fall off in traffic since 23 Jan... backlink profile is clean (and no linkbuilding undertaken) - nothing else has changed... - more than half their traffic is via mobile devices and they've lost a good 1/3 of their traffic - and drilling deeper it's their organic traffic that's been hit. Anybody else seeing similar? edit... for reference: https://www.davidnaylor.co.uk/google-released-mobile-algorithm-think.html
Intermediate & Advanced SEO | | McTaggart0 -
Google can't access/crawl my site!
Hi I'm dealing with this problem for a few days. In fact i didn't realize it was this serious until today when i saw most of my site "de-indexed" and losing most of the rankings. [URL Errors: 1st photo] 8/21/14 there were only 42 errors but in 8/22/14 this number went to 272 and it just keeps going up. The site i'm talking about is gazetaexpress.com (media news, custom cms) with lot's of pages. After i did some research i came to the conclusion that the problem is to the firewall, who might have blocked google bots from accessing the site. But the server administrator is saying that this isn't true and no google bots have been blocked. Also when i go to WMT, and try to Fetch as Google the site, this is what i get: [Fetch as Google: 2nd photo] From more than 60 tries, 2-3 times it showed Complete (and this only to homepage, never to articles). What can be the problem? Can i get Google to crawl properly my site and is there a chance that i will lose my previous rankings? Thanks a lot
Intermediate & Advanced SEO | | granitgash
Granit FvhvDVR.png dKx3m1O.png0 -
Can anyone see any issues with the canonical tags on this web site?
The main domain is: http://www.eumom.ie/ And these would be some of the core pages: http://www.eumom.ie/pregnancy/ http://www.eumom.ie/getting-pregnant/ Any help from the Moz community is much appreciated!
Intermediate & Advanced SEO | | IcanAgency0 -
HTTPS Certificate Expired. Website with https urls now still in index issue.
Hi Guys This week the Security certificate of our website expired and basically we now have to wail till next Tuesday for it to be re-instated. So now obviously our website is now index with the https urls, and we had to drop the https from our site, so that people will not be faced with a security risk screen, which most browsers give you, to ask if you are sure that you want to visit the site, because it's seeing it as an untrusted one. So now we are basically sitting with the site urls, only being www... My question what should we do, in order to prevent google from penalizing us, since obviously if googlebot comes to crawl these urls, there will be nothing. I did however re-submitted it to Google to crawl it, but I guess it's going to take time, before Google picks up that now only want the www urls in the index. Can somebody please give me some advice on this. Thanks Dave
Intermediate & Advanced SEO | | daveza0 -
How to Avoid Duplicate Content Issues with Google?
We have 1000s of audio book titles at our Web store. Google's Panda de-valued our site some time ago because, I believe, of duplicate content. We get our descriptions from the publishers which means a good
Intermediate & Advanced SEO | | lbohen
deal of our description pages are the same as the publishers = duplicate content according to Google. Although re-writing each description of the products we offer is a daunting, almost impossible task, I am thinking of re-writing publishers' descriptions using The Best Spinner software which allows me to replace some of the publishers' words with synonyms. I have re-written one audio book title's description resulting in 8% unique content from the original in 520 words. I did a CopyScape Check and it reported "65 duplicates." CopyScape appears to be reporting duplicates of words and phrases within sentences and paragraphs. I see very little duplicate content of full sentences
or paragraphs. Does anyone know whether Google's duplicate content algorithm is the same or similar to CopyScape's? How much of an audio book's description would I have to change to stay away from CopyScape's duplicate content algorithm? How much of an audio book's description would I have to change to stay away from Google's duplicate content algorithm?0