Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Google stopped crawling my site. Everybody is stumped.
-
This has stumped the Wordpress staff and people in the Google Webmasters forum.
We are in Google News (have been for years), and so new posts are crawled immediately. On Feb 17-18 Crawl Stats dropped 85%, and new posts were no longer indexed (not appearing on News or search). Data highlighter attempts return "This URL could not be found in Google's index."
No manual actions by Google. No changes to the website; no custom CSS. No Site Errors or new URL errors. No sitemap problems (resubmitting didn't help). We're on wordpress.com, so no odd code. We can see the robot.txt file.
Other search engines can see us, as can social media websites. Older posts still index, but loss of News is a big hit. Also, I think overall Google referrals are dropping.
We can Fetch the URL for a new post, and many hours later it appears on Google and News, and we can then use Data Highlighter.
It's now 6 days and no recovery. Everybody is stumped. Any ideas?
I just joined, so this might be the wrong venue. If so, apologies.
-
I'm sure you've gone over this and if so feel free to ignore me but did you make sure your sitemap is up to scratch - https://support.google.com/news/publisher/answer/184732?hl=en
You can also look to see if your data is up to par - https://developers.google.com/structured-data/testing-tool/
You can always work back from the date it stopped working see if you've done anything that may have caused this (some one changing a tag etc.) lastly there is always the giant list of things to check for Google news - http://www.adamsherk.com/seo/the-most-common-google-news-errors/
Best of luck and let us know how you get on.
-
Of course, I should have included that: Fabiusmaximus.com
-
Are you able to share the website that we're talking about?
-
Thanks for the ingenious advice, out of the box thinking. However, we're still in Google News. When we force Google to index a post via Fetch, it shows up in News. Plus the New Publisher page tells you that you're in News.
It's some sort of technical problem. A weird one.
-
My guess is that it's less of a technical constraint and more of a non-match for Google News guidelines. There are several things in there that could trigger the boot, (https://support.google.com/news/publisher/answer/40787) such as:
Stick to the news--we mean it! Google News is not a marketing service. We don't want to send users to sites created primarily for promoting a product or organization, or to sites that engage in commerce journalism. If your site mixes news content with other types of content, especially paid advertorials or promotional content, we strongly recommend that you separate non-news types of content. Otherwise, if we find non-news content mixed with news content, we may exclude your entire publication from Google News.
Journalistic standards. Original reporting and honest attribution are longstanding journalistic values. If your site publishes aggregated content, you will need to separate it from your original work, or restrict our access to those aggregated articles via your robots.txt file.
News content. Sites included in Google News should offer timely reporting on matters that are important or interesting to our audience. We generally do not include how-to articles, advice columns, job postings, or strictly informational content such as weather forecasts and stock data.
And so on... If you think you're in the clear try going to the Google News specific help center (on the previously linked page) and submit your site for inclusion.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removing site subdomains from Google search
Hi everyone, I hope you are having a good week? My website has several subdomains that I had shut down some time back and pages on these subdomains are still appearing in the Google search result pages. I want all the URLs from these subdomains to stop appearing in the Google search result pages and I was hoping to see if anyone can help me with this. The subdomains are no longer under my control as I don't have web hosting for these sites (so these subdomain sites just show a default hosting server page). Because of this, I cannot verify these in search console and submit a url/site removal request to Google. In total, there are about 70 pages from these subdomains showing up in Google at the moment and I'm concerned in case these pages have any negative impacts on my SEO. Thanks for taking the time to read my post.
Technical SEO | | QuantumWeb620 -
Site indexed by Google, but (almost) never gets impressions
Hi there, I have a question that I wasn't able to give it a reasonable answer yet, so I'm going to trust on all of you. Basically a site has all its pages indexed by Google (I verified with site:sitename.com) and it also has great and unique content. All on-page grades are A with absolutely no negative factors at all. However its pages do not get impressions almost at all. Of course I didn't expect it to be on page 1 since it has been launched on Dec, 1st, but it looks like Google is ignoring (or giving it bad scores) for some reason. Only things that can contribute to that could be: domain privacy on the domain, redirect from the www to the subdomain we use (we did this because it will be a multi-language site, so we'll assign to each country a subdomain), recency (it has been put online on Dec 1st and the domain is just a couple of months old). Or maybe because we blocked crawlers for a few days before the launch? Exactly a few days before Dec 1st. What do you think? What could be the reason for that? Thanks guys!
Technical SEO | | ruggero0 -
How to remove all sandbox test site link indexed by google?
When develop site, I have a test domain is sandbox.abc.com, this site contents are same as abc.com. But, now I search site:sandbox.abc.com and aware of content duplicate with main site abc.com My question is how to remove all this link from goolge. p/s: I have just add robots.txt to sandbox and disallow all pages. Thanks,
Technical SEO | | JohnHuynh0 -
What is the best way to find missing alt tags on my site (site wide - not page by page)?
I am looking to find all the missing alt tags on my site at once. I have a FF extension that use to do it page by page, but my site is huge and that will take forever. Thanks!!
Technical SEO | | franchisesolutions1 -
Does Google pass link juice a page receives if the URL parameter specifies content and has the Crawl setting in Webmaster Tools set to NO?
The page in question receives a lot of quality traffic but is only relevant to a small percent of my users. I want to keep the link juice received from this page but I do not want it to appear in the SERPs.
Technical SEO | | surveygizmo0 -
Should we use Google's crawl delay setting?
We’ve been noticing a huge uptick in Google’s spidering lately, and along with it a notable worsening of render times. Yesterday, for example, Google spidered our site at a rate of 30:1 (google spider vs. organic traffic.) So in other words, for every organic page request, Google hits the site 30 times. Our render times have lengthened to an avg. of 2 seconds (and up to 2.5 seconds). Before this renewed interest Google has taken in us we were seeing closer to one second average render times, and often half of that. A year ago, the ratio of Spider to Organic was between 6:1 and 10:1. Is requesting a crawl-delay from Googlebot a viable option? Our goal would be only to reduce Googlebot traffic, and hopefully improve render times and organic traffic. Thanks, Trisha
Technical SEO | | lzhao0 -
Delete old site but redirect domain to a new domain and site
I just have a quick query and I have a feeling about what the answer is so just wanted to see what you guys thought... Basically I am working on a client site. This client has a few other websites that are divisions of their company. However these divisions/websites are no longer used. They are wanting to delete the websites but redirect the domains to their name main website. They believe this will pass on SEO benefits as these old division sites are old and have a good PR and history. I'm unsure for DEFINITE, which way is correct?
Technical SEO | | Weerdboil0 -
Crawling image folders / crawl allowance
We recently removed /img and /imgp from our robots.txt file thus allowing googlebot to crawl our image folders. Not sure why we had these blocked in the first place, but we opened them up in response to an email from Google Product Search about not being able to crawl images - which can/has hurt our traffic from Google Shopping. My question is: will allowing Google to crawl our image files eat up our 'crawl allowance'? We wouldn't want Google to not crawl/index certain pages, and ding our organic traffic, because more of our allotted crawl bandwidth is getting chewed up crawling image files. Outside of the non-detailed crawl stat graphs from Webmaster Tools, what's the best way to check how frequently/ deeply our site is getting crawled? Thanks all!
Technical SEO | | evoNick0