I'm pulling my hair out trying to figure out why google stopped crawling.. any help is appreciated
-
This is going to be kind of long, simply because there is a background to the domain name that is not typical to anybody in the world really and I'm not sure if its possible that it was penalized or ranked lower because of that or not. Because of that I'm going to include it with the hopes that giving the full picture some nice soul in the world who has more knowledge in this than me see's something or knows something and can point me in the right direction.
Our site has been around for a few years, at one point the domain was seized by homeland security ICE, and then they had to give it back in Dec. which sparked a lot of the SOPA PIPA stuff and we became the poster child so to speak. The site had previously been up since 2008, but due to that whole mess the site was down for 13 months on the dreaded seized server with a scary warning graphic and site title which caused quite obviously a bunch of 404 errors and who knows what else damage to anything we'd had before that as far as page rank and incoming links. we had a lot of incoming links from high quality sites. We were advised upon getting the domain back to pretty much scrap all the old content that was on the site prior and just start fresh.. which we did. Googlebot started crawling slowly, but then as we started getting back into the swing of things people started linking to us,some with high page rank, we were getting indexed quite frequently and ranking high on search results in our niche.. Then something happened on March 4th, we had arguably our best day with google traffic, we'd been linked back by places like Huff Post etc for content in our niche.. and the next day literally it was a freefall. Darn near nothing. I've attached a screen shot from webmaster tools so you can see how drastic it was.
I went crazy, trying to figure out what was wrong, searching obsessively through webmaster tools looking for any indication of a problem, searched the site on google site:dajaz1.com and what comes up is page 2 page 3 page 45 page 46. It's also taken to indexing our category and tag pages and even our search pages. I've now set those all to noindex follow but when I look at where the googlebots are at on the site, they're on the categories, pages, author pages, and tags. Some of our links are still getting indexed, but doing a search just of our site name and we're ranking below many of the media sites that have written about our legal issues, when a month ago we were at least top result for our own name. I've racked my brain trying to figure out the issue. I've disabled plugins, I'm on fetch as google bot all the time making sure our stuff is at least coming out as 200 (we had 2 days where we were getting 403 errors due to a super-cache issue, but once fixed googlebot returned like it never left) I've literally watched 1000 videos, read 100 forums, added in SEO plugins, tried to optimize the site to the point I'm worried I'm over doing it.. and still they've barely begun to crawl. As you can see there is some activity in the last 2-3 days, but even submitting a new site map once I changed the theme out of desperation it's only indexed 16.
I've looked for errors all through webmaster tools and I can't find anything to tell me why that happened, how to fix it, and how to get googlebot to like us again. I'm pulling my hair out here. The links we have incoming are high quality links like huffington post , spin, complex, etc. Those haven't slowed down at all, we do outgoing links to sites we trust and are high quality as well. I've got interns working on how they're writing titles and such, I've gone through and attempted to fix duplicate pages and titles.. I've been going through and re-writing meta description tags What am I missing? I'm pulling my hair out trying to figure out what the issue is. Eternally grateful for any help provided. jnzb6.png
-
Glad to hear there's at least some progress. So often in these cases, I'm seeing that there are multiple things going on, and all you can do is fix them one at a time until the situation gets clearer. Give it a little time and then make the next set of changes - hopefully, it's all cumulative. Just make sure you measure progress and give things time to work. Where I see the most problems is when someone makes a change, it doesn't work in 3-4 days and then they reverse it or pile on new changes. Half the time, Google hasn't even re-cached 90% of the pages at that point.
-
Well google made a strong return yesterday.. so that definitely helped. Not sure if they're indexing us (though my sitemap no longer says 16 indexed and now 2000 indexed. And still not positive how many of those are pages and categories, but it looks like some of the posts are starting to make it under site:dajaz1.com
Thank you so much for your help. I'm not sure the issue is fixed ( I currently have all SEO plugins disabled and I'm too scared to turn them on) but at least google is crawling again and the next few days should tell the tale.
Another thing I did was got rid of the sub categories and turned them into tags. Hopefully this will help with the site structure, even if we take an initial hit with 404's at this point I don't want the tags indexed anyway since I want google to focus on the posts.
-
Sorry - my browser auto-completed the "/350" paginated URL. That change looks good, actually - it'll simplify the crawl path and should prevent those weird, empty results pages. Not sure why those are happening (and you may still need to fix something), but sticking to "Older" / "Newer" should help. I doubt this is the whole problem, but it's definitely an issue.
-
SORRY: SEE NEXT COMMENT
The current and prev/next link are just showing up as blank for me now (no articles). Something's off in the database, I suspect. It's almost like the "newest" articles are blank.
-
I removed the pagi navigation and just left it to the older and newer links to take care of that issue. Lets see if that works at all. I removed all the drop down navigation, though to be honest I'd added that in after this all started hoping to set some type of site structure.. which I clearly didn't do a good job of.
I'm ok with radical at this point, I just want our posts to start getting indexed again. I really appreciate the suggestions..
-
The category/tag issue is tricky - sometimes, it does make sense to NOINDEX/FOLLOW, if those categories and tags spin out into a ton of internal search pages. My worry is that you're noindex'ing main navigation links - which is really a mixed signal. Those pages are important enough to get top billing, but not to index (?)
I have a feeling the best answer is a balance, but that takes a really deep knowledge of the site to sort out. If it's only been a week, it's really tough to tell. On a big site and deeper pages, it can take weeks to sort out, I'm afraid to say. NOINDEX is far from instantaneous.
Your main categories could potentially get ranked over your blog posts in some cases, but again, that's partially the site architecture. If you don't want them to rank, why are they main navigation links? Another, more radical options would be to kill the drop-down menus and add a layer (each main-nav home page would have left-side links or something to that effect) - that would drive more link-juice to the main sections and articles, and less to the sub-categories.
There's something very wrong with your article/blog navigation. If I click the ">>", I jump to page 417, but then there are dozens of empty pages. This is part of why all those paginated versions are getting indexed. I don't know if this is a WP issue or a quirk of your implementation (maybe a bunch of blank entries in the database), but it's creating dozens or hundreds of empty pages for Google to crawl. Since that's available right from the home-page, it's definitely a problem.
-
Thanks for the response. Honestly I don't know why it's indexing the pages like that. I was told to make the categories and tags noindex, follow because google is indexing us, but it's indexing our pages, categories and tags. We've even had search results pages show up.. But it doesn't seem to like our actual posts.
I've turned the wp seo plugin off because honestly I'm just confused and not sure what to do. The site was innocent, so I would think that it wouldn't be penalized for it. It was (and still sorta is) a case that's mainstream media and they are fully aware of the situation all the way up the food chain at google. We were very active in the anti SOPA/PIPA and used by companies like google and the eff as an example of why it was a bad law. I can't see google penalizing us for that.
searching site:dajaz1.com none of our blog posts come up. I don't understand why it's not ranking them, but it will rank our pages with numbers behind them, or categories and tags nor do I understand how to fix it.
we've been noindex,follow on the categories, tagged, pages, search, and author pages for about a week now, and it's not indexing new stuff, and it's still not indexing our actual blog posts.
So if II should put them to index our categories in the nav bar, what's to stop them from indexing them over our blog posts.. because our posts are not ranking at all. (but our category pages are? )
That scroller at the top I think you're talking about is a gawd awful Nike Ad. We're on a ad network so we're stuck with it.. however that occurred after all this ranking stuff started.
I'm confused about the site speed, it loads very quickly for me, scores a 86/100 on google page test, 87/100 on webpagetest.org.. but yet shows slow load in analytics, webmaster tools, and occasionally people will tell me it loads slow for them. Unless it's that ad, in which case there's not much I can do about it. We're in a contract with the ad network (which is very a reputable and well known one)
-
Unfortunately, it can be very difficult to separate a bad history from a penalty from a large-scale technical problem, especially on large sites. I've seen many people assume they got hit by Panda when it was really a link-based penalty, and vise-versa. The site's history makes this go from difficult to nearly impossible, at least without a very deep dive, but I'll see what I can see.
Alan's right on one thing - Google Webmaster Tools has huge gaps in what they warn you about, and it's typically only manual penalties. Many sites have massive problems that never trigger a warning from Google.
I notice that you're NOINDEX'ing even high-level pages (in the navigation), such as:
http://dajaz1.com/music/alternative/
That seems like a bad message to Google - if it's important enough to appear in navigation, it's important enough to index. That's a pretty extreme culling of pages.
The paginated content is a bit of a mess, such as:
In some cases, these don't even seem to return any results, so I'm not sure how they got crawled in the first place. The trick with META NOINDEX here is that, until Google re-crawls, they won't process the tag. This gets tricky, but I'd recommend a couple of possibilities:
(1) If the page returns no results, 301-redirect to the last page of search that has results.
(2) If none of these pages have search value, you could block "/page" as a folder in Google Webmaster Tools. This is a bit dangerous, so I'd want to make sure none of these pages had search value.
Are you getting any page load (speed) warning? Hitting your site intially is massive - about 4MB, by my count, with a ton of JavaScript, most of which just fuels the top, rotating images (which are loading very slowly on my machine). It seems like overkill, both from an SEO and usability standpoint, and is probably in your way to recovery. I'd seriously consider stripping down the size of the code and pruning back some of the active elements for a while.
If you can re-open the important paths, get rid of the thin content (this is going to be complicated and probably involves multiple steps), and speed up the site, you'll know enough to see if this is a technical issue (such as Panda).
There is certainly weak ranking even on your indexed pages, which could indicate a penalty, but it's really tough to tell. Too much of your content is competitive or uses shared phrases or videos, so it's hard to see whether a search for:
"Dwayne Wade 70-Foot Buzzer Beater" ...has you in 6th place for competitive reasons or because your site has been devalued. I don't think it's a penalty, at least in this case. It's a YouTube video and there are other, similar videos for a fairly recent, competitive term, so this may be an accurate ranking (in Google's POV).
The history is a lot tougher, and Q&A just isn't adequate to comment on a situation that complicated, as there are not only SEO but legal ramification. Honestly, I'd have to know a lot more details on that. If you suspect the history has hit you permanently, there may come a time when you have to completely re-brand and re-launch under a new domain.
I suspect, though, that cleaning up the crawl problems, removing the thinnest content, speeding up the site, and generally fixing some technical issues could help quite a bit. It's going to be a difficult process, though. The thing about changes like the Panda Update, is that it's not just one factor. I can't point to one thing and say "fix this" - you have to aggressively attack multiple factors, since Google is wrapping multiple signals into Panda and won't tell you which one is the problem.
I should say that I'm not saying this is Panda, but that it's a Panda-like situation - you've got a lot of crawl/index issues that are going to cause you problems. The question is whether those are compounded by your history (and, unfortunately, they probably are). The combination means that you have to be even more aggressive with the clean-up.
-
well, I'm sorry I wasted your time, like you, I'm looking for answers, and thats all. Don't worry, I won't respond to any of your threads again, I did the best I could.
If you look back through the things I said, you may find something useful. Don't underestimate your own powers of investigation and testing.
I can almost guarantee you this - there is no simple answer. Nobody has it, other than google. The only way to work out what it is - is to ask, listen, experiment and test. There are a lot of smart people in SEOMoz, and if they haven't responded, it is because they don't have "the answer" you want.
And just remember, nobody responding to you here, if they do respond, is making any money from giving you suggestions, nor are they trying to wind you up. This is a voluntary thing of people helping people. The only possible reward is a thumbs up on a post, but I see I didn't give you anything that was even remotely useful.
I hope you figure it out.
-
When I post something on the internet, freaked out or not, that's never an invite to go looking for my personal contact information to contact me outside of this medium.
Anyway.. thanks for your help.. My situation might not be like yours, I was looking for actual advice tips and knowledge on things that might really be wrong. What I got was google hates your guts, but hey I'm in the same boat maybe they just manually told you to screw off.
How does that help me in any way when what I'm looking for are things on my site that I can use to improve my situation. I know you're well meaning, but you're not giving me any answers here. I'm trying to improve the SEO on my site.. not get phone calls from guys telling me their story in the middle of the night because I made a mistake in putting my url. I'll ask my question in another thread elsewhere.
Thanks
-
it was the only number I could find - and you seemed to be pretty well freaked out about the situation.
I am in Australia right now, and I didn't know you were on the east coast.
Since the timezone changed, I messed up the time calculation, sorry.
-
Melissa.
If anyone knew the answer, they would be going around selling the ability to fix panda and other problems, but I don't think there is anyone outside of google who knows the answer. We are on our own. The same thing that happened to you happened to us over 12 months ago and I'm still struggling to find a cure.
Working on a drastic change next week, but that change will not help you - your site is much too small for this change.
Not a lot of this makes any sense.
Linking to other sites - so is everyone else and they are not affected.
short content - so are many others and we're not hearing from them
some racy content - maybe, but look at all the others.
You have many incoming links from big sites. Maybe their algorithm is so paranoid about new links that it has hit you.
Here are just about my final ideas for you, after this, I can't think of anything.
1. Do you have any copyright content that someone could have reported you for?
2. Are you linking out to any bad sites - check all of your links - including the ones in your footers and sidebars.
3. check your pages with "fetch as googlebot" in case they can't find what they want
4. Check to see if you've been linkbombed with a ton of bad links, by someone who doesn't like you.
5. Register your writers with a google profile and link from their profile on their site to their google profile
You said you checked WMT many times, so you didn't get an unnatural links message
Two possible reasons nobody else has answered - they don't have an answer - or maybe they are too frightened to connect with you, because of the takedown.
-
Yeah well, I prefer to think it's not that since that would be a pretty big deal not to mention there is nothing going on on that page that would be worthy of a penalty... having said that I'm pretty disappointed with Seomoz in that nobody else is responding to this. certainly not worth the money per month..
And the phone call from you at 130am to the number listed on our DMCA page was really REALLY creepy..
-
Melissa, I doubt you'd get anything in webmaster tools.
That is because it would be a manual penalty, and I have no idea whether they would admit to it or not.
-
Anyone? I've posted this here and on google's webmaster forums and so far all the responses I've gotten are that google has pretty much told me to screw off and marveling at how drastic a fall it was.. but nobody can give me any real indication as to why. It's very frustrating.. granted we don't write big long drawn out paragraphs on a lot of our content, and it's content elsewhere, we're usually first or close with a lot of things, and we've earned enough respect by being quick with the exclusives that we're linked in Huff Post's from around the web, linked to as a source from Spin and MTV... we ranked high on things like the superbowl, whitney houstons death.. so the sudden fall is shocking as we're certainly not a scrape copy and paste site.. We get scraped and copied and pasted a lot.
-
Well considering that the domain was seized for 13 months by the government and then when they had to give the domain back the timing was such that it kicked the ball rolling on a internet uproar that cost them 2 bills. I'd say we've got some enemies..
But in that case wouldn't we get some sort of notification from Google in webmaster tools? There is nothing showing there..
-
It is possible that one or more people, or a group, has reported you for something.
Some time ago, someone emailed us to say they had reported us to google for something. I don't know what it was, but about a week later, our ranking dropped.
When you have two million pages, its a bit hard to work out what might have caused it.
-
exactly.. I need to know what is causing it.
-
That is part of the problem - other sites ranking above you.
But thats not the reason it is happening.
The reason is something else - this is just the fallout from that.
I was just searching for one of our stories that I wrote myself.
Everyone that links to us - and even sites that copied the story, with my byline on it - they are all ranking, but we're not.
Its killing us.
-
Actually we do get original stuff, and post exclusives all the time. The problem is there are a lot of blogs who then turn around and repost the same thing we post.. Not that they don't link back or credit, because the bigger and reputable blogs like Spin and MTV etc do.. it's the trillion of blogs that copy and paste blog.
As for the length of content we're working on that, but with the type of blog we have it's often difficult to write out paragraphs on a freestyle that isn't attached to any project.
I need to figure out what the issue is.. it's going to make me crazy until I do.
-
Melissa,
That is a dramatic drop, much worse than ours was.
The issue of google indexing index pages and showing them and tag pages in SERPS, but not the original, is typical.
For some reason I don't see yet, google hates your guts.
Your stories are showing up in Australian results, but back a page or two.
If you search for
Drake Announces Second Leg of Club Paradise Tour
you won't find it until you get to the last result and then click the link that says they will show you the results they hid.
This means they hate something on your site and until you fix it, you are out of luck.
I know, because I've been trying to fix this same thing for more than 12 months.
From looking at just a few of your pages, it appears that you have nothing that is 100% original and unique to your site. That probably means they see you as an aggregator or content farm.
I could be wrong, but thats what it looks like.
Show me something that is 100% unique to you.
Some of your content is very thin, too, 20-30 words and a video.
That probably makes your content look like it is mostly duplicate, with just a few words changed.
Your sitemap looks OK.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removing indexed internal search pages from Google when it's driving lots of traffic?
Hi I'm working on an E-Commerce site and the internal Search results page is our 3rd most popular landing page. I've also seen Google has often used this page as a "Google-selected canonical" on Search Console on a few pages, and it has thousands of these Search pages indexed. Hoping you can help with the below: To remove these results, is it as simple as adding "noindex/follow" to Search pages? Should I do it incrementally? There are parameters (brand, colour, size, etc.) in the indexed results and maybe I should block each one of them over time. Will there be an initial negative impact on results I should warn others about? Thanks!
Intermediate & Advanced SEO | | Frankie-BTDublin0 -
Does integration of external supplemenatry data help or hurt regarding googles perception of content quality? (e.g weather info, climate table, population info, currency exchange data via API or open source databases)
We just lost over 20% traffic after google algo update at June 26.
Intermediate & Advanced SEO | | lcourse
In SEO forums people guess that there was likely a Phantom update or maybe a Panda update. The most common advice I found was adding more unique content. While we have already unique proprietary content on all our pages and we plan to add more, I was also considering to add some content from external sources. Our site is travel related so I thought about adding for each city page external data such as weather, climate data, currency exchange data via APIs from external sources and also some data such as population from open source databases or some statistical info we would search on the web. I believe this data would be useful to the visitors. I understand that purely own content would be ideal and we will work on this as well. Any thoughts? Do you think the external data may rather help or hurt how google perceives content quality?0 -
How do I get rel='canonical' to eliminate the trailing slash on my home page??
I have been searching high and low. Please help if you can, and thank you if you spend the time reading this. I think this issue may be affecting most pages. SUMMARY: I want to eliminate the trailing slash that is appended to my website. SPECIFIC ISSUE: I want www.threewaystoharems.com to showing up to users and search engines without the trailing slash but try as I might it shows up like www.threewaystoharems.com/ which is the canonical link. WHY? and I'm concerned my back-links to the link without the trailing slash will not be recognized but most people are going to backlink me without a trailing slash. I don't want to loose linkjuice from the people and the search engines not being in consensus about what my page address is. THINGS I"VE TRIED: (1) I've gone in my wordpress settings under permalinks and tried to specify no trailing slash. I can do this here but not for the home page. (2) I've tried using the SEO by yoast to set the canonical page. This would work if I had a static front page, but my front page is of blog posts and so there is no advanced page settings to set the canonical tag. (3) I'd like to just find the source code of the home page, but because it is CSS, I don't know where to find the reference. I have gone into the css files of my wordpress theme looking in header and index and everywhere else looking for a specification of what the canonical page is. I am not able to find it. I'm thinking it is actually specified in the .htaccess file. (4) Went into cpanel file manager looking for files that contain Canonical. I only found a file called canonical.php . the only thing that seemed like it was worth changing was changing line 139 from $redirect_url = home_url('/'); to $redirect_url = home_url(''); nothing happened. I'm thinking it is actually specified in the .htaccess file. (5) I have gone through the .htaccess file and put thes 4 lines at the top (didn't redirect or create the proper canonical link) and then at the bottom of the file (also didn't redirect or create the proper canonical link) : RewriteEngine on
Intermediate & Advanced SEO | | Dillman
RewriteCond %{HTTP_HOST} ^([a-z.]+)?threewaystoharems.com$ [NC]
RewriteCond %{HTTP_HOST} !^www. [NC]
RewriteRule .? http://www.%1threewaystoharems.com%{REQUEST_URI} [R=301,L] Please help friends.0 -
A few questions on Google's Structured Data Markup Helper...
I'm trying to go through my site and add microdata with the help of Google's Structured Data Markup Helper. I have a few questions that I have not been able to find an answer for. Here is the URL I am referring to: http://www.howlatthemoon.com/locations/location-chicago My company is a bar/club, with only 4 out of 13 locations serving food. Would you mark this up as a local business or a restaurant? It asks for "URL" above the ratings. Is this supposed to be the URL that ratings are on like Yelp or something? Or is it the URL for the page? Either way, neither of those URLs are on the page so I can't select them. If it is for Yelp should I link to it? How do I add reviews? Do they have to be on the page? If I make a group of days for Day of the Week for Opening hours, such as Mon-Thu, will that work out? I have events on this page. However, when I tried to do the markup for just the event it told me to use itemscope itemtype="http://schema.org/Event" on the body tag of the page. That is just a small part of the page, I'm not sure why I would put the event tag on the whole body? Any other tips would be much appreciated. Thanks!
Intermediate & Advanced SEO | | howlusa0 -
After Receiving a "Googlebot can't access your site" would this stop your site from being crawled?
Hi Everyone,
Intermediate & Advanced SEO | | AMA-DataSet
A few weeks ago now I received a "Googlebot can't access your site..... connection failure rate is 7.8%" message from the webmaster tools, I have since fixed the majority of these issues but iv noticed that all page except the main home page now have a page rank of N/A while the home page has a page rank of 5 still. Has this connectivity issues reduced the page ranks to N/A? or is it something else I'm missing? Thanks in advance.0 -
Why isn't google indexing our site?
Hi, We have majorly redesigned our site. Is is not a big site it is a SaaS site so has the typical structure, Landing, Features, Pricing, Sign Up, Contact Us etc... The main part of the site is after login so out of google's reach. Since the new release a month ago, google has indexed some pages, mainly the blog, which is brand new, it has reindexed a few of the original pages I am guessing this as if I click cached on a site: search it shows the new site. All new pages (of which there are 2) are totally missed. One is HTTP and one HTTPS, does HTTPS make a difference. I have submitted the site via webmaster tools and it says "URL and linked pages submitted to index" but a site: search doesn't bring all the pages? What is going on here please? What are we missing? We just want google to recognise the old site has gone and ALL the new site is here ready and waiting for it. Thanks Andrew
Intermediate & Advanced SEO | | Studio330 -
'Select your country' page leading to high Temporary Redirects
Hello all, I manage an ecommerce website and product prices are shown depending on what country you select. When a user does a product search or lands on a product page, they are immediately redirected to a 'select your country' page. After selecting their option, the user is redirected back to the product or search result page. The problem I face is that, this is leading to a high 'Temporary Redirects' list in my crawl diagnostic page. Looking at the list of temporary redirects, 90% are users being bounced to a 'select your country' page. Any advice to tackle this? Have you guys faced anything similar? Thanks Cyto
Intermediate & Advanced SEO | | Bio-RadAbs0 -
What are the different tactics for getting ranked/ included in Google finance searches such as http://www.google.com/finance/company_news?q=NASDAQ:ADBE
I don't know what ranking factors they are using for this feed. The results vary greatly from a search done at google.com or google.com/news and google.com/finance I'm working with a website that regularly publishes finance-related news and currently gets traffic from google finance. I'm wondering what we can do to optimize our news articles to possibly show more prominently or more often. Thanks
Intermediate & Advanced SEO | | joemascaro0