Massive Amount of Pages Deindexed
-
On or about 12/1/17 a massive amount of my site's pages were deindexed. I have done the following:
- Ensured all pages are "index,follow"
- Ensured there are no manual penalites
- Ensured the sitemap correlates to all the pages
- Resubmitted to Google
- ALL pages are gone from Bing as well
In the new SC interface, there are 661 pages that are Excluded with 252 being "Crawled - currently not indexed: The page was crawled by Google, but not indexed. It may or may not be indexed in the future; no need to resubmit this URL for crawling." What in the world does this mean and how the heck do I fix this. This is CRITICAL. Please help!
The url is https://www.hkqpc.com
-
the report was run prior canonical directives
Anytime remember to noindex your robots.txt
https://yoast.com/x-robots-tag-play/
There are cases in which the robots.txt file itself might show up in search results. By using an alteration of the previous method, you can prevent this from happening to your website:
<filesmatch "robots.txt"="">Header set X-Robots-Tag "noindex"</filesmatch>
**And in Nginx:**
location = robots.txt { add_header X-Robots-Tag "noindex"; }
-
Looking at the first report, "Redirect Chains".. As I understand the table, these are correct..
Column A is the page (source) with the redirecting link
Column B is the link that is redirecting (http://www.hkqlaw.com)
Column C shows 2 redirects happening
Column I shows the first redirect (http://www.hkqlaw.com -> http://www.hkqpc.com) (non ssl version)
Column N shows the second redirect (http://www.hkqpc.com -> https://www.hkqpc.com) (ssl version)The original link (hkqlaw.com) is a link in the footer of our news section so is common on those pages which is why it shows so often. So, like I said, this appears to be correct.
I added the canonical directives to the pages earlier so perhaps that report was run prior to me doing that?
Again, thanks so much for your effort in helping me!
-
Now I'm really baffled. I just ran Screaming Frog and don't see any of the redirects or other stats. Which software are you using that is showing this information? I'm trying to replicate it and figure out if there's something, somewhere else doing this.
-
Wow, I got it
your 301 redirecting a ton of URLs back to the homepage.
- Redirect chains https://bseo.io/cZW0w0
- internal URLs https://bseo.io/4sFqUk
- insecure content https://bseo.io/YDDKGD
- no canonical https://bseo.io/fWey1Q
- crawl overview https://bseo.io/Zg6bpM
- canonical errors https://bseo.io/YtTh7W
-
Ok, canonical is set for each page (and I fixed the // issue). I used x-robots header to noindex the robots.txt and sitemap.xml files, along with a few other extensions while I was at it.
I'll get the secured cookie header set after this is resolved. We don't store any sensitive data via cookies for this site so it's not of immediate concern but still one I'll address.
EDIT: The https://www.hkqpc.com/attorney/David-Saba.html/ page no longer exists which was the cause of the errors. I've redirected that to the appropriate page.
-
https://cryptoreport.websecurity.symantec.com/checker/
This server cannot be scanned for these vulnerabilities:HeartbleedServer scan unsuccessful. <a>See possible causes.</a>Poodle (TLS)Server scan unsuccessful. See possible causes.BEASTThis server is vulnerable to a BEAST attack. <a>More information.</a>
I am sorry I said your IP was Network solutions when it was 1&1 I still strongly recommend changing hosting companies even though I am German and so is 1&1
DNS resolves www.hkqpc.com to 74.208.236.66
The SSL certificate used to load resources from https://www.hkqpc.com will be distrusted in M70. Once distrusted, users will be prevented from loading these resources. See https://g.co/chrome/symantecpkicerts for more information.
Look: https://cl.ly/pCY5
Look: https://cl.ly/pAKa
symantec SSL certificates are now owned by DigiCert
<big>https://www.digicert.com/help/</big>
https://www.dareboost.com/en/report/5a70b33e0cf28f017576367f
The Set-Cookie HTTP header can be configured with your Apache server. Make sure that the mod_headers module is enabled. Then, you can specify the header (in your .htaccess file, for example). Here is an example: <ifmodule mod_headers.c=""># only for Apache > 2.2.4: Header edit Set-Cookie ^(.*)$ $1;HttpOnly;Secure # lower versions: Header set Set-Cookie HttpOnly;Secure</ifmodule>
- robots.txt file inside of the SERPS big photo https://i.imgur.com/cJeDR9t.png
- XML sitemap inside of SERPS should be no indexed big photo https://i.imgur.com/tlx5jc7.png
Double forward slashes after verdicts the same page without double forward slashes you need to add rel canonical tags zero canonical's on any page whatsoever.
- https://www.hkqpc.com/news/verdicts//hkq-attorneys-win-carbon-county-real-estate-case/
- https://www.hkqpc.com/news/verdicts/hkq-attorneys-win-carbon-county-real-estate-case/
The URLs above need a rel=canonical tag I have created an example below for you. For the page without the double forward slashes, and this tells Google the one you'd prefer to have indexed besides it keeps the query string pages and junk pages out of Google's index. Please see the resources below and add them to your website because I do not know what type of CMS you're using I cannot recommend a plug-in to do it but if you were using something like WordPress it would be automatically done by something like Yoast WordPress SEO for the site that you are using it may be a wise move to move to something like WordPress it is a solid platform for a site that size and makes things a lot easier for you to implement change across the entire site quickly.
- https://moz.com/blog/complete-guide-to-rel-canonical-how-to-and-why-not
- https://yoast.com/rel-canonical/
- https://moz.com/blog/canonical-url-tag-the-most-important-advancement-in-seo-practices-since-sitemaps
You need to add a canonical
- Bigger photo of problem https://i.imgur.com/1qMMPSM.png
- this page https://www.hkqpc.com/attorney/David-Saba.html/
- Warning: Creating default object from empty value in /homepages/43/d238880598/htdocs/classes/class.attorneys.php on line 38
- Warning: Invalid argument supplied for foreach() in /homepages/43/d238880598/htdocs/headers/attorney.php on line 15
- ** FIx for this**
- https://stackoverflow.com/questions/14806959/how-to-fix-creating-default-object-from-empty-value-warning-in-php
- http://thisinterestsme.com/invalid-argument-supplied-for-foreach/
You have
Heartbleed Vulnerability
An unknown error occurred while scanning for the Heartbleed Bug.
-
Thanks for the great feedback! The hkqlaw.com url simply forwards (301) to hkqpc.com. The IP address you have is for hkqlaw.com which is registered through Network Solutions, but hosting of hkqpc.com is on 1and1.com hosting. Also, the timeout error you're getting is because there is no SSL cert for hkqlaw.com, again, it's just forwarded to hkqpc.com (which does have an SSL attached to it). As far as SC, everything is setup to index hkqpc.com.
-
Right now I cannot get that site to load on my browser, and when I used https://tools.pingdom.com it was unable to load as well you could be having some serious server problems, and that could be causing the issue although I was getting it to run through screaming frog which is surprising.
This is a zip file of your screen frog results this will show if there are any no index pages which I found none of it looks to me like you have a server issue. Zip file: http://bseo.io/BXYpZh
I checked your site for malware using https://sitecheck.sucuri.net/results/www.hkqlaw.com/ ( please understand this only check the homepage and a handful of others) and found none though when I checked your IP address I noticed a lot of ransomware information tied directly to your IP
https://ransomwaretracker.abuse.ch/ip/205.178.189.131/
Here is a large screenshot of when I tried to browse your website: https://i.imgur.com/OzcLhbx.png
Here is Pingdom ( remember to test on something outside of your local computer because you have caching and other things that could give you incorrect results.)
https://tools.pingdom.com/#!/bd6d52/https://www.hkqlaw.com/
in my experience network solutions, hosting is terrible I would strongly suggest doing two things.
Get a better hosting company for your site.
A good host that is not too expensive is and also managed is liquid Web, cloudways, rack space, pairnic, you can also build out your own system on non-managed hosting like Linode, digital ocean, AWS, Google cloud, Microsoft Azure if you want a high-quality, inexpensive manage host that offers more than one back and like the ones I've listed above https://www.cloudways.com/en/ will host anything and manage it, and you can use the backends provided before this. If you want what I think is the best and price is not a big deal considering you're not running WordPress https://armor.com is my preferred hosting company. Otherwise, cloudways or liquid Web would be where I would host your site.
Considering you already have an IP address attached to ransomware and you're using hosting company that will not be beneficial to you in security terms. I would add a web application firewall/reverse proxy you can do that with https://sucuri.net/website-firewall/ https://incapsula.com https://fastly.com and if you want most basic and least secure but better than what you have https://cloudflare.com
At the very least put Cloudflare on their but what I'm seeing is a severe problem coming from your web host and knowing that hosting company I would strongly advise you to move to a better host.
I hope this was of help,
Thomas
-
Not sure if this is of help to you, I suppose it depends how many pages you are expecting to be indexed, but according to John Mu at Google - Google does not necessarily index all pages.
https://www.seroundtable.com/google-index-all-pages-20780.html
-
Not recently. It migrated well over a year ago to HTTPS.
-
First thing to confirm - did you recently migrate to HTTPS?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page not being ranked properly
Hi, Wondering if someone could possibly shed some light on why some of our pages are not being ranked properly on Google. For example this page https://www.mypetzilla.co.uk/dog-breeds Keyword "Dog Breeds" we can't be found on and we are absolutely baffled why? Could it be that we are listing all 100 and something dog breeds on one page? Should we introduce pagination or load more as user scrolls down. This page has been up for at least 4 years. Any suggestion or advice would be much appreciated. Many thanks
Intermediate & Advanced SEO | | Mypetzilla0 -
Huge amount of backlinks detected - what to do ?
The websites that use Yotpo review solution can display product galleries like this //imgur.com/4dHUh7O - orginal source page: http://skibox.fr/fr/veste-de-pluie-dynastar-long-shell.html Every product in the gallery generates a link to https://yotpo.com such as https://yotpo.com/go/eAaQNjJh This generate a huge amount of links detected in Google Search Console (GWMT) of yotpo.com And every of those links redirects 301 to a page of the website using Yotpo review solution. Example: https://yotpo.com/go/eAaQNjJh redirects to http://skibox.fr/fr/batons-de-ski-leki-worldcup-lite-slalom-4683.html?#.VymNdr5_TwY It seems to be similar to shorten URL links (that are legitimate), but I am not about the influence of this, what do you think ? Is this really influencing (in bad) the (potential) rankings of https://www.yotpo.com subdomain pages? What would you recommend to do?
Intermediate & Advanced SEO | | KobyYotpo0 -
How should I deal with this page?
Hey Mozzers, I was looking for a little guidance and advice regarding a couple of pages on my website. I have used 'shoes' for this example. I have the current structure Parent Category - Shoes Sub Categories - Blue Shoes
Intermediate & Advanced SEO | | ATP
Hard Shoes
Soft Shoes
Big Shoes etc Supporting Article - Different Types of Shoe and Their Uses There are about 12 subcategories in total - each one links back to the Parent Category with the keyword "Shoes". Every sub category has gone from ranking 50+ to 10-30th for its main keyword which is a good start and as I release supporting articles im sure each one will climb. I am happy with this. The Article ranks no1 for about 20 longtails terms around "different shoes". This page attracts around 60% of my websites traffic but we know this traffic will not convert as most are people and children looking for information only for educational purposes and are not looking to buy. Many are also looking for a type of product we dont sell. My issue is ranking for the primary category "Shoes" keyword. When i first made the changes we went from ranking nowhere to around 28th on the parent category page targeted at "Shoes". Whilst not fantastic this was good as gave us something to work off. However a few weeks later, the article page ranked 40th for this term and the main page dropped off the scale. Then another week some of the sub category pages ranked for it. And now none of my pages rank in the top 50 for it. I am fairly sure this is due to some cannibalisation - simply because of various pages ranking for it at different times.
I also think that additional content added by products on the sub category pages is giving them more content and making them rank better. The Page Itself
The Shoes page itself contains 400 good unique words, with the keyword mentioned 8 times including headings. There is an image at the top of the page with its title and alt text targeted towards the keyword. The 12 sub categories are linked to on the left navigation bar, and then again below the 400 words of content via a picture and text link. This added the keyword to the page another 18 or so times in the form of links to longtail subcaterogies. This could introduce a spam problem i guess but its in the form of nav bars or navigation tables and i understood this to be a necessary evil on eCommerce websites. There are no actual products linked from this page. - a problem? With all the basic SEO covered. All sub pages linking back to the parent category, the only solution I can think of is to add more content by Adding all shoes products to the shoe page as it currently only links out the the sub categories Merging the "Different Type of Shoe and Their Uses" article into the shoe page to make a super page and make the article pages less like to produce cannibalistic problems. However, by doing solution 2, I remove a page bringing in a lot of traffic. The traffic it brings in however is of very little use and inflates the bounce rate and lowers the conversion rate of my whole site by significant figures. It also distorts other useful reports to track my other progress. I hope i have explained well enough, thanks for sticking with me this far, i havn't posted links due to a reluctance by the company so hopefully my example will suffice. As always thanks for any input.0 -
Location Pages On Website vs Landing pages
We have been having a terrible time in the local search results for 20 + locations. I have Places set up and all, but we decided to create location pages on our sites for each location - brief description and content optimized for our main service. The path would be something like .com/location/example. One option that has came up in question is to create landing pages / "mini websites" that would probably be location-example.url.com. I believe that the latter option, mini sites for each location, would be a bad idea as those kinds of tactics were once spammy in the past. What are are your thoughts and and resources so I can convince my team on the best practice.
Intermediate & Advanced SEO | | KJ-Rodgers0 -
Large number of pages crawled.
My campaign for printlabelandmail.com says that seomoz has crawled 619 pages. My site, however, only has a little over 250 pages. Where are these extra pages? I did recently relaunched my website with wordpress. I was using Dreamweaver before. I thought I deleted all the old pages. Could these extra pages be old pages from the site prior to my relaunch? I hope my question makes sense. Any insights would be helpful. Thanks! Andrea
Intermediate & Advanced SEO | | JimDirectMailCoach0 -
YouTube Page
Hi All, I am new here but already I can see that SEOmoz is a great place for SEO 🙂 I need advice... We have one client that have 100.000 views per day on their YouTube channel! Now they have about 15.000 per day and ask us what we can do with SEO for their YouTube channel. Thanks for help! All The Best, Sanel
Intermediate & Advanced SEO | | FighterSpirit0 -
Cleaning bad pages
We have 10,000 of bad pages, which panda could track and penalize us for that. If we delete them we will get 404 error, and after that we could again get penality from G algo. How can i delete them to follow google rules and avoid penalities? If we make redirect of 10k pages with 301 to index, can 10k old pages be treated as duplicate?
Intermediate & Advanced SEO | | bele0 -
Privacy policy page
I only link to my privacy policy page from the homepage, but the privacy policy page has a pr4, while the main domain has a pr5. Using site:domain name the policy page is at the top of the 2nd page of google so it ranks high. I was thinking of either nofollowing the link or adding a (noindex,follow) directive on the policy page, until I saw some seo professional sites using rel=canonical on their policy pages that points to their policy page itself. Am I better off using the (noindex,follow) or rel=canonical = policy page ? thanks
Intermediate & Advanced SEO | | Flapjack0