WEBMASTER console: increase in the number of URLs we were blocked from crawling due to authorization permission errors.
-
Hi guys,I received this warning in my webmaster console: "Google detected a significant increase in the number of URLs we were blocked from crawling due to authorization permission errors." So i went to "Crawl Errors" section and i found such errors under "Access denied" status:
?page_name=Cheap+Viagra+Gold+Online&id=471
?page_name=Cheapest+Viagra+Us+Licensed+Pharmacies&id=1603
and many happy URLs like these. Does anybody know what this is and where it comes from?
Thanks in advance!
-
Thank you Tom!
-
Hi
to removed any chance of infection and I am not telling you that I am 100% sure it's infected
You must be certain that the regional infection was removed. If it was not and you had links created by a third party other than yourself you are better off getting it completely cleaned
use Sucuri.net to remove any chance of a hack.
Just type this into Google
- ?page_name=Cheap+Viagra+Gold+Online&id=471
- ?page_name=Cheapest+Viagra+Us+Licensed+Pharmacies&id=1603
http://www.pearsonified.com/2010/04/wordpress-pharma-hack.php
https://blog.sucuri.net/2010/07/understanding-and-cleaning-the-pharma-hack-on-wordpress.html
https://sitecheck.sucuri.net/results/www.davidandsonsjewelers.com/articles/author/carole/
i used deepcrawl.com to create the audit I you referenced.
&
Screaming frog SEO to create the site map
I hope that helps,
Tom
-
Hello Thomas,
I really appreciate your help! You said i can look at your site's structure. What is your site address?
Unfortunately, i still don't know what i need to do in order to remove those pharma hack from my site. If you know where to point me to get the answer, i'll be very grateful.
Also, what tool you used to generate this report http://crawl.blueprintmarketing.com/projects/reports/215533?ro=75ad0c6e4afacc428b553d449dfd281f82ec2ad6 ?
Also, what tool you used to create XML site map?
Thanks
-
No site map from checking multiple configurations of XML site maps and coming up with nothing no redirects either e.g. /sitemap_index.xml might exist separately or redirect to /sitemap.xml
http://www.davidandsonsjewelers.com/sitemap.xml shows a 404
Tool's
deepcrawl.com https://varvy.com/mobile/ & https://varvy.com/tools/
-
detect mobile issues
-
If I were you I would look at my site structure make sure that it was built in a certain manner for the right reasons.
If your traffic is all right you really do not want to change the site that much. If you do change the site change it slowly.
( A great example of this is how FireHost.com it is becoming Armor.com)
the tools I used to find out whether or not you had a site map primarily was deepcrawl.com
to detect mobile issues
https://varvy.com/mobile/ & https://varvy.com/tools/
http://i.imgur.com/W7BDaq7.png
http://www.screamingfrog.co.uk/seo-spider/
http://i.imgur.com/LbCBmmW.png
I used screaming frog to create a XML site map for you here
I would definitely add an XML site map.
Sincerely,
Thomas
-
Also, do you say that the mobile site is blocked? Also, how do you see that the site doesn't have XML? What tool shows you all this info?
Thanks
-
Hi Thomas,
I really appreciate your help! Can you advise me what i should do? I see all these reports but i don't know how i need to clean the site.
Thank you!
-
As you are showing certain URLs that are definitely Pharma hack their are certain things Sucuri is unable to detect because of it being a front-end tool not the PHP tool that would be needed for the two-part WordPress and PHP version of your site.
Just type this into Google
- ?page_name=Cheap+Viagra+Gold+Online&id=471
- ?page_name=Cheapest+Viagra+Us+Licensed+Pharmacies&id=1603
http://www.pearsonified.com/2010/04/wordpress-pharma-hack.php
https://blog.sucuri.net/2010/07/understanding-and-cleaning-the-pharma-hack-on-wordpress.html
https://sitecheck.sucuri.net/results/www.davidandsonsjewelers.com/articles/author/carole/
https://www.virustotal.com/en/ip-address/216.120.237.225/information/
http://dnsbl.inps.de/query.cgi?lang=en&ip=216.120.237.225&action=check&quick=0
-
and switch everything to WordPress
view-source:http://www.davidandsonsjewelers.com/
-
some of you are links are really not supposed to be there
Here is your report please use the URL below to navigate the entire report.
All of you are URLs are relative to the most part that should be fixed. You have a Java redirect that definitely needs to be fixed.
PDF & XML outline
- http://cl.ly/d6Sv/www.davidandsonsjewelers.com_http-www-davidandsonsjewelers-com-_13-09-2015_overview_215533.pdf
- http://cl.ly/d6S7/public-report_files-215533-www.davidandsonsjewelers.com_http-www-davidandsonsjewelers-com-_13-09-2015_overview_215533.xls
You have roughly 108 indexed URLs according to Google
https://marketing.grader.com/report/www.davidandsonsjewelers.com/overall
you do not have an XML site map unfortunately I found that out in the first five minutes but you can also find out if these things using
https://moz.com/researchtools/crawl-test
upon a quick check with another tool I found
http://i.imgur.com/Y60WnIc.png
I love deepcrawl however your site is not large you can learn a lot about it with
http://www.screamingfrog.co.uk/seo-spider/ free
I hope this is a help, with analytics access and webmaster tool like this I cannot obviously give you a much better picture.
Tom
-
I will run the audit now sorry for the delay
-
-
The best way to solve this problem is to use
Or http://screamingfrog.co.uk Seo spider
If you give me the URL I will do it quick check for you.
-
Thank you Thomas,
My site is clean though according to sucuri. I spoke to owner of this website and they said that they were hacked in the past and they blocked those pages themselves. So now google detects those pages again? Or what exactly is happening? Anybody knows?
Thanks
-
Remember that not every URL is in Googles index. It does not mean that your back link is not in
https://moz.com/researchtools/ose/
You should very quickly make sure that your website is not still completely full of malware like it sounds it is
use this tool to determined what has happened to your site if it is infected it is free.
If it is hacked as I believe it may be dependent on what you have described I would then purchase the malware removal and web application firewall
https://sucuri.net/website-antivirus/
if you would like a much more secure hosting environment https://armor.com is the best.
Once you have removed your site from the blacklists and removed all the bad where/malware make sure to crawl it with Google in Webmaster tools using fetch as a Google bot
your nightmare should be short-lived sorry to hear that your site was hacked hopefully this will get you back on track quickly.
-
Hi Dirk,
In webmaster tools if i click one by one those links, i can see "Linked from" URLs. There are URLs like this:
http://schwagginwagon.com/?page_name=Buying+Tadalis+SX+Safely+No+Prescription+Tadalis+SX&id=1810
and also there is one URL is coming from my domain. Not sure what it means.
I went through every single URL in Google index but all of them are normal URLs. Nothing related to spam. Any ideas?
Thanks
-
Try to do a search of type viagra site:yourdomain.com - and see if there are any pages of suspicious nature that are listed.
In the crawl error section in webmaster tools you could also check where these url's are coming from (external/internal links)
If your site is hacked - you can find more info here http://www.google.com/webmasters/hacked/ on what to do next.
rgds,
Dirk
-
Hello Dirk,
Thank you for fast reply! I thought it too right away. So all of these URLs are forbidden when i try to access them. This is the message from google webmaster tools "Googlebot couldn't crawl your URL because your server either requires authentication to access the page, or it is blocking Googlebot from accessing your site."
Any ideas? Thanks
-
Hi
On first sight I would guess your site has been hacked - do these url's exist when you try them?
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Webmaster tools reporting spurious errors?
For the past 3 or so months Webmaster tools has been reporting 404 errors on my pages... The odd thing is that I can't figure out what they are seeing. Here is an example of a link they claim is a 404 antiquebanknotes/nationalcurrency/rare/1895-Ten-Dollar-Bill.aspx This is strange because it's a malformed URL. It says it's linked from this page: http://www.antiquebanknotes.com/antiquebanknotes/rare/1882-twenty-dollar-bill.aspx Which is a URL that doesn't exist. The bolded portion of this URRL shouldn't be there. Can anyone give me an idea what is happening here? Kind regards, Greg
Technical SEO | | Banknotes1 -
Cannot work out why a bunch of urls are giving a 404 error
I have used the Crawl Diagnostic reports to greatly reduce the number of 404 errors but there is a bunch of 16 urls that were all published on the same date and have the same referrer url but I cannot see the woood for trees as to what is causing the error. **The 404 error links have the structure:**http://www.domainname.com/category/thiscategory/page/thiscategory/this-is-a-post The referrer structure is: http://www.domainname.com/category/thiscategory/page/2/ Any suggestions as to how to unravel this would be appreciated.
Technical SEO | | Niamh20 -
20 000 duplicates in Moz crawl due to Joomla URL parameters. How to fix?
We have a problem of massive duplicate content in Joomla. Here is an example of the "base" URL: http://www.binary-options.biz/index.php/Web-Pages/binary-options-platforms.html For some reason Joomla creates many versions of this URL, for example: http://www.binary-options.biz/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html or http://www.binary-options.biz/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html?q=/index.php/Web-Pages/binary-options-platforms.html So it lists the URL parameter ?q= and then repeats part of the beforegoing URL. This leads to tens of thousands duplicate pages in our content heavy site. Any ideas how to fix this? Thanks so much!
Technical SEO | | Xmanic0 -
My number of duplicate page title and temporary redirect warnings increased after I enabled Canonical urls. Why? Is this normal?
After receiving my first SEO moz report, I had some duplicate page titles and temporary redirects. I was told enabling Canonical urls would take of this. I enabled the Canonical URLs, but the next report showed that both of those problems had increased three fold after enabled the canonical urls! What happened?
Technical SEO | | btsseo780 -
Should I block robots from URLs containing query strings?
I'm about to block off all URLs that have a query string using robots.txt. They're mostly URLs with coremetrics tags and other referrer info. I figured that search engines don't need to see these as they're always better off with the original URL. Might there be any downside to this that I need to consider? Appreciate your help / experiences on this one. Thanks Jenni
Technical SEO | | ShearingsGroup0 -
Help with Webmaster Tools "Not Followed" Errors
I have been doing a bunch of 301 redirects on my site to address 404 pages and in each case I check the redirect to make sure it works. I have also been using tools like Xenu to make sure that I'm not linking to 404 or 301 content from my site. However on Friday I started getting "Not Followed" errors in GWT. When I check the URL that they tell me provided the error it seems to redirect correctly. One example is this... http://www.mybinding.com/.sc/ms/dd/ee/48738/Astrobrights-Pulsar-Pink-10-x-13-65lb-Cover-50pk I tried a redirect tracer and it reports the redirect correctly. Fetch as googlebot returns the correct page. Fetch as bing bot in the new bing webmaster tools shows that it redirects to the correct page but there is a small note that says "Status: Redirection limit reached". I see this on all of the redirects that I check in the bing webmaster portal. Do I have something misconfigured. Can anyone give me a hint on how to troubleshoot this type of issue. Thanks, Jeff
Technical SEO | | mybinding10 -
Seek help correcting large number of 404 errors generated, 95% traffic halt
Hi, The following GWT screen tells a bit of the story: site: http://bit.ly/mrgdD0 http://www.diigo.com/item/image/1dbpl/wrbp On about Feb 8 I decided to fix a large number of 'duplicate title' warnings being reported in GWT "HTML Suggestions" -- these were for URLs which differed only in parameter case, and which had Canonical tags, but were still reported as dups in GWT. My traffic had been steady at about 1000 clicks/day. At midnight on 2/10, google traffic completely halted, down to 11 clicks/day. I submitted a recon request and was told 'no manual penalty' Also, the 'sitemap' indexes in GWT showed 'pending' for 24x7 starting then. By about the 18th, the 'duplicate titles' count dropped to about 600 or so... the next day traffic hopped right back to about 800 clicks/day - for a week - then stopped again, down to 10/day, a week later, on the 26th. I then noticed that GWT was reporting 20K page-not found errors - this has now grown to 35K such errors! I realized that bogus internal links were being generated as I failed to disable the PHP warning messages.... so I disabled PHP warnings and fixed what I thought was the source of the errors. However, the not-found count continues to climb -- and I don't know where these bad internal links are coming from, because the GWT report lists these link sources as 'unavailable'. I'v been through a similar problem last year and it took months (4) for google to digest all the bogus pages ad recover. If I have to wait that long again I will lose much $$. Assuming that the large number of 404 internal errors is the reason for the sudden shutoff... How can I a) verify the source of these internal links, given that google says the source pages are 'unavailable'.. Most critically, how can I do a 'RESET" and have google re-spider my site -- or block the signature of these URLs in order to get rid of these errors ASAP?? thanks
Technical SEO | | mantucket0 -
How do I fix these duplicate URLs?
HI guys, I ran a report on my site and it shows some duplicate titles (example below). Do I need to add something to the htaccess file or another file to fix this? I understand that the search engines should only see 1 URL for the page. 2 pages have "Bikes for sale | used bikes | second hand bicycles" title pauslwebsite.com/bikes/ paulswebsite.com/bikes/index.asp Thanks
Technical SEO | | paulmund0