Strange Webmaster Tools Crawl Report
-
Up until recently I had robots.txt blocking the indexing of my pdf files which are all manuals for products we sell. I changed this last week to allow indexing of those files and now my webmaster tools crawl report is listing all my pdfs as not founds.
What is really strange is that Webmaster Tools is listing an incorrect link structure: "domain.com/file.pdf" instead of "domain.com/manuals/file.pdf"
Why is google indexing these particular pages incorrectly? My robots.txt has nothing else in it besides a disallow for an entirely different folder on my server and my htaccess is not redirecting anything in regards to my manuals folder either. Even in the case of outside links present in the crawl report supposedly linking to this 404 file when I visit these 3rd party pages they have the correct link structure.
Hope someone can help because right now my not founds are up in the 500s and that can't be good
Thanks is advance!
-
Hello,
Did you check the "linked From" tab? click on each error and see which are the sites that are linked from
-
Thanks for the help Wissam!
What I have done is changed all relative paths to direct- then I ran screaming frog and it did not pick up any 404s at all - this was last Thursday. Unfortunately webmaster tools is still reporting the same style 404s having been discovered since then. Is there a reason why screaming frog and webmaster tools would be seeing different crawl results?
-
all link reported in the GWT is based on a crawl.( so there is either an external or internal link pointing to these.com/file.pdf)
So what i would do is fire up Screaming Frog or Xenu and do a full site crawl and check the reports. You might find some pages linking or using relative urls in the a href elements.
If you land into a situation where you have external links pointing to wrong URLS I would recommend either by contacting them or just 301 /file.pdf to /manuals/file.pdf
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What Tools Should I Use To Investigate Damage to my website
I would like to know what tools I should use and how to investigate damage to my website in2town.co.uk I hired a person to do some work to my website but they damaged it. That person was on a freelance platform and was removed because of all the complaints made about them. They also put in backdoors on websites including mine and added content. I also had a second problem where my content was being stolen. My site always did well and had lots of keywords in the top five and ten, but now they are not even in the top 200. This happened in January and feb. When I write unique articles, they are not showing in Google and need to find what the problem is and how to fix it. Can anyone please help
Technical SEO | | blogwoman10 -
520 Error from crawl report with Cloudflare
I am getting a lot of 520 Server Error in crawl reports. I see this is related to Cloudflare. We know 520 is Cloudflare so maybe the Moz team can change this from "unknown" to "Cloudflare 520". Perhaps the Moz team can update the "how to fix" section in the reporting, if they have some possible suggestions on how to avoid seeing these in the report of if there is a real issue that needs to be addressed. At this point I don't know. There must be a solution that Moz can provide like a setting in Cloudflare that will permit the Rogerbot if Cloudflare is blocking it because it does not like its behavior or something. It could be that Rogerbot is crawling my site on a bad day or at a time when we were deploying a massive site change. If I know when my site will be down can I pause Rogerbot? I found this https://developers.cloudflare.com/support/troubleshooting/general-troubleshooting/troubleshooting-crawl-errors/
Technical SEO | | awilliams_kingston0 -
A crawl revealed two home pages
After doing a site crawl using the moz tool, I have found two home pages-www.domain.com/ and www.domain.com. Both URLS have the exact same metrics and I have set a preferred domain name in google, will this hurt seo? Should I claim the www.domain.com/ as well as www.domain.com and domain.com in the search console? Thanks
Technical SEO | | Tom3_150 -
404s effecting crawl rate?
We made a change to our site where we all of a sudden we are creating a large number of 404 pages. Is this effecting the crawl/indexing rate? Currently we've submitted 3.4 million pages, have over 834K indexed but have over and 330K pages not found. Since the large increase in 404s we've noticed a decrease in pages crawled per day. I found this Q & A in Webmasters (http://googlewebmastercentral.blogspot.com/2011/05/do-404s-hurt-my-site.html) but it seems like the 404s should not have an effect. Is this article out of date? What do you think fellow Moz-ers? Is this a problem?
Technical SEO | | JoshKimber0 -
WebMaster Tools keeps showing old 404 error but doesn't show a "Linked From" url. Why is that?
Hello Moz Community. I have a question about 404 crawl errors in WebmasterTools, a while ago we had an internal linking problem regarding some links formed in a wrong way (a loop was making links on the fly), this error was identified and fixed back then but before it was fixed google got to index lots of those malformed pages. Recently we see in our WebMaster account that some of this links still appearing as 404 but we currently don't have that issue or any internal link pointing to any of those URLs and what confuses us even more is that WebMaster doesn't show anything in the "Linked From" tab where it usually does for this type of errors, so we are wondering what this means, could be that they still in google's cache or memory? we are not really sure. If anyone has an idea of what this errors showing up now means we would really appreciate the help. Thanks. jZVh7zt.png
Technical SEO | | revimedia1 -
Crawl Results
How fresh is SEOMOZ crawl results ?. On my report for today I can see that my website ranking for several keywords run manually and individually on Google, Yahoo and bing to be better than the actual SEOMOZ report. Also have been noticing that Back link count on SEOMOZ report to be significantly less than counted with other sites and software.Can someone advise me on this ?
Technical SEO | | sherohass0 -
Funky 404 error on reports
The report is showing a 404 error where a URL is being appended to the end of the address. It does not show up on the website of on the Sitemap so am wondering if I am missing something or is it a system error?
Technical SEO | | ccbseo0 -
Google Webmaster Tools: Keywords
Hi SEOmozzers! I'm the Dr./owner/in-house SEO for my eye care practice. The URL is www.ofallonfamilyeyecare.com. Our practice is in O'Fallon, MO. Since I'm an optometrist, my main keywords are "optometrist o'fallon" and "o'fallon optometrist". As I get more familiarity with SEO, Google Analytics and Webmaster Tools, I've discovered the Keywords that Google feels best represent my website. About a week ago I noted Google counted 21 instances of "optometrist" on the 28-30 pages of my website, which ranks as #32 in the most common keywords. #1 is "eye" with 506 instances. Even though 21 occurrences seemed low, I went though every page adding "optometrist" a couple times in the body where it would naturally be appropriate. I also added it to the address shown on the footer of every page. I changed the top navigation option of "meet Dr. Hegyi" to "our optometrist". I must have added at least 4 occurrences to every page on my site, and submitted for a re-crawl. I even tried to scale back the "eye" occurrences on a few pages. Today I see that Google has re-crawled the site and the keywords have been updated. "Optometrist has DROPPED from #32 to #33. Does anyone have any ideas or suggestions why I'm not seeing increased occurrence in Googles eyes? I realize this may not be a big factor in SERPs, but every bit of on-page optimization helps. Or is this too minor of an issue to sweat? Thanks!
Technical SEO | | JosephHegyi0