First Crawl Report
-
Just joined SEOMoz today and am slightly overwhelmed, but excited about learning loads from it.
I've just received my Crawl Report and there is a
404 : UserPreemptionError:
http://www.iainmoran.com/comments/feed/This is a WordPress site and I've no idea what the best course of action to take. I've done some searching on Google and a couple of sites suggest removing that url from within the robots.txt file. I'm using the Yoast Plugin which apparently creates a robots.txt file, but I can't see any way to edit it. Is there another solution for resolving the 404 error?
Many thanks,
Iain.
-
John, thank you so much for your help with this. It's very much appreciated.
That does explain why I couldn't find the robots.txt file.
Is there another way to resolve the 404 error warning that SEOMoz is telling me about?
404 : UserPreemptionError:
http://www.iainmoran.com/comments/feed/I understand that if I disable comments on my WP site, that would fix the issue, but don't really want to do that if it can be avoided.
Thanks again,
Iain.
-
Actually, on further investigation, blocking the includes folder does not effect ranking, as the content is loaded server side (PHP)
I don't imagine that the robot.txt file is your issue, as its just stopping people accessing your include and admin folder, which you want
-
Iain
I dont use WP, but just Googled this
"I believe WordPress itself creates the robots.txt file and it is a virtual file, meaning there won't be a hard copy on your server for you to edit. You could create one yourself and upload it to your server and I think search engines will use that one instead, or you can use a plugin like this one, that will let you edit your robots.txt file from the WordPress admin."
Explains why you cant find it !
Basically that suggest creating one yourself and uploading it to the server.
Create a note pad file, add your content, name file robot.txt (must change extension to .txt), and that should work
-
Iain
Just to be sure, its actually just called a robot.txt (not .text)
I just checked, you do have one in your root file
http://www.iainmoran.com/robots.txt
Bizarrely, its blocking Google indexing your includes, which is a problem cause most of your links will be included in some sort of nav include, not to mention most of your website content.
Its defiantly there, maybe check your public folder and www folder
Let me know how you get on
John
-
Thanks for your quick response, Johnny,
I've just looked at my root directory via both my CPanel and FTP, but there is no robots.text file there.
I don't understand why this is, as in the Yoast section within each of my pages, there are some which I have set to "nofollow"
Also, within the Yoast CP there is a section titled: Robots.txt and says the following under it:
"If you had a robots.txt file and it was editable, you could edit it from here."Iain.
-
Iain
You robots.xt file can be found within your root directory
Where ever your website is hosted, log in there (if you have cpanel which is more than likely) and navigate to your public folder. the robots.text file should reside there, download it and edit in any software really, wordpad etc.
If you don't have cpanel, you can add your FTP credential using an FTP client and navigate the same way
When you edit it, upload it back to your server.
Regards
J
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content Showing up on Moz Crawl | www. vs. no-www.
Hello Moz Community! I am new to SEO, Moz and this is my first question. My questions; I have a client that is getting flagged for Duplicate Content. He is getting flagged for having two domains that have the same content i.e. www.mysite.com & mysite.com. I read into this and set up a 301 redirect through my hosting site. I evaluated which site had a stronger Page Authority and had the weaker site redirect to the stronger site. However, I am still getting hit for Duplicate pages caused by the www.mysite.com & mysite.com being duplicates. How should I go about resolving this? Is this an example of a Canonical tag needed in the head of the HTML? Any direction is appreciated. Thank You. B/R Will H.
Technical SEO | | MarketingChimp100 -
Google's ability to crawl AJAX rendered content
I would like to make a change to the way our main navigation is currently rendered on our e-commerce site. Currently, all of the content that appears when you click a navigation category is rendering on page load. This is currently a large portion of every page visit’s bandwidth and even the images are downloaded even if a user doesn’t choose to use the navigation. I’d like to change it so the content appears and is downloaded only IF the user clicks on it, I'm planning on using AJAX. As that is the case it wouldn’t not be automatically on the site(which may or may not mean Google would crawl it). As we already provide a sitemap.xml for Google I want to make sure this change would not adversely affect our SEO. As of October this year the Webmaster AJAX crawling doc. suggestions has been depreciated. While the new version does say that its crawlers are smart enough to render AJAX content, something I've tested, I'm not sure if that only applies to content injected on page load as opposed to in click like I'm planning to do.
Technical SEO | | znotes0 -
Google not crawling the website from 22nd October
Hi, This is Suresh. I made changes to my website and I see that google is unable to crawl my website from 22nd October. Even it is not showing any content when I use Cache:www.vonexpy.com. Can any body help me in knowing why Google is unable to crawl my website. Is there any technical issue with the website? Website is www.vonexpy.com Thanks in advance.
Technical SEO | | sureshchowdary1 -
Moz Crawl Diagnostic shows lots of duplicate content issues
Hi my client's website uses URL with www and without www. In page/title both website shows up. The one with www has page authority of 51 and the one without 45. In Moz diagnostic I can see that the website shows over 200 duplicate content which are not found in , e.g. Webmaster. When I check each page and add/remove www then the website shows the same content for both www and no www. It is not redirect - in search tab it actually shows www and then if you use no www it doesn't show www. Is the www issue to blame? or could it be something else? and what do I do since both www URL and no-www URL have high authority, just set up redirect from lower authority URL to higher authority URL?
Technical SEO | | GardenPet0 -
CDN Being Crawled and Indexed by Google
I'm doing a SEO site audit, and I've discovered that the site uses a Content Delivery Network (CDN) that's being crawled and indexed by Google. There are two sub-domains from the CDN that are being crawled and indexed. A small number of organic search visitors have come through these two sub domains. So the CDN based content is out-ranking the root domain, in a small number of cases. It's a huge duplicate content issue (tens of thousands of URLs being crawled) - what's the best way to prevent the crawling and indexing of a CDN like this? Exclude via robots.txt? Additionally, the use of relative canonical tags (instead of absolute) appear to be contributing to this problem as well. As I understand it, these canonical tags are telling the SEs that each sub domain is the "home" of the content/URL. Thanks! Scott
Technical SEO | | Scott-Thomas0 -
Cloaking? Best Practices Crawling Content Behind Login Box
Hi- I'm helping out a client, who publishes sale information (fashion sales etc.) In order for the client to view the sale details (date, percentage off etc.) they need to register for the site. If I allow google bot to crawl the content, (identify the user agent) but serve up a registration light box to anyone who isn't google would this be considered cloaking? Does anyone know what the best practice for this is? Any help would be greatly appreciated. Thank you, Nopadon
Technical SEO | | nopadon0 -
Google showing former meta tags in search results inspite of new tags being crawled by it
I had changed the meta tags for a site www.aztexsodablast.com.au about a month back and Google has also crawled those new tags but in search results when I search for the term 'Aztex Sodablast' it is continuing to show the old tags while on the site, the new tags are being displayed. What may be the issue and how could I correct the problem?
Technical SEO | | pulseseo0 -
Recently revamped site structure - now not even ranking for brand name, but lots of content - what happened? (Yup, the site has been crawled a few times since) Any ideas? Did I make a classic mistake? Any advise appreciated :)
I've completely disappeared off Google - what happened? Even my brand name keyword does not bring up my website - I feel lost, confused and baffled on what my next steps should be. ANY advice would be welcome, since there's no going back to the way the site was set up.
Technical SEO | | JeanieWalker0