Sitemap as Referrer in Crawl Error Report
-
I have just downloaded the SEOMoz crawl error report, and I have a number of pages listed which all show FALSE.
The only common denominator is the referrer - the sitemap.
I can't find anything wrong, should I be worried this is appearing in the error report?
-
Thanks Tom.
The site map is pointing to the correct pages, and when visiting the pages in the search engines no problems arises.
I don't understand why these pages are listed in the crawl error report when I can't see any obvious issue.
-
Hi Christina
If the referrer is the sitemap, it means that the SEOMoz crawler has been directed to that page because of the sitemap you have submitted.
If you're getting 404 errors or access errors for certain pages and they are only able to be accessed via the sitemap, then it's a good idea to remove those URLs from the sitemap altogether. It doesn't make sense to have URLs listed in your sitemap if those URLs don't exist or have restricted access.
A cleaner sitemap will ultimately help in the long run. Hope this helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Dropdown content on page being crawled
Hi, will the content within a dropdown on a page be crawled? I.e. if the page visitor has to click to reveal the content as a dropdown will it be crawled by bots. Thanks
Technical SEO | | BillSCC1 -
Sitemap: Linking horizontal pages on a sitemap that has a vertical hierarchy structure
I'm currently in the process of revamping a website and creating a sitemap for it so that all pages get indexed by search engines. The site is divided into two websites that share the same root domain. The marketing site is on example.com and the application is on go.example.com. To get to go.example.com from example.com, you need to go through one of three “action pages”. The action pages are accessed from every page on example.com where we have a CTA button on the site (that’s pretty much every page). These action pages do not link back to any other page on the site though, nor are they a necessary step to navigate to other webpages. These action pages are only viewed when a user is ready to be taken to the application site. My question is, how should these pages be set up in a vertical sitemap since these three pages have a horizontal structure? Any insight would be much appreciated!
Technical SEO | | RallyUp0 -
Should I disallow crawl of my Job board?
MOZ crawler is telling me we have loads of duplicate content issues. We use a Job Board plugin on our Wordpress site and we have allot of duplicate or very similar jobs (usually just a different location), but the plugin doesn't allow us to add any rel canonical tags to the individual jobs. Should I disallow the /jobs/ url in the robots.txt file? This will solve the duplicate content issue but then Google wont be able to crawl any of the individual job listings Has anyone had any experience working with a job board plugin on Wordpress and had a similar issue, or can advise on how best to solve our duplicate content?? Thanks 🙂
Technical SEO | | O2C0 -
Crawl issues
Hello there, I have found that when crawling my site I have errors regarding the meta description and it says it is missing from few pages. I checked these pages but there is a meta description. I also ran the same report with other tools and it comes up the same issues. What should I do?
Technical SEO | | PremioOscar0 -
403 error
Hey guys, I know that a 403 is not a terrible thing, but is it worth while fixing? If so what is the best way to approach it. Cheers
Technical SEO | | Adamshowbiz0 -
Host sitemaps on S3?
Hey guys, I run a dynamic web service and I will start building static sitemaps for it pretty soon. The fact that my app lives in a multitude of servers doesn't make it easy to distribute frequently updated static files throughout the servers. My idea was to host the files in AWS S3 and point my robots.txt sitemap directive there. I'll use a sitemap index so, every other sitemap will be hosted on S3 as well. I could dynamically mirror the content from the files in S3 through my app, but that would be a little more resource intensive than just serving the static files from a common place. Any ideas? Thanks!
Technical SEO | | tanlup0 -
Children in this Sitemap index Warnings
Hi, I have just submitted a sitmap for one website. But I am getting this warning: Number of children in this Sitemap index 3
Technical SEO | | knockmyheart
Sitemap contains urls which are blocked by robots.txt.Sitemap: www.zemtube.com/videoscategory-sitemap.xmlValue: http://www.zemtube.com/videoscategory/exclusive/www.zemtube.com/videoscategory-sitemap.xmlValue: http://www.zemtube.com/videoscategory/featured/www.zemtube.com/videoscategory-sitemap.xmlValue: http://www.zemtube.com/videoscategory/other/It is a wordpress website and the robots.txt file is:# Exclude Files From All Robots: User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /tag/ End robots.txt file#I have also tried adding this to the robots.txtSitemap: http://www.zemtube.com/sitemap_index.xmlWebmaster-Tools-Sitemaps-httpwww.zemtube.com_.pdf0 -
4xx Client Error
I have 2 pages showing as errors in my Crawl Diagnostics, but I have no idea where these pages have come from, they don't exist on my site. I have done a site wide search for them and they don't appear to be referenced are linked to from anywhere on my site, so where is SEomoz pulling this info from? the two links are: http://www.adgenerator.co.uk/acessibility.asp http://www.adgenerator.co.uk/reseller-application.asp The first link has a spelling mistake and the second link should have an "S" on the end of "application"
Technical SEO | | IPIM0