Where does the crawler find the urls?
-
The SEO Moz crawler has found a number of 500 error pages, and 404s etc which is very useful
however some of the urls are weird/broken formats we don't recognise and nobody remembers ever using - not weird enough to imply hacking, but something broken in the CMS
Is there anyway to find out where the crawler found these urls? I can patch up and redirect the end result as best I can but I would prefer to fix plug the leak
thanks
-
If you export the crawl diagnostics to a CSV, we do have this information in the last column.
-
thanks for the tips. It is a little frustrating that the information I need has passed through seomoz's system but I guess they don't have the inclination or resources to show us the info
Xenu reckons it can handle 1m urls, we are in the position of not really knowing how many pages our site has!
-
You can pop the links into the free Xenu Link Sleuth* - after you've done a crawl just right-click on the URL you're interested in and click 'URL Properties' - you'll see any inlinks it finds listed there. Depending on the size of your site, it could take a while for the crawl to complete.
You could try the link: property in Google first, though it won't be as thorough as Xenu.
*If you haven't seen it before, don't worry about how the Xenu website looks - the software is kosher - as recommended by many SEOmoz staff. Screaming Frog is a paid alternative (with a limited free version).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have 2 linking root domains on my URL. But I don't get the whole Root domain thing. So I don't understand how I can improve it?
I have 2 linking root domains on my URL. But I don't get the whole Root domain thing. So I don't understand how I can improve it? I copy and pasted this, from my Links page in my campaign because I can't seem to grasp what a root domain is: 'A higher number of good quality linking root domains improves a page's ranking potential'. Can some one explain to me what this is. As simply as possible. Here's my site www.Thumannagency.com Thanks in advance:)
Moz Pro | | MissThumann0 -
Angular.js + Crawlers
I am working with a site that recently deployed Angular.js on the site. From an SEO standpoint its a little more tricky than we thought. We have deployed a couple updates to render pages for the bots but we not seeing changes in Moz weekly reports. When it comes to Angular.js, will the Moz bots read/access the site the same as the other major engines? I'm trying to figure out if our deployments are working or if there's something off in the Moz reports. Thanks.
Moz Pro | | JoshKimber0 -
How do I find out which pages are being indexed on my site and which are not?
Hi, I doing my first technical audit on my site. I am learning how to do an audit as i go and am a lost. I know some page won't be indexed but how do I: 1. Check the site for all pages, both indexed and not indexed 2. Run a report to show indexed pages only (i am presuming i can do this via screaming Frog or webmaster tool) 3. I can do a comparison between the two list and work out which pages are not being indexed. I'll then need to figure out way. I'll cross this bridge once i get to it Thanks Ben
Moz Pro | | benjmoz0 -
Crawlers crawl weird long urls
I did a crawl start for the first time and i get many errors, but the weird fact is that the crawler tracks duplicate long, not existing urls. For example (to be clear): there is a page: www.website.com/dogs/dog.html but then it is continuing crawling:
Moz Pro | | r.nijkamp
www.website.com/dogs/dog.html
www.website.com/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dogs/dog.html what can I do about this? Screaming Frog gave me the same issue, so I know it's something with my website0 -
How can I find my past KW rankings?
I want to know how I can find out how a client's site was ranking for keywords at the beginning of the campaign so I can compare it with current rankings. I haven't been generating reports regularly. I kinda assumed I could ask for the reports retroactively. As in, what was the ranking of my keywords from the first crawl done by SEO moz. Thanks!
Moz Pro | | GreenGrowthSEO0 -
SEO Web Crawler - Referrer Lists XML Sitemap URL
Hello!, I recently ran the crawl tool on a client site. Opening up the file, I noticed that the referring URLs listed are my XML sitemaps and not (X)HTML pages. Any reason or thoughts behind why this is happening? Thanks!
Moz Pro | | MorpheusMedia0 -
Www. part of url not showing on google search results.
When typing in fun translator into google UK, my website www.funtranslator.com is the 18th result on the 2nd page, however, there is an issue, it only shows funtranslator.com, without the www part, which is not what I want because the whole address to show up, because my campaign on seomoz pro and webmaster tools account is www.funtranslator.com not funtranslator.com. How can I resolve this. For not, I have a redirect that goes from funtranslator to www.funtranslator.com I want to know how to get the full www.url into the results instead of just url.com without the www part of the url. Also, will this affect my stats gathered in my campaign on my seomoz pro account?
Moz Pro | | RyanSMurphy1 -
Where can I find all the guides that are availablel to pro members? i seem to be lost
hi, I am trying to find the following guide that I available to pro members but there is no clear way to find it. the-professionals-guide-to-link-building can some one give me the url to find all the pro guides? thank you! Vijay
Moz Pro | | vijayvasu0