Where does the crawler find the urls?
-
The SEO Moz crawler has found a number of 500 error pages, and 404s etc which is very useful
however some of the urls are weird/broken formats we don't recognise and nobody remembers ever using - not weird enough to imply hacking, but something broken in the CMS
Is there anyway to find out where the crawler found these urls? I can patch up and redirect the end result as best I can but I would prefer to fix plug the leak
thanks
-
If you export the crawl diagnostics to a CSV, we do have this information in the last column.
-
thanks for the tips. It is a little frustrating that the information I need has passed through seomoz's system but I guess they don't have the inclination or resources to show us the info
Xenu reckons it can handle 1m urls, we are in the position of not really knowing how many pages our site has!
-
You can pop the links into the free Xenu Link Sleuth* - after you've done a crawl just right-click on the URL you're interested in and click 'URL Properties' - you'll see any inlinks it finds listed there. Depending on the size of your site, it could take a while for the crawl to complete.
You could try the link: property in Google first, though it won't be as thorough as Xenu.
*If you haven't seen it before, don't worry about how the Xenu website looks - the software is kosher - as recommended by many SEOmoz staff. Screaming Frog is a paid alternative (with a limited free version).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is MOZ crawl is returning URLs with variable results showing Missing Meta Desc? Example: http://nw-naturals.net/?page_number_0=47
Can you help me dive down into my website guts to find out why the MOZ crawl is returning URLs with variable results? And saying this is missing a description when it's not really a page? Example: http://nw-naturals.net/?page_number_0=47. I've asked MOZ but it's a web development issue so they can't help me with it. Has anyone had an issue with this on their website? Thank you!
Moz Pro | | lewisdesign0 -
Missing Meta Description in Pop Up Image Gallery URL
Hi there, Moz has reported that I have a lot of missing meta description in the Pop Up Image Gallery. There are over 150 of these that are missing. The issue is that in our CMS I cannot add meta description to images. I am also not entirely sure why the images I am loading up are generating their own URL like the one below. Possibly that is why Moz is saying it needs a meta description when it is only an image?. An example of the link that has missing meta description that I can't add would be something like this: http://www.domianhere.com/products/PopUpImageGallery/93/1?iframe=true When clicking these links it is purely an image with no text. So basically I am asking how do you add a meta description to a Pop Up image Gallery photo URL and/or possibly why would an image create a custom URL so I have to put in a Meta Description? Hopefully that makes sense, thank you in advance for your help - much appreciated!
Moz Pro | | marketing-gal0 -
Find Historical SERP Ranking for a Term?
Is there any way to find out what pages ranked for a given term historically? I.e. what were the top 10 search results for "Widgets" 6 months ago, 1 year ago, 2 years ago? If I had a campaign tracking that term, I'd be able to look back, but I do not. Does this data exist anywhere in a format that could be queried?
Moz Pro | | kpclaypool0 -
Blog Page URLs Showing Duplicate Content
On the SEOMoz Crawl Diagnostics, we are receiving information that we have duplicate page content for the URL Blog pages. For Example: blog/page/33/ blog/page/34/ blog/page/35/ blog/page/36/ These are older post in our blog. Moz is saying that these are duplicate content. What is the best way to fix the URL structure of the pages?
Moz Pro | | _Thriveworks0 -
Why don't Google+ URL's work in OSE?
Is there any reason why Google+ URLs does not work in OSE? Is it just that it is a secure URL or is there something bigger there? Why? Be cool to determine every website the person has been published on; especially if it is rel="author" verified. Jeff
Moz Pro | | WebBizIdeas1 -
Not able to find Do follow Link as shown in Seomoz Toolbar
SEO Moz Toolbar showing 1 do follow link in every forum question page Have checked source twice thoroughly - not able to trace that do follow link - to which site is that do follow link going to Some sample links - http://www.mycarhelpline.com/index.php?option=com_easydiscuss&view=post&id=426&Itemid=78 - http://www.mycarhelpline.com/index.php?option=com_easydiscuss&view=post&id=63&Itemid=78 Facebook & Google plus - both are no follow can some one help please so as to know this mysterious do follow link as shown in as letted know by SEOMOZ Toolbar
Moz Pro | | Modi0 -
Canonical URLs for Search Parameters
Hi Guys Our seomoz campaign report is returning a lot or Rel Canonical issues similar to this for each page. The non / version redirects to the / version but how do I get the ones with search parameters ie '?datefrom&nights' to redirect. http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78
Moz Pro | | JohnTulley
http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78/
http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78/?datefrom&nights
http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78/?datefrom=&nights= Any help would be welcome, thanks0 -
How do I get the Page Authority of individual URLs in my exported (CSV) crawl reports?
I need to prioritize fixes somehow. It seems the best way to do this would be to filter my exported crawl report by the Page Authority of each URL with an error/issue. However, Page Authority doesn't seem to be included in the crawl report's CSV file. Am I missing something?
Moz Pro | | Twilio0