What crawler do you recommend for finding orphaned pages on a website?
-
Is there a crawler that you guys recommend for finding all pages, including orphaned pages on a website? A data export is not feasible. I saw a question from back in 2013 and was wondering if anything has changed since then in regards to crawling orphaned pages. Do most enterprise systems already have this built into their crawler? Or is it best to get a crawler like Xenu or Screaming Frog or Deepcrawl?
-
Hi there!
i agree with Patrick. I was going to recommend using Screaming Frog or Google Search Console! Let me know if you try these, don't like them, and need another recommendation.
-
Hi there
I really like ScreamingFrog but I also really like Search Console and Moz Pro. The reason being, I like having different sets of data because they are all different. I also like seeing if pages are being linked to randomly from other sources other than my own website which Search Console does a great job (and so does Majestic or Ahrefs). Different sources find different things so it's nice to get other opinions on what you might have out there floating around.
Just my two cents! Hope this helps!
Patrick
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEO Content Audits Questions (Removing pages from website, extracting data, organizing data).
Hi everyone! I have a few questions - we are running an SEO content audit on our entire website and I am wondering the best FREE way to extract a list of all indexed pages. Would I need to use a mix of Google Analytics, Webmaster Tools, AND our XML sitemap or could I just use Webmaster Tools to pull the full list? Just want to make sure I am not missing anything. As well, once the data is pulled and organized (helpful to know the best way to pull detailed info about the pages as well!) I am wondering if it would be a best practice to sort by high trafficked pages in order to rank them for prioritization (ie: pages with most visits will be edited and optimized first). Lastly, I am wondering what constitutes a 'removable' page. For example, when it is appropriate to fully remove a page from our website? I understand that it is best, if you need to remove a page, to redirect the person to another similar page OR the homepage. Is this the best practice? Thank you for the help! If you say it is best to organize by trafficked pages first in order to optimize them - I am wondering if it would be an easier process to use MOZ tools like Keyword Explorer, Page Optimization, and Page Authority to rank pages and find ways to optimize them for best top relevant keywords. Let me know if this option makes MORE sense than going through the entire data extraction process.
Technical SEO | | PowerhouseMarketing0 -
What should I do with all these 404 pages?
I have a website that Im currently working on that has been fairly dormant for a while and has just been given a face lift and brought back to life. I have some questions below about dealing with 404 pages. In Google WMT/search console there are reports of thousands of 404 pages going back some years. It says there are over 5k in total but I am only able to download 1k or so from WMT it seems. I ran a crawl test with Moz and the report it sent back only had a few hundred 404s in, why is that? Im not sure what to do with all the 404 pages also, I know that both Google and Moz recommend a mixture of leaving some as 404s and redirect others and Id like to know what the community here suggests. The 404s are a mix of the following: Blog posts and articles that have disappeared (some of these have good back-links too) Urls that look like they used to belong to users (the site used to have a forum) which where deleted when the forum was removed, some of them look like they were removed for spam reasons too eg /user/buy-cheap-meds-online and others like that Other urls like this /node/4455 (or some other random number) Im thinking I should permanently redirect the blog posts to the homepage or the blog but Im not sure what to do about all the others? Surely having so many 404s like this is hurting my crawl rate?
Technical SEO | | linklander0 -
Reverify website
Hello, I want to disavow some really dodgy links and although the website is verifed since we see a meta tag in the source code, we no longer have the login details for the email address used to verify it so can't access the tools console. So how do I upload a disavow file (the links i got were from Ahref and Moz analysis)? Should I reverify the website using a new email address and upload a new meta tag? Is there a problem with doing this? Sorry if this is obvious!
Technical SEO | | AL123al0 -
How do I optimize a website for SEO for a client that is using a subdirectory as a seperate website?
We launched a subdirectory site about two months ago for our client. What's happening is searches for the topic covered by the subdirectory are yielding search results for the old site and not the new site. We'd like to change this. Are there best practices for the subdirectory site Specifically we're looking for things we can do using sitemapping and Webmaster tools. Are there other technical things we can do? Thanks you.
Technical SEO | | IVSeoTeam120 -
Switchboard Tags - Multiple desktop pages pointing to one mobile page
I have recently started to implement switchboard tags to connect our mobile and desktop pages, and to ensure that our mobile pages show up in rankings for mobile users. Because our desktop site is much deeper in content than our mobile site, there are a number of desktop pages we would like to have point to one mobile page. However, with the switchboard tags, this poses a problem because it requires multiple rel=canonical tags to be placed on the one mobile page. I'm assuming this will either confuse the search engines, or they will choose to ignore the rel=canonical tag altogether. Any ideas on how to approach this situation other than creating an equivalent mobile version of every desktop page or implementing a user agent detection redirect?
Technical SEO | | JBlank0 -
Can you 301 redirect a page to an already existing/old page ?
If you delete a page (say a sub department/category page on an ecommerce store) should you 301 redirect its url to the nearest equivalent page still on the site or just delete and forget about it ? Generally should you try and 301 redirect any old pages your deleting if you can find suitable page with similar content to redirect to. Wont G consider it weird if you say a page has moved permenantly to such and such an address if that page/address existed before ? I presume its fine since say in the scenario of consolidating departments on your store you want to redirect the department page your going to delete to the existing pages/department you are consolidating old departments products into ?
Technical SEO | | Dan-Lawrence0 -
Website hacked
Hi I've been asked to help a colleague with his website. It seems to be hacked. He recently received an e-mail from Google saying his adwords account was suspended 'due to high probability his site may be hosting or distributing malicious software' I just checked his source and there seems to loads of weird on code on his pages, this would not have been but on by any members of the website owners. Please image attached when we try to access his website via google search I just contacted the hosting provider - does anyone have experience with this and how to prevent such hacking in the future. The site is build using HTML with no CMS. IjW19.jpg
Technical SEO | | Socialdude0 -
Duplicate Page Title
The crawl of my website http://www.aboutaburningfire.com revealed an error showing a duplicate page title. Can someone please explain to me how to fix this? I'm not sure what it means or how to fix it. | House Church Chicago, Organic Church, Illinois http://www.aboutaburningfire.com/ 1 Pending Pending House Church Chicago, Organic Church, Illinois http://www.aboutaburningfire.com/index.html |
Technical SEO | | severity0