Unreachable Pages
-
Hi All
Is there a tool to check a website if it has stand alone unreachable pages?
Thanks for helping
-
The only possible way I can think of is if the other person's site has an xml sitemap that is accurate, complete, and was generated by the website's system itself. (As is often created by plugins on WordPress sites, for example)
You could then pull the URLs from the xml into the spreadsheet as indicated above, add the URLs from the "follow link" crawl and continue from there. If a site has an xml sitemap it's usually located at www.website.com/sitemap.xml. Alternately, it's location may be specified in the site's robots.txt file.
The only way this can be done accurately is if you can get a list of all URLs natively created by the website itself. Any third-party tool/search engine is only going to be able to find pages by following links. And the very definition of the pages you're looking for is that they've never been linked. Hence the challenge.
Paul
-
Thanks Paul! Is there any way to do that for another persons site, any tool?
-
The only way I can see accomplishing this is if you have a fully complete sitemap generated by your own website's system (ie not created by a third-party tool which simply follow links to map your site)
Once you have the full sitemap, you'll also need to do a crawl using something like Screaming Frog to capture all the pages it can find using the "follow link" method.
Now you should have a list of ALL the pages on the site (the first sitemap) and a second list of all the pages that can be found through internal linking. Load both into a spreadsheet and eliminate all the duplicate URLs. What you'll be left with "should" be the pages that aren't connected by any links - ie the orphaned pages.
You'll definitely have to do some manual cleanup in this process to deal with things like page URLs that include dynamic variables etc, but it should give a strong starting point. I'm not aware of any tool capable of doing this for you automatically.
Does this approach make sense?
Paul
-
pages without any internal links to them
-
Do you mean orphaned pages without any internal links to them? Or pages that are giving a bad server header code?
-
But I want to find the stand alone pages only. I don't want to see the reachable pages. Can any one help?
-
If the page is indexed you can just place the site url in quotes "www.site.com" in google and it will give you all the pages that has this url on it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Indexed pages
Just started a site audit and trying to determine the number of pages on a client site and whether there are more pages being indexed than actually exist. I've used four tools and got four very different answers... Google Search Console: 237 indexed pages Google search using site command: 468 results MOZ site crawl: 1013 unique URLs Screaming Frog: 183 page titles, 187 URIs (note this is a free licence, but should cut off at 500) Can anyone shed any light on why they differ so much? And where lies the truth?
Technical SEO | | muzzmoz1 -
Removed Product page on our website, what to do
We just removed an entire product category on our website, (product pages still exist, but will be removed soon as well) Should we be setting up re-directs, or can we simply delete this category and product
Technical SEO | | DutchG
pages and do nothing? We just received this in Google Webmasters tools: Google detected a significant increase in the number of URLs that return a 404 (Page Not Found) error. We have not updated the sitemap yet...Would this be enough to do or should we do more? You can view our website here: http://tinyurl.com/6la8 We removed the entire "Spring Planted Category"0 -
Page Speed or Size?
Hi everyone. I have a client who really wants to add a 1min html5 video to the background of their homepage. I have managed to reduce the size of the video to 20MB and I have tested the page in pingdom. The results are 1.85 s to load, and weighed in at 21.2 MB. My question is does Google factor page load speed or size in it's ranking factors? I am also mindful of the negative effect this could have on bounce rate. Thanks.
Technical SEO | | WillWatrous0 -
Duplicate Page Titles
I had an issue where I was getting duplicate page titles for my index file. The following URLs were being viewed as duplicates: www.calusacrossinganimalhospital.com www.calusacrossinganimalhospital.com/index.html www.calusacrossinganimalhospital.com/ I tried many solutions, and came across the rel="canonical". So i placed the the following in my index.html: I did a crawl, and it seemed to correct the duplicate content. Now I have a new message, and just want to verify if this is bad for search engines, or if it is normal. Please view the attached image. i9G89.png
Technical SEO | | pixel830 -
Is it better to delete web pages that I don't want anymore or should I 301 redirect all of the pages I delete to the homepage or another live page?
Is it better for SEO to delete web pages that I don't want anymore or should I 301 redirect all of the pages I delete to the homepage or another live page?
Technical SEO | | CustomOnlineMarketing0 -
Yahoo and Bing do not index all pages
Only 20% of our pages are indexed by Bing and Yahoo although we have correctly submitted the sitemap to bing webmaster tools and other search engines index all our content. Do you have any suggestions?
Technical SEO | | AEM130 -
On-Page Question
Im trying to increase value to specific pages by putting history, and additional images. Will copying snippets from other sites negatively affect me? Should the content be re-written completely?
Technical SEO | | Anest0 -
HTTPS attaching to home page
Hi!! Okay - weird tech question. Domain is http://hiphound.com. I have SSL attaching to checkout and my account pages. Tested and works well. Issue - I am able to reach the home page at https://hiphound.com AND http://hiphound.com. If I access the home page via HTTPS and click on a link (any link) then the site is redirected to HTTP again which is good. My concern is the home page displaying via HTTPS and HTTP. Is this is an issue that can be resolved or is it expected behavior I have to live with.? I am being told by DEV there is nothing they can do about it but want to understand why and if they are correct. Thoughts? Thank you!! Lynn
Technical SEO | | hiphound0