Unreachable Pages
-
Hi All
Is there a tool to check a website if it has stand alone unreachable pages?
Thanks for helping
-
The only possible way I can think of is if the other person's site has an xml sitemap that is accurate, complete, and was generated by the website's system itself. (As is often created by plugins on WordPress sites, for example)
You could then pull the URLs from the xml into the spreadsheet as indicated above, add the URLs from the "follow link" crawl and continue from there. If a site has an xml sitemap it's usually located at www.website.com/sitemap.xml. Alternately, it's location may be specified in the site's robots.txt file.
The only way this can be done accurately is if you can get a list of all URLs natively created by the website itself. Any third-party tool/search engine is only going to be able to find pages by following links. And the very definition of the pages you're looking for is that they've never been linked. Hence the challenge.
Paul
-
Thanks Paul! Is there any way to do that for another persons site, any tool?
-
The only way I can see accomplishing this is if you have a fully complete sitemap generated by your own website's system (ie not created by a third-party tool which simply follow links to map your site)
Once you have the full sitemap, you'll also need to do a crawl using something like Screaming Frog to capture all the pages it can find using the "follow link" method.
Now you should have a list of ALL the pages on the site (the first sitemap) and a second list of all the pages that can be found through internal linking. Load both into a spreadsheet and eliminate all the duplicate URLs. What you'll be left with "should" be the pages that aren't connected by any links - ie the orphaned pages.
You'll definitely have to do some manual cleanup in this process to deal with things like page URLs that include dynamic variables etc, but it should give a strong starting point. I'm not aware of any tool capable of doing this for you automatically.
Does this approach make sense?
Paul
-
pages without any internal links to them
-
Do you mean orphaned pages without any internal links to them? Or pages that are giving a bad server header code?
-
But I want to find the stand alone pages only. I don't want to see the reachable pages. Can any one help?
-
If the page is indexed you can just place the site url in quotes "www.site.com" in google and it will give you all the pages that has this url on it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Amp page development issue
Hi everyone currently developing an amp version of my website it validates with no errors, however my <a name="blah"></a>some blah does not work for amp any ideas
Technical SEO | | livingphilosophy0 -
Low page impressions
Hey there MOZ Geniuses; While checking my webmaster data I noticed that almost all my Google impressions are generated by the home page, most other content pages are showing virtually no impression data <50 (the home page is showing around 1500 - a couple of the pages are in the 150-200 range). the site has been up for about 8 months now. Traffic on average is about 500 visitors, but I'm seeing very little entry other then the home page. Checking the number Sitemap section 27 of 30 are index Webmaster tools are not reporting errors Webmaster keyword impressions are also extremely low 164 keywords with the highest impression count of 79 and dropping from there. MOZ is show very few minor issues although it says that it crawled 10k pages? -- we only have 30 or so. The answer seems obvious, Google is not showing my content ... the question is why and what steps can I take to analyze this? Could there be a possibility of some type of penalty? I welcome all your suggestions: The site is www.calibersi.com
Technical SEO | | VanadiumInteractive0 -
Brand domain not in 1st page
Hi, I've made an e-commerce (drsebagh.it) for the italian division of the brand Dr Sebagh. Now if I search the brand query on google.it (https://www.google.it/search?q=dr+sebagh&oq=dr+sebagh&aqs=chrome.0.69i59l3j0l3.1352j0j4&sourceid=chrome&espv=210&es_sm=91&ie=UTF-8) the site is around the 3rd serp. I can't find where problems are. No duplicate content (as my client says and Copyscape Free seems to confirm that) also Webmaster Tools doesn't signal errors... Can someone helps me to do a quickly check?
Technical SEO | | YouON0 -
Why is this page not ranking but is indexed?
I have a page http://jobs.hays.co.uk/jobs-in-norfolk and it is indexed by Google but will not show up for any keywords I try. Any ideas?
Technical SEO | | S_Curtis0 -
Issue Duplicate Page Title
I'm having some really strange issues with duplicate page titles and I can't seem to figure out what's going on. I just got a new crawl from SEOMOZ and it's showing some duplicate page titles. http://www.example.com/blog/ http://www.example.com/blog/page/2/ http://www.example.com/blog/page/3/ Repeat .............. I have no idea what's going on, how these were duplicated, or how to correct it. Does anyone have a chance to take a look and see if you can figure out what's happening and what I need to do to correct the errors? I'm using Wordpress and all in one SEO plugin. Thanks so much!
Technical SEO | | KLLC0 -
Wordpress duplicate pages
I am using Wordpress and getting duplicate content Crawler error for following two pages http://edustars.yourstory.in/tag/edupristine/ http://edustars.yourstory.in/tag/education-startups/ These two are tags which take you to the same page. All the other tags/categories which take you to the same page or have same title are also throwing errors, how do i fix it?
Technical SEO | | bhanu22170 -
Are all duplicate pages bad?
I just got my first Crawl Report for my forum and it said I have almost 9,000 duplicate pages. When I looked at a sample of them though I saw that many of them were "reply" links. By this I mean the "reply" button was clicked for a topic yet since the crawler was not a member, it just brought them to the login/register screen. Since all the topics would bring you to the same login page I'm assuming it counted all these "reply" links as duplicates. Should I just ignore these or is there some way to fix it? Thanks in advance.
Technical SEO | | Xee0 -
Duplicate Page Issue
Dear All, I am facing stupid duplicate page issue, My whole site is in dynamic script and all the URLs were in dynamic, So i 've asked my programmer make the URLs user friendly using URL Rewrite, but he converted aspx pages to htm. And the whole mess begun. Now we have 3 different URLs for single page. Such as: http://www.site.com/CityTour.aspx?nodeid=4&type=4&id=47&order=0&pagesize=4&pagenum=4&val=Multi-Day+City+Tours http://www.tsite.com/CityTour.aspx?nodeid=4&type=4&id=47&order=0&pagesize=4&pagenum=4&val=multi-day-city-tours http://www.site.com/city-tour/multi-day-city-tours/page4-0.htm I think my programmer messed up the URL Rewrite in ASP.net(Nginx) or even didn't use it. So how do i overcome this problem? Should i add canonical tag in both dynamic URLs with pointing to pag4-0.htm. Will it help? Thanks!
Technical SEO | | DigitalJungle0