Find archived sitemap of a website that no longer exists
-
I am trying to figure out the site structure of a website and the urls of all the pages. Normally this would be easy but a couple of months ago the website went down and I don't think it will ever come back. Any help would be appreciated.
-
Use the internet archive (wayback machine) which effectdigital mentioned above, to find the /robots.txt file from the desired date. In that file you should find the referenced sitemap file (assuming the site properly included its sitemap reference in its robot.txt file). Then you can use the same process to request the sitemap file which was referenced in the robots.txt file.
-
Hi Effect,
Does your second link automatically provider the sitemaps available, or does the user still need to "know" or be able to guess where they might be e.g /sitemap.xml?
Nick
-
You can use this site to see legacy site-maps for some websites (though they may be partial or incomplete):
For example, check these sitemap results:
For smaller sites, the results are much easier to look at.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pagerank and sitemap question :)
As most of us are, I am working on pushing my page rank up and in that I have been looking at some of the pages ahead of me to see what they may be doing. I noticed one of my main competitors has some different sitemaps than I am using. My sitemaps consist of:
Competitive Research | | allstatetransmission
Using Yoast WP PLugin:
Posts
Pages
QA/FAQs
Testimonials
Categories
and FAQ Categories Using Google Images Sitemap WP Plugin:
Images Sitemap http://1stimpressions.com/sitemap_index.xml Their sitemap has:
Posts
Pages
Attachment
Portfolio (Next Gen Gallery)
Category
Post Tags
Next Gen Gallery (NGG) Tags
Portfolio Type Sitemap
Author Sitemap http://smartwrap.com/sitemap_index.xml Could adding in some of these sitemaps help in page rank? I have looked at their links and on page optimization and we are just about par in comparison if not ahead of them, but they still are a whole page ahead of us in google searches for phoenix vehicle wraps, or car wraps phoenix and related searches. Thanks for your input, I really appreciate it. 🙂 Cheers! (edited spelling)0 -
Can anyone explain to me why this website is top of the listings?
Hi folks, Since the latest penguin update, this website: <cite class="bc">www.sterlingbuild.co.uk</cite> is now appearing top of the listings for most 'Velux' related search terms e.g. velux windows, velux, velux windows online. I say top, VELUX themselves are actually top but then Sterling Build are top of all the other retailers that sell VELUX Windows. Why? Their link profile is appalling, the website is no better than their competitors as far as I can tell, their prices are more expensive - it makes no sense? Would be very grateful if anyone can work out why Google has decided they have the best website? Thanks, Luke
Competitive Research | | LukeyB300 -
How to Find Another Site's robots.txt File?
An SEO report, not by SEOmoz, says my top two competitors have robots.txt files that disallows spidering. I suspect that their robots.txt file doesn't disallow all spidering. How do I find out what is in their robots.txt files?
Competitive Research | | lbohen0 -
Free tools to find country of origin of backlinks/urls
Hey are there any free tools out there which can allow me to insert a large list of urls, and it determines the country of origin of the domain. I know the paid version of majestic does, but i was wondering if theres any free tools? Cheers, Chris
Competitive Research | | monster990 -
Where do we find a Keyword Discovery Tool in SEOmoz?
I've been looking for a way to compare keywords amongst my competition websites.
Competitive Research | | homesonthesound0 -
Analyzing Back Links - Says site A has back link to Site B but when I look at site B I can't find any back link to Site A. Why?
I am new to SEO Moz - It looks like incredible technology. I was playing around with different websites to see where they had back linked to see how it works. Looked at a site called racingsecretsexposed [dot] come and it said that it had dozens of links to www.ndesignstudio.com such as: ndesign-studio.com/blog/best-wordpress-sites?replytocom=893 with link anchor text "laying horses" but when I do a search for the company name, or the anchor text "laying horses", or the owner of the company's name on ndesignstudio.com - nothing appears. Why not? Isn't the back link anchored by the text laying horses, which should link back to the racing secrets website? Thanks
Competitive Research | | NewtoSEO900 -
Help with website
Hello, I've been trying to get our website on 1st place (or at least 1st page to start) on several keywords, on google in all English speaking countries. In most of them I managed, but in USA I'm on 5th page. By Domain Analysis we are better than competition. Keywords are glass nail files, crystals nail files (some long tail keywords are ok, we are somewhere on top). Website: http://www.czech-glass-nail-files.com/ We worked on code, design (notice issues is just recent, because of server migration), on-page, blog. Can someone help me out with an advise? The goal is to be #1. We definitely doing something wrong, but I can't figure out what. Thank you.
Competitive Research | | divan0 -
Twitter as a website's #2 ranked linked page?
A site I'm researching on open-site explorer has a #2 link with page authority of 52 and Domain authority of 97, and that link is the site's twitter page. No other sites I've researched have had their twitter page show up in it's link rankings like this, can someone explain?
Competitive Research | | TheSquareFoot0