Pull meta descriptions from a website that isn't live anymore
-
Hi all, we moved a website over to Wordpress 2 months ago. It was using .cfm before, so all of the URLs have changed. We implemented 301 redirects for each page, but we weren't able to copy over any of the meta descriptions.
We have an export file which has all of the old web pages. Is there a tool that would allow us to upload the old pages and extract the meta descriptions so that we can get them onto the new website? We use the Yoast SEO plugin which has a bulk meta descriptions editor, so I'm assuming that the easiest/most effective way would be to find a tool that generates some sort of .csv or excel file that we can just copy and paste? Any feedback/suggestions would be awesome, thanks!
-
You can pull the meta descriptions with Screaming Frog from the Wayback Machine if your site is archived. If you want to do this, let me know and I'll help you with the settings.
-
I would do it one better and crawl from a local web server, just to be sure. But in all reality, a password protected directory is probably more accessible, in this instance.
-
Note Ray-pp suggests you use a private directory... Make sure to keep it out of the serps
-
Thanks Ray, we've used the Screaming From Spider for some time now, I've flirted with the idea of re-uploading the web files. This may be our best option, thanks.
-
Hi George,
If you can upload the old pages to a private directory, you can then use Screaming Frog SEO tool to crawl all of the pages and retrieve the meta descriptions. That would allow you to easily export much of the on-page SEO, include your meta information.
Screaming Frog SEO spider is a mus have tool for SEOs - check it out if you haven't already!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
18,000 'Title Element is too Long' Errors
How detrimental is this in the overall SEO scheme of things? Having checked 3 of our main competitors, they too seem to have similar issues... I am trying to look at a solution but it is proving very difficult! Thanks Andy
Intermediate & Advanced SEO | | TomKing0 -
Website with only a portion being 'mobile friendly' -- what to tell Google?
I have a website for desktop that does a lot of things, and have converted part of it do show pages in a mobile friendly format based on the users device. Not responsive design, but actual diff code with different formatting by mobile vs desktop--but each still share the same page url name. Google allows this approach. The mobile-friendly part of the site is not as extensive as desktop, so there are pages that apply to the desktop but not for mobile. So the functionality is limited some for mobile devices, and therefore some pages should only be indexed for desktop users. How should that page be handled for Google crawlers? If it is given a 404 not found for their mobile bot will Google properly still crawl it for the desktop, or will Google see that the url was flagged as 'not found' and not crawl it for the desktop? I asked a similar question yest, but it was not stated clearly. Thanks,Ted
Intermediate & Advanced SEO | | friendoffood0 -
2.3 million 404s in GWT - learn to live with 'em?
So I’m working on optimizing a directory site. Total size: 12.5 million pages in the XML sitemap. This is orders of magnitude larger than any site I’ve ever worked on – heck, every other site I’ve ever worked on combined would be a rounding error compared to this. Before I was hired, the company brought in an outside consultant to iron out some of the technical issues on the site. To his credit, he was worth the money: indexation and organic Google traffic have steadily increased over the last six months. However, some issues remain. The company has access to a quality (i.e. paid) source of data for directory listing pages, but the last time the data was refreshed some months back, it threw 1.8 million 404s in GWT. That has since started to grow progressively higher; now we have 2.3 million 404s in GWT. Based on what I’ve been able to determine, links on this particular site relative to the data feed are broken generally due to one of two reasons: the page just doesn’t exist anymore (i.e. wasn’t found in the data refresh, so the page was simply deleted), or the URL had to change due to some technical issue (page still exists, just now under a different link). With other sites I’ve worked on, 404s aren’t that big a deal: set up a 301 redirect in htaccess and problem solved. In this instance, setting up that many 301 redirects, even if it could somehow be automated, just isn’t an option due to the potential bloat in the htaccess file. Based on what I’ve read here and here, 404s in and of themselves don’t really hurt the site indexation or ranking. And the more I consider it, the really big sites – the Amazons and eBays of the world – have to contend with broken links all the time due to product pages coming and going. Bottom line, it looks like if we really want to refresh the data on the site on a regular basis – and I believe that is priority one if we want the bot to come back more frequently – we’ll just have to put up with broken links on the site on a more regular basis. So here’s where my thought process is leading: Go ahead and refresh the data. Make sure the XML sitemaps are refreshed as well – hopefully this will help the site stay current in the index. Keep an eye on broken links in GWT. Implement 301s for really important pages (i.e. content-rich stuff that is really mission-critical). Otherwise, just learn to live with a certain number of 404s being reported in GWT on more or less an ongoing basis. Watch the overall trend of 404s in GWT. At least make sure they don’t increase. Hopefully, if we can make sure that the sitemap is updated when we refresh the data, the 404s reported will decrease over time. We do have an issue with the site creating some weird pages with content that lives within tabs on specific pages. Once we can clamp down on those and a few other technical issues, I think keeping the data refreshed should help with our indexation and crawl rates. Thoughts? If you think I’m off base, please set me straight. 🙂
Intermediate & Advanced SEO | | ufmedia0 -
Anyone managed to change 'At a glance:' in local search results
On Google's local search results, i.e when the 'Google places' data is displayed along with the map on the right hand side of the search results, there is also an element 'At a glance:'
Intermediate & Advanced SEO | | DeanAndrews
The data that if being displayed is from some years ago and the client would if possible like it to reflect there current services, which they have been providing for some five years. According to Google support here - http://support.google.com/maps/bin/answer.py?hl=en&answer=1344353 this cannot be changed, they say 'Can I edit a listing’s descriptive terms or suggest a new one?
No; the terms are not reviewed, curated, or edited. They come from an algorithm, and we do not help that algorithm figure it out. ' My question is has anyone successfully influenced this data and if so how.0 -
Hide H1 tags on pages. Don't chuckle-Need assistance.
I redesigned my companies website and I am first and foremost an SEO person so I know the importance of a well laid out website. Furthermore, I know realistically you should NEVER hide text whether it's with WH or BH intentions but here is my problem. For every page I have all the details taken care of except proper placement of H1 tags. My website is responsive designed VERY competitive industry I have to make sure it is properly developed both design wise and seo wise It's an INC 5000 company so NO BH intentions On phones and tablet devices I have the header images hidden and in the place of header images I have the information as in location, service,etc of whatever that page may be. This makes it look good on desktops and serves up information quickly to people using phones and tablets. My question is: Would it be bad to turn that text seen on tablets and phones into an h1 tag as it's hidden on desktops with CSS but available on mobile devices. My problem is making the h1 tag's work with the desktop versions visually as placement doesn't make since. Any opinions are appreciated. Thanks Ballanrk
Intermediate & Advanced SEO | | ballanrk0 -
What if you can't navigate naturally to your canonicalized URL?
Assume this situation for a second... Let's say you place a rel= canonical tag on a page and point to the original/authentic URL. Now, let's say that that original/authentic URL is also populated into your XML sitemap... So, here's my question... Since you can't actually navigate to that original/authentic URL (it still loads with a 200, it's just not actually linkded to from within the site itself), does that create an issue for search engines? Last consideration... The bots can still access those pages via the canonical tag and the XML sitemap, it's just that the user wouldn't be able to access those original/authentic pages in their natural site navigation. Thanks, Rodrigo
Intermediate & Advanced SEO | | AlgoFreaks0 -
Any way to find which domains are 301 redirected to competitors' websites?
By looking at the work from an SEO collegue it became clear that his weak linkbuilding graph probably is not the cause for his good rankings for a pretty competitive keyword. (also no social mentions where found) I was wondering what it could be, site structure and other on page optimization factors seems to be ok and I don't think there will be exceptionally good or bad user behavior... Finally I looked at the competitors and found that they have more links, better content en better design, so I got a little stuck. The only reason I can think of is that he is doing 301 redirects (or is rel=canonical tags). Is there a way to trace these redirects back to the source in order to include this important variable in your competitor research? thnx
Intermediate & Advanced SEO | | djingel10 -
Redirecting multiple websites to a single website
I've been trying to run several truck accessory affiliate websites for a quite a while now. I've recently decided to combine all of my affiliate websites into a single community website. This way I'll be able to focus all my energy and link building into a single place and build up a single brand. My question is, how many websites do I try to redirect to the new website at a time? Do I need to spread this out? Or is it ok if I move all of my content and websites at a single time? I have around 30 websites that I could move to this new domain. Thanks! Andy
Intermediate & Advanced SEO | | daenterpri0