Pull meta descriptions from a website that isn't live anymore
-
Hi all, we moved a website over to Wordpress 2 months ago. It was using .cfm before, so all of the URLs have changed. We implemented 301 redirects for each page, but we weren't able to copy over any of the meta descriptions.
We have an export file which has all of the old web pages. Is there a tool that would allow us to upload the old pages and extract the meta descriptions so that we can get them onto the new website? We use the Yoast SEO plugin which has a bulk meta descriptions editor, so I'm assuming that the easiest/most effective way would be to find a tool that generates some sort of .csv or excel file that we can just copy and paste? Any feedback/suggestions would be awesome, thanks!
-
You can pull the meta descriptions with Screaming Frog from the Wayback Machine if your site is archived. If you want to do this, let me know and I'll help you with the settings.
-
I would do it one better and crawl from a local web server, just to be sure. But in all reality, a password protected directory is probably more accessible, in this instance.
-
Note Ray-pp suggests you use a private directory... Make sure to keep it out of the serps
-
Thanks Ray, we've used the Screaming From Spider for some time now, I've flirted with the idea of re-uploading the web files. This may be our best option, thanks.
-
Hi George,
If you can upload the old pages to a private directory, you can then use Screaming Frog SEO tool to crawl all of the pages and retrieve the meta descriptions. That would allow you to easily export much of the on-page SEO, include your meta information.
Screaming Frog SEO spider is a mus have tool for SEOs - check it out if you haven't already!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to Evaluate Original Domain Authority vs. Recent 'HTTPS' Duplicate for Potential Domain Migration?
Hello Everyone, So our site has used ‘http’ for the domain since the start. Everything has been set up for this structure and Google is only indexing these pages. Just recently a second version was created on ‘httpS’. We know having both up is the worst case scenario but now that both are up is it worth just switching over or would the original domain authority warrant just keeping it on ‘http’ and redirecting the ‘httpS’ version? Assuming speed and other elements wouldn’t be an issue and it's done correctly. Our thought was if we could do this quickly it would be easier to just redirect the ‘httpS’ version but was not sure if the Pros of ‘httpS’ would be worth the resources. Any help or insight would be appreciated. Please let us know if there are any further details we could provide that might help. Looking forward to hearing from all of you! Thank you in advance for the help. Best,
Intermediate & Advanced SEO | | Ben-R1 -
I'm stumped!
I'm hoping to find a real expert to help out with this. TL;DR Our visibility in search has started tanking and I cannot figure out why. The whole story: In fall of 2015 I started working with Convention Nation (www.conventionnation.com). The client is trying to build a resource for convention and tradeshow attendees that would help them identify the events that will help them meet their goals (learning, networking, sales, whatever). They had a content team overseas that spent their time copy/pasting event information into our database. At the time, I identified several opportunities to improve SEO: Create and submit a sitemap Add meaningful metas Fix crawl errors On-page content uniqueification and optimization for most visible events (largest audience likely to search) Regular publishing and social media Over nine months, we did these things and saw search visibility, average rank and CTR all double or better. There was still one problem, and that is created by our specific industry. I'll use a concrete example: MozCon. This event happens once a year and there are enough things that are the same about it every year (namely, the generalized description of the event, attendees and outcomes) that the 2015 page was getting flagged as a duplicate of 2016. The event content for most of our events was pretty thin anyway, and much of it was duplicated from other sources, so we implemented a feature that grouped recurring events. My thinking was that this would reduce the perception of duplicate or obsolete content and links and provide a nice backlink opportunity. I expected a dip after we deployed this grouping feature, that's been consistent with other bulk content changes we've made to the site, but we are not recovering from the dip. In fact, our search visibility and traffic are dropping every week. So, the current state of things is this: Clean crawl reports: No errors reported by Moz or Google Moz domain authority: 20; Spam score 2/17 We're a little thin on incoming links, but steady growth in both social media and backlinks Continuing to add thin/duplicate content for unique events at the rate of 200 pages/mo Adding solid, unique strategic content at the rate of 15 pages/mo I just cannot figure out where we've gone astray. Is there anything other than the thin/copied content that could be causing this? It wasn't hurting us before we grouped the events... What could possibly account for this trend? Help me, Moz Community, you're my only hope! Lindsay
Intermediate & Advanced SEO | | LindsayDayton0 -
WordPress posts Title field inserts title into blog posts like a headline but doesn't ad H1 tag how to change?
I have a Wordpress website which is just using the Default theme, when I post in the blog, whatever I put in the "Title" field at the top of the editor is automatically is placed within the body of the blog post, like a headline, but it doesn't include any H1 tags that I can see. If I add my own headline within in the blog editor, it still inserts the Title like a headline. I am using the Yoast SEO Plugin and also write the meta title there, should I just leave the Wordpress title field blank so it doesn't insert into the blog post? Or is that inserted Title being recognized as an H1 even though I don't see h1 tags anywhere? Hope this isn't too confusing.
Intermediate & Advanced SEO | | SEO4leagalPA1 -
Some site's links look different on google search. For example Games.com › Flash games › Decoration games How can we do our url's like this?
For example Games.com › Flash games › Decoration games How can we do our url's like this?
Intermediate & Advanced SEO | | lutfigunduz0 -
What do you think about this links? Toxic or don't? disavow?
Hi, we are now involved in a google penalty issue (artificial links – global – all links). We were very surprised, cause we only have 300 links more less, and most of those links are from stats sites, some are malware (we are trying to fight against that), and other ones are article portals. We have created a spreadsheet with the links and we have analyzed them using Link Detox. Now we are sending emails, so that they can be removed, or disavow the links what happen is that we have very few links, and in 99% of then we have done nothing to create that link. We have doubts about what to do with some kind of links. We are not sure them to be bad. We would appreciate your opinion. We should talk about two types: Domain stats links Article portals Automatically generated content site I would like to know if we should remove those links or disavow them These are examples Anygator.com. We have 57 links coming from this portal. Linkdetox says this portal is not dangerous http://es.anygator.com/articulo/arranca-la-migracion-de-hotmail-a-outlook__343483 more examples (stats or similar) www.mxwebsite.com/worth/crearcorreoelectronico.es/ and from that website we have 10 links in wmt, but only one works. What do you do on those cases? Do you mark that link as a removed one? And these other examples… what do you think about them? More stats sites: http://alestat.com/www,crearcorreoelectronico.es.html http://www.statscrop.com/www/crearcorreoelectronico.es Automated generated content examples http://mrwhatis.net/como-checo-mi-correo-electronico-yaho.html http://www.askives.com/abrir-correo-electronico-gmail.html At first, we began trying to delete all links, but… those links are not artificial, we have not created them, google should know those sites. What would you do with those sites? Your advices would be very appreciated. Thanks 😄
Intermediate & Advanced SEO | | teconsite0 -
Can't find X-Robots tag!
Hi all. I've been checking out http://www.unthankbooks.com/ as it seems to have some indexing problems. I ran a server header check, and got a 200 response. However, it also shows the following: X-Robots-Tag:
Intermediate & Advanced SEO | | Blink-SEO
noindex, nofollow It's not in the page HTML though. Could it be being picked up from somewhere else?0 -
Why isn't google indexing our site?
Hi, We have majorly redesigned our site. Is is not a big site it is a SaaS site so has the typical structure, Landing, Features, Pricing, Sign Up, Contact Us etc... The main part of the site is after login so out of google's reach. Since the new release a month ago, google has indexed some pages, mainly the blog, which is brand new, it has reindexed a few of the original pages I am guessing this as if I click cached on a site: search it shows the new site. All new pages (of which there are 2) are totally missed. One is HTTP and one HTTPS, does HTTPS make a difference. I have submitted the site via webmaster tools and it says "URL and linked pages submitted to index" but a site: search doesn't bring all the pages? What is going on here please? What are we missing? We just want google to recognise the old site has gone and ALL the new site is here ready and waiting for it. Thanks Andrew
Intermediate & Advanced SEO | | Studio330 -
Duplicate Title Tags & Duplication Meta Description after 301 Redirect
Today, I was checking my Google webmaster tools and found 16,000 duplicate title tags and duplicate meta description. I have investigate for this issue and come to know about as follow. I have changed URL structure for 11,000 product pages on 3rd July, 2012 and set up 301 redirect from old product pages to new product pages. Google have started to crawl my new product pages but, De-Indexing of old URLs are quite slower. That's why I found this issue on Google webmaster tools. Can anyone suggest me, How can I increase ratio of De-Indexing for old URLs? OR any other suggestions? How much time Google will take to De-Index old URLs from web search?
Intermediate & Advanced SEO | | CommercePundit0