Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
URL mapping for site migration
-
Hi all! I'm currently working on a migration for a large e-commerce site. The old one has around 2.5k urls, the new one 7.5k. I now need to sort out the redirects from one to the other.
This is proving pretty tricky, as the URL structure has changed site wide. There doesn't seem to be any consistent rules either so using regex doesn't really work.
By and large, the copy appears to be the same though. Does anybody know of a tool I can crawl the sites with that will export the crawled url and related copy into a spreadsheet? That way I can crawl both sites and compare the copy to match them up.
Thanks!
-
Just to confirm mosquitohawk's comments, there's not a great way to do this other than sorting through the spreadsheet.
Hopefully URLs have distinct enough subfolders that you can break them out into sections easily.
-
Darn!
Another alternative would be to use Screaming Frog to get a full list of URLs from each site, then use a scraping tool like Mozenda to scrape that list from each site, pull the content area and it will create the data structure you want and make it available for export. Then you can basically do what I had said in the previous email, compare the two spreadsheets.
-
Thank you for taking the time to answer. I did think of Screaming Frog, but the problem is that it only records the instances of custom parameters, not the contents. I tweeted the SF team to check and they said it wasn't possible too. I've also tried InSite Inspyder too but tat doesn't do it either.
-
Screaming Frog SEO Spider could do that for you. You'd need to set up a custom filter to look for a copy identifier (ie: a div that always contains the main copy) and have it scrape that for you while it's crawling. Do the same for the other site and then you could match them up pretty easy I think.
Here is a good resource on different ways of using the tool - http://www.seerinteractive.com/blog/screaming-frog-guide We use it almost daily for a variety of tasks and find it to be pretty flexible. Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site Migration - Pagination
Hi, We are migrating our website and an issue we are facing is how to handle paginated content in our categories. Our new website will have the same structure but with different urls. Should we 301 redirect all the paginated content (if crawled by Google) to the url of the main category? To put this into an example: Old urls: www.example.com/technology/tvs (main category of TVs & also page 1) ** www.example.com/technology/tvs?v=0&page=2 ** ( page 2 of TVs) New urls: **www.example.com/soundvision/tvs **(main category of TVs & also page 1) **www.example.com/soundvision/tvs?page=2 **(page 2 of tvs) Should we redirect all of the old TV urls (also the paginated) to www.example.com/soundvision/tvs ? The is no rel next, prev tag in our site and no canonicals. Also there is a view all products page in each category, BUT it doesn't contain all the products(max. is 100 per page - yes the view all page is also paginated). The same view all products page (paginated) will exist in the new website also. I checked google search console, and Google has decided to treat as canonical page the first page www.example.com/technology/tvs . Also, all the organic traffic of our categories goes to these pages (main category page - 1st page). I would appreciate any thoughts on this.
Intermediate & Advanced SEO | | HellasSITES0 -
Duplicate URLs ending with #!
Hi guys, Does anyone know why a site can contain duplicate URLs ending with hastag & exclamation mark e.g. https://site.com.au/#! We are finding a lot of these URLs (as duplicates) and i was wondering what they are from developer standpoint? And do you think it's worth the time and effort adding a rel canonical tag or 301 to these URLs eventhough they're not getting indexed by Google? Cheers, Chris
Intermediate & Advanced SEO | | jayoliverwright0 -
Adding hreflang tags - better on each page, or the site map?
Hello, I am wondering if there seems to be a preference for adding hreflang tags (from this article). My client just changed their site from gTLDs to ccTLDs, and a few sites have taken a pretty big traffic hit. One issue is definitely the amount of redirects to the page, but I am also going to work with the developer to add hreflang tags. My question is - is it better to add them to the header of each page, or the site map, or both, or something else? Any other thoughts are appreciated. Our Australia site, which was at least findable using Australia Google before this relaunch, is not showing up, even when you search the company name directly. Thanks!Lauryn
Intermediate & Advanced SEO | | john_marketade0 -
Linking to URLs With Hash (#) in Them
How does link juice flow when linking to URLs with the hash tag in them? If I link to this page, which generates a pop-over on my homepage that gives info about my special offer, where will the link juice go to? homepage.com/#specialoffer Will the link juice go to the homepage? Will it go nowhere? Will it go to the hash URL above? I'd like to publish an annual/evergreen sort of offer that will generate lots of links. And instead of driving those links to homepage.com/offer, I was hoping to get that link juice to flow to the homepage, or maybe even a product page, instead. And just updating the pop over information each year as the offer changes. I've seen competitors do it this way but wanted to see what the community here things in terms of linking to URLs with the hash tag in them. Can also be a use case for using hash tags in URLs for tracking purposes maybe?
Intermediate & Advanced SEO | | MiguelSalcido0 -
Canonical URL & sitemap URL mismatch
Hi We're running a Magento store which doesn't have too much stock rotation. We've implemented a plugin that will allow us to give products custom canonical URLs (basically including the category slug, which is not possible through vanilla Magento). The sitemap feature doesn't pick up on these URLs, so we're submitting URLs to Google that are available and will serve content, but actually point to a longer URL via a canonical meta tag. The content is available at each URL and is near identical (all apart from the breadcrumbs) All instances of the page point to the same canonical URL We are using the longer URL in our internal architecture/link building to show this preference My questions are; Will this harm our visibility? Aside from editing the sitemap, are there any other signals we could give Google? Thanks
Intermediate & Advanced SEO | | tomcraig860 -
Product or Shop in URL
What do you think is better for seo and for sale, I am using woo-ecommerce for health products website. websitename.com/product/keyword OR websitename.com/shop/keyword
Intermediate & Advanced SEO | | MasonBaker0 -
Weird 404 URL Problem - domain name being placed at end of urls
Hey there. For some reason when doing crawl tests I'm finding pages with the domain name being tacked on the end and causing 404 errors.
Intermediate & Advanced SEO | | Jay328
For example: http://domainname.com/page-name/http://domainname.com This is happening to all pages, posts and even category type 1. Site is in Wordpress
2. Using Yoast SEO plugin Any suggestions? Thanks!0 -
Recovery during domain migration
On average, how long does it takes to recover 80% of the rankings if two high authority domains are combined without chaging any content? I totally understand that each domain is different and search engines can treat them differently but if all the steps are followed to the T what are the chances?
Intermediate & Advanced SEO | | ninjamarketer1