Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Way to spider Wordpress site
-
I have an old Wordpress site and I want to move it to a new server and take it off Wordpress (too many hacks). I am trying to spider the site so as to get static, non-Wordpress, pages.
I am having trouble doing this. When I spider the site, it changes the URLs. For instance, if the URL is www.domain.com/page/ the URL I get out of the spider is /page/index.html And those are not the URLs in the search engine indices. There are about 2000 pages on this site, so it is not feasible to set up 301 redirects.
I tried using these spidering programs: WinHTTack Website Copier and PageNest
Does anyone know of another method of turning a Wordpress site into a non Wordpress site?
-
Hi Dan
Hmm that's a little strange. Two things;
- is WordPress updated? Do you get the normal URLs when viewing in your browser?
- have you tried Screaming Frog SEO Spider? It's free to crawl up to 500 pages Although it won't get the actual HTML on the pages, it could solve the URL issue perhaps.
This blackhat world thread has a few options too.
-Dan
-
Hi Dan, I'm not so experienced in migrating a WP to non -wp but I understand that the issue you're having is that the spider is returning index.htmlfiles for urls like domain/page/.
IT's normal, any spider you will use you'll always have and index.html file. Every directory has it's index.html which is the default file to show if you're not establishing something different with rewrite rules.
If you write /page/ the browser will read the index.html file. What you have to be sure is that you'll set up a 301 redirect to avoid any index.html url to show and have it redirected to the main / page (with wildcards is a one line rule) and that your internal links are pointing all to / pages and not to index.html version of it. You can jsut find and replace the /index.html" string into the html code with the /" text (dreamweaver or any html editor will do that in bulk.
Only one commentary on you idea is that you may consider useful to build a php driven site, using includes for header, footer and nav/sidebar, jsut because thinking ahead if you're willing to make changes to a portion of the page repeating throughout the site you'll have to make changes in all pages and uplaod them all which is quite huge to do and also let space for many human/machine errors.
Hope that helped you out!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Seo For Forum Sites
I have forum site.I've opened it 2 months ago.But there is a problem.Therefore my content is unique , my site's keyword ranking constantly changing..Sometimes my site's ranking drops from first 500.After came to 70s. I didn't make any off page seo to my site.What is the problem ?
Technical SEO | | tutarmi0 -
How to create site map for large site (ecommerce type) that has 1000's if not 100,000 of pages.
I know this is kind of a newbie question but I am having an amazing amount of trouble creating a sitemap for our site Bestride.com. We just did a complete redesign (look and feel, functionality, the works) and now I am trying to create a site map. Most of the generators I have used "break" after reaching some number of pages. I am at a loss as to how to create the sitemap. Any help would be greatly appreciated! Thanks
Technical SEO | | BestRide0 -
Mobile site ranking instead of/as well as desktop site in desktop SERPS
I have just noticed that the mobile version of my site is sometimes ranking in the desktop serps either instead of as well as the desktop site. It is not something that I have noticed in the past as it doesn't happen with the keywords that I track, which are highly competitive. It is happening for results that include our brand name, e.g '[brand name][search term]'. The mobile site is served with mobile optimised content from another URL. e.g wwww.domain.com/productpage redirects to m.domain.com/productpage for mobile. Sometimes I am only seen the mobile URL in the desktop SERPS, other times I am seeing both the desktop and mobile URL for the same product. My understanding is that the mobile URL should not be ranking at all in desktop SERPS, could we be being penalised for either bad redirects or duplicate content? Any ideas as to how I could further diagnose and solve the problem if you do believe that it could be harming rankings?
Technical SEO | | pugh0 -
Removing Media from Wordpress
I've run the seomoz on page report and found an interesting issue. I'm using wordpress and it seems that every picture I add to my articles seem to be added as separate pages to the site. I'm having to go to each and every picture and creating a meta tag and description to it. I still get duplicate content issues with the same. On my Disqus system, I get the same pictures added just as a page or article would look like. What can I do to avoid this?
Technical SEO | | emasaa0 -
Authorship and Publisher on WordPress
I successfully enabled rel=publisher on our WordPress blog, and as a test I also enabled rel=authorship for a set of blog posts. (Tested both in Google's Rich Snippets Tester.) However, on the individual blog posts the publisher credit disappears. Is there a way to enable both to appear on blog posts?
Technical SEO | | ufmedia0 -
404 Errors After Site Migration
Hello - I'm working on a website selling fashion accessories. The site just went through a site migration from Yahoo! to Big Commerce. Now we have a high level of warnings and errors from the crawl. Few are mentioning sites I never seen before on the Yahoo! platform. I also notice that the pages crawled has doubled. How can I fix or did I do something wrong with migration? I was running the website with minimal errors and now overwhelmed with errors all the error updates. If I can get some assistance on what could be wrong, I would greatly appreciate. Thanks.
Technical SEO | | ShopChameleon0 -
Is there a pinging tool to ping all sites at once
hi, i am just wondering if there is a tool that you can put on your toolbar that allows you to ping all the sites at once. The last thing i want to keep doing is to go through every single one and ping my article. I would like to find a tool that does it all for me, can anyone let me know if there is one out there. many thanks
Technical SEO | | ClaireH-1848860 -
Sitefinity vs Wordpress
We're looking for a new CMS and out development company suggested Sitefinity. I've had great success with Wordpress. Is either system better. I love worpdress but have had no experience with Sitefinity. Thanks!
Technical SEO | | StandUpCubicles0