Site less than 20 pages shows 1,400+ pages when crawled
-
Hello! I’m new to SEO, and have been soaking up as much as I can. I really love it, and feel like it could be a great fit for me – I love the challenge of figuring out the SEO puzzle, plus I have a copywriting/PR background, so I feel like that would be perfect for helping businesses get a great jump on their online competition.
In fact, I was so excited about my newfound love of SEO that I offered to help a friend who owns a small business on his site. Once I started, though, I found myself hopelessly confused.
The problem comes when I crawl the site. It was designed in Wordpress, and is really not very big (part of my goal in working with him was to help him get some great content added!)
Even though there are only 11 pages – and 6 posts – for the entire site, when I use Screaming Frog to crawl it, it sees HUNDREDS of pages. It stops at 500, because that is the limit for their free version. In the campaign I started here at SEOmoz, and it says over 1,400 pages have been crawled…with something like 900 errors.
Not good, right?
So I've been trying to figure out the problem...when I look closer in Screaming Frog, I can see that some things are being repeated over and over. If I sort by the Title, the URLs look like they’re stuck in a loop somehow - one line will have /blog/category/postname…the next line will have /blog/category/category/postname…and the next line will have /blog/category/category/category/postname…and so on, with another /category/ added each time.
So, with that, I have two questions
- Does anyone know what the problem is, and how to fix it?
- Do professional SEO people troubleshoot this kind of stuff all of the time? Is this the best place to get answers to questions like that? And if not, where is?
Thanks so much in advance for your help! I’ve enjoyed reading all of the posts that are available here so far, it seems like a really excellent and helpful community...I'm looking forward to the day when I can actually answer the questions!!
-
Thanks, Irving! I am trying turning on/off the plugins - the person who designed the site used a WP Boxer plugin and Multiple Content Blocks plugin, and that is how the homepage is designed (feeding info from pages/posts) so I was wondering if that could be part of it...but when I turn them off/on that doesn't seem to help. So I'm trying the other plugins too (there are just a couple), and if that doesn't work, I'll try a fresh install!
I also tried changing the permalink structure to just /sample-post/ and that didn't seem to work either...but I'm going to keep working on it!
I haven't tried the Twitter approach yet - because I don't actually have a Twitter account (I'm trying to keep social media from taking over my life) - but if that's where the answers are, I guess I need to get on there!
-
Did you install plugins that might have caused the issue? I would deactivate all plugins and see if it has an effect then turn them on one at a time to see if you can isolate the issue.
If the plugins are not the issue, it might make sense to backup the DB and do a fresh install of WP which isn't hard.
-
I don't think the site moved hosts - I'm not the person who created it, but his business is relatively new, so if there was a change it would have been done with very little content on the site.
The permalink structure is custom and looks like this: /blog/%year%/%monthnum%/%day%/%postname%/
Would something else be better? Let me know! Thanks!!
-
Hey K,
If you could post a screen shot of the Settings>Permalink structure screen in the Wordpress Dashboard, or just copy and paste whatever is written in there in a reply, that might help diagnose the issue. Also, do you know if the site has moved hosts recently and was re-installed using the Wordpress export & import feature?
-
Thanks, Alan! I'll try contacting those guys!
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redirecting Pages During Site Migration
Hi everyone, We are changing a website's domain name. The site architecture will stay the same, but we are renaming some pages. How do we treat redirects? I read this on Search Engine Land: The ideal way to set up your redirects is with a regex expression in the .htaccess file of your old site. The regex expression should simply swap out your domain name, or swap out HTTP for HTTPS if you are doing an SSL migration. For any pages where this isn’t possible, you will need to set up an individual redirect. Make sure this doesn’t create any conflicts with your regex and that it doesn’t produce any redirect chains. Does the above mean we are able to set up a domain redirect on the regex for pages that we are not renaming and then have individual 1:1 redirects for renamed pages in the same .htaccess file? So have both? This will not conflict with the regex rule?
Intermediate & Advanced SEO | | nhhernandez0 -
Is this a good sitemap hierarchy for a big eCommerce site (50k+ pages).
Hi guys, hope you're all good. I am currently in the process of designing a new sitemap hierarchy to ensure that every page on the site gets indexed and is accessible via Google. It's important that our sitemap file is well structured, divided and organised into relevant sub-categories to improve indexing. I just wanted to make sure that it's all good before forwarding onto the development team for them to consider. At the moment the site has everything thrown into /sitemap.xml/ and it exceeds the 50k limit. Here is what I have came up with: A primary sitemap.xml referencing other sitemap files, each of the following areas will have their own sitemap of which is referenced by /sitemap.xml/. As an example, sitemap.xml will contain 6 links, all of which link to other sitemaps. Product pages; Blog posts; Categories and sub categories; Forum posts, pages etc; TV specific pages (we have a TV show); Other pages. Is this format correct? Once it has been implemented I can then go ahead and submit all 6 separate sitemaps to webmaster tools + add a sitemap link to the footer of the site. All comments are greatly appreciated - if you know of a site which has a good sitemap architecture, please send the link my way! Brett
Intermediate & Advanced SEO | | Brett-S0 -
Our parent company has included their sitemap links in our robots.txt file - will that have an impact on the way our site is crawled?
Our parent company has included their sitemap links in our robots.txt file. All of their sitemap links are on a different domain and I'm wondering if this will have any impact on our searchability or potential rankings.
Intermediate & Advanced SEO | | tsmith1310 -
"No index" page still shows in search results and paginated pages shows page 2 in results
I have "no index, follow" on some pages, which I set 2 weeks ago. Today I see one of these pages showing in Google Search Results. I am using rel=next prev on pages, yet Page 2 of a string of pages showed up in results before Page 1. What could be the issue?
Intermediate & Advanced SEO | | khi50 -
Significantly reducing number of pages (and overall content) on new site - is it a bad idea?
Hi Mozzers - I am looking at new site (not launched yet) - it contains significantly fewer pages than the previous site - 35 pages rather than 107 before - content on the remaining pages is plentiful but I am worried about the sudden loss of a significant "chunk" of the website - significantly cutting the size of a website must surely increase the risks of post-migration performance problems? Further info - the site has run an SEO contract with a large SEO firm for several years. They don't appear to have done anything beyond tinkering with homepage content - all the header and description tags are the same across the current website. 90% of site traffic currently arrives on the homepage. Content quality/volume isn't bad across most of the current site. Thanks in advance for your input!
Intermediate & Advanced SEO | | McTaggart0 -
Multi-Location SEO: Sites vs Pages
I just started with a new company that requires multi-location SEO for its niche product/service. Currently, we have a main corporate website, as well as, 40+ individual dealer websites (we host all). Keep in mind each of these dealers consist of only 1-2 people, so corporate I will be managing the site or sites and content strategy. Many of the individual dealer sites actually rank very well (#1-#3) in their areas for our targeted keywords, but they all use the same duplicate content. Also, there are many dealer sites that have dropped off the radar in last year, which is probably because of the duplicate and static content. So I'm at a crossroads... Attempt to redo all of these location sites with unique and local content for each or Create optimized unique pages for each of them on our main site and redirect their current local domains to their page on our site Any advise regarding which direction to go in and why. Why is very important. It will be very difficult to convince a dealer that is #1 with his local site that we are redirecting to our main site, so I need some good ammo and reasoning. Also, any tips toward achieving local seo success will be greatly appreciated, too! Thank you!
Intermediate & Advanced SEO | | the-coopersmith0 -
How are pages ranked when using Google's "site:" operator?
Hi, If you perform a Google search like site:seomoz.org, how are the pages displayed sorted/ranked? Thanks!
Intermediate & Advanced SEO | | anthematic0 -
Push for site-wide https, but all pages in index are http. Should I fight the tide?
Hi there, First Q&A question 🙂 So I understand the problems caused by having a few secure pages on a site. A few links to the https version a page and you have duplicate content issues. While there are several posts here at SEOmoz that talk about the different ways of dealing with this issue with respect to secure pages, the majority of this content assumes that the goal of the SEO is to make sure no duplicate https pages end up in the index. The posts also suggest that https should only used on log in pages, contact forms, shopping carts, etc." That's the root of my problem. I'm facing the prospect of switching to https across an entire site. In the light of other https related content I've read, this might seem unecessary or overkill, but there's a vaild reason behind it. I work for a certificate authority. A company that issues SSL certificates, the cryptographic files that make the https protocol work. So there's an obvious need our site to "appear" protected, even if no sensitive data is being moved through the pages. The stronger push, however, stems from our membership of the Online Trust Alliance. https://otalliance.org/ Essentially, in the parts of the internet that deal with SSL and security, there's a push for all sites to utilize HSTS Headers and force sitewide https. Paypal and Bank of America are leading the way in this intiative, and other large retailers/banks/etc. will no doubt follow suit. Regardless of what you feel about all that, the reality is that we're looking at future that involves more privacy protection, more SSL, and more https. The bottom line for me is; I have a site of ~800 pages that I will need to switch to https. I'm finding it difficult to map the tips and tricks for keeping the odd pesky https page out of the index, to what amounts to a sitewide migratiion. So, here are a few general questions. What are the major considerations for such a switch? Are there any less obvious pitfalls lurking? Should I even consider trying to maintain an index of http pages, or should I start work on replacing (or have googlebot replace) the old pages with https versions? Is that something that can be done with canonicalization? or would something at the server level be necessary? How is that going to affect my page authority in general? What obvious questions am I not asking? Sorry to be so longwinded, but this is a tricky one for me, and I want to be sure I'm giving as much pertinent information as possible. Any input will be very much appreciated. Thanks, Dennis
Intermediate & Advanced SEO | | dennis.globalsign0