SEOMoz only crawling 5 pages of my website
-
Hello,
I've added a new website to my SEOmoz campaign tool. It only crawls 5 pages of the site. I know the site has way more pages then this and also has a blog.
Google shows at least 1000 results indexed.
Am I doing something wrong? Could it be that the site is preventing a proper crawl?
Thanks
Bill
-
You should have setup a subdomain (which is what you are very linkely to have done anyway) but this linking issue is a real sticking point for you at the moment.
It's difficult to give you concrete advise without knowing your friend's business model, marketing strategy and content, owever, lets just say for neatness he wants to keep his main squeeze page as it is - at www.kingofcopy.com - you could separate all of the squeeze pages from the 'subscribers' content by creating a sub folder called 'members-area' for example - so www.kingofcopy.com contains the squeeze page where it is now (and additional sqeeze pages reside at www.kingofcopy.com/maxoutsideusa.html etc)
and all of the opt in content is moved to www.kingofcopy.com/members-area/ ensuring all of the good info that shouldn't be visible is noindexed accordingly.
Of course, this advise is based on the assumption that you only want to rank squeeze pages.
If I were undertaking this project I would do things a little differently - as I believe that sqeeze pages have now lost some of their kick - perhaps due to the huge numbers of them I have seen... So instead I would have a lot of teaser articles and videos - which contain a lot of good keyword targeted content all SEOd to the max, making sure that there are some good nuggets of info in them - so that the reader thinks - Wow! If the stuff he gives away for free is this good then I can't wait to find out how much better the paid for stuff is!
In terms of onpage SEO and campaign management - separate content which you want highly visible from the members only content - store non-indexed pages emembers pages within a sub-folder - link all of the content you want visible and indexing in some way
-
well I am just doing this for a friend of mine. His site is not ranking as well as he would like it to. I know he has some issues but first I wanted to see what major errors I could find and then fix. Then of course I am only getting 5 pages of content.
The rest of his site is indexed in google. You can find lots of his pages. I was just trying to figure out why the tool is only crawling 5 pages.
I don't recall which campaign I set it up originally. Which one do you recommend?
-
Hey Bill,
Can you tell us what campaign type you initially setup: Sub domain, Root domain or Sub folder?
I believe you are going to struggle setting up your campaign to monitor all of these pages due to the current configuration - based on the link architecture/navigation.
Would it be fair to say that you are actually only concerned about monitoring the performance of the visible Sqeeze pages in the SERPS - because if every other page should only be visible when you opt in then it stands to reason that you would be better to have all of this content hidden using noindex, to preserve the value of the content within those pages - to give potential customers every reason to opt in?
If we had a better idea of what your end goal was it might help us better assist you.
-
I think what he has here is a squeeze page set as his home page. You can not access the rest of the site unless you optin. Of course some of the other subpages are indexed in Google so you can bypass the home page.
Because he is using a squeeze page with no navigation is this why there is no link to the rest of the sites content?
Sorry-Trying to follow along.
-
http://www.kingofcopy.com/sitemap.xml - references only 3 files (with the index and sitemap/xml link making it up to 5)
However the other sections of the site are installed into sub folders or are disconnected from the content referenced from your root www.kingofcopy.com
take a look at this sitemap further into the site from one of your subfolders http://www.kingofcopy.com/products/sitemap.xml and you will see what looks to be the 1000+ pages you refer to.
However, there is no connection between the root directory and these other pages and sub folders.
It appears that your main page is http://www.kingofcopy.com/main.html
Ordinarily you would want to bring them into one common, connected framework - with all accessible pages linked to in a structured and logical way - and if you have other exclusive squeeeze pages/landing pages that you do not want to show up in search results - and just direct users to them using mail shots etc then you can prevent them getting indexed - for example - you may want to prevent a pure sqeeze page like http://www.kingofcopy.com/max/maxoutsideusa.html from appearing in the SERPS.
To prevent all robots from indexing a page on your site, place the following meta tag into the section of your page:
Personally, I would consider a restructure to bring this content into the root directory - noindexing the squeeze pages as required - but this would need to be carefully planned and well executed with 301 redirects in place where content has moved from one directory to another
However, you could always shuffle around the first few pages - renaming main.html to index html and having the copy you currently have at www.kingofcopy.com in a lightbox/popup or similar over the top of the main page ?
I think the problem with the main.html page not being found as your default root/home page and the lack of connections between certain pages is the cause for a lot of the issues with your campaign crawling so few of the pages.
Incidentally, if you did restructure consider using Wordpress as would be a great fit with what you have produced already (and there are plently of wordpress squeeze page/product promotion themes available.
-
I feel like I've checked just about everything. I do not have access to his GWT.
Ryan, thanks for helping me with this.
-
Can you share the URL?
There are several things to check, starting with the robots.txt file and your site's navigation.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Page
I just Check Crawl the status error with Duplicate Page Content. As Mentioned Below. Songs.pk | Download free mp3, Hindi Music, Indian Mp3 Songs http://www.getmp3songspk.com Songs.pk | Download free mp3, Hindi Music, Indian Mp3 Songs http://getmp3songspk.com and then i added these lines to my htaccess file RewriteBase /
Moz Pro | | Getmp3songspk
RewriteCond %{HTTP_HOST} !^www.getmp3songspk.com$ [NC]
RewriteRule ^(.*)$ http://www.getmp3songspk.com/$1 [L,R=301] But Still See that error again when i crawl a new test.0 -
Since July 1, we've had a HUGE jump in errors on our weekly crawl. We don't think anything has changed on our website. Has MOZ changed something that would account for a large leap in duplicate content and duplicate title errors?
Our error report went from 1,900 to 18,000 in one swoop, starting right around the first of July. The errors are duplicate content and duplicate title, as if it does not see our 301 redirects. Any insights?
Moz Pro | | KristyFord0 -
Using seomoz.
Hi, I'm scouring the web for new links, and am taking a closer look at OSE; I'm trying to gauge whether the PA & DA rating(s) in the results page for a domain search takes into account any penalty signals the site may be showing, and whether there is a 'real' way to identify how much value links from the source will give you (if for example, there's already x10 other links away from that page or x100 links away from that domain). Basically - what else needs to be taken into account when using this app to identify link sources, etc. Thanks, David
Moz Pro | | newstd1000 -
Seomoz puts hyphen and company name after page title in the report, why?'
Hi Community, Can you figure out why this is happening? The title in the url itself does not contain - CodeBaby. Thank you! W6yg1cJ.png
Moz Pro | | CodeBaby0 -
Crawl Report Warnings
How much notice should be paid to the warnings on the SEO Moz crawl reports? We manage a fairly large property site and a lot of the errors on the crawl reports relate to automated responses. As a matter of priority which of the list below will have negative affects with the search engines? Temporary RedirectToo Many On-Page LinksOverly-Dynamic URLTitle Element Too Long (> 70 Characters)Title Missing or EmptyDuplicate Page ContentDuplicate Page TitleMissing Meta Description Tag
Moz Pro | | SoundinTheory0 -
Crawl credits how to buy more?
Just wondering if there is a way of increasing, my 2 crawl credits per day limit?
Moz Pro | | aussieseoguy0 -
Crawl Test has taken over 5 days and still has yet to complete
I am running some crawls on some sites and I have a number still pending. I have one from 7 days ago, a couple from 6 days ago, and 1 from 5 days ago. The confusing thing is that I have run a few others in that same period that have finished already. Do I need to restart the crawls or cancel them and start over?
Moz Pro | | DRSearchEngOpt0 -
Truncate page URLs
We have some pages (for example a contact us form) for which the URL is modified by the CMS depending on the referring page (this helps to put the form submission in context for the sales reps who get the contact submission). The SEOmoz crawler considers each URL a new page -- and so numbers like in diagnostics are all inflated as the same page is listed multiple times (e.g. for too many links) Is there a setting to change what the crawler considers to be the same page? Here are two URLs for the same page that the reports treat as separate pages: http://www.spirent.com/About-Us/Contact_us.aspx?referurl=0F528F4D703D8BB3523738D6373AA8AD http://www.spirent.com/About-Us/Contact_us.aspx?referurl=10ACDA6055244E369395223437FDCF30 The page is actually: http://www.spirent.com/About-Us/Contact_us.aspx Thanks Ken
Moz Pro | | spirent.marcom0