Moz crawler finding my homepage multiple times
-
Hi and thank you in advance for your help!
I have a Moz Pro campaign running (I am a complete Moz novice by the way) for one of my websites (balloonsutah.com). After crawling my site, the Moz crawler informed me that I have 3 pages with duplicate content. While I am not sure why exactly this is happening, the crawler indexed my homepage 3 times under different url's.
-balloonsutah.com
-balloonsutah.com/
-balloonsutah.com/index.htmlI checked my FTP server and I cannot figure out for the life of me why the crawler is finding anything other than the index.html file.
I suppose I need to do something regarding a rel="Canonical" but I am not terribly familiar with that either.
Any suggestions would be greatly appreciated!
Keenan -
You're welcome!
-
Great answer! I appreciate the time you spent spelling everything out in detail. Thank you!
-
First things first, I did check all web addresses. They all exist. You probably need to provide more detail whether or not you are using a CMS for your web pages.
All 3 pages have different page authority. That is, one of the version is ranking higher than the other versions. I did a quick check of that via Moz toolbar. Looks like the index.html has the highest authority.
Note that all 3 versions you listed, has 2 other versions. The one with the www, and the one without the www. Judging from the moz toolbar, looks like you rank better for the one without the 'www' . Rel canonical is is good option, but in this case I would try to do a 301 redirect from the server side first. Again, not sure how much access you have to the server side. You might need to contact your web admin.host company etc.
You can read about redirects more over here. --> http://moz.com/learn/seo/redirection. If you don't have access to the server you can try doing the rel canonical. Read more here --> http://moz.com/learn/seo/duplicate-content
Example. you have www.example.com/page1.htm, /page2.htm, page3.htm. They all have same exact content. Lets say that pag1.htm is your main version. You can do the following in the header section of page2, and page 3.htm
"This tag tells Bing and Google that the given page should be treated as though it were a copy of the URL www.example.com/pag1.htm/ and that all of the links and content metrics the engines apply should actually be credited toward the provided URL."
I would recommend not to delete all the other version, but instead do a 301 redirect, or a rel canonical, as they all of some kind of page authority, except index.html has the highest. (the non www version). But you need to make that decision. But looks like that's what you want to be the main one anyway.
ALSO,
You can tell google which version you prefer to google in GWT. This informs google which version you prefer. You can read more here.
https://support.google.com/webmasters/answer/44231?hl=en
"Once you tell us your preferred domain name, we use that information for all future crawls of your site and indexing refreshes. For instance, if you specify your preferred domain as http://www.example.com and we find a link to your site that is formatted as http://example.com, we follow that link as http://www.example.com instead. In addition, we'll take your preference into account when displaying the URLs. If you don't specify a preferred domain, we may treat the www and non-www versions of the domain as separate references to separate pages."
"Note: Once you've set your preferred domain, you may want to use a 301 redirect to redirect traffic from your non-preferred domain, so that other search engines and visitors know which version you prefer."
You cannot control the www and non www versons of your website, but you can control, making duplicate pages, especially of your home page. I am guessing that that is something that was done by your CMS. Index.html was probably done by you. FURTHERMORE, I think .com/ & .com is the one and the same thing. and you probably had to decide, when you were making a new campaign in moz. They probably asked you to put down your web address for your domain, and your probably put something like, "balloonsutah.com"Not exactly sure, why it showed you .com & .com/, but it makes sense that they would show you, .com, and /index.html, as they are two different pages, even though it has the same content. It still is two different URL's.
I probably wouldn't worry too much about it. But I'll let one of the moz members answer about .com &.com/. I would perhaps concern myself more about 301 redirects, and rel canonicals.
Hope I helped.
-
Thank you for the help!
-
Hello Keenan-price,
Welcome to the Moz community!
Moz is reporting these duplicates correctly. Each of the listed URLs are seen as unique URLs and unique pages. This is a common problem when a website does not have the proper canonical tags and 301 redirects in place for these URLs.
You'll want to decide on how your website should be displayed (which URL you prefer) and implement the canonical tag and 301 redirects.
the 301 redirects could be done with your .htaccess file, depending on your site environment. The canonical tags would depend on your site's environment (wordpress, custom development, ect).
Also, make sure to go into your Google Webmaster Tools account and specify a single page as being the correct page, once you've decided on how you want the URL to be displayed.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz claims we have meta noindex but we don't
Hi, I'm encountering an issue where moz scan says we have meta noindex, but I have confirmed across several of our pages that this simply isn't true. I have confirmation that the below tag is present: name="robots" content="index, follow" /> I also verified our https header through https://www.webconfs.com/http-header-check.php and see nothing indicating that we are sending any no index headers. Why would the crawler report this when it doesn't seem to be the case? Let me know if I need to provide more information.
Moz Bar | | charper_floqast0 -
Moz is only crawling 2 pages
Hi, I found a similar thread, but it did not provide a clear-cut answer. We have had this campaign running for over a year, and we are always adding content to the website, but Moz is only ever able to crawl 2 pages, Screaming Frog only picks up 12, but I know there is a lot more than that. None of our pages are set to no-index, so I do not know what is causing this. Welcoming any ideas/solutions. Thanks
Moz Bar | | GavinAdv0 -
Moz Crawler Causing Server Timeouts... Crawling thousands of non-existant pages with query parameters
Moz crawler is crawling all pages like this: http://www.xxxx.com/?product_count=100&product_order=desc&product_orderby=date http://www.xxxx.com/?product_count=100&product_order=desc&paged=1 http://www.xxx.com/?product_count=100&product_order=desc&product_view=grid Last month it crawled 80,000 pages on a site with less than 100 pages. Is there a way to select only certain pages to be crawled? Right now it is still crawling this site, since Monday morning and it's Tuesday mid-day. Every Monday it is causing time-outs from high band width on our server. Just getting ready to delete this client from the account unless there is a solution someone can give us. Thanks.
Moz Bar | | adirondack0 -
Moz is reporting weird email address URLs as 'Meta refresh' errors? Anything to worry about?
Under site crawl, Moz is reporting weird email address URLs as 'Meta refresh' errors. The URLs are: http://support@ihasco.co.uk and http://enquiries@ihasco.co.uk Once clicked, they redirect to our homepage. Anyone else ever had this? Is it anything to worry about? I don't think it is, but would be good to get some reassurance.
Moz Bar | | iHasco0 -
Crawl Test Takes Long Time
Hi Moz, I have submitted our website for a crawl test. Usually it would only take a few hours to do the crawl. However this time, it takes quite long time and the result still shows in progress 😞 This is a small website which only contains less than 10 pages. Just wondering if this is our website setting issue or it is a technical issue at your end? Many thanks in advance. sFjAERG.png
Moz Bar | | russellbrown0 -
Moz bar problem?
I have a little problem regarding Moz bar. I had a FB and Twitter account earlier for my company. I have lost access to those accounts. Also they did'nt had my likes and followers (less than 10). My current Fb and Twitter account are active. Fb has 123 Like. And that the same account I am mentioned on my website. Still MozBar is only detecting that earlier account. Is there a way I could re-direct moz bar to my new accounts? thanks
Moz Bar | | jogindergujela0 -
Moz analytics not updating
Okay so I was invited to moz analytics. When I received the email I was stoked to get to use the new beta software. My campaigns transferred over ,but when I began to look at the data, it said updating check back in 24 hours or something along those lines. I thought okay that is fine, but to my suprise it has been around four days since then and it still says it is updating. It also shows weekly stats of visits but the number there is definitely wrong. It said I only had around 2,100 but I get more than that daily. Anyone in support that can help? I'm confused on what I can do to fix this issue. I understand it is just a beta ,but other people, from what I have seen, haven't had a similar issue. If anyone can point me in the right direction I'd appreciate it!
Moz Bar | | ithvac0 -
Emails from Moz makes my Outlook unresponsive
Did anybody else notice this? It started a few weeks ago, every time that I receive an email from Moz regarding a Q&.A update and I try to open it, my Outlook becomes unresponsive and I have to restart it.
Moz Bar | | echo10