Moz crawler finding my homepage multiple times
-
Hi and thank you in advance for your help!
I have a Moz Pro campaign running (I am a complete Moz novice by the way) for one of my websites (balloonsutah.com). After crawling my site, the Moz crawler informed me that I have 3 pages with duplicate content. While I am not sure why exactly this is happening, the crawler indexed my homepage 3 times under different url's.
-balloonsutah.com
-balloonsutah.com/
-balloonsutah.com/index.htmlI checked my FTP server and I cannot figure out for the life of me why the crawler is finding anything other than the index.html file.
I suppose I need to do something regarding a rel="Canonical" but I am not terribly familiar with that either.
Any suggestions would be greatly appreciated!
Keenan -
You're welcome!
-
Great answer! I appreciate the time you spent spelling everything out in detail. Thank you!
-
First things first, I did check all web addresses. They all exist. You probably need to provide more detail whether or not you are using a CMS for your web pages.
All 3 pages have different page authority. That is, one of the version is ranking higher than the other versions. I did a quick check of that via Moz toolbar. Looks like the index.html has the highest authority.
Note that all 3 versions you listed, has 2 other versions. The one with the www, and the one without the www. Judging from the moz toolbar, looks like you rank better for the one without the 'www' . Rel canonical is is good option, but in this case I would try to do a 301 redirect from the server side first. Again, not sure how much access you have to the server side. You might need to contact your web admin.host company etc.
You can read about redirects more over here. --> http://moz.com/learn/seo/redirection. If you don't have access to the server you can try doing the rel canonical. Read more here --> http://moz.com/learn/seo/duplicate-content
Example. you have www.example.com/page1.htm, /page2.htm, page3.htm. They all have same exact content. Lets say that pag1.htm is your main version. You can do the following in the header section of page2, and page 3.htm
"This tag tells Bing and Google that the given page should be treated as though it were a copy of the URL www.example.com/pag1.htm/ and that all of the links and content metrics the engines apply should actually be credited toward the provided URL."
I would recommend not to delete all the other version, but instead do a 301 redirect, or a rel canonical, as they all of some kind of page authority, except index.html has the highest. (the non www version). But you need to make that decision. But looks like that's what you want to be the main one anyway.
ALSO,
You can tell google which version you prefer to google in GWT. This informs google which version you prefer. You can read more here.
https://support.google.com/webmasters/answer/44231?hl=en
"Once you tell us your preferred domain name, we use that information for all future crawls of your site and indexing refreshes. For instance, if you specify your preferred domain as http://www.example.com and we find a link to your site that is formatted as http://example.com, we follow that link as http://www.example.com instead. In addition, we'll take your preference into account when displaying the URLs. If you don't specify a preferred domain, we may treat the www and non-www versions of the domain as separate references to separate pages."
"Note: Once you've set your preferred domain, you may want to use a 301 redirect to redirect traffic from your non-preferred domain, so that other search engines and visitors know which version you prefer."
You cannot control the www and non www versons of your website, but you can control, making duplicate pages, especially of your home page. I am guessing that that is something that was done by your CMS. Index.html was probably done by you. FURTHERMORE, I think .com/ & .com is the one and the same thing. and you probably had to decide, when you were making a new campaign in moz. They probably asked you to put down your web address for your domain, and your probably put something like, "balloonsutah.com"Not exactly sure, why it showed you .com & .com/, but it makes sense that they would show you, .com, and /index.html, as they are two different pages, even though it has the same content. It still is two different URL's.
I probably wouldn't worry too much about it. But I'll let one of the moz members answer about .com &.com/. I would perhaps concern myself more about 301 redirects, and rel canonicals.
Hope I helped.
-
Thank you for the help!
-
Hello Keenan-price,
Welcome to the Moz community!
Moz is reporting these duplicates correctly. Each of the listed URLs are seen as unique URLs and unique pages. This is a common problem when a website does not have the proper canonical tags and 301 redirects in place for these URLs.
You'll want to decide on how your website should be displayed (which URL you prefer) and implement the canonical tag and 301 redirects.
the 301 redirects could be done with your .htaccess file, depending on your site environment. The canonical tags would depend on your site's environment (wordpress, custom development, ect).
Also, make sure to go into your Google Webmaster Tools account and specify a single page as being the correct page, once you've decided on how you want the URL to be displayed.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I can't find multiple titles
Hello, Moz showed a lot of multiple titles on our website but, when I go to fix them, there isn't any multiple titles. I was wondering if anyone knew where these issues could be coming from? Many thanks, Olivia
Moz Bar | | SigneerHFS0 -
How does Moz gather the Keyword for my campaign in Moz Pro?
Looking at my campaign on Moz Pro and I was wondering where the keywords are generated from as I did not put these in. Can someone please explain?
Moz Bar | | House-Doctors0 -
Why page load time is different in google webmaster vs what is displayed in moz?
When I analyze the site through Moz tool and compare the results with google webmaster, I am not able to figure what why Moz does not report the slow pages. Fro example this page has an avrage LCP of 3.0 sec https://www.collegehippo.com/graduate-school/programs/gre-score-business-analytics-data-analytics When I see the report in moz, it does not point to any such issue. Should I be worried about what google reports and try to fix the page?
Moz Bar | | etattva0 -
Duplicate Content on Website with Multiple Locations
Hi there, I've spent hours reading posts on duplicate content and googling this but I'm still not sure what to do. We created a site that has two WP installs for a company with two different locations - the landing page is website.com and links to WP install 1 (website.com/city1), and WP install 2 (website.com/city2). They specifically wanted two different sites so they could be managed by staff at either location. However some of the pages have the same content - ie. services, policies, etc. so all of those are showing errors for duplicate content. All pages have different city-specific URL's and meta-descriptions but that clearly doesn't help. We can't redirect the "duplicate" pages because then it would take the user to the other city's specific site. Is there anything we can do?? Is this going to significantly damage rankings? Thanks kindly for any help you can provide.
Moz Bar | | charlie0071 -
Moz Crawler not Identifying all Duplicate Pages
On two recent site crawls (9/27/14 and 11/4/14) for duplicate content the Moz tool did not ID the following 2 pages, which are 100% duplicate to each other: http://www.hooksandlattice.com/planter-hampton-241212.html ; Screenshot: http://screencast.com/t/DdwWroUU http://www.hooksandlattice.com/planter-hampton-721212.html ; Screenshot: http://screencast.com/t/8Lb1cJZmGrhX As I'm working feverishly to re-write and update the site (goal is ZERO duplicates) I'm finding it challenging to use the Moz tool to get the project done. Does anyone have any feedback or help they can provide for how I can identify all duplicate pages associated with my domain? Thank you! Lindsey Pfeiffer
Moz Bar | | CMC-SD0 -
Moz Crawler URL paramaters & duplicate content
Hi all, this is my first post on Moz Q&A 🙂 Questions: Does the Moz Crawler take into account rel="canonical" for search results pages with sorting / filtering URL parameters? How much time does it take for an issue to disappear from the issues list after it's been corrected? Does it come op in the next weekly report? I'm asking because the crawler is reporting 50k+ pages crawled, when in reality, this number should be closer to 1000. All pages with query parameters have the correct canonical tag pointing to the root URL, so I'm wondering whether I need to noindex the other pages for the crawler to report correct data?: Original (canonical URL): DOMAIN.COM/charters/search/mx/BS?search_location=cabo-san-lucas Filter active URL: DOMAIN.COM/charters/search/mx/BS?search_location=cabo-san-lucas&booking_date=&booking_days=1&booking_persons=1&priceFilter%5B%5D=0%2C500&includedPriceFilter%5B%5D=drinks-soft Also, if noindex is the only solution, will it impact the ranking of the pages involved? Note: Google and Bing are semi-successful in reporting index page count, each reporting around 2.5k result pages when using the site:DOMAIN.com query. The rel canonical tag was missing for a short period of time about 4 weeks ago, but since fixing the issue these pages still haven't been deindexed. Appreciate any suggestions regarding Moz Crawler & Google / Bing index count!
Moz Bar | | Vukan_Simic0 -
Weird back link showed in moz crawl
Some time ago somebody from this site: http://dianibeach.com created a weird link to our site which had on the end db. Later we have realized that the link was coming from every footer on each page. I believe that the back links from footer does not have realy value and even the more of them the less value. We have asked the guy to remove that links as I thought it might harm our site more then help. Now I I was very surprised to find this link in moz crawl error as second top page on our site in current index??? Can somebody explain how is this possible?? The most ridiculous thing is that when I click on that link it realy opens our site! How is that possible, what is it? This is the link: http://villasdiani.com/?db Thank you very much for any help with this
Moz Bar | | Rebeca10 -
Can Moz use canconical links to prevent notices about duplicate content issues?
if so how do we enable this - we've an average size site with a few hundred products but they appear in multiple categories, canonical url points to it's primary category (but a new page exists for each section... so for /cat-a/abc there will be another page cat-b/abc and again but the canonical points to cat-a always for that product) basically I see this kind of duplication error / notice as a false positive... help me
Moz Bar | | SEOAndy0