Why does my crawl diagnostics show duplicate content

MSSBConsulting

My crawl diagnostics show duplicate content at mysite.com and mysite.com/index.html which are essentially the same file.

Dr-Pete

Michel is right - Google doesn't care that they're one template - if both URLs are being crawled, then they'll see that as two "pages". Every unique, crawlable URL can become an indexed page. That's why duplicate content problems are so common.

The good news is that you can put a canonical tag on just the one template/file and it will cover all of the paths/URLs that land on that file. The tag goes in your section and looks like:

I'd check the internal links, though, and see if you're linking to both versions. It's best to use one, consistent URL in your internal links for any given page.

MSSBConsulting

mysite.com is a domain not a file with mysite.com/index.html being the home page. Not sure how I would do what you suggest.

mozzello

If the crawl report found those two URLs, then your website has at least one link to each of those URLs (otherwise Rogerbot wouldn't have found them).

You should follow Collin's advice to define the canonical page.

It also won't hurt to figure out where those links are being used in your content, and then make sure you only use one to point to your page.

Cheers

Michel

CollinJarman

"Essentially" the same file isn't the same as "the same file." Your best bet is probably to mark one of them (probably mysite.com) with rel=canonical.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Why does my crawl diagnostics show duplicate content

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Member Only Content

In my crawl diagnostics, there are links to duplicate content. How can I track down where these links originated in?

Rogerbot not showing in logs

Campaign crawl re - schedule

Crawl reports urls with duplicate content but its not the case

How to get seomoz to re-crawl a site?

Crawl completed but no report available for download?

SEOmoz Bot indexing JSON as content