Why do I see a duplicate content errors when rel="canonical" tag is present
-
I was reviewing my first Moz crawler report and noticed the crawler returned a bunch of duplicate page content errors. The recommendations to correct this issue are to either put a 301 redirect on the duplicate URL or use the rel="canonical" tag so Google knows which URL I view as the most important and the one that should appear in the search results. However, after poking around the source code I noticed all of the pages that are returning duplicate content in the eyes of the Moz crawler already have the rel="canonical" tag.
Does the Moz crawler simply not catch whether that tag is being used? If I have that tag in place, is there anything else I need to do in order to get that error to stop showing up in the Moz crawler report?
-
We're seeing the same issue. Multiple pages are flagged as "duplicate content" but each retains a single rel canonical tag pointing to the same url.
-
Hey Webtraders,
I'm also look at this issue any chance you got to the bottom of it?
-
We have the same problem with reference to duplicate pagetitles in the Moz crawl errors. Has anyone found a solution for this already?
-
I had pages with bad rel canonical configurated and moz crawl did not detect them as duplicate content. The information of rel canonical the moz crawl show it to me on notices.
Althouth If you see duplicate content on moz crawl and you have rel canonical installed it doesn't mean always mean it has
I have a lot of blog pages with same title o description and the moz crawl shows as duplicate metas althoug i think it is not bat for google as they see de canonical o rel page on this case
-
is the rel canonical pointing to the right page or are they all just pointing to themselves?
A lot of times Wordpress or similar creation tools will drop a canonical tag on each page that points to itself. What you need to do is ensure that the duplicated page is pointing to the one you want indexed...
If you cut and paste an example in here perhaps we can be more helpful.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Htaccess and robots.txt and 902 error
Hi this is my first question in here I truly hope someone will be able to help. It's quite a detailed problem and I'd love to be able to fix it through your kind help. It regards htaccess files and robot.txt files and 902 errors. In October I created a WordPress website from what was previously a non-WordPress site it was quite dated. I had built the new site on a sub-domain I created on the existing site so that the live site could remain live whilst I created on the subdomain. The site I built on the subdomain is now live but I am concerned about the existence of the old htaccess files and robots txt files and wonder if I should just delete the old ones to leave the just the new on the new site. I created new htaccess and robots.txt files on the new site and have left the old htaccess files there. Just to mention that all the old content files are still sat on the server under a folder called 'old files' so I am assuming that these aren't affecting matters. I access the htaccess and robots.txt files by clicking on 'public html' via ftp I did a Moz crawl and was astonished to 902 network error saying that it wasn't possible to crawl the site, but then I was alerted by Moz later on to say that the report was ready..I see 641 crawl errors ( 449 medium priority | 192 high priority | Zero low priority ). Please see attached image. Each of the errors seems to have status code 200; this seems to be applying to mainly the images on each of the pages: eg domain.com/imagename . The new website is built around the 907 Theme which has some page sections on the home page, and parallax sections on the home page and throughout the site. To my knowledge the content and the images on the pages are not duplicated because I have made each page as unique and original as possible. The report says 190 pages have been duplicated so I have no clue how this can be or how to approach fixing this. Since October when the new site was launched, approx 50% of incoming traffic has dropped off at the home page and that is still the case, but the site still continues to get new traffic according to Google Analytics statistics. However Bing Yahoo and Google show a low level of Indexing and exposure which may be indicative of the search engines having difficulty crawling the site. In Google Analytics in Webmaster Tools, the screen text reports no crawl errors. W3TC is a WordPress caching plugin which I installed just a few days ago to speed up page speed, so I am not querying anything here about W3TC unless someone spots that this might be a problem, but like I said there have been problems re traffic dropping off when visitors arrive on the home page. The Yoast SEO plugin is being used. I have included information about the htaccess and robots.txt files below. The pages on the subdomain are pointing to the live domain as has been explained to me by the person who did the site migration. I'd like the site to be free from pages and files that shouldn't be there and I feel that the site needs a clean up as well as knowing if the robots.txt and htaccess files that are included in the old site should actually be there or if they should be deleted... ok here goes with the information in the files. Site 1) refers to the current website. Site 2) refers to the subdomain. Site 3 refers to the folder that contains all the old files from the old non-WordPress file structure. **************** 1) htaccess on the current site: ********************* BEGIN W3TC Browser Cache <ifmodule mod_deflate.c=""><ifmodule mod_headers.c="">Header append Vary User-Agent env=!dont-vary</ifmodule>
Moz Pro | | SEOguy1
<ifmodule mod_filter.c="">AddOutputFilterByType DEFLATE text/css text/x-component application/x-javascript application/javascript text/javascript text/x-js text/html text/richtext image/svg+xml text/plain text/xsd text/xsl text/xml image/x-icon application/json
<ifmodule mod_mime.c=""># DEFLATE by extension
AddOutputFilter DEFLATE js css htm html xml</ifmodule></ifmodule></ifmodule> END W3TC Browser Cache BEGIN W3TC CDN <filesmatch ".(ttf|ttc|otf|eot|woff|font.css)$"=""><ifmodule mod_headers.c="">Header set Access-Control-Allow-Origin "*"</ifmodule></filesmatch> END W3TC CDN BEGIN W3TC Page Cache core <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule .* - [E=W3TC_ENC:_gzip]
RewriteCond %{HTTP_COOKIE} w3tc_preview [NC]
RewriteRule .* - [E=W3TC_PREVIEW:_preview]
RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{QUERY_STRING} =""
RewriteCond %{REQUEST_URI} /$
RewriteCond %{HTTP_COOKIE} !(comment_author|wp-postpass|w3tc_logged_out|wordpress_logged_in|wptouch_switch_toggle) [NC]
RewriteCond "%{DOCUMENT_ROOT}/wp-content/cache/page_enhanced/%{HTTP_HOST}/%{REQUEST_URI}/_index%{ENV:W3TC_PREVIEW}.html%{ENV:W3TC_ENC}" -f
RewriteRule .* "/wp-content/cache/page_enhanced/%{HTTP_HOST}/%{REQUEST_URI}/_index%{ENV:W3TC_PREVIEW}.html%{ENV:W3TC_ENC}" [L]</ifmodule> END W3TC Page Cache core BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule> END WordPress ....(((I have 7 301 redirects in place for old page url's to link to new page url's))).... #Force non-www:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www.domain.co.uk [NC]
RewriteRule ^(.*)$ http://domain.co.uk/$1 [L,R=301] **************** 1) robots.txt on the current site: ********************* User-agent: *
Disallow:
Sitemap: http://domain.co.uk/sitemap_index.xml **************** 2) htaccess in the subdomain folder: ********************* Switch rewrite engine off in case this was installed under HostPay. RewriteEngine Off SetEnv DEFAULT_PHP_VERSION 53 DirectoryIndex index.cgi index.php BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /WPnewsiteDee/
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /subdomain/index.php [L]</ifmodule> END WordPress **************** 2) robots.txt in the subdomain folder: ********************* this robots.txt file is empty **************** 3) htaccess in the Old Site folder: ********************* Deny from all *************** 3) robots.txt in the Old Site folder: ********************* User-agent: *
Disallow: / I have tried to be thorough so please excuse the length of my message here. I really hope one of you great people in the Moz community can help me with a solution. I have SEO knowledge I love SEO but I have not come across this before and I really don't know where to start with this one. Best Regards to you all and thank you for reading this. moz-site-crawl-report-image_zpsirfaelgm.jpg0 -
Magento Site "Title Missing or Empty"
Hi everyone . . ..bear with me as I am "noob". My moz analysis brought up a few "critical" issues, one of which was a missing or empty title for a page. That page is: http://www.thirdcoastsigns.com/sales/guest/form Which does not even appear to be a working page . . . so I'm at a bit of a loss ow to fix this. I suspect it's a little bit of SEO expertise and a little bit of Magento expertise. Thoughts and suggestions welcome! P.S. Look forward to my NEXT noob question about a bunch of pages I have flagged as "Pages with temporary redirects". Stay tuned!
Moz Pro | | damon12120 -
Yahoo Store Beginner with "duplicate content" errors. Can I pay for support? $$$
Hi. I have a Yahoo store that seems to have many errors. We built the site for utility knowing NOTHING about SEO. We just started with MOZ and would love to PAY someone to help get us past the beginning stages. Is there someone familiar with the Yahoo! Store format that can charge us perhaps in hourly blocks to walk us through possible solutions to issues? One issue we are having... seems to be that our subsections which contain items that are the endpoints... I know of no way to label the sections anything but an "item". I'm wondering if this might be causing the "duplicate" error because a specific item is listed both in the section and on it's own page. please help! Thom 888-567-5194
Moz Pro | | TITOJAX0 -
How to choose the best canonical URL
In a duplicate content situation, and assuming that both rel=canonical and a 301 redirect pass link equity (I know there is still some speculation on this), how should you choose the "best" version of the URL to establish as the redirect target or authoritative URL? For example, we have a series of duplicate pages on our site. Typically we choose the "cleanest" or shortest non-trailing-slash version of the URL as the canonical, but what if those pages are already established and have varying page authority/backlink profiles? The URLs are: example.com/stores/locate/index?parameters=tags - PA = 54, Inbound Links = 259 example.com/stores/locate/index - PA = 60, Inbound Links = 302 example.com/stores/ - This is the version that currently ranks. PA = 42, Inbound Links = 3 example.com/stores - PA = 40, Inbound Links = 8 This might not really even matter, but in the interests of conserving as much SEO value as possible, which would you choose as either the 301 redirect target and/or the canonical version? My gut is to go with the URL that's already ranking (example.com/stores/) but curious if PA, backlinks, and trailing slashes should be considered also. We of course would not 301 the URL with the tracking parameters. 🙂 Thanks for your help!
Moz Pro | | Critical_Mass0 -
How do fix an 803 Error?
I got am 803 error this week on the Moz crawl for one of my pages. The page loads normally in the browser. We use cloudflare. Is there anything that I should do or do I wait a week and hope it disappears? 803 Incomplete HTTP response received Your site closed its TCP connection to our crawler before our crawler could read a complete HTTP response. This typically occurs when misconfigured back-end software responds with a status line and headers but immediately closes the connection without sending any response data.
Moz Pro | | Zippy-Bungle1 -
Moz tools are returning "url is inaccessible"
Hello everyone, I have been trying to use the on page grader tool and I have also tried to do a site crawl test, and both tools have come back with a "Sorry, but that URL is inaccessible" error. This has not been a problem before. Any ideas why this is happening eg what is blocking it. The url is www.livinghouse.co.uk any help for a novice would be appreciated. PS. I have had another tool also not giving any results, so I assume its something on the site which is blocking the tools. Could this also block Google? Thanks Giles
Moz Pro | | livinghouse0 -
Why does SEOMoz think I have duplicate content?
The SEOmoz crawl report shows me a large amount of duplicate content sites. Our site is built on a CMS that creates the link we want it to be but also automatically creates it's own longer version of the link (e.g. http://www.federalnational.com/About/tabid/82/Default.aspx and http://www.federalnational.com/about.aspx). We set the site up so that there are automatic redirects for our site. Google Webmaster does not see these pages as duplicate pages. Why does SEOmoz consider them duplicate content? Is there a way to weed this out so that the crawl report becomes more meaningful? Thanks!
Moz Pro | | jsillay0 -
Will canonical tag get rid of duplicate page title errors?
I have a directory on my website, paginated in groups of 10. On page 2 of the results, the title tag is the same as the first page, as it is on the 3rd page and so on. This is giving me duplicate page title errors. If i use rel=canonical tags on the subsequent pages and href the first page of my results, will my duplicate page title warnings go away? thanks.
Moz Pro | | fourthdimensioninc0