Quick Fix to "Duplicate page without canonical tag"?
-
When we pull up Google Search Console, in the Index Coverage section, under the category of Excluded, there is a sub-category called ‘Duplicate page without canonical tag’. The majority of the 665 pages in that section are from a test environment.
If we were to include in the robots.txt file, a wildcard to cover every URL that started with the particular root URL ("www.domain.com/host/"), could we eliminate the majority of these errors?
That solution is not one of the 5 or 6 recommended solutions that the Google Search Console Help section text suggests. It seems like a simple effective solution. Are we missing something?
-
No index & test Indexing Before You Launch
The domains are intended for development use and cannot be used for production. A custom or CMS-standard will only work
robots.txt on
Live environments with a custom domain. Adding sub-domains (i.e.,dev.example.com , ``test.example.com
) for DEV or TEST will remove the header only,X-Robots-Tag: noindex
but still, serve the domain.robots.txt
To support pre-launch SEO testing, we allow the following bots access to platform domains:
- Site Auditor by Raven
- SEMrush
- RogerBot by Moz
- Dotbot by Moz
If you’re testing links or SEO with other tools, you may request the addition of the tool to our
robots.txt
Pantheon's documentation on robots.txt: http://pantheon.io/docs/articles/sites/code/bots-and-indexing/User-agent: * Disallow: / User-agent: RavenCrawler User-agent: rogerbot User-agent: dotbot User-agent: SemrushBot User-agent: SemrushBot-SA Allow: /
-
The simplest solution would be to mark every page in your test environment "noindex". This is normally standard operating procedure anyway because most people don't want customers stumbling across the wrong URL in search by mistake and seeing a buggy page that isn't supposed to be "live" for customers.
Updating your robots.txt file would tell Google not to crawl the page, but if they've already crawled it and added it to their index it just means that they will retain the last crawled version of the page and will not crawl it in the future. You have to direct Google to "noindex" the pages. It will take some time as Google refreshes the crawl of each page, but eventually you'll see those errors drop off as Google removes those pages from their index. If I were consulting a client I would tell them to make the change and check back in two or three months.
Hope this helps!
-
The new version of search console will show all the pages available on your site. even the no-index pages, why? I don't know, the truth is even when you set up those pages as no-follow and no-index it will keeping show you the same error. That does not mean that there is something wrong with your site. I would not worry in your case.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why do two pages compete while a canonical tag is active?
Hi guys, My SERP analysis show me that two pages compete eachother for the keyword kinderfiets, which should not happen since there is a canonical tag is active. www.halfords.nl/fiets/kinderfiets/kinderfiets/ Ranks #6 and www.halfords.nl/fiets/kinderfiets/ Ranks #7. The first one is a subcategory which is one step deeper than the second one. I prefer consumers to land on the broader subcategory, because that one shows more products.Furthermore, we already did some SEO tweaking for the #7 page, but did not work on the #6 page. So it is even kind of strange that this page ranks higher.Can somebody help me out?Kind Regards,Tom
Technical SEO | | Sebastiaan10 -
How to deal with duplicate pages on Shopify
Moz is alerting me that there's about 60 duplicate pages on my Shopify ecommerce site. Most of them are products. I'm not sure how to fix this since the coding for my site is in liquid. I'm not sure if this is something I even need to be worried about. Most of these duplicate pages are a result of product tags shopify sites use to group products you tag with characteristics that the user can select in the product view. here are a couple URLS: https://www.mamadoux.com/collections/all/hooded https://www.mamadoux.com/collections/all/jumpers https://www.mamadoux.com/collections/all/menswear
Technical SEO | | Mamadoux0 -
Joomla creating duplicate pages, then the duplicate page's canonical points to itself - help!
Using Joomla, every time I create an article a subsequent duplicate page is create, such as: /latest-news/218-image-stabilization-task-used-to-develop-robot-brain-interface and /component/content/article?id=218:image-stabilization-task-used-to-develop-robot-brain-interface The latter being the duplicate. This wouldn't be too much of a problem, but the canonical tag on the duplicate is pointing to itself.. creating mayhem in Moz and Webmaster tools. We have hundreds of duplicates across our website and I'm very concerned with the impact this is having on our SEO! I've tried plugins such as sh404SEF and Styleware extensions, however to no avail. Can anyone help or know of any plugins to fix the canonicals?
Technical SEO | | JamesPearce0 -
What is the better way to fix duplication https and http?
Hi All! I have a doubt about how to fix the duplication problem https @ http. What is the better way to fix it in your opionion/experience? Me, for instance, I have chosen to put "noindex, nofollow" into https version. Each page of my site has a https version so I put this metarobots into it....But I am not sure about what happens with all backlinks with "https" URLs I have, I've just checked I have some...What do you think about it? Thanks in advance for helping!
Technical SEO | | Red_educativa0 -
Container Page/Content Page Duplicate Content
My client has a container page on their website, they are using SiteFinity, so it is called a "group page", in which individual pages appear and can be scrolled through. When link are followed, they first lead to the group page URL, in which the first content page is shown. However, when navigating through the content pages, the URL changes. When navigating BACK to the first content page, the URL is that for the content page, but it appears to indexers as a duplicate of the group page, that is, the URL that appeared when first linking to the group page. The client updates this on the regular, so I need to find a solution that will allow them to add more pages, the new one always becoming the top page, without requiring extra coding. For instance, I had considered integrating REL=NEXT and REL=PREV, but they aren't going to keep that up to date.
Technical SEO | | SpokeHQ1 -
Moz Crawl Reporting Duplicate content on "template" styled pages
We have a lot of detail pages on our site that reference specific scholarships. Each page has a different Title and Description. They also have unique information all regarding the same data points. The pages are displayed in a similar structure to the user so the data is easy to read. My problem is a lot of these pages are being reported as duplicate content when they certainly are not. Most of them are reported as duplicates when they have the same sponsor. They may have the same contact information listed. These two are being reported as duplicate of each other. They share some data but they are definitely different scholarships. http://www.collegexpress.com/scholarships/adelaide-mcclelland-garden-club-scholarship/9254/ http://www.collegexpress.com/scholarships/mary-wannamaker-witt-and-lee-hampton-witt-memorial-scholarship/10785/ Would it help to add a Canonical for each page to themselves? Any other suggestions would be great. Thanks
Technical SEO | | GeorgeLaRochelle0 -
Duplicate Home Page content and title ... Fix with a 301?
Hello everybody, I have the following erros after my first crawl: Duplicate Page Content http://www.peruviansoul.com http://www.peruviansoul.com/ http://www.peruviansoul.com/index.php?id=2 Duplicate Page title http://www.peruviansoul.com http://www.peruviansoul.com/ http://www.peruviansoul.com/index.php?id=2 Do you think I could fix them redirecting to http://www.peruviansoul.com with a couple of 301 in the .htaccess file? Thank you all for you help. Gustavo
Technical SEO | | peruviansoul0 -
Canonical tags and relative paths
Hi, I'm seeing a problem with Roger Bot crawling a clients site. In a campaign I am seeing you say that the canonical tag is pointing to a different URL. The tag is as follows:- /~/Standards-and....etc Google say:- relative paths are recognized as expected with the tag. Also, if you include a <base> link in your document, relative paths will resolve according to the base URL Is the issue with this, that there is a /~/, that there is no <base> link or just an issue with Roger? Best regards, Peter
Technical SEO | | peeveezee0