Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Quick Fix to "Duplicate page without canonical tag"?
-
When we pull up Google Search Console, in the Index Coverage section, under the category of Excluded, there is a sub-category called ‘Duplicate page without canonical tag’. The majority of the 665 pages in that section are from a test environment.
If we were to include in the robots.txt file, a wildcard to cover every URL that started with the particular root URL ("www.domain.com/host/"), could we eliminate the majority of these errors?
That solution is not one of the 5 or 6 recommended solutions that the Google Search Console Help section text suggests. It seems like a simple effective solution. Are we missing something?
-
No index & test Indexing Before You Launch
The domains are intended for development use and cannot be used for production. A custom or CMS-standard will only work
robots.txt on
Live environments with a custom domain. Adding sub-domains (i.e.,dev.example.com , ``test.example.com
) for DEV or TEST will remove the header only,X-Robots-Tag: noindex
but still, serve the domain.robots.txt
To support pre-launch SEO testing, we allow the following bots access to platform domains:
- Site Auditor by Raven
- SEMrush
- RogerBot by Moz
- Dotbot by Moz
If you’re testing links or SEO with other tools, you may request the addition of the tool to our
robots.txt
Pantheon's documentation on robots.txt: http://pantheon.io/docs/articles/sites/code/bots-and-indexing/User-agent: * Disallow: / User-agent: RavenCrawler User-agent: rogerbot User-agent: dotbot User-agent: SemrushBot User-agent: SemrushBot-SA Allow: /
-
The simplest solution would be to mark every page in your test environment "noindex". This is normally standard operating procedure anyway because most people don't want customers stumbling across the wrong URL in search by mistake and seeing a buggy page that isn't supposed to be "live" for customers.
Updating your robots.txt file would tell Google not to crawl the page, but if they've already crawled it and added it to their index it just means that they will retain the last crawled version of the page and will not crawl it in the future. You have to direct Google to "noindex" the pages. It will take some time as Google refreshes the crawl of each page, but eventually you'll see those errors drop off as Google removes those pages from their index. If I were consulting a client I would tell them to make the change and check back in two or three months.
Hope this helps!
-
The new version of search console will show all the pages available on your site. even the no-index pages, why? I don't know, the truth is even when you set up those pages as no-follow and no-index it will keeping show you the same error. That does not mean that there is something wrong with your site. I would not worry in your case.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do Canonical Tags Pass Link Juice?
I have an ecommerce website where some pages link to a product page with a different URL. EXAMPLE: 1: /category/product1.html (not indexed by Google) with canonical pointing to product1.html Other page link to the product like below. 2: product1.html (indexed by Google) Now the question is, does 1: pass any link juice to product1.html or not? Is it worth to change everything and link only to one URL? My site is running on Magento!
Technical SEO | | bill3690 -
Does Google read dynamic canonical tags?
Does Google recognize rel=canonical tag if loaded dynamically via javascript? Here's what we're using to load: <script> //Inject canonical link into page head if (window.location.href.indexOf("/subdirname1") != -1) { canonicalLink = window.location.href.replace("/kapiolani", ""); } if (window.location.href.indexOf("/subdirname2") != -1) { canonicalLink = window.location.href.replace("/straub", ""); } if (window.location.href.indexOf("/subdirname3") != -1) { canonicalLink = window.location.href.replace("/pali-momi", ""); } if (window.location.href.indexOf("/subdirname4") != -1) { canonicalLink = window.location.href.replace("/wilcox", ""); } if (canonicalLink != window.location.href) { var link = document.createElement('link'); link.rel = 'canonical'; link.href = canonicalLink; document.head.appendChild(link); } script>
Technical SEO | | SoulSurfer80 -
Do URLs with canonical tags get indexed by Google?
Hi, we re-branded and launched a new website in February 2016. In June we saw a steep drop in the number of URLs indexed, and there have continued to be smaller dips since. We started an account with Moz and found several thousand high priority crawl errors for duplicate pages and have since fixed those with canonical tags. However, we are still seeing the number of URLs indexed drop. Do URLs with canonical tags get indexed by Google? I can't seem to find a definitive answer on this. A good portion of our URLs have canonical tags because they are just events with different dates, but otherwise the content of the page is the same.
Technical SEO | | zasite0 -
Duplicate Page Content and Titles from Weebly Blog
Anyone familiar with Weebly that can offer some suggestions? I ran a crawl diagnostics on my site and have some high priority issues that appear to stem from Weebly Blog posts. There are several of them and it appears that the post is being counted as "page content" on the main blog feed and then again when it is tagged to a category. I hope this makes sense, I am new to SEO and this is really confusing. Thanks!
Technical SEO | | CRMI0 -
The Mysterious Case of Pagination, Canonical Tags
Hey guys, My head explodes when I think of this problem. So I will leave it to you guys to find a solution... My root domain (xxx.com) runs on WordPress platform. I use Yoast SEO plugin. The next page of root domain -- page/2/ -- has been canonicalized to the same page -- page/2/ points to page/2/ for example. The page/2/ and remaining pages also have this rel tags: I have also added "noindex,follow" to page/2/ and further -- Yoast does this automatically. Note: Yoast plugin also adds canonical to page/2/...page/3/ automatically. Same is the case with category pages and tag pages. Oh, and the author pages too -- they all have self-canonicalization, rel prev & rel next tags, and have been "noindex, followed." Problem: Am I doing this the way it should be done? I asked a Google Webmaster employee on rel next and prev tags, and this is what she said: "We do not recommend noindexing later pages, nor rel="canonical"izing everything to the first page." (My bad, last year I was canonicalizing pages to first page). One of the popular blog, a competitor, uses none of these tags. Yet they rank higher. Others following this format have been hit with every kind of Google algorithm I could think of. I want to leave it to Google to decide what's better, but then again, Yoast SEO plugin rules my blog -- okay, let's say I am a bad coder. Any help, suggestions, and thoughts are highly appreciated. 🙂 Update 1: Paginated pages -- including category pages and tag pages -- have unique snippets; no full-length posts. Thought I'd make that clear.
Technical SEO | | sidstar0 -
Hreflang on non-canonical pages
Hi! I've been trying to figure out what is the best way to solve this dilemma with duplicate content and multiple languages across domains. 1 product info page 2 same product but GREEN
Technical SEO | | LarsEriksson
3 same product but RED
4 same product but YELLOW **Question: ** Since pages 2,3,4 just varies slightly I use the canonical tag to indicate they are duplicates of page 1. Now I also want to indicate there are other language versions with the_ rel="alternate" hreflang="x" _element. Should I place the _rel="alternate" hreflang="x" _on the canonical page only pointing to the canonical page with "x" language. Should I place the _rel="alternate" hreflang="x" _on all pages pointing to the canonical page with the "x" language? Should I place the _rel="alternate" hreflang="x" _on all pages and then point it to the translated page (even if it is not a canonical page) ? /Lars0 -
Should we use "and" or "&"?
Our client has an ampersand in their brand name. The logo has "&", their url is spelled out. I'm trying to get them to standardize the use of the name for directories/listings. Should we use "and" or "&"?
Technical SEO | | vernonmack0 -
Which pages to "noindex"
I have read through the many articles regarding the use of Meta Noindex, but what I haven't been able to find is a clear explanation of when, why or what to use this on. I'm thinking that it would be appropriate to use it on: legal pages such as privacy policy and terms of use
Technical SEO | | mmaes
search results page
blog archive and category pages Thanks for any insight of this.0