Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Quick Fix to "Duplicate page without canonical tag"?
-
When we pull up Google Search Console, in the Index Coverage section, under the category of Excluded, there is a sub-category called ‘Duplicate page without canonical tag’. The majority of the 665 pages in that section are from a test environment.
If we were to include in the robots.txt file, a wildcard to cover every URL that started with the particular root URL ("www.domain.com/host/"), could we eliminate the majority of these errors?
That solution is not one of the 5 or 6 recommended solutions that the Google Search Console Help section text suggests. It seems like a simple effective solution. Are we missing something?
-
No index & test Indexing Before You Launch
The domains are intended for development use and cannot be used for production. A custom or CMS-standard will only work
robots.txt onLive environments with a custom domain. Adding sub-domains (i.e.,dev.example.com , ``test.example.com) for DEV or TEST will remove the header only,X-Robots-Tag: noindexbut still, serve the domain.robots.txtTo support pre-launch SEO testing, we allow the following bots access to platform domains:
- Site Auditor by Raven
- SEMrush
- RogerBot by Moz
- Dotbot by Moz
If you’re testing links or SEO with other tools, you may request the addition of the tool to our
robots.txtPantheon's documentation on robots.txt: http://pantheon.io/docs/articles/sites/code/bots-and-indexing/User-agent: * Disallow: / User-agent: RavenCrawler User-agent: rogerbot User-agent: dotbot User-agent: SemrushBot User-agent: SemrushBot-SA Allow: /
-
The simplest solution would be to mark every page in your test environment "noindex". This is normally standard operating procedure anyway because most people don't want customers stumbling across the wrong URL in search by mistake and seeing a buggy page that isn't supposed to be "live" for customers.
Updating your robots.txt file would tell Google not to crawl the page, but if they've already crawled it and added it to their index it just means that they will retain the last crawled version of the page and will not crawl it in the future. You have to direct Google to "noindex" the pages. It will take some time as Google refreshes the crawl of each page, but eventually you'll see those errors drop off as Google removes those pages from their index. If I were consulting a client I would tell them to make the change and check back in two or three months.
Hope this helps!
-
The new version of search console will show all the pages available on your site. even the no-index pages, why? I don't know, the truth is even when you set up those pages as no-follow and no-index it will keeping show you the same error. That does not mean that there is something wrong with your site. I would not worry in your case.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
"Noindex, follow" for thin pages?
Hey there Mozzers, I have a question regarding Thin pages. Unfortunately, we have Thin pages, almost empty to be honest. I have the idea to ask the dev team to do "noindex, follow" on these pages. What do you think? Has someone faced this situation before? Will appreciate your input!
Technical SEO | | Europarl_SEO_Team0 -
Rel=Canonical For Landing Pages
We have PPC landing pages that are also ranking in organic search. We've decided to create new landing pages that have been improved to rank better in natural search. The PPC team however wants to use their original landing pages so we are unable to 301 these pages to the new pages being created. We need to block the old PPC pages from search. Any idea if we can use rel=canonical? The difference between old PPC page and new landing page is much more content to support keyword targeting and provide value to users. Google says it's OK to use rel=canonical if pages are similar but not sure if this applies to us. The old PPC pages have 1 paragraph of content followed by featured products for sale. The new pages have 4-5 paragraphs of content and many more products for sale. The other option would be to add meta noindex to the old PPC landing pages. Curious as to what you guys think. Thanks.
Technical SEO | | SoulSurfer80 -
Multiple H1 Tags on Page
Can having multiple H1 tags on a webpage be detrimental to its rankings?
Technical SEO | | AubbiefromAubenRealty0 -
Duplicate Content Issues on Product Pages
Hi guys Just keen to gauge your opinion on a quandary that has been bugging me for a while now. I work on an ecommerce website that sells around 20,000 products. A lot of the product SKUs are exactly the same in terms of how they work and what they offer the customer. Often it is 1 variable that changes. For example, the product may be available in 200 different sizes and 2 colours (therefore 400 SKUs available to purchase). Theese SKUs have been uploaded to the website as individual entires so that the customer can purchase them, with the only difference between the listings likely to be key signifiers such as colour, size, price, part number etc. Moz has flagged these pages up as duplicate content. Now I have worked on websites long enough now to know that duplicate content is never good from an SEO perspective, but I am struggling to work out an effective way in which I can display such a large number of almost identical products without falling foul of the duplicate content issue. If you wouldnt mind sharing any ideas or approaches that have been taken by you guys that would be great!
Technical SEO | | DHS_SH0 -
The Mysterious Case of Pagination, Canonical Tags
Hey guys, My head explodes when I think of this problem. So I will leave it to you guys to find a solution... My root domain (xxx.com) runs on WordPress platform. I use Yoast SEO plugin. The next page of root domain -- page/2/ -- has been canonicalized to the same page -- page/2/ points to page/2/ for example. The page/2/ and remaining pages also have this rel tags: I have also added "noindex,follow" to page/2/ and further -- Yoast does this automatically. Note: Yoast plugin also adds canonical to page/2/...page/3/ automatically. Same is the case with category pages and tag pages. Oh, and the author pages too -- they all have self-canonicalization, rel prev & rel next tags, and have been "noindex, followed." Problem: Am I doing this the way it should be done? I asked a Google Webmaster employee on rel next and prev tags, and this is what she said: "We do not recommend noindexing later pages, nor rel="canonical"izing everything to the first page." (My bad, last year I was canonicalizing pages to first page). One of the popular blog, a competitor, uses none of these tags. Yet they rank higher. Others following this format have been hit with every kind of Google algorithm I could think of. I want to leave it to Google to decide what's better, but then again, Yoast SEO plugin rules my blog -- okay, let's say I am a bad coder. Any help, suggestions, and thoughts are highly appreciated. 🙂 Update 1: Paginated pages -- including category pages and tag pages -- have unique snippets; no full-length posts. Thought I'd make that clear.
Technical SEO | | sidstar0 -
Two different canonical tags on one page
Due to an error, some of my pages now have two canonical tags on them. One is correct and the other goes to a nonsense URL (404 page). I know I should ideally remove the incorrect ones, but it's a big manual job. Are they doing any harm? Can I just leave them there and let Google figure it out? The correct ones are higher up in the code. Will this make a difference? Any help appreciated.
Technical SEO | | ShearingsGroup0 -
Why "title missing or empty" when title tag exists?
Greetings! On Dec 1, 2011 in a SEOMoz campaign, two crawl metrics shot up from zero (Nov 17, Nov 24). "Title missing or empty" was 9,676. "Duplicate page content" was 9,678. Whoa! Content at site has not changed. I checked a sample of web pages and each seems to have a proper TITLE tag. Page content differs as well -- albeit we list electronic part numbers of hard-to-find parts, which look similar. I found a similar post http://www.seomoz.org/q/why-crawl-error-title-missing-or-empty-when-there-is-already-title-and-meta-desciption-in-place . In answer, Sha ran Screaming Frog crawler. I ran Frog crawler on a few hundred pages. Titles were found and hash codes were unique. Hmmm. Site with errors is http://electronics1.usbid.com Small sample of pages with errors: electronics1.usbid.com/catalog_10.html
Technical SEO | | groovykarma
electronics1.usbid.com/catalog_100.html
electronics1.usbid.com/catalog_1000.html I've tried to reproduce errors yet I cannot. What am I missing please? Thanks kindly, Loren0 -
Should there be a canonical tag on my 404 error page?
In my crawl diagnostics, I notice some 4xx client errors. They are appearing for pages that no longer exist, so I'm not sure what the problem is. Shouldn't they just be dealt as 404's? Anyway, on closer inspection I noticed that my 404 error page contains a canonical tag which points to the missing page. Could this be the issue? Is it a good idea to remove the canonical tag from this error page? Thanks.
Technical SEO | | Leighm0