Duplicate Page Errors
-
Hey guys,
I'm wondering if anyone can help... Here is my issue...
Our website:
http://www.cryopak.com
It's built on Concrete 5 CMSI'm noticing a ton of duplicate page errors (9530 to be exact). I'm looking at the issues and it looks like it is being caused by the CMS. For instance the home page seems to be duplicating..
http://www.cryopak.com/en/
http://www.cryopak.com/en/?DepartmentId=67
http://www.cryopak.com/en/?DepartmentId=25
http://www.cryopak.com/en/?DepartmentId=4
http://www.cryopak.com/en/?DepartmentId=66Do you think this is an issue? Is their anyway to fix this issue? It seems to be happening on every page.
Thanks
Jim
-
Thanks everyone for the help. This should def. help clean up some of the problems that I've been having with the website.
-
I ran a crawl with Xenu (similar to what Donna did with Screaming Frog), and came across some deep page that may be causing this problem. For example, on this page...
...the last link to "phase change material" goes to:
http://www.cryopak.com/product_line/default.aspx?DepartmentId=67
...which then redirects to...
http://www.cryopak.com/en/?DepartmentId=67
It seems like multiple pages share that template, so one canonical tag might clean up a lot. I'd have to understand the site structure a lot better to advise, though. Google doesn't seem to be indexing these URLs, so they probably aren't a huge problem, but they could be diluting your ranking power. It's worth cleaning them up.
-
James,
I did a scan of your site. Your problem appears to have several sources. Do you know how to use the screamingfrog scan utility? It's free for sites with less than 500 pages. When I ran a scan on your site, looking only at the html pages, i came up with 283.
- You have search result pages indexed that shouldn't be. They'll look like duplicates to Google.
- You have product pages that contain a lot of the same content, for example http://www.cryopak.com/en/cold-chain-packaging/pre-qualified-shipping-containers/timesaver-2-8-c-series/timesaver24-24-hour-pre-qualified-shipper/ and http://www.cryopak.com/en/cold-chain-packaging/pre-qualified-shipping-containers/timesaver-2-8-c-series/timesaver48-48-hour-pre-qualified-shipper/ (24-24-hour vs 48-48-hour).
- You have different pages with the exact same title tag.
- You have some pages that are identical with one extra character in the URL e.g. http://www.cryopak.com/en/about and http://www.cryopak.com/en/about/. See that extra slash at the end?
I suggest you run and scan and inventory to get a good idea of where your problems are.
I'm not seeing your http://www.cryopak.com/en/?DepartmentId=xx (where xx represents 67, 25, 4 and 66) in the scan results. They're not redirecting and I don't see a canonical tag in the source code so I don't know what to tell you about those.
If it helps, I can direct message you a CSV file with the results of my scan.
-
Okay.. well I don't see any duplicate page issues in webmaster tools but I only see them in the SEO Moz Craw Errors report. So if they aren't showing up in webmaster tools should I really worry about this???
I can't edit those pages individually because those pages don't exist they are just a product of the CMS system generating those URL strings with the numbers. So I don't think I can canonical tag those pages.
I guess I can group them together and do 301 redirects??
Yes.. http://cryopak.spydertrapdev.com/ is just a dev environment.
-
Hi James,
I suggest you canonical the duplicate pages rather than 301 redirect them. Using canonical tags instead of 301 redirects will allow you to preserve any incoming link equity from external links to those pages. With a 301 redirect, you'll lose that equity.
David may have run your site through Open Site Explorer (OSE) and seen that there's very few incoming links to the duplicate pages and therefore felt it unnecessary to canonicalize them. I see only 8 from the example you gave us above, but don’t want to assume that’s all there is, especially when you're saying you see duplicates on the site, If you have webmaster tools set up, you can get a more exhaustive list of incoming links there.
The other thing I noticed is that the incoming links to the sample pages are coming from a cryopak subdomain on another site. Here are the ones I can see using OSE.
|
http://cryopak.spydertrapdev.com/product_line/default.aspx?DepartmentId=25
http://cryopak.spydertrapdev.com/product_line/default.aspx?DepartmentId=4
http://cryopak.spydertrapdev.com/product_line/default.aspx?DepartmentId=66
http://cryopak.spydertrapdev.com/product_line/default.aspx?DepartmentId=67
|
I get an error when I try to look at spydertrapdev.com so can't tell if that's a development environment that's been set up for your site or what. These may not be links you want to maintain. You’ll have to decide.
Good luck.
Donna
-
There are two ways to fix this.
First is to redirect all the pages to the proper home page, using a 301. Duplicate pages are bad for seo. Google likes to see one set of content, for each URL. See the webmaster tools article on duplicate content here.
Second is to go into webmaster tools, and set the true URL for this page, using the "URL parameters" function. This way, you can set the proper version of the page, so Google knows what to index. Be very careful when doing this, as you can mess up the way Google sees your site. There is a video on the link, I would watch it, and do a bit of reading first.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content, although page has "noindex"
Hello, I had an issue with some pages being listed as duplicate content in my weekly Moz report. I've since discussed it with my web dev team and we decided to stop the pages from being crawled. The web dev team added this coding to the pages <meta name='robots' content='max-image-preview:large, noindex dofollow' />, but the Moz report is still reporting the pages as duplicate content. Note from the developer "So as far as I can see we've added robots to prevent the issue but maybe there is some subtle change that's needed here. You could check in Google Search Console to see how its seeing this content or you could ask Moz why they are still reporting this and see if we've missed something?" Any help much appreciated!
Technical SEO | | rj_dale0 -
How do I prevent duplicate page title errors from being generated by my multiple shop pages?
Our e-commerce shop has numerous pages within the main shop page. Users navigate through the shop via typical pagination. So while there may be 6 pages of products it's all still under the main shop page. Moz keeps flagging my shop pages as having duplicate titles (ie shop page 2). But they're all the same page. Users aren't loading unique pages each time they go to the next page of products and they aren't pages I can edit. I'm not sure how to prevent this issue from popping up on my reports.
Technical SEO | | NiteSkirm0 -
Is there a way to index important pages manually or to make sure a certain page will get indexed in a short period of time??
Hi There! The problem I'm having is that certain pages are waiting already three months to be indexed. They even have several backlinks. Is it normal to have to wait more than three months before these pages get an indexation? Is there anything i can do to make sure these page will get an indexation soon? Greetings Bob
Technical SEO | | rijwielcashencarry0400 -
Partially duplicated content on separate pages
TL;DR: I am writing copy for some web pages. I am duplicating some bits of copy exactly on separate web pages. And in other cases I am using the same bits of copy with slight alterations. Is this bad for SEO? Details: We sell about 10 different courses. Each has a separate page. I'm currently writing copy for those pages. Some of the details identical for each course. So I can duplicate the content and it will be 100% applicable. For example, when we talk about where we can run courses (we go to a company and run it on their premises) – that's applicable to every course. Other bits are applicable with minor alterations. So where we talk about how we'll tailor the course, I will say for example: "We will the tailor the course to the {technical documents|customer letters|reports} your company writes." Or where we have testimonials, the headline reads "Improving {customer writing|reports|technical documents} in every sector and industry". There is original content on each page. The duplicate stuff may seem spammy, but the alternative is me finding alternative re-wordings for exactly the same information. This is tedious and time-consuming and bizarre given that the user won't notice any difference. Do I need to go ahead and re-write these bits ten slightly different ways anyway?
Technical SEO | | JacobFunnell0 -
Switchboard Tags - Multiple desktop pages pointing to one mobile page
I have recently started to implement switchboard tags to connect our mobile and desktop pages, and to ensure that our mobile pages show up in rankings for mobile users. Because our desktop site is much deeper in content than our mobile site, there are a number of desktop pages we would like to have point to one mobile page. However, with the switchboard tags, this poses a problem because it requires multiple rel=canonical tags to be placed on the one mobile page. I'm assuming this will either confuse the search engines, or they will choose to ignore the rel=canonical tag altogether. Any ideas on how to approach this situation other than creating an equivalent mobile version of every desktop page or implementing a user agent detection redirect?
Technical SEO | | JBlank0 -
CMS on autopilot is happily creating duplicate pages - advice?
Hi, our ecommerce CMS (Magento) is creating a bunch of pages with very little content and no user value like this: http://goo.gl/UU2vl This particular example is the by product of a product filtering page, which has the format www.mywebsite/explore/index/loaddata/id/10/. These pages have no content other than images - also the pages don't have page titles and are therefore being flagged in webmaster tools as requiring HTML improvements We also have CMS auto generated pages like this: www.mysite.comhttp/review/product/list/id/7 where the page is effectively a duplicate of the product page, and this is giving us pages being flagged by webmastertools as having duplicate title tags. Should we exclude these two type of page via robots.txt or take another approach, like not worry about them 🙂 many thanks, any help gratefully received.
Technical SEO | | w1ll1am0 -
Duplicate pages in Google index despite canonical tag and URL Parameter in GWMT
Good morning Moz... This is a weird one. It seems to be a "bug" with Google, honest... We migrated our site www.three-clearance.co.uk to a Drupal platform over the new year. The old site used URL-based tracking for heat map purposes, so for instance www.three-clearance.co.uk/apple-phones.html ..could be reached via www.three-clearance.co.uk/apple-phones.html?ref=menu or www.three-clearance.co.uk/apple-phones.html?ref=sidebar and so on. GWMT was told of the ref parameter and the canonical meta tag used to indicate our preference. As expected we encountered no duplicate content issues and everything was good. This is the chain of events: Site migrated to new platform following best practice, as far as I can attest to. Only known issue was that the verification for both google analytics (meta tag) and GWMT (HTML file) didn't transfer as expected so between relaunch on the 22nd Dec and the fix on 2nd Jan we have no GA data, and presumably there was a period where GWMT became unverified. URL structure and URIs were maintained 100% (which may be a problem, now) Yesterday I discovered 200-ish 'duplicate meta titles' and 'duplicate meta descriptions' in GWMT. Uh oh, thought I. Expand the report out and the duplicates are in fact ?ref= versions of the same root URL. Double uh oh, thought I. Run, not walk, to google and do some Fu: http://is.gd/yJ3U24 (9 versions of the same page, in the index, the only variation being the ?ref= URI) Checked BING and it has indexed each root URL once, as it should. Situation now: Site no longer uses ?ref= parameter, although of course there still exists some external backlinks that use it. This was intentional and happened when we migrated. I 'reset' the URL parameter in GWMT yesterday, given that there's no "delete" option. The "URLs monitored" count went from 900 to 0, but today is at over 1,000 (another wtf moment) I also resubmitted the XML sitemap and fetched 5 'hub' pages as Google, including the homepage and HTML site-map page. The ?ref= URls in the index have the disadvantage of actually working, given that we transferred the URL structure and of course the webserver just ignores the nonsense arguments and serves the page. So I assume Google assumes the pages still exist, and won't drop them from the index but will instead apply a dupe content penalty. Or maybe call us a spam farm. Who knows. Options that occurred to me (other than maybe making our canonical tags bold or locating a Google bug submission form 😄 ) include A) robots.txt-ing .?ref=. but to me this says "you can't see these pages", not "these pages don't exist", so isn't correct B) Hand-removing the URLs from the index through a page removal request per indexed URL C) Apply 301 to each indexed URL (hello BING dirty sitemap penalty) D) Post on SEOMoz because I genuinely can't understand this. Even if the gap in verification caused GWMT to forget that we had set ?ref= as a URL parameter, the parameter was no longer in use because the verification only went missing when we relaunched the site without this tracking. Google is seemingly 100% ignoring our canonical tags as well as the GWMT URL setting - I have no idea why and can't think of the best way to correct the situation. Do you? 🙂 Edited To Add: As of this morning the "edit/reset" buttons have disappeared from GWMT URL Parameters page, along with the option to add a new one. There's no messages explaining why and of course the Google help page doesn't mention disappearing buttons (it doesn't even explain what 'reset' does, or why there's no 'remove' option).
Technical SEO | | Tinhat0 -
I have a ton of "duplicated content", "duplicated titles" in my website, solutions?
hi and thanks in advance, I have a Jomsocial site with 1000 users it is highly customized and as a result of the customization we did some of the pages have 5 or more different types of URLS pointing to the same page. Google has indexed 16.000 links already and the cowling report show a lot of duplicated content. this links are important for some of the functionality and are dynamically created and will continue growing, my developers offered my to create rules in robots file so a big part of this links don't get indexed but Google webmaster tools post says the following: "Google no longer recommends blocking crawler access to duplicate content on your website, whether with a robots.txt file or other methods. If search engines can't crawl pages with duplicate content, they can't automatically detect that these URLs point to the same content and will therefore effectively have to treat them as separate, unique pages. A better solution is to allow search engines to crawl these URLs, but mark them as duplicates by using the rel="canonical" link element, the URL parameter handling tool, or 301 redirects. In cases where duplicate content leads to us crawling too much of your website, you can also adjust the crawl rate setting in Webmaster Tools." here is an example of the links: | | http://anxietysocialnet.com/profile/edit-profile/salocharly http://anxietysocialnet.com/salocharly/profile http://anxietysocialnet.com/profile/preferences/salocharly http://anxietysocialnet.com/profile/salocharly http://anxietysocialnet.com/profile/privacy/salocharly http://anxietysocialnet.com/profile/edit-details/salocharly http://anxietysocialnet.com/profile/change-profile-picture/salocharly | | so the question is, is this really that bad?? what are my options? it is really a good solution to set rules in robots so big chunks of the site don't get indexed? is there any other way i can resolve this? Thanks again! Salo
Technical SEO | | Salocharly0